We can categorize the databases in AWS into the following types:
- OLTP (RDS)
- OLAP
- NoSQL
- In-memory
- Graph.
1. OLTP (Online Transaction Processing) DB:
Also called as RDS (Relational Database Service). Features of OLTP databases in AWS are:
- Relational DB
- Row wise data storage
- Predefined schema
- Very strong Transactional.
- Multi AZ deployment
- Read Replicas in same AZs, other AZs, cross region.
- Storage backed by EBS (gp2 or io1).
- eg., Amazon RDS (Relational Database Service) – Oracle, MySQL, Microsoft SQL Server, PostgreSQL, MariaDB, AWS Aurora (Aurora is AWS proprietor RDS, which they try to promote a lot. Its having 2 flavors -MySQL and PostgreSQL).
- But, you can’t SSH to RDS instances.
- Fully managed.
- Storage is auto-scaling.
AWS Aurora:
Amazon Aurora typically involves a cluster of DB instances instead of a single instance. Each connection is handled by a specific DB instance. When you connect to an Aurora cluster, the hostname and port that you specify point to an intermediate handler called an endpoint. Aurora uses the endpoint mechanism to abstract these connections. Thus, you don’t have to hardcode all the hostnames or write your own logic for load-balancing and rerouting connections when some DB instances aren’t available.
For certain Aurora tasks, different instances or groups of instances perform different roles. For example, the primary instance handles all data definition language (DDL) and data manipulation language (DML) statements. Up to 15 Aurora Replicas handle read-only query traffic.
Using endpoints, you can map each connection to the appropriate instance or group of instances based on your use case. For example, to perform DDL statements, you can connect to whichever instance is the primary instance. To perform queries, you can connect to the reader endpoint, with Aurora automatically performing load-balancing among all the Aurora Replicas. For clusters with DB instances of different capacities or configurations, you can connect to custom endpoints associated with different subsets of DB instances. For diagnosis or tuning, you can connect to a specific instance endpoint to examine details about a specific DB instance.
A reader endpoint for an Aurora DB cluster provides load-balancing support for read-only connections to the DB cluster. Use the reader endpoint for read operations, such as queries. By processing those statements on the read-only Aurora Replicas, this endpoint reduces the overhead on the primary instance. It also helps the cluster to scale the capacity to handle simultaneous SELECT
queries, proportional to the number of Aurora Replicas in the cluster. Each Aurora DB cluster has one reader endpoint.
If the cluster contains one or more Aurora Replicas, the reader endpoint load balances each connection request among the Aurora Replicas. In that case, you can only perform read-only statements such as SELECT
in that session. If the cluster only contains a primary instance and no Aurora Replicas, the reader endpoint connects to the primary instance. In that case, you can perform write operations through the endpoint.
Aurora DB Cluster :
RDS Read Replicas Vs Multi AZ :
Read Replicas :
- It helps you to scale your read.
- it can be within same AZ, cross AZ or cross Region
- Replication is Async.
- Upto 15 Read Replicas can be created.
- It can also be converted to a Master DB if required.
- P.S.: Read Replicas do not do anything to upgrade or increase the read throughput on the primary DB instance per se, but it provides a way for your application to fetch data from replicas.
(In the below diagram the “M” stands for Master and “R” stands for Read-Replica)
Multi AZ Deployments:
- Its basically for Disaster Recovery.
- When you create or modify your DB instance to run as a Multi-AZ deployment, Amazon RDS automatically provisions and maintains a synchronous standby replica in a different Availability Zone. Updates to your DB Instance are synchronously replicated across Availability Zones to the standby in order to keep both in sync and protect your latest database updates against DB instance failure.
Below table shows the difference between the two:
RDS Security :
2. OLAP (Online Analytics Processing) DB:
- Relational DB
- Column wise data storage
- Exponentially larger reads than writes to the DB.
- Predefined schema
- Required for analytics, Business Intelligence, Data Warehousing.
- Used for ETL(Extract Transfer Load) use cases.
- Supports standard SQL.
- eg. AWS Redshift.
3. NoSQL (Document & Key) DB:
- Schema-less
- Stores Semi Structured data.
- Required for quickly evolving data.
- millisecond response
- eg.. AWS Dynamo DB, AWS Document DB
4. In-Memory/Cache DB:
- for microsecond response
- eg. AWS ElastiCache. (Memcached– simple cache. Redis– persistent cache)
5. Graph DB:
- For complex relationship data -like fraud detection, Social Networking data, etc.
- eg., AWS Neptune