AWS and other interesting stuff

Relational Database Service (RDS)

Amazon RDS

A managed Relational Database Service

  • Fast
  • Cost efficient
  • Resizable
  • Secure - VPCs and encryption is supported, both at rest and in transit (SSL).
    • IAM users and permissions can control who has access to the RDS databases
  • Highly available
  • Minimal administration
  • Shell access is not provided

RDS Components

  • DB Instance
    • Each instance has a customer supplied “instance identifier” which must be unique for a customer within a region.
  • DB Parameter Groups
    • i.e. command line parameters e.g. MySQL use innodb
  • DB Option Groups
    • Add ons e.g. MySQL memcache support, SQL Server Transparent Data Encryption, Oracle Application Express (APEX).

Pricing

Pricing is determined by:

  • Instance Class (micro, small etc)
  • Storage
    • GB per month
    • If storage is scaled the bill will be pro-rated
    • Aurora charges for storage are different:
    • e.g. $0.110 per GB-month + $0.222 per million requests
  • Data
    • Data in is free
    • Data out to another AWS region is ~ $0.140 per GB
    • Data out to the internet is free up to 1GB then ~ $0.140 per GB or lower
  • Backups
    • There is no additional charge for backups up to 100% of your provisioned space.
    • Backups are often smaller than provisioned space, meaning no additional cost.
    • Backup storage is free for active DB instances only

Engines

  • MySQL
  • MariaDB
  • MSSQL
  • PostgreSQL
  • Oracle
  • Amazon Aurora

Responsibilities

  • RDS
    • Provision infrastructure
    • Install database software
    • Automatic backups
    • Automatic patching
    • Synchronous data replication
    • Automatic Failover
  • Customer
    • Settings
    • Schema
    • Performance tuning

Scaling

Vertical

  • Change the instance type (you can use reserved instance with RDS)
  • Upgrade storage
    • 5GB to 6TB
    • Storage types:
    • Magnetic
    • General Purpose (SSD) storage
      • 3 IOPS/GB
      • Burst up to 3,000 IOPS
    • Provisioned IOPS (SSD) storage can be used too
      • Fast, predictable, consistent
    • IOPS available varies by engine
    • RDS automatically stripes across multiple EBS volumes to enhance IOPS performance, depending on the amount of storage requested.
    • Can be scaled live (but is slow)
    • MSSSQL does not let you scale your storage

Horizontal

  • Read replicas (Automatic backups must be enabled for these to work)
  • Master/Slave
  • Up to 5 read replicas
  • Replicas can be promoted to master
  • You can create a replica of a replica

Sharding

Split database into multiple databases. Works for tables you don’t need to join. e.g. create a read replica, promote it to master, and drop the tables you no longer need

Best Practice

  • Max 10,000 tables for provisioned IOPs and 1000 for standard storage
  • Partition tables so that file sizes are well under the 6TB limit
  • Different instance sizes have different IO capacity, so upgrade to get more capacity
  • Use Amazon RDS EB event to monitor failovers
  • Set a TTL of less than 30 seconds as failover is done with DNS
  • Test failover to understand how long the process takes
  • Provision enough RAM so the working set can reside in memory
    • i.e. The value of ReadIOPS should be small and stable
  • If migrating to an instance with more RAM results in a large drop in ReadIOPS was not all in memory

Enhanced Monitoring

Realtime metrics for the OS: CPU, memory, disk space, IOPS, db connection, network traffic.

MySQL and MariaDB

  • Innodb is better for recover, MyISAM is better for full text search.
  • XtraDB is the recommended/default engine for MariaDB

Maintenance

  • You set a weekly 30 minute maintenance window.
  • Can be applied manually or automatically in the defined window.
  • The maintenance window defines when operations starts only; how long it takes is variable
  • Maintenance items can be marked as Required or Available. Only Available ones can be optionally deferred indefinitely. Required can be deferred for a short time.
  • Required patching is automatically scheduled only for security and reliability patches.
  • Major version upgrades are not applied automatically, but Minor ones are.
  • A snapshot is taken before and after an upgrade
  • Read replicas are upgraded before the source

Performance Metrics

RDS uses 1 minute intervals

  • IOPS
  • Latency
  • Throughput
  • Queue Depth

Multi-AZ

  • Multi-AZ deployments for Oracle, PostgreSQL, MySQL, and MariaDB DB instances use Amazon technology, while SQL Server DB instances use SQL Server Mirroring.
  • Failover is done by changing the DNS record (CNAME)
  • There is a reboot with failover option (AKA Forced Failover)

Read Replicas

  • When creating read replicas there is a brief pause < 1 min on the master, unless multi-AZ is enabled as the stand-by can be used.
  • If the source is deleted all read replicas will be promoted to masters
  • Read replicas will switch masters when there is a multi-AZ failover
  • As of June 2016, MySQL, MariaDB and PostgresSQL can have read replicas in a different region to the master.

Amazon Aurora

Fast and reliable like high-end commercial databases

Simple and cost effective like open source databases

  • Up to 5x improvement from MySQL on the same hardware
  • Up to 15 read replicas
  • Compatible with MySQL 5.6
  • Storage is fault tolerant and self healing
  • Detects crashes and restarts
  • Storage Auto-scaling from 10GB to 64TB (no resizing disruption)

Backups

  • Automatic, continuous, incremental backups
  • Point-in-time restore within a second
  • Up to 35 day retention period
  • Stored in S3
  • No impact on database performance

Snapshots

  • User initiated
  • Stored in S3
  • Kept until you delete them
  • Incremental

Failure and Fault Tolerance

  • Aurora keeps 6 copies across 3 AZs
  • Data divided into 10GB segments across many disks
  • You can lose up to 2 copies of your data without affecting writes
  • You can lose up to 3 copies of your data without affecting reads

Replicas

Amazon Aurora Replicas

  • Share underlying volume with primary instance
  • Updates made to primary are visible to all replicas
  • Up to 15 of them
  • Low performance impact on primary
  • Replica can be a failover target with no data loss

MySQL Read Replicas

  • Primary instance data is replayed on your replica as transactions
  • Up to 5 of them
  • High performance impact on primary
  • Replica can be a failover target, but you may potentially have minutes of data loss

Security

  • Must be in a VPC
  • SSL encrypts data in transit
  • You can encrypt databases using KMS
  • Encryption is applied to
    • Storage
    • Backups
    • Snapshots
    • Replicas
  • You can’t encrypt an unencrypted database