AWS and other interesting stuff

Amazon Elastic Block Store

Elastic Block Store (EBS)

Amazon EBS provides highly available, reliable, durable, block-level storage volumes that can be attached to a running instance.

  • EBS volumes are constrained to the AZ they’re in.
  • Root EBS volume is deleted, by default, on Instance termination but can be modified by changing the DeleteOnTermination flag
  • EBS volumes can be created and attached to a running EC2 instance by specifying a block device mapping
  • EBS volumes can be detached from an instance explicitly or by terminating the instance
    • EBS data volumes, attached to an running instance, can be detached by unmounting the volume from the instance first. If the volume is detached without being unmounted, it might result the volume being stuck in the busy state and could possibly damaged the file system or the data it contains

Performance

Fundamental Definitions

  • Capacity - data in GB that can be stored on a volume
  • Throughput - throughput in MB/s for read/write operations
  • Block Size - the size of each read / write operation, measured in KB
  • IOPS - the number of Input and Output operations per second
  • Latency - the delay between a read / write request and its completion, measured in ms

Things That Influence Performance

  • Instance Type and Settings
    • EBS optimised flag: separates EBS traffic and other traffic minimising contention
    • Provide dedicated throughput to EBS with options ranging from 500 Mb/s and 12,000 Mb/s depending on the instance type
  • IO Profile
    • IOPS (up to 10,000 or 20,000 depending on the type of volume) and Throughput (up to 160 MB/s or 320 MB/s depending on the type of volume)
    • These limits are independent
      • e.g. 20,000 IOs x 256KB blocks = 5,120,000 KB = 5,000 MB/s i.e. you can’t reach the max IOPS in this situation
      • e.g. if you have an application that has a 8KB or 16KB block size requirement, you’ll hit the IOPS limit before the throughput limit.
  • Network Speed
    • This is dependent on instance size e.g. c4.large has a smaller network throughput (500 Mb/s) than a m4.10xlarge (4,000 Mb/s)
    • Using EBS optimised instances frees up network capacity by removing contention between storage and network traffic.
  • EBS Volume Type
    • GP2 = baseline + burst
    • PIOPS = Provioned IOPs
    • ST1 = Throughput optimized HDD
    • SC1 = Cold HDD
    • Standard = Magnetic volumes

Examples

Block Size Influence

Throughput = Block Size x IOPS

For example, an EBS volume: 1TB GP2 = 3,000 IOPS and 160MB/s

  • 32KB Block Size
    • 1,000 IOPS = 32MB/s
    • 2,000 IOPS = 64MB/s
    • 3,000 IOPS = 96 MB/s
  • 64KB Block Size
    • 3,000 IOPS = 192 MB/s > 160MB/s
  • 256KB Block Size
    • 750 IOPS = 192 MB/s > 160MB/s

Exceeding The Limits Of EBS

Large EBS optimised instances can deliver 32,000 16K IOPS or from 500 MB/s to 10,000 MB/s

Max IOPS is 48,000 IOPS @ 16K delivered by the larger 10 Gbp/s network stacks.

How is this possible when EBS doesn’t support this level of IOPS and throughput?

The answer is:

  • Use a larger EC2 instance type that is EBS Optimised, or has 10 Gb/s networking, or both.
  • Attach multiple EBS volumes and configure them as a RAID 0 or LVM Stripe volume
  • Use 128 or 256 KB stripe size to ensure the best possible performance.

EBS Types

Montoring

For burstable volumes (gp2, st1, sc1) you should use CloudWatch to monitor your BurstBalance.

gp2 volumes

  • Can be 1 GiB to 16 TiB
  • 99% performance consistency

gp2 Burst Bucket

  • The smallest baseline performance for a volume is 100 IOPS e.g. a 1GB volume still gets this amount.
  • All GP2 volumes start with a 5,400,000 credit balance
    • The pool provides enough burst performance for boot and traditional usage spikes. A 500 GB volume can burst @ 3,000 IOPS for 60 minutes. A boot will need much less than this, leaving lots of capacity for spikes.
  • It’s replenished every second by the number of Base IOPS (3 per GB)
  • The pool can be spent at up to the 3,000 Max Burst Rate per second
  • Deliver within 10% of their baseline performance 99% of a given year.

  • Volumes >= 1,000 don’t use burst as they have higher base performance
  • 214 GB is the minimum size for maximum throughput
    • 214 x 3 = 642 IOPS
    • 642 IOPS x 256KB Blocks = 160 MB/s
  • 3,334 GB is the minimum size for maximum IOPS 3 x 3,334 = 10,000 IOPS
Burst duration = Credit Balance / (Burst IOPS - 3x(Volume size in GiB))

io1 volumes

  • Can be 4GiB to 16 TiB
  • Maximum IOPS is 20,000
    • The maximum ratio of provisioned IOPS to requested volume size (in GiB) is 50:1. For example, a 100 GiB volume can be provisioned with up to 5,000 IOPS.
  • Throughput limit is 320 MiB/s
  • Deliver within 10% of their provisioned performance 99.9% of a given year.
  • For the best per-I/O latency experience, we recommend that you provision an IOPS-to-GiB ratio greater than 2:1. For example, a 2,000 IOPS volume should be smaller than 1,000 GiB.

st1 volumes

  • Low cost magnetic storage that defines performance in terms of throughput rather than IOPS
  • Can be 500 MiB to 16 TiB
  • Useful for large sequential IO *

sc1 volumes

  • Low cost magnetic storage that defines performance in terms of throughput rather than IOPS
  • Can be 500 MiB to 16 TiB
  • Useful for large sequential IO

standard volumes

  • Previous generation volume
  • Magnetic drives suited for workloads where data is accessed infrequently and scenarios where low-cost storage for small volume sizes is important i.e. near-archival or cold workloads
  • Can be 1 GiB to 1 TiB
  • Deliver 100 IOPS on average, with burst capability up to hundreds of IOPS.
  • Variable tens of MB/s throughput
  • 2-40ms latency

EBS Optimized

  • EBS-optimized instances use an optimized configuration stack and provides additional, dedicated capacity for EBS IO.
  • Dedicated bandwidth options range from 500Mbps and 10,000Mbps
  • Can be set at launch, or when an instance is stopped
  • m4, c4, r4 and d2 are EBS Optimized by default

EBS Encryption

  • Uses 256-bit Advanced Encryption Standard algorithms (AES-256) and an Amazon-managed key infrastructure.
  • Encryption occurs on the server that hosts the EC2 instance, providing encryption of data-in-transit from the EC2 instance to EBS storage.
  • Supported by all EBS volume types: gp2, io1, st1, sc1
  • There is no way to turn in on or off once created, instead you can apply a new encryption status when copying a snapshot
  • Uses AWS KMS and Customer Master Keys (CMK)
    • A CMK is created by default, or you can specify another one you’ve created
  • Public snapshots of encrypted volumes are not supported, but you can share an encrypted snapshot with specific accounts
  • Each encrypted volume (and its subsequent snapshots) is encrypted with a unique volume encryption key that is then encrypted with a region-specific secure master key. The volume encryption keys are used in memory on the server that hosts your EC2 instance; they are never stored on disk in plain text
  • Supported on m3, m4, t2, c3, c4, cr1, r3, r4, x1, d2, i2, g2, p2 instance types

EBS Snapshots

  • Snapshots are incremental and stored on S3
  • Use snapshots to create new volumes, increase the size of the volumes or replicate data across Availability Zones.
  • Snapshots size can probably be smaller then the volume size as the data is compressed before being saved to S3.
  • The snapshot deletion process is designed so that you only have to retain the most recent snapshot in order to restore the volume
  • Snapshots are asynchronous and are in pending state until complete
    • Having multiple snapshots in pending state at the same time can result in reduced performance
    • There is a limit of 5 pending snapshots for gp2, io1 or magnetic volumes and 1 pending snapshot for st1 and sc1 volumes
  • Snapshots are constrained to the region in which they were created.
  • You can copy snapshots across regions, which is useful for DR and migrations
  • Encryption
    • Snapshots of encrypted EBS volumes are automatically encrypted
    • Volumes that are created from encrypted snapshots are automatically encrypted.
    • When you copy an unencrypted snapshot that you own, you can encrypt it during the copy process.
    • When you copy an encrypted snapshot that you own, you can reencrypt it with a different key during the copy process.
  • Sharing
    • Snapshots can be shared with the public, or with a list of specific account IDs
    • Encrypted snapshots can be shared, but they must use a custom CMK (Customer Managed Key) rather than the default one.
    • Accounts you’re sharing the snapshot with must have DescribeKey and ReEncrypt permissions on the CMK.

EBS CloudWatch Events

  • Amazon EBS emits notifications based on Amazon CloudWatch Events for a variety of snapshot and encryption status changes
  • With CloudWatch Events, you can establish rules that trigger programmatic actions in response to a change in snapshot or encryption key state. For example, when a snapshot is created, you can trigger an AWS Lambda function to share the completed snapshot with another account or copy it to another region for disaster-recovery purposes.

EBS Tips

  • Pre-warming EBS is no longer required for new volumes
  • Volumes created from snapshots are lazy restored from S3 - force a full read of the volume to speed up the restore
  • If you use RAID0 or LVM Striped then Quiesce IO, freeze file systems and perform snapshots
  • Snapshots only consume data changed since the last snapshot - you can improve RTO and RPO by taking snapshots often. It will be quicker and have the same cost as less regular snapshots.
  • You can use a block device mapping to specify additional EBS volumes when you launch your instance, or you can attach additional EBS volumes after your instance is running.
    • Instances can have block device mappings, and so can AMIs
    • Note: You can specify the instance store volumes for your instance only when you launch an instance. You can’t attach instance store volumes to an instance after you’ve launched it

Backups

  • Volumes are replicated within AZs, but it is still a good idea to back them up as they have an annual failure rate (AFR) between 0.1% and 0.2%
  • Snapshots are more reliable than EBS volumes as they’re:

    • Stored on S3 which provides 11 9s of durability over a given year and can sustain the loss of data in 2 facilities at the same time.
    • We can copy snapshots to other regions to increase durability even more, or migrate to another region
  • To partially restore a snapshot, we can launch a new volume from the snapshot and mount it to an instance so files can be copied across.

    • $ aws ec2 attach-volume --volume-id <value> --instance-id <value> --device /dev/sdc
  • To create a new instance from the snapshot we can create an image from the snapshot and use that to launch the new instance.

  • It’s recommend that you pause IO operations before performing a snapshot

    • fsfreeze for EXT2,3,4 xfs_freeze for XFS
  • Or, unmount the volume

LVM:

When you do a snapshot of EBS volumes used in an LVM you need to use LVM snapshot too i.e. use LVM’s internal snapshot option then do an EBS snapshot of all volumes used in the LVM. You can then remove the LVM snapshot to avoid using up disk space and affecting performance.

RAID:

Taking snapshots of RAID volumes also requires a stop to I/O operations and a flush of cache to disk. If you use LVM for software raid you can do an LVM snapshot first.

Using AMIs:

We can pre-bake an AMI with application code, configurations, software etc…

Exam Tips

  • Understand throughput, IOPS, block size and latency
  • Know the storage types and their strengths and weaknesses
  • Know that either / or of IOPS or throughput can be saturated, not necessarily both