Amazon EC2 provides each instance with a consistent and predictable amount of CPU capacity, regardless of its underlying hardware.
Amazon EC2 dedicates some resources of the host computer, such as CPU, memory, and instance storage, to a particular instance. Amazon EC2 shares other resources of the host computer, such as the network and the disk subsystem, among instances. If each instance on a host computer tries to use as much of one of these shared resources as possible, each receives an equal share of that resource. However, when a resource is under-utilized, an instance can consume a higher share of that resource while it’s available.
Each instance type provides higher or lower minimum performance from a shared resource. For example, instance types with high I/O performance have a larger allocation of shared resources.
Para-Virtualization vs Hardware Virtualization (HVM)
- Used to be the primary form of virtualization on the AWS platform
- Requires instance OS and driver modifications, the host OS presents an API which the guest OS needs to support
- Originally had better performance than HVM
- Was originally very slow
- Modern CPUs have added specific virtual instructions to improve the performance, specifically memory translation.
- Network and IO remain slow, but they’re improved by guest drivers, and recently CPUs have network/IO acceleration
- AMIs support Para-Virtualization or HVM
- HVM are becoming more prevalent
- Some instance types are only supported using HVM e.g. T2
- Some features require HVM e.g. enhanced networking
- Some instance types don’t support HVM e.g. T1
- Para-Virtualization instances often have cheap spot prices
When you select an instance type, this determines the networking and storage features that are available.
Key features to be aware of:
- EBS Optimisation - dedicated bandwidth, consistency, higher throughput.
- Provides separate transit capacity for storage and network, thereby improving the performance of both.
- m4, c4, r4 and d2 are EBS Optimised by default
- Enhanced Networking - AWS supported SR-IOV, less jitter, higher throughput
- SR-IOV - Single Root IO Virtualization
- The alternative is a software defined NIC means the host inspects all the traffic for VMs, providing features like security groups and Network Access Control Lists. Host CPU cycles are consumed by processing this traffic.
- Has to apply stateful firewall rules on every packet, switch the packet and encapsulate it. e.g. some estimates this to be 70% of a CPU at 10 Gb/s network speed.
- Introduces problems like noisy neighbours, network jitter
- SR-IOV provides a direct path for the hypervisor to the underlying network card; traffic bypasses the hypervisor leading to line-rate performance.
- AMI needs to have SR-IOV drivers
- Needs to be run on an instance that supports it, e.g. M4
- Lower latency
- Consistent latency
- Improved throughput
- AWS get lower CPU usage on their servers, reducing the price for everyone
- Instance Store Volumes - No resilience, high throughput, high I/O
- Can only be added when you launch an instance
- Higher performance and throughput
- GPU Availability - Media conversion, graphical or scientific compute e.g. Genomics
- Placement Group
- A Placement Group is a logical grouping of instances within a single Availability Zone and are recommended for applications that benefits from low network latency, high network throughput (10 Gbps), or both.
- Must be in the same VPC or in peered VPC
- Has a unique name within the AWS account
- Placement Groups cannot be merged. You need to terminate instances in one placement group and relaunch them in another.
- Create an AMI from the existing instance
- Launch an new instance from the AMI in the placement group
- AWS recommends using the same instance type for all instances in a placement group
- They’re not a good fit for horizontally scalable web services, or services that require AZ redundancy.
Launching A New Instance
When you launch a new instance you can select an existing key pair or create a new one. You can not import an existing one; this needs to be done via the Key Pairs option in the EC2 console.
- Each type is limited to 20 reserved instances
- Each type is limited to various on-demand limits e.g. t2.* is 20, m4.10xlarge is 5, hi1.4xlarge is 2 etc
- Pricing is linear based on resource usage i.e. you’re not penalised for buying lots of small instances instead of a few large ones.
Instance Type Details
General purpose burstable.
- Instances have a base level of performance and the ability to burst to higher levels using CPU credits.
- The difference between the credit earned and the CPU used is stored as credit for up to 24 hours
1 CPU Credit = 100% of 1 vCPU for 1 minuteor
2 vCPU running at 25% for 2 minutesetc…
- An initial CPU credit is allocated to give good start-up performance
- ** Multiple vCPUs, so percentage is split between vCPU
- *** The maximum CPU credit does not include initial CPU credits, which are used first and do not expire
- Max = per-hour x 24 hours
3 x 24 = 72etc…
- Max = per-hour x 24 hours
- When you’re running low on credits, the instance’s credit consumption is gradually lowered to the base level; you will not experience a sharp performance drop-off
- CPU credits are measured at millisecond-level resolution
- CPU credits do not to persist between an instance stop – start. However, after the start, the instance receives the initial CPU credits again
- Metrics: CPUCreditUsage, CPUCreditBalance
- HVM only
- VPC only
- EBS only
- Available as On-demand or Reserved but NOT spot
- EBS backed only
M* - General Purpose Instances
- M4 instances provide a balance of compute, memory and network resources.
- EBS-optimized by default
- Support for enhanced networking
m4.16xlarge instance use Elastic Network Adapters (ENA) at 20 Gbps, all other m* instances don’t they use Intel 82599 10 Gbps.
C* - Compute Optimized Instances
- C4 are EBS optimized by default, and you can turn EBS optimization on for C3 instances for a low per-hour charge.
- The c4.8xlarge instance type
- provides the ability to control processor C-states and P-states on Linux. C-states control the sleep levels that a core can enter when it is inactive, while P-states control the desired performance (in CPU frequency) from a core.
- has 36 vCPUs, which requires OS support to handle more than 32
June 2016: C4 does not use Elastic Network Adapters (ENAs), whereas R4 does. C4 uses Intel 82599 VF.
R* and X1 - Memory Optimized Instances
R4 instances are suitable for:
- High performance DBs, both relational and NoSQL
- In-memory databases e.g. SAP HANA
- Distributed web caching
- Realtime processing of big unstructured data
- High-performance computing
X1 instances are suited for:
- In-memory databases e.g. SAP HANA
- Big data processing e.g. Apache Spark
- High-performance computing
- R4 instances can have up to 488 GiB of RAM
- R3 instances can have up to 244 GiB of RAM
- X1 instances include Intel Scalable Memory Buffers, providing 300 GiB/2 of sustainable memory-read bandwidth and 140 GiB/s of sustainable memory-write bandwidth.
- R4 instances have up to 64 vCPUs that run on AWS customized Intel Xeon with high-memory bandwidth and larger L3 cache.
- X1 instances have up to 128 vCPUs with high-memory bandwidth and larger L3 cache.
Large instances allow you to set C-states and P-states
- You can’t launch R4 instances as Spot or Dedicated
June 2016: R4 and X1 instances use Elastic Network Adapters (ENA). It is a next generation interface and accompanying drivers that provide enhanced networking. Previously, enhanced networking referred to the use of Intel 82599 VF (R3 still use this).
D2 and I2 - Storage Optimized Instances
D2 instances are suited for:
- Log or data processing applications
- MapReduce and Hadoop distributed computing
- Massive parallel processing (MPP) data warehouse
I2 instances are suited for:
- NoSQL databases
- Clustered databases
- Online Transaction Processing (OLTP) systems
- HDD-based instance store volumes
- D2 instances are EBS-optimized by default at no additional cost.
- D2 instances provide the best disk performance when you use a Linux kernel that supports persistent grants, an extension to the Xen block ring protocol that significantly improves disk throughput and scalability.
- The d2.8xlarge instance type provides the ability to control processor C-states and P-states on Linux
- The d2.8xlarge instance type provides 36 vCPUs
- SSD-based instance store volumes
- You can enable EBS optimization for your I2 instances for an additional low, hourly fee.
- Prone to write amplification: As you fill the SSD-based instance store volumes for your instance, the number of write IOPS that you can achieve decreases. This is due to the extra work the SSD controller must do to find available space, rewrite existing data, and erase unused space so that it can be rewritten.
- To reduce write amplification, you should leave 10% of the volume unpartitioned so that the SSD controller can use it for over-provisioning. This decreases the storage that you can use, but increases performance.
- I2 instance store volumes support TRIM. You can use the TRIM command to notify the SSD controller whenever you no longer need data that you’ve written. This provides the controller with more free space, which can reduce write amplification and increase performance.
- HI1 is the equivalent previous generation
P2 and G2 - Accelerated Computing Instances
Accelerated computing instance families use hardware accelerators, or co-processors, to perform some functions, such as floating point number calculation and graphics processing, more efficiently than is possible in software running on CPUs.
- Requires HVM
- Must have NVIDIA drivers installed to access GPU
- P2 instances use NVIDIA Tesla K80 GPUs and are designed for general purpose GPU computing using the CUDA or OpenCL programming models.
- Use cases: deep learning, graph databases, high performance databases, computational fluid dynamics, computational finance, seismic analysis, molecular modeling, genomics, rendering, and other server-side GPU compute workloads.
- P2 instances are EBS optimized by default
- The p2.16xlarge instance type provides the ability to control processor C-states and P-states on Linux
- G2 instances use NVIDIA GRID K520 GPUs and provide a cost-effective, high-performance platform for graphics applications using DirectX or OpenGL. NVIDIA GRID GPUs also support NVIDIA’s fast capture and encode API operations.
- Use cases: video creation services, 3D visualizations, streaming graphics-intensive applications, and other server-side graphics workloads.
- CG1 instances use NVIDIA Tesla M2050 GPUs and are designed for general purpose GPU computing using the CUDA or OpenCL programming models.
- Know when to use a specific instance - for specific workloads
- Rendering, media conversion, scientific analysis, big data etc.
- Know how to identify when an instance type is causing a performance issue
- How to remove the issue via instance type change
- Know how instance features such as EBS optimisation and Enhanced Networking change the performance of instances running
- Know When feature use is appropriate, and when it isn’t
- e.g. don’t use instance store volumes for persistent data, no matter how much performance is important.
- Know restrictions on the features
- i.e. certain features aren’t available on some generation or families