Amazon - Auto Scaling

What is Auto Scaling?

Auto Scaling allows you to automatically scale your Amazon EC2 capacity up or down according to conditions you define. With Auto Scaling, you can ensure that the number of Amazon EC2 instances you’re using scales up seamlessly during demand spikes to maintain performance, and scales down automatically during demand lulls to minimize costs. Auto Scaling is particularly well suited for applications that experience hourly, daily, or weekly variability in usage.

Is there a cost associated with using Auto Scaling?

Yes and No. Auto Scaling is enabled by Amazon CloudWatch and available at no additional charge beyond Amazon CloudWatch fees. Each instance launched by Auto Scaling is automatically enabled for monitoring and the applicable Amazon Cloudwatch charges will be applied.

—-

In an AWS cloud architecture, instances can be provisioned on-the-fly based on a set of triggers for scaling the fleet out and back in. Auto Scaling can be used to create capacity groups of servers that can grow or shrink on-demand.

Auto Scaling also works directly with CloudWatch for metrics data and Elastic Load Balancing (ELB) service for addition and removal of hosts for load distribution. For example, if the web servers are reporting greater than 80% CPU utilization over a period of time then an additional web server could be quickly deployed, and then automatically added to the Elastic Load Balancer for immediate inclusion in the load-balancing rotation.

If you choose to have a load balancer at each layer of your architecture, multiple auto-scaling groups can be created for different layer of the architecture to allow each layer to scale independently. For example, the web server auto-scaling group might trigger scaling out and in on network I/O, whereas the application server auto-scaling group might scale out and in on CPU utilization.

Minimum and maximums can also be set to help you ensure 24/7 availability and to cap to usage within a group. Auto Scaling triggers can be set to both grow and shrink the total fleet at that layer to match the resource utilization to actual traffic needs.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License