EC2 Auto Scaling – Simple Scaling based on CPU utilization
The following pages cover the basics of EC2 Auto Scaling.
In this page, we will review the behavior of dynamic scaling.
There are three types of dynamic scaling in EC2 Auto Scaling
- Simple Scaling
- Step scaling
- Target Tracking Scaling
In this page, we will check the behavior of Simple Scaling.
It scales the number of instances based on CPU usage.
For more information on step scaling, please refer to the following page.
For target tracking scaling, please refer to the following page
Environment
Create an ALB and attach EC2 Auto Scaling in private subnets.
Set the number of Auto Scaling instances as follows
- Minimum number: 1
- Maximum number: 2
- Desired number: 1
Set scaling to run based on CPU utilization.
Scale out when CPU utilization exceeds 30% and scale in when it falls below 30%.
The EC2 instance to be started in the Auto Scaling group, but it should be the latest version of Amazon Linux 2.
Install Apache from the yum repository on S3 and configure it to act as a web server.
Use SSM Session Manager to access the launched instance.
CloudFormation template files
Build the above configuration with CloudFormation.
The CloudFormation templates are located at the following URL
https://github.com/awstut-an-r/awstut-fa/tree/main/087
Explanation of key points of the template files
This page focuses on how to configure Simple Scaling in EC2 Auto Scaling.
For information on how to attach resources in a private subnet to an ALB, please refer to the following page
For information on how to execute yum on instances in a private subnet, please refer to the following page
(Reference) Launch Template
Resources:
LaunchTemplate:
Type: AWS::EC2::LaunchTemplate
Properties:
LaunchTemplateData:
IamInstanceProfile:
Arn: !GetAtt InstanceProfile.Arn
ImageId: !Ref ImageId
InstanceType: !Ref InstanceType
SecurityGroupIds:
- !Ref InstanceSecurityGroup
UserData: !Base64 |
#!/bin/bash -xe
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
ec2-metadata -i > /var/www/html/index.html
LaunchTemplateName: !Sub "${Prefix}-LaunchTemplate"
Code language: YAML (yaml)
A Launch template is a resource for configuration information for EC2 instances to be launched within an Auto Scaling group.
You must create a Launch template or launch configuration to configure EC2 Auto Scaling.
However, it is currently deprecated to configure Auto Scaling using Launch configuration.
We strongly recommend that you do not use launch configurations. They do not provide full functionality for Amazon EC2 Auto Scaling or Amazon EC2. We provide information about launch configurations for customers who have not yet migrated from launch configurations to launch templates.
Launch configurations
These are basically the same configuration items as for EC2 instances.
For example, specify the AMI and instance type of the instance to be launched with the ImageId and InstanceType properties.
User data can be set with the UserData property.
This time we will install and activate Apache, write the instance ID in an HTML file, and set it in the Apache root page.
Auto Scaling Group
Resources:
AutoScalingGroup:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
AutoScalingGroupName: !Sub "${Prefix}-AutoScalingGroup"
DesiredCapacity: !Ref DesiredCapacity
LaunchTemplate:
LaunchTemplateId: !Ref LaunchTemplate
Version: !GetAtt LaunchTemplate.LatestVersionNumber
MaxSize: !Ref MaxSize
MinSize: !Ref MinSize
VPCZoneIdentifier:
- !Ref PrivateSubnet1
- !Ref PrivateSubnet2
TargetGroupARNs:
- !Ref ALBTargetGroup
Code language: YAML (yaml)
No special settings are required for the Auto Scaling group in configuring Simple Scaling.
Set the number of instances to be created in the group as follows
Set the desired number to 1 in the DesiredCapacity property.
Set the maximum number to 2 in the MaxSize property.
Set the minimum number to 1 in the MinSize property.
Scaling Policy
Resources:
ScalingPolicy1:
Type: AWS::AutoScaling::ScalingPolicy
Properties:
AdjustmentType: ChangeInCapacity
AutoScalingGroupName: !Ref AutoScalingGroup
Cooldown: !Ref Cooldown
PolicyType: SimpleScaling
ScalingAdjustment: !Ref ScalingAdjustment1
ScalingPolicy2:
Type: AWS::AutoScaling::ScalingPolicy
Properties:
AdjustmentType: ChangeInCapacity
AutoScalingGroupName: !Ref AutoScalingGroup
Cooldown: !Ref Cooldown
PolicyType: SimpleScaling
ScalingAdjustment: !Ref ScalingAdjustment2
Code language: YAML (yaml)
To configure Simple Scaling, two scaling policies are created.
The first policy is for scaling out and the second is for scaling in.
Set the type of scaling policy in the PolicyType property.
In this case, we specify “SimpleScaling” since this is a simple scaling.
You can set the cooldown period in the Cooldown property.
The explanation of the cooldown is as quoted below.
After your Auto Scaling group launches or terminates instances, it waits for a cooldown period to end before any further scaling activities initiated by simple scaling policies can start. The intention of the cooldown period is to prevent your Auto Scaling group from launching or terminating additional instances before the effects of previous activities are visible.
Scaling cooldowns for Amazon EC2 Auto Scaling
In this case, we specify “300” and wait 5 minutes for the next scaling to occur.
The ScalingAdjustment property is a parameter related to the adjustment value.
It is the amount by which the number of instances in the Auto Scaling group is increased or decreased when scaling is performed.
In this case, “1” and “-1” are specified for the two policies, respectively.
Set the scaling adjustment type with the AdjustmentType property.
This time, specify “ChangeInCapacity”.
The description of this type is quoted below.
ChangeInCapacity — Increment or decrement the current capacity of the group by the specified value. A positive value increases the capacity and a negative adjustment value decreases the capacity. For example: If the current capacity of the group is 3 and the adjustment is 5, then when this policy is performed, we add 5 capacity units to the capacity for a total of 8 capacity units.
Scaling adjustment types
In this configuration, one instance is started in the group during normal operation, but during scale-out, one additional instance is started to create a two-unit configuration, and during scale-in, one instance is removed to create a one-unit configuration.
CloudWatch Alarm
Resources:
Alarm1:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmActions:
- !Ref ScalingPolicy1
AlarmName: !Sub "${Prefix}-Alarm1"
ComparisonOperator: GreaterThanOrEqualToThreshold
Dimensions:
- Name: AutoScalingGroupName
Value: !Ref AutoScalingGroup
EvaluationPeriods: !Ref AlarmEvaluationPeriod
MetricName: CPUUtilization
Namespace: AWS/EC2
Period: !Ref AlarmPeriod
Statistic: Average
Threshold: !Ref AlarmThreshold
Alarm2:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmActions:
- !Ref ScalingPolicy2
AlarmName: !Sub "${Prefix}-Alarm2"
ComparisonOperator: LessThanOrEqualToThreshold
Dimensions:
- Name: AutoScalingGroupName
Value: !Ref AutoScalingGroup
EvaluationPeriods: !Ref AlarmEvaluationPeriod
MetricName: CPUUtilization
Namespace: AWS/EC2
Period: !Ref AlarmPeriod
Statistic: Average
Threshold: !Ref AlarmThreshold
Code language: YAML (yaml)
CloudWatch alarms act as triggers for scaling policy action.
Alarms are created for each scaling policy.
The first alarm is for the scale-out policy and the second alarm is for the scale-in policy.
This is specified in the AlarmActions property.
Set the target for metric measurement in the Dimensions property.
This time, to measure the CPU utilization of the Auto Scaling group, specify “AutoScalingGroupName” for Name and the ID of the group for Value.
This time, set the scaling to be started according to the CPU utilization.
By specifying “CPUUtilization” for the MetricName property, “AWS/EC2” for the Namespace property, and “Average” for the Statistic property, the average CPU utilization for the entire Auto Scaling group is measured.
The measurement period of CPU utilization is set with the Period property.
In this case, “60” is specified to measure every minute.
Threshold values are set with the ComparisonOperator, Threshold, and EvaluationPeriods properties.
For Alarm 1, specify “GreaterThanOrEqualToThreshold,” “30,” and “2,” respectively, to trigger when CPU usage reaches 30% or more for two consecutive times.
Alarm 2 specifies “LessThanOrEqualToThreshold”, “30”, and “2”, respectively, and is triggered when CPU utilization falls below 30% for two consecutive times.
Use CloudFormation to build this environment and check the actual behavior.
Create CloudFormation stacks and check resources in stacks
Create a CloudFormation stacks.
For information on how to create stacks and check each stack, please refer to the following page
After checking the resources in each stack, information on the main resources created this time is as follows
- ALB: fa-087-ALB
- DNS name of ALB: fa-087-alb-737613323.ap-northeast-1.elb.amazonaws.com
- ALB target group: fa-087-albTargetGroup
- Launch template: fa-087-LaunchTemplate
- EC2 Auto Scaling group: fa-087-AutoScalingGroup
- CloudWatch alarm1: fa-087-Alarm1
- CloudWatch Alarm 2: fa-087-Alarm2
Confirm the created resource from the AWS Management Console.
Confirm the ALB.
Confirm the DNS name, etc. of the ALB.
Confirm the Auto Scaling group.
The desired/minimum number is 1 and the maximum number is 2. In other words, within the Auto Scaling group, one instance will be launched during normal operation and two instances during scale-out.
Check the scaling policy.
You can see that two scaling policies have been created.
These policies are for scale-out/in.
You can also check the conditions that trigger the CloudWatch alarm, which is the trigger for the action of the two policies.
Looking at the activity history of the Auto Scaling group, we can see that one instance was created in the group that was empty.
Checking the instances in the group, we can see that one instance is indeed running.
The CPU utilization at this point is as follows.
Since the CPU utilization is less than 30%, scale-out will not be started.
Check Operation
Normal
Now that everything is ready, access the ALB.
The instance has been accessed.
You can see that the Auto Scaling group is indeed attached to the ALB.
Scale-out
Check the behavior during scale-out.
Use SSM Session Manager to access the instances in the Auto Scaling group.
% aws ssm start-session --target i-013a345e86bf1eda1
Starting session with SessionId: root-0d5c2cb28ca211e00
sh-4.2$
Code language: Bash (bash)
For more information on SSM Session Manager, please refer to the following page
Use the yes command to increase the CPU utilization of the instance.
sh-4.2$ yes > /dev/null &
Code language: Bash (bash)
After waiting for a while, CPU utilization exceeds 30%.
Now the conditions for scale-out to start have been met.
The activity history of the Auto Scaling group shows that a new instance has been started.
Checking the instances in the group, we see that two instances have indeed been launched.
We access the ALB again.
In addition to the first instance, we were able to access the second instance alternately.
As described above, we see that the policy for scale-out has acted and a second instance has been created in the Auto Scaling group.
Scale-In
We wait for a while to check the behavior during scale-in.
With the second instance activated, the average CPU utilization for the entire Auto Scaling group has dropped below 30%.
The conditions for scale-in to start have now been met.
Looking at the activity history of the Auto Scaling group, we can see that one instance has been deleted.
Checking the instances in the group shows that indeed one instance was started.
The instance that was deleted is the instance whose CPU utilization was increased by the yes command.
Check the CPU utilization again.
The CPU utilization has returned to less than 1%.
This is because the yes command, etc. is not executed on the instance that is currently running.
As described above, we see that the policy for scale-in has acted and one instance in the Auto Scaling group has been deleted.
Summary
We have confirmed the behavior of Simple Scaling, a type of dynamic scaling of EC2 Auto Scaling.