Periodically delete old AMIs – Step Functions version

Periodically delete old AMIs - Step Functions version.

Periodically delete old AMIs – Step Functions version

For periodic acquisition and deletion of AMIs, AWS provides DLM (Data Lifecycle Manager).

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/snapshot-lifecycle.html

The following pages cover an introduction to DLM.

https://awstut.com/en/2023/03/19/introduction-to-data-lifecycle-manager-create-ebs-snapshot-ami-periodically-en

However, as cited below, DLM cannot be used to manage all AMIs.

Amazon Data Lifecycle Manager cannot be used to manage snapshots or AMIs that are created by any other means.

Amazon Data Lifecycle Manager

In this case, we will consider how to periodically delete old AMIs without using DLM.
Specifically, we will use Lambda functions and Step Functions.

Environment

Diagram of periodically delete old AMIs - Step Functions version.

Create a Step Functions state machine.
A state machine consists of two main states.

  • First state: A state that executes a Lambda function to extract old AMI data.
  • Second state: A Map state that processes each AMI data acquired in the previous state.

The Map state consists of the following two sub-states

  • First substate: delete the AMI using a Lambda function.
  • Second substate: Delete the snapshot associated with the AMI using the Lambda function.

For the purpose of this verification, AMIs that have been created for more than one hour are targeted for deletion.

The runtime environment for the function is Python 3.8.

Create an EventBridge rule to run this state machine periodically.
Specifically, set the state machine to run once per hour.

CloudFormation template files

The above configuration is built with CloudFormation.
The CloudFormation templates are placed at the following URL

https://github.com/awstut-an-r/awstut-soa/tree/main/02/004

Explanation of key points of template files

Step Functions State Machine

Resources:
  StateMachine:
    Type: AWS::StepFunctions::StateMachine
    Properties:
      Definition:
        Comment: !Sub "${Prefix}-StateMachine"
        StartAt: ListExpiredImagesState
        States:
          ListExpiredImagesState:
            Type: Task
            Resource: !Ref Function1Arn
            Parameters:
              valid_hours: !Ref ValidHours
            ResultPath: $.images
            Next: DeleteImageMapState
          DeleteImageMapState:
            Type: Map
            MaxConcurrency: 1
            InputPath: $.images
            ItemSelector:
              describe-image.$: $$.Map.Item.Value
            ItemProcessor:
              ProcessorConfig:
                Mode: INLINE
              StartAt: DeleteImageState
              States:
                DeleteImageState:
                  Type: Task
                  Resource: !Ref Function2Arn
                  InputPath: $.describe-image
                  ResultPath: $.delete-image
                  Next: DeleteSnapshotState
                DeleteSnapshotState:
                  Type: Task
                  Resource: !Ref Function3Arn
                  InputPath: $.describe-image
                  ResultPath: $.delete-snapshots
                  End: true
            End: true
      LoggingConfiguration:
        Destinations:
          - CloudWatchLogsLogGroup:
              LogGroupArn: !GetAtt LogGroup.Arn
        IncludeExecutionData: true
        Level: ALL
      RoleArn: !GetAtt StateMachineRole.Arn
      StateMachineName: !Ref Prefix
      StateMachineType: STANDARD
Code language: YAML (yaml)

For more information on the basics of the Step Functions state machine, please see the following pages.

https://awstut.com/en/2022/06/18/introduction-to-step-functions-with-cfn-en

First State

Sets the Lambda function to execute with the Resources property.
In this state, Lambda function 1 described below is specified.

In the Parameters property, you can set parameters to be passed to Lambda function 1.
Here you can set the parameters related to the AMI expiration date as follows.

{
  "valid_hours": 1
}
Code language: JSON / JSON with Comments (json)

Specify “1” for this parameter to target for deletion AMIs that have been created for more than one hour.

The ResultPath property allows you to set how to receive the data returned from function 1.
In this case, the received values are stored in images.
The following is an example.

{
  "images": [
    {
      "CreationDate": "2023-02-18T23:59:14.000Z",
      "ImageId": "ami-0871af2accbb69d20",
      "BlockDeviceMappings": [
        {
          "DeviceName": "/dev/xvda",
          "Ebs": {
            "SnapshotId": "snap-0470d37f1fce13f94",
            ...
          }
        }
      ],
      ...
    },
    {
      "CreationDate": "2023-02-19T12:45:33.000Z",
      "ImageId": "ami-0a95e4ddd73a24fc9",
      "BlockDeviceMappings": [
        {
          "Ebs": {
            "SnapshotId": "snap-0819cb0ef6defdafa",
            ...
          }
        },
        {
          "Ebs": {
            "DeleteOnTermination": false,
            "SnapshotId": "snap-01449bfd40017a2bf",
            ...
          }
        }
      ],
      ...
    },
    ...
  ]
}
Code language: JSON / JSON with Comments (json)

Lambda Function 1

Function
Resources:
  Function1:
    Type: AWS::Lambda::Function
    Properties:
      Code:
        ZipFile: |
          import boto3
          import datetime
          import os

          ACCOUNT_ID = os.environ['ACCOUNT_ID']
          REGION_NAME = os.environ['REGION_NAME']

          client = boto3.client('ec2', region_name=REGION_NAME)

          def lambda_handler(event, context):
            valid_hours = event['valid_hours']

            now = datetime.datetime.now(datetime.timezone.utc)

            response = client.describe_images(
              Owners=[ACCOUNT_ID])
            expired_images = []
            for image in response['Images']:
              creation_date_str = image['CreationDate']
              creation_date_dt = datetime.datetime.fromisoformat(creation_date_str.replace('Z', '+00:00'))

              diff = now - creation_date_dt
              diff_hour = diff.seconds / (60 * 60)

              if diff_hour > valid_hours:
                expired_images.append(image)

            return expired_images
      Environment:
        Variables:
          ACCOUNT_ID: !Ref AWS::AccountId
          REGION_NAME: !Ref AWS::Region
      FunctionName: !Sub "${Prefix}-function-01"
      Handler: !Ref Handler
      Runtime: !Ref Runtime
      Role: !GetAtt FunctionRole1.Arn
Code language: YAML (yaml)

The code to be executed by the Lambda function in inline format.
For more information, please refer to the following page.

https://awstut.com/en/2022/02/02/3-parterns-to-create-lambda-with-cloudformation-s3-inline-container

The code is as follows

  • Access the valid_hours of the event object to obtain the value of the expiration date.
  • After creating a client object for EC2 in boto3, execute the describe_images method to obtain a list of AMIs.
  • Compare the creation date and time of each AMI with the current date and time, and extract those that have exceeded the expiration date.

The date and time of AMI creation can be obtained by referring to the “CreationDate” value, which is in ISO 8601 format, such as “2023-02-19T12:45:33.000Z”.
In this case, we used the following page to process the string, convert it to datetime type, and compare it with the current date and time.

https://note.nkmk.me/python-datetime-isoformat-fromisoformat/

IAM Role
Resources:
  FunctionRole1:
    Type: AWS::IAM::Role
    DeletionPolicy: Delete
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Action: sts:AssumeRole
            Principal:
              Service:
                - lambda.amazonaws.com
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
      Policies:
        - PolicyName: !Sub "${Prefix}-DescribeImagesPolicy"
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - ec2:DescribeImages
                Resource: "*"
Code language: YAML (yaml)

Authorize inline policies to retrieve AMI information.

Second State

Map State.
Please refer to the following pages for basic information on Map State.

https://awstut.com/en/2023/03/12/iteration-using-map-in-step-functions-en

By specifying “$.images” in the InputPath property, the data set in the previous state is read.

The ItemSelector property allows you to set the data format to be passed to each iteration.
By setting “describe-image.$: $$.Map.Item.Value”, for example, the following data will be generated and passed.

{
  "describe-image": {
    "CreationDate": "2023-02-18T23:59:14.000Z",
    "ImageId": "ami-0871af2accbb69d20",
    "BlockDeviceMappings": [
      {
        "DeviceName": "/dev/xvda",
        "Ebs": {
          "SnapshotId": "snap-0470d37f1fce13f94"
        },
        ...
      }
    ],
    ...
  }
}
Code language: JSON / JSON with Comments (json)

First substate

State to execute Lambda function 2.

By specifying “$.describe-image” in the InputPath property, you can receive the value set in the ItemSelector property described above and pass it to the function.
The following is a concrete example of data passed to the function.

{
  "CreationDate": "2023-02-18T23:59:14.000Z",
  "ImageId": "ami-0871af2accbb69d20",
  "BlockDeviceMappings": [
    {
      "DeviceName": "/dev/xvda",
      "Ebs": {
        "SnapshotId": "snap-0470d37f1fce13f94"
      },
      ...
    }
  ],
  ...
}
Code language: JSON / JSON with Comments (json)

The ResultPath property allows you to set how to receive the data returned by function 2.
This time, the received value is stored in delete-image.
The following is an example.

{
  "describe-image": {
    ...
  },
  "delete-image": {
    "ResponseMetadata": {
      "RequestId": "951ccf72-e138-4026-b851-33bb4364c6f3",
      "HTTPStatusCode": 200,
      "HTTPHeaders": {
        ...
      },
      "RetryAttempts": 0
    }
  }
}
Code language: JSON / JSON with Comments (json)

Lambda Function 2

Function
Resources:
  Function2:
    Type: AWS::Lambda::Function
    Properties:
      Code:
        ZipFile: |
          import boto3
          import os

          REGION_NAME = os.environ['REGION_NAME']

          client = boto3.client('ec2', region_name=REGION_NAME)

          def lambda_handler(event, context):
            image_id = event['ImageId']
            deregister_image_response = client.deregister_image(
              ImageId=image_id
            )
            return deregister_image_response
      Environment:
        Variables:
          REGION_NAME: !Ref AWS::Region
      FunctionName: !Sub "${Prefix}-function-02"
      Handler: !Ref Handler
      Runtime: !Ref Runtime
      Role: !GetAtt FunctionRole2.Arn
Code language: YAML (yaml)

Delete AMI with the deregister_image method.

Returns the response data from the method execution.

IAM Role
Resources:
  FunctionRole2:
    Type: AWS::IAM::Role
    DeletionPolicy: Delete
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Action: sts:AssumeRole
            Principal:
              Service:
                - lambda.amazonaws.com
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
      Policies:
        - PolicyName: !Sub "${Prefix}-DeleteImagePolicy"
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - ec2:DeregisterImage
                Resource: "*"
Code language: YAML (yaml)

The contents grant the necessary permissions for AMI deletion.

Second substate

State to execute Lambda function 3.

This state also receives the value set in the ItemSelector property described above by specifying “$.describe-image” in the InputPath property, and passes it to the function.

The ResultPath property allows you to set how to receive the data returned from function 3.
This time, the received values are stored in delete-snapshots.
The following is an example.

{
  ...
  },
  "delete-image": {
    ...
  },
  "delete-snapshots": [
    {
      "ResponseMetadata": {
        "RequestId": "95088418-f80f-4501-a538-64d7a1f71711",
        "HTTPStatusCode": 200,
        "HTTPHeaders": {
          ...
        },
        "RetryAttempts": 0
      }
    }
  ]
}
Code language: JSON / JSON with Comments (json)

Lambda Function 3

Function
Resources:
  Function3:
    Type: AWS::Lambda::Function
    Properties:
      Code:
        ZipFile: |
          import boto3
          import os

          REGION_NAME = os.environ['REGION_NAME']

          client = boto3.client('ec2', region_name=REGION_NAME)

          def lambda_handler(event, context):
            image_id = event['ImageId']

            responses = []
            for block_device in event['BlockDeviceMappings']:
              snapshot_id = block_device['Ebs']['SnapshotId']
              delete_snapshot_response = client.delete_snapshot(
                SnapshotId=snapshot_id
              )
              print(delete_snapshot_response)
              responses.append(delete_snapshot_response)
            return responses
      Environment:
        Variables:
          REGION_NAME: !Ref AWS::Region
      FunctionName: !Sub "${Prefix}-function-03"
      Handler: !Ref Handler
      Runtime: !Ref Runtime
      Role: !GetAtt FunctionRole3.Arn
Code language: YAML (yaml)

The delete_snapshot method deletes the snapshot associated with an AMI.

Returns the response data from the method execution.

IAM Role
Resources:
  FunctionRole3:
    Type: AWS::IAM::Role
    DeletionPolicy: Delete
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Action: sts:AssumeRole
            Principal:
              Service:
                - lambda.amazonaws.com
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
      Policies:
        - PolicyName: !Sub "${Prefix}-DeleteSnapshotPolicy"
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - ec2:DeleteSnapshot
                Resource: "*"
Code language: YAML (yaml)

The contents grant the necessary permissions for deleting snapshots.

EventBridge Rule

Resources:
  Rule:
    Type: AWS::Events::Rule
    Properties:
      Name: !Sub "${Prefix}-EventsRule"
      ScheduleExpression: rate(1 hour)
      State: ENABLED
      Targets:
        - Arn: !Ref StateMachineArn
          Id: !Ref StateMachineName
          RoleArn: !GetAtt EventsRuleRole.Arn
Code language: YAML (yaml)

Set up rules to run the Step Functions state machine periodically.

The notation for the schedule uses a rate expression to set the state machine to run once per hour.

For information on how to use EventBridge to periodically run the Step Functions state machine, please see the following page.

https://awstut.com/en/2023/03/05/use-eventbridge-to-execute-step-functions-periodically-en

Architecting

Use CloudFormation to build this environment and check its actual behavior.

Create a CloudFormation stacks and check the resources in the stacks

Create CloudFormation stacks.
For information on how to create stacks and check each stack, please refer to the following pages.

https://awstut.com/en/2021/12/11/cloudformations-nested-stack

After reviewing the resources in each stack, information on the main resources created in this case is as follows

  • Step Functions state machine: soa-02-004
  • Lambda function 1: soa-02-004-function-01
  • Lambda function 2: soa-02-004-function-02
  • Lambda function 3: soa-02-004-function-03
  • EventBridgeRule: fa-120-EventsRule

Check the resources created from the AWS Management Console.

Check the state machine.

Detail of Step Functions 1.

It has been successfully created.
You can see that the state machine consists of a Map state and three Lambda functions.

Check the EventBridge rules.

Detail of EventBridge 1.
Detail of EventBridge 2.

An EventBridge rule is created.
The content is to execute the above state machine every hour.

Operation Check

AMI Creation

Now that you are ready, create an AMI for testing.

Detail of EC2 1.

AMI and snapshot have been created.

Detail of EC2 2.
Detail of EC2 3.

Details are as follows

  • AMI:ami-0d548220085c3a287
  • Snapshot:snap-03741d041ff197f39

First Step Functions state machine execution

Verify the operation of the state machine.

Detail of Step Functions 2.

State machine is running.
It was executed automatically by an EventBridge rule.

Detail of Step Functions 3.

The details of the execution result show that the process was completed successfully.

However, it was terminated in the middle of the Map state.
This is because the target AMI for deletion did not exist.
The AMI was just created, but it is not yet eligible for deletion because the expiration time of 1 hour has not yet passed.

Second Step Functions state machine execution

Wait until the next execution of the state machine.

After one hour, the state machine is run again.

Detail of Step Functions 4.

Details.

Detail of Step Functions 5.

This time, all states have been executed and successfully completed.

Check the output at the end of the process.

[
  {
    "describe-image": {
      "CreationDate": "2023-02-24T12:49:49.000Z",
      "ImageId": "ami-0d548220085c3a287",
      "BlockDeviceMappings": [
        {
          "DeviceName": "/dev/xvda",
          "Ebs": {
            "SnapshotId": "snap-03741d041ff197f39",
            ...
          }
        }
      ],
      ...
    },
    "delete-image": {
      "ResponseMetadata": {
        "HTTPStatusCode": 200,
        ...
    },
    "delete-snapshots": [
      {
        "ResponseMetadata": {
          "HTTPStatusCode": 200,
          ...
        }
      }
    ]
  }
]
Code language: JSON / JSON with Comments (json)

Indeed, we can see that the AMI and Snapshot deletion states in the Map state have been executed.

Finally, check the status of AMI and snapshots.

Detail of EC2 4.
Detail of EC2 5.

Indeed, the AMI and snapshot have been deleted.

Summary

We have shown how to use Lambda functions and Step Functions to periodically delete old AMIs.