Call CodeBuild from Step Functions

TOC

Call CodeBuild from Step Functions

Step Functions are integrated with a variety of services.

With AWS service integrations, you can call API actions and coordinate executions directly from your workflow.

Call other AWS services

In this article, we will demonstrate how to call CodeBuild from Step Functions to download a YouTube video and save it to an S3 bucket. By following this guide, you will learn how to set up an automated video download flow and verify its results.

Architecture

Diagram of executing CodeBuild within Step Functions.
  • State Machine Creation
    1. Call Lambda function 1 to obtain the ID of the video to be downloaded
    2. Call CodeBuild to download the video and save it to S3
    3. Call Lambda function 2 to obtain a list of objects in the S3 bucket

Resources

Step Functions

State Machie

Detail of Step Functions 01.

The first state GetVideoIdState calls Lambda function 1 to obtain the ID of the YouTube video to download. This function returns a string (video ID).

The second state, DownloadVideoState, executes an API and calls CodeBuild; see the following page for the API that can be executed in CodeBuild.

あわせて読みたい
Manage AWS CodeBuild builds with Step Functions - AWS Step Functions Learn how to integrate Step Functions with AWS CodeBuild to trigger, stop, and manage builds

This time we will run the StartBuild API to execute the CodeBuild project. This API supports the Job Execution (.sync) integration pattern.

For integrated services such as AWS Batch and Amazon ECS, Step Functions can wait for a request to complete before progressing to the next state. To have Step Functions wait, specify the “Resource” field in your task state definition with the .sync suffix appended after the resource URI.

Run a Job (.sync)

In summary, the value specified for the Resource property is “arn:aws:states:::codebuild:startBuild.sync”.

The parameters that can be set in executing StartBuild API can be found on the following page, but this time we will set the following two.

あわせて読みたい
StartBuild - AWS CodeBuild Starts running a build with the settings defined in the project. These setting include: how to run a build, where to get the source code, which build environmen...

The first is environment variables, which can be set by specifying EnvironmentVariablesOverride. In this case, we will set the ID of the video to be downloaded as an environment variable. The second is the name of the CodeBuild project to run: specify the name of the project to run in ProjectName.

For basic information on state machines, please refer to the following pages.

あわせて読みたい
Introduction to Step Functions with CFN 【Introduction to Step Functions with CloudFormation】 This course is about refactoring, which is the scope of AWS DVA. Step Functions is a serverless orches...

IAM Role

Detail of IAM role 01.
Detail of IAM role 02.

A trust policy allows the states.amazonaws.com service to assume this IAM role, and an IAM policy attached to the IAM role grants the necessary permissions to start and stop CodeBuild, manipulate events in EventBridge, and execute Lambda functions to start and stop CodeBuild, manipulate EventBridge events, and execute Lambda functions.

We refer to the following pages

あわせて読みたい
Manage AWS CodeBuild builds with Step Functions - AWS Step Functions Learn how to integrate Step Functions with AWS CodeBuild to trigger, stop, and manage builds

Regarding the permissions of EventBridge, the AWS official explanation is as follows

Events sent from AWS services to Amazon EventBridge are directed to Step Functions using a managed rule, and require permissions for events:PutTargets, events:PutRule, and events:DescribeRule.

Additional permissions for tasks using the Run a Job pattern

CodeBuild

Build Project

Detail of CodeBuild 01.
Detail of CodeBuild 02.

The CodeBuild project uses a Lambda type Python environment. yt-dlp is installed and the video is downloaded from YouTube using the video ID received in the environment variable. The downloaded video is uploaded to the specified S3 bucket.

For Lambda type CodeBuild, please refer to the following page.

あわせて読みたい
CodeBuild – Lambda Version 【CodeBuild - Lambda Version】 On January 6, an update to the CodeBuild build environment was announced. Customers can now select AWS Lambda as a new compute...

IAM Role

Detail of IAM role 03.
Detail of IAM role 04.

The IAM role required for this project will be given permission to store objects, i.e., downloaded videos, in the S3 bucket.

Lambda 1

Function

Detail of Lambda 01.

This function is a simple one that returns the ID of a specific YouTube video. The runtime environment uses Python 3.12 and returns the video ID directly in the code.

IAM Role

Detail of IAM role 05.
Detail of IAM role 06.

Attach the AWS administrative policy AWSLambdaBasicExecutionRole to the IAM role, giving it the minimum privileges necessary to execute functions.

Lambda 2

Function

Detail of Lambda 02.

This function retrieves a list of objects in an S3 bucket and returns their names, using Python 3.12 as the runtime environment and boto3 to manipulate the S3 bucket.

IAM Role

Detail of IAM role 07.

In addition to the AWS management policy AWSLambdaBasicExecutionRole, authorization is given to retrieve a list of objects in the S3 bucket.

CloudFormation Template

Step Functions

State Machine

Resources:
  StateMachine:
    Type: AWS::StepFunctions::StateMachine
    Properties:
      Definition:
        Comment: !Sub "${Prefix}-statemachine"
        StartAt: GetVideoIdState
        States:
          GetVideoIdState:
            Type: Task
            Resource: !Ref FunctionArn1
            ResultPath: $.video_id
            Next: DownloadVideoState
          DownloadVideoState:
            Type: Task
            Resource: arn:aws:states:::codebuild:startBuild.sync
            Parameters:
              EnvironmentVariablesOverride:
                - Name: VIDEO_ID
                  Type: PLAINTEXT
                  Value.$: $.video_id
              ProjectName: !Ref CodeBuildProjectName
            ResultPath: $.codebuild_response
            Next: CheckVideoNameState
          ListBucketState:
            Type: Task
            Resource: !Ref FunctionArn2
            End: true
      LoggingConfiguration:
        Destinations:
          - CloudWatchLogsLogGroup:
              LogGroupArn: !GetAtt StateMachineLogGroup.Arn
        IncludeExecutionData: true
        Level: ALL
      RoleArn: !GetAtt StateMachineRole.Arn
      StateMachineName: !Sub "${Prefix}-statemachine"
      StateMachineType: STANDARD
Code language: YAML (yaml)

IAM Role

Resources:
  StateMachineRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Action: sts:AssumeRole
            Principal:
              Service:
                - states.amazonaws.com
      Policies:
        - PolicyName: StateMachinePolicy
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - codebuild:StartBuild
                  - codebuild:StopBuild
                  - codebuild:BatchGetBuilds
                Resource:
                  - !Sub "arn:aws:codebuild:${AWS::Region}:${AWS::AccountId}:project/${CodeBuildProjectName}"
              - Effect: Allow
                Action:
                  - events:PutTargets
                  - events:PutRule
                  - events:DescribeRule
                Resource:
                  - !Sub "arn:aws:events:${AWS::Region}:${AWS::AccountId}:rule/StepFunctionsGetEventForCodeBuildStartBuildRule"
              - Effect: Allow
                Action:
                  - lambda:InvokeFunction
                Resource:
                  - !Ref FunctionArn1
                  - !Ref FunctionArn2
              - Effect: Allow
                Action:
                  - logs:CreateLogDelivery
                  - logs:GetLogDelivery
                  - logs:UpdateLogDelivery
                  - logs:DeleteLogDelivery
                  - logs:ListLogDeliveries
                  - logs:PutLogEvents
                  - logs:PutResourcePolicy
                  - logs:DescribeResourcePolicies
                  - logs:DescribeLogGroups
                Resource: "*"
Code language: YAML (yaml)

CodeBuild

Build Project

Resources:
  CodeBuildProject:
    Type: AWS::CodeBuild::Project
    Properties: 
      Artifacts:
        Type: NO_ARTIFACTS
      Cache: 
        Type: NO_CACHE
      Environment: 
        ComputeType: !Ref ProjectEnvironmentComputeType
        EnvironmentVariables:
          - Name: BUCKET_NAME
            Type: PLAINTEXT
            Value: !Ref BucketName
          - Name: VIDEO_ID
            Type: PLAINTEXT
            Value: ""
        Image: !Ref ProjectEnvironmentImage
        ImagePullCredentialsType: CODEBUILD
        Type: !Ref ProjectEnvironmentType
      LogsConfig: 
        CloudWatchLogs:
          GroupName: !Ref CodeBuildLogGroup
          Status: ENABLED
      Name: !Sub "${Prefix}-project"
      ServiceRole: !GetAtt CodeBuildRole.Arn
      Source: 
        Type: NO_SOURCE
        BuildSpec: !Sub |
          version: 0.2
          
          phases:
            install:
              commands:
                - pip3 install yt-dlp
            build:
              commands:
                - echo $VIDEO_ID
                - yt-dlp -o /tmp/$VIDEO_ID.mp4 $VIDEO_ID
                
                - ls -al /tmp
                - aws s3 cp /tmp/$VIDEO_ID.mp4 s3://$BUCKET_NAME/
      Visibility: PRIVATE
Code language: YAML (yaml)

IAM Role

Resources:
  CodeBuildRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - codebuild.amazonaws.com
            Action:
              - sts:AssumeRole
      Policies:
        - PolicyName: CodeBuildPolicy
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - s3:PutObject
                Resource:
                  - !Sub "arn:aws:s3:::${BucketName}/*"
              - Effect: Allow
                Action:
                  - logs:CreateLogGroup
                  - logs:CreateLogStream
                  - logs:PutLogEvents
                Resource:
                  - !Sub "arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:${CodeBuildLogGroup}"
                  - !Sub "arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:${CodeBuildLogGroup}:log-stream:*"
Code language: YAML (yaml)

Lambda 1

Function

Resources:
  Function1:
    Type: AWS::Lambda::Function
    Properties:
      Architectures:
        - !Ref Architecture
      Code:
        ZipFile: |
          def lambda_handler(event, context):
            return 'TqqaSD2qTdY'
      FunctionName: !Sub "${Prefix}-function-01"
      Handler: !Ref Handler
      Runtime: !Ref Runtime
      Role: !GetAtt FunctionRole1.Arn
Code language: YAML (yaml)

IAM Role

Resources:
  FunctionRole1:
    Type: AWS::IAM::Role
    DeletionPolicy: Delete
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Action: sts:AssumeRole
            Principal:
              Service:
                - lambda.amazonaws.com
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
Code language: YAML (yaml)

Lambda 2

Function

Resources:
  Function2:
    Type: AWS::Lambda::Function
    Properties:
      Architectures:
        - !Ref Architecture
      Environment:
        Variables:
          REGION: !Ref AWS::Region
          BUCKET_NAME: !Ref BucketName
      Code:
        ZipFile: |
          import boto3
          import json
          import os
          from datetime import date, datetime
          
          region = os.environ['REGION']
          bucket_name = os.environ['BUCKET_NAME']
          
          s3_client = boto3.client('s3', region_name=region)

          def lambda_handler(event, context):
            response = s3_client.list_objects_v2(
              Bucket=bucket_name
            )
            return [obj['Key'] for obj in response['Contents']]
      FunctionName: !Sub "${Prefix}-function-02"
      Handler: !Ref Handler
      Runtime: !Ref Runtime
      Role: !GetAtt FunctionRole2.Arn
Code language: YAML (yaml)

IAM Role

Resources:
  FunctionRole2:
    Type: AWS::IAM::Role
    DeletionPolicy: Delete
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Action: sts:AssumeRole
            Principal:
              Service:
                - lambda.amazonaws.com
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
      Policies:
        - PolicyName: FunctionPolicy
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - s3:ListBucket
                Resource:
                  - !Sub "arn:aws:s3:::${BucketName}"
Code language: YAML (yaml)

Full Template

GitHub
awstut-fa/150 at main · awstut-an-r/awstut-fa Contribute to awstut-an-r/awstut-fa development by creating an account on GitHub.

Verification

Start the state machine from the Step Functions console. After a short wait, the execution completes successfully.

Detail of Step Functions 02.

It passed through three states and completed successfully.

Check the results of each state execution from the log.

Check the first log.

Detail of Step Functions 03.

The first Lambda function has been executed and the ID has been returned.

Check the second log.

Detail of Step Functions 04.

Above is a portion of the log; if you look at the BuildStatus value, it says “SUCCEEDED”. This indicates that the build has indeed completed.

Incidentally, the result of executing the CodeBuild StartBuild API is in the following format.

あわせて読みたい
StartBuild - AWS CodeBuild Starts running a build with the settings defined in the project. These setting include: how to run a build, where to get the source code, which build environmen...

Check the third log.

Detail of Step Functions 05.

In the last state, Lambda function 2 is executed and the list of objects in the S3 bucket is retrieved. The name of the downloaded video file is included in the list, indicating that it was successfully saved to the S3 bucket.

Finally, check the S3 bucket.

Detail of S3 01.

A direct check of the S3 bucket shows that the specified video file is stored correctly. This confirms that the entire state machine is working as expected.

Conclusion

We showed how to call CodeBuild from AWS Step Functions to automatically save YouTube videos to an S3 bucket, utilizing a Lambda function to retrieve the video ID and check it in the S3 bucket, then automate the actual download and save with CodeBuild. The actual downloading and storing was automated with CodeBuild. This procedure efficiently automates the video download process. Please refer to this article to build an automated flow.

TOC