Call CodeBuild from Step Functions
Step Functions are integrated with a variety of services.
With AWS service integrations, you can call API actions and coordinate executions directly from your workflow.
Call other AWS services
In this article, we will demonstrate how to call CodeBuild from Step Functions to download a YouTube video and save it to an S3 bucket. By following this guide, you will learn how to set up an automated video download flow and verify its results.
Architecture
- State Machine Creation
- Call Lambda function 1 to obtain the ID of the video to be downloaded
- Call CodeBuild to download the video and save it to S3
- Call Lambda function 2 to obtain a list of objects in the S3 bucket
Resources
Step Functions
State Machie
The first state GetVideoIdState calls Lambda function 1 to obtain the ID of the YouTube video to download. This function returns a string (video ID).
The second state, DownloadVideoState, executes an API and calls CodeBuild; see the following page for the API that can be executed in CodeBuild.
This time we will run the StartBuild API to execute the CodeBuild project. This API supports the Job Execution (.sync) integration pattern.
For integrated services such as AWS Batch and Amazon ECS, Step Functions can wait for a request to complete before progressing to the next state. To have Step Functions wait, specify the “Resource” field in your task state definition with the .sync suffix appended after the resource URI.
Run a Job (.sync)
In summary, the value specified for the Resource property is “arn:aws:states:::codebuild:startBuild.sync”.
The parameters that can be set in executing StartBuild API can be found on the following page, but this time we will set the following two.
The first is environment variables, which can be set by specifying EnvironmentVariablesOverride. In this case, we will set the ID of the video to be downloaded as an environment variable. The second is the name of the CodeBuild project to run: specify the name of the project to run in ProjectName.
For basic information on state machines, please refer to the following pages.
IAM Role
A trust policy allows the states.amazonaws.com service to assume this IAM role, and an IAM policy attached to the IAM role grants the necessary permissions to start and stop CodeBuild, manipulate events in EventBridge, and execute Lambda functions to start and stop CodeBuild, manipulate EventBridge events, and execute Lambda functions.
We refer to the following pages
Regarding the permissions of EventBridge, the AWS official explanation is as follows
Events sent from AWS services to Amazon EventBridge are directed to Step Functions using a managed rule, and require permissions for events:PutTargets, events:PutRule, and events:DescribeRule.
Additional permissions for tasks using the Run a Job pattern
CodeBuild
Build Project
The CodeBuild project uses a Lambda type Python environment. yt-dlp is installed and the video is downloaded from YouTube using the video ID received in the environment variable. The downloaded video is uploaded to the specified S3 bucket.
For Lambda type CodeBuild, please refer to the following page.
IAM Role
The IAM role required for this project will be given permission to store objects, i.e., downloaded videos, in the S3 bucket.
Lambda 1
Function
This function is a simple one that returns the ID of a specific YouTube video. The runtime environment uses Python 3.12 and returns the video ID directly in the code.
IAM Role
Attach the AWS administrative policy AWSLambdaBasicExecutionRole to the IAM role, giving it the minimum privileges necessary to execute functions.
Lambda 2
Function
This function retrieves a list of objects in an S3 bucket and returns their names, using Python 3.12 as the runtime environment and boto3 to manipulate the S3 bucket.
IAM Role
In addition to the AWS management policy AWSLambdaBasicExecutionRole, authorization is given to retrieve a list of objects in the S3 bucket.
CloudFormation Template
Step Functions
State Machine
Resources:
StateMachine:
Type: AWS::StepFunctions::StateMachine
Properties:
Definition:
Comment: !Sub "${Prefix}-statemachine"
StartAt: GetVideoIdState
States:
GetVideoIdState:
Type: Task
Resource: !Ref FunctionArn1
ResultPath: $.video_id
Next: DownloadVideoState
DownloadVideoState:
Type: Task
Resource: arn:aws:states:::codebuild:startBuild.sync
Parameters:
EnvironmentVariablesOverride:
- Name: VIDEO_ID
Type: PLAINTEXT
Value.$: $.video_id
ProjectName: !Ref CodeBuildProjectName
ResultPath: $.codebuild_response
Next: CheckVideoNameState
ListBucketState:
Type: Task
Resource: !Ref FunctionArn2
End: true
LoggingConfiguration:
Destinations:
- CloudWatchLogsLogGroup:
LogGroupArn: !GetAtt StateMachineLogGroup.Arn
IncludeExecutionData: true
Level: ALL
RoleArn: !GetAtt StateMachineRole.Arn
StateMachineName: !Sub "${Prefix}-statemachine"
StateMachineType: STANDARD
Code language: YAML (yaml)
IAM Role
Resources:
StateMachineRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action: sts:AssumeRole
Principal:
Service:
- states.amazonaws.com
Policies:
- PolicyName: StateMachinePolicy
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- codebuild:StartBuild
- codebuild:StopBuild
- codebuild:BatchGetBuilds
Resource:
- !Sub "arn:aws:codebuild:${AWS::Region}:${AWS::AccountId}:project/${CodeBuildProjectName}"
- Effect: Allow
Action:
- events:PutTargets
- events:PutRule
- events:DescribeRule
Resource:
- !Sub "arn:aws:events:${AWS::Region}:${AWS::AccountId}:rule/StepFunctionsGetEventForCodeBuildStartBuildRule"
- Effect: Allow
Action:
- lambda:InvokeFunction
Resource:
- !Ref FunctionArn1
- !Ref FunctionArn2
- Effect: Allow
Action:
- logs:CreateLogDelivery
- logs:GetLogDelivery
- logs:UpdateLogDelivery
- logs:DeleteLogDelivery
- logs:ListLogDeliveries
- logs:PutLogEvents
- logs:PutResourcePolicy
- logs:DescribeResourcePolicies
- logs:DescribeLogGroups
Resource: "*"
Code language: YAML (yaml)
CodeBuild
Build Project
Resources:
CodeBuildProject:
Type: AWS::CodeBuild::Project
Properties:
Artifacts:
Type: NO_ARTIFACTS
Cache:
Type: NO_CACHE
Environment:
ComputeType: !Ref ProjectEnvironmentComputeType
EnvironmentVariables:
- Name: BUCKET_NAME
Type: PLAINTEXT
Value: !Ref BucketName
- Name: VIDEO_ID
Type: PLAINTEXT
Value: ""
Image: !Ref ProjectEnvironmentImage
ImagePullCredentialsType: CODEBUILD
Type: !Ref ProjectEnvironmentType
LogsConfig:
CloudWatchLogs:
GroupName: !Ref CodeBuildLogGroup
Status: ENABLED
Name: !Sub "${Prefix}-project"
ServiceRole: !GetAtt CodeBuildRole.Arn
Source:
Type: NO_SOURCE
BuildSpec: !Sub |
version: 0.2
phases:
install:
commands:
- pip3 install yt-dlp
build:
commands:
- echo $VIDEO_ID
- yt-dlp -o /tmp/$VIDEO_ID.mp4 $VIDEO_ID
- ls -al /tmp
- aws s3 cp /tmp/$VIDEO_ID.mp4 s3://$BUCKET_NAME/
Visibility: PRIVATE
Code language: YAML (yaml)
IAM Role
Resources:
CodeBuildRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Principal:
Service:
- codebuild.amazonaws.com
Action:
- sts:AssumeRole
Policies:
- PolicyName: CodeBuildPolicy
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- s3:PutObject
Resource:
- !Sub "arn:aws:s3:::${BucketName}/*"
- Effect: Allow
Action:
- logs:CreateLogGroup
- logs:CreateLogStream
- logs:PutLogEvents
Resource:
- !Sub "arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:${CodeBuildLogGroup}"
- !Sub "arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:${CodeBuildLogGroup}:log-stream:*"
Code language: YAML (yaml)
Lambda 1
Function
Resources:
Function1:
Type: AWS::Lambda::Function
Properties:
Architectures:
- !Ref Architecture
Code:
ZipFile: |
def lambda_handler(event, context):
return 'TqqaSD2qTdY'
FunctionName: !Sub "${Prefix}-function-01"
Handler: !Ref Handler
Runtime: !Ref Runtime
Role: !GetAtt FunctionRole1.Arn
Code language: YAML (yaml)
IAM Role
Resources:
FunctionRole1:
Type: AWS::IAM::Role
DeletionPolicy: Delete
Properties:
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action: sts:AssumeRole
Principal:
Service:
- lambda.amazonaws.com
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
Code language: YAML (yaml)
Lambda 2
Function
Resources:
Function2:
Type: AWS::Lambda::Function
Properties:
Architectures:
- !Ref Architecture
Environment:
Variables:
REGION: !Ref AWS::Region
BUCKET_NAME: !Ref BucketName
Code:
ZipFile: |
import boto3
import json
import os
from datetime import date, datetime
region = os.environ['REGION']
bucket_name = os.environ['BUCKET_NAME']
s3_client = boto3.client('s3', region_name=region)
def lambda_handler(event, context):
response = s3_client.list_objects_v2(
Bucket=bucket_name
)
return [obj['Key'] for obj in response['Contents']]
FunctionName: !Sub "${Prefix}-function-02"
Handler: !Ref Handler
Runtime: !Ref Runtime
Role: !GetAtt FunctionRole2.Arn
Code language: YAML (yaml)
IAM Role
Resources:
FunctionRole2:
Type: AWS::IAM::Role
DeletionPolicy: Delete
Properties:
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action: sts:AssumeRole
Principal:
Service:
- lambda.amazonaws.com
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
Policies:
- PolicyName: FunctionPolicy
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- s3:ListBucket
Resource:
- !Sub "arn:aws:s3:::${BucketName}"
Code language: YAML (yaml)
Full Template
Verification
Start the state machine from the Step Functions console. After a short wait, the execution completes successfully.
It passed through three states and completed successfully.
Check the results of each state execution from the log.
Check the first log.
The first Lambda function has been executed and the ID has been returned.
Check the second log.
Above is a portion of the log; if you look at the BuildStatus value, it says “SUCCEEDED”. This indicates that the build has indeed completed.
Incidentally, the result of executing the CodeBuild StartBuild API is in the following format.
Check the third log.
In the last state, Lambda function 2 is executed and the list of objects in the S3 bucket is retrieved. The name of the downloaded video file is included in the list, indicating that it was successfully saved to the S3 bucket.
Finally, check the S3 bucket.
A direct check of the S3 bucket shows that the specified video file is stored correctly. This confirms that the entire state machine is working as expected.
Conclusion
We showed how to call CodeBuild from AWS Step Functions to automatically save YouTube videos to an S3 bucket, utilizing a Lambda function to retrieve the video ID and check it in the S3 bucket, then automate the actual download and save with CodeBuild. The actual downloading and storing was automated with CodeBuild. This procedure efficiently automates the video download process. Please refer to this article to build an automated flow.