Introduction to Step Functions with CloudFormation
This course is about refactoring, which is the scope of AWS DVA.
Step Functions is a serverless orchestration service.
In this introduction to Step Functions, we will create a simple state machine with CloudFormation.
Environment
We will create a single Step Functions state machine and two tasks inside it.
Within the tasks, we will invoke Lambda functions.
The functions will perform the following functions
- Lambda function 1: Obtains the current date and time information.
- Lambda function 2: Calculates and returns the UNIX time from the date/time information.
The runtime environment for the functions is Python 3.8.
CloudFormation Template Files
The above configuration is built using CloudFormation.
The CloudFormation template is located at the following URL
https://github.com/awstut-an-r/awstut-dva/tree/main/04/002
Explanation of key points of the template files
Step Functions
State Machine
Resources:
StateMachine:
Type: AWS::StepFunctions::StateMachine
Properties:
Definition:
Comment: !Sub "${Prefix}-StateMachine"
StartAt: FirstState
States:
FirstState:
Type: Task
Resource: !Ref Function1Arn
Next: LastState
LastState:
Type: Task
Resource: !Ref Function2Arn
End: true
LoggingConfiguration:
Destinations:
- CloudWatchLogsLogGroup:
LogGroupArn: !GetAtt LogGroup.Arn
IncludeExecutionData: true
Level: ALL
RoleArn: !GetAtt StateMachineRole.Arn
StateMachineName: !Ref Prefix
StateMachineType: STANDARD
Code language: YAML (yaml)
The Definition property defines the structure of the state machine.
The structure is expressed in a notation called the Amazon Statement Language.
The Amazon statement language can be written in JSON or YAML format, and the latter is used in this case.
The state machine structure is composed of states.
A state defines the action to be executed.
In this configuration, one action, executing a Lambda function, is defined in one state.
Since two Lambda functions will be invoked, this also means that two states will be created.
In the state machine structure, two fields are important and necessary.
The first is the StartAt field.
If multiple states are defined, this field defines which state to start the action from.
In this case, we will specify that the action will start from the state where Lambda function 1, which acquires the current date and time, is executed.
The second field is the States field.
This field defines the state to execute in the state machine.
Check the fields that make up the states.
The Type field specifies the type of state. If you want to perform some action such as executing a Lambda function as in this case, specify “Task”.
The Resource field specifies the ARN of the resource to be executed. In this case, specify the Lambda function to be invoked.
The Next field specifies the next state to be executed. In this case, function 2 will be executed after function 1, so this field is defined for the state of function 1, and the state for function 2 is specified as the value.
The End field indicates the last state to be executed. In this case, the state for function 2 will be executed last, so define this field for this state and specify “true” as the value.
The LoggingConfiguration property allows you to configure settings related to logging during state machine execution.
Logs can be stored in CloudWatch Logs.
Specify the ARN for the log group described below.
Specify “true” for the IncludeExecutionData property and “ALL” for the Level property to collect all logs.
Specify the type of state machine in the StateMachineType property.
There are two types: Standard and Express.
For details, please refer to the following official page, but we will select Standard this time.
https://docs.aws.amazon.com/step-functions/latest/dg/concepts-standard-vs-express.html
IAM Role
Resources:
StateMachineRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action: sts:AssumeRole
Principal:
Service:
- states.amazonaws.com
Policies:
- PolicyName: !Sub "${Prefix}-InvokeTaskFunctions"
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- lambda:InvokeFunction
Resource:
- !Ref Function1Arn
- !Ref Function2Arn
- PolicyName: !Sub "${Prefix}-DeliverToCloudWatchLogPolicy"
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- logs:CreateLogDelivery
- logs:GetLogDelivery
- logs:UpdateLogDelivery
- logs:DeleteLogDelivery
- logs:ListLogDeliveries
- logs:PutLogEvents
- logs:PutResourcePolicy
- logs:DescribeResourcePolicies
- logs:DescribeLogGroups
Resource: "*"
Code language: YAML (yaml)
Authorization to run the state machine.
There are two main types.
The first is the privileges required to invoke Lambda functions.
The second is the authorization to distribute logs to CloudWatch Logs.
For the latter, we have created the following official page
https://docs.aws.amazon.com/step-functions/latest/dg/cw-logs.html#cloudwatch-iam-policy
Log Group
Resources:
LogGroup:
Type: AWS::Logs::LogGroup
Properties:
LogGroupName: !Sub "${Prefix}-StateMachineLogGroup"
Code language: YAML (yaml)
No special configuration is required.
Simply create a log group.
Lambda Functions
Create a Lambda function from the code described inline.
For details, please refer to the following page.
This page will focus on the code to be executed.
Function 1
Resources:
Function1:
Type: AWS::Lambda::Function
Properties:
Code:
ZipFile: |
import datetime
def lambda_handler(event, context):
now = datetime.datetime.now()
return {
'year': now.year,
'month': now.month,
'day': now.day,
'hour': now.hour,
'minute': now.minute,
'second': now.second,
'microsecond': now.microsecond
}
FunctionName: !Sub "${Prefix}-function-01"
Handler: !Ref Handler
Runtime: !Ref Runtime
Role: !GetAtt FunctionRole.Arn
Code language: YAML (yaml)
This function is used to obtain the current date and time.
After obtaining the current date and time, this function extracts 7 elements from the date and time object and returns them.
Function 2
Resources:
Function2:
Type: AWS::Lambda::Function
Properties:
Code:
ZipFile: |
import datetime
import pprint
import time
def lambda_handler(event, context):
pprint.pprint(event)
year = event['year']
month = event['month']
day = event['day']
hour = event['hour']
minute = event['minute']
second = event['second']
microsecond = event['microsecond']
dt = datetime.datetime(year, month, day, hour, minute, second, microsecond)
epoch_time = int(time.mktime(dt.timetuple()))
print(epoch_time)
return epoch_time
FunctionName: !Sub "${Prefix}-function-02"
Handler: !Ref Handler
Runtime: !Ref Runtime
Role: !GetAtt FunctionRole.Arn
Code language: YAML (yaml)
Function to get UNIX time from date/time information.
In Python, you can access the arguments of the function invocation from the event object.
In this case, we will set up the function to retrieve the seven values returned by function1.
After acquiring the arguments, after creating a datetime object, we will calculate the UNIX time from this and return it.
Architecting
Using CloudFormation, we will build this environment and check the actual behavior.
Create CloudFormation stacks and check resources in stacks
Create a CloudFormation stack.
For information on how to create stacks and check each stack, please refer to the following page
After checking the resources in each stack, information on the main resources created this time is as follows
- Step Functions state machine: dva-04-002
- Lambda function 1: dva-04-002-function-01
- Lambda function 2: dva-04-002-function-02
- CloudWatch Logs log group: dva-04-002-StateMachineLogGroup
Check the state machine creation status from the AWS Management Console.
It has been created successfully.
Next, check the Lambda function.
This is also normal.
Checking Action
Now that everything is ready, we can execute the state machine.
To execute from the AWS Management Console, click “Start execution”.
A page for setting execution options will be displayed.
Leave Input empty and press “Start execution”.
The state machine starts.
The Execution Status is “Running”.
This means that the state machine is running.
Wait a little longer.
The Execution Status is “Succeeded”.
The state machine has completed.
You can see the execution log.
If we look at the Step column, we can see that the state machine has transitioned to two states from the start to the end of the state machine.
These are the states of Lambda functions 1 and 2, respectively.
This means that the two functions have been successfully executed.
Let’s review the detailed log of the state machine execution.
We start with the logs of the state machine start and function 1 invocation.
In the log of ExecutionStart, we see that input is empty.
This is because this parameter was left empty in the state machine execution options.
The LambdaFunctionScheduled log shows that resource specifies Lambda function 1 to be invoked.
Also, since the state machine input was empty, the input as an argument when invoking the function was also empty.
The log of TaskStateExited shows that the output is set to the result returned by function 1.
We can see the seven pieces of data that make up the current date and time.
Next we check the log of function 2 and its execution and state machine exit.
Looking at the log of TaskStateEnterd, we can see the result returned by function 1 in the input.
The LambdaFunctionScheduled log shows that Lambda function 2 is specified in resource to be invoked.
Also, since the state machine input is set, this is set as the argument when the function is invoked.
The log of TaskStateExited shows that the output is set to the result returned by function 2.
This is the date and time passed from function 1, i.e., the UNIX time of the current date and time.
The ExecutionSucceded log shows that the output is also set to the result of function 2.
Finally, we also check the logs delivered to the CloudWatch Logs log group.
A log stream has been created.
We can see that the logs we just checked are also delivered to the CloudWatch Logs log group.
Summary
As an introduction to Step Functions, we have built a simple Step Functions configuration using CloudFormation.