Accessing DynamoDB Accelerator (DAX) with EC2/Lambda
This is one of the AWS SAA topics related to designing a high-performance architecture.
DynamoDB Accelerator (DAX) is a caching service for DynamoDB.
Amazon DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache for Amazon DynamoDB that delivers up to a 10 times performance improvement—from milliseconds to microseconds—even at millions of requests per second.
Amazon DynamoDB Accelerator (DAX)
In this case, the objective is to create a DAX and access it from EC2 and Lambda.
Environment
Create a DynamoDB table outside the VPC.
Set up a partition key, sort key, etc. so that items for verification can be stored.
Deploy a DAX cluster on private subnets.
Place two nodes in the cluster.
A primary node and a lead replica node.
Create two resources in the private subnet as clients to access the DAX cluster.
The first is an EC2 instance.
It will use the official AWS package to connect to the DAX cluster.
To install it, we access the Internet via a NAT gateway.
The second is a Lambda function.
This also uses a dedicated package to connect to the DAX cluster, so it is prepared in the form of a Lambda layer.
The runtime environment for the function is Python 3.8.
CloudFormation template files
The above configuration is built using CloudFormation.
The CloudFormation templates are located at the following URL
https://github.com/awstut-an-r/awstut-saa/tree/main/02/009
Explanation of key points of the template files
DAX
To create a DAX, the following four resources must be created
- DAX cluster
- Parameter Group
- Subnet group
- IAM Roles
DAX Cluster
Resources:
DAXCluster:
Type: AWS::DAX::Cluster
Properties:
AvailabilityZones:
- !Sub "${AWS::Region}${AvailabilityZone1}"
- !Sub "${AWS::Region}${AvailabilityZone2}"
ClusterEndpointEncryptionType: NONE
ClusterName: !Sub "${Prefix}-Cluster"
Description: Test DAX Cluster
IAMRoleARN: !GetAtt DAXRole.Arn
NodeType: !Ref DAXNodeType
ParameterGroupName: !Ref DAXParameterGroup
ReplicationFactor: 2
SecurityGroupIds:
- !Ref DAXSecurityGroup
SubnetGroupName: !Ref DAXSubnetGroup
Code language: YAML (yaml)
In the AvailabilityZones property, specify the AZ where the cluster will be deployed.
Regarding the AZ where the cluster will be deployed, AWS official mentions the following
For production usage, we strongly recommend using DAX with at least three nodes, where each node is placed in different Availability Zones. Three nodes are required for a DAX cluster to be fault-tolerant.
DAX cluster components
Since this is a verification purpose, we will deploy clusters in two AZs.
The NodeType property allows you to set the instance type of the nodes in the cluster.
In this case, we will specify “dax.t3.small,” which is the smallest size.
The ReplicationFactor property sets the number of nodes in the cluster.
This time, “2” is specified since two nodes will be created in the cluster.
Specify the security group to be applied to the DAX cluster in the SecurityGroupIds property.
Regarding the security group to be applied to the cluster, AWS official mentions the following
When you launch a cluster in your VPC, you add an ingress rule to your security group to allow incoming network traffic. The ingress rule specifies the protocol (TCP) and port number (8111) for your cluster.
DAX cluster components
So this time we will apply the following security group
Resources:
DAXSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: !Sub "${Prefix}-DAXSecurityGroup"
GroupDescription: Allow DAX.
VpcId: !Ref VPC
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: !Ref DAXPort
ToPort: !Ref DAXPort
SourceSecurityGroupId: !Ref InstanceSecurityGroup
- IpProtocol: tcp
FromPort: !Ref DAXPort
ToPort: !Ref DAXPort
SourceSecurityGroupId: !Ref FunctionSecurityGroup
Code language: YAML (yaml)
Contents to allow 8111/tcp from EC2 instances and Lambda functions security group.
Parameter Groups
The following is the official AWS description of parameter groups.
A named set of parameters that are applied to all of the nodes in a DAX cluster.
AWS::DAX::ParameterGroup
Resources:
DAXParameterGroup:
Type: AWS::DAX::ParameterGroup
Properties:
Description: Test DAX Parameter Group
ParameterGroupName: !Sub "${Prefix}-ParameterGroup"
ParameterNameValues:
query-ttl-millis: 75000
record-ttl-millis: 88000
Code language: YAML (yaml)
In this case, we used the above page as a reference for setting up the system.
Subnet Group
The subnet group is a resource that specifies the subnets where the cluster will be deployed.
Resources:
DAXSubnetGroup:
Type: AWS::DAX::SubnetGroup
Properties:
Description: Test DAX Subnet Group
SubnetGroupName: !Sub "${Prefix}SubnetGroup"
SubnetIds:
- !Ref DAXSubnet1
- !Ref DAXSubnet2
Code language: YAML (yaml)
This time, two private subnets are specified.
IAM Role
When a client such as an EC2 instance or Lambda function accesses DAX, DAX internally queries DynamoDB and returns the results to the client.
In other words, it is necessary to authorize DAX to access DynamoDB.
Authorization to DAX is done by associating an IAM role with the cluster.
Resources:
DAXRole:
Type: AWS::IAM::Role
DeletionPolicy: Delete
Properties:
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action: sts:AssumeRole
Principal:
Service:
- dax.amazonaws.com
Policies:
- PolicyName: !Sub "${Prefix}-DAXPolicy"
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- dynamodb:DescribeTable
- dynamodb:PutItem
- dynamodb:GetItem
- dynamodb:UpdateItem
- dynamodb:DeleteItem
- dynamodb:Query
- dynamodb:Scan
- dynamodb:BatchGetItem
- dynamodb:BatchWriteItem
- dynamodb:ConditionCheckItem
Resource:
- !Ref DynamoDBTableArn
Code language: YAML (yaml)
Refer to the official AWS page for the configuration.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DAX.access-control.html
This is the content of granting various privileges to DynamoDB.
DynamoDB
Resources:
Table:
Type: AWS::DynamoDB::Table
Properties:
AttributeDefinitions:
- AttributeName: partition_key
AttributeType: N
- AttributeName: sort_key
AttributeType: N
BillingMode: PROVISIONED
KeySchema:
- AttributeName: partition_key
KeyType: HASH
- AttributeName: sort_key
KeyType: RANGE
ProvisionedThroughput:
ReadCapacityUnits: !Ref ReadCapacityUnits
WriteCapacityUnits: !Ref WriteCapacityUnits
TableClass: STANDARD
TableName: TryDaxTable
Code language: YAML (yaml)
Since this is for DAX verification, we have reproduced the table structure shown in the following official AWS page.
The attribute named “partition_key” is the partition key, and the attribute named “sort_key” is the sort key.
EC2 instance
Resources:
Instance:
Type: AWS::EC2::Instance
Properties:
IamInstanceProfile: !Ref InstanceProfile
ImageId: !Ref ImageId
InstanceType: !Ref InstanceType
NetworkInterfaces:
- DeviceIndex: 0
SubnetId: !Ref InstanceSubnet
GroupSet:
- !Ref InstanceSecurityGroup
UserData: !Base64 |
#!/bin/bash -xe
pip3 install amazon-dax-client
wget http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/samples/TryDax.zip
unzip TryDax.zip
Code language: YAML (yaml)
You can use a dedicated client to access DAX.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DAX.client.html
We will use the following AWS official as a reference
Specifically, use the user data to install the client and download the sample program.
For more information on how to initialize an EC2 instance using user data, please refer to the following page
Confirm the IAM role for the instance.
Resources:
InstanceRole:
Type: AWS::IAM::Role
DeletionPolicy: Delete
Properties:
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action: sts:AssumeRole
Principal:
Service:
- ec2.amazonaws.com
ManagedPolicyArns:
- arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
Policies:
- PolicyName: DynamoDBAndDAXFullAccessPolicy
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- "dax:*"
Resource:
- !Ref DAXClusterArn
- Effect: Allow
Action:
- "dynamodb:*"
Resource:
- !Ref DynamoDBTableArn
Code language: YAML (yaml)
Define IAM roles with reference to the following official AWS page.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DAX.client.create-user-policy.html
This role gives you the authority to perform all actions on the DAX cluster and DynamoDB tables mentioned above.
Lambda Function
Resources:
Function2:
Type: AWS::Lambda::Function
Properties:
Architectures:
- !Ref Architecture
Code:
ZipFile: |
import amazondax
import json
import os
dax_endpoint_url = os.environ['DAX_ENDPOINT_URL']
dynamodb_table = os.environ['DYNAMODB_TABLE']
region = os.environ['REGION']
def lambda_handler(event, context):
dax_client = amazondax.AmazonDaxClient(
endpoint_url=dax_endpoint_url,
region_name=region
)
partion_value = '1'
result = dax_client.query(
TableName=dynamodb_table,
ExpressionAttributeNames={
'#name0': 'partition_key'
},
ExpressionAttributeValues={
':value0': {'N': partion_value}
},
KeyConditionExpression='#name0 = :value0'
)
print(result)
return {
'statusCode': 200,
'body': json.dumps(result, indent=2)
}
Environment:
Variables:
DAX_ENDPOINT_URL: !Ref DAXClusterDiscoveryEndpointURL
DYNAMODB_TABLE: !Ref DynamoDBTable
REGION: !Ref AWS::Region
FunctionName: !Sub "${Prefix}-function2"
Handler: !Ref Handler
Layers:
- !Ref LambdaLayer
Runtime: !Ref Runtime
Role: !GetAtt FunctionRole2.Arn
VpcConfig:
SecurityGroupIds:
- !Ref FunctionSecurityGroup
SubnetIds:
- !Ref FunctionSubnet
FunctionRole2:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action: sts:AssumeRole
Principal:
Service:
- lambda.amazonaws.com
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole
Policies:
- PolicyName: DAXFullAccessPolicy
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- "dax:*"
Resource:
- !Ref DAXClusterArn
Code language: YAML (yaml)
This also uses the Python version of the DAX client to access DAX.
The following page was used as a reference to implement the use of the DAX client.
https://pypi.org/project/amazon-dax-client/
As for the content to be executed, we query and retrieve items with a partition key of “1”.
(Reference) Lambda Layer
As in the case of EC2 instances, a DAX client (amazon-dax-client) is also used for Lambda functions.
By creating a Lambda layer and including the DAX client in it, the Lambda function will be able to import the client module.
In this case, we will use a CloudFormation custom resource to automatically perform this Lambda layer creation.
For more information, please see the following page
Architecting
Use CloudFormation to build this environment and check its actual behavior.
Create CloudFormation stacks and check resources in stacks
Create CloudFormation stacks.
For information on how to create stacks and check each stack, please refer to the following page
After checking the resources in each stack, information on the main resources created this time is as follows
- DynamoDB: TryDaxTable
- DAX cluster: saa-02-009-cluster
- DAX endpoint: dax://saa-02-009-cluster.ryxnym.dax-clusters.ap-northeast-1.amazonaws.com
- EC2 instance: i-098244d885d0d7b13
- Lambda function: saa-02-009-function2
Check each resource from the AWS Management Console.
First, check DynamoDB.
The table has been successfully created.
We can see that both the partition key and sort key have been set correctly.
Check DAX.
The DAX cluster is successfully created.
We can also see that two nodes have been created in the cluster.
The subnet group and parameter group are also successfully created.
Checking Action
Now that everything is ready, let’s check the Operation.
Accessing DAX from EC2 instance
Connect to the instance using SSM Session Manager.
% aws ssm start-session --target i-098244d885d0d7b13
...
sh-4.2$
Code language: Bash (bash)
For more information on SSM Session Manager, please visit
Checking the Operation of User Data.
First is the installation status of the DAX client.
[ssm-user@ip-10-0-1-51 /]$ sudo pip3 list installed
---------------------- -------
amazon-dax-client 2.0.3
antlr4-python3-runtime 4.9.3
botocore 1.29.32
jmespath 1.0.1
pip 21.3.1
python-dateutil 2.8.2
setuptools 57.0.0
six 1.16.0
urllib3 1.26.13
websockets 10.1
wheel 0.36.2
Code language: plaintext (plaintext)
Sure enough, the client (amazon-dax-client) is installed.
Next, check the download status of the sample programs.
[ssm-user@ip-10-0-1-51 /]$ ls -l
total 40
drwxr-xr-x 6 root root 60 Dec 19 12:20 TryDax
-rw-r--r-- 1 root root 20931 Dec 19 05:03 TryDax.zip
...
Code language: plaintext (plaintext)
Indeed, the ZIP file has been downloaded and extracted.
02-write-data.py
Execute the code on the following page
The contents of the file stores sample data in a DynamoDB table.
Modify some of the code as follows
#
# Copyright 2010-2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# This file is licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License. A copy of
# the License is located at
#
# http://aws.amazon.com/apache2.0/
#
# This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
# CONDITIONS OF ANY KIND, either express or implied. See the License for the
# specific language governing permissions and limitations under the License.
#
#!/usr/bin/env python3
from __future__ import print_function
import os
import amazondax
import botocore.session
#region = os.environ.get('AWS_DEFAULT_REGION', 'us-west-2')
region = os.environ.get('AWS_DEFAULT_REGION', 'ap-northeast-1')
session = botocore.session.get_session()
dynamodb = session.create_client('dynamodb', region_name=region) # low-level client
table_name = "TryDaxTable"
some_data = 'X' * 1000
pk_max = 10
sk_max = 10
for ipk in range(1, pk_max+1):
for isk in range(1, sk_max+1):
params = {
'TableName': table_name,
'Item': {
"partition_key": {'N': str(ipk)},
"sort_key": {'N': str(isk)},
#"pk": {'N': str(ipk)},
#"sk": {'N': str(isk)},
"someData": {'S': some_data}
}
}
dynamodb.put_item(**params)
print("PutItem ({}, {}) suceeded".format(ipk, isk))
Code language: Python (python)
Run the modified program.
[ssm-user@ip-10-0-1-150 python]$ python3 02-write-data.py
PutItem (1, 1) suceeded
PutItem (1, 2) suceeded
PutItem (1, 3) suceeded
PutItem (1, 4) suceeded
PutItem (1, 5) suceeded
PutItem (1, 6) suceeded
PutItem (1, 7) suceeded
PutItem (1, 8) suceeded
PutItem (1, 9) suceeded
PutItem (1, 10) suceeded
...
Code language: Bash (bash)
Sample data has been saved.
The storage status is also checked from the DynamoDB table side.
Indeed, the data is saved.
03-getitem-test.py
Execute the code on the following page
The content compares the case of accessing DynamoDB directly to retrieve an item and the case of retrieving it through DAX.
The sample program is partially modified as follows
#
# Copyright 2010-2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# This file is licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License. A copy of
# the License is located at
#
# http://aws.amazon.com/apache2.0/
#
# This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
# CONDITIONS OF ANY KIND, either express or implied. See the License for the
# specific language governing permissions and limitations under the License.
#
#!/usr/bin/env python
from __future__ import print_function
import os, sys, time
import amazondax
import botocore.session
#region = os.environ.get('AWS_DEFAULT_REGION', 'us-west-2')
region = os.environ.get('AWS_DEFAULT_REGION', 'ap-northeast-1')
session = botocore.session.get_session()
dynamodb = session.create_client('dynamodb', region_name=region) # low-level client
table_name = "TryDaxTable"
if len(sys.argv) > 1:
endpoint = sys.argv[1]
dax = amazondax.AmazonDaxClient(session, region_name=region, endpoints=[endpoint])
client = dax
else:
client = dynamodb
pk = 10
sk = 10
iterations = 50
start = time.time()
for i in range(iterations):
for ipk in range(1, pk+1):
for isk in range(1, sk+1):
params = {
'TableName': table_name,
'Key': {
#"pk": {'N': str(ipk)},
#"sk": {'N': str(isk)}
"partition_key": {'N': str(ipk)},
"sort_key": {'N': str(isk)}
}
}
result = client.get_item(**params)
print('.', end='', file=sys.stdout); sys.stdout.flush()
print()
end = time.time()
print('Total time: {} sec - Avg time: {} sec'.format(end - start, (end-start)/iterations))
Code language: Python (python)
Run the modified program.
[ssm-user@ip-10-0-1-150 python]$ python3 03-getitem-test.py
...
Total time: 35.522796869277954 sec - Avg time: 0.7104559373855591 sec
[ssm-user@ip-10-0-1-150 python]$ python3 03-getitem-test.py dax://saa-02-009-cluster.ryxnym.dax-clusters.ap-northeast-1.amazonaws.com
...
Total time: 8.720930099487305 sec - Avg time: 0.1744186019897461 sec
Code language: Bash (bash)
In this verification, it is about 4 times faster via DAX.
04-query-test.py
Execute the code on the following page
The content compares the case of accessing DynamoDB directly and the case of retrieving through DAX in order to execute a query.
The sample program is partially modified as follows
#
# Copyright 2010-2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# This file is licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License. A copy of
# the License is located at
#
# http://aws.amazon.com/apache2.0/
#
# This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
# CONDITIONS OF ANY KIND, either express or implied. See the License for the
# specific language governing permissions and limitations under the License.
#
#!/usr/bin/env python
from __future__ import print_function
import os, sys, time
import amazondax
import botocore.session
#region = os.environ.get('AWS_DEFAULT_REGION', 'us-west-2')
region = os.environ.get('AWS_DEFAULT_REGION', 'ap-northeast-1')
session = botocore.session.get_session()
dynamodb = session.create_client('dynamodb', region_name=region) # low-level client
table_name = "TryDaxTable"
if len(sys.argv) > 1:
endpoint = sys.argv[1]
dax = amazondax.AmazonDaxClient(session, region_name=region, endpoints=[endpoint])
client = dax
else:
client = dynamodb
pk = 5
sk1 = 2
sk2 = 9
iterations = 5
params = {
'TableName': table_name,
#'KeyConditionExpression': 'pk = :pkval and sk between :skval1 and :skval2',
'KeyConditionExpression': 'partition_key = :pkval and sort_key between :skval1 and :skval2',
'ExpressionAttributeValues': {
":pkval": {'N': str(pk)},
":skval1": {'N': str(sk1)},
":skval2": {'N': str(sk2)}
}
}
start = time.time()
for i in range(iterations):
result = client.query(**params)
end = time.time()
print('Total time: {} sec - Avg time: {} sec'.format(end - start, (end-start)/iterations))
Code language: Python (python)
Run the modified program.
[ssm-user@ip-10-0-1-150 python]$ python3 04-query-test.py
Total time: 0.05969858169555664 sec - Avg time: 0.011939716339111329 sec
[ssm-user@ip-10-0-1-150 python]$ python3 04-query-test.py dax://saa-02-009-cluster.ryxnym.dax-clusters.ap-northeast-1.amazonaws.com
Total time: 0.024791479110717773 sec - Avg time: 0.004958295822143554 sec
Code language: plaintext (plaintext)
In this verification, it is about 2.4 times faster via DAX.
05-scan-test.py
Execute the code on the following page
The content compares the case of accessing DynamoDB directly and the case of retrieving through DAX in order to perform a scan.
The sample program is partially modified as follows
#
# Copyright 2010-2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# This file is licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License. A copy of
# the License is located at
#
# http://aws.amazon.com/apache2.0/
#
# This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
# CONDITIONS OF ANY KIND, either express or implied. See the License for the
# specific language governing permissions and limitations under the License.
#
#!/usr/bin/env python
from __future__ import print_function
import os, sys, time
import amazondax
import botocore.session
#region = os.environ.get('AWS_DEFAULT_REGION', 'us-west-2')
region = os.environ.get('AWS_DEFAULT_REGION', 'ap-northeast-1')
session = botocore.session.get_session()
dynamodb = session.create_client('dynamodb', region_name=region) # low-level client
table_name = "TryDaxTable"
if len(sys.argv) > 1:
endpoint = sys.argv[1]
dax = amazondax.AmazonDaxClient(session, region_name=region, endpoints=[endpoint])
client = dax
else:
client = dynamodb
iterations = 5
params = {
'TableName': table_name
}
start = time.time()
for i in range(iterations):
result = client.scan(**params)
end = time.time()
print('Total time: {} sec - Avg time: {} sec'.format(end - start, (end-start)/iterations))
Code language: Python (python)
Run the modified program.
[ssm-user@ip-10-0-1-150 python]$ python3 05-scan-test.py
Total time: 0.1732480525970459 sec - Avg time: 0.03464961051940918 sec
[ssm-user@ip-10-0-1-150 python]$ python3 05-scan-test.py dax://saa-02-009-cluster.ryxnym.dax-clusters.ap-northeast-1.amazonaws.com
Total time: 0.05351376533508301 sec - Avg time: 0.010702753067016601 sec
Code language: Bash (bash)
In this verification, we can see that it is about 3.2 times faster via DAX.
Lambda Function
The Lambda function also accesses DAX.
The following is the result of the function execution.
The function was executed successfully.
Check the execution log of the function in CloudWatch Logs.
Indeed, the query to DAX was able to retrieve the item with a partition key value of “1”.
Summary
In this article, we have confirmed how to create DAX and access it from EC2 instances and Lambda.