SAA_EN

Accessing DynamoDB Accelerator (DAX) with EC2/Lambda

スポンサーリンク
Accessing DynamoDB Accelerator (DAX) with EC2/Lambda. SAA_EN
スポンサーリンク
スポンサーリンク

Accessing DynamoDB Accelerator (DAX) with EC2/Lambda

This is one of the AWS SAA topics related to designing a high-performance architecture.

DynamoDB Accelerator (DAX) is a caching service for DynamoDB.

Amazon DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache for Amazon DynamoDB that delivers up to a 10 times performance improvement—from milliseconds to microseconds—even at millions of requests per second.

Amazon DynamoDB Accelerator (DAX)

In this case, the objective is to create a DAX and access it from EC2 and Lambda.

Environment

Diagram of accessing DynamoDB Accelerator (DAX) with EC2/Lambda.

Create a DynamoDB table outside the VPC.
Set up a partition key, sort key, etc. so that items for verification can be stored.

Deploy a DAX cluster on private subnets.
Place two nodes in the cluster.
A primary node and a lead replica node.

Create two resources in the private subnet as clients to access the DAX cluster.

The first is an EC2 instance.
It will use the official AWS package to connect to the DAX cluster.
To install it, we access the Internet via a NAT gateway.

The second is a Lambda function.
This also uses a dedicated package to connect to the DAX cluster, so it is prepared in the form of a Lambda layer.
The runtime environment for the function is Python 3.8.

CloudFormation template files

The above configuration is built using CloudFormation.
The CloudFormation templates are located at the following URL

awstut-saa/02/009 at main · awstut-an-r/awstut-saa
Contribute to awstut-an-r/awstut-saa development by creating an account on GitHub.

Explanation of key points of the template files

DAX

To create a DAX, the following four resources must be created

  • DAX cluster
  • Parameter Group
  • Subnet group
  • IAM Roles

DAX Cluster

Resources: DAXCluster: Type: AWS::DAX::Cluster Properties: AvailabilityZones: - !Sub "${AWS::Region}${AvailabilityZone1}" - !Sub "${AWS::Region}${AvailabilityZone2}" ClusterEndpointEncryptionType: NONE ClusterName: !Sub "${Prefix}-Cluster" Description: Test DAX Cluster IAMRoleARN: !GetAtt DAXRole.Arn NodeType: !Ref DAXNodeType ParameterGroupName: !Ref DAXParameterGroup ReplicationFactor: 2 SecurityGroupIds: - !Ref DAXSecurityGroup SubnetGroupName: !Ref DAXSubnetGroup
Code language: YAML (yaml)

In the AvailabilityZones property, specify the AZ where the cluster will be deployed.
Regarding the AZ where the cluster will be deployed, AWS official mentions the following

For production usage, we strongly recommend using DAX with at least three nodes, where each node is placed in different Availability Zones. Three nodes are required for a DAX cluster to be fault-tolerant.

DAX cluster components

Since this is a verification purpose, we will deploy clusters in two AZs.

The NodeType property allows you to set the instance type of the nodes in the cluster.
In this case, we will specify “dax.t3.small,” which is the smallest size.

The ReplicationFactor property sets the number of nodes in the cluster.
This time, “2” is specified since two nodes will be created in the cluster.

Specify the security group to be applied to the DAX cluster in the SecurityGroupIds property.
Regarding the security group to be applied to the cluster, AWS official mentions the following

When you launch a cluster in your VPC, you add an ingress rule to your security group to allow incoming network traffic. The ingress rule specifies the protocol (TCP) and port number (8111) for your cluster.

DAX cluster components

So this time we will apply the following security group

Resources: DAXSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupName: !Sub "${Prefix}-DAXSecurityGroup" GroupDescription: Allow DAX. VpcId: !Ref VPC SecurityGroupIngress: - IpProtocol: tcp FromPort: !Ref DAXPort ToPort: !Ref DAXPort SourceSecurityGroupId: !Ref InstanceSecurityGroup - IpProtocol: tcp FromPort: !Ref DAXPort ToPort: !Ref DAXPort SourceSecurityGroupId: !Ref FunctionSecurityGroup
Code language: YAML (yaml)

Contents to allow 8111/tcp from EC2 instances and Lambda functions security group.

Parameter Groups

The following is the official AWS description of parameter groups.

A named set of parameters that are applied to all of the nodes in a DAX cluster.

AWS::DAX::ParameterGroup
Resources: DAXParameterGroup: Type: AWS::DAX::ParameterGroup Properties: Description: Test DAX Parameter Group ParameterGroupName: !Sub "${Prefix}-ParameterGroup" ParameterNameValues: query-ttl-millis: 75000 record-ttl-millis: 88000
Code language: YAML (yaml)

In this case, we used the above page as a reference for setting up the system.

Subnet Group

The subnet group is a resource that specifies the subnets where the cluster will be deployed.

Resources: DAXSubnetGroup: Type: AWS::DAX::SubnetGroup Properties: Description: Test DAX Subnet Group SubnetGroupName: !Sub "${Prefix}SubnetGroup" SubnetIds: - !Ref DAXSubnet1 - !Ref DAXSubnet2
Code language: YAML (yaml)

This time, two private subnets are specified.

IAM Role

When a client such as an EC2 instance or Lambda function accesses DAX, DAX internally queries DynamoDB and returns the results to the client.
In other words, it is necessary to authorize DAX to access DynamoDB.
Authorization to DAX is done by associating an IAM role with the cluster.

Resources: DAXRole: Type: AWS::IAM::Role DeletionPolicy: Delete Properties: AssumeRolePolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Action: sts:AssumeRole Principal: Service: - dax.amazonaws.com Policies: - PolicyName: !Sub "${Prefix}-DAXPolicy" PolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Action: - dynamodb:DescribeTable - dynamodb:PutItem - dynamodb:GetItem - dynamodb:UpdateItem - dynamodb:DeleteItem - dynamodb:Query - dynamodb:Scan - dynamodb:BatchGetItem - dynamodb:BatchWriteItem - dynamodb:ConditionCheckItem Resource: - !Ref DynamoDBTableArn
Code language: YAML (yaml)

Refer to the official AWS page for the configuration.

DAX access control - Amazon DynamoDB
Amazon DynamoDB and DAX are separate AWS services and have different security models implementation of AWS Identity and Access Management (IAM) security roles a...

This is the content of granting various privileges to DynamoDB.

DynamoDB

Resources: Table: Type: AWS::DynamoDB::Table Properties: AttributeDefinitions: - AttributeName: partition_key AttributeType: N - AttributeName: sort_key AttributeType: N BillingMode: PROVISIONED KeySchema: - AttributeName: partition_key KeyType: HASH - AttributeName: sort_key KeyType: RANGE ProvisionedThroughput: ReadCapacityUnits: !Ref ReadCapacityUnits WriteCapacityUnits: !Ref WriteCapacityUnits TableClass: STANDARD TableName: TryDaxTable
Code language: YAML (yaml)

Since this is for DAX verification, we have reproduced the table structure shown in the following official AWS page.

01-create-table.py - Amazon DynamoDB
Test DAX functionality using the 01-create-table.py program in the Python sample application.

The attribute named “partition_key” is the partition key, and the attribute named “sort_key” is the sort key.

EC2 instance

Resources: Instance: Type: AWS::EC2::Instance Properties: IamInstanceProfile: !Ref InstanceProfile ImageId: !Ref ImageId InstanceType: !Ref InstanceType NetworkInterfaces: - DeviceIndex: 0 SubnetId: !Ref InstanceSubnet GroupSet: - !Ref InstanceSecurityGroup UserData: !Base64 | #!/bin/bash -xe pip3 install amazon-dax-client wget http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/samples/TryDax.zip unzip TryDax.zip
Code language: YAML (yaml)

You can use a dedicated client to access DAX.

Developing with the DynamoDB Accelerator (DAX) client - Amazon DynamoDB
Learn to develop with the Amazon DynamoDB Accelerator (DAX) client to securely connect your applications to DAX and speed up your DynamoDB read throughput.

We will use the following AWS official as a reference

Python and DAX - Amazon DynamoDB
Test DAX functionality using this Python sample application.

Specifically, use the user data to install the client and download the sample program.

For more information on how to initialize an EC2 instance using user data, please refer to the following page

Confirm the IAM role for the instance.

Resources: InstanceRole: Type: AWS::IAM::Role DeletionPolicy: Delete Properties: AssumeRolePolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Action: sts:AssumeRole Principal: Service: - ec2.amazonaws.com ManagedPolicyArns: - arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore Policies: - PolicyName: DynamoDBAndDAXFullAccessPolicy PolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Action: - "dax:*" Resource: - !Ref DAXClusterArn - Effect: Allow Action: - "dynamodb:*" Resource: - !Ref DynamoDBTableArn
Code language: YAML (yaml)

Define IAM roles with reference to the following official AWS page.

Step 2: Create an IAM user and policy - Amazon DynamoDB
Create an IAM user and policy that grants access to DynamoDB and your DAX cluster.

This role gives you the authority to perform all actions on the DAX cluster and DynamoDB tables mentioned above.

Lambda Function

Resources: Function2: Type: AWS::Lambda::Function Properties: Architectures: - !Ref Architecture Code: ZipFile: | import amazondax import json import os dax_endpoint_url = os.environ['DAX_ENDPOINT_URL'] dynamodb_table = os.environ['DYNAMODB_TABLE'] region = os.environ['REGION'] def lambda_handler(event, context): dax_client = amazondax.AmazonDaxClient( endpoint_url=dax_endpoint_url, region_name=region ) partion_value = '1' result = dax_client.query( TableName=dynamodb_table, ExpressionAttributeNames={ '#name0': 'partition_key' }, ExpressionAttributeValues={ ':value0': {'N': partion_value} }, KeyConditionExpression='#name0 = :value0' ) print(result) return { 'statusCode': 200, 'body': json.dumps(result, indent=2) } Environment: Variables: DAX_ENDPOINT_URL: !Ref DAXClusterDiscoveryEndpointURL DYNAMODB_TABLE: !Ref DynamoDBTable REGION: !Ref AWS::Region FunctionName: !Sub "${Prefix}-function2" Handler: !Ref Handler Layers: - !Ref LambdaLayer Runtime: !Ref Runtime Role: !GetAtt FunctionRole2.Arn VpcConfig: SecurityGroupIds: - !Ref FunctionSecurityGroup SubnetIds: - !Ref FunctionSubnet FunctionRole2: Type: AWS::IAM::Role Properties: AssumeRolePolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Action: sts:AssumeRole Principal: Service: - lambda.amazonaws.com ManagedPolicyArns: - arn:aws:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole Policies: - PolicyName: DAXFullAccessPolicy PolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Action: - "dax:*" Resource: - !Ref DAXClusterArn
Code language: YAML (yaml)

This also uses the Python version of the DAX client to access DAX.

The following page was used as a reference to implement the use of the DAX client.

amazon-dax-client
Amazon DAX Client for Python

As for the content to be executed, we query and retrieve items with a partition key of “1”.

(Reference) Lambda Layer

As in the case of EC2 instances, a DAX client (amazon-dax-client) is also used for Lambda functions.
By creating a Lambda layer and including the DAX client in it, the Lambda function will be able to import the client module.
In this case, we will use a CloudFormation custom resource to automatically perform this Lambda layer creation.
For more information, please see the following page

Architecting

Use CloudFormation to build this environment and check its actual behavior.

Create CloudFormation stacks and check resources in stacks

Create CloudFormation stacks.
For information on how to create stacks and check each stack, please refer to the following page

After checking the resources in each stack, information on the main resources created this time is as follows

  • DynamoDB: TryDaxTable
  • DAX cluster: saa-02-009-cluster
  • DAX endpoint: dax://saa-02-009-cluster.ryxnym.dax-clusters.ap-northeast-1.amazonaws.com
  • EC2 instance: i-098244d885d0d7b13
  • Lambda function: saa-02-009-function2

Check each resource from the AWS Management Console.
First, check DynamoDB.

Detail of DynamoDB 1.

The table has been successfully created.
We can see that both the partition key and sort key have been set correctly.

Check DAX.

Detail of dax 1.

The DAX cluster is successfully created.
We can also see that two nodes have been created in the cluster.

Detail of dax 2.
Detail of dax 3.

The subnet group and parameter group are also successfully created.

Checking Action

Now that everything is ready, let’s check the Operation.

Accessing DAX from EC2 instance

Connect to the instance using SSM Session Manager.

% aws ssm start-session --target i-098244d885d0d7b13 ... sh-4.2$
Code language: Bash (bash)

For more information on SSM Session Manager, please visit

Checking the Operation of User Data.

First is the installation status of the DAX client.

[ssm-user@ip-10-0-1-51 /]$ sudo pip3 list installed ---------------------- ------- amazon-dax-client 2.0.3 antlr4-python3-runtime 4.9.3 botocore 1.29.32 jmespath 1.0.1 pip 21.3.1 python-dateutil 2.8.2 setuptools 57.0.0 six 1.16.0 urllib3 1.26.13 websockets 10.1 wheel 0.36.2
Code language: plaintext (plaintext)

Sure enough, the client (amazon-dax-client) is installed.

Next, check the download status of the sample programs.

[ssm-user@ip-10-0-1-51 /]$ ls -l total 40 drwxr-xr-x 6 root root 60 Dec 19 12:20 TryDax -rw-r--r-- 1 root root 20931 Dec 19 05:03 TryDax.zip ...
Code language: plaintext (plaintext)

Indeed, the ZIP file has been downloaded and extracted.

02-write-data.py

Execute the code on the following page

02-write-data.py - Amazon DynamoDB
Test DAX functionality using the 02-write-data.py program in the Python sample application.

The contents of the file stores sample data in a DynamoDB table.

Modify some of the code as follows

# # Copyright 2010-2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. # # This file is licensed under the Apache License, Version 2.0 (the "License"). # You may not use this file except in compliance with the License. A copy of # the License is located at # # http://aws.amazon.com/apache2.0/ # # This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR # CONDITIONS OF ANY KIND, either express or implied. See the License for the # specific language governing permissions and limitations under the License. # #!/usr/bin/env python3 from __future__ import print_function import os import amazondax import botocore.session #region = os.environ.get('AWS_DEFAULT_REGION', 'us-west-2') region = os.environ.get('AWS_DEFAULT_REGION', 'ap-northeast-1') session = botocore.session.get_session() dynamodb = session.create_client('dynamodb', region_name=region) # low-level client table_name = "TryDaxTable" some_data = 'X' * 1000 pk_max = 10 sk_max = 10 for ipk in range(1, pk_max+1): for isk in range(1, sk_max+1): params = { 'TableName': table_name, 'Item': { "partition_key": {'N': str(ipk)}, "sort_key": {'N': str(isk)}, #"pk": {'N': str(ipk)}, #"sk": {'N': str(isk)}, "someData": {'S': some_data} } } dynamodb.put_item(**params) print("PutItem ({}, {}) suceeded".format(ipk, isk))
Code language: Python (python)

Run the modified program.

[ssm-user@ip-10-0-1-150 python]$ python3 02-write-data.py PutItem (1, 1) suceeded PutItem (1, 2) suceeded PutItem (1, 3) suceeded PutItem (1, 4) suceeded PutItem (1, 5) suceeded PutItem (1, 6) suceeded PutItem (1, 7) suceeded PutItem (1, 8) suceeded PutItem (1, 9) suceeded PutItem (1, 10) suceeded ...
Code language: Bash (bash)

Sample data has been saved.

The storage status is also checked from the DynamoDB table side.

Detail of DynamoDB 2.

Indeed, the data is saved.

03-getitem-test.py

Execute the code on the following page

03-getitem-test.py - Amazon DynamoDB
Test DAX functionality using the 03-getitem-test.py program in the Python sample application.

The content compares the case of accessing DynamoDB directly to retrieve an item and the case of retrieving it through DAX.

The sample program is partially modified as follows

# # Copyright 2010-2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. # # This file is licensed under the Apache License, Version 2.0 (the "License"). # You may not use this file except in compliance with the License. A copy of # the License is located at # # http://aws.amazon.com/apache2.0/ # # This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR # CONDITIONS OF ANY KIND, either express or implied. See the License for the # specific language governing permissions and limitations under the License. # #!/usr/bin/env python from __future__ import print_function import os, sys, time import amazondax import botocore.session #region = os.environ.get('AWS_DEFAULT_REGION', 'us-west-2') region = os.environ.get('AWS_DEFAULT_REGION', 'ap-northeast-1') session = botocore.session.get_session() dynamodb = session.create_client('dynamodb', region_name=region) # low-level client table_name = "TryDaxTable" if len(sys.argv) > 1: endpoint = sys.argv[1] dax = amazondax.AmazonDaxClient(session, region_name=region, endpoints=[endpoint]) client = dax else: client = dynamodb pk = 10 sk = 10 iterations = 50 start = time.time() for i in range(iterations): for ipk in range(1, pk+1): for isk in range(1, sk+1): params = { 'TableName': table_name, 'Key': { #"pk": {'N': str(ipk)}, #"sk": {'N': str(isk)} "partition_key": {'N': str(ipk)}, "sort_key": {'N': str(isk)} } } result = client.get_item(**params) print('.', end='', file=sys.stdout); sys.stdout.flush() print() end = time.time() print('Total time: {} sec - Avg time: {} sec'.format(end - start, (end-start)/iterations))
Code language: Python (python)

Run the modified program.

[ssm-user@ip-10-0-1-150 python]$ python3 03-getitem-test.py ... Total time: 35.522796869277954 sec - Avg time: 0.7104559373855591 sec [ssm-user@ip-10-0-1-150 python]$ python3 03-getitem-test.py dax://saa-02-009-cluster.ryxnym.dax-clusters.ap-northeast-1.amazonaws.com ... Total time: 8.720930099487305 sec - Avg time: 0.1744186019897461 sec
Code language: Bash (bash)

In this verification, it is about 4 times faster via DAX.

04-query-test.py

Execute the code on the following page

04-query-test.py - Amazon DynamoDB
Test DAX functionality using the 04-query-test.py program in the Python sample application.

The content compares the case of accessing DynamoDB directly and the case of retrieving through DAX in order to execute a query.

The sample program is partially modified as follows

# # Copyright 2010-2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. # # This file is licensed under the Apache License, Version 2.0 (the "License"). # You may not use this file except in compliance with the License. A copy of # the License is located at # # http://aws.amazon.com/apache2.0/ # # This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR # CONDITIONS OF ANY KIND, either express or implied. See the License for the # specific language governing permissions and limitations under the License. # #!/usr/bin/env python from __future__ import print_function import os, sys, time import amazondax import botocore.session #region = os.environ.get('AWS_DEFAULT_REGION', 'us-west-2') region = os.environ.get('AWS_DEFAULT_REGION', 'ap-northeast-1') session = botocore.session.get_session() dynamodb = session.create_client('dynamodb', region_name=region) # low-level client table_name = "TryDaxTable" if len(sys.argv) > 1: endpoint = sys.argv[1] dax = amazondax.AmazonDaxClient(session, region_name=region, endpoints=[endpoint]) client = dax else: client = dynamodb pk = 5 sk1 = 2 sk2 = 9 iterations = 5 params = { 'TableName': table_name, #'KeyConditionExpression': 'pk = :pkval and sk between :skval1 and :skval2', 'KeyConditionExpression': 'partition_key = :pkval and sort_key between :skval1 and :skval2', 'ExpressionAttributeValues': { ":pkval": {'N': str(pk)}, ":skval1": {'N': str(sk1)}, ":skval2": {'N': str(sk2)} } } start = time.time() for i in range(iterations): result = client.query(**params) end = time.time() print('Total time: {} sec - Avg time: {} sec'.format(end - start, (end-start)/iterations))
Code language: Python (python)

Run the modified program.

[ssm-user@ip-10-0-1-150 python]$ python3 04-query-test.py Total time: 0.05969858169555664 sec - Avg time: 0.011939716339111329 sec [ssm-user@ip-10-0-1-150 python]$ python3 04-query-test.py dax://saa-02-009-cluster.ryxnym.dax-clusters.ap-northeast-1.amazonaws.com Total time: 0.024791479110717773 sec - Avg time: 0.004958295822143554 sec
Code language: plaintext (plaintext)

In this verification, it is about 2.4 times faster via DAX.

05-scan-test.py

Execute the code on the following page

05-scan-test.py - Amazon DynamoDB
Test DAX functionality using the 05-scan-test.py program in the Python sample application.

The content compares the case of accessing DynamoDB directly and the case of retrieving through DAX in order to perform a scan.

The sample program is partially modified as follows

# # Copyright 2010-2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. # # This file is licensed under the Apache License, Version 2.0 (the "License"). # You may not use this file except in compliance with the License. A copy of # the License is located at # # http://aws.amazon.com/apache2.0/ # # This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR # CONDITIONS OF ANY KIND, either express or implied. See the License for the # specific language governing permissions and limitations under the License. # #!/usr/bin/env python from __future__ import print_function import os, sys, time import amazondax import botocore.session #region = os.environ.get('AWS_DEFAULT_REGION', 'us-west-2') region = os.environ.get('AWS_DEFAULT_REGION', 'ap-northeast-1') session = botocore.session.get_session() dynamodb = session.create_client('dynamodb', region_name=region) # low-level client table_name = "TryDaxTable" if len(sys.argv) > 1: endpoint = sys.argv[1] dax = amazondax.AmazonDaxClient(session, region_name=region, endpoints=[endpoint]) client = dax else: client = dynamodb iterations = 5 params = { 'TableName': table_name } start = time.time() for i in range(iterations): result = client.scan(**params) end = time.time() print('Total time: {} sec - Avg time: {} sec'.format(end - start, (end-start)/iterations))
Code language: Python (python)

Run the modified program.

[ssm-user@ip-10-0-1-150 python]$ python3 05-scan-test.py Total time: 0.1732480525970459 sec - Avg time: 0.03464961051940918 sec [ssm-user@ip-10-0-1-150 python]$ python3 05-scan-test.py dax://saa-02-009-cluster.ryxnym.dax-clusters.ap-northeast-1.amazonaws.com Total time: 0.05351376533508301 sec - Avg time: 0.010702753067016601 sec
Code language: Bash (bash)

In this verification, we can see that it is about 3.2 times faster via DAX.

Lambda Function

The Lambda function also accesses DAX.

Detail of Lambda 1.

The following is the result of the function execution.

Detail of Lambda 2.

The function was executed successfully.

Check the execution log of the function in CloudWatch Logs.

Detail of Lambda 3.

Indeed, the query to DAX was able to retrieve the item with a partition key value of “1”.

Summary

In this article, we have confirmed how to create DAX and access it from EC2 instances and Lambda.

タイトルとURLをコピーしました