AppSync – Data Source: Lambda

Configuring Lambda as Data Source for AppSync

AppSync allows you to select a data source from the following services

Lambda
DynamoDB
OpenSearch
None
HTTP endpoint
RDS

This time, we will check the configuration with Lambda as the data source.
For a basic explanation of AppSync and the configuration with DynamoDB as the data source, please refer to the following page.

Environment

Diagram of AppSync - Data Source: Lambda

Define a Lambda function that operates as a data source.
Lambda functions are defined to manipulate objects in S3 buckets.
Specifically, there are three types of functions

Put: Create a new object in the S3 bucket.
List: Obtain a list of objects in the S3 bucket.
Delete: Delete an object in the S3 bucket by specifying a key (object name).

AppSync defines a schema resolver to enable the above three operations.

Create another Lambda function as a client to execute the GraphQL API by AppSync.
Enable the Function URL so that the URL query parameter can specify the operation to be performed.

We will create two Lambda functions, both in Python 3.8.

CloudFormation template files

The above configuration is built using CloudFormation.
The CloudFormation template is located at the following URL

https://github.com/awstut-an-r/awstut-fa/tree/main/044

Explanation of key points of template files

AppSync with Lambda as data source

Data Source

Resources:
  DataSource:
    Type: AWS::AppSync::DataSource
    Properties:
      ApiId: !GetAtt GraphQLApi.ApiId
      LambdaConfig:
        LambdaFunctionArn: !Ref FunctionArn
      Name: DataSource
      ServiceRoleArn: !GetAtt DataSourceRole.Arn
      Type: AWS_LAMBDA
Code language: YAML (yaml)

There are two key points.

The first is the Type property.
When Lambda is used as the data source, specify “AWS_LAMBDA”.

The second is the LambdaConfig property.
Set the ARN of the Lambda function to be set as the data source to the LambdaFunctionArn property.

Schema

Resources:
  GraphQLSchema:
    Type: AWS::AppSync::GraphQLSchema
    Properties:
      ApiId: !GetAtt GraphQLApi.ApiId
      Definition: |
        schema {
          query: Query
          mutation: Mutation
        }

        type Query {
          listS3Objects: [S3Object]
        }

        type Mutation {
          putS3Object: S3Object
          deleteS3Object(Key: String!): S3Object
        }

        type S3Object {
          Key: String!
          LastModified: String
          Size: Int
          ETag: String
        }
Code language: YAML (yaml)

No special configuration is required to use Lambda as the data source.
Define the schema as usual.

As mentioned above, we will implement three processes for S3 buckets, so define the schema accordingly.

Resolver

As confirmed in the schema, we will define a total of three query mutations.
As a representative example, we will cover the mutation to delete an object in the S3 bucket (deleteS3Object).

Resources:
  DeleteS3ObjectResolver:
    Type: AWS::AppSync::Resolver
    DependsOn:
      - GraphQLSchema
    Properties:
      ApiId: !GetAtt GraphQLApi.ApiId
      DataSourceName: !GetAtt DataSource.Name
      FieldName: deleteS3Object
      Kind: UNIT
      RequestMappingTemplate: |
        {
          "version": "2018-05-29",
          "operation": "Invoke",
          "payload": {
            "field": "Delete",
            "arguments": $utils.toJson($context.arguments)
          }
        }
      ResponseMappingTemplate: |
        $context.result
      TypeName: Mutation
Code language: YAML (yaml)

The key points are the request mapping (RequestMappingTemplate property) and response mapping (ResponseMappingTemplate property).
For more information on these specifications, please refer to the following page.

https://docs.aws.amazon.com/appsync/latest/devguide/resolver-mapping-template-reference-lambda.html

Request Mapping

This is a setting related to the data to be passed when executing a data source Lambda function from AppSync.

Either of two values can be set for the operation.

The Lambda data source lets you define two operations: Invoke and BatchInvoke.
Resolver mapping template reference for Lambda

In this case, we will specify Invoke.

You can pass data to a Lambda function by defining a payload.

The payload field is a container that you can use to pass any well-formed JSON to the Lambda function.
Resolver mapping template reference for Lambda

In this case, we define two values to be passed to the Lambda function.

field: a value indicating the operation to be performed by the data source Lambda function, either Put, List, or Delete.
arguments: Arguments for the mutation execution. Arguments can be obtained in $context.arguments, which is converted to JSON using $utils.toJson and passed.

Response Mapping

This setting is for AppSync to receive the values returned from the data source Lambda function.
The result of the Lambda function execution can be obtained with $context.result.

The key point is that the results must be returned in JSON format.
In other words, the object must be converted to JSON either on the Lambda function side or on the response mapping side.
In this case, JSON is converted on the function side.
If you want to convert the object to JSON on the response mapping side, you can use $utils.toJson as described above.

IAM role for AppSync

Resources:
  DataSourceRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Action: sts:AssumeRole
            Principal:
              Service: appsync.amazonaws.com
      Policies:
        - PolicyName: !Sub "${Prefix}-DataSourcePolicy"
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - lambda:InvokeFunction
                Resource:
                  - !Ref FunctionArn
Code language: YAML (yaml)

Permission is granted for AppSync to execute a Lambda function that is set as a data source.

Data source Lambda function

import boto3
import json
import os
from datetime import date, datetime

bucket_name = os.environ['BUCKET_NAME']
s3_client = boto3.client('s3')

PUT = 'Put'
LIST = 'List'
DELETE = 'Delete'

def json_serial(obj):
  # reference: https://www.yoheim.net/blog.php?q=20170703
  if isinstance(obj, (datetime, date)):
    return obj.isoformat()
  raise TypeError ("Type %s not serializable" % type(obj))

def lambda_handler(event, context):
  if event['field'] == PUT:
    now = datetime.now()
    now_str = now.strftime('%Y%m%d%H%M%S')

    key = "{datetime}.txt".format(datetime=now_str)

    put_response = s3_client.put_object(
      Bucket=bucket_name,
      Key=key,
      Body=now_str.encode())

    get_response = s3_client.get_object(
      Bucket=bucket_name,　Key=key)

    object_ = {
      'Key': key,
      'LastModified': get_response['LastModified'],
      'Size': get_response['ContentLength'],
      'ETtag': get_response['ETag']
    }
    return json.dumps(object_, default=json_serial)

  elif event['field'] == LIST:
    list_response = s3_client.list_objects_v2(
      Bucket=bucket_name)
    objects = list_response['Contents']
    return json.dumps(objects, default=json_serial)

  elif event['field'] == DELETE:
    key = event['arguments']['Key']
    delete_response = s3_client.delete_object(
      Bucket=bucket_name,　Key=key)
    object_ = {
      'Key': key
    }
    return json.dumps(object_, default=json_serial)
Code language: Python (python)

There are two points.

The first is how to receive the values set in the request mapping.
In Python, they are stored in the event object.
When receiving the value of a field, it becomes event[‘field’], and when receiving arguments, it becomes event[‘arguments’].

The second point is the value to be returned.
As mentioned above, the value returned by a function must be converted to JSON either on the function side or on the response mapping side.
For this reason, the values returned by the function are converted to JSON on the function side.
In addition, referring to the following site, we have prepared a function to enable serialization of datetime objects when converting to JSON with json.dumps.

https://www.yoheim.net/blog.php?q=20170703

With the above in mind, here is a summary of the code.

Refer to the value of field to determine the query content requested in the GraphQL query.
In the case of *Delete, reference the name (key) of the object to be deleted from the argument.
Operate the S3 bucket according to the contents of the request.
The result of the operation is converted to JSON and returned in the response mapping.

(Reference) GraphQL client Lambda function

import json
import os
import time
from gql import Client, gql
from gql.transport.aiohttp import AIOHTTPTransport

api_key = os.environ['API_KEY']
graphql_url = os.environ['GRAPHQL_URL']

transport = AIOHTTPTransport(
  url=graphql_url,
  headers={
    'x-api-key': api_key
  })
client = Client(transport=transport, fetch_schema_from_transport=True)

PUT = 'Put'
LIST = 'List'
DELETE = 'Delete'


def lambda_handler(event, context):
  field = ''
  document = None
  result = None

  if not 'queryStringParameters' in event or (
      not 'field' in event['queryStringParameters']):
    field = LIST
  else:
    field = event['queryStringParameters']['field']

  if field == PUT:
    document = gql(
      """
      mutation PutS3ObjectMutation {
        putS3Object {
          Key
          LastModified
          Size
          ETag
        }
      }
      """
      )
    result = client.execute(document)

  elif field == LIST:
    document = gql(
      """
      query ListS3ObjectsQuery {
        listS3Objects {
          Key
        }
      }
      """
      )
    result = client.execute(document)

  elif field == DELETE:
    object_name = event['queryStringParameters']['object_name']
    document = gql(
      """
      mutation DeleteS3ObjectsMutation($object_name: String!) {
        deleteS3Object(Key: $object_name) {
          Key
        }
      }
      """
      )

    params = {
      'object_name': object_name
    }
    result = client.execute(document, variable_values=params)

  return {
    'statusCode': 200,
    'body': json.dumps(result, indent=2)
  }
Code language: Python (python)

Lambda function to execute a GraphQL query.
In this case, we will use GQL as the GraphQL client library for Python.

https://github.com/graphql-python/gql

In Python, URL query parameters can be retrieved by event[‘queryStringParameters’].
Pass the required parameters in the URL query parameters.

field: The operation to perform.
object_name: Required if the operation is Delete. The name of the object to delete (key).

Architecting

Use CloudFormation to build this environment and check the actual behavior.

Prepare deployment package for Lambda function

There are three ways to create a Lambda function.
In this case, we will choose the method of uploading a deployment package to an S3 bucket.
For more information, please refer to the following page

Prepare deployment package for Lambda layer

Prepare the aforementioned GQL as a Lambda layer.
For more information on the Lambda layer, please refer to the following page

The command to create a package for the Lambda layer is as follows

$ sudo pip3 install --pre gql[all] -t python

$ zip -r layer.zip python
Code language: Bash (bash)

Create CloudFormation stacks and check resources in stacks

Create a CloudFormation stack.
For information on how to create stacks and check each stack, please refer to the following page

After checking the resources in each stack, information on the main resources created this time is as follows

S3 bucket: fa-044
AppSync API: fa-044-GraphQLApi
Lambda function for data source: fa-044-function-01
Function URL for GraphQL client Lambda function: https://khty5dwgudjyl6otndvxm3uora0hemhk.lambda-url.ap-northeast-1.on.aws

From the AWS Management Console, check AppSync.

You can see that the Lambda function is registered as a data source.

Confirmation of Operation

Now that everything is ready, access the Function URL of the GraphQL client Lambda function.

The first operation is to add an object to the S3 bucket (Put).
Specify “Put” as the value of field in the URL query.

The putS3Object mutation is executed.
The result of the execution is displayed.
Execute it one more time for the next operation.

The next operation (List) is to obtain a list of objects stored in the S3 bucket.

The listS3Objects query has been executed.
As specified in the query, only the Key was returned.

By the way, the status of the S3 bucket is as shown in the following image.

Indeed, two objects are stored.

The next operation is to delete an object in the S3 bucket (Delete).
Add the object name (key) of the object to be deleted to the parameter.

The deleteS3Object mutation has been executed.
It has been successfully executed and the key of the deleted object has been returned.

Check the list again.

There is now only one.
We can see that the object was indeed deleted in the previous operation.

Summary

We have introduced the configuration of setting a Lambda function to AppSync data source.