Server-side encryption of Kinesis Data Streams and Firehose using KMS

Server-side encryption of Kinesis Data Streams and Firehose using KMS

The following page shows how to save data received by Kinessis Data Streams to an S3 bucket via Kinesis Data Firehose.

あわせて読みたい
Store data in S3 bucket received by Kinesis Data Streams via Firehose 【Store data received by Kinesis Data Streams in S3 buckets via Firehose】 In the following pages, we have shown how data generated by Lambda functions can b...

This time, the configuration is similar to the page above, but encrypts the data handled by Kinesis Data Streams and Kinesis Data Firehose.
Specifically, we will encrypt the data sent to Kinesis Data Streams and the data delivered to the S3 bucket by Kinesis Data Firehose.
Both are encrypted using CMKs created by KMS.

Environment

Diagram of Server-side encryption of Kinesis Data Streams and Firehose using KMS.

Basically, the structure is the same as the page introduced at the beginning of this document.

The two changes are as follows.
The first enables SSE in Kinesis Data Streams.
The second point enables SSE for data delivered from Kinesis Data Firehose to S3 buckets.
Both encrypt using KMS (CMK).

The runtime environment for Lambda functions is Python 3.12.

CloudFormation template files

The above configuration is built with CloudFormation.
The CloudFormation template is placed at the following URL

GitHub
awstut-dva/02/005 at main · awstut-an-r/awstut-dva Contribute to awstut-an-r/awstut-dva development by creating an account on GitHub.

Explanation of key points of template files

(Reference) Lambda function

Resources:
  DataSourceLambda:
    Type: AWS::Lambda::Function
    Properties:
      Architectures:
        - !Ref Architecture
      Code:
        # https://docs.aws.amazon.com/ja_jp/streams/latest/dev/get-started-exercise.html
        ZipFile: |
          import boto3
          import datetime
          import json
          import os
          import random
          
          STREAM_NAME = os.environ['KINESIS_STREAM_NAME']
          LIMIT = 10
          
          def get_data():
            return {
              'EVENT_TIME': datetime.datetime.now().isoformat(),
              'TICKER': random.choice(['AAPL', 'AMZN', 'MSFT', 'INTC', 'TBV']),
              'PRICE': round(random.random() * 100, 2)}
          
          def generate(stream_name, kinesis_client, limit):
            for i in range(limit):
              data = get_data()
              print(data)
              kinesis_client.put_record(
                StreamName=stream_name,
                Data=json.dumps(data).encode('utf-8'),
                PartitionKey="partitionkey")
        
          def lambda_handler(event, context):
            generate(STREAM_NAME, boto3.client('kinesis'), LIMIT)
      Environment:
        Variables:
          KINESIS_STREAM_NAME: !Ref KinesisDataStreamName
      Handler: !Ref Handler
      Role: !GetAtt DataSourceLambdaRole.Arn
      Runtime: !Ref Runtime
      Timeout: !Ref Timeout
Code language: YAML (yaml)

Function to create test data to be delivered to an S3 bucket via Kinesis.

We use the code discussed in the following pages.

あわせて読みたい
Create and run a Managed Service for Apache Flink application - Amazon Kinesis Data Streams In this exercise, you create a Managed Service for Apache Flink application with data streams as a source and a sink.

The following is the IAM role for this function.

Resources:
  DataSourceLambdaRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Action: sts:AssumeRole
            Principal:
              Service:
                - lambda.amazonaws.com
      Policies:
        - PolicyName: DataSourceLambdaPolicy
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - kms:GenerateDataKey
                Resource:
                  - !Ref KinesisKeyArn
              - Effect: Allow
                Action:
                  - kinesis:PutRecord
                Resource:
                  - !Ref KinesisDataStreamArn
Code language: YAML (yaml)

The point is that we are giving permission (kms:GenerateDataKey) regarding the KMS key for Kinesis Data Steams.
This function is positioned as a Kinesis stream producer from the perspective of Kinesis Data Streams, and the authorization to be given to the producer is described below.

Your Kinesis stream producers must have the kms:GenerateDataKey permission.

Example Producer Permissions

Kinesis Data Streams

Resources:
  KinesisDataStream:
    Type: AWS::Kinesis::Stream
    Properties:
      Name: !Sub "${Prefix}-DataStream"
      RetentionPeriodHours: 24
      ShardCount: !Ref ShardCount
      StreamEncryption: 
        EncryptionType: KMS
        KeyId: !Ref KinesisKeyArn
Code language: YAML (yaml)

The key to enabling SSE in Kinesis Data Streams is the StreamEncryption property.
Specify “KMS” for the EncryptionType property and the KMS ARN for the KeyId property.

This time, the following KMS key is specified

Resources:
  KinesisKey:
    Type: AWS::KMS::Key
    Properties:
      Enabled: true
      KeyPolicy:
        Version: 2012-10-17
        Id: !Sub "${Prefix}-kinesis"
        Statement:
          - Effect: Allow
            Principal:
              AWS: "*"
            Action:
              - kms:Encrypt
              - kms:Decrypt
              - kms:ReEncrypt*
              - kms:GenerateDataKey*
              - kms:DescribeKey
            Resource: "*"
            Condition:
              StringEquals:
                kms:CallerAccount: !Ref AWS::AccountId
                kms:ViaService: !Sub "kinesis.${AWS::Region}.amazonaws.com"
          - Effect: Allow
            Principal:
              AWS: !Sub "arn:aws:iam::${AWS::AccountId}:root"
            Action: "*"
            Resource: "*"
      KeySpec: SYMMETRIC_DEFAULT
      KeyUsage: ENCRYPT_DECRYPT
      Origin: AWS_KMS
Code language: YAML (yaml)

The key policy was set with reference to AWS Managed Keys for Kinesis Data Streams.

The following are the IAM roles for Kinesis Data Streams.

Resources:
  KinesisStreamSourceRole:
    Type: AWS::IAM::Role
    DeletionPolicy: Delete
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Action: sts:AssumeRole
            Principal:
              Service:
                - firehose.amazonaws.com
      Policies:
        - PolicyName: KinesisStreamSourcePolicy
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - kinesis:DescribeStream
                  - kinesis:GetShardIterator
                  - kinesis:GetRecords
                  - kinesis:ListShards
                Resource:
                  - !GetAtt KinesisDataStream.Arn
Code language: YAML (yaml)

The IAM role will not be configured for KMS.
Because permissions (kms:Decrypt) for Kinesis Data Streams are allowed in the key policy.

Kinesis Data Firehose

Resources:
  KinesisFirehoseDeliveryStream:
    Type: AWS::KinesisFirehose::DeliveryStream
    Properties:
      DeliveryStreamName: !Ref KinesisFirehoseDeliveryStreamName
      DeliveryStreamType: KinesisStreamAsSource
      KinesisStreamSourceConfiguration: 
        KinesisStreamARN: !GetAtt KinesisDataStream.Arn
        RoleARN: !GetAtt KinesisStreamSourceRole.Arn
      S3DestinationConfiguration: 
        BucketARN: !Ref BucketArn
        CloudWatchLoggingOptions: 
          Enabled: true
          LogGroupName: !Ref LogGroup
          LogStreamName: !Ref LogStream
        CompressionFormat: UNCOMPRESSED
        EncryptionConfiguration: 
          KMSEncryptionConfig: 
            AWSKMSKeyARN: !Ref S3KeyArn
        Prefix: firehose/
        RoleARN: !GetAtt KinesisS3DestinationRole.Arn
Code language: YAML (yaml)

In this configuration, the destination for Kinesis Data Firehose is an S3 bucket.
To encrypt using KMS when deploying to this bucket, set the EncryptionConfiguration property.
Specify the ARN of the KMS key for S3 in the AWSKMSKeyARN property within the same property.

Below is the KMS key for S3.

Resources:
  S3Key:
    Type: AWS::KMS::Key
    Properties:
      Enabled: true
      KeyPolicy:
        Version: 2012-10-17
        Id: !Sub "${Prefix}-s3"
        Statement:
          - Effect: Allow
            Principal:
              AWS: "*"
            Action:
              - kms:Encrypt
              - kms:Decrypt
              - kms:ReEncrypt*
              - kms:GenerateDataKey*
              - kms:DescribeKey
            Resource: "*"
            Condition:
              StringEquals:
                kms:CallerAccount: !Ref AWS::AccountId
                kms:ViaService: !Sub "s3.${AWS::Region}.amazonaws.com"
          - Effect: Allow
            Principal:
              AWS: !Sub "arn:aws:iam::${AWS::AccountId}:root"
            Action: "*"
            Resource: "*"
      KeySpec: SYMMETRIC_DEFAULT
      KeyUsage: ENCRYPT_DECRYPT
      Origin: AWS_KMS
Code language: YAML (yaml)

The key policy is almost the same as for Kinesis Data Streams.

The following is the IAM role for Kinesis Data Firehose.

Resources:
  KinesisS3DestinationRole:
    Type: AWS::IAM::Role
    DeletionPolicy: Delete
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Action: sts:AssumeRole
            Principal:
              Service:
                - firehose.amazonaws.com
      Policies:
        - PolicyName: KinesisS3DestinationPolicy
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - s3:AbortMultipartUpload
                  - s3:GetBucketLocation
                  - s3:GetObject
                  - s3:ListBucket
                  - s3:ListBucketMultipartUploads
                  - s3:PutObject
                Resource:
                  - !Ref BucketArn
                  - !Sub "${BucketArn}/*"
              - Effect: Allow
                Action:
                  - logs:PutLogEvents
                Resource:
                  - !GetAtt LogGroup.Arn
Code language: YAML (yaml)

This IAM role is also not configured for KMS.

(Reference) S3 bucket

Resources:
  Bucket:
    Type: AWS::S3::Bucket
    Properties:
      AccessControl: Private
      BucketName: !Ref Prefix
Code language: YAML (yaml)

No SSE-related settings are made.
That is, the default encryption method for this bucket is SSE-S3 with S3 managed keys.

Architecting

Use CloudFormation to build this environment and check its actual behavior.

Create CloudFormation stacks and check the resources in the stacks

Create CloudFormation stacks.
For information on how to create stacks and check each stack, please see the following page.

あわせて読みたい
CloudFormation’s nested stack 【How to build an environment with a nested CloudFormation stack】 Examine nested stacks in CloudFormation. CloudFormation allows you to nest stacks. Nested ...

Check the resources created from the management console.

Detail of KMS 01.
Detail of KMS 02.

Two KMS keys have been successfully created.

Check Kinesis Data Streams.

Detail of Kinesis 01.

We can see that indeed SSE with KMS is enabled.
And we can also see that the KSM key used is the one created for Kinesis.

Check Kinesis Data Firehose.

Detail of Kinesis 02.
Detail of Kinesis 03.

Indeed, the Kinesis Data Firehose is successfully created.
Checking the details, we see that it is configured to deliver data received from the aforementioned Kinesis Data Streams to an S3 bucket.
And we can also see that when the data is delivered to the S3 bucket, it is configured to be encrypted using the KMS key for S3 mentioned above.

Operation Check

Ready to go.

Execute a Lambda function to generate test data.

Detail of Lambda 01.

The function has been successfully executed and test data has been generated.
This data will be sent to Kinesis Data Streams.

After a short wait, Kinesis monitoring confirms that the data has been sent.

Detail of Kinesis 04.
Detail of Kinesis 05.

You can see that data has been sent to Kinesis Data Streams and Kinesis Data Firehose.

Check the S3 bucket.

Detail of S3 01.

Indeed, an object was created in the S3 bucket.

Detail of S3 02.

This object encryption setting shows that it is encrypted by the KMS key for S3 mentioned earlier.

Finally, download this file and check its contents.

% cat dva-02-005-FirehoseDeliveryStream-1-2024-03-27-10-57-58-2f6bd68e-e997-444a-b231-2c9f4e277df7 | jq  
{
  "EVENT_TIME": "2024-03-27T10:57:58.105290",
  "TICKER": "INTC",
  "PRICE": 61.97
}
{
  "EVENT_TIME": "2024-03-27T10:57:58.691875",
  "TICKER": "AAPL",
  "PRICE": 99.22
}
{
  "EVENT_TIME": "2024-03-27T10:57:58.725142",
  "TICKER": "MSFT",
  "PRICE": 30.67
}
{
  "EVENT_TIME": "2024-03-27T10:57:58.745062",
  "TICKER": "INTC",
  "PRICE": 12.41
}
{
  "EVENT_TIME": "2024-03-27T10:57:58.765025",
  "TICKER": "INTC",
  "PRICE": 39.63
}
{
  "EVENT_TIME": "2024-03-27T10:57:58.805054",
  "TICKER": "MSFT",
  "PRICE": 56.14
}
{
  "EVENT_TIME": "2024-03-27T10:57:58.825060",
  "TICKER": "AAPL",
  "PRICE": 4.66
}
{
  "EVENT_TIME": "2024-03-27T10:57:58.845055",
  "TICKER": "AAPL",
  "PRICE": 85.08
}
{
  "EVENT_TIME": "2024-03-27T10:57:58.885011",
  "TICKER": "AMZN",
  "PRICE": 70.01
}
{
  "EVENT_TIME": "2024-03-27T10:57:58.905023",
  "TICKER": "INTC",
  "PRICE": 75.95
}
Code language: Bash (bash)

It is indeed test data generated by the Lambda function.

Thus, we have confirmed that the KMS key can encrypt the data to be added to Kinesis Data Streams and the objects placed in the S3 bucket by Kinesis Data Firehose.

Summary

We have identified how to encrypt data handled by Kinesis Data Streams and Kinesis Data Firehose using KMS keys.