DynamoDB

Prerequisites

  • DynamoDB tables created with streams turned on with "New and old images"
  • For capturing historic (older than 24 hours) data:
    • An S3 bucket (for the data exports)
    • Point-in-time recovery (PITR) enabled on DynamoDB tables

DynamoDB Setup

A DynamoDB stream is an ordered flow of information about changes to items in a DynamoDB table. When you enable a stream on a table, DynamoDB captures information about every modification to data items in the table and stores this information in a log for up to 24 hours.

The easiest way to manage DynamoDB Streams is by using the AWS Management Console.

  1. Sign in to the AWS Management Console and open the DynamoDB console
  2. On the DynamoDB console dashboard, choose Tables and select an existing table
  3. Choose the Exports and streams tab
  4. In the DynamoDB stream details section, choose Turn on
    1. Choose the information that will be written to the stream whenever the data in the table is modified. From the list, select New and old images (required)
    2. Choose Turn on stream
  5. Repeat the above steps for every table to be captured by Streamkap

📘

See Change data capture for DynamoDB Streams for more information

Capturing historic data

Amazon DynamoDB point-in-time recovery (PITR) provides automatic backups of your DynamoDB table data. This is required by Streamkap for capturing data older than 24 hours.

  1. Navigate to the DynamoDB console
  2. Choose Tables from the left navigation and select an existing table
  3. From the Backups tab, for the Point in Time Recovery option, choose Edit
  4. Choose Turn on point-in-time recovery and then Save changes
  5. Repeat the above steps for every table where historic data is to be captured by Streamkap

📘

See Point-in-time backups: how it works for more information.

S3 exports

An S3 bucket is used in conjunction with point-in-time recovery to export data older than 24 hours.

You can export to an existing Amazon S3 bucket you have permission to write to. The destination bucket doesn't need to be in the same AWS Region or have the same owner as the source table owner. However, it's recommended to create a new bucket for Streamkap to use when snapshotting historic data.

Your AWS Identity and Access Management (IAM) policy needs to allow you to be able to perform S3 actions (s3:AbortMultipartUpload, s3:PutObject, and s3:PutObjectAcl) and the DynamoDB export action (dynamodb:ExportTableToPointInTime).

Here's an example of a sample policy that will grant your user permissions to perform exports to an S3 bucket:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowDynamoDBExportAction",
            "Effect": "Allow",
            "Action": "dynamodb:ExportTableToPointInTime",
            "Resource": "arn:aws:dynamodb:us-east-1:111122223333:table/my-table"
        },
        {
            "Sid": "AllowWriteToDestinationBucket",
            "Effect": "Allow",
            "Action": [
                "s3:AbortMultipartUpload",
                "s3:PutObject",
                "s3:PutObjectAcl"
            ],
            "Resource": "arn:aws:s3:::your-bucket/*"
        }
    ]
}

If you need to write to an S3 bucket that is in another account or you don't have permissions to write to, the S3 bucket owner must add a bucket policy to allow you to export from DynamoDB to that bucket. Here's an example policy on the target S3 bucket:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ExampleStatement",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::123456789012:user/{name}"
            },
            "Action": [
                "s3:AbortMultipartUpload",
                "s3:PutObject",
                "s3:PutObjectAcl"
            ],
            "Resource": "arn:aws:s3:::awsexamplebucket1/*"
        }
    ]
}

📘

See DynamoDB data export to Amazon S3: how it works for more information.

Recommended policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "dynamodb:DescribeExport",
                "dynamodb:ExportTableToPointInTime",
                "dynamodb:ListExports",
                "dynamodb:DescribeStream",
                "dynamodb:GetRecords",
                "dynamodb:GetShardIterator",
                "dynamodb:DescribeTable",
				        "dynamodb:Scan",
				        "dynamodb:Query",
				        "dynamodb:GetItem"
            ],
            "Resource": "arn:aws:dynamodb:us-west-2:424842667740:table/<your-table-name>"
        },
        {
            "Sid": "AccessGameScoresStreamOnly",
            "Effect": "Allow",
            "Action": [
                "dynamodb:DescribeStream",
                "dynamodb:GetRecords",
                "dynamodb:GetShardIterator",
                "dynamodb:ListStreams"
            ],
            "Resource": "arn:aws:dynamodb:us-west-2:424842667740:table/<your-table-name>/stream/*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:ListBucket",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::<your-bucket-name>",
                "arn:aws:s3:::<your-bucket-name>/*"
            ]
        }
    ]
}

Some permissions are needed only for manual troubleshooting purposes and Auto QA tools, not used by the connector, e.g.dynamodb:Scan, dynamodb:Query, dynamodb:GetItem

Streamkap Setup

  • Create a new DynamoDB Source
  • Enter the following information:
    • Name - A unique and memorable name for this Connector
    • AWS Region - The region of the S3 bucket used for historic data exports
    • AWS Access Key ID - The key for the S3 bucket
    • AWS Secret Key - The secret for the S3 bucket key
    • S3 export bucket name - The name of the bucket used for historic data exports
  • Click Next
    • Add Schemas/Tables. Can also bulk upload here. The format is a simple list of each schema or table per row saved in CSV format without a header
  • Click Save

The connector will take approximately 1 minute to start processing data.