Prerequisites

  • SQL Server version ≥ 2016 Service Pack 1 Standard or Enterprise Edition
  • A database user with sufficient privileges to configure the database, including enabling CDC, CT and creating users

SQL Server Setup

1. Grant Database Access

2. Enable Change Data Capture

Restores

During restores, CDC will be disabled. All of the related metadata is automatically removed from the database. This applies to snapshot restores, point-in-time restores, and SQL Server Native restores from S3. After performing one of these types of restores, you can re-enable CDC and re-specify tables to track.
Before you can use CDC with your Amazon RDS DB instance, you must enable it on the database. You must have master user privileges to enable CDC in the Amazon RDS DB instance. After CDC is enabled, any user who is db_owner of that database can enable or disable CDC on tables in that database.
EXEC msdb.dbo.rds_cdc_enable_db '{database}';
GO
Microsoft recommends keeping CDC data files separate from your primary database files.
-- Replace { ... } placeholders as required
ALTER DATABASE [{database}] ADD FILEGROUP Streamkap_ChangeTracking;
ALTER DATABASE [{database}] ADD FILE (
	NAME      = Streamkap_ChangeTracking_Data,
	FILENAME  = N'D:\rdsdbdata\DATA\Streamkap_ChangeTracking_Data.ndf'
) TO FILEGROUP Streamkap_ChangeTracking;
GO
After CDC is enabled on the database, for every table you want Streamkap to capture, enable change tracking on that using the script below:
-- Replace { ... } placeholders as required
EXEC sys.sp_cdc_enable_table  
@source_schema        = N'{schema}',
@source_name          = N'{table}',
@role_name            = N'streamkap_role',
@filegroup_name       = N'Streamkap_ChangeTracking',
@supports_net_changes = 0
GO

3. Create Database User

It’s recommended to create a separate user and role for Streamkap to access your SQL Server database. Below is an example script that does that.
-- Replace {database}, {schema} and {password} placeholders as required
USE {database};
GO
CREATE LOGIN streamkap_user WITH PASSWORD = '{password}';
CREATE USER streamkap_user FOR LOGIN streamkap_user;

CREATE ROLE streamkap_role;
ALTER ROLE streamkap_role ADD MEMBER streamkap_user;

GRANT SELECT ON SCHEMA::{schema} TO streamkap_role;
GRANT SELECT ON SCHEMA::cdc TO streamkap_role;
GO

USE master;
GO
GRANT VIEW SERVER STATE TO streamkap_user;
GO

4. Enable Snapshots

To backfill your data, the Connector needs to be able to perform snapshots (See Snapshots & Backfilling for more information). To enable this process, a table must be created for the Connector to use.
Please create the signal table with the name streamkap_signal. It will not be recognised if given another name.
CREATE SCHEMA streamkap;

-- Create the table within the schema
CREATE TABLE streamkap.streamkap_signal (
  id VARCHAR(255) PRIMARY KEY, 
  type VARCHAR(32) NOT NULL, 
  data VARCHAR(2000) NULL
);

-- Enable change tracking on the signal table
EXEC sys.sp_cdc_enable_table  
@source_schema        = N'streamkap',
@source_name          = N'streamkap_signal',
@role_name            = N'streamkap_role',
@filegroup_name       = N'Streamkap_ChangeTracking',
@supports_net_changes = 0
GO

-- Grant permission on the signal table to the Streamkap role
GRANT SELECT, UPDATE, INSERT ON streamkap.streamkap_signal TO streamkap_role;

Streamkap Setup

Follow these steps to configure your new connector:

1. Create the Source

2. Connection Settings

  • Name: Enter a name for your connector.
  • Endpoint: Specify the endpoint of the SQL Server database.
  • Port: Default is 1433.
  • Connect via SSH Tunnel: The Connector will connect to an SSH server in your network which has access to your database. This is necessary if the Connector cannot connect directly to your database.
  • Username: Username to access the database. By default, Streamkap scripts use streamkap_user.
  • Password: Password to access the database.
  • Source Database: Specify the database to stream data from.
  • Heartbeats: Crucial for low and intermittent traffic databases. Enabled by default.
    • Heartbeat Table Schema: If configured, this allows the connector to receive and acknowledge changes from low and intermittent traffic databases and prevent offsets from become stale.

3. Snapshot Settings

  • Signal Table Schema: The Connector will use a table in this schema to manage snapshots. See Enable Snapshots for setup instructions.

4. Advanced Parameters

  • Represent binary data as: Specifies how the data for binary columns e.g. blob, binary, varbinary should be interpreted. Your destination for this data can impact which option you choose. Default is bytes.
Click Next.

5. Schema and Table Capture

  • Add Schemas/Tables: Specify the schema(s) and table(s) for capture
    • You can bulk upload here. The format is a simple list of schemas and tables, with each entry on a new row. Save as a .csv file without a header.
Click Save.

Troubleshooting