Amazon RDS Aurora MySQL
MySQL Change Data Capture Setup on Amazon RDS Aurora with Streamkap
Prerequisites
- MySQL version ≥ 5.7
- MySQL binlog enabled
- Streamkap user and role
Granting Privileges
It's recommended to create a separate user and role for Streamkap to access your MySQL database. Below is an example script that does that.
-- Replace { ... } placeholders as required
-- Identify version
SHOW VARIABLES LIKE 'VERSION';
--On MySQL version 5.6 to 8.0
CREATE USER streamkap_user@'%' IDENTIFIED BY '{password}';
--On MySQL version 8.0+
CREATE USER streamkap_user@'%' IDENTIFIED WITH mysql_native_password BY '{password}';
--Grant Permissions
GRANT REPLICATION CLIENT, RELOAD, SHOW DATABASES, REPLICATION SLAVE ON *.* TO streamkap_user@'%';
--Grant Select on all schemas needed
GRANT SELECT ON {schema}.* TO 'streamkap_user';
Enable Snapshots
You can perform ad-hoc snapshots of all or some of your tables in the Streamkap app. See Snapshots & Backfilling for more information.
To enable this feature, there are 2 methods available for MySQL databases.
Method 1: Enable GTID (Recommended)
GTIDs only available in MySQL version ≥ 5.6.5
Global transaction identifiers (GTIDs) uniquely identify transactions that occur on a server within a cluster. Though not required, using GTIDs simplifies replication and enables you to more easily confirm if primary and replica servers are consistent as well as carry out incremental snapshots.
Set up following these instructions. Ensure you follow the guide for your version: MySQL Replication GTID - Configuring Aurora
This method is recommended if you cannot create and grant Streamkap read/write privileges on a 'signal' table (method 2) for any reason. It's the equivalent of a 'read only' connection.
Method 2: Create a table in the source database
If you cannot enable GTID mode, you will need to create the table and give permissions to the streamkap_user
. Streamkap will use this collection for managing snapshots.
This table can exist in a different database (on the same MySQL instance) to the database Streamkap captures data from.
Please create the signal table with the name
streamkap_signal
. It will not be recognised if given another name.
CREATE TABLE streamkap_signal (
id VARCHAR(255) PRIMARY KEY,
type VARCHAR(32) NOT NULL,
data VARCHAR(2000) NULL
);
GRANT SELECT, UPDATE, INSERT ON streamkap_signal TO streamkap_user;
Configure binary logging
- Open the Amazon RDS console at https://console.aws.amazon.com/rds/
- In the navigation pane, choose Parameter groups
- Choose the parameter group used by the DB instance you want to modify
- You can't modify a default parameter group. If the DB instance is using a default parameter group, create a new parameter group and associate it with the DB instance
- From Parameter group actions, choose Edit
- Set the
binlog_format
parameter to the binary logging format ofROW
- Set the
binlog_row_image
parameter toFull
- Choose Save changes to save the updates to the DB parameter group
Set binary log retention period
- Connect to your master database with your SQL tool.
- View current settings with
CALL mysql.rds_show_configuration;
- If less than 24 hours or null run
CALL mysql.rds_set_configuration('binlog retention hours', 72);
Verify binary logs are enabled
You can either:
- Check the parameter group for the DB instance and that
log_bin
parameter isON
- Run the following SQL query on the DB instance
SHOW VARIABLES LIKE '%log_bin%';
. Result should beON
- Run
SHOW BINARY LOGS
Consider Access Restrictions
- Visit Connection Options to ensure Streamkap can reach your database
Setup MySQL Connector in Streamkap
- Go to Sources and click Create New
- Input
- Name for your Connector
- Hostname
- Port (Default
3306
) - Username (Username you chose earlier, our scripts use
streamkap_user
) - Password
- Timezone - The timezone of your database
- Use GTID
- If your database is using GTID, leave this as 'Yes'. See Enable GTID for more information.
- If 'No', please ensure you create the signal table as described here
- Signal Table Database: Streamkap will use a table in this database to manage snapshots e.g.
public
. See Enable Snapshots for more information
- Signal Table Database: Streamkap will use a table in this database to manage snapshots e.g.
- Advanced Parameters
- Snapshot Mode (Default
When Needed
) See MySQL Snapshot Modes for more information - Represent Binary Data As (Default
bytes
) - Snapshot Chunk Size (Default
1024
) - This is the number of rows read at a time when snapshotting. This is a low safe value. As a guide, if you have 100m + rows of data you may want to move this to5120
. If you have 1bn then a higher number still will allow you to backfill faster. - Max Batch Size (Default
2048
) - A value that specifies the maximum size of each batch of events that the connector processes. Only increase if experiencing lag
- Snapshot Mode (Default
- Add Schemas/Tables. Can also bulk upload here. The format is a simple list of each schema or table per row saved in csv format without a header.
- Click Save
The connector will take approximately 1 minute to start processing data.
Updated 20 days ago