Prerequisites
- MySQL version ≥ 5.7
- MySQL binlog enabled
- A database user with sufficient privileges to configure the database, including enabling binary logging and creating users
MySQL Setup
1. Grant Database Access
- Configure one of the Connection Options to ensure Streamkap can reach your database.
2. Configure Binary Logging
Binary logging records all changes to your database tables. The Connector relies on MySQL’s implementation of this.- Open the Amazon RDS console at https://console.aws.amazon.com/rds/.
- In the navigation pane, choose Parameter groups.
Default parameter groups can’t be modifiedIf the DB instance is using a default parameter group, create a new one:
- Choose Create parameter group.
- Enter a Parameter group name and Description.
- For Engine type, choose your database engine.
- For Parameter group family, choose a DB parameter group family.
- Choose Create.
- Select the parameter group to edit.
- Choose Edit from Parameter group actions.
- Set
binlog_formattoROW. - Set
binlog_row_imagetoFull. - Choose Save changes.
- In the navigation pane, choose Databases and select the target DB instance.
- Choose Modify.
- Change the DB parameter group setting.
- Choose Continue and review modifications.
- On the confirmation page, choose Modify DB Instance.
Configuring RDS for MySQL binary logging
3. Set Binary Log Retention Period
- Connect to your master database with your SQL tool.
- View current settings with
CALL mysql.rds_show_configuration; - If less than 24 hours or null, run
CALL mysql.rds_set_configuration('binlog retention hours', 72);
4. Verify Binary Logs Are Enabled
You can verify using any of these methods:- Check the parameter group for the DB instance and that
log_binparameter isON - Run the following SQL query on the DB instance
SHOW VARIABLES LIKE '%log_bin%';. Result should beON - Run
SHOW BINARY LOGS
5. Create Database User
It’s recommended to create a separate user and role for the Connector to access your MySQL database. Below is an example script that does that.6. Enable Snapshots
To backfill your data, the Connector needs to be able to perform snapshots. See Snapshots & Backfilling for more information. To enable this feature, there are 2 methods available:Method 1: Enable GTID (Recommended)
This method is recommended if you cannot create and grant Streamkap read/write privileges on a ‘signal’ table (method 2) for any reason. It’s the equivalent of a ‘read only’ connection.
GTID mode is ON.
Method 2: Create a table in the source database
If you cannot enable GTID mode, you will need to create the table and give permissions to thestreamkap_user. The Connector will use this collection for managing snapshots.
7. Heartbeats
Heartbeats ensure the Connector stays active and continues capturing changes, which is especially important for low-traffic or intermittent databases. Connectors use “offsets”—like bookmarks—to track their position in the database’s log or change stream. When no changes occur for long periods, these offsets may become outdated, and the Connector might lose its place or stop capturing changes. There are two layers of heartbeat protection: Layer 1: Connector heartbeats (enabled by default) The Connector periodically emits heartbeat messages to an internal topic, even when no actual data changes are detected. This keeps offsets fresh and prevents staleness. No configuration is necessary for this layer; it is automatically enabled. Layer 2: Source database heartbeats (requires configuration) For read-write connections (when Read only is No during Streamkap Setup—if applicable), you can configure the Connector to update a dedicated heartbeat table in the source database at regular intervals. This simulates activity, ensuring change events are generated consistently, maintaining log progress and preventing staleness. It’s especially useful for databases like PostgreSQL, where lack of regular changes can cause the write-ahead log (WAL) to grow rapidly during inactivity. This layer requires you to set up a heartbeat table in your source database.Configure source database heartbeats
Streamkap Setup
Follow these steps to configure your new connector:1. Create the Source
- Navigate to Add Connectors.
- Choose MySQL.
2. Connection Settings
- Name: Enter a name for your connector.
- Hostname: Specify the hostname.
-
Port: Default is
3306. -
Connect via SSH Tunnel: The Connector will connect to an SSH server in your network which has access to your database. This is necessary if the Connector cannot connect directly to your database.
- See SSH Tunnel for setup instructions.
-
Username: Username to access the database. By default, Streamkap scripts use
streamkap_user. - Password: Password to access the database.
-
Heartbeats: Crucial for low and intermittent traffic databases. Enabled by default.
- See Heartbeats for setup instructions.
- Connection Timezone: The timezone of your database.
Timezone conversionMySQL converts
TIMESTAMP values from the current time zone to UTC for storage, and back from UTC to the current time zone for retrieval. By default, the current time zone for each connection is the database server’s time zone but this option allows you to override that.As long as the time zones remain the same, you get back the same value you store.We recommend using the default SERVER option which attempts to detect the session time zone from the values configured on the MySQL server session variables ‘time_zone’ or ‘system_time_zone’. It also reduces the chance of problems with daylight savings adjustment ‘fall back’ and ‘spring forward’.If either time zones change, an ad-hoc snapshot is recommended so your source and destination timestamps are consistent.3. Snapshot Settings
-
Use GTID: If your database is using GTID, leave this as ‘Yes’. See Enable GTID for more information.
- If ‘No’, please ensure you create the signal table as described in Method 2.
- Signal Table Database: Streamkap will use a table in this database to manage snapshots. See Enable Snapshots for more information.
4. Advanced Parameters
- Represent binary data as: Specifies how the data for binary columns e.g.
blob,binary,varbinaryshould be interpreted. Your destination for this data can impact which option you choose. Default isbytes. - Capture Only Captured Databases DDL: Used to control whether the connector records schema structures from all databases defined in the server (the default) or only those databases for which you’ve explicitly configured the connector. Specify
trueto capture schema history only for the specific databases you’ve configured. This is particularly valuable when databases are large, to reduce the volume of DDL stored in the schema history topic. It also improves startup times when the connector restarts or recovers from failures. Default isfalse. See Schema History Optimization for details. - Capture Only Captured Tables DDL: Used to control whether the connector records the schema structure for all tables in the configured databases (the default) or only the tables whose changes the connector captures. Specify
trueto capture schema history only for the specific tables you’ve configured. This is particularly valuable when tables are large, to reduce the volume of DDL statements stored in the schema history topic. It also improves startup times when the connector restarts or recovers from failures. Default isfalse. See Schema History Optimization for details.
5. Schema and Table Capture
- Add Schemas/Tables: Specify the schema(s) and table(s) for capture.
- You can bulk upload here. The format is a simple list of schemas and tables, with each entry on a new row. Save as a
.csvfile without a header.
- You can bulk upload here. The format is a simple list of schemas and tables, with each entry on a new row. Save as a
Have questions? See the MySQL Source FAQ for answers to common questions about MySQL sources, troubleshooting, and best practices.