Skip to main content

Prerequisites

  • MySQL version ≥ 5.7
  • MySQL binlog enabled on the primary server
  • Connection details including Server ID
  • A database user with sufficient privileges to configure the database, including enabling binary logging and creating users

MySQL Setup

1. Grant Database Access

2. Configure Binary Logging

Binary logging records all changes to your database tables. The Connector relies on MySQL’s implementation of this. Open a connection to your MySQL database’s server. Access your MySQL server configuration file (usually /etc/my.cnf). Ensure that this file has the following lines in the [mysqld] section. These lines enable ROW format binary log replication, which the Connector needs to perform incremental updates.
  • Enable binlog
  • Set Server-ID if not set already
  • Set minimum 3 days for log expiry
[mysqld]
binlog-format=ROW
log-bin=mysql-binlog
server-id=123456789
expire-logs-days=3
log-slave-updates=1
If your configuration already has a server-id entry, you don’t need to change it. Otherwise, choose any number between 1 and 4294967295 as the server-id.
  • Restart your MySQL server to effect these changes.

3. Verify Binary Logs Are Enabled

You can verify using any of these methods:
  • Run the following SQL query on the DB instance SHOW VARIABLES LIKE '%log_bin%';. Result should be ON
  • Run SHOW BINARY LOGS
If you upgrade your database version to 8.0.23 or above, please set the parameter binlog_row_metadata value to MINIMAL before upgrading. Failing to do this may result in your MySQL connectors failing.

4. Create Database User

It’s recommended to create a separate user and role for the Connector to access your MySQL database. Below is an example script that does that.
-- Replace { ... } placeholders as required

-- Create user
CREATE USER 'streamkap_user'@'%' IDENTIFIED WITH mysql_native_password BY '{password}';

-- Grant Permissions
GRANT REPLICATION CLIENT, RELOAD, SHOW DATABASES, REPLICATION SLAVE ON *.* TO 'streamkap_user'@'%';

-- Grant Select on all schemas needed
GRANT SELECT ON {schema}.* TO 'streamkap_user';

5. Enable Snapshots

To backfill your data, the Connector needs to be able to perform snapshots. See Snapshots & Backfilling for more information. To enable this feature, there are 2 methods available:
This method is recommended if you cannot create and grant Streamkap read/write privileges on a ‘signal’ table (method 2) for any reason. It’s the equivalent of a ‘read only’ connection.
GTIDs only available in MySQL version ≥ 5.6.5
Global transaction identifiers (GTIDs) uniquely identify transactions that occur on a server within a cluster. Though not required, using GTIDs simplifies replication and enables you to more easily confirm if primary and replica servers are consistent as well as carry out incremental snapshots. Set up following these instructions. Ensure you follow the guide for your version Replication GTIDs - How To and that GTID mode is ON.

Method 2: Create a table in the source database

If you cannot enable GTID mode, you will need to create the table and give permissions to the streamkap_user. The Connector will use this collection for managing snapshots.
Please create the signal table with the name streamkap_signal in a new schema called streamkap. It will not be recognised if given another name.
-- Create the schema
CREATE SCHEMA streamkap;

CREATE TABLE streamkap_signal (
  id VARCHAR(255) PRIMARY KEY,
  type VARCHAR(32) NOT NULL,
  data VARCHAR(2000) NULL
);

GRANT SELECT, UPDATE, INSERT ON streamkap.streamkap_signal TO 'streamkap_user';

6. Heartbeats

Heartbeats ensure the Connector stays active and continues capturing changes, which is especially important for low-traffic or intermittent databases. Connectors use “offsets”—like bookmarks—to track their position in the database’s log or change stream. When no changes occur for long periods, these offsets may become outdated, and the Connector might lose its place or stop capturing changes. There are two layers of heartbeat protection: Layer 1: Connector heartbeats (enabled by default) The Connector periodically emits heartbeat messages to an internal topic, even when no actual data changes are detected. This keeps offsets fresh and prevents staleness. No configuration is necessary for this layer; it is automatically enabled. Layer 2: Source database heartbeats (requires configuration) For read-write connections (when Read only is No during Streamkap Setup—if applicable), you can configure the Connector to update a dedicated heartbeat table in the source database at regular intervals. This simulates activity, ensuring change events are generated consistently, maintaining log progress and preventing staleness. It’s especially useful for databases like PostgreSQL, where lack of regular changes can cause the write-ahead log (WAL) to grow rapidly during inactivity. This layer requires you to set up a heartbeat table in your source database.

Configure source database heartbeats

-- Create the streamkap schema
CREATE SCHEMA IF NOT EXISTS streamkap;

-- Switch to the streamkap schema
USE streamkap;

-- Create the heartbeat table with id, text, and last_update fields
CREATE TABLE streamkap_heartbeat (
    id INT AUTO_INCREMENT PRIMARY KEY,
    text TEXT,
    last_update TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Grant permission to the Streamkap user
GRANT SELECT, INSERT, UPDATE ON streamkap.streamkap_heartbeat TO STREAMKAP_USER;

-- Insert the first row into the heartbeat table
INSERT INTO streamkap_heartbeat (text) VALUES ('test_heartbeat');

Streamkap Setup

Follow these steps to configure your new connector:

1. Create the Source

2. Connection Settings

  • Name: Enter a name for your connector.
  • Hostname: Specify the hostname.
  • Port: Default is 3306.
  • Connect via SSH Tunnel: The Connector will connect to an SSH server in your network which has access to your database. This is necessary if the Connector cannot connect directly to your database.
  • Username: Username to access the database. By default, Streamkap scripts use streamkap_user.
  • Password: Password to access the database.
  • Heartbeats: Crucial for low and intermittent traffic databases. Enabled by default.
  • Connection Timezone: The timezone of your database.
Timezone conversionMySQL converts TIMESTAMP values from the current time zone to UTC for storage, and back from UTC to the current time zone for retrieval. By default, the current time zone for each connection is the database server’s time zone but this option allows you to override that.As long as the time zones remain the same, you get back the same value you store.We recommend using the default SERVER option which attempts to detect the session time zone from the values configured on the MySQL server session variables ‘time_zone’ or ‘system_time_zone’. It also reduces the chance of problems with daylight savings adjustment ‘fall back’ and ‘spring forward’.If either time zones change, an ad-hoc snapshot is recommended so your source and destination timestamps are consistent.

3. Snapshot Settings

  • Use GTID: If your database is using GTID, leave this as ‘Yes’. See Enable GTID for more information.
    • If ‘No’, please ensure you create the signal table as described in Method 2.
  • Signal Table Database: Streamkap will use a table in this database to manage snapshots. See Enable Snapshots for more information.

4. Advanced Parameters

  • Represent binary data as: Specifies how the data for binary columns e.g. blob, binary, varbinary should be interpreted. Your destination for this data can impact which option you choose. Default is bytes.
  • Capture Only Captured Databases DDL: Specifies whether the connector records schema structures from all logical databases in the database instance or only captured databases. Enabling this when you have many databases in your instance can improve performance and avoid timeouts. Default is false. See Schema History Optimization for details.
  • Capture Only Captured Tables DDL: Specifies whether the connector records schema structures from all logical tables in the captured schemas or databases, or only captured tables. Enabling this when you have many tables can improve performance and avoid timeouts. Default is false. See Schema History Optimization for details.
Click Next.

5. Schema and Table Capture

  • Add Schemas/Tables: Specify the schema(s) and table(s) for capture.
    • You can bulk upload here. The format is a simple list of schemas and tables, with each entry on a new row. Save as a .csv file without a header.
CDC only captures base tables, not ViewsChange Data Capture reads the MySQL binary log (binlog), which only records changes to physical tables. Database Views are query-time computations with no physical storage—they don’t generate binlog entries.What you cannot capture: Views, temporary tables, MEMORY tables, BLACKHOLE tables, or system tables (information_schema, performance_schema).Solution: Specify only the underlying base tables that feed your views. You can recreate the view logic in your destination or transformation layer.
Click Save.
Have questions? See the MySQL Source FAQ for answers to common questions about MySQL sources, troubleshooting, and best practices.