MariaDB (Generic)

MariaDB Change Data Capture Setup with Streamkap

Prerequisites

  • MariaDB version ≥ 11.4.3
  • MariaDB binlog enabled on the primary server
  • Connection details
  • Streamkap user and role

Granting Privileges

It's recommended to create a separate user and role for Streamkap to access your MariaDB database. Below is an example script that does that.

-- Replace { ... } placeholders as required

-- Create user
CREATE USER streamkap_user@'%' IDENTIFIED BY 'password';

-- Grant permissions
GRANT REPLICATION CLIENT, RELOAD, SHOW DATABASES, REPLICATION SLAVE ON *.* TO streamkap_user@'%';

-- Grant select on all schemas needed
GRANT SELECT ON {schema}.* TO 'streamkap_user';

Enable Snapshots

You can perform ad-hoc snapshots of all or some of your tables in the Streamkap app. See Snapshots & Backfilling for more information.

To enable this feature, there are 2 methods available for MariaDB databases.

Method 1: Enable GTID (default)

Global transaction identifiers (GTIDs) uniquely identify transactions that occur on a server within a cluster. Though not required, using GTIDs simplifies replication and enables you to more easily confirm if primary and replica servers are consistent as well as carry out incremental snapshots.

For MariaDB, this is enabled by default, no additional setup is necessary.

Method 2: Create a table in the source database

If for some reason you have disabled GTIDs and cannot enable them, you will need to create the table and give permissions to the streamkap_user. Streamkap will use this collection for managing snapshots.

❗️

Please create the signal table with the name streamkap_signal. It will not be recognised if given another name.

-- Create the schema
CREATE SCHEMA streamkap;

CREATE TABLE streamkap.streamkap_signal (
  id VARCHAR(255) PRIMARY KEY, 
  type VARCHAR(32) NOT NULL, 
  data VARCHAR(2000) NULL
);

GRANT SELECT, UPDATE, INSERT ON streamkap.streamkap_signal TO streamkap_user;

Update Server Configuration File

Open a connection to your MariaDB database's server. Access your MariaDB server configuration file (usually /etc/my.cnf). These lines enable ROW format binary log replication which Streamkap needs to perform incremental updates.

  • Enable binary logging
  • Set a unique server-id number if not set already. If your configuration already has a server-id entry, you don't need to change it. Otherwise, choose a number between 1 and 4294967295 as the server-id.
  • Set a minimum 3 days for log expiry (default is 30 days)
binlog_format=ROW
binlog_row_image=FULL
log_bin=mariadb-binlog  
server-id=123456789
binlog_expire_logs_seconds=259200
  • Restart your MariaDB server for these changes to take effect

Validate binlog row value options

To enable the connector to consume UPDATE events, this variable must be set to a value other than PARTIAL_JSON.

  • Check current variable value: show global variables where variable_name = 'binlog_row_value_options';
  • If the value of the variable is set to PARTIAL_JSON, run the following command to unset it: set @@global.binlog_row_value_options="";

Verify binary logs are enabled

You can either:

  • Run the following SQL query on the DB instance SHOW VARIABLES LIKE '%log_bin%';. Result should be ON
  • Run SHOW BINARY LOGS

Consider Access Restrictions

Setup MariaDB Connector in Streamkap

  • Go to Sources and click Create New
  • Input
    • Name for your Connector
    • Hostname
    • Port (Default 3306)
    • Username (Username you chose earlier, our scripts use streamkap_user)
    • Password
    • Heartbeat - Required for low volume connectors. See MariaDB Heartbeats
    • Connection Timezone - The timezone of your database
    • 📘

      Timezone conversion

      MariaDB converts TIMESTAMP values from the current time zone to UTC for storage, and back from UTC to the current time zone for retrieval. By default, the current time zone for each connection is the database server's time zone but this option allows you to override that.

      As long as the time zones remain the same, you get back the same value you store.

      We recommend using the default SERVER option which attempts to detect the session time zone from the values configured on the MariaDB server session variables 'time_zone' or 'system_time_zone'. It also reduces the chance of problems with daylight savings adjustment 'fall back' and 'spring forward'.

      If either time zones change, an ad-hoc snapshot is recommended so your source and destination timestamps are consistent.

    • Use GTID
      • MariaDB enables this by default. See Enable GTID for more information.
      • If you have disabled GTID mode and cannot enable it, please ensure you create the signal table as described here
        • Signal Table Database: Streamkap will use a table in this schema to manage snapshots e.g. streamkap. See Enable Snapshots for more information
  • Connect via SSH Tunnel. See SSH Tunnel
  • Advanced Parameters
    • Represent Binary Data As (Default bytes)
  • Add Schemas/Tables. Can also bulk upload here. The format is a simple list of each schema or table per row saved in csv format without a header.
  • Click Save
    The connector will take approximately 1 minute to start processing data.