Skip to main content
This FAQ focuses on using MySQL as a source in streamkap, including general self-hosted setups and cloud variants (AWS RDS/Aurora, Google Cloud SQL, Azure MySQL Database). Streamkap’s MySQL connector provides real-time CDC with managed features like automatic scaling, UI setup, and ETL transformations.
A MySQL source in streamkap enables real-time Change Data Capture (CDC) from MySQL databases, capturing row-level inserts, updates, and deletes with sub-second latency. It uses the binary log (binlog) to stream changes to destinations, supporting snapshots for initial loads, schema evolution, and handling for schema changes. Streamkap offers a serverless setup via UI or API.What MySQL versions are supported as sources?Streamkap: MySQL 5.7+ for basic CDC; 8.0+ for advanced features like GTIDs and query log events; 8.4+ for latest enhancements. Compatible with MySQL 5.6.5+ in limited modes.
Streamkap supports:
  • Self-hosted/Generic (on-prem/VM)
  • AWS RDS MySQL (including Aurora)
  • Google Cloud SQL for MySQL
  • Azure MySQL Database
  • Any other MySQL compatible platform
Streamkap also supports standalone, primary/replica, high availability clusters, and multi-primary topologies.
  • CDC: Binlog-based for inserts/updates/deletes; captures schema changes
  • Snapshots: Ad-hoc/initial using global/table locks; incremental or blocking
  • Schema Evolution: Automatic DDL tracking; emits change events
  • Data Types: Standard mappings (numerics, temporals, strings, binary configurable, JSON, ENUM/SET); temporal/decimal/binary handling modes
  • Ingestion Modes: Inserts (append) or upserts
  • Security: SSL, authentication
  • Monitoring: Latency, lag, binlog metrics
  • Heartbeats: For low-traffic databases to advance binlog position via a heartbeat table
  • Streamkap adds transaction metadata, row filtering, and original SQL queries if enabled
Streamkap reads MySQL’s binary log (binlog) to capture committed operations, emitting changes as events.Requires binlog_format=ROW and binlog_row_image=FULL.
Uses GTID mode for position tracking in multi-primary setups, recommended for reliability.
binlog_row_image controls logged data for changes:
  • MINIMAL: Logs PK only (partial deletes)
  • FULL: Logs all columns (complete records, recommended)
Set to FULL for full delete handling.
Trigger ad-hoc at source/table level.
  • Methods: Global read lock or table-level; incremental (chunked by PK, default 1024 rows) or blocking
  • Modes: initial (default), always, initial_only, no_data, when_needed, configuration_based, custom
Streamkap simplifies triggering via UI.
  • Basics: Booleans (BOOLEAN), integers (INT8/16/32/64), floats (FLOAT32/64), strings (STRING), dates/timestamps (adaptive/connect modes)
  • Advanced: Binary (BYTES/base64/hex), decimals (precise/double/string modes), JSON (STRING/io.debezium.data.Json), ENUM/SET (STRING/io.debezium.data.Enum/Set)
  • Character: CHAR/VARCHAR/TEXT (STRING)
  • Unsupported: Non-UTF8; spatial (GEOMETRY); MyISAM tables; binlog expiration losses
Heartbeats prevent binlog buildup in low-traffic DBs by periodically updating a heartbeat table, advancing the binlog position.
  • Setup: Create the heartbeat table in the source database; the connector updates it
  • How it Works: Generates change events to keep binlog moving
  • Verification: Check table updates; troubleshoot permissions
  • Limitations: Manual setup required for low-traffic scenarios
Use queries like SHOW BINARY LOGS for size/position; tools like Datadog/New Relic for lag.Best Practices: Retain 3–5 days; alert on expiration/growth.
  • MyISAM unsupported
  • Binlog expiration can lose events
  • PK changes require manual handling
  • TOASTed equivalents incomplete without FULL image
  • High binlog growth without monitoring
  • Schema changes during snapshots unsupported
  • Read replicas for snapshots in some cloud setups
  • Captures deletes as events with before images if binlog_row_image=FULL
  • Supports soft deletes
Encrypted connections (SSL), IP allowlisting, role-based access (dedicated user recommended).
  • Binlog Buildup: Enable heartbeats; monitor retention; drop unused logs
  • Data Loss on Upgrade: Stop writes, capture all events, recreate positions post-upgrade, resnapshot
  • Missing Events: Ensure binlog enabled and tables included; check REPLICATION privileges
  • Use dedicated replication user (not root)
  • Enable GTID mode for better failover
  • Limit databases/tables to reduce binlog size
  • Enable auto-vacuum equivalents; set binlog retention to 3-5 days
  • For deletes, use FULL binlog_row_image
  • Test snapshots in staging; monitor lag via tools
  • For cloud: Use provider monitoring; enable heartbeats for low traffic