Prerequisites
- PostgreSQL version ≥ 10
- (Recommended) Instance size of XL or greater
- Supabase IPv4 add-on enabled
- A database user with sufficient privileges for creating users
Supabase Setup
1. Grant Database Access
- Configure one of the Connection Options to ensure Streamkap can reach your database.
2. Create Database User
If you are running a PostgreSQL version earlier than 17, you will need to use the
postgres
user for replication and continue to the next step: Enable Snapshots. This is because, in pre-17 versions, non-superusers cannot create or manage replication slots.3. Enable Snapshots
To backfill your data, the Connector needs to be able to perform snapshots. See Snapshots & Backfilling for more information. To enable this feature, there are 2 methods available:Method 1: Enable read only connection
Requires PostgreSQL version 13 or higher.
- Set Read only to Yes during Streamkap Setup. No other configuration should be necessary.
Method 2: Create a table in the source database
Not supported on read replicas. Please use method 1 instead.
Please create the signal table with the name
streamkap_signal
. It will not be recognised if given another name.Publications and signal tables
When you create the PostgreSQL publication in the next step, if you choose to specify tables for capture instead of all tables, you must include thestreamkap_signal
table.4. Create Publication & Slot
Publications contain a set of change events for the tables you want the Connector to capture.- Create a publication for your tables. You can create a publication for all tables or selected tables.
Altering publications
You cannot alterFOR ALL TABLES
publications to include/exclude tables.If you set up a FOR ALL TABLES
publication and later decide to change that, you have to drop the publication and create another to include specific tables e.g. CREATE PUBLICATION ... TABLE table1, table2, table3, ...
.However, any change events that occur before the new publication’s created will not be included in it, so a snapshot’s required to ensure they are not missed by your Streamkap pipelines.You should also stop the Source before changing the publication.- Create a replication slot.
Replication slot auto-creation
If the Connector cannot find the replication slot named in Replication Settings during Streamkap Setup, it will attempt to create one automatically as it cannot read the database transaction logs without one.Streamkap Setup
Follow these steps to configure your new connector:1. Create the Source
- Navigate to Add Connectors.
- Choose Supabase.
2. Connection Settings
- Name: Enter a name for your connector.
- Hostname: Specify the hostname.
Connection modes
Supabase offers several connection modes, including connection poolers. However, Streamkap requires a direct connection because logical replication relies on theIDENTIFY_SYSTEM
command to retrieve essential replication information from the database. If another mode is used, the Connector will fail with errors related to IDENTIFY_SYSTEM
.-
Port: Default is
5432
. -
Connect via SSH Tunnel: The Connector will connect to an SSH server in your network which has access to your database. This is necessary if the Connector cannot connect directly to your database.
- See SSH Tunnel for setup instructions.
- Username: Username to access the database.
- Password: Password to access the database.
- Database: Specify the database to stream data from.
-
Read only: Whether or not to use a read-only connection. Requires PostgreSQL version 13 or higher.
- When connecting to a read replica, set this to Yes to support Streamkap snapshots.
-
Heartbeats: Crucial for low and intermittent traffic databases. Enabled by default.
- Heartbeat Table Schema (For read-write connections only; Read only is No): If configured, this can mitigate unnecessary PostgreSQL log growth and retention.
- See PostgreSQL Heartbeats for setup instructions.
- Heartbeat Table Schema (For read-write connections only; Read only is No): If configured, this can mitigate unnecessary PostgreSQL log growth and retention.
3. Snapshot Settings
If you set Read only to No, you will need to create a snapshot signal table and give permissions to the
streamkap_user
. See Enable Snapshots for setup instructions.- Signal Table Schema: The Connector will use a table in this schema to manage snapshots.
4. Replication Settings
- Replication Slot Name: The name of the replication slot for the connector to use. Default is
streamkap_pgoutput_slot
. - Publication Name: The name of the publication for the connector to use. Default is
streamkap_pub
.
5. Advanced Parameters
- SSL mode: Whether to use an encrypted connection to the PostgreSQL server. By default, it’s required.
- Prefix with Database Name?: Changes the format of topics to
DatabaseName_TopicName
- Represent binary data as: Specifies how the data for binary columns e.g.
blob
,binary
,varbinary
should be interpreted. Your destination for this data can impact which option you choose. Default isbytes
.
6. Schema and Table Capture
- Add Schemas/Tables: Specify the schema(s) and table(s) for capture
- You can bulk upload here. The format is a simple list of schemas and tables, with each entry on a new row. Save as a
.csv
file without a header.
- You can bulk upload here. The format is a simple list of schemas and tables, with each entry on a new row. Save as a
Troubleshooting
Managing PostgreSQL upgrades
Managing PostgreSQL upgrades
When upgrading the PostgreSQL database used by Streamkap, there are specific steps to prevent data loss and ensure continued operation.Streamkap handles network failures and outages well. If a monitored database stops, the connector resumes from the last recorded log sequence number (LSN) once communication is restored. It retrieves this offset and queries PostgreSQL for a matching LSN in the replication slot.A replication slot is required for change capture, but PostgreSQL removes slots during upgrades and doesn’t restore them. When the connector restarts, it requests the last known offset, but PostgreSQL cannot return it.Creating a new replication slot isn’t enough to prevent data loss. New slots only track changes from their creation point and lack earlier offsets. The connector fetches its last known offset from Kafka but can’t retrieve corresponding data from the new slot. It skips older change events and resumes from the latest log position, causing silent data loss with no warnings.
Procedure
Follow these steps to minimize data loss. Note that a few steps may require support from Streamkap. We recommend notifying us about your database upgrade ahead of time to ensure you have the necessary support.- Using your database’s upgrade procedure, ensure writes to it have stopped.
- Allow the connector to capture all change events before starting the upgrade procedure. Ask Streamkap to confirm this for you.
Interrupting change data capture
If all the events were not captured before stopping the Source and upgrading the database, you can perform a snapshot (after the procedure is completed) in Streamkap for that Source to ensure no change events were missed.- Assuming all events are captured, stop the Source in the Streamkap app. This flushes the last records and saves the last offset.
- Stop the database and upgrade it using your upgrade procedure.
- Recreate the logical replication slot, otherwise, Streamkap will miss changes. See Create Publication & Slot.
- Verify the publication for Streamkap exists; recreate it if necessary: Create Publication & Slot.
Update the Source in the Streamkap app
If you have chosen different names for the logical replication slot or publication, make sure to update them in the Streamkap setup page for the relevant PostgreSQL Source. In this case, contact Streamkap, as they may need to reset your Connector’s offsets in this scenario.- Restore write access to the database.
- Resume or restart the Source in the Streamkap app.
Replica identity and deleted records (PostgreSQL 13 and newer)
Replica identity and deleted records (PostgreSQL 13 and newer)
Introduced in PostgreSQL 13, the This ensures complete data retention.
REPLICA IDENTITY
table setting controls what data is logged for row updates and deletes.By default, only the primary key and Streamkap metadata column values are retained for deleted records. All other columns will be empty. This leaves you with an incomplete record.If you require - for auditing and historical tracking purposes - all column values for deleted records, or if your deletion strategy for your destination is ‘soft deletes’ (retain the deleted record with a deletion flag), you should set the REPLICA IDENTITY
to FULL
for all capture tables.Capturing partitioned tables (PostgreSQL 13 and newer)
Capturing partitioned tables (PostgreSQL 13 and newer)
For capturing partitioned tables, it’s essential to enable
publish_via_partition_root
on the publication.By default, changes to partitions are published from the partition itself. The connector expects changes to come from the root table.To ensure compatibility and consistent replication, enable this setting:ALTER PUBLICATION streamkap_pub SET (publish_via_partition_root = true);