Db2 (Generic)

⚠️

Early access

This Connector is currently in the early stages of testing and may have unexpected issues and glitches. The documentation is also a work in progress.

Prerequisites

❗️

IBM InfoSphere Data Replication License required

The connector requires the use of the abstract syntax notation (ASN) libraries, which are available as a standard part of Db2 for Linux. To use the ASN libraries, you must have a license for IBM InfoSphere Data Replication (IIDR). You do not have to install IIDR to use the ASN libraries.

  • Db2 version ≥ 11.5
  • db2inst1 user credentials

Db2 Setup

The Connector is based on the ASN Capture/Apply agents that enable SQL Replication in Db2. A capture agent:

  • Generates change-data tables for tables that are in capture mode.
  • Monitors tables in capture mode and stores change events for updates to those tables in their corresponding change-data tables.

The Connector uses a SQL interface to query change-data tables for change events.

Granting Privileges

It's recommended to create a separate user and role for Streamkap to access your Db2 database. Below is an example script that does that.

-- Replace { ... } placeholders as required
CREATE USER STREAMKAP_USER IDENTIFIED BY '{password}';

CREATE ROLE STREAMKAP_ROLE;

-- Grant Streamkap permissions on the database, schema and all tables to capture
GRANT CONNECT ON DATABASE TO STREAMKAP_ROLE; 

-- Grant SELECT on all tables Streamkap should capture
GRANT SELECT ON {schemaName}.{tableName} TO STREAMKAP_ROLE;

GRANT ROLE STREAMKAP_ROLE TO USER STREAMKAP_USER;

Enable Snapshots

You will need to create the table in the source database and give permissions to the STREAMKAP_USER. Streamkap will use this table for managing snapshots.

❗️

Please create the signal table with the name STREAMKAP_SIGNAL. It will not be recognised if given another name

-- Create the schema
CREATE SCHEMA STREAMKAP;

-- Create the table within the schema
CREATE TABLE STREAMKAP.STREAMKAP_SIGNAL (
  id VARCHAR(255) PRIMARY KEY, 
  type VARCHAR(32) NOT NULL, 
  data VARCHAR(2000) NULL
);

-- Grant necessary privileges on the table to the role
GRANT SELECT, UPDATE, INSERT ON STREAMKAP.STREAMKAP_SIGNAL TO STREAMKAP_ROLE;

Enable Change Data Capture

To put tables into capture mode, Debezium provides a set of user-defined functions (UDFs) for your convenience. The procedure here shows how to install and run these management UDFs. Alternatively, you can run Db2 control commands to put tables into capture mode. The administrator must then enable CDC for each table that you want Streamkap to capture.

Prerequisites

  • You are logged in to Db2 as the db2inst1 user
  • On the Db2 host, the Debezium management UDFs are available in the $HOME/asncdctools/src directory. UDFs are available from the Debezium examples repository
  • The Db2 command bldrtn is on PATH, e.g. by running export PATH=$PATH:/opt/ibm/db2/V11.5.0.0/samples/c/ with Db2 11.5

Compile the Debezium management UDFs on the Db2 server host by using the bldrtn command provided with Db2:

cd $HOME/asncdctools/src  
bldrtn asncdc

Ensure that JDBC can read the Db2 metadata catalog:

# Replace the {...} placeholders as required
cd $HOME/sqllib/bnd  
db2 connect to {databaseName}  
db2 bind db2schema.bnd blocking all grant public sqlerror continue

Ensure that the database was recently backed-up. The ASN agents must have a recent starting point to read from. If you need to perform a backup, run the following commands, which prune the data so that only the most recent version is available. If you do not need to retain the older versions of the data, specify dev/null for the backup location.

# Replace the {...} placeholders as required
db2 backup db {databaseName} to {backupLocation}
db2 restart db {databaseName}

Connect to the database to install the Debezium management UDFs. It is assumed that you are logged in as the db2inst1 user so the UDFs should be installed on the db2inst1 user. Copy the Debezium management UDFs and set permissions for them:

# Replace the {...} placeholders as required
db2 connect to {databaseName}
cp $HOME/asncdctools/src/asncdc $HOME/sqllib/function  
chmod 777 $HOME/sqllib/function

Enable the Debezium UDF that starts and stops the ASN capture agent:

db2 -tvmf $HOME/asncdctools/src/asncdc_UDF.sql

Create the ASN control tables:

$ db2 -tvmf $HOME/asncdctools/src/asncdctables.sql

Enable the Debezium UDF that adds tables to capture mode and removes tables from capture mode:

$ db2 -tvmf $HOME/asncdctools/src/asncdcaddremove.sql

After you set up the Db2 server, use the UDFs to control Db2 replication (ASN) with SQL commands. Some of the UDFs expect a return value in which case you use the SQL VALUE statement to invoke them. For other UDFs, use the SQL CALL statement. Start the ASN agent:

VALUES ASNCDC.ASNCDCSERVICES('start','asncdc');

Put tables into capture mode. Invoke the following statement for each table that you want to put into capture:

-- Replace {...} placeholders as required
CALL ASNCDC.ADDTABLE('{schemaName}', '{tableName}');

Reinitialize the ASN service:

VALUES ASNCDC.ASNCDCSERVICES('reinit','asncdc');

Grant the STREAMKAP_USER privileges on the ASN CDC tables:

GRANT SELECT ON TABLE ASNCDC.IBMQREP_COLVERSION TO ROLE STREAMKAP_ROLE;
GRANT SELECT ON TABLE ASNCDC.IBMQREP_TABVERSION TO ROLE STREAMKAP_ROLE;
GRANT SELECT ON TABLE ASNCDC.IBMSNAP_APPLEVEL TO ROLE STREAMKAP_ROLE;
GRANT SELECT ON TABLE ASNCDC.IBMSNAP_CAPMON TO ROLE STREAMKAP_ROLE;
GRANT SELECT ON TABLE ASNCDC.IBMSNAP_CAPPARMS TO ROLE STREAMKAP_ROLE;
GRANT SELECT ON TABLE ASNCDC.IBMSNAP_CAPSCHEMAS TO ROLE STREAMKAP_ROLE;
GRANT SELECT ON TABLE ASNCDC.IBMSNAP_CAPTRACE TO ROLE STREAMKAP_ROLE;
GRANT SELECT ON TABLE ASNCDC.IBMSNAP_PRUNCNTL TO ROLE STREAMKAP_ROLE;
GRANT SELECT ON TABLE ASNCDC.IBMSNAP_PRUNE_LOCK TO ROLE STREAMKAP_ROLE;
GRANT SELECT ON TABLE ASNCDC.IBMSNAP_PRUNE_SET TO ROLE STREAMKAP_ROLE;
GRANT SELECT ON TABLE ASNCDC.IBMSNAP_REGISTER TO ROLE STREAMKAP_ROLE;
GRANT SELECT ON TABLE ASNCDC.IBMSNAP_RESTART TO ROLE STREAMKAP_ROLE;
GRANT SELECT ON TABLE ASNCDC.IBMSNAP_SIGNAL TO ROLE STREAMKAP_ROLE;
GRANT SELECT ON TABLE ASNCDC.IBMSNAP_SUBS_COLS TO ROLE STREAMKAP_ROLE;
GRANT SELECT ON TABLE ASNCDC.IBMSNAP_SUBS_MEMBR TO ROLE STREAMKAP_ROLE;
GRANT SELECT ON TABLE ASNCDC.IBMSNAP_SUBS_SET TO ROLE STREAMKAP_ROLE;
GRANT SELECT ON TABLE ASNCDC.IBMSNAP_SUBS_STMTS TO ROLE STREAMKAP_ROLE;
GRANT SELECT ON TABLE ASNCDC.IBMSNAP_UOW TO ROLE STREAMKAP_ROLE;
GRANT SELECT ON TABLE ASNCDC.IBMSNAP_CAPENQ TO ROLE STREAMKAP_ROLE;

Streamkap Setup

  • Go to Sources, click Add and choose Db2
  • Input the following information:
    • Name for your Connector
    • Hostname - IP address or hostname of the Db2 database server
    • Port (default: 50000) - Port number of the Db2 database server
    • Username - Name of the Db2 database user for connecting to the Db2 database server
    • Password - Password to use when connecting to the Db2 database server
    • Database Name - The name of the Db2 database from which to stream the changes
    • Signal Table Schema - Streamkap will use a collection in this schema to manage snapshots e.g. public. See Enable Snapshots for more information
    • Connect via SSH Tunnel. See SSH Tunnel
    • Add Schemas/Tables. Can also bulk upload here. The format is a simple list of each schema or table per row saved in csv format without a header.
    • Click Save

Db2 Schema Evolution

While the Connector can capture schema changes, to update a schema, you must collaborate with a database administrator to ensure that the connector continues to produce change events. This is required because of how Db2 implements replication.

For each table in capture mode, the replication feature in Db2 creates a change-data table that contains all changes to that source table. However, change-data table schemas are static. If you update the schema for a table in capture mode then you must also update the schema of its corresponding change-data table. The Connector cannot do this.

A database administrator with elevated privileges must update schemas for tables that are in capture mode.

Refresh change table structure (Online)

🚧

Online refresh limitation

In the interval between the source table structure changing in the source database, and _before _the change table structure is refreshed, change events continue to be captured with the outdated table structure.

For example, if you added a new column to a source table, change events that are captured before the change table is refreshed will not contain the new column.

If this cannot be tolerated, an Offline refresh has to be performed. However, that means downtime for whatever system made structural changes to your tables and your Streamkap pipelines.

Please contact us for assistance if an Offline refresh is required.

An online schema update does not require application and data processing downtime. That is, you do not stop the Connector before you perform an online schema update. Also, an online schema update procedure is simpler than the procedure for an offline schema update.

However, when a table is in capture mode, after a change to a column name, the Db2 replication feature continues to use the old column name. The new column name does not appear in the change events. You must restart the Connector to see the new column name in change events.

Procedure when adding a column to the end of a table

  1. Lock the source tables whose schema you want to change
  2. In the ASN register table, mark the locked tables as INACTIVE
  3. Reinitialize the ASN capture service
  4. Apply all changes to the schemas for the source tables
  5. Apply all changes to the schemas for the corresponding change-data tables
  6. In the ASN register table, mark the source tables as ACTIVE
  7. Reinitialize the ASN capture service
  8. Restart the connector to see updated column names in change events

Procedure when adding a column to the middle of a table

  1. Lock the source table(s) to be changed
  2. In the ASN register table, mark the locked tables as INACTIVE
  3. Reinitialize the ASN capture service
  4. For each source table to be changed:
    1. Export the data in the source table
    2. Truncate the source table
    3. Alter the source table and add the column
    4. Load the exported data into the altered source table
    5. Export the data in the source table’s corresponding change-data table
    6. Truncate the change-data table
    7. Alter the change-data table and add the column
    8. Load the exported data into the altered change-data table
  5. In the ASN register table, mark the tables as INACTIVE. This marks the old change-data tables as inactive, which allows the data in them to remain but they are no longer updated
  6. Reinitialize the ASN capture service
  7. Restart the connector to see updated column names in change events

Reference

Debezium Management UDFs and Db2 ASN replication

If you followed our Db2 (Generic) - Enable Change Data Capture guide you can use these Debezium management UDFs to control Db2 replication (ASN) with SQL commands.

DescriptionCommand and notes
Start the ASN agentVALUES ASNCDC.ASNCDCSERVICES('start','asncdc');
Stop the ASN agentVALUES ASNCDC.ASNCDCSERVICES('stop','asncdc');
Check the status of the ASN agentVALUES ASNCDC.ASNCDCSERVICES('status','asncdc');
Enable change data capture mode on a tableCALL ASNCDC.ADDTABLE('{SCHEMA}', '{TABLE}');

Replace {SCHEMA} with the name of the schema that contains the table you want to put into capture mode. Likewise, replace {TABLE} with the name of the table to put into capture mode.
Disable change data capture mode on a tableCALL ASNCDC.REMOVETABLE('{SCHEMA}', '{TABLE}');
Reinitialize the ASN capture serviceVALUES ASNCDC.ASNCDCSERVICES('reinit','asncdc');

Do this after you put a table into capture mode or after you remove a table from capture mode.