Redshift
Stream data into Redshift
Prerequisites
- Connection details
- Streamkap user and role
Granting Privileges
It's recommended to create a separate user and role for Streamkap to access your Redshift database. Below is an example script that does that.
-- Connect to your Redshift cluster using a SQL client
-- Create the Streamkap User
CREATE USER streamkap_user PASSWORD {'password'}
-- Grant privileges to the user on specific schemas and tables
GRANT USAGE ON SCHEMA {schema} TO streamkap_user;
-- GRANT SELECT, INSERT, UPDATE, DELETE ON streamkap.streamkap_table TO streamkap_user;
Streamkap Setup
-
Go to Destinations and choose Redshift
-
Input the following information:
- Name - A unique and memorable name for this Connector
- Hostname - The hostname connection string without the port.
- Port - The port number of the connection
- Username (Case sensitive) -
STREAMKAP_USER
OR the username you chose - Password - The password for your username
- Database Name (Case sensitive) -
STREAMKAPDB
OR the database name you chose - Schema Name (Case sensitive) -
STREAMKAP
OR the schema name you chose - Insert Mode Insert or Upserts (i.e. append only or replace)
- Delete Mode - Delete records in destination if deleted at source
- Schema Evolution - Handle additional columns automatically?
- Tasks - Amount of parallelism in writing events.
-
Click Save
How this Connector Works
The Redshift connector supports idempotent write operations by using upsert semantics and basic schema evolution.
The following features are supported:
- At-least-once delivery
- Delete mode
- Idempotent writes (Insert/Upsert mode)
- Schema evolution
At-least-once delivery
The Redshift connector guarantees that events that is consumes are processed at least once.
Delete mode
The PostreSQL connector can delete rows in the destination database when a DELETE or tombstone event is consumed.
Idempotent writes
The Redshift connector supports idempotent writes, allowing the same records to be replayed repeatedly and the final database state to remain consistent. In order to support idempotent writes, the Redshift connector must be set to Upsert mode. An upsert operation is applied as either an update or an insert, depending on whether the specified primary key already exists. If the primary key value already exists, the operation updates values in the row. If the specified primary key value doesn’t exist, an insert adds a new row.
Schema evolution
The Redshift connector supports schema evolution
The connector automatically detects fields that are in the event payload but that do not exist in the destination table. The connector alters the destination table to add the new fields.
When schema evolution is set to Yes, the connector automatically creates or alters the destination database table according to the structure of the incoming event.
When an event is received from a topic for the first time, and the destination table does not yet exist, the Redshift connector uses the event’s key, or the schema structure of the record to resolve the column structure of the table. If schema evolution is enabled, the connector prepares and executes a CREATE TABLE SQL statement before it applies the DML event to the destination table.
When the Redshift connector receives an event from a topic, if the schema structure of the record differs from the schema structure of the destination table, the connector uses either the event’s key or its schema structure to identify which columns are new, and must be added to the database table. If schema evolution is enabled, the connector prepares and executes an ALTER TABLE SQL statement before it applies the DML event to the destination table. Because changing column data types, dropping columns, and adjusting primary keys can be considered dangerous operations, the connector is prohibited from performing these operations.
Updated 9 months ago