Motherduck (Native)

Stream data to MotherDuck from Streamkap

Overview

This guide explains how to stream data from a Kafka cluster into the MotherDuck database using native jdbc (duckdb) driver. Since this is direct Kafka sink connector for MotherDuck, it will ingest the data into your database without S3 as an intermediary.

Prerequisites

  • MotherDuck Account: A valid MotherDuck account and database setup where the data will be loaded.

Motherduck Setup

To set up the Connector, you will need to gather connection details and configure your Motherduck instance. Log in to your Motherduck Account and then follow the steps below.

Get database name

You will need the database name to configure the connector. You can find it on the left side panel of the Motherduck UI, under the Attached databases section.

Generate an access token

For setting the Streamkap DataBricks' Token:

  1. Open Settings page from the top left menu option
  2. Open the Access Tokens page, listed under the INTEGRATIONS section
  3. Click + Create token
  4. Enter a Name and select Read/Write Token option in Token type and keep Automatically expire this token to false
  5. Click Create token
  6. Copy the access token

Streamkap Setup

  1. Go to Destinations and choose Motherduck
  2. Input the following information:
    1. Name - A unique and memorable name for this Connector
    2. Ingestion Mode (default: upsert) - See Inserts/Upserts for information about ingestion modes
    3. Delete Mode (default: none) - Delete records in destination if deleted at source
    4. Tasks - If Pipelines for this Destination have lag that's continuously growing over time and not reducing, increase the number of Tasks, otherwise, leave as default
    5. Schema Evolution (default: true) - If enabled, the connector will automatically adapt to changes in the source schema by adding new columns in the target table
    6. Motherduck Token - The access token you generated earlier
    7. Database (Case sensitive) - The name of the database
    8. Schema (Case sensitive - default: streamkap) - The schema within the database to write data to
  3. Click Save

How it works

As data's streamed from the source in to topics (think of them as partitioned tables), the Motherduck Sink connector will:

  • Check whether tables for the topics exist in Motherduck, if not, it creates them
  • Detect changes between the source data schema and target table schema, and if:
    • a new column (by name) is found, add to the end of the table
    • an existing column's data type is changed, add a new column to the end of the table named <column_name>_<new_data_type_name>