Join

Join multiple topics. Supports schema evolution and pass-through fields.

Setup

1. Settings

  • Name: Enter a name for the transform e.g. filter out empty records
  • Language: Select the preferred (if supported) language to define the transform.
  • Input Pattern: Enter a regular expression pattern to list the topics this transform is applied to e.g. source_abcd1234.public.orders,source_abcd1234\.public\.order.*.
  • Output Pattern: Enter the desired output topic name including schema e.g. enriched.orders.
  • Transform Parallelism: The degree of parallelism for this transform. Determines number of tasks this transform should use when executing the transform logic. The default should be more than sufficient in most cases.

2. Advanced Parameters

  • Input Serialization: The format (e.g. JSON, AVRO, STRING, BYTES) of the input data. The default (Any) will try determine the format automatically.
  • Output Serialization: The format (e.g. JSON, AVRO, STRING, BYTES) of the output data. The default (Any) will try determine the format automatically.

Click Save.

Implementation

Now you need to define the logic for your transform and deploy it.

  • Go to the Implementation tab.

1. Objective / Goal

  • Describe in a short sentence the purpose of the transform and its expected output.

2. Input / Output Topics

These take the Input Pattern and Output Pattern entered earlier. You can also amend the input tables, and see a preview of the input topics captured by the input pattern as well as the corresponding output topic.

📘

No topics in preview

If no topics are shown, review and amend your input topic patterns. The topics page shows the full names of topics, usually in three parts. Starting your pattern with .* ignores the first part (usually the Connector's ID), so you can focus your pattern on matching the schema (middle) and table (last) parts.

  • Output Topic Replacement: Amend the desired output topic name using a regular expression pattern.

2.1 Input Tables

Here the input topic pattern is translated into individual table expressions for easier control over which topics are or are not included in the join.

  • Add or Remove topics as needed.

Click Save Topics Pattern / Replacement.

3. Transform Logic

Here is where you define the logic for the transform.

SQL Join

Depending on the number of input tables, a number of editable SQL scripts are generated. There are 2 types of SQL script involved: the first type defines-per input table-the fields needed to join them together, and the other is the JOIN query itself.

  • Click on each .sql script to edit it.

Once you're happy with them, continue to the next section.

4. Deploy & Check

Here you deploy your transform, but you can also perform diagnostic and validation checks at a click of a button.

4.0 Clear Output Topic

This purges the output topic of data (if any) as well as any cached job state. A "reset" of the transform's previous outputs so you can start fresh.

  • Output Topic pattern to clear: Enter a regular expression pattern that captures the output topics to be cleared.

Click Clear Output Topics and Job State.

4.1 Deploy

When deploying a transform, you can decide from what point the transform starts processing the input topics data-assuming the data's still available.

  • Replay Window:
    • Time-based: Enter the last number of days, hours or minutes to process.
    • Recency-based: Enter 0 to process from the latest data, or, leave it blank/empty to process from the earliest data.

Click Deploy Transform.

4.2 Job Status

Check on the current status of the transform job.

Click Check Job Status.

4.3 Output Topics

Check what output topics were created (if any) and populated by the transform.

Click Check Output Topics.

4.4 Errors

Check if the transform jobs are failing and if so, what the problem is.

Click Check for Errors.