Setup
- Navigate to Create Transform.
- Choose Transform / Filter Records.
1. Settings
- Name: Enter a name for the transform e.g. filter out empty records
- Language: Select the preferred (if supported) language to define the transform.
- Input Pattern: Enter a regular expression pattern to list the topics this transform is applied to e.g.
source_abcd1234.public.orders
,source_abcd1234\.public\.order.*
. - Output Pattern: Enter the desired output topic name including schema e.g.
enriched.orders
. - Transform Parallelism: The degree of parallelism for this transform. Determines number of tasks this transform should use when executing the transform logic. The default should be more than sufficient in most cases.
2. Advanced Parameters
- Input Serialization: The format (e.g.
JSON
,AVRO
,STRING
,BYTES
) of the input data. The default (Any
) will try determine the format automatically. - Output Serialization: The format (e.g.
JSON
,AVRO
,STRING
,BYTES
) of the output data. The default (Any
) will try determine the format automatically.
Implementation
Now you need to define the logic for your transform and deploy it.- Go to the Implementation tab.
1. Objective / Goal
- Describe in a short sentence the purpose of the transform and its expected output.
2. Input / Output Topics
These take the Input Pattern and Output Pattern entered earlier. You can also amend the input tables, and see a preview of the input topics captured by the input pattern as well as the corresponding output topic.No topics in preview
If no topics are shown, review and amend your input topic patterns. The topics page shows the full names of topics, usually in three parts. Starting your pattern with.*
ignores the first part (usually the Connector’s ID), so you can focus your pattern on matching the schema (middle) and table (last) parts.- Input Topic Pattern: Amend the desired input topic using a regular expression pattern.
- Output Topic Replacement: Amend the desired output topic name using a regular expression pattern.
3. Transform Logic
Here is where you define the logic for the transform.Transform / Filter Records
When a record is processed by Streamkap from a data source, it’s usually made up of at least 2 parts: key and value.- The key usually represents the primary key of the record i.e. a unique record identifier.
- The value usually represents the rest of the record i.e. all other record fields/columns.
- Click on the relevant
.js
scripts to edit them.
- Click on the
topicTransform.js
script to edit it.
4. Deploy & Check
Here you deploy your transform, but you can also perform diagnostic and validation checks at a click of a button.4.0 Clear Output Topic
This purges the output topic of data (if any) as well as any cached job state. A “reset” of the transform’s previous outputs so you can start fresh.- Output Topic pattern to clear: Enter a regular expression pattern that captures the output topics to be cleared.
4.1 Deploy
When deploying a transform, you can decide from what point the transform starts processing the input topics data-assuming the data’s still available.-
Replay Window:
- Time-based: Enter the last number of days, hours or minutes to process.
- Recency-based: Enter
0
to process from the latest data, or, leave it blank/empty to process from the earliest data.