Transform / Filter Records
Apply custom logic to modify or filter records.
Setup
- Navigate to Create Transform.
- Choose Transform / Filter Records.
1. Settings
- Name: Enter a name for the transform e.g. filter out empty records
- Language: Select the preferred (if supported) language to define the transform.
- Input Pattern: Enter a regular expression pattern to list the topics this transform is applied to e.g.
source_abcd1234.public.orders
,source_abcd1234\.public\.order.*
. - Output Pattern: Enter the desired output topic name including schema e.g.
enriched.orders
. - Transform Parallelism: The degree of parallelism for this transform. Determines number of tasks this transform should use when executing the transform logic. The default should be more than sufficient in most cases.
2. Advanced Parameters
- Input Serialization: The format (e.g.
JSON
,AVRO
,STRING
,BYTES
) of the input data. The default (Any
) will try determine the format automatically. - Output Serialization: The format (e.g.
JSON
,AVRO
,STRING
,BYTES
) of the output data. The default (Any
) will try determine the format automatically.
Click Save.
Implementation
Now you need to define the logic for your transform and deploy it.
- Go to the Implementation tab.
1. Objective / Goal
- Describe in a short sentence the purpose of the transform and its expected output.
2. Input / Output Topics
These take the Input Pattern and Output Pattern entered earlier. You can also amend the input tables, and see a preview of the input topics captured by the input pattern as well as the corresponding output topic.
No topics in previewIf no topics are shown, review and amend your input topic patterns. The topics page shows the full names of topics, usually in three parts. Starting your pattern with
.*
ignores the first part (usually the Connector's ID), so you can focus your pattern on matching the schema (middle) and table (last) parts.
- Input Topic Pattern: Amend the desired input topic using a regular expression pattern.
- Output Topic Replacement: Amend the desired output topic name using a regular expression pattern.
Click Save Topics Pattern / Replacement.
3. Transform Logic
Here is where you define the logic for the transform.
Transform / Filter Records
When a record is processed by Streamkap from a data source, it's usually made up of at least 2 parts: key and value.
- The key usually represents the primary key of the record i.e. a unique record identifier.
- The value usually represents the rest of the record i.e. all other record fields/columns.
Depending on your transform/filter goal, you can define the logic for one or both of these.
- Click on the relevant
.js
scripts to edit them.
You can also transform the output topic name based on some value from the records rather than relying on the static/fixed Output Topic Replacement defined earlier.
- Click on the
topicTransform.js
script to edit it.
Once you're happy with them, continue to the next section.
4. Deploy & Check
Here you deploy your transform, but you can also perform diagnostic and validation checks at a click of a button.
4.0 Clear Output Topic
This purges the output topic of data (if any) as well as any cached job state. A "reset" of the transform's previous outputs so you can start fresh.
- Output Topic pattern to clear: Enter a regular expression pattern that captures the output topics to be cleared.
Click Clear Output Topics and Job State.
4.1 Deploy
When deploying a transform, you can decide from what point the transform starts processing the input topics data-assuming the data's still available.
- Replay Window:
- Time-based: Enter the last number of days, hours or minutes to process.
- Recency-based: Enter
0
to process from the latest data, or, leave it blank/empty to process from the earliest data.
Click Deploy Transform.
4.2 Job Status
Check on the current status of the transform job.
Click Check Job Status.
4.3 Output Topics
Check what output topics were created (if any) and populated by the transform.
Click Check Output Topics.
4.4 Errors
Check if the transform jobs are failing and if so, what the problem is.
Click Check for Errors.
Updated 1 day ago