Two Categories of Transforms
| Category | Where It Runs | Who Controls Order | User Action |
|---|---|---|---|
| Destination-side transforms | On the destination connector (Kafka Connect SMTs) | System (fixed order) | Configure which transforms are active |
| Pipeline transforms (Flink) | In the streaming layer (Apache Flink) | You | Define transforms and arrange their sequence |
These two categories operate at different stages of the pipeline. Destination-side transforms run after pipeline transforms. Data flows from the source, through any Flink pipeline transforms, and then through destination-side transforms before landing in the destination.
Destination-Side Transforms
Destination-side transforms are applied automatically by the system in a fixed order. You choose which transforms to enable on your destination connector, but the system determines the sequence in which they execute.Available Destination-Side Transforms
| Transform | What It Does |
|---|---|
| DropFields | Removes specified fields from the record before it reaches the destination. |
| RenameFields | Renames fields to match destination naming conventions or resolve conflicts. |
| RenameKeyFields | Renames fields in the message key (as opposed to the value). |
| CopyField | Copies the value of one field into a new field, preserving the original. |
| ToJsonJ | Converts structured fields (maps, arrays, nested objects) into JSON strings. |
| ToJsonbJ | Converts structured fields into PostgreSQL-compatible JSONB strings. |
| ToIntJ | Casts compatible field values to integer type. |
| ToFloatJ | Casts compatible field values to float type. |
| ToStringJ | Casts field values to string type. |
| AddStringSuffix | Appends a suffix string to the value of a specified field. |
| StringReplace | Replaces occurrences of a pattern within string field values. |
| HeaderToFieldCustom | Copies a Kafka message header value into a record field. |
| MarkColumnsAsOptional | Marks specified columns as optional in the schema, allowing null values. |
Fixed Execution Order
The system applies active destination-side transforms in a predetermined sequence. You do not need to set or manage this order — it is handled automatically. Because the order is fixed, certain interactions are predictable:- If you enable both ToJsonJ and DropFields, JSON conversion runs before fields are dropped. If you need a field excluded from JSON output, use DropFields to remove the entire JSON field after conversion.
- When both CopyField and RenameFields are enabled, the copy runs first. Rename operations apply to the already-copied fields.
- Enabling both ToJsonJ and ToStringJ means JSON conversion runs before string casting. Nested objects are converted to JSON strings first, then any remaining type casting to string is applied.
Pipeline Transforms (Flink)
Pipeline transforms run in the Apache Flink streaming layer. Unlike destination-side transforms, you control the order in which pipeline transforms execute. Each transform receives the output of the previous one, forming a chain. For the full list of available pipeline transform types, see Transforms.Why Order Matters
Pipeline transforms execute sequentially. Each transform receives the output of the one before it. This means:- A filter that removes records will reduce the volume of data that subsequent transforms process.
- An enrich transform adds fields that downstream transforms can reference.
- A fan-out transform routes records to separate output topics — any subsequent transforms must be configured per output branch, so place fan-out at the end of your chain.
Recommended Ordering
The following order works well for most pipelines that combine multiple transform types:Filter first
Apply Transform / Filter Records early to remove irrelevant records. This reduces the volume of data that all subsequent transforms must process.
Join or Enrich next
Use Join, Enrich, or Enrich (Async) to combine or augment records. Running these after filtering means fewer records to look up or join, which improves performance and reduces API call volume.
Transform / Map after enrichment
Apply Transform / Filter Records again (if needed) to reshape, rename, or compute fields using the enriched data.
Fan Out last
Use Fan Out at the end of your chain so that all preceding logic (filtering, enrichment, mapping) is applied before records are routed to their final output topics.
Not every pipeline needs all of these stages. Use only the transforms your use case requires. The recommended order applies to the transforms you do use.
Practical Examples
Example 1: Filter Before Enrich (Async)
Scenario: You stream order events and want to call an external API to enrich orders with customer details, but only for orders above $100.| Order | Transform | Effect |
|---|---|---|
| 1 | Filter Records | Keep only orders where amount > 100. Removes 80% of records. |
| 2 | Enrich (Async) | Call external API to fetch customer details for remaining orders. |
Example 2: Enrich Then Map
Scenario: You want to enrich product records with category data from a lookup table, then compute a new field based on the enriched data.| Order | Transform | Effect |
|---|---|---|
| 1 | Enrich | Add category_name from a cached lookup topic. |
| 2 | Transform / Filter Records | Compute display_label by combining product_name and category_name. |
category_name field, which only exists after enrichment. Reversing the order would mean the field is not yet available.
Example 3: Full Chain with Fan Out
Scenario: You stream user activity events and want to filter, enrich, and then route to different output topics based on activity type.| Order | Transform | Effect |
|---|---|---|
| 1 | Filter Records | Remove bot traffic and test accounts. |
| 2 | Enrich | Add user profile data from a cached topic. |
| 3 | Transform / Filter Records | Normalize field names and compute session duration. |
| 4 | Fan Out | Route purchase events to one topic and browse events to another. |
Common Ordering Mistakes
| Mistake | Problem | Fix |
|---|---|---|
| Enrich (Async) before Filter | Wastes API calls on records that are later discarded | Move Filter before Enrich (Async) |
| Map/Transform before Enrich | References enriched fields that do not exist yet | Move Enrich before the Map/Transform step |
| Fan Out before Map | Requires duplicating transform logic in each output branch | Move Fan Out to the end of the chain |
| Filter after Join | Joins all records first, then discards some — wastes join processing | Filter input topics before joining |
End-to-End Data Flow
To understand where each category of transforms fits, here is the full data path:- Source — Change events are captured from your database.
- Pipeline transforms (Flink) — Your user-defined transforms (Filter, Enrich, Join, Fan Out) execute in the order you specify.
- Destination-side transforms (SMTs) — System-managed transforms (DropFields, RenameFields, ToJsonJ, etc.) apply in a fixed order.
- Destination — Transformed data lands in your data warehouse or lake.
See Also
- Transform Types Overview - All available transform types
- Streaming Transforms - Managing transforms in the UI