Skip to main content
Streaming Transforms allow you to apply real-time data transformations using Apache Flink. Transform records as they flow through Kafka topics with custom JavaScript logic for filtering, enrichment, aggregation, and more.

Overview

The Transforms page displays all active transforms with their current status and performance metrics:

Page Actions

  • Create Transform: Launch the transform creation wizard
  • Search: Filter transforms by name
  • Refresh: Reload the transforms list

Transforms Table

The table displays all transforms with the following columns:
  • Name: User-defined transform name (click to view details)
  • Type: Transform type (e.g., “Transform/Filter Records”, “Aggregate”, “Join”, “Enrich”)
  • Status: Current transform job status
    • RUNNING: Transform is actively processing records
    • RESTARTING: Transform is restarting due to failure or configuration change
    • FAILED: Transform has failed and requires attention
    • UNKNOWN: Status cannot be determined
  • Start Time: Timestamp when the current transform job started
  • Duration: How long the current job has been running
  • End Time: Timestamp when the job ended (for completed/failed transforms)
  • Latency: Average time for records to be processed by this transform
  • Tasks: Number of parallel tasks executing the transform logic (degree of parallelism)

Quick Actions Menu

Click the three-dot menu on any transform row to access:
  • View: Navigate to transform detail page
  • Copy ID: Copy transform UUID to clipboard (useful for support tickets)
  • Delete: Remove the transform (requires confirmation)

Transform Detail Page

Click any transform name to view detailed information across three tabs: Status, Settings, and Implementation.

Status Tab

The Status tab displays real-time metrics and output topics:
Stats Cards Key performance metrics for the transform:
  • Status: Current job status (RUNNING, RESTARTING, FAILED, or UNKNOWN)
  • Tasks: Number of parallel tasks (degree of parallelism)
  • Latency: Average record processing time
  • Duration: How long the current job has been running
Transform Information
  • Name: User-defined transform name
  • Type: Transform type (e.g., “Transform/Filter Records”, “Aggregate”, “Join”)
  • Job: Apache Flink job identifier
Topics Table Lists all output topics created by this transform:
  • Name: Output topic name (click to view topic details)
  • Volume: Total data volume written to this topic
  • Errors: Number of processing errors
  • Written: Total number of records written
Transforms can write to multiple output topics depending on the transform logic and routing rules.

Settings Tab

Configure transform parameters and resource allocation:
Basic Settings
  • Name: Transform display name (editable)
  • Language: Transform logic language (currently JavaScript only)
Topic Patterns Define which topics the transform processes and where output is written:
  • Input Pattern: Regular expression matching input topic names
    • Example: .*\.db_shard.* matches all topics containing “.db_shard.”
    • Supports standard regex syntax
  • Output Pattern: Template for output topic names
    • Example: db_users.users_aggregated creates a single output topic
    • Can include variables for dynamic topic creation
Performance Tuning
  • Transform Parallelism: Slider to adjust the number of parallel tasks (1-64)
    • Higher parallelism increases throughput but consumes more resources
    • Set based on input topic partition count for optimal performance
    • Recommended: Match input topic partitions or set to 2x for CPU-intensive transforms
Advanced Settings Click to expand additional configuration options (transform-type specific).
For high-throughput transforms, set parallelism equal to or greater than the number of input topic partitions to maximize throughput.

Implementation Tab

The Implementation tab contains the transform logic code and testing interface. Transform Code Editor
  • Write or edit JavaScript transformation logic
  • Access to Streamkap transform APIs and utilities
  • Syntax highlighting and error detection
Testing Interface
  • Test transform logic with sample records
  • View transformation output before deploying
  • Debug transformation errors
Refer to the Transform Types documentation for specific implementation examples and available APIs for each transform type.

Best Practices

  1. Start with Low Parallelism: Begin with parallelism matching input partitions, then scale up if needed
  2. Monitor Latency: Watch the latency metric to identify performance bottlenecks
  3. Use Descriptive Names: Name transforms clearly to indicate their purpose
  4. Test Before Deploying: Use the Implementation tab testing interface to validate logic
  5. Handle Errors Gracefully: Implement error handling in transform code to prevent job failures
  6. Review Output Topics: Regularly check output topic metrics for unexpected errors or volume
  7. Optimize Input Patterns: Use specific regex patterns to avoid processing unnecessary topics
  8. Document Transform Logic: Add comments in Implementation tab to explain complex transformations

Troubleshooting

Transform Status Shows RESTARTING

If a transform continuously restarts:
  1. Check Logs: Review transform logs in the Logs page for error messages
  2. Verify Input Topics: Ensure input pattern matches existing topics
  3. Test Transform Logic: Use Implementation tab to test with sample data
  4. Check Resources: Verify service has sufficient resources for the parallelism level
  5. Review Recent Changes: Revert recent settings or code changes that may have caused the issue

High Latency

If transform latency is increasing:
  1. Increase Parallelism: Scale up the number of tasks in Settings tab
  2. Optimize Transform Code: Review and optimize JavaScript logic in Implementation tab
  3. Check Input Lag: Verify input topics don’t have excessive consumer lag
  4. Review Resource Usage: Ensure service has sufficient CPU and memory
  5. Partition Input Topics: Increase partitions on input topics for better parallelism

No Output Records

If the transform shows zero written records:
  1. Verify Input Pattern: Ensure regex pattern matches actual topic names
  2. Check Transform Logic: Confirm logic doesn’t filter out all records
  3. Review Input Topics: Verify input topics contain data
  4. Check for Errors: Look at Errors metric and review logs
  5. Test Implementation: Use Implementation tab testing to validate logic
  • Transform Types - Detailed guides for each transform type (Filter, Aggregate, Join, Enrich)
  • Pipelines - How transforms integrate into data pipelines
  • Topics - Understanding input and output topics
  • Logs - Troubleshooting transform errors
  • Services - Managing service resources for transforms
I