Overview
The Transforms page displays all active transforms with their current status and performance metrics:
Page Actions
- Create Transform: Launch the transform creation wizard
- Search: Filter transforms by name
- Refresh: Reload the transforms list
Transforms Table
The table displays all transforms with the following columns:- Name: User-defined transform name (click to view details)
- Type: Transform type (e.g., “Transform/Filter Records”, “Aggregate”, “Join”, “Enrich”)
- Status: Current transform job status
RUNNING
: Transform is actively processing recordsRESTARTING
: Transform is restarting due to failure or configuration changeFAILED
: Transform has failed and requires attentionUNKNOWN
: Status cannot be determined
- Start Time: Timestamp when the current transform job started
- Duration: How long the current job has been running
- End Time: Timestamp when the job ended (for completed/failed transforms)
- Latency: Average time for records to be processed by this transform
- Tasks: Number of parallel tasks executing the transform logic (degree of parallelism)
Quick Actions Menu
Click the three-dot menu on any transform row to access:- View: Navigate to transform detail page
- Copy ID: Copy transform UUID to clipboard (useful for support tickets)
- Delete: Remove the transform (requires confirmation)
Transform Detail Page
Click any transform name to view detailed information across three tabs: Status, Settings, and Implementation.Status Tab
The Status tab displays real-time metrics and output topics:
- Status: Current job status (
RUNNING
,RESTARTING
,FAILED
, orUNKNOWN
) - Tasks: Number of parallel tasks (degree of parallelism)
- Latency: Average record processing time
- Duration: How long the current job has been running
- Name: User-defined transform name
- Type: Transform type (e.g., “Transform/Filter Records”, “Aggregate”, “Join”)
- Job: Apache Flink job identifier
- Name: Output topic name (click to view topic details)
- Volume: Total data volume written to this topic
- Errors: Number of processing errors
- Written: Total number of records written
Transforms can write to multiple output topics depending on the transform logic and routing rules.
Settings Tab
Configure transform parameters and resource allocation:
- Name: Transform display name (editable)
- Language: Transform logic language (currently JavaScript only)
- Input Pattern: Regular expression matching input topic names
- Example:
.*\.db_shard.*
matches all topics containing “.db_shard.” - Supports standard regex syntax
- Example:
- Output Pattern: Template for output topic names
- Example:
db_users.users_aggregated
creates a single output topic - Can include variables for dynamic topic creation
- Example:
- Transform Parallelism: Slider to adjust the number of parallel tasks (1-64)
- Higher parallelism increases throughput but consumes more resources
- Set based on input topic partition count for optimal performance
- Recommended: Match input topic partitions or set to 2x for CPU-intensive transforms
For high-throughput transforms, set parallelism equal to or greater than the number of input topic partitions to maximize throughput.
Implementation Tab
The Implementation tab contains the transform logic code and testing interface. Transform Code Editor- Write or edit JavaScript transformation logic
- Access to Streamkap transform APIs and utilities
- Syntax highlighting and error detection
- Test transform logic with sample records
- View transformation output before deploying
- Debug transformation errors
Refer to the Transform Types documentation for specific implementation examples and available APIs for each transform type.
Best Practices
- Start with Low Parallelism: Begin with parallelism matching input partitions, then scale up if needed
- Monitor Latency: Watch the latency metric to identify performance bottlenecks
- Use Descriptive Names: Name transforms clearly to indicate their purpose
- Test Before Deploying: Use the Implementation tab testing interface to validate logic
- Handle Errors Gracefully: Implement error handling in transform code to prevent job failures
- Review Output Topics: Regularly check output topic metrics for unexpected errors or volume
- Optimize Input Patterns: Use specific regex patterns to avoid processing unnecessary topics
- Document Transform Logic: Add comments in Implementation tab to explain complex transformations
Troubleshooting
Transform Status Shows RESTARTING
If a transform continuously restarts:- Check Logs: Review transform logs in the Logs page for error messages
- Verify Input Topics: Ensure input pattern matches existing topics
- Test Transform Logic: Use Implementation tab to test with sample data
- Check Resources: Verify service has sufficient resources for the parallelism level
- Review Recent Changes: Revert recent settings or code changes that may have caused the issue
High Latency
If transform latency is increasing:- Increase Parallelism: Scale up the number of tasks in Settings tab
- Optimize Transform Code: Review and optimize JavaScript logic in Implementation tab
- Check Input Lag: Verify input topics don’t have excessive consumer lag
- Review Resource Usage: Ensure service has sufficient CPU and memory
- Partition Input Topics: Increase partitions on input topics for better parallelism
No Output Records
If the transform shows zero written records:- Verify Input Pattern: Ensure regex pattern matches actual topic names
- Check Transform Logic: Confirm logic doesn’t filter out all records
- Review Input Topics: Verify input topics contain data
- Check for Errors: Look at Errors metric and review logs
- Test Implementation: Use Implementation tab testing to validate logic
Related Documentation
- Transform Types - Detailed guides for each transform type (Filter, Aggregate, Join, Enrich)
- Pipelines - How transforms integrate into data pipelines
- Topics - Understanding input and output topics
- Logs - Troubleshooting transform errors
- Services - Managing service resources for transforms