Skip to main content
When a message cannot be processed by downstream consumers or fails during a transform, Streamkap routes it to a dead letter queue (DLQ) topic rather than blocking the entire pipeline. This ensures that healthy data continues to flow while problematic messages are captured for inspection and resolution. For a brief overview of DLQ topics and how they appear in the Topics page, see Topics.
Messages in the DLQ represent data that did NOT reach the destination. Until the root cause is resolved and data is backfilled, DLQ records represent data loss. Investigate DLQ messages promptly.

How the DLQ Works

When Streamkap encounters a message that cannot be processed, the following happens:
  1. The message is diverted from the normal data flow into a dedicated DLQ topic
  2. Error metadata is attached to the message as Kafka headers
  3. The pipeline continues processing remaining messages without interruption
  4. The DLQ message is retained for inspection and root-cause analysis
DLQ routing is automatic. You do not need to configure it. Every destination connector and transform has a DLQ topic created when errors occur.

Finding DLQ Topics

Naming Convention

DLQ topics follow a predictable naming pattern based on the connector type:
Connector TypeDLQ Topic PatternExample
Destinationdestination_{entity_id}.streamkap.deadletterqueuedestination_abc123.streamkap.deadletterqueue
Transform{transform_output_topic_prefix}deadletterqueuemy_transform.streamkap.deadletterqueue

Filtering in the Topics Page

  1. Navigate to the Topics page
  2. Check the Include DLQ checkbox in the top toolbar to show DLQ topics
  3. Use the Search Box to filter by deadletterqueue
  4. Optionally, use the By Destination or By Transform collection in the left sidebar to narrow results
  5. Check the Errors column for non-zero values to identify active DLQ topics
Use the Error Status filter in the left sidebar to quickly surface topics that have DLQ activity without manually searching.

Inspecting DLQ Messages

Once you locate a DLQ topic, click on it to open the Topic Details panel, then click Browse Messages to access the Messages tab.

Reading Error Information

Each DLQ message contains the original payload along with error headers that describe why the message failed. Streamkap inspects the following headers in order of priority:
  1. __connect.errors.exception.message — the primary exception message from the Kafka Connect framework
  2. __connect.errors.exception.class.name — the Java exception class that caused the failure
  3. exception.message — a general exception message
  4. error.message — an error description from the processing component
  5. _streamkap_error — a Streamkap-specific error message
  6. connect.errors.exception.message — fallback exception message (without leading underscores)
  7. connect.errors.exception.class.name — fallback exception class name (without leading underscores)
Streamkap parses error information by first checking message headers, then attempting to extract structured JSON values, then falling back to raw values, and finally adding contextual metadata. The first non-empty value found in the header priority list above is used as the primary error description.

Inspecting a DLQ Message

  1. Open the DLQ topic in the Messages tab
  2. Click on a message row to expand it
  3. Review the Value field for the original message payload
  4. Review the Headers for error details — look for the headers listed above
  5. Use the Regex Filter to search for specific error patterns across multiple messages (e.g., .*schema.* or .*permission.*)
If the DLQ topic contains many messages, use the Timestamp Seek filter to jump to a specific time window. Combine it with a Regex filter to narrow results further. See Topics - Filtering Messages for details.

Common Error Patterns and Fixes

Symptoms:
  • Error headers reference schema, type, or conversion errors
  • Messages mention mismatched column types or unsupported data types
Common Causes:
  • A source column type was altered (e.g., INTEGER changed to TEXT)
  • The destination does not support a data type present in the source
  • A new column was added to the source but does not exist in the destination
Resolution: See Error Reference — Schema Errors for detailed resolution steps.
Symptoms:
  • Error headers reference access denied, authorization, or permission failures
  • Messages mention insufficient privileges or forbidden operations
Common Causes:
  • The destination database user lacks INSERT, UPDATE, or DELETE privileges on the target table
  • Table-level or schema-level permissions were revoked or not granted
  • Role-based access changes were applied without updating the connector credentials
Resolution: See Error Reference — Permission Errors for detailed resolution steps.
Symptoms:
  • Error headers reference message size, row size, or value length limits
  • Messages mention payload too large or column value exceeds maximum
Common Causes:
  • A row exceeds the maximum message size supported by the destination
  • A column value (e.g., a large TEXT or BLOB field) exceeds the destination column length limit
  • Batch size settings cause aggregated payloads to exceed limits
Resolution: See Error Reference — Resource Errors for detailed resolution steps on size and limit issues.
Symptoms:
  • Records appearing in the DLQ for Snowflake destinations
  • Error headers reference maximum record size exceeded
Common Causes:
  • Individual records exceed Snowflake’s 16 MB maximum record size limit
  • Tables with large TEXT, BLOB, or BYTEA columns produce records that exceed the limit
Resolution:
  1. Identify the oversized columns in the DLQ message payload
  2. Exclude large columns from replication if they are not needed at the destination
  3. Add a transform to truncate large fields before they reach the destination
  4. If the data is required, consider splitting the source table or extracting large columns into a separate table
Symptoms:
  • Records diverted to DLQ with column value length errors
  • Error headers reference value exceeding maximum column length
Common Causes:
  • A source column value exceeds the VARCHAR size defined at the destination
  • Schema evolution created a column with a default size that is too small for the actual data
Resolution:
  1. Identify the column and value from the DLQ message
  2. Increase the column size at the destination (e.g., ALTER TABLE ... ALTER COLUMN ... TYPE VARCHAR(n))
  3. Configure a transform (SMT) to truncate values before delivery if increasing the column size is not feasible
Symptoms:
  • Records diverted to DLQ with timestamp-related errors
  • Error headers reference invalid date/time values
Common Causes:
  • Source data contains invalid timestamps such as year 0000 or dates before year 1
  • The destination rejects timestamps outside its supported range
Resolution:
  1. Identify the invalid timestamp values from the DLQ message payload
  2. Clean the source data to use valid timestamp values
  3. Configure a timestamp normalization transform (SMT) to convert out-of-range timestamps to valid values before delivery
Symptoms:
  • Error headers reference null constraint, NOT NULL, or required field violations
  • Messages contain null values for columns that have NOT NULL constraints
Common Causes:
  • The source contains NULL values in a column that the destination defines as NOT NULL
  • A transform removed or nullified a required field
  • Schema evolution added a new NOT NULL column at the destination without a default value
Resolution: See Error Reference — Null constraint violation for detailed resolution steps.

Recovery

DLQ messages represent records that did not reach the destination. The recovery process focuses on identifying the error, fixing the root cause, and backfilling the missing data.
1

Identify the error

Inspect the DLQ topic messages and error headers to understand the failure reason. Use the common error patterns above or the Error Reference as a guide.
2

Fix the root cause

Apply the appropriate fix based on the error type: correct schema mismatches, grant missing permissions, adjust size limits, or resolve constraint violations.
3

Backfill missing data

After fixing the root cause, backfill the data that was lost:
  • Trigger a snapshot (recommended): Snapshot the affected tables to backfill all missing records. This is the safest approach as it captures the current state of the source data. See Snapshots & Backfilling.
  • Reset consumer group offsets (for very recent events): If the DLQ messages are recent and you want to replay a specific time window, reset the consumer group offsets to replay the affected messages. See Consumer Groups.
4

Verify and monitor

Monitor the pipeline to confirm that new messages are delivered successfully to the destination. Check that the DLQ topic stops receiving new messages. Review historical DLQ messages to ensure no additional issues remain.
DLQ messages are retained according to the topic’s retention policy. By default, DLQ topics retain messages for 7 days (168 hours). After the retention period expires, messages are automatically deleted by Kafka. You can view the exact retention settings for a DLQ topic by opening the topic and checking the Metadata tab, which displays the configured retention.ms (time-based) and retention.bytes (size-based) values. DLQ retention settings are managed by Streamkap and cannot be changed by the user. If you need messages retained for a longer period, contact Streamkap support.
Replay and skip/acknowledge operations are not self-service. Re-injecting DLQ messages back into the pipeline, skipping individual DLQ records, or acknowledging (clearing) DLQ messages all require assistance from Streamkap support. If you need to reprocess, skip, or clear failed messages, contact Streamkap support with the DLQ topic name, affected connector, and a description of the issue. The recommended self-service approach for recovering missing data is to trigger a snapshot of the affected tables (see Recovery above).

Preventing DLQ Records

While the DLQ ensures that pipeline failures do not block healthy data, the best approach is to minimize DLQ records in the first place. Follow these practices to reduce the likelihood of messages being routed to the DLQ.

Pre-deployment Schema Validation

Test all schema changes in a staging or development environment before applying them to production. Common DLQ-causing schema issues include:
  • Adding a NOT NULL column without a default value at the destination
  • Changing a column type at the source (e.g., INTEGER to TEXT) without updating the destination
  • Dropping columns that are referenced by transforms
Maintain a staging pipeline that mirrors your production setup. Apply schema changes to staging first and verify that data flows without DLQ errors before promoting to production.

Test Transforms Before Deploying

Transforms that produce unexpected output schemas, null values, or oversized fields are a common source of DLQ records. Before deploying a new or modified transform:
  1. Validate the transform logic with representative sample data
  2. Verify that the transform output schema is compatible with the destination table
  3. Check edge cases such as null input values, empty strings, and unusually large fields

Set Up Alert Thresholds

Configure monitoring alerts to detect DLQ activity early, before a small number of failed records becomes a large backlog:
  • Set up alerts on DLQ topic record counts (see Alerts)
  • Monitor for sudden spikes in DLQ activity, which often indicate a systemic issue such as a schema change or permission revocation
  • Review DLQ topics regularly, even when alerts are not firing, to catch low-volume issues

Enable Schema Evolution

If your source schemas may change over time (new columns, type changes), ensure that schema evolution is enabled on your destination connector. Schema evolution allows the destination to automatically adapt to non-breaking schema changes, reducing the chance of schema mismatch errors in the DLQ. See Schema Evolution Support for details on supported evolution types and configuration.

Best Practices

  1. Monitor DLQ topics proactively: Set up alerts for DLQ activity to catch issues early (see Alerts)
  2. Investigate promptly: DLQ messages often indicate systemic issues that affect multiple records
  3. Check error headers first: The error headers provide the fastest path to understanding the failure
  4. Fix at the source: Whenever possible, fix the root cause rather than working around it at the destination
  5. Use the Logs page: Cross-reference DLQ errors with connector logs for additional context (see Logs)
  6. Review after schema changes: Any time you alter source or destination schemas, check DLQ topics for new errors
  • Topics - Browse topics, inspect messages, and manage DLQ topic visibility
  • Logs - View connector and pipeline logs for additional troubleshooting context
  • Alerts - Configure notifications for DLQ activity and pipeline errors
  • Pipelines - Monitor pipeline health and data flow status