> ## Documentation Index
> Fetch the complete documentation index at: https://docs.streamkap.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Dead Letter Queue (DLQ)

> Monitor, inspect, and resolve messages that failed processing in your Streamkap pipelines.

When a message cannot be processed by downstream consumers or fails during a transform, Streamkap routes it to a **dead letter queue (DLQ) topic** rather than blocking the entire pipeline. This ensures that healthy data continues to flow while problematic messages are captured for inspection and resolution.

For a brief overview of DLQ topics and how they appear in the Topics page, see [Topics](/topics).

<Warning>
  Messages in the DLQ represent data that did NOT reach the destination. Until the root cause is resolved and data is backfilled, DLQ records represent data loss. Investigate DLQ messages promptly.
</Warning>

## How the DLQ Works

When Streamkap encounters a message that cannot be processed, the following happens:

1. The message is diverted from the normal data flow into a dedicated DLQ topic
2. Error metadata is attached to the message as Kafka headers
3. The pipeline continues processing remaining messages without interruption
4. The DLQ message is retained for inspection and root-cause analysis

<Note>
  DLQ routing is automatic. You do not need to configure it. Every destination connector and transform has a DLQ topic created when errors occur.
</Note>

## Finding DLQ Topics

### Naming Convention

DLQ topics follow a predictable naming pattern based on the connector type:

| Connector Type | DLQ Topic Pattern                                   | Example                                        |
| -------------- | --------------------------------------------------- | ---------------------------------------------- |
| Destination    | `destination_{entity_id}.streamkap.deadletterqueue` | `destination_abc123.streamkap.deadletterqueue` |
| Transform      | `{transform_output_topic_prefix}deadletterqueue`    | `my_transform.streamkap.deadletterqueue`       |

### Filtering in the Topics Page

1. Navigate to the **Topics** page
2. Check the **Include DLQ** checkbox in the top toolbar to show DLQ topics
3. Use the **Search Box** to filter by `deadletterqueue`
4. Optionally, use the **By Destination** or **By Transform** collection in the left sidebar to narrow results
5. Check the **Errors** column for non-zero values to identify active DLQ topics

<Tip>
  Use the **Error Status** filter in the left sidebar to quickly surface topics that have DLQ activity without manually searching.
</Tip>

## Inspecting DLQ Messages

Once you locate a DLQ topic, click on it to open the Topic Details panel, then click **Browse Messages** to access the Messages tab.

### Reading Error Information

Each DLQ message contains the original payload along with error headers that describe why the message failed. Streamkap inspects the following headers in order of priority:

1. `__connect.errors.exception.message` -- the primary exception message from the Kafka Connect framework
2. `__connect.errors.exception.class.name` -- the Java exception class that caused the failure
3. `exception.message` -- a general exception message
4. `error.message` -- an error description from the processing component
5. `_streamkap_error` -- a Streamkap-specific error message
6. `connect.errors.exception.message` -- fallback exception message (without leading underscores)
7. `connect.errors.exception.class.name` -- fallback exception class name (without leading underscores)

<Info>
  Streamkap parses error information by first checking message headers, then attempting to extract structured JSON values, then falling back to raw values, and finally adding contextual metadata. The first non-empty value found in the header priority list above is used as the primary error description.
</Info>

### Inspecting a DLQ Message

1. Open the DLQ topic in the **Messages** tab
2. Click on a message row to expand it
3. Review the **Value** field for the original message payload
4. Review the **Headers** for error details -- look for the headers listed above
5. Use the **Regex Filter** to search for specific error patterns across multiple messages (e.g., `.*schema.*` or `.*permission.*`)

<Tip>
  If the DLQ topic contains many messages, use the **Timestamp Seek** filter to jump to a specific time window. Combine it with a Regex filter to narrow results further. See [Topics - Filtering Messages](/topics#filtering-messages) for details.
</Tip>

## Common Error Patterns and Fixes

<AccordionGroup>
  <Accordion title="Schema Mismatch">
    **Symptoms:**

    * Error headers reference schema, type, or conversion errors
    * Messages mention mismatched column types or unsupported data types

    **Common Causes:**

    * A source column type was altered (e.g., `INTEGER` changed to `TEXT`)
    * The destination does not support a data type present in the source
    * A new column was added to the source but does not exist in the destination

    **Resolution:** See [Error Reference — Schema Errors](/error-reference#schema-errors) for detailed resolution steps.
  </Accordion>

  <Accordion title="Permission Errors">
    **Symptoms:**

    * Error headers reference access denied, authorization, or permission failures
    * Messages mention insufficient privileges or forbidden operations

    **Common Causes:**

    * The destination database user lacks `INSERT`, `UPDATE`, or `DELETE` privileges on the target table
    * Table-level or schema-level permissions were revoked or not granted
    * Role-based access changes were applied without updating the connector credentials

    **Resolution:** See [Error Reference — Permission Errors](/error-reference#permission-errors) for detailed resolution steps.
  </Accordion>

  <Accordion title="Size and Limit Errors">
    **Symptoms:**

    * Error headers reference message size, row size, or value length limits
    * Messages mention payload too large or column value exceeds maximum

    **Common Causes:**

    * A row exceeds the maximum message size supported by the destination
    * A column value (e.g., a large `TEXT` or `BLOB` field) exceeds the destination column length limit
    * Batch size settings cause aggregated payloads to exceed limits

    **Resolution:** See [Error Reference — Resource Errors](/error-reference#resource-errors) for detailed resolution steps on size and limit issues.
  </Accordion>

  <Accordion title="Oversized Records (Snowflake 16 MB limit)">
    **Symptoms:**

    * Records appearing in the DLQ for Snowflake destinations
    * Error headers reference maximum record size exceeded

    **Common Causes:**

    * Individual records exceed Snowflake's 16 MB maximum record size limit
    * Tables with large `TEXT`, `BLOB`, or `BYTEA` columns produce records that exceed the limit

    **Resolution:**

    1. Identify the oversized columns in the DLQ message payload
    2. Exclude large columns from replication if they are not needed at the destination
    3. Add a transform to truncate large fields before they reach the destination
    4. If the data is required, consider splitting the source table or extracting large columns into a separate table
  </Accordion>

  <Accordion title="VARCHAR Overflow">
    **Symptoms:**

    * Records diverted to DLQ with column value length errors
    * Error headers reference value exceeding maximum column length

    **Common Causes:**

    * A source column value exceeds the VARCHAR size defined at the destination
    * Schema evolution created a column with a default size that is too small for the actual data

    **Resolution:**

    1. Identify the column and value from the DLQ message
    2. Increase the column size at the destination (e.g., `ALTER TABLE ... ALTER COLUMN ... TYPE VARCHAR(n)`)
    3. Configure a transform (SMT) to truncate values before delivery if increasing the column size is not feasible
  </Accordion>

  <Accordion title="Timestamp Range Errors">
    **Symptoms:**

    * Records diverted to DLQ with timestamp-related errors
    * Error headers reference invalid date/time values

    **Common Causes:**

    * Source data contains invalid timestamps such as year `0000` or dates before year 1
    * The destination rejects timestamps outside its supported range

    **Resolution:**

    1. Identify the invalid timestamp values from the DLQ message payload
    2. Clean the source data to use valid timestamp values
    3. Configure a timestamp normalization transform (SMT) to convert out-of-range timestamps to valid values before delivery
  </Accordion>

  <Accordion title="Null Constraint Violations">
    **Symptoms:**

    * Error headers reference null constraint, NOT NULL, or required field violations
    * Messages contain `null` values for columns that have NOT NULL constraints

    **Common Causes:**

    * The source contains `NULL` values in a column that the destination defines as NOT NULL
    * A transform removed or nullified a required field
    * Schema evolution added a new NOT NULL column at the destination without a default value

    **Resolution:** See [Error Reference — Null constraint violation](/error-reference#schema-errors) for detailed resolution steps.
  </Accordion>
</AccordionGroup>

## Recovery

DLQ messages represent records that did not reach the destination. The recovery process focuses on identifying the error, fixing the root cause, and backfilling the missing data.

<Steps>
  <Step title="Identify the error">
    Inspect the DLQ topic messages and error headers to understand the failure reason. Use the [common error patterns](#common-error-patterns-and-fixes) above or the [Error Reference](/error-reference) as a guide.
  </Step>

  <Step title="Fix the root cause">
    Apply the appropriate fix based on the error type: correct schema mismatches, grant missing permissions, adjust size limits, or resolve constraint violations.
  </Step>

  <Step title="Backfill missing data">
    After fixing the root cause, backfill the data that was lost:

    * **Trigger a snapshot** (recommended): Snapshot the affected tables to backfill all missing records. This is the safest approach as it captures the current state of the source data. See [Snapshots & Backfilling](/snapshots).
    * **Reset consumer group offsets** (for very recent events): If the DLQ messages are recent and you want to replay a specific time window, reset the consumer group offsets to replay the affected messages. See [Consumer Groups](/consumer-groups).
  </Step>

  <Step title="Verify and monitor">
    Monitor the pipeline to confirm that new messages are delivered successfully to the destination. Check that the DLQ topic stops receiving new messages. Review historical DLQ messages to ensure no additional issues remain.
  </Step>
</Steps>

<Info>
  DLQ messages are retained according to the topic's retention policy. By default, DLQ topics retain messages for **7 days** (168 hours). After the retention period expires, messages are automatically deleted by Kafka. You can view the exact retention settings for a DLQ topic by opening the topic and checking the **Metadata** tab, which displays the configured `retention.ms` (time-based) and `retention.bytes` (size-based) values. DLQ retention settings are managed by Streamkap and cannot be changed by the user. If you need messages retained for a longer period, contact Streamkap support.
</Info>

<Note>
  **Replay and skip/acknowledge operations are not self-service.** Re-injecting DLQ messages back into the pipeline, skipping individual DLQ records, or acknowledging (clearing) DLQ messages all require assistance from Streamkap support. If you need to reprocess, skip, or clear failed messages, contact [Streamkap support](mailto:support@streamkap.com) with the DLQ topic name, affected connector, and a description of the issue. The recommended self-service approach for recovering missing data is to trigger a snapshot of the affected tables (see [Recovery](#recovery) above).
</Note>

## Preventing DLQ Records

While the DLQ ensures that pipeline failures do not block healthy data, the best approach is to minimize DLQ records in the first place. Follow these practices to reduce the likelihood of messages being routed to the DLQ.

### Pre-deployment Schema Validation

Test all schema changes in a staging or development environment before applying them to production. Common DLQ-causing schema issues include:

* Adding a NOT NULL column without a default value at the destination
* Changing a column type at the source (e.g., `INTEGER` to `TEXT`) without updating the destination
* Dropping columns that are referenced by transforms

<Tip>
  Maintain a staging pipeline that mirrors your production setup. Apply schema changes to staging first and verify that data flows without DLQ errors before promoting to production.
</Tip>

### Test Transforms Before Deploying

Transforms that produce unexpected output schemas, null values, or oversized fields are a common source of DLQ records. Before deploying a new or modified transform:

1. Validate the transform logic with representative sample data
2. Verify that the transform output schema is compatible with the destination table
3. Check edge cases such as null input values, empty strings, and unusually large fields

### Set Up Alert Thresholds

Configure monitoring alerts to detect DLQ activity early, before a small number of failed records becomes a large backlog:

* Set up alerts on DLQ topic record counts (see [Alerts](/alerts))
* Monitor for sudden spikes in DLQ activity, which often indicate a systemic issue such as a schema change or permission revocation
* Review DLQ topics regularly, even when alerts are not firing, to catch low-volume issues

### Enable Schema Evolution

If your source schemas may change over time (new columns, type changes), ensure that **schema evolution** is enabled on your destination connector. Schema evolution allows the destination to automatically adapt to non-breaking schema changes, reducing the chance of schema mismatch errors in the DLQ.

See [Schema Evolution Support](/schema-evolution-support) for details on supported evolution types and configuration.

## Best Practices

1. **Monitor DLQ topics proactively**: Set up alerts for DLQ activity to catch issues early (see [Alerts](/alerts))
2. **Investigate promptly**: DLQ messages often indicate systemic issues that affect multiple records
3. **Check error headers first**: The error headers provide the fastest path to understanding the failure
4. **Fix at the source**: Whenever possible, fix the root cause rather than working around it at the destination
5. **Use the Logs page**: Cross-reference DLQ errors with connector logs for additional context (see [Logs](/logs))
6. **Review after schema changes**: Any time you alter source or destination schemas, check DLQ topics for new errors

## Related Documentation

* [Topics](/topics) - Browse topics, inspect messages, and manage DLQ topic visibility
* [Logs](/logs) - View connector and pipeline logs for additional troubleshooting context
* [Alerts](/alerts) - Configure notifications for DLQ activity and pipeline errors
* [Pipelines](/pipelines) - Monitor pipeline health and data flow status
