Skip to main content
Please ensure you’ve followed the relevant connector setup guide to enable the snapshots feature
Streamkap Sources use snapshots to backfill historical data from your source tables.

Snapshot Options

When triggering a snapshot, you can choose from three options:

Filtered Snapshot

Apply filter conditions to capture specific rows. Streaming continues during snapshot.
  • Best for capturing a subset of data based on conditions (e.g., date ranges, specific statuses)
  • Uses incremental watermarking to capture data in small chunks
  • Requires tables to have primary keys (or a Surrogate Key)
  • Can continue from where it left off on failure or cancellation
The filter syntax depends on your Source type:
  • SQL-based Sources (PostgreSQL, MySQL, Oracle, etc.): Use SQL WHERE clause syntax
  • Document/NoSQL Sources (MongoDB, DocumentDB, DynamoDB): Use JSON filter expressions

Full Snapshot

Capture all rows from selected tables. Streaming continues during snapshot.
  • Best for complete data backfills where you need all historical data
  • Uses incremental watermarking to capture data in small chunks
  • Requires tables to have primary keys (or a Surrogate Key)
  • Can continue from where it left off on failure or cancellation

Blocking Snapshot

Database locks: Blocking snapshots may hold database locks for the duration of the operation. Use with caution on high-traffic tables.
Capture all rows while pausing streaming. Streaming resumes automatically after snapshot completes.
  • Required for keyless tables: Tables without primary keys cannot use incremental snapshots (unless a Surrogate Key is specified)
  • Point-in-time consistency: Guarantees a consistent view of data at a specific moment
  • Faster for large tables: Can be more performant since it captures all data in one operation
  • Multiple tables in parallel: Depending on connector configuration, multiple tables can be snapshotted simultaneously
However, blocking snapshots cannot be resumed on failure or cancellation—they must be re-triggered.
Blocking snapshots do not currently support filters. To apply filters, use the Filtered Snapshot option instead (requires primary keys).

Surrogate Key

Available under Advanced Options when configuring Filtered or Full snapshots.
A surrogate key allows you to specify an alternative column for the connector to use as the primary key during snapshot chunking. This is useful when:
  • Keyless tables: Tables without primary keys can use a surrogate key (e.g., a timestamp or auto-increment column) for incremental snapshots instead of requiring a blocking snapshot
  • Performance optimization: A different column may provide better chunking performance (e.g., using created_at instead of a UUID primary key for more efficient range queries)
Advanced Options showing Surrogate Key configuration
Limitations:
  • Only single-column surrogate keys are supported (composite keys are not available)
  • The surrogate key column must exist in the table and contain sortable values
Leave the field empty to use the table’s primary key for chunking (default behavior).

Snapshot Lifecycle

WhenBehavior
At connector creationThe connector starts in streaming mode, reading any change data seen from this point onwards. No snapshots are triggered automatically.
After connector creationYou can trigger ad-hoc snapshots for any or all of the tables the connector is configured to capture. A confirmation prompt is required before the snapshot begins.
Pipeline creation and editYou can choose to trigger snapshots for the topics the pipeline will stream to your destination. A confirmation prompt is required before the snapshot begins.

Behavior

Deletions are not captured during snapshots. Snapshots read existing rows at a point in time—deletion events can only be processed during streaming, or, replayed if Streamkap data retention policies allow.

Filtered & Full Snapshots

These snapshots use incremental watermarking, capturing data in small chunks to minimize database impact. Streaming continues uninterrupted while historical data is being backfilled. When snapshotting multiple tables, tables are processed sequentially—one at a time. Each table must complete before the next begins. On failure: The snapshot resumes from where it left off. If it cannot resume automatically, you can re-trigger it at the Connector or Table level once the issue is resolved. On cancellation: The snapshot stops at its current progress. Streaming continues uninterrupted. You can resume the snapshot later from where it left off.
When rows are modified while an incremental snapshot is running, event ordering may vary, because the Connector’s streaming and snapshotting in parallel:
  • Updates: You may receive events in different orders (readupdate, updateread, or just update)
  • Deletes: You may receive readdelete, or just delete
This is normal behavior—The Connector resolves these out-of-sequence events when the same row appears in both the snapshot and the streaming log ensuring they are processed in the correct order and deduplicated.

Blocking Snapshots

Database locks: Blocking snapshots may hold database locks for the duration of the operation. Use with caution on high-traffic tables.
These snapshots capture all data in a single transaction. Streaming pauses until the snapshot completes, then resumes automatically. Multiple tables may be processed in parallel depending on connector configuration. On failure: Streaming resumes immediately. Re-trigger the snapshot once the issue is resolved—it will start from the beginning since blocking snapshots capture all rows in one operation. On cancellation: Since streaming is paused during blocking snapshots, when you cancel a blocking snapshot, Streamkap restarts the connector to terminate the snapshot immediately. Streaming resumes after the restart.
A brief delay exists between signaling a blocking snapshot and when streaming actually pauses. This may result in some duplicate events being emitted after the snapshot completes. Ensure your destination can handle idempotent writes or has deduplication enabled.
A high performance, bulk parallel snapshot feature is planned for future releases.

Triggering a Snapshot

You can trigger an ad-hoc snapshot at the Source level or per Table from the Connector’s page.

Source Level Snapshot

This will trigger an incremental snapshot for all tables/topics captured by the Source:
Source quick actions menu with Snapshot option

Snapshot Options Dialog

When triggering a source-level snapshot, you can choose between Full Snapshot or Blocking Snapshot:
Source snapshot options dialog showing Full and Blocking snapshot types

Table/Topic Level Snapshot

This will trigger a snapshot for the selected tables/topics only:
Topic quick actions menu with Snapshot option

Snapshot Options Dialog

When triggering a table/topic snapshot, you can choose the snapshot type and configure advanced options:
Snapshot options dialog showing Filtered, Full, and Blocking snapshot types with Advanced Options

Filtered Snapshot Configuration

Select Filtered Snapshot to apply filter conditions. Filter syntax varies by Source type:
  • SQL-based Sources: Use SQL WHERE clause syntax (e.g., created_at >= '2025-01-01' AND created_at < '2025-02-01')
  • Document/NoSQL Sources: Use JSON filter expressions (e.g., {"status": "active", "created_at": {"$gte": "2025-01-01", "$lt": "2025-02-01"}})
Filtered snapshot configuration with SQL filter editor

Best Practices for Filtered Snapshots

When using Filtered snapshots, we strongly recommend:
  • Use closed range filters when applying comparative operators on timestamp or date fields. For example: created_at >= '2025-01-01' AND created_at < '2025-02-01'. Closed ranges ensure you capture all intended data without gaps or overlaps.
  • Filter on indexed or primary key fields for optimal performance. Filtering on columns that are part of your table’s indices or primary key allows the database to efficiently locate matching rows, significantly reducing the load on your source database.

Confirmation Prompt

After initiating a snapshot, you must confirm your action by typing “snapshot” in the confirmation dialog:
Snapshot confirmation dialog

Snapshot Progress

Upon triggering a snapshot, the Connector status will update to reflect the snapshot operation:
Source status tab showing snapshot progress
Also, the Topics list will show the snapshot status per table/topic:
Topics table with snapshot status column

Cancelling a Snapshot

You can cancel an in-progress snapshot from the Connector’s quick actions menu:
Source quick actions menu with Cancel Snapshot option