> ## Documentation Index
> Fetch the complete documentation index at: https://docs.streamkap.com/llms.txt
> Use this file to discover all available pages before exploring further.

# SQL Server Source FAQ

## SQL Server Sources FAQ for Streamkap

This FAQ focuses on using SQL Server as a source in Streamkap, including self-hosted setups and cloud variants (AWS RDS SQL Server, Azure SQL Database, Google Cloud SQL for SQL Server). Streamkap's SQL Server connector provides real-time CDC with managed features like automatic scaling, UI setup, and ETL transformations.

<AccordionGroup>
  <Accordion title="What is a SQL Server source in Streamkap?">
    A SQL Server source in Streamkap enables real-time Change Data Capture (CDC) from SQL Server databases, capturing row-level inserts, updates, and deletes with sub-second latency. It reads CDC change tables to stream changes to destinations, supporting snapshots for initial loads, schema evolution, and heartbeats for low-traffic databases. Streamkap abstracts complexity, offering a serverless setup via UI or API.
  </Accordion>

  <Accordion title="What SQL Server versions are supported as sources?">
    * SQL Server 2016+ for basic CDC; 2017+ for enhanced features and Linux support; 2019+ for advanced data types and performance.
    * Compatible with SQL Server 2014+ in limited modes.
    * Azure SQL Database (all service tiers with CDC support)
  </Accordion>

  <Accordion title="What SQL Server deployments are supported?">
    Streamkap supports:

    * Self-hosted (on-prem/VM)
    * AWS RDS SQL Server (Standard, Enterprise, Web editions)
    * Azure SQL Database (including Managed Instance)
    * Google Cloud SQL for SQL Server

    Streamkap also supports standalone instances, Always On Availability Groups, and failover cluster instances.
  </Accordion>

  <Accordion title="What are the key features of SQL Server sources in Streamkap?">
    * **CDC**: Change table-based via SQL Server CDC; captures inserts/updates/deletes
    * **Snapshots**: Ad-hoc/initial backfills using signal table methods; locking or non-locking modes
    * **Schema Evolution**: Requires manual refresh of CDC change tables when source table structure changes
    * **Heartbeats**: Enabled by default to prevent offset staleness in low-traffic databases
    * **Data Types**: Standard mappings (numerics, temporals, strings, binary as bytes/base64/hex, XML, hierarchyid)
    * **Ingestion Modes**: Inserts (append) or upserts
    * **Security**: SSL, Windows/SQL authentication, IP allowlisting
    * **Monitoring**: Latency, lag, CDC cleanup metrics in-app
    * Streamkap adds transaction metadata and ad-hoc snapshots with filters
  </Accordion>

  <Accordion title="How does CDC work for SQL Server sources?">
    Streamkap reads SQL Server's CDC change tables, which are automatically populated by the database engine when CDC is enabled on tables. The SQL Server Agent must be running to process CDC data. Change events are emitted as they're read from these change tables.
  </Accordion>

  <Accordion title="What is CDC cleanup and why does it matter?">
    SQL Server CDC generates change tables that grow over time. The SQL Server Agent runs cleanup jobs to purge old data based on retention settings.

    * **Default retention**: 3 days
    * **Monitor cleanup**: Ensure SQL Server Agent is running and cleanup jobs are scheduled
    * **Storage impact**: Change tables can consume significant disk space if cleanup fails

    If the SQL Server Agent is stopped or cleanup jobs fail, change tables will grow indefinitely and may cause storage issues.
  </Accordion>

  <Accordion title="How do snapshots work for SQL Server sources?">
    * Trigger ad-hoc at source/table level
    * Requires a signal table (`streamkap_signal`) with CDC enabled
    * Methods: Blocking (locks tables briefly) or incremental (chunked reads)
    * Modes: `initial` (default), `always`, `initial_only`, `no_data`, `when_needed`, `configuration_based`, `custom`

    Streamkap simplifies triggering via UI.
  </Accordion>

  <Accordion title="What are heartbeats and how do they work?">
    Connectors use "offsets"—like bookmarks—to track their position in the database's log or change stream. When no changes occur for long periods, these offsets may become outdated, and the Connector might lose its place or stop capturing changes.

    Heartbeats ensure the Connector stays active and continues capturing changes.

    There are two layers of heartbeat protection:

    #### Layer 1: Connector heartbeats (enabled by default)

    The Connector periodically emits heartbeat messages to an internal topic, even when no actual data changes are detected. This keeps offsets fresh and prevents staleness.

    No configuration is necessary for this layer; it is automatically enabled. We recommend keeping this layer enabled for all deployments.

    #### Layer 2: Source database heartbeats (recommended)

    <Info>
      **Why we recommend configuring Layer 2**

      While Layer 2 is **crucial** for low-traffic or intermittent databases, we recommend configuring it for all deployments. It provides additional resilience and helps prevent issues during periods of inactivity.
    </Info>

    You can configure regular updates to a dedicated heartbeat table in the source database. This simulates activity, ensuring change events are generated consistently, maintaining log progress and providing additional resilience.

    How this layer is configured depends on the connection type (if supported by the Source):

    * **Read-write connections** (when **Read only** is **No** during Streamkap Setup): The Connector updates the heartbeat table directly.
    * **Read-only connections** (when **Read only** is **Yes** during Streamkap Setup): A scheduled job on the **primary** database updates the heartbeat table, and these changes replicate to the read replica for the Connector to consume.

    This layer requires you to set up a heartbeat table—and for read-only connections, a scheduled job (e.g., `pg_cron` for PostgreSQL, `event_scheduler` for MySQL)—on your source database.

    For read-write connections (when **Read only** is **No** during Streamkap Setup), the Connector writes to the heartbeat table directly.

    <CodeGroup>
      ```SQL SQL theme={null}
      -- Create the heartbeat table with id, text, and last_update fields
      CREATE TABLE streamkap.streamkap_heartbeat (
        id INT IDENTITY(1,1) PRIMARY KEY,
        text NVARCHAR(MAX),
        last_update DATETIME2 DEFAULT SYSUTCDATETIME()
      );

      -- Enable change tracking on the heartbeat table
      EXEC sys.sp_cdc_enable_table  
      @source_schema        = N'streamkap',
      @source_name          = N'streamkap_heartbeat',
      @role_name            = N'streamkap_role',
      @filegroup_name       = N'Streamkap_ChangeTracking', -- Not applicable for Azure SQL Databases
      @supports_net_changes = 0
      GO

      -- Grant permission on the heartbeat table to the Streamkap role
      GRANT SELECT, UPDATE, INSERT, DELETE ON streamkap.streamkap_heartbeat TO streamkap_role;

      -- Insert the first row into the heartbeat table
      INSERT INTO streamkap.streamkap_heartbeat (text) VALUES ('test_heartbeat');
      ```
    </CodeGroup>
  </Accordion>

  <Accordion title="What data types are supported?">
    * **Basics**: Integers (TINYINT, SMALLINT, INT, BIGINT), floats (REAL, FLOAT), strings (CHAR, VARCHAR, NCHAR, NVARCHAR, TEXT), dates/timestamps (DATE, DATETIME, DATETIME2, DATETIMEOFFSET, TIME)
    * **Advanced**: Binary (BINARY, VARBINARY, IMAGE as bytes/hex), decimals (DECIMAL, NUMERIC, MONEY), XML, uniqueidentifier (GUID)
    * **Custom**: Hierarchyid, spatial types (GEOMETRY, GEOGRAPHY) with limitations
    * **Unsupported**: Non-UTF8/UTF16 encodings, deprecated types (TEXT, NTEXT, IMAGE in older modes), sql\_variant without explicit casting
  </Accordion>

  <Accordion title="How to handle table schema changes?">
    When you modify a table structure (add/remove columns, change data types), the CDC change table does **not** automatically update.

    **You must manually refresh the change table:**

    1. Create a new capture instance using `sys.sp_cdc_enable_table` with a new `@capture_instance` name
    2. Wait for Streamkap to switch to the new capture instance
    3. Disable the old capture instance using `sys.sp_cdc_disable_table`

    **Limitation**: Maximum 2 capture instances per table. See setup docs for detailed scripts.
  </Accordion>

  <Accordion title="How to monitor CDC for SQL Server sources?">
    * Check SQL Server Agent status: `EXEC master.dbo.xp_servicecontrol N'QUERYSTATE', N'SQLSERVERAGENT'`
    * Monitor change table sizes: `SELECT * FROM sys.dm_cdc_log_scan_sessions`
    * Check cleanup job status: `SELECT * FROM msdb.dbo.cdc_jobs`
    * Monitor lag via Streamkap in-app metrics

    **Best Practices**: Ensure SQL Server Agent is always running; alert on CDC cleanup failures; monitor disk space for change tables
  </Accordion>

  <Accordion title="Can CDC capture database Views, Materialized Views, and other virtual objects?">
    **No, CDC cannot capture Views or most virtual database objects.**

    **Why Views cannot be captured:**\
    CDC captures changes by reading the database transaction log (binlog, WAL, oplog, redo log, etc.). Views are query-time computations over base tables—they don't store data or generate transaction log entries. When you query a view, the database engine executes the underlying SELECT statement against the base tables. Since views don't store data, they don't generate transaction log entries.

    **What cannot be captured:**

    * **Views**: Virtual tables with no physical storage or change tables
    * **Indexed Views (Materialized Views)**: While they have physical storage, they don't generate CDC events. Capture the base tables instead.
    * **Table Variables**: Session-scoped, not logged in transactions
    * **Temporary Tables (#temp, ##temp)**: Session or connection-scoped, excluded from CDC
    * **Memory-Optimized Tables** ([In-Memory OLTP](https://learn.microsoft.com/en-us/sql/relational-databases/in-memory-oltp/in-memory-oltp-in-memory-optimization)): Extremely limited CDC support. While technically possible with SQL Server 2016 SP2+ (compatibility level 130+), memory-optimized tables use In-Memory OLTP engine which bypasses traditional transaction logging. CDC can be enabled but with significant performance penalties that negate the benefits of memory optimization. **Not recommended for CDC; capture from traditional disk-based tables instead or use alternative change tracking approaches.**
    * **System Tables** (sys.\*, information\_schema): Metadata representations, not user data
    * **CTEs (Common Table Expressions)**: Query-time constructs with no persistent storage
    * **External Tables**: Reference external data sources (PolyBase), not local storage

    **Solution:**\
    Configure CDC on the underlying base tables that power your views. The view logic can be recreated in your destination or transformation layer.

    **SQL Server-specific notes:**

    * **Indexed Views**: Even though they store data physically, changes to indexed views are captured through the base table CDC, not the view itself
    * **Partitioned Views**: Capture each underlying base table separately
    * **Memory-Optimized Tables** ([In-Memory OLTP](https://learn.microsoft.com/en-us/sql/relational-databases/in-memory-oltp/in-memory-oltp-in-memory-optimization)): While CDC can technically be enabled on memory-optimized tables in SQL Server 2016 SP2+ with database compatibility level 130+, doing so introduces significant performance overhead. Memory-optimized tables use In-Memory OLTP engine designed for lock-free, log-optimized operations. Enabling CDC forces traditional transaction logging which defeats the performance benefits. **Recommendation**: Use traditional disk-based tables for CDC capture, or consider [Change Tracking](https://learn.microsoft.com/en-us/sql/relational-databases/track-changes/about-change-tracking-sql-server) (lighter weight alternative) or [Temporal Tables](https://learn.microsoft.com/en-us/sql/relational-databases/tables/temporal-tables) for memory-optimized scenarios.

    **Example:**\
    If you have a view `vw_sales_summary` that queries tables `Orders` and `Customers`, enable CDC on the `Orders` and `Customers` tables, then recreate the view logic downstream.
  </Accordion>

  <Accordion title="What are common limitations?">
    * SQL Server Agent must be running for CDC to function
    * Maximum 2 capture instances per table (impacts schema evolution)
    * Change table cleanup requires active SQL Server Agent jobs
    * High transaction volume can cause change table growth
    * Schema changes require manual intervention
    * Read replicas may have limitations for CDC
    * Restores disable CDC (must re-enable after restore)
  </Accordion>

  <Accordion title="How to handle deletes?">
    Captures deletes as events. SQL Server CDC records before-images for deletes by default, providing complete record information.
  </Accordion>

  <Accordion title="What security features are available?">
    Encrypted connections (SSL/TLS), SQL Server or Windows authentication, role-based access, IP allowlisting.
  </Accordion>

  <Accordion title="Troubleshooting common issues">
    * **CDC Not Working**: Verify SQL Server Agent is running; check CDC is enabled on database and tables
    * **Change Table Growth**: Monitor cleanup job execution; adjust retention settings
    * **Missing Events**: Ensure capture instance is active; verify role membership for CDC access
    * **After Database Restore**: CDC is disabled; must re-enable on database and tables
    * **Schema Changes Not Reflected**: Must manually refresh change tables (see setup docs)
  </Accordion>

  <Accordion title="Table schema evolution and change tracking tables">
    When CDC is enabled for a table and changes occur to it, change events are persisted to a 'change table' on the SQL Server database server.

    If you introduce a change in the structure of the source table, for example, by adding a new column, that change is **not** reflected in the change table.

    For as long as the change table continues to use the outdated table structure, the Streamkap SQL Server Source is unable to capture change events for the table correctly.

    You **must** intervene to refresh the change table structure because of the way that CDC is implemented in SQL Server.

    ### Refresh change table structure (Online)

    <Warning>
      **Online refresh limitation**

      In the interval between the source table structure changing in the source database, and before the change table structure is refreshed, change events continue to be captured with the **outdated** table structure.

      For example, if you added a new column to a source table, change events that are captured before the change table is refreshed will not contain the new column.

      If this cannot be tolerated, an **Offline** refresh has to be performed. However, that means downtime for whatever system made structural changes to your tables **and** your Streamkap pipelines.

      Please contact us for assistance if an Offline refresh is required.
    </Warning>

    The procedure for completing an **Online** refresh is simpler than running it Offline, and you can complete it without any downtime to your systems and Streamkap pipelines.

    For every source table that has changed, copy paste the script below into SQL Server Management Studio, change placeholders as required, and then run all queries.

    <Check>
      If you're not sure what `{refresh_table}` name to use, use `{schema}_{table}_v{N}`. For example, if the source table is `sales.orders` then you'd use `sales_orders_v2`
    </Check>

    ```sql theme={null}
    -- Replace {database}, {schema} and {table} placeholders
    -- {database}: name of the CDC enabled database
    -- {schema}: name of the schema with tables to refresh
    -- {table}: name of the table to refresh
    -- {refresh_table}: a unique name for the refreshed change table
    USE {database};
    GO
    EXEC sys.sp_cdc_enable_table
    @source_schema        = N'{schema}',
    @source_name          = N'{table}',
    @role_name            = N'streamkap_role',
    @filegroup_name       = N'Streamkap_ChangeTracking',
    @supports_net_changes = 0,
    @capture_instance     = N'{refresh_table}'
    GO
    ```

    <Danger>
      **Refresh table limitation**

      There cannot be more than 2 change tables for every source table.

      After refreshing a change table using the above script, confirm with Streamkap Support that your SQL Server Source has started streaming from the refreshed change table. Once confirmed, disable CDC on the outdated change table. Use the script below to do that.

      ```sql theme={null}
      -- Replace {database}, {schema}, {table} and {refresh_table} placeholders
      -- {database}: name of the CDC enabled database
      -- {schema}: name of the schema with the table refreshed earlier
      -- {table}: name of the table refreshed earlier
      -- {refresh_table}: name of the previous refresh table, usually {schema}_{table}
      USE {database};
      GO
      EXEC sys.sp_cdc_disable_table
      @source_schema        = N'{schema}',
      @source_name          = N'{table}',
      @capture_instance     = N'{refresh_table}'
      GO
      ```
    </Danger>
  </Accordion>

  <Accordion title="Two capture instances already exist for source table">
    If you're getting this error message when refreshing the change table structure, it's because there cannot be more than 2 change tables for every source table.

    To fix the problem, 1 of the 2 change tables for the source table need to be disabled. The scripts below can help you do that.

    ```sql theme={null}
    -- Replace {database}, {schema} and {table} placeholders
    -- {database}: name of the CDC enabled database
    -- {schema}: name of the schema with the table refreshed earlier
    -- {table}: name of the table refreshed earlier
    USE {database};
    GO

    EXEC sys.sp_cdc_help_change_data_capture
    @source_schema = N'{schema}',
    @source_name   = N'{table}'
    GO
    ```

    The above script should return 2 results, the 2 change tables for the `{table}` specified. Typically you would disable the oldest change table, so use the `create_date` column to identify the oldest one.

    When you've identified the change table to disable, use its `source_schema`, `source_table` and `capture_instance` names in the query below and execute. Then, try your refresh table script again.

    ```sql theme={null}
    -- Replace {database}, {schema}, {table} and {refresh_table} placeholders
    -- {database}: name of the CDC enabled database
    USE {database};
    GO
    EXEC sys.sp_cdc_disable_table
    @source_schema        = N'{source_schema}',
    @source_name          = N'{source_table}',
    @capture_instance     = N'{capture_instance}'
    GO
    ```
  </Accordion>

  <Accordion title="The Connector's running but there is no data">
    There can be a number of reasons. The most common are misconfiguration of CDC and privilege grants.

    To diagnose, run the following script:

    ```sql theme={null}
    -- Replace {database} placeholder
    -- {database}: name of the CDC enabled database
    USE {database};
    GO

    EXEC sys.sp_cdc_help_change_data_capture
    GO
    ```

    If the query returns an **error** or **no results** check:

    * you are connected to the SQL Server database with a user that has `DB_OWNER` role privileges.
    * if the `EXEC sys.sp_cdc_help_change_data_capture` stored procedure returns **no results**, the source tables you want Streamkap to capture may not be CDC enabled.

    If the query returns **results**, check:

    * the table(s) captured by the Connector are listed.
    * there are `capture_instance`s for the capture tables.
    * the database user or role used by the Connector are members of the `rolename` for the capture table(s).

    If you are still having issues after following the above steps, please don't hesitate to reach out to us.
  </Accordion>

  <Accordion title="The SQL Server Setup scripts are failing">
    There can be many reasons for the Setup scripts to fail, but the scripts below can help you diagnose the issues.

    ```sql theme={null}
    -- Replace {database} placeholder
    -- {database}: name of the CDC enabled database
    USE {database};
    GO

    SELECT name, database_id, source_database_id, compatibility_level, is_read_only, state, state_desc, is_in_standby, is_cleanly_shutdown, is_cdc_enabled, is_encrypted, replica_id
    FROM sys.databases
    WHERE name = '{database}' AND is_cdc_enabled=1;

    EXEC sys.sp_cdc_help_change_data_capture
    GO
    ```

    If any of the queries return an **error** or **no results**:

    * Check you connected to the SQL Server database with a user that has `DB_OWNER` role privileges
    * If the `SELECT ... FROM sys.databases ...` returns **no results**, the database you want Streamkap to capture may not be CDC enabled
    * If the `EXEC sys.sp_cdc_help_change_data_capture` stored procedure returns **no results**, the source tables you want Streamkap to capture may not be CDC enabled

    If you are still having issues after following the above steps, please don't hesitate to reach out to us.
  </Accordion>

  <Accordion title="How can I optimize schema history for large databases?">
    For large database instances with many databases or tables, schema history can impact performance. Learn about optimization settings that can reduce schema history topic size and improve connector startup times.

    See the [Schema History Optimization](/schema-history-optimization) guide for detailed information on when and how to use these settings.
  </Accordion>

  <Accordion title="Best practices for SQL Server sources">
    * Use dedicated CDC user with minimal privileges
    * Ensure SQL Server Agent is always running and monitored
    * Set appropriate CDC cleanup retention (default 3 days)
    * Limit capture to needed tables only to reduce change table size
    * Test schema evolution procedures in staging
    * Monitor disk space for change tables
    * For cloud: Use provider monitoring; enable heartbeats for low traffic
    * Document schema change procedures for your team
  </Accordion>

  <Accordion title="Planning a database version upgrade?">
    Database version upgrades require careful planning to avoid data loss and minimize downtime for your CDC pipelines. See our [Database Upgrade Guide](/database-upgrade-guide) for step-by-step instructions.
  </Accordion>
</AccordionGroup>
