Oracle Source FAQ

Oracle Sources FAQ for Streamkap

This FAQ focuses on using Oracle Database as a source in Streamkap, including self-hosted setups and cloud variants (AWS RDS Oracle). Streamkap’s Oracle connector provides real-time CDC with managed features like automatic scaling, UI setup, and ETL transformations.

What is an Oracle source in Streamkap?

An Oracle source in Streamkap enables real-time Change Data Capture (CDC) from Oracle databases, capturing row-level inserts, updates, and deletes with sub-second latency. It uses LogMiner to read the redo logs and stream changes to destinations, supporting snapshots for initial loads, schema evolution, and heartbeats for low-traffic databases. Streamkap abstracts complexity, offering a serverless setup via UI or API.

What Oracle versions are supported as sources?

Oracle Database 11g+ for basic CDC; 12c+ for enhanced features like multitenant architecture and JSON support; 19c+ for latest optimizations and cloud compatibility.
Compatible with Oracle Database 11.2+
Oracle RAC (Real Application Clusters) supported

What Oracle deployments are supported?

Streamkap supports:

Self-hosted (on-prem/VM)
AWS RDS Oracle (Standard Edition, Enterprise Edition)
Oracle Cloud Infrastructure (OCI)
Oracle RAC (Real Application Clusters)

Streamkap also supports standalone instances, Data Guard configurations, and Oracle Exadata.

What are the key features of Oracle sources in Streamkap?

CDC: LogMiner-based via redo logs; captures inserts/updates/deletes with SCN (System Change Number) tracking
Snapshots: Ad-hoc/initial backfills using signal table methods; locking or non-locking modes
Schema Evolution: Automatic handling of DDL changes when enabled; tracks table structure modifications
Heartbeats: Enabled by default to prevent redo log buildup in low-traffic databases
Data Types: Standard mappings (numerics, temporals, strings, LOBs, RAW as bytes/base64/hex, XML, JSON in 12c+)
Ingestion Modes: Inserts (append) or upserts
Security: SSL/TLS, Oracle wallet, IP allowlisting
Monitoring: Latency, lag, redo log metrics, LogMiner session stats in-app
Streamkap adds transaction metadata and ad-hoc snapshots with filters

How does CDC work for Oracle sources?

Streamkap uses Oracle LogMiner to read and parse redo log files, extracting committed change events. LogMiner queries are executed against redo logs to retrieve SQL statements and change data, which are then emitted as change events. The connector tracks progress using SCN (System Change Number).Requirements:

Supplemental logging must be enabled (database and table level)
Archive log mode must be enabled
LogMiner privileges required for the Streamkap user

What is supplemental logging and why does it matter?

Supplemental logging instructs Oracle to include additional information in redo logs beyond what’s needed for database recovery. This extra data is essential for CDC.Types:

Database-level: ADD SUPPLEMENTAL LOG DATA
Table-level: ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS (recommended for complete change records)

Why it matters:

Without supplemental logging, redo logs may not contain complete before/after images
Affects ability to capture deletes and updates with full column data
Must be enabled before CDC starts capturing

Setting:

-- Database level
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;

-- Table level (for complete records)
ALTER TABLE schema.table ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;

How do snapshots work for Oracle sources?

Trigger ad-hoc at source/table level
Requires a signal table (streamkap_signal)
Methods: Blocking (uses locks) or incremental (chunked by ROWID or primary key)
Modes: initial (default), always, initial_only, no_data, when_needed, configuration_based, custom

Streamkap simplifies triggering via UI.

What are heartbeats and how do they work?

Heartbeats ensure the Connector stays active and continues capturing changes, which is especially important for low-traffic or intermittent databases.Connectors use “offsets”—like bookmarks—to track their position in the database’s log or change stream. When no changes occur for long periods, these offsets may become outdated, and the Connector might lose its place or stop capturing changes.There are two layers of heartbeat protection:Layer 1: Connector heartbeats (enabled by default)The Connector periodically emits heartbeat messages to an internal topic, even when no actual data changes are detected. This keeps offsets fresh and prevents staleness.No configuration is necessary for this layer; it is automatically enabled.Layer 2: Source database heartbeats (requires configuration)For read-write connections (when Read only is No during Streamkap Setup—if applicable), you can configure the Connector to update a dedicated heartbeat table in the source database at regular intervals. This simulates activity, ensuring change events are generated consistently, maintaining log progress and preventing staleness. It’s especially useful for databases like PostgreSQL, where lack of regular changes can cause the write-ahead log (WAL) to grow rapidly during inactivity.This layer requires you to set up a heartbeat table in your source database.

Configure source database heartbeats

-- Replace {...} placeholders as needed
ALTER SESSION SET CONTAINER={PDB};

-- Create the heartbeat table with id, text, and last_update fields
CREATE TABLE STREAMKAP_USER.STREAMKAP_HEARTBEAT (
    id NUMBER GENERATED BY DEFAULT ON NULL AS IDENTITY PRIMARY KEY,
    text VARCHAR2(4000),
    last_update TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Grant permission to the Streamkap user
GRANT SELECT, INSERT, UPDATE ON STREAMKAP_USER.STREAMKAP_HEARTBEAT TO STREAMKAP_USER;

-- Grant necessary privileges on the table to the common user
GRANT SELECT, INSERT, UPDATE ON STREAMKAP_USER.STREAMKAP_HEARTBEAT TO C##STREAMKAP_USER;

-- Insert the first row into the heartbeat table
INSERT INTO STREAMKAP_USER.STREAMKAP_HEARTBEAT (text) VALUES ('test_heartbeat');

What data types are supported?

Basics: Numbers (NUMBER, INTEGER, FLOAT), strings (VARCHAR2, CHAR, NVARCHAR2, NCHAR, CLOB), dates/timestamps (DATE, TIMESTAMP, TIMESTAMP WITH TIME ZONE)
Advanced: Binary (RAW, LONG RAW, BLOB as bytes/hex), XML (XMLTYPE), JSON (12c+, stored as VARCHAR2/CLOB)
Custom: ROWID, UROWID, spatial types (SDO_GEOMETRY) with limitations
Unsupported: BFILE (external file pointers), deprecated LONG types (use CLOB instead), nested tables, VARRAYs without flattening, Oracle object types without conversion

How to monitor redo logs for Oracle sources?

Check archive log generation rate:

SELECT name, sequence#, first_time, next_time 
FROM v$archived_log 
WHERE first_time > SYSDATE - 1
ORDER BY first_time;

Monitor LogMiner sessions:

SELECT * FROM v$logmnr_contents WHERE ROWNUM <= 10;

Check redo log retention and space:
```
SELECT * FROM v$log;
```

Best Practices:

Retain archive logs for at least 3-5 days
Alert on archive log destination disk space
Monitor supplemental logging overhead
Track LogMiner query performance

Can CDC capture database Views, Materialized Views, and other virtual objects?

No, CDC cannot capture Views or most virtual database objects.Why Views cannot be captured:
CDC captures changes by reading the database transaction log (binlog, WAL, oplog, redo log, etc.). Views are query-time computations over base tables—they don’t store data or generate transaction log entries. When you query a view, the database engine executes the underlying SELECT statement against the base tables. Since views don’t store data, they don’t generate transaction log entries.What cannot be captured:

Views: Virtual tables with no physical storage or redo log entries
Materialized Views (MVs): While they have physical storage, their refresh operations may not generate standard DML redo logs, or may generate bulk operations that are difficult to track incrementally. Solution: Capture the base tables that feed the materialized view, then recreate the MV logic downstream.
Materialized View Logs: Internal structures for fast refresh; not user data
Global Temporary Tables (GTT): Session or transaction-scoped, redo logging depends on ON COMMIT setting, generally excluded from CDC
External Tables: Reference files outside the database, no redo logging
Index-Organized Tables (IOT): Can be captured, but with special considerations for ROWID-based snapshots
System Tables (DBA_*, ALL_*, USER_*, V$*): Metadata views, not user data
CTEs (Common Table Expressions): Query-time constructs with no persistent storage
Object Tables: Tables based on Oracle object types; require special handling

Solution:
Configure CDC on the underlying base tables that power your views. The view logic can be recreated in your destination or transformation layer.Oracle-specific notes:

Materialized Views: Depending on refresh method (COMPLETE, FAST, FORCE), may generate redo logs, but tracking incremental changes is problematic. Always capture the base tables instead.
Materialized View with ON COMMIT refresh: Generates redo logs but as part of transaction commit; difficult to associate with source changes
Partitioned Views: Capture each base table in the partition view union
Index-Organized Tables: Can be captured, but ensure supplemental logging includes the primary key (which is the physical organization key)

Example:
If you have a view V_CUSTOMER_ORDERS that joins tables CUSTOMERS and ORDERS, enable supplemental logging and CDC on the CUSTOMERS and ORDERS base tables, then recreate the join logic in your destination or transformation layer.If you have a materialized view MV_DAILY_SALES that aggregates from SALES table, capture the SALES table and perform the aggregation downstream.

What are common limitations?

Archive log mode must be enabled (not enabled by default in many deployments)
Supplemental logging overhead on write performance
LogMiner has performance impact on high-transaction databases
Large transactions can cause memory pressure in LogMiner sessions
Redo log space management critical for continuous CDC
Schema changes may require connector restart to refresh table metadata
Long-running transactions can delay change event visibility
Oracle RAC requires all nodes to have archive logs accessible

How to handle deletes?

Captures deletes as events with before-images, provided supplemental logging is enabled with ALL COLUMNS. Without proper supplemental logging, delete events may only contain primary key values.

What security features are available?

Encrypted connections (SSL/TLS), Oracle wallet for credential management, role-based access, IP allowlisting, support for Oracle Network Encryption.

Why does the Connector need CREATE TABLE privileges?

Oracle has a background process called the “Log Writer” or LGWR for short. Logs are written to an in-memory buffer first, and then the LGWR writes them to disk.The Connector needs to keep track of the last recorded system change number (SCN) that the LGWR process records for each committed transaction.It’s best to persist that SCN somewhere outside of the Connector so it creates a very small, 1 column table named LOG_MINING_TABLE. If the Connector fails, it can use the last recorded value from that table to recover.

Do I need to resize my redo logs for a database with low data volume and traffic?

We recommend it because the supplemental logging configuration increases the amount of data in the redo logs, data that’s required for tracking changes to your data, including its schema.We would also recommend for databases with low traffic to also enable the heartbeats feature.

How do I find the Container and pluggable database names?

Connect to the database and run this script to list the available container (CDB) and pluggable database (PDB) names.

SELECT NAME, CDB, CON_ID FROM V$DATABASE ORDER BY CON_ID;

For the Container database names, the CDB column should show YES.

Can I create an Oracle local user instead?

At this time, no. The Connector makes calls to LogMiner APIs from within the root database. Additionally, it consults several V$ tables which are not available from within a PDB.

How do I find the database endpoint and port? (AWS RDS)

If you use any database tools such as Oracle SQL Developer to interact with the database, you’ll find them in the connection configuration you have saved in those tools.Alternatively, you can find them in the AWS account:

Sign into your AWS account
Once signed in, navigate to the RDS dashboard by clicking on Services in the top left corner, Databases and then RDS or by typing RDS into the top left search box
From the AWS RDS Dashboard, click on DB Instances or Databases in the left side menu
Click on the DB identifier for the Oracle database you want Streamkap to use. The Database Details page should appear
Under the Connectivity & security section you will find Endpoint & port

Does the AWS automated backups retention period need to be large enough to accommodate the archivelog retention hours?

No, they are mostly independent of each other.

archivelog retention hours determines how long the logs are retained locally in the database storage
AWS automated backups retention period determines how long the logs are retained by AWS outside of the database storage

When archive logs have existed for longer than the archivelog retention hours, they are removed from the database storage and then retained by AWS - outside of the database storage - for the AWS automated backups retention period.The AWS automated backups are there so you can recover your database in the event of a disaster. The retention period doesn’t impact the Streamkap Connector, only the archivelog retention hours does.

Troubleshooting common issues

CDC Not Working: Verify archive log mode enabled; check supplemental logging at database and table levels
Redo Log Buildup: Enable heartbeats; monitor archive log destination space; adjust retention
Missing Events: Ensure supplemental logging includes ALL COLUMNS; verify LogMiner privileges
Performance Issues: Monitor LogMiner session resource usage; limit captured tables; tune redo log size
Connection Failures: Check listener status, TNS configuration, firewall rules
Schema Changes Not Reflected: Restart connector to refresh metadata after DDL operations

How can I optimize schema history for large databases?

For large database instances with many databases or tables, schema history can impact performance. Learn about optimization settings that can reduce schema history topic size and improve connector startup times.See the Schema History Optimization guide for detailed information on when and how to use these settings.

Best practices for Oracle sources

Use dedicated CDC user with minimal required privileges (LogMiner, SELECT on tables)
Enable supplemental logging with ALL COLUMNS for complete change records
Ensure archive log mode is enabled and monitored
Set appropriate archive log retention (3-5 days minimum)
Limit capture to needed schemas/tables to reduce LogMiner overhead
Monitor redo log generation rate and space
Test schema evolution procedures in staging
For cloud: Use provider monitoring; enable heartbeats for low traffic
Document supplemental logging configuration for team reference
Consider impact of supplemental logging on write-heavy workloads

Getting Started

App

Deployment

Sources

Destinations

Transformation

Billing

Security

Rest API

Oracle Sources FAQ for Streamkap

Configure source database heartbeats

Getting Started

App

Deployment

Sources

Destinations

Transformation

Billing

Security

Rest API

​Oracle Sources FAQ for Streamkap

​Configure source database heartbeats

Oracle Sources FAQ for Streamkap

Configure source database heartbeats