DocumentDB
documentDB Change Data Capture Setup with Streamkap
Prerequisites
- DocumentDB version ≥ 4.x
- Connection details
- Streamkap user and role
Granting Privileges
- Using MongoDB Shell, connect to your primary node or replica set
- Create a user for Streamkap. Replace password with your choice.
use admin
db.createUser({
user: "streamkap_user",
pwd: "<password>",
roles: [ "readAnyDatabase", {role: "read", db: "local"} ]
})
Enable Snapshots
You will need to create a streamkap_signal
collection and give permissions to the streamkap user/role. Streamkap will use this collection for managing snapshots.
This collection can exist in a different database (on the same DocumentDB instance) to the database Streamkap captures data from.
Please create the signal collection with the name
streamkap_signal
. It will not be recognised if given another name.
db.createCollection("streamkap_signal")
db.grantRolesToUser("streamkap_user", [
{ role: "read", db: "{database}" },
{ role: "readWrite", db: "{database}", collection: "streamkap_signal" }
])
--When later setting up the connector, you must include this collection
Enable Change Streams
Set/Configure change stream log retention policy
- Should be min 48 hours, recommend 7 days
- How to modify change stream log retention policy
Consider Access Restrictions
- Visit Connection Options to ensure Streamkap can reach your database
Setup DocumentDB Connector in Streamkap
- Go to Sources and click Create New
- Input
- Name for your Connector
- Connection String- Copy the connection string from earlier steps
- Optional - Connection Mode (Default
replica_set
): - Array Encoding: Specify how Streamkap should encode DocumentDB array types.
Array
is the optimal method but requires all elements in the array to be of the same type.Document
orString
should be used if the DocumentDB arrays have mixed types - Signal Table Database: Streamkap will use a collection in this database to manage snapshots e.g.
public
. See Enable Snapshots for more information - (Optional) Include Schema?: If you plan on streaming data from this DocumentDB Source to Rockset, set this option to No
- Advanced Parameters
- Represent Binary Data As (Default
bytes
)
- Represent Binary Data As (Default
- Add Schemas/Tables. Can also bulk upload here. The format is a simple list of each schema or table per row saved in csv format without a header.
- Click Save
The connector will take approximately 1 minute to start processing data.
Updated about 2 months ago