Prerequisites
- DocumentDB version ≥ 4.x
- A database user with sufficient privileges to create database users and collections
DocumentDB Setup
1. Grant Database Access
- Configure one of the Connection Options to ensure Streamkap can reach your database.
2. Enable Change Streams
Change streams allow applications to access real-time data changes. The Connector relies on DocumentDB’s implementation of this.Set Change Stream Log Retention Policy
Change stream logs should be retained for a minimum of 48 hours. We recommend 7 days.3. Create Database User
It’s recommended to create a separate user for the Connector to access your DocumentDB database.- Using MongoDB Shell, connect to your primary node or replica set.
- Create a user for Streamkap using the script below. Replace password with your choice.
4. Enable Snapshots
To backfill your data, the Connector needs to be able to perform snapshots. See Snapshots & Backfilling for more information. You will need to create astreamkap_signal collection and give permissions to the streamkap_user. The Connector will use this collection for managing snapshots.
This collection can exist in a different database (on the same DocumentDB instance) to the database Streamkap captures data from.
Streamkap Setup
Follow these steps to configure your new connector:1. Create the Source
- Navigate to Add Connectors.
- Choose DocumentDB.
2. Connection Settings
- Name: Enter a name for your connector.
- Connection String: The DocumentDB connection string.
-
Connection Mode (optional): Default is
replica_set. -
Array Encoding: Specify how Streamkap should encode DocumentDB array types.
Arrayis the optimal method but requires all elements in the array to be of the same type.DocumentorStringshould be used if the DocumentDB arrays have mixed types. - Include Schema? (optional): If you plan on streaming data from this DocumentDB Source to Rockset, set this option to No.
3. Snapshot Settings
- Signal Table Database: Streamkap will use a collection in this database to manage snapshots. See Enable Snapshots for more information.
4. Advanced Parameters
- Represent binary data as: Specifies how the data for binary columns should be interpreted. Your destination for this data can impact which option you choose. Default is
bytes.
5. Database and Collection Capture
- Add Database/Collections: Specify the database(s) and collection(s) for capture.
- You can bulk upload here. The format is a simple list of databases and collections, with each entry on a new row. Save as a
.csvfile without a header.
- You can bulk upload here. The format is a simple list of databases and collections, with each entry on a new row. Save as a
Have questions? See the DocumentDB Source FAQ for answers to common questions about DocumentDB sources, troubleshooting, and best practices.