> ## Documentation Index
> Fetch the complete documentation index at: https://docs.streamkap.com/llms.txt
> Use this file to discover all available pages before exploring further.

# ClickHouse Cloud

> ClickHouse Cloud destination configuration for Streamkap CDC

## Prerequisites

* ClickHouse User
* ClickHouse database hostname
* For nested arrays/structs: `flatten_nested=0` must be set on the ClickHouse server or user level
* Dialect version must match your ClickHouse server version (e.g., ClickHouse 22.x requires dialect version 22)

## Delivery Guarantees

Streamkap offers **at-least-once** delivery guarantee for ClickHouse and defaults to `upsert` mode.

For `append` (or insert) mode, duplicate records will be inserted into destination tables, but, ClickHouse materialized views can filter them out.

With `upsert` mode (default) we carry out deduplication using the records primary key.

<Info>
  See [Inserts/Upserts](/insertsupserts) for more information about ingestion modes.
</Info>

## Data duplication

<Info>
  Only applies to `upsert` configured ClickHouse Destinations and destination tables.
</Info>

Although using the ClickHouse [`FINAL`](https://clickhouse.com/docs/en/sql-reference/statements/select/from#final-modifier) query modifier has [drawbacks](https://clickhouse.com/docs/en/sql-reference/statements/select/from#drawbacks) , we have seen situations where users `SELECT` queries on the destination tables return some duplicate records and the `FINAL` query modifier helps avoid that.

Behind-the-scenes, `upsert` ingestion mode relies on ClickHouse's `ReplacingMergeTree` [table engine](https://clickhouse.com/docs/en/engines/table-engines) to deduplicate data in the background instead of when the data's inserted or merged (e.g. `MERGE INTO` for other data warehouses like Snowflake or DataBricks) into the destination tables.

To save having to specify the `FINAL` query modifier in every query, you can consider altering the querying user's role as shown below:

<CodeGroup>
  ```SQL SQL theme={null}
  -- Replace { ... } placeholders as required
  ALTER ROLE {ROLENAME} SETTINGS final = 1;
  ```
</CodeGroup>

## Data Type Support

<Info>
  All mappings are done on a 'best effort, nearest equivalent' basis because many data warehouses and data processing systems have unique data types or share data types but interpret them differently.
</Info>

Our Connector supports the following data types and mappings:

| Streamkap                                                                   | ClickHouse                                                                              |
| --------------------------------------------------------------------------- | --------------------------------------------------------------------------------------- |
| `INT8`                                                                      | `Int8`                                                                                  |
| `INT16`                                                                     | `Int16`                                                                                 |
| `INT32`                                                                     | `Int32`                                                                                 |
| `FLOAT32`                                                                   | `Float32`                                                                               |
| `FLOAT64`                                                                   | `Float64`                                                                               |
| `BOOLEAN`                                                                   | `Bool`                                                                                  |
| `BYTES`                                                                     | `BLOB`(String)                                                                          |
| `org.apache.kafka.connect.data.Decimal`                                     | `DECIMAL(38, 0)`                                                                        |
| `org.apache.kafka.connect.data.Timestamp` `io.debezium.time.ZonedTimestamp` | `DateTime64`                                                                            |
| `org.apache.kafka.connect.data.Date`                                        | `Date`                                                                                  |
| `io.debezium.data.Json`                                                     | `String`(JSON) or `JSON`(if supported by ClickHouse server & `JSON Support` is enabled) |
| `STRUCT`(e.g. maps, key/value pairs)                                        | `Tuple`                                                                                 |
| `ARRAY`                                                                     | `Array`                                                                                 |

## ClickHouse Setup

It's recommended to create a separate user and role for Streamkap to access your ClickHouse database. Below is an example script that does that.

<CodeGroup>
  ```SQL SQL theme={null}
  -- Replace { ... } placeholders as required
  CREATE ROLE STREAMKAP_ROLE;
  GRANT SELECT, CREATE, SHOW, INSERT, ALTER ON {database}.* TO STREAMKAP_ROLE;

  CREATE USER streamkap_user IDENTIFIED WITH sha256_password BY '{password}'
  GRANT STREAMKAP_ROLE TO streamkap_user;
  ```
</CodeGroup>

## Allowing Streamkap access

IP access lists in ClickHouse determine what services can connect to your ClickHouse database.

From your ClickHouse Cloud services list:

1. Choose the service and then **Settings**
2. Click **+ Add entry** and enter [Streamkap IP Addresses](/streamkap-ip-addresses)
3. Enter a description e.g. **Streamkap app**
4. Click **Add entry**

<Info>
  See [Setting IP Filters](https://clickhouse.com/docs/en/cloud/security/setting-ip-filters) for more information about ClickHouse IP access lists.
</Info>

## Obtaining connection details

You will need the **hostname** of your ClickHouse database when setting up the Destination in Streamkap.

From your ClickHouse Cloud services list:

1. Choose the service and then **Connect**
2. At the bottom of the connection window you will see `https://` followed by the hostname like this `abc123.us-west-2.aws.clickhouse.cloud`. The part after `https://` (excluding the port `:port`) is the hostname

## Handling semi-structured data

To support complex data structures such as nested arrays of structs and nested structs containing arrays, the Connector's user role needs to be configured as shown below:

<CodeGroup>
  ```SQL SQL theme={null}
  ALTER ROLE STREAMKAP_ROLE SETTINGS allow_experimental_object_type = 1;
  ```
</CodeGroup>

## Streamkap Setup

1. Go to [Destinations](https://app.streamkap.com/connectors/add?tab=Destinations) and choose **ClickHouse**

2. Input the following information:

   1. **Name** - A unique and memorable name for this Connector
   2. **Hostname** - The IP or URL of your ClickHouse instance
   3. **Ingestion Mode** (default: `upsert`) - See [Inserts/Upserts](/insertsupserts) for information about ingestion modes
   4. **Port** (default: `8443`)
   5. **Username** (case sensitive) - `STREAMKAP_USER` or the username you chose
   6. **Password** - Password for the database user
   7. **Database** - The name of the database
   8. **SSL** - SSL is enabled by default on ClickHouse instances
   9. **Tasks** - If Pipelines for this Destination have lag that's continuously growing over time and not reducing, increase the number of Tasks, otherwise, leave as default
   10. **JSON Support** - Allow JSON data type in ClickHouse, make sure the ClickHouse server supports it.

3. Click **Save**

## Metadata Columns

<Snippet file="metadata-columns.mdx" />

For usage patterns, see [Metadata](/metadata).

## Troubleshooting

<AccordionGroup>
  <Accordion title="Performance tuning">
    If you are experiencing performance issues or lag with your ClickHouse destination, adjustments to poll records, partition counts, and task parallelism can help. See [Performance Tuning](/performance-tuning) for detailed guidance on optimizing pipeline throughput and latency.
  </Accordion>
</AccordionGroup>

## Related Documentation

* [Ingestion Modes](/insertsupserts) - Understand insert vs. upsert ingestion modes
* [Delete Handling](/insertsupserts#delete-handling) - Configure how delete events are processed at the destination
