Requirements
-
AWS Access Key and Secret Access Key with the following permissions to the destination bucket:
s3:GetObjects3:PutObjects3:AbortMultipartUploads3:ListMultipartUploadPartss3:ListBucketMultipartUploads
Configure S3 Connector
Name: A descriptive name for the connectorAWS Access Key: An Access Key with the appropriate permissions for the bucket to which Streamkap will load dataAWS Secret Access Key: The Secret Access Key with the appropriate permissions for the bucket to which Streamkap will load dataRegion: Name of the region for bucket to which Streamkap will load dataBucket Name: The name of the bucket to which Streamkap will load dataFormat: The format of the file. The following options are available:CSV,JSONL,JSONandParquetFilename Template: The format of the filename. See below for more information about formatting options.Compression Type: Compression type for output files. Supported algorithms aregzip,snappy,zstdandnone. Defaults togzipOutput Fields: List of fields to include in output. Available options are:key,offset,timestamp,valueandheaders. Defaults tovalue.
{{topic}}--{{partition}}--{{start_offset}}): The format of the filename. You can combine any of the elements below using other text or characters, including dashes (-) and underscores (_)
| Element | Description |
|---|---|
{{topic}} | The Streamkap topic name. For example, a PostgreSQL Source table web.salesorders topic’s name would be salesorders |
{{partition:padding=true|false}} | The partition number of the records in the file, typically 0. Streamkap topics and their data can be partitioned for better performance in certain scenarios. For example, a topic salesorders has 10 partitions, 0 through to 9. If padding set to true it will set leading zeroes for offset, the default value is false; |
{{start_offset:padding=true|false}} | The offset number of the first record in the file. Every record streamed has an incrementing offset number. For example, a topic salesorders has 1000 records, offsets 0 through to 999. Note that in the case of a multi-partitioned topic, offset numbers are not unique across partitions. If padding set to true it will set leading zeroes for offset, the default value is false; |
{{timestamp:unit=yyyy|MM|dd|HH}} | The timestamp for when the file was created by the Connector. For example, the template {{topic}}{{timestamp:unit=yyyy}}-{{timestamp:unit=MM}} and timestamp of 2024-01-01 20:24 would create a file named salesorders2024-01 |
{{key}} | The Kafka key |
- topic, partition, start_offset, and timestamp - grouping by the topic, partition, and timestamp;
- key - grouping by the key.
- key, topic, partition - grouping by the topic, partition, and key.