AWS Glue
Stream data into AWS Glue
Prerequisites
- An AWS user with sufficient privileges to create and configure IAM users and roles.
AWS Glue Setup
1. Create IAM User
It is recommended to create a separate IAM user and role with minimum necessary access.
- Go to the Amazon IAM Console.
- Click Users in the sidebar.
- Click Add users.
- Enter a User name: e.g.,
streamkap_user
- Set the Access type:
- ✅ Check “Access key – Programmatic access”
- ❌ Uncheck “Console access” (not needed)
Click Next.
- Choose Attach policies directly and create a custom policy:
- Click Create policy (opens in new tab).
- Go to the JSON tab and paste this policy, ensuring you replace any
<...>
placeholders as required:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"glue:GetDatabase",
"glue:GetDatabases",
"glue:CreateTable",
"glue:GetTable",
"glue:GetTables",
"glue:UpdateTable",
"glue:DeleteTable",
"glue:CreatePartition",
"glue:GetPartition",
"glue:GetPartitions",
"glue:BatchCreatePartition",
"glue:UpdatePartition",
"glue:DeletePartition",
"glue:BatchDeletePartition"
],
"Resource": [
"arn:aws:glue:<region>:<account_id>:catalog",
"arn:aws:glue:<region>:<account_id>:database/<glue_database_name>",
"arn:aws:glue:<region>:<account_id>:table/*/*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::<bucket_name>",
"arn:aws:s3:::<bucket_name>/*"
]
}
]
}
For the Connector to access AWS Glue, it needs to be able to assume the user's role. Please grant the following Trust policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::300973880807:role/kafkaConnectTenantAccessRole"
},
"Action": "sts:AssumeRole"
}
]
}
Streamkap Setup
1. Create the Destination
- Navigate to Add Connectors.
- Choose Iceberg.
- Name: Enter a name for your connector.
2. Connection Settings
- Catalog Type: The type of Iceberg catalog.
GLUE
:- AWS IAM Role: AWS IAM role (e.g.,
arn:aws:iam:::role/
)
- AWS IAM Role: AWS IAM role (e.g.,
- Region: The AWS region to be used.
- S3 Bucket Path: Path to the storage location for the Iceberg tables.
- Schema: An Iceberg table name prefix—equivalent to a database schema (e.g., public, sales, analytics).
3. Ingestion Settings
-
Ingestion Mode: Specifies the strategy used to insert events into the Iceberg tables.
Changing ingestion modeappend
andupsert
modes use different, incompatible methods for loading data into the Iceberg tables. If - for whatever reason - you want to change modes for an existing Iceberg Connector, please create a new Iceberg Destination instead i.e. a separate destination forinsert
, and forupsert
.upsert
mode:- Primary key fields: Optional. A comma-separated list of field names to use as record identifiers when a primary key's not present.
Click Save.
Updated 2 days ago