Overview
Use the Webhook Source to ingest data into Streamkap by accepting HTTP requests at a dynamically generated endpoint. This is useful for integrating third-party services, webhooks from external platforms, or custom applications that push data via HTTP. The connector automatically routes incoming requests to Kafka topics based on the URL path or a default topic.Prerequisites
- An external service or application capable of sending HTTP POST, PUT, or DELETE requests
- Understanding of your data format (JSON or string)
- Plan for topic naming and routing strategy
- API key for secure access to your webhook endpoint
How It Works
- Webhook Generation: Once created, the connector generates a unique webhook URL and API key.
- Flexible Routing: Requests to
https://webhook.endpoint/topic_nameare routed to thetopic_nametopic. Requests to the base URL route to the default topic. - Automatic Topic Creation: If you POST to a new path (e.g.,
/orders), the connector can automatically create that topic, but you should pre-register it in the Schema section for proper configuration. - Data Transformation: Incoming data is converted to JSON format and optionally enriched with metadata (e.g., HTTP method, deletion markers).
- Message Keys: The connector extracts Kafka message keys from HTTP headers, enabling downstream systems to perform upserts and deletes based on unique identifiers.
Streamkap Setup (UI)
- Navigate to Sources and choose Webhook.
-
Fill in the fields:
- Name – A memorable identifier for this webhook source (e.g., “External Service Webhook”).
- Webhook URL – Auto-generated and displayed as read-only after creation. This URL is unique to your webhook. Share this URL with external services. Example:
https://wh-abc123xyz-tenant-internal.streamkap.net. - API Key – Auto-generated and displayed as read-only after creation. External services must include this in requests using the
X-API-Keyheader. - Data Format – Choose
jsonorstring:- JSON – Expects JSON payloads; automatically parsed and converted to Kafka records.
- String – Treats the raw HTTP body as a string value.
- Add Delete Field – Enable to add a
delete: truefield when DELETE HTTP requests are received. Useful for marking deletions in downstream systems. - Camel Message Header Key – The name of the HTTP header containing the Kafka message key (default:
key). This is critical for enabling upserts and deletes in downstream consumers.- The header value must be a valid JSON object representing the key fields.
- The same key field(s) must also be present in the request body/payload.
- Example: If you want to upsert records by
id, set this tokeyand include both the header and body field. - See Message Key Extraction section below for detailed examples.
-
Schema Configuration:
- Define the topics where incoming data will be stored.
- Default Topic Name – The topic that receives all requests to the base webhook URL (default:
default_topic). Must be added to the Schema section below. - Topic Names – Additional topics to route requests to based on URL paths. Add topic names you plan to use. Examples:
orders– Requests tohttps://wh-abc123xyz-tenant-internal.streamkap.net/ordersroute here.payments– Requests tohttps://wh-abc123xyz-tenant-internal.streamkap.net/paymentsroute here.events– Requests tohttps://wh-abc123xyz-tenant-internal.streamkap.net/eventsroute here.
- Important: Always include
default_topicin the list. If you expect dynamic topics, pre-register them here or wait for the automatic topic discovery service to detect them (which may take time).
- Save the Source.
- Attach to Pipelines and configure downstream Sinks as needed.
Authentication
All requests must include the API key using theX-API-Key header:
401 Unauthorized response.
Example Request with API Key
Message Key Extraction
The connector supports extracting Kafka message keys from HTTP headers. This enables downstream systems to perform upserts (update or insert) and deletes based on unique identifiers.How It Works
- Header Configuration: Set the Camel Message Header Key to the name of the HTTP header containing the key (default:
key). - Key Format: The header value must be a valid JSON object representing the key fields.
- Dual Presence: The key field(s) must appear in both the HTTP header AND the request body for consistency.
- Single Record Assumption: The connector assumes each HTTP request contains exactly one message with one key.
Key Extraction Examples
Example 1: Simple ID-Based Key
Setup:- Camel Message Header Key:
key - Body includes
idfield
- Message Key:
{"id": 12345} - Message Value:
id field, replacing old records with the same id.
Example 2: Composite Key
Setup:- Camel Message Header Key:
key - Body includes both
customer_idandorder_numberfields
- Message Key:
{"customer_id": "CUST-123", "order_number": "ORD-001"} - Message Value:
Example 3: DELETE with Key
Setup:- Camel Message Header Key:
key - Add Delete Field: Enabled
- Message Key:
{"user_id": "USER-456"} - Message Value:
delete: true field and removes the record with key {"user_id": "USER-456"} from the target system.
Example 4: Custom Header Name
Setup:- Camel Message Header Key:
msg_key(custom header name) - Body includes
product_idfield
- Message Key:
{"product_id": "PROD-789"} - Message Value:
Best Practices for Message Keys
- Always Include Both: Ensure the key field(s) exist in both the HTTP header and the request body.
- Use Immutable IDs: Use unique, immutable identifiers (e.g., order ID, user ID) as keys to avoid data inconsistencies.
- Composite Keys: For datasets requiring uniqueness across multiple fields, use composite keys (e.g.,
customer_id+order_number). - Consistency Across Calls: Use the same key structure for all requests to the same topic.
- Valid JSON: The header value must be valid JSON. Invalid JSON in the key header will cause request failures.
Using Your Webhook
Once the webhook is created, you’ll see:- Webhook URL: The endpoint external services should POST/PUT/DELETE to
- API Key: Required for authentication via
X-API-Keyheader
Routing Behavior
| Request URL | Resulting Topic | Notes |
|---|---|---|
POST https://wh-abc123xyz-tenant-internal.streamkap.net | default_topic | Base URL routes to default topic |
POST https://wh-abc123xyz-tenant-internal.streamkap.net/orders | orders | Path-based routing (must be pre-registered in Schema) |
PUT https://wh-abc123xyz-tenant-internal.streamkap.net/customers | customers | PUT requests also supported |
DELETE https://wh-abc123xyz-tenant-internal.streamkap.net/users with key header | users | DELETE requests require key header to identify the record to delete. Body must also contain the key field(s). |
Data Format & Transformation
JSON Format
Incoming JSON payloads are automatically parsed and sent as Kafka records:String Format
Raw string bodies are wrapped in adata field:
DELETE Method Handling
If Add Delete Field is enabled and a DELETE request is received:Topic Discovery & Auto-Creation
The webhook connector supports flexible URL paths:- Pre-registered Topics – Topics explicitly added in the Schema section always accept requests.
- Dynamic Topics – New paths automatically create topics, but:
- The topic must be added to the Schema section in advance for guaranteed routing.
- If not pre-registered, a background topic discovery service will detect and register new topics, but this may take time (typically a few minutes).
- To avoid delays, register all expected topics upfront.
Example Workflow
- You plan to receive data for
orders,payments, andevents. - Add all three topics to the Topic Names field in the Schema section.
- External services can now reliably POST to
/orders,/payments, or/events. - If a request arrives for
/refunds(not pre-registered), the connector will create the topic, but there may be a delay before it appears in your pipeline.
Request Limits & Performance
The webhook connector supports large payloads with the following defaults:| Setting | Default | Purpose |
|---|---|---|
| Max Initial Line Length | 16 KB | HTTP request line size limit |
| Max Header Size | 64 KB | Total HTTP headers size limit |
| Max Chunk Size | 50 MB | Maximum payload size per request |
| Backlog Size | 200 | Queued connections before rejection |
Data Format Examples
JSON POST to Orders Topic with Key
Request:- Key:
{"order_id": "ORD-001"} - Value:
String POST to Events Topic
Request:DELETE Request with Delete Field and Key
Request:- Key:
{"user_id": "USER-456"} - Value:
Common Scenarios
Scenario 1: Multi-Topic Event Ingestion with Upserts
Setup:- Default Topic:
default_topic - Topic Names:
orders,payments,events,notifications - Data Format: JSON
- Camel Message Header Key:
key
- Third-party service POSTs orders to
.../orderswith akeyheader containing the order ID. - Billing system POSTs payments to
.../paymentswith akeyheader containing the payment ID. - Downstream sink connectors upsert records based on these keys, updating existing records instead of creating duplicates.
- Each goes to its respective topic, enabling independent processing.
Scenario 2: Generic Event Logging
Setup:- Default Topic:
webhook_events - Topic Names: (empty or just default)
- Data Format: String
- All requests regardless of path go to
webhook_events. - Useful for centralized logging of third-party webhooks.
Scenario 3: Soft Deletes with Tombstones
Setup:- Default Topic:
users - Add Delete Field: Enabled
- Data Format: JSON
- Camel Message Header Key:
key
- POST
{"user_id": 123, "name": "Alice"}with headerkey: {"user_id": 123}creates a record. - DELETE with
{"user_id": 123}and headerkey: {"user_id": 123}adds"delete": trueto mark deletion. - Downstream systems can process deletes accordingly, removing records by key.
Integration Examples
Integrating Shopify Webhooks
Configure Shopify to send order events to your webhook:shopify_orders topic in real-time.
Integrating GitHub Webhooks
Configure GitHub to send push and pull request events:github_events topic.
Custom Application Integration with Keys
From your application, send data via HTTP with message keys:Error Handling
| Status Code | Meaning | Action |
|---|---|---|
| 202 Accepted | Request received and queued for processing | Success; data will be in Kafka shortly |
| 400 Bad Request | Malformed request, invalid format, or invalid JSON in key header | Check JSON syntax, Content-Type header, and key header format |
| 401 Unauthorized | Missing or invalid API key | Verify the API key in the X-API-Key header |
| 413 Payload Too Large | Request exceeds max chunk size (50 MB) | Reduce payload size or split into multiple requests |
| 429 Too Many Requests | Rate limit exceeded | Implement backoff and retry logic |
| 500 Internal Server Error | Connector issue | Check connector logs; retry after a delay |
Security Notes
- API Key Protection: Treat the API key like a password. Do not commit it to version control or expose in client-side code. Store securely in environment variables or secrets management systems.
- HTTPS Only: Always use HTTPS for webhook requests in production. Plain HTTP is not recommended.
- Key Header Security: The
keyheader contains sensitive identifying information. Ensure HTTPS is used to prevent interception. - IP Whitelisting: If available, restrict webhook access to known IP ranges of external services.
- Payload Validation: Implement server-side validation in downstream consumers to ensure data integrity and key consistency.
- Rate Limiting: External services should implement exponential backoff for
429responses.
Limitations
- No Built-in Deduplication: If external services retry requests, duplicate records may appear. Implement deduplication downstream using unique message keys or IDs.
- Single Record per Request: The connector assumes each HTTP request contains exactly one Kafka message. Batch requests are not supported at the connector level.
- Single-Instance Deployment: The connector runs as a single task (
tasks.max=1). Horizontal scaling requires multiple connector instances. - No Request Buffering: Incoming requests are processed sequentially. High-volume ingestion may experience latency.
- Topic Pre-Registration Recommended: While auto-creation works, pre-registering topics in the Schema section ensures reliable routing without delays.
- Key Header Validation: Invalid JSON in the key header will cause the request to fail. Always validate key format before sending.
Monitoring & Troubleshooting
Checking Message Flow
Use your pipeline dashboard to monitor:- Events Written – Total messages ingested from the webhook
- Volume – Data volume in bytes
- Topic Distribution – Messages per topic
Common Issues
Issue: Requests to new topics fail or are delayed Solution: Pre-register all expected topics in the Schema section. Issue: 401 Unauthorized errors Solution: Verify the API key is included in theX-API-Key header with the correct value.
Issue: 400 Bad Request with JSON format selected
Solution: Ensure the request body is valid JSON and the Content-Type: application/json header is set. Verify the key header contains valid JSON.
Issue: Data appears in wrong topic
Solution: Check the URL path matches a registered topic name. Typos or unregistered paths may route to the default topic.
Issue: Downstream upserts not working
Solution: Ensure the message key is correctly extracted. Verify the key header contains valid JSON and the same key fields exist in the request body.
Issue: Invalid JSON in key header
Solution: Use tools like jq or online JSON validators to ensure your key JSON is well-formed. Example: key: $(echo '{"id": 123}' | jq -c .).