Azure Blob Storage

Prerequisites

  • An Azure account granted the Storage Blob Data Contributor (for Shared Access Signature generation) roles, or the Storage Account Key Operator Service Role (for Account key access) built-in role, or a custom role with permissions to:
    • Microsoft.Storage/storageAccounts/blobServices/generateUserDelegationKey/action
    • Microsoft.Storage/storageAccounts/blobServices/containers/blobs/write
    • Microsoft.Storage/storageAccounts/listkeys/action
  • An Azure storage account

👍

If you have not yet created an Azure storage account, please follow Microsoft's Create a Azure storage account guide

Limitations

  • Currently our Connector does not renew SAS tokens automatically. If and when credentials expire, you will need to generate new credentials and update your Azure Blob Storage Destination's credentials in the Streamkap app to resume Pipelines

Azure Blob Storage Setup

Set up a blob container (optional)

You can use an existing blob container, but you may wish to create a new one to completely isolate Streamkap generated files from the rest of your files.

To create a new blob container:

  1. Log in to the Azure Portal
  2. At the top of the page, select Storage accounts and click on a storage account
  3. In the navigation pane for the storage account, scroll to the Data storage section and select Containers
  4. Within the Containers pane, click the + Container button to open the New container pane
  5. Provide a Name for your new container
  6. Once you're happy with the configuration, click the Create button

You will need the container name for Streamkap Setup later on.

Get connection details

Currently our Connector supports 2 types of Azure Storage credentials:

  • Shared Access Signatures (SAS) which provide granular control over access to the storage account and its data
  • Account access keys which are generated when Azure creates a storage account and provide full access to the account and its data

Account signed SAS (recommended)

You can use any type of SAS with our Connector, but we recommend an Account SAS as it offers longer expiration times while still providing granular control over access to the storage account and its data.

  1. Log in to the Azure Portal
  2. At the top of the page, select Storage accounts and click on a storage account
  3. In the navigation pane go to Storage browser and select Blob containers
    1. For the blob container Streamkap should use, click the ... button on the right and then Generate SAS
    2. For the Signing method, choose Account key
    3. For Permissions, select Write
    4. For the Start and expiry date/time, set a reasonably long time period (or as long as your security policies allow) to reduce how often you have to generate new SAS tokens and update your Azure Blob Storage Destinations in the Streamkap app. Azure also recommends setting the start date to at least 15 minutes in the past to avoid clock skew issues
    5. For the Allowed IP addresses, enter the Streamkap IP Addresses
    6. Click the Generate SAS token and URL button
    7. For the Blob SAS URL, click the Copy to clipboard button

You will need that URL to complete Streamkap Setup later on.

Account access key

You can use an Account access key, which is generated when Azure creates a storage account. However, because it provides full access to the account and its data, it's not recommended or essential for our Connector.

  1. Log in to the Azure Portal
  2. At the top of the page, select Storage accounts and click on a storage account
  3. In the navigation pane, under the Security + networking section go to Access keys
    1. For either of the keys, click the Show button next to its connection string
    2. Click the Copy button to copy the connection string to the clipboard

You will need that connection string to complete Streamkap Setup later on.

User delegated SAS (not recommended)

📘

Granting User delegated SAS permissions

When generating a User delegated SAS in the Azure Portal you may see the following warning message:

You don't have permissions to grant write access. You can still create a shared access signature, but you'll need an RBAC role with additional permissions before you can grant that level of access to your signature recipient.

If so, you need to ensure that your Azure user has been assigned to a built-in Azure role or custom role that has the equivalent permissions to those which you are granting. For granting Write access (required by our Connector), your Azure user would need to be assigned to Storage Blob Data Contributor role or a custom role with the Microsoft.Storage/storageAccounts/blobServices/containers/blobs/write permission

You can use a User delegated SAS. However, because the maximum expiration period is 7 days and our Connector does not currently renew SAS tokens automatically, it's not recommended for our Connector.

  1. Log in to the Azure Portal
  2. At the top of the page, select Storage accounts and click on a storage account
  3. In the navigation pane go to Storage browser and select Blob containers
    1. For the blob container Streamkap should use, click the ... button on the right and then Generate SAS
    2. For the Signing method, choose User delegation key
    3. For Permissions, select Write
    4. For the Start and expiry date/time, set a reasonably long time period (or as long as your security policies allow) to reduce how often you have to generate new SAS tokens and update your Azure Blob Storage Destinations in the Streamkap app. Azure also recommends setting the start date to at least 15 minutes in the past to avoid clock skew issues
    5. For the Allowed IP addresses, enter the Streamkap IP Addresses
    6. Click the Generate SAS token and URL button
    7. For the Blob SAS URL, click the Copy to clipboard button

Streamkap Setup

  1. Go to Destinations and choose Azure Blob Storage

  2. Input the following information:

    1. Name: A unique and memorable name for this Connector

    2. Azure Connection String: The connection string (for Account access keys) or Blob SAS URL (for Shared Access Signatures)

    3. Blob container name: The name of an existing blob container to use

    4. Format (default: JSON): The format to use when writing the data to file storage

      1. Include headers (CSV format only): Include or exclude column name header row per file
    5. Storage directory: The top level directory for storing the data e.g. myfolder/subfolder

    6. Filename template (default: {{topic}}--{{partition}}--{{start_offset}}): The format of the filename. You can combine any of the elements below using other text or characters, including dashes (-) and underscores (_)

      | Element | Description | | :-------------------------------------------------- | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | {{topic}} | The Streamkap topic name. For example, a PostgreSQL Source table web.salesorders topic's name would be salesorders | | {{partition}} | The partition number of the records in the file, typically 0. Streamkap topics and their data can be partitioned for better performance in certain scenarios. For example, a topic salesorders has 10 partitions, 0 through to 9. | | {{start_offset}} | The offset number of the first record in the file. Every record streamed has an incrementing offset number. For example, a topic salesorders has 1000 records, offsets 0 through to 999. Note that in the case of a multi-partitioned topic, offset numbers are not unique across partitions. | | {{timestamp:yyyy\\|MM\\|dd\\|HH\\|mm\\|ss\\|SSS}} | The timestamp for when the file was created by the Connector. For example, the template {{topic}}{{timestamp:yyyy}}-{{timestamp:MM}} and timestamp of 2024-01-01 20:24 would create a file named salesorders2024-01. |


    7. Flush size (default: 1000): Number of records to write per file

    8. File size (default: 65536): Minimum size (in bytes) per file. Records are held in memory until this file size is met or the Rotate interval is exceeded

    9. Rotate interval (default: -1 disabled): Maximum time (in milliseconds) to wait before writing records held in memory to file. This ignores the flush and file size settings

    10. Compression type (default: none): The compression algorithm to use when writing each file. Depending on the Format, some compression algorithms may not be available

  3. Click the Save button