Skip to main content

Azure Configuration

Complete guide for configuring Azure Blob Storage and Azure Data Lake Gen2 connectivity with FastTransfer.

Supported Azure Storage Services

ServiceProtocolBest For
Azure Blob Storageabs://General-purpose object storage
Azure Data Lake Gen2abfss://Big data analytics and hierarchical namespaces

Authentication Methods

FastTransfer supports multiple Azure authentication methods:

  1. Azure CLI (recommended for local development)
  2. Connection String (simple but less secure)
  3. Storage Account Key (via environment variables)
  4. Managed Identity (recommended for Azure infrastructure)

1. Azure CLI

The simplest method for local development and testing.

Setup:

# Install Azure CLI
# Windows: Download from https://aka.ms/installazurecliwindows
# Linux: curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash

# Login to Azure
az login

# List subscriptions
az account list --output table

# Set active subscription (if you have multiple)
az account set --subscription "My Subscription Name"
# or by ID
az account set --subscription "12345678-1234-1234-1234-123456789abc"

# Verify access to storage account
az storage account show --name mystorageaccount --resource-group myresourcegroup

FastTransfer will automatically use the Azure CLI's authentication credentials.

Using with FastTransfer:

# No additional configuration needed - just run FastTransfer
./FastTransfer \
...
--directory "abs://mystorageaccount.blob.core.windows.net/mycontainer/exports" \
--fileoutput "data.parquet" \
...

2. Connection String

Useful for CI/CD pipelines and automated workflows.

Get Connection String:

  1. Navigate to your Storage Account
  2. Go to Security + networkingAccess keys
  3. Copy the Connection string for key1 or key2

Set Connection String:

$env:AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=https;AccountName=mystorageaccount;AccountKey=xxxxx;EndpointSuffix=core.windows.net"

# Run FastTransfer
.\\FastTransfer.exe `
...
--directory "abs://mystorageaccount.blob.core.windows.net/container/exports" `
--fileoutput "data.parquet" `
...

3. Storage Account Key

Alternative to connection string using separate environment variables.

Get Storage Account Key:

  1. Navigate to your Storage Account
  2. Go to Security + networkingAccess keys
  3. Copy key1 or key2

Set Environment Variables:

$env:AZURE_CLIENT_ID="<ClientID>"
$env:AZURE_TENANT_ID="<TenantId>"
$env:AZURE_CLIENT_SECRET="<SecretID>"


# Run FastTransfer
.\\FastTransfer.exe `
...
--directory "abs://mystorageaccount.blob.core.windows.net/mycontaner/exports" `
--fileoutput "data.parquet" `
...

4. Managed Identity

When FastTransfer runs on Azure infrastructure, it can use Managed Identity automatically. No credentials configuration needed!

Supported Azure Services:

  • Azure Virtual Machines
  • Azure App Service
  • Azure Functions
  • Azure Container Instances
  • Azure Kubernetes Service (AKS)

Enable System-Assigned Managed Identity:

  1. Navigate to your Azure resource (VM, App Service, etc.)
  2. Go to Identity
  3. Enable System assigned identity
  4. Save changes

Grant Storage Access:

# Get the Managed Identity Object ID
PRINCIPAL_ID=$(az vm show --name myVM --resource-group myResourceGroup --query identity.principalId -o tsv)

# Assign Storage Blob Data Contributor role
az role assignment create \
--assignee $PRINCIPAL_ID \
--role "Storage Blob Data Contributor" \
--scope "/subscriptions/{subscription-id}/resourceGroups/{resource-group}/providers/Microsoft.Storage/storageAccounts/{storage-account}"

Azure Blob Storage

URI Format

abs://storageaccount.blob.core.windows.net/container/path/

URI Examples

# Root of container
--directory "abs://mystorageaccount.blob.core.windows.net/mycontainer/"

# Folder in container
--directory "abs://mystorageaccount.blob.core.windows.net/mycontainer/exports"

# Nested folders
--directory "abs://mystorageaccount.blob.core.windows.net/mycontainer/exports/sales/2024"

# Date partitioning
--directory "abs://mystorageaccount.blob.core.windows.net/mycontainer/yearmonth=202401/"

Complete Examples

Basic Export:

.\\FastTransfer.exe `
--connectiontype "mssql" `
--server "localhost" `
--database "SalesDB" `
--trusted `
--sourceschema "dbo" `
--sourcetable "Customers" `
--directory "abs://mystorageaccount.blob.core.windows.net/mycontainer/exports/customers" `
--fileoutput "customers.parquet"

Parallel Export with Query:

.\\FastTransfer.exe `
--connectiontype "pgsql" `
--server "localhost" `
--port "5432" `
--database "ecommerce" `
--user "postgres" `
--password "postgres" `
--query "SELECT * FROM orders WHERE order_date >= '2024-01-01'" `
--directory "abs://mystorageaccount.blob.core.windows.net/container/exports/orders" `
--fileoutput "orders_2024.parquet" `
--parallelmethod "Random" `
--distributekeycolumn "order_id" `
--paralleldegree 8

Azure Data Lake Gen2

Azure Data Lake Storage Gen2 is recommended for analytics workloads due to hierarchical namespace support and better performance.

URI Format

abfss://container@storageaccount.dfs.core.windows.net/path/
Key Differences from Blob Storage
  • Protocol: abfss:// instead of abs://
  • Endpoint: .dfs.core.windows.net instead of .blob.core.windows.net
  • Features: Hierarchical namespace, directory-level operations, better performance

URI Examples

# Root of filesystem
--directory "abfss://mystorageaccount.dfs.core.windows.net/datalake"

# Folder in filesystem
--directory "abfss://mystorageaccount.dfs.core.windows.net/datalake/raw/sales"

# Hive-style partitioning
--directory "abfss://mystorageaccount.dfs.core.windows.net/datalake/sales/yearmonth=202401"

Complete Examples

Basic Export:

.\\FastTransfer.exe `
--connectiontype "mssql" `
--server "localhost" `
--database "DataWarehouse" `
--trusted `
--sourceschema "dbo" `
--sourcetable "FactSales" `
--directory "abfss://mystorageaccount.dfs.core.windows.net/datalake/raw/{sourcetable}/" `
--fileoutput "FactSales.parquet"

Parallel Export with Partitioning:

.\\FastTransfer.exe `
--connectiontype "mssql" `
--server "localhost" `
--database "Analytics" `
--trusted `
--query "SELECT * FROM Events" `
--directory "abfss://mystorageaccount.dfs.core.windows.net/mycontainer/{sourcedatabase}/events/" `
--fileoutput "events.parquet" `
--parallelmethod "DataDriven" `
--distributekeycolumn "EventDate" `
--datadrivenquery "SELECT RefDate From Calendar WHERE RefDate between '2024-01-01' and '2024-03-01'" `
--paralleldegree 12

Required Permissions

Azure RBAC Roles

Recommended Role:

  • Storage Blob Data Contributor - Full read/write access to blobs

Alternative Roles:

  • Storage Blob Data Owner - Full access including ACL management
  • Storage Blob Data Reader - Read-only access (for verification)

Assign Role via Azure Portal

  1. Navigate to your Storage Account
  2. Go to Access Control (IAM)
  3. Click AddAdd role assignment
  4. Select Storage Blob Data Contributor
  5. Select user, group, or managed identity
  6. Click Save

Assign Role via Azure CLI

# For a user
az role assignment create \
--assignee user@example.com \
--role "Storage Blob Data Contributor" \
--scope "/subscriptions/{subscription-id}/resourceGroups/{resource-group}/providers/Microsoft.Storage/storageAccounts/{storage-account}"

# For a managed identity
az role assignment create \
--assignee {managed-identity-object-id} \
--role "Storage Blob Data Contributor" \
--scope "/subscriptions/{subscription-id}/resourceGroups/{resource-group}/providers/Microsoft.Storage/storageAccounts/{storage-account}"
Copyright © 2026 Architecture & Performance. Built with Docusaurus.