Cloud Storage Examples
Examples for exporting data directly to cloud storage platforms.
AWS S3
Export data directly to Amazon S3 buckets using date-based placeholders.
S3 with Date Placeholders
Automatically organize files by date using placeholder tokens:
.\\FastTransfer.exe `
--connectiontype "mssql" `
--server "localhost" `
--database "SalesDB" `
--trusted `
--sourceschema "dbo" `
--sourcetable "DailySales" `
--directory "s3://my-bucket/data/sales/full/" `
--fileoutput "sales.parquet" `
--parallelmethod "RangeId" `
--distributekeycolumn "sale_id" `
--paralleldegree 8 `
--runid "sales_to_s3_daily"
S3 with Nested Structure
Create hierarchical folder structures:
.\\FastTransfer.exe `
--connectiontype "pgsql" `
--server "localhost" `
--port "5432" `
--database "ecommerce" `
--user "postgres" `
--password "postgres" `
--sourceschema "public" `
--sourcetable "orders" `
--directory "s3://analytics-bucket/exports/orders/" `
--fileoutput "orders.parquet" `
--parallelmethod "DataDriven" `
--distributekeycolumn "o_orderdate" `
--paralleldegree 10 `
--merge "True"
S3 with CSV Format
Export CSV files to S3 with manual date partitioning:
.\\FastTransfer.exe `
--connectiontype "mssql" `
--server "localhost" `
--database "LogsDB" `
--trusted `
--query "SELECT * FROM ApplicationLogs WHERE log_date = '2024-05-01'" `
--directory "s3://logs-bucket/app-logs/appdate=2024-05-01/" `
--fileoutput "app_logs.csv" `
--decimalseparator "." `
--delimiter "|" `
--dateformat "yyyy-MM-dd HH:mm:ss" `
--encoding "UTF-8"
Azure Blob Storage
Export data to Azure Blob Storage containers.
Azure with Date Partitioning
.\\FastTransfer.exe `
--connectiontype "mssql" `
--server "localhost" `
--database "AnalyticsDB" `
--trusted `
--sourceschema "dbo" `
--sourcetable "Events" `
--directory "abs://mystorageaccount.blob.core.windows.net/datacontainer/events/full/" `
--fileoutput "events.parquet" `
--parallelmethod "Ntile" `
--distributekeycolumn "event_id" `
--paralleldegree 10
Azure Data Lake Gen2
Export to Azure Data Lake Storage Gen2:
.\\FastTransfer.exe `
--connectiontype "mssql" `
--server "localhost" `
--database "AnalyticsDB" `
--trusted `
--sourceschema "dbo" `
--sourcetable "Events" `
--directory "abfss://mystorageaccount.dfs.core.windows.net/datacontainer/events/full/" `
--fileoutput "events.parquet" `
--parallelmethod "Ntile" `
--distributekeycolumn "event_id" `
--paralleldegree 10
Hourly Exports
Create exports with hourly granularity.
Hourly Data Partition by Sensor with an incremental time range drived from outside
.\\FastTransfer.exe `
--connectiontype "mssql" `
--server "localhost" `
--database "StreamingDB" `
--trusted `
--query "SELECT * FROM SensorData WHERE reading_time >= '2024-02-03 14:00:00' and reading_time < '2024-02-03 15:00:00'" `
--directory "s3://iot-bucket/sensor-data/" `
--fileoutput "sensors_20240203140000_20240203150000.parquet" `
--parallelmethod "DataDriven" `
--distributekeycolumn "sensor_id" `
--datadrivenquery "SELECT sensor_id From SensorList" `
--paralleldegree 8 `
--runid "hourly_sensor_export_20240203140000_20240203150000"
Best for Cloud Exports
Performance Tips
Use Parquet for cloud storage: Better compression and query performance, reduce network traffic
Enable parallel execution when needed:
- if you need to split data on a business criteria
- if you have a lots of data to extract
Use negativ value for --paralleldegree to automatically set threads to (CPU cores / abs(degree)), leaving resources for other processes.
eg --paralleldegree -2 will use half of the core of the machine from where FastTransfer is launched
**Use --datadrivenquery to :
- speedup the retrieval of the elements list to extract.
- as a technic for incremental extraction by using a where in the
--datadrivenqueryinstead of the main--query
Keep the degree safe for your source, to avoid a total saturation.
Example with All Best Practices
.\\FastTransfer.exe `
--connectiontype "mssql" `
--server "localhost" `
--database "DataWarehouse" `
--trusted `
--sourceschema "dbo" `
--sourcetable "FactSales" `
--directory "s3://analytics-bucket/warehouse/sales/{sourcetable}" `
--fileoutput "FactSales.parquet" `
--parallelmethod "DataDriven" `
--datadrivenquery "SELECT ref_date from calendar where ref_date > cast(getdate()-30 as date)" `
--distributekeycolumn "sale_date" `
--paralleldegree -2 `
--runid "daily_sales_to_s3" `
--loglevel "Information"
Build your command with the Wizard