Advanced Scenarios
Advanced use cases including DuckDB integration, ClickHouse, SAP HANA, Teradata, and other complex transfer scenarios.
DuckDB Integration
DuckDB provides powerful data transformation capabilities with FastTransfer.
DuckDB Memory - Parquet with Nested Fields to MSSQL
Process complex nested JSON/Parquet files with DuckDB's in-memory engine:
.\FastTransfer.exe `
--sourceconnectiontype "duckdb" `
--sourceserver ":memory:" `
--query "select * EXCLUDE(Risques, Interlocuteurs),unnest(interlocuteurs,recursive := true ) from (select day, unnest(contexte_chantier, recursive := true), unnest(detail_chantier, recursive := true) from read_parquet('D:\CLIENTS\ENEDIS\BI4ALL\CHANTIER\**\*.parquet', hive_partitioning=true)) t_row_unest" `
--targetconnectiontype "msbulk" `
--targetserver "localhost" `
--targettrusted `
--targetdatabase "ENDSBI4ALL" `
--targetschema "bi4all" `
--targettable "DuckChantiers_hpart" `
--parallelmethod "DataDriven" `
--distributeKeyColumn "day" `
--loadmode "Truncate" `
--degree 8
Use :memory: as the server to process data without persisting a DuckDB database file. Perfect for ETL transformations.
DuckDB Stream - Multiple CSV Files with DataDriven
Process multiple CSV files in parallel using filename as distribution key:
.\FastTransfer.exe `
--sourceconnectiontype "duckdbstream" `
--sourceserver ":memory:" `
--query "SELECT * exclude filename FROM read_csv('D:\temp\FastBCP_lineitem_Extract\lineitem*.CSV', filename=true)" `
--targetconnectiontype "msbulk" `
--targetserver "localhost" `
--targettrusted `
--targetdatabase "tpch_test" `
--targetschema "dbo" `
--targettable "lineitem" `
--parallelmethod "DataDriven" `
--distributeKeyColumn "filename" `
--datadrivenquery "select file from glob('D:\temp\FastBCP_lineitem_Extract\*.CSV')" `
--loadmode "Truncate" `
--degree 16
DuckDB Stream - Single Large CSV with Random Method
Process a single large CSV file with parallel Random distribution:
.\FastTransfer.exe `
--sourceconnectiontype "duckdbstream" `
--sourceserver ":memory:" `
--query "SELECT * FROM read_csv('D:\temp\lineitem*.CSV')" `
--targetconnectiontype "msbulk" `
--targetserver "localhost" `
--targettrusted `
--targetdatabase "tpch_test" `
--targetschema "dbo" `
--targettable "lineitem" `
--parallelmethod "Random" `
--distributeKeyColumn "l_orderkey" `
--loadmode "Truncate" `
--degree 16
DuckDB Stream - CSV with Column Slicing
Use calculated column slices for parallel distribution:
.\FastTransfer.exe `
--sourceconnectiontype "duckdbstream" `
--sourceserver ":memory:" `
--query "SELECT * FROM read_csv('D:\temp\lineitem*.CSV') rcsv" `
--targetconnectiontype "msbulk" `
--targetserver "localhost" `
--targettrusted `
--targetdatabase "tpch_test" `
--targetschema "dbo" `
--targettable "lineitem" `
--parallelmethod "DataDriven" `
--distributeKeyColumn "(l_orderkey/5000000)::int" `
--datadrivenquery "select distinct (l_orderkey/5000000)::int l_orderkey_group from read_csv('D:\temp\lineitem*.CSV') rcsv" `
--loadmode "Truncate" `
--degree 13
DuckDB Stream - CSV with Custom Format
Specify CSV format parameters when auto-detection isn't sufficient:
.\FastTransfer.exe `
--sourceconnectiontype "duckdbstream" `
--sourceserver ":memory:" `
--query "SELECT * FROM read_csv('D:\OpenData\wikidata\vector_database_wikipedia_articles_embedded.csv', auto_detect=false, delim=',', quote='""', escape='""', new_line='\n', skip=0, comment='', header=true, columns={'id': 'BIGINT', 'url': 'VARCHAR', 'title': 'VARCHAR', 'text': 'VARCHAR', 'title_vector': 'VARCHAR', 'content_vector': 'VARCHAR', 'vector_id': 'BIGINT'}) rcsv" `
--targetconnectiontype "pgcopy" `
--targetserver "localhost:35432" `
--targetuser "postgres" `
--targetpassword "YourPassword" `
--targetdatabase "wikipedia" `
--targetschema "public" `
--targettable "articles" `
--parallelmethod "DataDriven" `
--distributeKeyColumn "(id/10000)::int" `
--loadmode "Truncate" `
--degree 12
DuckDB File to PostgreSQL
Transfer from a DuckDB database file to PostgreSQL:
.\FastTransfer.exe `
--sourceconnectiontype "duckdbstream" `
--sourceserver "D:\OpenData\wikidata\dbpedia_14.duckdb" `
--sourceschema "main" `
--sourcetable "dbpedia_14" `
--query "SELECT label, title, content, split, content_word_count, chunk, content_n_chunks, chunk_index, chunk_word_count, chunk_embedding::text FROM dbpedia_14" `
--targetconnectiontype "pgcopy" `
--targetserver "localhost:35432" `
--targetuser "postgres" `
--targetpassword "YourPassword" `
--targetdatabase "wikipedia" `
--targetschema "public" `
--targettable "dbpedia_14" `
--parallelmethod "Ntile" `
--distributeKeyColumn "chunk_index" `
--degree 14 `
--loadmode "Truncate"
MSSQL to DuckDB
Transfer from SQL Server to a DuckDB database file:
.\FastTransfer.exe `
--sourceconnectiontype "mssql" `
--sourceserver "localhost" `
--sourcedatabase "tpch10_collation_bin2" `
--sourceschema "dbo" `
--sourcetable "orders_1M" `
--sourcetrusted `
--targetconnectiontype "duckdb" `
--targetdatabase "tpch10_insert" `
--targetschema "tpch10" `
--targettable "orders" `
--parallelmethod "None" `
--loadmode "Truncate" `
--targetserver "D:\OpenData\TPCH\data\10\duckdb\tpch10_insert.duckdb"
ClickHouse Integration
PostgreSQL to ClickHouse - Single-threaded
Transfer data to ClickHouse without parallelization:
.\FastTransfer.exe `
--sourceconnectiontype "pgsql" `
--sourceserver "localhost:15432" `
--sourcedatabase "tpch" `
--sourceuser "FastUser" `
--sourcepassword "FastPassword" `
--sourceschema "tpch_10" `
--query "select * from tpch_10.orders limit 100000" `
--targetconnectiontype "clickhousebulk" `
--targetserver "localhost:8123" `
--targetdatabase "default" `
--targetuser "default" `
--targetpassword "YourPassword" `
--targetschema "default" `
--targettable "orders" `
--parallelmethod "None" `
--loadmode "Append" `
--batchsize 10000
PostgreSQL to ClickHouse - Parallel with Ctid
High-performance transfer using PostgreSQL's Ctid method:
.\FastTransfer.exe `
--sourceconnectiontype "pgsql" `
--sourceserver "localhost:15432" `
--sourcedatabase "tpch" `
--sourceuser "FastUser" `
--sourcepassword "FastPassword" `
--sourceschema "tpch_10" `
--sourcetable "orders" `
--targetconnectiontype "clickhousebulk" `
--targetserver "localhost:8123" `
--targetdatabase "default" `
--targetuser "default" `
--targetpassword "YourPassword" `
--targetschema "default" `
--targettable "orders" `
--parallelmethod "Ctid" `
--degree 15 `
--loadmode "Truncate" `
--batchsize 1000000
ClickHouse to PostgreSQL - Multiple Methods
Transfer from ClickHouse to PostgreSQL using various distribution methods:
- Single-threaded
- RangeId Method
- DataDriven Method
.\FastTransfer.exe `
--targetconnectiontype "pgcopy" `
--targetserver "localhost:15432" `
--targetdatabase "tpch" `
--targetuser "FastUser" `
--targetpassword "FastPassword" `
--targetschema "tpch_10" `
--targettable "orders_empty" `
--sourceconnectiontype "clickhouse" `
--sourceserver "localhost:8123" `
--sourcedatabase "default" `
--sourceuser "default" `
--sourcepassword "YourPassword" `
--sourceschema "default" `
--query "select * from default.orders limit 100000" `
--parallelmethod "None" `
--loadmode "Append" `
--batchsize 10000
.\FastTransfer.exe `
--targetconnectiontype "pgcopy" `
--targetserver "localhost:15432" `
--targetdatabase "tpch" `
--targetuser "FastUser" `
--targetpassword "FastPassword" `
--targetschema "tpch_10" `
--targettable "orders_empty" `
--sourceconnectiontype "clickhouse" `
--sourceserver "localhost:8123" `
--sourcedatabase "default" `
--sourceuser "default" `
--sourcepassword "YourPassword" `
--sourceschema "default" `
--sourcetable "orders" `
--parallelmethod "RangeId" `
--distributeKeyColumn "O_ORDERKEY" `
--degree 15 `
--loadmode "Truncate" `
--batchsize 100000
.\FastTransfer.exe `
--targetconnectiontype "pgcopy" `
--targetserver "localhost:15432" `
--targetdatabase "tpch" `
--targetuser "FastUser" `
--targetpassword "FastPassword" `
--targetschema "tpch_10" `
--targettable "orders_empty" `
--sourceconnectiontype "clickhouse" `
--sourceserver "localhost:8123" `
--sourcedatabase "default" `
--sourceuser "default" `
--sourcepassword "YourPassword" `
--sourceschema "default" `
--sourcetable "orders" `
--parallelmethod "DataDriven" `
--distributeKeyColumn "year(o_orderdate)" `
--degree 15 `
--loadmode "Truncate" `
--batchsize 100000
SAP HANA Integration
PostgreSQL to SAP HANA
Transfer data to SAP HANA using the HANA bulk loader:
Single-threaded:
.\FastTransfer.exe `
--sourceconnectiontype "pgsql" `
--sourceserver "localhost:15432" `
--sourcedatabase "tpch" `
--sourceuser "FastUser" `
--sourcepassword "FastPassword" `
--sourceschema "tpch_10" `
--sourcetable "orders" `
--targetconnectiontype "hanabulk" `
--targetserver "hxehost:39015" `
--targetdatabase "" `
--targetuser "SYSTEM" `
--targetpassword "YourPassword" `
--targetschema "TPCH" `
--targettable "ORDERS" `
--parallelmethod "None" `
--loadmode "Append" `
--batchsize 5000 `
--query "select * from tpch_10.orders limit 100000"
Parallel with Ctid:
.\FastTransfer.exe `
--sourceconnectiontype "pgsql" `
--sourceserver "localhost:15432" `
--sourcedatabase "tpch" `
--sourceuser "FastUser" `
--sourcepassword "FastPassword" `
--sourceschema "tpch_10" `
--sourcetable "orders" `
--targetconnectiontype "hanabulk" `
--targetserver "hxehost:39015" `
--targetdatabase "" `
--targetuser "SYSTEM" `
--targetpassword "YourPassword" `
--targetschema "TPCH" `
--targettable "ORDERS" `
--parallelmethod "Ctid" `
--degree 4 `
--loadmode "Truncate" `
--batchsize 5000
SAP HANA to PostgreSQL
Transfer from SAP HANA to PostgreSQL:
.\FastTransfer.exe `
--targetconnectiontype "pgsql" `
--targetserver "localhost:15432" `
--targetdatabase "tpch_import" `
--targetuser "FastUser" `
--targetpassword "FastPassword" `
--targetschema "tpch_10" `
--targettable "orders" `
--sourceconnectiontype "hana" `
--sourceserver "hxehost:39015" `
--sourcedatabase "" `
--sourceuser "SYSTEM" `
--sourcepassword "YourPassword" `
--sourceschema "TPCH" `
--sourcetable "ORDERS" `
--parallelmethod "Random" `
--distributeKeyColumn "o_orderkey" `
--degree 12 `
--loadmode "Truncate" `
--batchsize 5000
Teradata Integration
PostgreSQL to Teradata
Transfer data to Teradata with parallel Ctid method:
.\FastTransfer.exe `
--sourceconnectiontype "pgsql" `
--sourceserver "localhost:15432" `
--sourcedatabase "tpch" `
--sourceuser "FastUser" `
--sourcepassword "FastPassword" `
--sourceschema "tpch_10" `
--sourcetable "orders" `
--targetconnectiontype "teradata" `
--targetserver "192.168.239.129:1025" `
--targetdatabase "TPCH" `
--targetuser "UTPCH" `
--targetpassword "UTPCH" `
--targetschema "TPCH" `
--targettable "orders" `
--parallelmethod "Ctid" `
--degree 15 `
--loadmode "Truncate" `
--batchsize 10000
Teradata to MSSQL
Transfer from Teradata to SQL Server:
.\FastTransfer.exe `
--sourceconnectiontype "teradata" `
--sourceserver "192.168.239.129:1025" `
--sourcedatabase "TPCH" `
--sourceuser "UTPCH" `
--sourcepassword "UTPCH" `
--sourceschema "TPCH" `
--sourcetable "orders_15M" `
--targetconnectiontype "msbulk" `
--targetserver "localhost" `
--targetdatabase "tpch10_collation_bin2" `
--targetuser "FastUser" `
--targetpassword "FastPassword" `
--targetschema "dbo" `
--targettable "orders_15M_copy" `
--parallelmethod "RangeId" `
--distributeKeyColumn "o_orderkey" `
--degree 12 `
--loadmode "Truncate"
Netezza Integration
Transfer from Netezza to MSSQL using DATASLICEID for optimal parallelism:
.\FastTransfer.exe `
--sourceconnectiontype "nzsql" `
--sourceserver "NZSOURCESYS" `
--sourcedatabase "TPCH" `
--sourceuser "TPCH" `
--sourcepassword "TPCH" `
--targetconnectiontype "msbulk" `
--targetserver "localhost" `
--targetdatabase "tpch_test" `
--sourceschema "tpch_10" `
--sourcetable "orders" `
--targettrusted `
--targetschema "dbo" `
--targettable "orders" `
--parallelmethod "DataDriven" `
--distributeKeyColumn "DATASLICEID" `
--degree 6
Use DATASLICEID as the distribution column to leverage Netezza's internal data distribution for optimal parallel extraction.
Advanced SQL Server Techniques
Using Partition Functions
Leverage SQL Server partition functions for efficient parallel transfers:
.\FastTransfer.exe `
--sourceconnectiontype "mssql" `
--sourceserver "localhost" `
--sourcedatabase "tpch10_collation_bin2" `
--sourcetrusted `
--targetconnectiontype "msbulk" `
--targetserver "localhost" `
--targetdatabase "tpch10_collation_bin2_copy" `
--query "select * from orders_part" `
--targettrusted `
--targetschema "dbo" `
--targettable "orders" `
--parallelmethod "DataDriven" `
--distributeKeyColumn "`$PARTITION.PF_DATE(o_orderdate)" `
--degree 10 `
--loadmode "Truncate"
Use `$PARTITION.FunctionName(ColumnName) to reference a partition function. The backtick (`) escapes the dollar sign in PowerShell.
Using Physloc Method
Use physical location for parallel distribution (readonly tables only):
.\FastTransfer.exe `
--sourceconnectiontype "mssql" `
--sourceserver "localhost" `
--sourcetrusted `
--sourcedatabase "tpch_test" `
--sourceschema "dbo" `
--query "select * from dbo.orders_15M" `
--targetconnectiontype "msbulk" `
--targetserver "localhost,11433" `
--targettrusted `
--targetdatabase "tpch_test" `
--targetschema "dbo" `
--targettable "orders_ccsi" `
--loadmode "Truncate" `
--batchsize 120000 `
--parallelmethod "Physloc" `
--degree 10 `
--mapmethod "Name"
The Physloc method should only be used on readonly tables, databases, or snapshots, as it relies on physical page locations that may change during writes.
Performance Tuning
Dynamic Degree with Negative Values
Use negative degree values to automatically calculate parallelism based on CPU cores:
.\FastTransfer.exe `
--sourceconnectiontype "pgsql" `
--sourceserver "localhost:15432" `
--sourceuser "FastUser" `
--sourcepassword "FastPassword" `
--sourcedatabase "tpch" `
--sourceschema "tpch_10" `
--sourcetable "orders" `
--targetconnectiontype "msbulk" `
--targetserver "localhost" `
--targetdatabase "tpch_test" `
--targettrusted `
--targetschema "dbo" `
--targettable "orders" `
--parallelmethod "Ctid" `
--degree -4 `
--loadmode "Truncate"
A negative degree value (e.g., -4) multiplies the absolute value by the number of CPU cores. With 8 cores, -4 results in 32 parallel threads.
Batch Size Optimization
Adjust batch size based on connection type for optimal performance:
- MSSQL (msbulk): 10,000 - 100,000 rows
- PostgreSQL (pgcopy): 10,000 - 50,000 rows
- Oracle (orabulk): 100,000 - 500,000 rows
- Oracle (oradirect): 1,000 - 10,000 rows
- ClickHouse: 10,000 - 1,000,000 rows
- SAP HANA: 5,000 - 10,000 rows
Debug Logging
Enable detailed logging for troubleshooting:
.\FastTransfer.exe `
--sourceconnectiontype "pgsql" `
--sourceserver "localhost:5432" `
--sourceuser "FastUser" `
--sourcepassword "FastPassword" `
--sourcedatabase "tpch" `
--sourceschema "tpch_10" `
--sourcetable "lineitem" `
--targetconnectiontype "msbulk" `
--targetserver "localhost" `
--targetdatabase "tpch_test" `
--targettrusted `
--targetschema "dbo" `
--targettable "lineitem" `
--parallelmethod "Ctid" `
--degree 10 `
--loadmode "Truncate" `
--loglevel "debug"
Use --loglevel "debug" to see detailed execution steps, connection information, and query generation details for troubleshooting.
Using Input Files
Process SQL queries from external files:
.\FastTransfer.exe `
--sourceconnectiontype "pgsql" `
--sourceserver "localhost:5432" `
--sourceuser "FastUser" `
--sourcepassword "FastPassword" `
--sourcedatabase "tpch" `
--sourcefile "..\..\..\..\FastTransfert\samples\select_orders.sql" `
--sourceschema "tpch_10" `
--sourcetable "orders" `
--targetconnectiontype "msbulk" `
--targetserver "localhost" `
--targetdatabase "tpch_test" `
--targettrusted `
--targetschema "dbo" `
--targettable "orders_2" `
--parallelmethod "Ctid" `
--degree 6 `
--loadmode "Truncate"
Column Mapping
Map by Name
Automatically map columns by name (default):
.\FastTransfer.exe `
--sourceconnectiontype "mssql" `
--sourceserver "localhost" `
--sourcetrusted `
--sourcedatabase "tpch_test" `
--sourceschema "dbo" `
--query "select * from dbo.orders_15M" `
--targetconnectiontype "msbulk" `
--targetserver "localhost,11433" `
--targettrusted `
--targetdatabase "tpch_test" `
--targetschema "dbo" `
--targettable "orders_ccsi" `
--loadmode "Truncate" `
--batchsize 120000 `
--parallelmethod "Physloc" `
--degree 10 `
--mapmethod "Name"
Build your command with the Wizard