Version: 0.14

Advanced Scenarios

Advanced use cases including DuckDB integration, ClickHouse, SAP HANA, Teradata, and other complex transfer scenarios.

DuckDB Integration

DuckDB provides powerful data transformation capabilities with FastTransfer.

DuckDB Memory - Parquet with Nested Fields to MSSQL

Process complex nested JSON/Parquet files with DuckDB's in-memory engine:

.\FastTransfer.exe `
  --sourceconnectiontype "duckdb" `
  --sourceserver ":memory:" `
  --query "select * EXCLUDE(Risques, Interlocuteurs),unnest(interlocuteurs,recursive := true ) from (select day, unnest(contexte_chantier, recursive := true), unnest(detail_chantier, recursive := true) from read_parquet('D:\CLIENTS\ENEDIS\BI4ALL\CHANTIER\**\*.parquet', hive_partitioning=true)) t_row_unest" `
  --targetconnectiontype "msbulk" `
  --targetserver "localhost" `
  --targettrusted `
  --targetdatabase "ENDSBI4ALL" `
  --targetschema "bi4all" `
  --targettable "DuckChantiers_hpart" `
  --parallelmethod "DataDriven" `
  --distributeKeyColumn "day" `
  --loadmode "Truncate" `
  --degree 8

DuckDB Memory Mode

Use :memory: as the server to process data without persisting a DuckDB database file. Perfect for ETL transformations.

DuckDB Stream - Multiple CSV Files with DataDriven

Process multiple CSV files in parallel using filename as distribution key:

.\FastTransfer.exe `
  --sourceconnectiontype "duckdbstream" `
  --sourceserver ":memory:" `
  --query "SELECT * exclude filename FROM read_csv('D:\temp\FastBCP_lineitem_Extract\lineitem*.CSV', filename=true)" `
  --targetconnectiontype "msbulk" `
  --targetserver "localhost" `
  --targettrusted `
  --targetdatabase "tpch_test" `
  --targetschema "dbo" `
  --targettable "lineitem" `
  --parallelmethod "DataDriven" `
  --distributeKeyColumn "filename" `
  --datadrivenquery "select file from glob('D:\temp\FastBCP_lineitem_Extract\*.CSV')" `
  --loadmode "Truncate" `
  --degree 16

DuckDB Stream - Single Large CSV with Random Method

Process a single large CSV file with parallel Random distribution:

.\FastTransfer.exe `
  --sourceconnectiontype "duckdbstream" `
  --sourceserver ":memory:" `
  --query "SELECT * FROM read_csv('D:\temp\lineitem*.CSV')" `
  --targetconnectiontype "msbulk" `
  --targetserver "localhost" `
  --targettrusted `
  --targetdatabase "tpch_test" `
  --targetschema "dbo" `
  --targettable "lineitem" `
  --parallelmethod "Random" `
  --distributeKeyColumn "l_orderkey" `
  --loadmode "Truncate" `
  --degree 16

DuckDB Stream - CSV with Column Slicing

Use calculated column slices for parallel distribution:

.\FastTransfer.exe `
  --sourceconnectiontype "duckdbstream" `
  --sourceserver ":memory:" `
  --query "SELECT * FROM read_csv('D:\temp\lineitem*.CSV') rcsv" `
  --targetconnectiontype "msbulk" `
  --targetserver "localhost" `
  --targettrusted `
  --targetdatabase "tpch_test" `
  --targetschema "dbo" `
  --targettable "lineitem" `
  --parallelmethod "DataDriven" `
  --distributeKeyColumn "(l_orderkey/5000000)::int" `
  --datadrivenquery "select distinct (l_orderkey/5000000)::int l_orderkey_group from read_csv('D:\temp\lineitem*.CSV') rcsv" `
  --loadmode "Truncate" `
  --degree 13

DuckDB Stream - CSV with Custom Format

Specify CSV format parameters when auto-detection isn't sufficient:

.\FastTransfer.exe `
  --sourceconnectiontype "duckdbstream" `
  --sourceserver ":memory:" `
  --query "SELECT * FROM read_csv('D:\OpenData\wikidata\vector_database_wikipedia_articles_embedded.csv', auto_detect=false, delim=',', quote='""', escape='""', new_line='\n', skip=0, comment='', header=true, columns={'id': 'BIGINT', 'url': 'VARCHAR', 'title': 'VARCHAR', 'text': 'VARCHAR', 'title_vector': 'VARCHAR', 'content_vector': 'VARCHAR', 'vector_id': 'BIGINT'}) rcsv" `
  --targetconnectiontype "pgcopy" `
  --targetserver "localhost:35432" `
  --targetuser "postgres" `
  --targetpassword "YourPassword" `
  --targetdatabase "wikipedia" `
  --targetschema "public" `
  --targettable "articles" `
  --parallelmethod "DataDriven" `
  --distributeKeyColumn "(id/10000)::int" `
  --loadmode "Truncate" `
  --degree 12

DuckDB File to PostgreSQL

Transfer from a DuckDB database file to PostgreSQL:

.\FastTransfer.exe `
  --sourceconnectiontype "duckdbstream" `
  --sourceserver "D:\OpenData\wikidata\dbpedia_14.duckdb" `
  --sourceschema "main" `
  --sourcetable "dbpedia_14" `
  --query "SELECT label, title, content, split, content_word_count, chunk, content_n_chunks, chunk_index, chunk_word_count, chunk_embedding::text FROM dbpedia_14" `
  --targetconnectiontype "pgcopy" `
  --targetserver "localhost:35432" `
  --targetuser "postgres" `
  --targetpassword "YourPassword" `
  --targetdatabase "wikipedia" `
  --targetschema "public" `
  --targettable "dbpedia_14" `
  --parallelmethod "Ntile" `
  --distributeKeyColumn "chunk_index" `
  --degree 14 `
  --loadmode "Truncate"

MSSQL to DuckDB

Transfer from SQL Server to a DuckDB database file:

.\FastTransfer.exe `
  --sourceconnectiontype "mssql" `
  --sourceserver "localhost" `
  --sourcedatabase "tpch10_collation_bin2" `
  --sourceschema "dbo" `
  --sourcetable "orders_1M" `
  --sourcetrusted `
  --targetconnectiontype "duckdb" `
  --targetdatabase "tpch10_insert" `
  --targetschema "tpch10" `
  --targettable "orders" `
  --parallelmethod "None" `
  --loadmode "Truncate" `
  --targetserver "D:\OpenData\TPCH\data\10\duckdb\tpch10_insert.duckdb"

ClickHouse Integration

PostgreSQL to ClickHouse - Single-threaded

Transfer data to ClickHouse without parallelization:

.\FastTransfer.exe `
  --sourceconnectiontype "pgsql" `
  --sourceserver "localhost:15432" `
  --sourcedatabase "tpch" `
  --sourceuser "FastUser" `
  --sourcepassword "FastPassword" `
  --sourceschema "tpch_10" `
  --query "select * from tpch_10.orders limit 100000" `
  --targetconnectiontype "clickhousebulk" `
  --targetserver "localhost:8123" `
  --targetdatabase "default" `
  --targetuser "default" `
  --targetpassword "YourPassword" `
  --targetschema "default" `
  --targettable "orders" `
  --parallelmethod "None" `
  --loadmode "Append" `
  --batchsize 10000

PostgreSQL to ClickHouse - Parallel with Ctid

High-performance transfer using PostgreSQL's Ctid method:

.\FastTransfer.exe `
  --sourceconnectiontype "pgsql" `
  --sourceserver "localhost:15432" `
  --sourcedatabase "tpch" `
  --sourceuser "FastUser" `
  --sourcepassword "FastPassword" `
  --sourceschema "tpch_10" `
  --sourcetable "orders" `
  --targetconnectiontype "clickhousebulk" `
  --targetserver "localhost:8123" `
  --targetdatabase "default" `
  --targetuser "default" `
  --targetpassword "YourPassword" `
  --targetschema "default" `
  --targettable "orders" `
  --parallelmethod "Ctid" `
  --degree 15 `
  --loadmode "Truncate" `
  --batchsize 1000000

ClickHouse to PostgreSQL - Multiple Methods

Transfer from ClickHouse to PostgreSQL using various distribution methods:

Single-threaded
RangeId Method
DataDriven Method

.\FastTransfer.exe `
  --targetconnectiontype "pgcopy" `
  --targetserver "localhost:15432" `
  --targetdatabase "tpch" `
  --targetuser "FastUser" `
  --targetpassword "FastPassword" `
  --targetschema "tpch_10" `
  --targettable "orders_empty" `
  --sourceconnectiontype "clickhouse" `
  --sourceserver "localhost:8123" `
  --sourcedatabase "default" `
  --sourceuser "default" `
  --sourcepassword "YourPassword" `
  --sourceschema "default" `
  --query "select * from default.orders limit 100000" `
  --parallelmethod "None" `
  --loadmode "Append" `
  --batchsize 10000

.\FastTransfer.exe `
  --targetconnectiontype "pgcopy" `
  --targetserver "localhost:15432" `
  --targetdatabase "tpch" `
  --targetuser "FastUser" `
  --targetpassword "FastPassword" `
  --targetschema "tpch_10" `
  --targettable "orders_empty" `
  --sourceconnectiontype "clickhouse" `
  --sourceserver "localhost:8123" `
  --sourcedatabase "default" `
  --sourceuser "default" `
  --sourcepassword "YourPassword" `
  --sourceschema "default" `
  --sourcetable "orders" `
  --parallelmethod "RangeId" `
  --distributeKeyColumn "O_ORDERKEY" `
  --degree 15 `
  --loadmode "Truncate" `
  --batchsize 100000

.\FastTransfer.exe `
  --targetconnectiontype "pgcopy" `
  --targetserver "localhost:15432" `
  --targetdatabase "tpch" `
  --targetuser "FastUser" `
  --targetpassword "FastPassword" `
  --targetschema "tpch_10" `
  --targettable "orders_empty" `
  --sourceconnectiontype "clickhouse" `
  --sourceserver "localhost:8123" `
  --sourcedatabase "default" `
  --sourceuser "default" `
  --sourcepassword "YourPassword" `
  --sourceschema "default" `
  --sourcetable "orders" `
  --parallelmethod "DataDriven" `
  --distributeKeyColumn "year(o_orderdate)" `
  --degree 15 `
  --loadmode "Truncate" `
  --batchsize 100000

SAP HANA Integration

PostgreSQL to SAP HANA

Transfer data to SAP HANA using the HANA bulk loader:

Single-threaded:

.\FastTransfer.exe `
  --sourceconnectiontype "pgsql" `
  --sourceserver "localhost:15432" `
  --sourcedatabase "tpch" `
  --sourceuser "FastUser" `
  --sourcepassword "FastPassword" `
  --sourceschema "tpch_10" `
  --sourcetable "orders" `
  --targetconnectiontype "hanabulk" `
  --targetserver "hxehost:39015" `
  --targetdatabase "" `
  --targetuser "SYSTEM" `
  --targetpassword "YourPassword" `
  --targetschema "TPCH" `
  --targettable "ORDERS" `
  --parallelmethod "None" `
  --loadmode "Append" `
  --batchsize 5000 `
  --query "select * from tpch_10.orders limit 100000"

Parallel with Ctid:

.\FastTransfer.exe `
  --sourceconnectiontype "pgsql" `
  --sourceserver "localhost:15432" `
  --sourcedatabase "tpch" `
  --sourceuser "FastUser" `
  --sourcepassword "FastPassword" `
  --sourceschema "tpch_10" `
  --sourcetable "orders" `
  --targetconnectiontype "hanabulk" `
  --targetserver "hxehost:39015" `
  --targetdatabase "" `
  --targetuser "SYSTEM" `
  --targetpassword "YourPassword" `
  --targetschema "TPCH" `
  --targettable "ORDERS" `
  --parallelmethod "Ctid" `
  --degree 4 `
  --loadmode "Truncate" `
  --batchsize 5000

SAP HANA to PostgreSQL

Transfer from SAP HANA to PostgreSQL:

.\FastTransfer.exe `
  --targetconnectiontype "pgsql" `
  --targetserver "localhost:15432" `
  --targetdatabase "tpch_import" `
  --targetuser "FastUser" `
  --targetpassword "FastPassword" `
  --targetschema "tpch_10" `
  --targettable "orders" `
  --sourceconnectiontype "hana" `
  --sourceserver "hxehost:39015" `
  --sourcedatabase "" `
  --sourceuser "SYSTEM" `
  --sourcepassword "YourPassword" `
  --sourceschema "TPCH" `
  --sourcetable "ORDERS" `
  --parallelmethod "Random" `
  --distributeKeyColumn "o_orderkey" `
  --degree 12 `
  --loadmode "Truncate" `
  --batchsize 5000

Teradata Integration

PostgreSQL to Teradata

Transfer data to Teradata with parallel Ctid method:

.\FastTransfer.exe `
  --sourceconnectiontype "pgsql" `
  --sourceserver "localhost:15432" `
  --sourcedatabase "tpch" `
  --sourceuser "FastUser" `
  --sourcepassword "FastPassword" `
  --sourceschema "tpch_10" `
  --sourcetable "orders" `
  --targetconnectiontype "teradata" `
  --targetserver "192.168.239.129:1025" `
  --targetdatabase "TPCH" `
  --targetuser "UTPCH" `
  --targetpassword "UTPCH" `
  --targetschema "TPCH" `
  --targettable "orders" `
  --parallelmethod "Ctid" `
  --degree 15 `
  --loadmode "Truncate" `
  --batchsize 10000

Teradata to MSSQL

Transfer from Teradata to SQL Server:

.\FastTransfer.exe `
  --sourceconnectiontype "teradata" `
  --sourceserver "192.168.239.129:1025"  `
  --sourcedatabase "TPCH" `
  --sourceuser "UTPCH" `
  --sourcepassword "UTPCH" `
  --sourceschema "TPCH" `
  --sourcetable "orders_15M" `
  --targetconnectiontype "msbulk" `
  --targetserver "localhost" `
  --targetdatabase "tpch10_collation_bin2" `
  --targetuser "FastUser" `
  --targetpassword "FastPassword" `
  --targetschema "dbo" `
  --targettable "orders_15M_copy" `
  --parallelmethod "RangeId" `
  --distributeKeyColumn "o_orderkey" `
  --degree 12 `
  --loadmode "Truncate"

Netezza Integration

Transfer from Netezza to MSSQL using DATASLICEID for optimal parallelism:

.\FastTransfer.exe `
  --sourceconnectiontype "nzsql" `
  --sourceserver "NZSOURCESYS" `
  --sourcedatabase "TPCH" `
  --sourceuser "TPCH" `
  --sourcepassword "TPCH" `
  --targetconnectiontype "msbulk" `
  --targetserver "localhost" `
  --targetdatabase "tpch_test" `
  --sourceschema "tpch_10" `
  --sourcetable "orders" `
  --targettrusted `
  --targetschema "dbo" `
  --targettable "orders" `
  --parallelmethod "DataDriven" `
  --distributeKeyColumn "DATASLICEID" `
  --degree 6

Netezza DATASLICEID

Use DATASLICEID as the distribution column to leverage Netezza's internal data distribution for optimal parallel extraction.

Advanced SQL Server Techniques

Using Partition Functions

Leverage SQL Server partition functions for efficient parallel transfers:

.\FastTransfer.exe `
  --sourceconnectiontype "mssql" `
  --sourceserver "localhost" `
  --sourcedatabase "tpch10_collation_bin2" `
  --sourcetrusted `
  --targetconnectiontype "msbulk" `
  --targetserver "localhost" `
  --targetdatabase "tpch10_collation_bin2_copy" `
  --query "select * from orders_part" `
  --targettrusted `
  --targetschema "dbo" `
  --targettable "orders" `
  --parallelmethod "DataDriven" `
  --distributeKeyColumn "`$PARTITION.PF_DATE(o_orderdate)" `
  --degree 10 `
  --loadmode "Truncate"

Partition Function Syntax

Use `$PARTITION.FunctionName(ColumnName) to reference a partition function. The backtick (`) escapes the dollar sign in PowerShell.

Using Physloc Method

Use physical location for parallel distribution (readonly tables only):

.\FastTransfer.exe `
  --sourceconnectiontype "mssql" `
  --sourceserver "localhost" `
  --sourcetrusted `
  --sourcedatabase "tpch_test" `
  --sourceschema "dbo" `
  --query "select * from dbo.orders_15M" `
  --targetconnectiontype "msbulk" `
  --targetserver "localhost,11433" `
  --targettrusted `
  --targetdatabase "tpch_test" `
  --targetschema "dbo" `
  --targettable "orders_ccsi" `
  --loadmode "Truncate" `
  --batchsize 120000 `
  --parallelmethod "Physloc" `
  --degree 10 `
  --mapmethod "Name"

Physloc Method

The Physloc method should only be used on readonly tables, databases, or snapshots, as it relies on physical page locations that may change during writes.

Performance Tuning

Dynamic Degree with Negative Values

Use negative degree values to automatically calculate parallelism based on CPU cores:

.\FastTransfer.exe `
  --sourceconnectiontype "pgsql" `
  --sourceserver "localhost:15432" `
  --sourceuser "FastUser" `
  --sourcepassword "FastPassword" `
  --sourcedatabase "tpch" `
  --sourceschema "tpch_10" `
  --sourcetable "orders" `
  --targetconnectiontype "msbulk" `
  --targetserver "localhost" `
  --targetdatabase "tpch_test" `
  --targettrusted `
  --targetschema "dbo" `
  --targettable "orders" `
  --parallelmethod "Ctid" `
  --degree -4 `
  --loadmode "Truncate"

Dynamic Degree

A negative degree value (e.g., -4) multiplies the absolute value by the number of CPU cores. With 8 cores, -4 results in 32 parallel threads.

Batch Size Optimization

Adjust batch size based on connection type for optimal performance:

MSSQL (msbulk): 10,000 - 100,000 rows
PostgreSQL (pgcopy): 10,000 - 50,000 rows
Oracle (orabulk): 100,000 - 500,000 rows
Oracle (oradirect): 1,000 - 10,000 rows
ClickHouse: 10,000 - 1,000,000 rows
SAP HANA: 5,000 - 10,000 rows

Debug Logging

Enable detailed logging for troubleshooting:

.\FastTransfer.exe `
  --sourceconnectiontype "pgsql" `
  --sourceserver "localhost:5432" `
  --sourceuser "FastUser" `
  --sourcepassword "FastPassword" `
  --sourcedatabase "tpch" `
  --sourceschema "tpch_10" `
  --sourcetable "lineitem" `
  --targetconnectiontype "msbulk" `
  --targetserver "localhost" `
  --targetdatabase "tpch_test" `
  --targettrusted `
  --targetschema "dbo" `
  --targettable "lineitem" `
  --parallelmethod "Ctid" `
  --degree 10 `
  --loadmode "Truncate" `
  --loglevel "debug"

Debug Mode

Use --loglevel "debug" to see detailed execution steps, connection information, and query generation details for troubleshooting.

Using Input Files

Process SQL queries from external files:

.\FastTransfer.exe `
  --sourceconnectiontype "pgsql" `
  --sourceserver "localhost:5432" `
  --sourceuser "FastUser" `
  --sourcepassword "FastPassword" `
  --sourcedatabase "tpch" `
  --sourcefile "..\..\..\..\FastTransfert\samples\select_orders.sql" `
  --sourceschema "tpch_10" `
  --sourcetable "orders" `
  --targetconnectiontype "msbulk" `
  --targetserver "localhost" `
  --targetdatabase "tpch_test" `
  --targettrusted `
  --targetschema "dbo" `
  --targettable "orders_2" `
  --parallelmethod "Ctid" `
  --degree 6 `
  --loadmode "Truncate"

Column Mapping

Map by Name

Automatically map columns by name (default):

.\FastTransfer.exe `
  --sourceconnectiontype "mssql" `
  --sourceserver "localhost" `
  --sourcetrusted `
  --sourcedatabase "tpch_test" `
  --sourceschema "dbo" `
  --query "select * from dbo.orders_15M" `
  --targetconnectiontype "msbulk" `
  --targetserver "localhost,11433" `
  --targettrusted `
  --targetdatabase "tpch_test" `
  --targetschema "dbo" `
  --targettable "orders_ccsi" `
  --loadmode "Truncate" `
  --batchsize 120000 `
  --parallelmethod "Physloc" `
  --degree 10 `
  --mapmethod "Name"

tip

Build your command with the Wizard

DuckDB Integration​

DuckDB Memory - Parquet with Nested Fields to MSSQL​

DuckDB Stream - Multiple CSV Files with DataDriven​

DuckDB Stream - Single Large CSV with Random Method​

DuckDB Stream - CSV with Column Slicing​

DuckDB Stream - CSV with Custom Format​

DuckDB File to PostgreSQL​

MSSQL to DuckDB​

ClickHouse Integration​

PostgreSQL to ClickHouse - Single-threaded​

PostgreSQL to ClickHouse - Parallel with Ctid​

ClickHouse to PostgreSQL - Multiple Methods​

SAP HANA Integration​

PostgreSQL to SAP HANA​

SAP HANA to PostgreSQL​

Teradata Integration​

PostgreSQL to Teradata​

Teradata to MSSQL​

Netezza Integration​

Advanced SQL Server Techniques​

Using Partition Functions​

Using Physloc Method​

Performance Tuning​

Dynamic Degree with Negative Values​

Batch Size Optimization​

Debug Logging​

Using Input Files​

Column Mapping​

Map by Name​