Import CSV Files to CedarDB
Fast, parallel file import using DuckDBStream

Terminal
.\FastTransfer.exe `
--sourceconnectiontype "duckdbstream" `
--sourceserver ":memory:" `
--sourceserver "your-server" `
--sourceuser "your-username" `
--sourcepassword "your-password" `
--query "SELECT * FROM read_csv('D:\path\to\files\*.csv, filename=true')" `
--targetconnectiontype "pgcopy" `
--targetserver "your-server" `
--targetuser "your-username" `
--targetpassword "your-password" `
--targetdatabase "your-database" `
--targetschema "your-schema" `
--targettable "your-table" `
--method "DataDriven" `
--distributekeycolumn "filename" `
--datadrivenquery "select file from glob('D:\path\to\files\*.csv')" `
--degree -2 `
--loadmode "Truncate" `
--mapmethod "Name"Source - CSV (Comma-Separated Values)
CSV is the universal standard for tabular data exchange. FastTransfer uses DuckDB to read CSV files with exceptional speed and automatic schema detection.
Features:
- •Automatic delimiter and encoding detection
- •Support for large files
- •Parallel processing of multiple files
- •Smart CSV parsing with DuckDB read_csv() syntax
Processing - DuckDBStream with DataDriven
DuckDB is a fast and efficient in-process analytical database. FastTransfer uses DuckDBStream to read multiple file formats with exceptional performance.
Parallel Method: DataDriven (Files)
For files, FastTransfer uses the filename as distribution key to parallelize the processing of multiple files simultaneously.
- ✓Concurrent processing of multiple files
- ✓Ideal for batch imports
- ✓Automatic horizontal scaling
Destination - CedarDB
FastTransfer uses PostgreSQL's binary COPY protocol for CedarDB with a PostgreSQL Compatible Source if you use pgcopy both in source and target connection types, ensuring maximum compatibility and performance.
Loading method:
Binary COPY Protocol
Advantages:
- •Binary COPY for maximum performance (Pg Compatible Source Only + pgcopy/pgcopy)
- •PostgreSQL protocol compatibility
- •Optimized for modern hardware
