Import CSV Files to CedarDB

Fast, parallel file import using DuckDBStream

FastTransfer

DuckDBStream

Terminal

.\FastTransfer.exe `
  --sourceconnectiontype "duckdbstream" `
  --sourceserver ":memory:" `
  --sourceserver "your-server" `
  --sourceuser "your-username" `
  --sourcepassword "your-password" `
  --query "SELECT * FROM read_csv('D:\path\to\files\*.csv, filename=true')" `
  --targetconnectiontype "pgcopy" `
  --targetserver "your-server" `
  --targetuser "your-username" `
  --targetpassword "your-password" `
  --targetdatabase "your-database" `
  --targetschema "your-schema" `
  --targettable "your-table" `
  --method "DataDriven" `
  --distributekeycolumn "filename"  `
  --datadrivenquery "select file from glob('D:\path\to\files\*.csv')"  `
  --degree -2  `
  --loadmode "Truncate"  `
  --mapmethod "Name"

All Parameters Wizard

Get FastTransfer

Source - CSV (Comma-Separated Values)

CSV is the universal standard for tabular data exchange. FastTransfer uses DuckDB to read CSV files with exceptional speed and automatic schema detection.

Features:

•Automatic delimiter and encoding detection
•Support for large files
•Parallel processing of multiple files
•Smart CSV parsing with DuckDB read_csv() syntax

Processing - DuckDBStream with DataDriven

DuckDB is a fast and efficient in-process analytical database. FastTransfer uses DuckDBStream to read multiple file formats with exceptional performance.

Parallel Method: DataDriven (Files)

For files, FastTransfer uses the filename as distribution key to parallelize the processing of multiple files simultaneously.

✓Concurrent processing of multiple files
✓Ideal for batch imports
✓Automatic horizontal scaling

Destination - CedarDB

FastTransfer uses PostgreSQL's binary COPY protocol for CedarDB with a PostgreSQL Compatible Source if you use pgcopy both in source and target connection types, ensuring maximum compatibility and performance.

Loading method:

Binary COPY Protocol

Advantages:

•Binary COPY for maximum performance (Pg Compatible Source Only + pgcopy/pgcopy)
•PostgreSQL protocol compatibility
•Optimized for modern hardware

Related Pipelines