Import Parquet Files to MySQL
Fast, parallel file import using DuckDBStream

Terminal
.\FastTransfer.exe `
--sourceconnectiontype "duckdbstream" `
--sourceserver ":memory:" `
--sourceserver "your-server" `
--sourceuser "your-username" `
--sourcepassword "your-password" `
--query "SELECT * FROM read_parquet('D:\path\to\files\*.parquet, filename=true')" `
--targetconnectiontype "mysqlbulk" `
--targetserver "your-server" `
--targetuser "your-username" `
--targetpassword "your-password" `
--targetdatabase "your-database" `
--targetschema "your-schema" `
--targettable "your-table" `
--method "DataDriven" `
--distributekeycolumn "filename" `
--datadrivenquery "select file from glob('D:\path\to\files\*.parquet')" `
--degree -2 `
--loadmode "Truncate" `
--mapmethod "Name"Source - Apache Parquet
Parquet is a columnar file format optimized for analytical processing. FastTransfer reads Parquet via DuckDB with exceptional native performance.
Features:
- •Ultra-fast columnar reading
- •Integrated native compression
- •Data type preservation
- •Pushdown filtering for optimal performance
Processing - DuckDBStream with DataDriven
DuckDB is a fast and efficient in-process analytical database. FastTransfer uses DuckDBStream to read multiple file formats with exceptional performance.
Parallel Method: DataDriven (Files)
For files, FastTransfer uses the filename as distribution key to parallelize the processing of multiple files simultaneously.
- ✓Concurrent processing of multiple files
- ✓Ideal for batch imports
- ✓Automatic horizontal scaling
Destination - MySQL
FastTransfer leverages MySQL's bulk insert API for optimized loading. Data is inserted in batches to maximize throughput.
Loading method:
Bulk Insert API
Advantages:
- •Optimized bulk insert
- •Intelligent batching
- •Support for indexes and constraints
