Import XLSX Files to ClickHouse

    Fast, parallel file import using DuckDBStream

    FastTransfer
    Terminal
    .\FastTransfer.exe `
      --sourceconnectiontype "duckdbstream" `
      --sourceserver ":memory:" `
      --sourceserver "your-server" `
      --sourceuser "your-username" `
      --sourcepassword "your-password" `
      --query "SELECT * FROM read_xlsx('D:\path\to\files\*.xlsx, filename=true')" `
      --targetconnectiontype "clickhousebulk" `
      --targetserver "your-server" `
      --targetuser "your-username" `
      --targetpassword "your-password" `
      --targetdatabase "your-database" `
      --targetschema "your-schema" `
      --targettable "your-table" `
      --method "DataDriven" `
      --distributekeycolumn "filename"  `
      --datadrivenquery "select file from glob('D:\path\to\files\*.xlsx')"  `
      --degree -2  `
      --loadmode "Truncate"  `
      --mapmethod "Name"
    Get FastTransfer

    Source - Excel (XLSX)

    The Excel XLSX format is ubiquitous in enterprise environments. FastTransfer can directly read Excel files without prior conversion.

    Features:

    • Direct reading without Excel installed with DuckDB read_xlsx() syntax
    • Support for multiple sheets
    • Automatic header detection
    • Data type preservation

    Processing - DuckDBStream with DataDriven

    DuckDB is a fast and efficient in-process analytical database. FastTransfer uses DuckDBStream to read multiple file formats with exceptional performance.

    Parallel Method: DataDriven (Files)

    For files, FastTransfer uses the filename as distribution key to parallelize the processing of multiple files simultaneously.

    • Concurrent processing of multiple files
    • Ideal for batch imports
    • Automatic horizontal scaling

    Destination - ClickHouse

    FastTransfer uses ClickHouse's native protocol for optimized bulk insertions on this column-oriented analytics database.

    Loading method:

    Native Protocol Bulk Copy

    Advantages:

    • Native ClickHouse protocol
    • Optimized for columnar format
    • Exceptional analytical performance