Limitations
FastTransfer is designed to handle most common data transfer scenarios, but there are some important limitations and restrictions to be aware of before using the tool.
Target Table Requirements
Table Must Pre-exist
The target table must already exist before running FastTransfer. The tool does not create tables automatically.
Why this limitation exists:
- Data type mappings between different database systems can be complex and ambiguous
- Business requirements for indexes, constraints, and defaults vary
- Table creation requires elevated privileges that transfer users may not have
Solution: Create the target table manually using appropriate CREATE TABLE statements for your target database, ensuring:
- Column count matches (for Position mapping) or column names match (for Name mapping)
- Data types are compatible with source data
- Any required constraints, indexes, or defaults are defined
Column Compatibility
Source and target columns must be compatible:
Position Mapping (default):
- Source and target must have the same number of columns
- Columns are matched by ordinal position (1st to 1st, 2nd to 2nd, etc.)
- Column names do not need to match
- Data types must be compatible
Name Mapping:
- Columns are matched by name
- Target can have additional columns (will use defaults or NULLs)
- Source columns without matching target column names will cause an error
- Data types must be compatible
Data Type Limitations
Array Data Types
Array data types have limited cross-database support and may not transfer correctly between all database platforms.
Known Issues:
- PostgreSQL arrays → SQL Server: Not directly supported (SQL Server has no native array type)
- PostgreSQL arrays → MySQL: Not directly supported (MySQL has limited array support via JSON)
- Array element types may not map correctly between systems
Workarounds:
-
Convert arrays to delimited strings in source query:
--query "SELECT id, array_to_string(tags, ',') as tags FROM table" -
Convert arrays to JSON:
--query "SELECT id, array_to_json(tags) as tags FROM table" -
Create separate normalized tables for array elements
Spatial Data Types
Spatial/geometry data types (PostGIS, SQL Server geography/geometry, Oracle SDO_GEOMETRY) are not supported for cross-database transfers.
Affected Types:
- PostgreSQL:
geometry,geography(PostGIS extension) - SQL Server:
geometry,geography - Oracle:
SDO_GEOMETRY,SDO_POINT - MySQL:
POINT,LINESTRING,POLYGON,GEOMETRY
Workarounds:
-
Convert to Well-Known Text (WKT) format:
-- PostgreSQL
--query "SELECT id, ST_AsText(geom) as geom_wkt FROM spatial_table"
-- SQL Server
--query "SELECT id, geom.STAsText() as geom_wkt FROM spatial_table" -
Convert to Well-Known Binary (WKB) as binary/bytea
-
Extract coordinates to separate numeric columns
Vector Data Types
Vector data types (used for AI/ML embeddings) have limited support and may not transfer correctly between databases.
Affected Types:
- PostgreSQL:
vector(pgvector extension) - SQL Server: Vector types via specialized columns
- ClickHouse: Array-based vectors
Considerations:
- Vector dimensions must match between source and target
- Some databases store vectors as arrays, others as specialized types
- Serialization format may differ
Workarounds:
-
Convert vectors to JSON arrays:
--query "SELECT id, embedding::text as embedding_json FROM vectors" -
Store vectors as delimited strings
-
Use binary serialization formats
Large Object Types
BLOB/CLOB/TEXT handling:
- Very large objects (> 100MB) may impact performance
- Memory constraints may apply to batch processing
- Some databases have specific requirements for LOB handling
Recommendations:
- Consider
--batchsizeadjustments for tables with large objects - Test with sample data first
- Use appropriate connection types (e.g.,
pgcopyfor PostgreSQL LOBs)
Performance Considerations
Network Latency
FastTransfer transfers data through the machine running the tool:
Source DB ─(network)→ FastTransfer Machine ─(network)→ Target DB
Implications:
- Both source and target network connections affect speed
- High latency to either database will slow transfers
- Bandwidth limitations apply to both connections
Optimization:
- Run FastTransfer close to either source or target (ideally on a machine in the same data center)
- Use parallelization (
--degree) to maximize throughput - Consider network compression if supported by database drivers
Memory Usage
Memory usage scales with:
- Batch size (
--batchsize) - Degree of parallelism (
--degree) - Row size (width of data)
- Number of large objects (LOBs, TEXT fields)
Guidelines:
- Typical memory per thread: 50-500 MB depending on row size
- For 4 parallel threads with 10K batch size: 200MB - 2GB RAM
- Large objects significantly increase memory requirements
Transaction Limits
Some databases have transaction size limits:
- Oracle: Undo tablespace size
- PostgreSQL: No hard limit but large transactions can cause issues
- MySQL:
max_binlog_cache_sizeaffects transaction logging - SQL Server: Transaction log size
Mitigation:
- Use smaller batch sizes for very large tables
- Consider
--useworktablesfor better transaction management - Monitor target database transaction logs
Connection and Authentication
Windows Authentication
--targettrusted (Windows/Kerberos authentication) has limited support:
- Primarily works with SQL Server on Windows
- PostgreSQL requires Kerberos/GSSAPI configuration
- Not supported for MySQL, Oracle, ClickHouse (require explicit credentials)
Certificate-Based Authentication
Client certificate authentication is not directly supported via command-line parameters.
Workaround: Use connection strings with certificate parameters:
--targetconnectionstring "Host=server;Port=5432;Database=db;SSL Mode=Require;SSL Cert=/path/to/cert.crt;SSL Key=/path/to/key.key"
Multi-Factor Authentication (MFA)
Interactive MFA is not supported in FastTransfer (it's a command-line tool).
Solutions:
- Use service accounts without MFA
- Use certificate-based auth where supported
- Generate time-limited tokens and use connection strings
Feature Limitations
No Schema Translation
FastTransfer does not translate database schemas or SQL dialects:
- Target table must already exist with compatible structure
- Data type conversions are basic (string to string, number to number, etc.)
- Database-specific features (computed columns, triggers) must be recreated manually
No DDL Operations
FastTransfer only transfers data (DML), not schema objects (DDL):
- Does not create tables, views, indexes, procedures
- Does not transfer constraints, triggers, permissions
- Does not migrate database settings or configurations
Use Cases: FastTransfer is for data migration, not full database migration. For complete database migrations including schema, use database-specific tools:
- SQL Server: SSMS, Azure DMS
- PostgreSQL: pg_dump/pg_restore
- MySQL: mysqldump
- Oracle: Data Pump, GoldenGate
No Data Transformation
FastTransfer transfers data as-is without transformation:
- No data cleansing or validation
- No type conversions beyond basic compatibility
- No row filtering (except via source
--query) - No column calculations
Solution: Use source queries for basic transformations:
--query "SELECT id, UPPER(name) as name, price * 1.1 as price_with_tax FROM products WHERE active = 1"
For complex transformations, use ETL tools (Apache Airflow, Apache NiFi, Talend, etc.)
Parallel Processing Limitations
Method Availability
Not all parallelization methods work with all databases:
| Method | SQL Server | PostgreSQL | MySQL | Oracle | Others |
|---|---|---|---|---|---|
| None | ✓ | ✓ | ✓ | ✓ | ✓ |
| Random | ✓ | ✓ | ✓ | ✓ | ✓ |
| RangeId | ✓ | ✓ | ✓ | ✓ | ✓ |
| Ntile | ✓ | ✓ | ✓ | ✓ | ✓ |
| DataDriven | ✓ | ✓ | ✓ | ✓ | Varies |
| Ctid | - | ✓ | - | - | - |
| Rowid | - | - | - | ✓ | - |
| Physloc | ✓ | - | - | - | - |
| NZDataSlice | - | - | - | - | Netezza only |
See Parallel Parameters for method details.
Ordering Not Guaranteed
When using parallel processing (--degree > 1):
- Row order is not guaranteed in target table
- Rows may arrive in different order than source
- Multiple threads insert concurrently
If order matters:
- Use
--degree 1for single-threaded transfer - Add ORDER BY in source query (though this may impact performance)
- Add a sequence/timestamp column for ordering
Platform-Specific Limitations
DuckDB
- File path must be accessible from FastTransfer machine
- Concurrent access limitations (DuckDB single-writer)
- Memory mode transfers require sufficient RAM
Netezza
- NZDataSlice method requires specific Netezza configuration
- ODBC driver must be installed separately
SAP HANA
hanabulkmethod requires SAP HANA-specific drivers- User must have appropriate SAP HANA bulk load permissions
Oracle
oradirect(direct path) has restrictions on target table (no triggers, limited constraints)- TNS configuration required for TNS-based connections
- Oracle client libraries may need to be installed
ClickHouse
- HTTP(S) protocol used, not native TCP protocol
- Insert performance depends on ClickHouse MergeTree optimization
- Very large inserts may require ClickHouse configuration tuning
Workarounds Summary
For most limitations, workarounds exist:
- Missing data type support: Use source queries to convert to compatible types
- Array/spatial types: Convert to text/JSON formats
- No table creation: Create target tables manually or via scripts
- Authentication issues: Use connection strings for advanced auth scenarios
- Performance issues: Tune
--batchsize,--degree, and parallelization methods - Memory constraints: Reduce batch size or parallelism
- Large objects: Adjust batch sizes, consider streaming approaches
Getting Help
If you encounter limitations not documented here or need assistance with workarounds:
- Check the Examples section for similar scenarios
- Review CLI documentation for parameter options
- Enable debug logging:
--loglevel Debug - Contact support: support@arpe.io
Best Practices
To avoid hitting limitations:
✅ Always test first with a small subset of data
✅ Create target tables with appropriate types before transfer
✅ Validate data types are compatible between source and target
✅ Use source queries for transformations and filtering
✅ Monitor performance and adjust batch size/parallelism
✅ Check logs for warnings and errors
✅ Backup target before large transfers (especially with --loadmode Truncate)