AWS Lambda Docker Wrapper
The AWS Lambda Docker wrapper enables you to run FastTransfer as a serverless function in AWS Lambda, allowing you to execute database transfers on-demand or on a schedule without managing infrastructure.
Overviewβ
This wrapper packages FastTransfer into a Docker container optimized for AWS Lambda, enabling:
Benefits:
- Serverless execution: No servers to manage
- Event-driven: Trigger transfers from S3 events, SNS, SQS, EventBridge, etc.
- Scheduled transfers: Use EventBridge rules for cron-like scheduling
- Auto-scaling: Lambda scales automatically with demand
- Cost-effective: Pay only for execution time
- VPC support: Connect to databases in private subnets
Use Cases:
- Scheduled ETL jobs
- Event-driven data synchronization
- On-demand data extracts triggered by API Gateway
- Cloud-to-cloud database migrations
Repositoryβ
The Lambda wrapper source code and deployment templates are available on GitHub:
π FastTransfer Lambda Wrapper on GitHub
Architectureβ
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β Trigger β β AWS Lambda β β Target DB β
β (EventBridge, βββββββΆβ (FastTransfer βββββββΆβ (RDS, Aurora, β
β API Gateway, β β Container) β β External) β
β S3, etc.) β ββββββββββββββββββββ βββββββββββββββββββ
βββββββββββββββββββ β
β
βΌ
ββββββββββββββββββββ
β Source DB β
β (RDS, Aurora, β
β External) β
ββββββββββββββββββββ
Requirementsβ
AWS Account Setupβ
- AWS Account with appropriate permissions
- AWS CLI configured
- Docker installed locally
- ECR repository for Docker images
Lambda Configurationβ
- Memory: 1024 MB minimum, 3008 MB recommended for large transfers
- Timeout: 15 minutes maximum (Lambda limit)
- Ephemeral storage: 512 MB minimum, up to 10 GB for large datasets
- VPC: Configure if databases are in private subnets
AWS Lambda has a maximum execution time of 15 minutes. For transfers that take longer, consider using AWS Batch, ECS, or EC2 instead.
Installationβ
Step 1: Clone the Repositoryβ
git clone https://github.com/aetperf/FastTransfer-Lambda-Wrapper.git
cd FastTransfer-Lambda-Wrapper
Step 2: Build Docker Imageβ
The repository includes a Dockerfile optimized for Lambda:
FROM public.ecr.aws/lambda/python:3.11
# Install FastTransfer
COPY FastTransfer ${LAMBDA_TASK_ROOT}/FastTransfer
RUN chmod +x ${LAMBDA_TASK_ROOT}/FastTransfer
# Copy wrapper code
COPY lambda_function.py ${LAMBDA_TASK_ROOT}
COPY requirements.txt ${LAMBDA_TASK_ROOT}
# Install Python dependencies
RUN pip install -r requirements.txt
# Set the CMD to your handler
CMD [ "lambda_function.handler" ]
Build the image:
docker build -t fasttransfer-lambda .
Step 3: Push to Amazon ECRβ
# Create ECR repository
aws ecr create-repository --repository-name fasttransfer-lambda --region us-east-1
# Login to ECR
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin <account-id>.dkr.ecr.us-east-1.amazonaws.com
# Tag image
docker tag fasttransfer-lambda:latest <account-id>.dkr.ecr.us-east-1.amazonaws.com/fasttransfer-lambda:latest
# Push image
docker push <account-id>.dkr.ecr.us-east-1.amazonaws.com/fasttransfer-lambda:latest
Step 4: Create Lambda Functionβ
Using AWS Console:
- Go to Lambda β Create Function
- Choose "Container image"
- Select your ECR image
- Configure memory (1024 MB+) and timeout (5-15 minutes)
- Attach IAM role with necessary permissions
Using AWS CLI:
aws lambda create-function \
--function-name fasttransfer-function \
--package-type Image \
--code ImageUri=<account-id>.dkr.ecr.us-east-1.amazonaws.com/fasttransfer-lambda:latest \
--role arn:aws:iam::<account-id>:role/lambda-execution-role \
--timeout 900 \
--memory-size 2048 \
--ephemeral-storage Size=1024
Step 5: Configure VPC (if needed)β
If your databases are in private subnets:
aws lambda update-function-configuration \
--function-name fasttransfer-function \
--vpc-config SubnetIds=subnet-xxxxx,subnet-yyyyy,SecurityGroupIds=sg-zzzzz
Usageβ
Event Structureβ
The Lambda function expects JSON event with transfer parameters:
{
"source": {
"type": "pgsql",
"server": "source-db.cluster-xxxxx.us-east-1.rds.amazonaws.com",
"user": "pguser",
"password": "pgpass",
"database": "sales",
"table": "orders"
},
"target": {
"type": "mssql",
"server": "target-db.xxxxx.us-east-1.rds.amazonaws.com",
"user": "sqluser",
"password": "sqlpass",
"database": "warehouse",
"table": "orders"
},
"options": {
"loadMode": "Append",
"degree": 4,
"batchSize": 10000
}
}
Invoke from AWS CLIβ
aws lambda invoke \
--function-name fasttransfer-function \
--payload file://transfer-config.json \
response.json
cat response.json
Invoke from Python (boto3)β
import boto3
import json
lambda_client = boto3.client('lambda', region_name='us-east-1')
payload = {
"source": {
"type": "pgsql",
"server": "source-db.cluster-xxxxx.us-east-1.rds.amazonaws.com",
"user": "pguser",
"password": "pgpass",
"database": "sales",
"table": "orders"
},
"target": {
"type": "mssql",
"server": "target-db.xxxxx.us-east-1.rds.amazonaws.com",
"user": "sqluser",
"password": "sqlpass",
"database": "warehouse",
"table": "orders"
}
}
response = lambda_client.invoke(
FunctionName='fasttransfer-function',
InvocationType='RequestResponse',
Payload=json.dumps(payload)
)
result = json.loads(response['Payload'].read())
print(f"Status: {result['statusCode']}")
print(f"Result: {result['body']}")
Scheduled Transfer with EventBridgeβ
Create a rule that triggers the Lambda function daily:
# Create EventBridge rule
aws events put-rule \
--name daily-transfer-2am \
--schedule-expression "cron(0 2 * * ? *)" \
--description "Daily database transfer at 2 AM UTC"
# Add Lambda as target
aws events put-targets \
--rule daily-transfer-2am \
--targets "Id"="1","Arn"="arn:aws:lambda:us-east-1:<account-id>:function:fasttransfer-function","Input"='{
"source": {"type": "pgsql", "server": "source-db.us-east-1.rds.amazonaws.com", "database": "sales", "table": "orders"},
"target": {"type": "mssql", "server": "target-db.us-east-1.rds.amazonaws.com", "database": "warehouse", "table": "orders"}
}'
# Grant EventBridge permission to invoke Lambda
aws lambda add-permission \
--function-name fasttransfer-function \
--statement-id EventBridgeInvoke \
--action lambda:InvokeFunction \
--principal events.amazonaws.com \
--source-arn arn:aws:events:us-east-1:<account-id>:rule/daily-transfer-2am
API Gateway Integrationβ
Expose FastTransfer as a REST API:
# Create REST API
aws apigateway create-rest-api \
--name fasttransfer-api \
--description "FastTransfer API"
# Configure Lambda integration
# (See AWS documentation for complete API Gateway setup)
Then invoke via HTTP:
curl -X POST https://xxxxxx.execute-api.us-east-1.amazonaws.com/prod/transfer \
-H "Content-Type: application/json" \
-d '{
"source": {"type": "pgsql", "server": "source.rds.amazonaws.com", "database": "db", "table": "table"},
"target": {"type": "mssql", "server": "target.rds.amazonaws.com", "database": "db", "table": "table"}
}'
Credential Managementβ
AWS Secrets Managerβ
Store database credentials securely:
import boto3
import json
def get_credentials(secret_name):
client = boto3.client('secretsmanager', region_name='us-east-1')
response = client.get_secret_value(SecretId=secret_name)
return json.loads(response['SecretString'])
# In lambda_function.py
source_creds = get_credentials('fasttransfer/source-db')
target_creds = get_credentials('fasttransfer/target-db')
payload = {
"source": {
"type": "pgsql",
"server": source_creds['host'],
"user": source_creds['username'],
"password": source_creds['password'],
"database": source_creds['database'],
"table": "orders"
},
"target": {
"type": "mssql",
"server": target_creds['host'],
"user": target_creds['username'],
"password": target_creds['password'],
"database": target_creds['database'],
"table": "orders"
}
}
Create secrets:
aws secretsmanager create-secret \
--name fasttransfer/source-db \
--secret-string '{"host":"source.rds.amazonaws.com","username":"pguser","password":"pgpass","database":"sales"}'
aws secretsmanager create-secret \
--name fasttransfer/target-db \
--secret-string '{"host":"target.rds.amazonaws.com","username":"sqluser","password":"sqlpass","database":"warehouse"}'
IAM Roles and Permissionsβ
Lambda execution role needs:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*"
},
{
"Effect": "Allow",
"Action": [
"ec2:CreateNetworkInterface",
"ec2:DescribeNetworkInterfaces",
"ec2:DeleteNetworkInterface"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"secretsmanager:GetSecretValue"
],
"Resource": "arn:aws:secretsmanager:*:*:secret:fasttransfer/*"
}
]
}
Monitoring and Loggingβ
CloudWatch Logsβ
Lambda automatically logs to CloudWatch:
import logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)
def handler(event, context):
logger.info(f"Starting transfer: {event}")
# Execute FastTransfer
result = execute_transfer(event)
logger.info(f"Transfer completed: {result}")
return result
CloudWatch Metricsβ
Create custom metrics:
import boto3
cloudwatch = boto3.client('cloudwatch')
def publish_metrics(row_count, duration):
cloudwatch.put_metric_data(
Namespace='FastTransfer',
MetricData=[
{
'MetricName': 'RowsTransferred',
'Value': row_count,
'Unit': 'Count'
},
{
'MetricName': 'TransferDuration',
'Value': duration,
'Unit': 'Seconds'
}
]
)
X-Ray Tracingβ
Enable AWS X-Ray for detailed tracing:
aws lambda update-function-configuration \
--function-name fasttransfer-function \
--tracing-config Mode=Active
Performance Optimizationβ
Memory Configurationβ
More memory = more CPU power:
- 1024 MB: Small transfers (under 100K rows)
- 2048 MB: Medium transfers (under 1M rows)
- 3008 MB: Large transfers (1M+ rows)
Ephemeral Storageβ
Increase if working with large datasets:
aws lambda update-function-configuration \
--function-name fasttransfer-function \
--ephemeral-storage Size=2048 # Up to 10240 MB
Provisioned Concurrencyβ
For consistent performance:
aws lambda put-provisioned-concurrency-config \
--function-name fasttransfer-function \
--provisioned-concurrent-executions 5
Limitationsβ
Lambda Constraintsβ
- Maximum execution time: 15 minutes
- Maximum memory: 10 GB
- Maximum ephemeral storage: 10 GB
- Maximum payload: 6 MB (synchronous), 256 KB (asynchronous)
Workarounds for Large Transfersβ
For transfers exceeding Lambda limits:
- Split into batches: Transfer in chunks using queries with LIMIT/OFFSET
- Use Step Functions: Orchestrate multiple Lambda invocations
- Use AWS Batch: For longer-running transfers
- Use ECS Fargate: For very large transfers
Cost Considerationsβ
Lambda pricing based on:
- Invocations: $0.20 per 1M requests
- Duration: $0.0000166667 per GB-second
- Data transfer: Standard AWS data transfer rates
Example calculation:
- 100 daily transfers
- 2 GB memory, 5 minutes each
- Cost: ~$15/month for compute + data transfer
Compare with:
- EC2 t3.medium 24/7: ~$30/month
- RDS proxy: Additional costs
Further Informationβ
For complete source code, deployment templates, and examples:
π Visit the GitHub Repository
For AWS documentation:
For FastTransfer documentation: