Traffic Routing in AWS Lambda: Canary Deployments, Weighted Aliases & Blue/Green

DEV Community

Lambda Versions and Aliases: The Foundation

Before traffic routing makes sense, you need to understand Lambda's versioning model.

Versions

Every time you publish a Lambda function, AWS creates an immutable version — a snapshot of your code and configuration at that point in time.

$LATEST → always points to the latest unpublished code (mutable)
:1 → first published version (immutable)
:2 → second published version (immutable)
:3 → third published version (immutable)

# Publish a new version via boto3
import boto3
lambda_client = boto3.client('lambda')
response = lambda_client.publish_version(
 FunctionName='brand-api',
 Description='v2.1.0 — faster logo lookup with DynamoDB cache'
)
version_arn = response['FunctionArn']
version_number = response['Version']
print(f'Published version {version_number}: {version_arn}')
# → Published version 42: arn:aws:lambda:us-east-1:123:function:brand-api:42

Versions are immutable — you cannot change the code of :42 after it's published. This is the foundation of safe deployments.

Aliases

An alias is a named pointer to a specific version. Your API Gateway, EventBridge rules, and other triggers should always point to an alias — never to $LATEST or a version number directly.

brand-api:prod → points to :42 (production traffic)
brand-api:staging → points to :43 (staging traffic)
brand-api:canary → points to :42 (95%) + :43 (5%) ← weighted routing

# Create or update an alias
lambda_client.create_alias(
 FunctionName='brand-api',
 Name='prod',
 FunctionVersion='42',
 Description='Production alias'
)
# Update alias to point to new version
lambda_client.update_alias(
 FunctionName='brand-api',
 Name='prod',
 FunctionVersion='43'
)

Traffic Splitting: Canary Deployments with Weighted Aliases

The most powerful traffic routing feature in Lambda is weighted aliases — you can split traffic between two versions with any percentage split.

brand-api:prod
├── version :42 → 95% of traffic
└── version :43 → 5% of traffic ← canary

This is Lambda's equivalent of what Knative achieves with Istio VirtualService traffic splitting — but built natively into the Lambda service.

Implementing a Canary Deployment

# deploy_canary.py
import boto3
import time
lambda_client = boto3.client('lambda')
cloudwatch = boto3.client('cloudwatch')
def deploy_canary(function_name: str, new_version: str, canary_percent: int = 5):
 """
 Deploy a new Lambda version as a canary.
 Routes canary_percent% of traffic to new version.
 """
 # Get current prod alias
 alias = lambda_client.get_alias(
 FunctionName=function_name,
 Name='prod'
 )
 current_version = alias['FunctionVersion']
 print(f'Current prod version: {current_version}')
 print(f'Deploying canary: version {new_version} at {canary_percent}%')
 # Update alias with weighted routing
 lambda_client.update_alias(
 FunctionName=function_name,
 Name='prod',
 FunctionVersion=current_version, # stable version gets majority
 RoutingConfig={
 'AdditionalVersionWeights': {
 new_version: canary_percent / 100 # e.g., 0.05 = 5%
 }
 }
 )
 print(f'Canary deployed: {100 - canary_percent}% → v{current_version}, '
 f'{canary_percent}% → v{new_version}')
def promote_canary(function_name: str, new_version: str):
 """Promote canary to 100% — full deployment"""
 lambda_client.update_alias(
 FunctionName=function_name,
 Name='prod',
 FunctionVersion=new_version,
 RoutingConfig={
 'AdditionalVersionWeights': {} # clear weighted routing
 }
 )
 print(f'Canary promoted: 100% traffic now on version {new_version}')
def rollback_canary(function_name: str, stable_version: str):
 """Roll back — remove canary, restore 100% to stable version"""
 lambda_client.update_alias(
 FunctionName=function_name,
 Name='prod',
 FunctionVersion=stable_version,
 RoutingConfig={
 'AdditionalVersionWeights': {} # clear canary
 }
 )
 print(f'Rolled back: 100% traffic restored to version {stable_version}')
# Usage
deploy_canary('brand-api', new_version='43', canary_percent=5)

Automated Canary with CloudWatch Alarms (CodeDeploy)

Manually managing canary percentages is error-prone. AWS CodeDeploy integrates with Lambda to automate the shift — and automatically roll back if CloudWatch alarms fire.

# serverless.yml — automated canary deployment
provider:
 name: aws
 deploymentMethod: direct
functions:
 brandApi:
 handler: handler.handler
 deploymentSettings:
 type: Canary10Percent5Minutes # shift 10% now, 100% after 5 minutes
 alias: prod
 alarms:
 - BrandApiErrorRateAlarm # rollback if this alarm fires
 - BrandApiLatencyAlarm

# CloudFormation — define the rollback alarms
resources:
 Resources:
 BrandApiErrorRateAlarm:
 Type: AWS::CloudWatch::Alarm
 Properties:
 AlarmName: brand-api-error-rate-canary
 MetricName: Errors
 Namespace: AWS/Lambda
 Dimensions:
 - Name: FunctionName
 Value: brand-api
 - Name: Resource
 Value: brand-api:prod # monitor the alias, not a specific version
 Statistic: Sum
 Period: 60
 EvaluationPeriods: 2
 Threshold: 5 # rollback if >5 errors in 2 minutes
 ComparisonOperator: GreaterThanThreshold
 BrandApiLatencyAlarm:
 Type: AWS::CloudWatch::Alarm
 Properties:
 AlarmName: brand-api-p99-latency-canary
 MetricName: Duration
 Namespace: AWS/Lambda
 Dimensions:
 - Name: FunctionName
 Value: brand-api
 - Name: Resource
 Value: brand-api:prod
 ExtendedStatistic: p99
 Period: 60
 EvaluationPeriods: 2
 Threshold: 2000 # rollback if P99 > 2000ms
 ComparisonOperator: GreaterThanThreshold

CodeDeploy deployment types for Lambda:

Type	Behavior
`AllAtOnce`	100% traffic shifts immediately (no canary)
`Canary10Percent5Minutes`	10% for 5 min, then 100%
`Canary10Percent10Minutes`	10% for 10 min, then 100%
`Canary10Percent15Minutes`	10% for 15 min, then 100%
`Linear10PercentEvery1Minute`	+10% every minute until 100%
`Linear10PercentEvery2Minutes`	+10% every 2 minutes until 100%

How Traffic Flows: Sync vs Async

Traffic routing in Lambda isn't just about version weights — the entire flow differs between synchronous and asynchronous invocations.

Synchronous Traffic Flow (API Gateway)

Client Request
 │
 ▼
API Gateway
 │ (points to alias: brand-api:prod)
 ▼
Lambda Service (weighted routing)
 ├── 95% → Execution Environment running v42
 └── 5% → Execution Environment running v43
 │
 ▼
Response returned to API Gateway → Client

Key characteristics:

Direct path: client waits for the response
No buffering: if Lambda is throttled, API Gateway immediately returns 429 to the client
Version routing: Lambda's weighted alias determines which version handles each request

# handler.py — use context to log which version is handling the request
import os
def handler(event, context):
 # Log version info for canary monitoring
 function_version = context.function_version
 print(f'Handled by version: {function_version}')
 # Your business logic
 brand_id = event['pathParameters']['brandId']
 return get_brand(brand_id)

Asynchronous Traffic Flow (SQS / EventBridge)

Async traffic introduces a buffer layer between the event source and Lambda execution. This is the key architectural difference.

Event Source (S3 upload / EventBridge rule)
 │
 ▼
Lambda Internal Queue ← traffic is buffered here
 │
 ▼ (Lambda polls the queue)
Lambda Service (weighted routing)
 ├── 95% → Execution Environment running v42
 └── 5% → Execution Environment running v43
 │
 ▼
Result → CloudWatch Logs
 → Success destination (SNS/SQS/EventBridge/Lambda)
 → Failure destination (DLQ) on repeated failures

The buffer is critical: it decouples the event producer from Lambda's availability. If Lambda is throttled or scaling out, events queue up and are processed when capacity is available — nothing is dropped.

# handler.py — async handler with destination routing
import json
import boto3
def handler(event, context):
 """
 Async handler — processes S3 upload events.
 On success: result routed to success-destination SQS.
 On failure: after 2 retries, routed to DLQ.
 """
 for record in event['Records']:
 bucket = record['s3']['bucket']['name']
 key = record['s3']['object']['key']
 try:
 result = process_brand_asset(bucket, key)
 print(f'Successfully processed: {key}')
 return {'processed': key, 'result': result}
 except Exception as e:
 print(f'Failed to process {key}: {e}')
 raise # re-raise to trigger Lambda retry + eventual DLQ routing

# serverless.yml — configure async destinations
functions:
 processBrandAsset:
 handler: handler.handler
 destinations:
 onSuccess: arn:aws:sqs:us-east-1:123:brand-asset-success
 onFailure: arn:aws:sqs:us-east-1:123:brand-asset-dlq
 maximumRetryAttempts: 2
 events:
 - s3:
 bucket: brand-assets
 event: s3:ObjectCreated:*

Concurrency Control at the Traffic Layer

In Knative's model, the queue-proxy sidecar acts as a per-pod concurrency limiter — it queues excess requests locally before forwarding to the user container, and reports metrics to the autoscaler.

AWS Lambda implements an equivalent mechanism natively, without requiring a sidecar:

Per-Function Concurrency Limiting

# Set maximum concurrency — Lambda queues excess async requests
lambda_client.put_function_concurrency(
 FunctionName='brand-logo-processor',
 ReservedConcurrentExecutions=50 # max 50 simultaneous executions
)

For synchronous invocations: requests beyond the concurrency limit are immediately throttled (429).

For asynchronous invocations: requests beyond the concurrency limit are queued in Lambda's internal event queue (up to 6 hours) and retried as capacity becomes available.

Per-Alias Concurrency (Provisioned Concurrency on Aliases)

You can apply Provisioned Concurrency specifically to an alias, ensuring the production alias always has warm environments while the canary alias uses on-demand scaling:

# Apply provisioned concurrency to prod alias only
lambda_client.put_provisioned_concurrency_config(
 FunctionName='brand-api',
 Qualifier='prod', # the alias name
 ProvisionedConcurrentExecutions=20
)
# Canary alias uses on-demand (may cold start, but that's acceptable for 5% traffic)
# No provisioned concurrency set on 'canary' alias

Blue/Green Deployment Pattern

For changes that are too risky for gradual canary (e.g., breaking schema changes), use a full blue/green deployment:

Blue environment: brand-api:prod → version :42 (100% traffic)
Green environment: brand-api:green → version :43 (0% traffic, fully tested)
After validation:
Blue environment: brand-api:prod → version :43 (100% traffic, instant cutover)
Green environment: brand-api:green → version :42 (kept for instant rollback)

# blue_green_deploy.py
import boto3
lambda_client = boto3.client('lambda')
def blue_green_cutover(function_name: str, new_version: str):
 """
 Instant traffic cutover from current prod version to new version.
 Previous version kept on 'previous' alias for instant rollback.
 """
 # Get current prod version (this becomes 'blue' / previous)
 current = lambda_client.get_alias(
 FunctionName=function_name,
 Name='prod'
 )
 current_version = current['FunctionVersion']
 # Preserve current version on 'previous' alias for rollback
 try:
 lambda_client.update_alias(
 FunctionName=function_name,
 Name='previous',
 FunctionVersion=current_version
 )
 except lambda_client.exceptions.ResourceNotFoundException:
 lambda_client.create_alias(
 FunctionName=function_name,
 Name='previous',
 FunctionVersion=current_version
 )
 # Cut over prod to new version (instant, no gradual shift)
 lambda_client.update_alias(
 FunctionName=function_name,
 Name='prod',
 FunctionVersion=new_version,
 RoutingConfig={'AdditionalVersionWeights': {}}
 )
 print(f'Cutover complete: prod now on v{new_version}')
 print(f'Rollback available: run rollback() to restore v{current_version}')
def instant_rollback(function_name: str):
 """Roll back to previous version instantly"""
 previous = lambda_client.get_alias(
 FunctionName=function_name,
 Name='previous'
 )
 previous_version = previous['FunctionVersion']
 lambda_client.update_alias(
 FunctionName=function_name,
 Name='prod',
 FunctionVersion=previous_version,
 RoutingConfig={'AdditionalVersionWeights': {}}
 )
 print(f'Rolled back: prod restored to v{previous_version}')

Deployment Strategy Decision Guide

How risky is this deployment?
│
├── Low risk (config change, minor bug fix)
│ └── AllAtOnce — deploy directly to 100%
│
├── Medium risk (new feature, refactor)
│ └── Canary — start at 5–10%, monitor errors/latency,
│ auto-promote or rollback via CodeDeploy alarms
│
├── High risk (breaking change, new external dependency)
│ └── Blue/Green — full parallel environment,
│ instant cutover after validation, instant rollback
│
└── Schema/data migration (irreversible changes)
 └── Feature flags in code + gradual rollout
 (decouple deployment from feature activation)

Summary

Concept	AWS Lambda Implementation
Traffic splitting	Weighted aliases (e.g., 95% v42 / 5% v43)
Canary deployment	CodeDeploy + Lambda aliases + CloudWatch alarms
Blue/Green	Two aliases pointing to different versions, instant cutover
Async traffic buffering	Lambda internal event queue (up to 6 hours)
Concurrency control	Reserved concurrency + Provisioned Concurrency per alias
Automatic rollback	CodeDeploy monitors alarms, rolls back if threshold breached

The key insight: Lambda's alias + versioning system is its traffic routing layer. Every production Lambda function should be invoked via an alias — never via $LATEST. This single practice unlocks canary deployments, blue/green releases, and instant rollbacks.

Next in this series: **Part 5 — Event-Driven Automation: Building a Serverless Maintenance Bot with Lambda & EventBridge**