From Manual OAuth Onboarding to Event-Driven Sync: A Privacy-Safe Serverless Case Study

DEV Community

In practice, Cognito-issued tokens represent Cognito sessions, not a full replacement for long-lived third-party API credential lifecycle management in this architecture. So we changed direction:

Use direct OAuth code exchange with the provider.
Persist provider access/refresh tokens in encrypted parameter storage.
Keep DynamoDB focused on non-sensitive token metadata and discovery records.

This decision reduced ambiguity in token ownership, made refresh behavior explicit, and aligned better with least-privilege backend API access patterns.

Resource discovery pipeline

After the callback exchange, the auth Lambda performs discovery against two provider APIs and stores normalized records.

# pseudo-code
tokens = exchange_code_for_tokens(code, redirect_uri)
email = fetch_user_identity(tokens.access_token)
store_tokens(email, tokens.access_token, tokens.refresh_token)
resources_a = list_resource_type_a(tokens.access_token)
resources_b, skipped = list_resource_type_b(tokens.refresh_token)
upsert_resources_a(email, resources_a)
upsert_resources_b(email, resources_b, skipped)

Notice the mixed token strategy:

API A accepts an access token directly.
API B may be better served from refresh-token-derived sessions.

This small detail matters in real-world provider ecosystems.

New problem: operational analytics lived in Postgres, not DynamoDB

V1 solved onboarding and discovery. Then a second problem appeared.

Downstream consumers (dashboards, joins, historical analysis, role-based reports) relied on relational querying in Postgres. But fresh data now landed in DynamoDB first.

We had to answer:

How do we keep relational tables synced with minimal lag?
How do we remain idempotent under retries and duplicate events?
How do we avoid expensive full-table scans every minute?

Solution v2: event-driven sync with DynamoDB Streams

The best fit was an event-driven projection layer:

DynamoDB table updates emit stream records.
Stream processor Lambda transforms records.
Lambda upserts rows into Postgres.

Updated architecture with streaming sync

DynamoDB stream for RDS Postgres Sync

Why event-driven first, batch second

Primary path (event-driven):

Low latency (seconds)
No frequent scans
Natural fit for change-data-capture style projection

Safety net (nightly reconciliation):

Catches rare drift (missed events, temporary DB outage, mapping regressions)
Supports audit checks and backfills This is a practical engineering pattern: fast path + correctness path.

Stream processor design details

1. Idempotency via SQL upsert

DynamoDB Streams are at-least-once delivery. Duplicate records can happen. Upsert semantics make retries safe.

-- pseudo-SQL
INSERT INTO ext_resource_a (
 user_email,
 resource_id,
 resource_name,
 status,
 updated_at
)
VALUES ($1, $2, $3, $4, $5)
ON CONFLICT (user_email, resource_id)
DO UPDATE SET
 resource_name = EXCLUDED.resource_name,
 status = EXCLUDED.status,
 updated_at = EXCLUDED.updated_at;

2. Record-level routing

# pseudo-code
for rec in event.records:
 table = detect_source_table(rec)
 if rec.event_name in ["INSERT", "MODIFY"]:
 row = map_new_image_to_row(table, rec.new_image)
 upsert_postgres(table, row)
 elif rec.event_name == "REMOVE":
 soft_delete_or_mark_inactive(table, rec.keys)

3. Preserve source truth semantics
Not every delete should be a physical delete in Postgres. Often better:

Keep row
Mark status = inactive
Track synced_at and source_updated_at

This improves auditability and historical reporting.

4. Backpressure and failure handling

For production, configure:

Batch size tuned for row payload
Retries + DLQ (or failure destination)
Per-table metrics for lag and failure counts

# pseudo-SAM fragment
EventSourceMapping:
 Type: DynamoDB
 Properties:
 StartingPosition: LATEST
 BatchSize: 100
 MaximumRetryAttempts: 3
 BisectBatchOnFunctionError: true

Security-by-design decisions (and why)

CSRF-safe OAuth state

Single-use, TTL-bound nonce in DynamoDB reduced callback forgery risk.

Token isolation
Only the secret store contains token values. The metadata table stores "token exists" and consent timestamps.

Least privilege IAM

Each Lambda role should have only:

read/write specific DynamoDB tables it uses
limited SSM parameter path access
CloudWatch log permissions
network access only when needed (sync Lambda inside VPC for RDS)

Logging hygiene

Never log:

authorization code
access token
refresh token
raw provider error objects that may contain sensitive context

Log instead:

operation outcome
provider endpoint class
masked subject identifiers
correlation ID

Cost-aware architecture choices

The design intentionally kept fixed costs low:

Lambda for bursty orchestration
API Gateway for managed ingress
DynamoDB on-demand for uncertain traffic
SSM Parameter Store SecureString instead of a heavier secret system for this phase
30-day log retention to control CloudWatch growth

Rough POC economics can stay small (single-digit USD/month) when traffic is modest and retention is disciplined.

Sequence walkthrough (problem to resolution)

Sequence walkthrough of 0auth secure event-driven onboarding serverless app with AWS

Small implementation snippets you can adapt

Sanitize identity for secret path keys

# pseudo-code
import re
def to_secret_path_segment(identity: str) -> str:
 return re.sub(r"[^A-Za-z0-9._-]", "_", identity)

Build the callback URI dynamically to avoid template coupling

# pseudo-code
def callback_uri_from_event(event):
 domain = event.requestContext.domainName
 stage = event.requestContext.stage
 return f"https://{domain}/{stage}/auth/callback"

Separate "active" and "skipped" discovered resources

# pseudo-code
active, skipped = discover_resources()
upsert_active(active)
upsert_skipped(skipped, reason_field="skip_reason")

Keep a reconciliation watermark

-- pseudo-SQL
CREATE TABLE sync_checkpoint (
 pipeline_name text primary key,
 last_reconciled_at timestamptz not null
);

Design trade-offs and what changed in architecture

What improved from v1 to v2

Onboarding became self-service instead of support-driven.
Token handling became boundary-safe and auditable.
Metadata became immediately queryable in DynamoDB.
Relational consumers received near real-time updates via streams.
Operational resilience improved with nightly reconciliation.

New complexity introduced (and accepted)

Stream processor deployment and monitoring.
VPC networking for Lambda-to-RDS connectivity.
Schema mapping ownership between NoSQL and SQL models. These are acceptable because they buy reliability, lower manual effort, and a better consumer experience.

What this case study intentionally does not reveal

To protect privacy and commercial implementation details, this post excludes:

Real account names, tenants, domains, and identifiers
Production table names and environment values
End-to-end source code and full function implementations
Internal business workflows, SLAs, and organization-specific goals

That is not a weakness. It is a publishing discipline.

Final architecture summary

Final design in one sentence:

A serverless OAuth ingestion service writes secure secrets and normalized metadata, then projects metadata changes into Postgres through an idempotent event-driven stream processor, with scheduled reconciliation for correctness.

If you are designing a similar platform, the key pattern to remember is:

Keep secrets and metadata in separate trust boundaries.
Use event streams for freshness.
Add periodic reconciliation for confidence.
Design cost and security as first-class constraints, not post-launch patches.

Practical rollout checklist

Ship OAuth nonce validation and one-time consumption first.
Enforce token/metadata split before production traffic.
Add provider discovery with partial-failure handling (active vs skipped).
Enable streams on metadata tables.
Implement the idempotent Postgres upsert projection.
Add nightly reconciliation and drift metrics.
Lock down IAM and log redaction rules.
Track cost and lag dashboards from day one.

Closing note

Architecture maturity usually arrives in stages, not all at once. First, you remove manual pain. Then you harden security boundaries. Then you solve data movement with event-driven design. If you do those steps intentionally, you can stay both secure and cost-effective while your system grows.

Resources
OAuth 2.0 Authorization Framework (RFC 6749): datatracker
OAuth 2.0 Threat Model and Security Considerations (RFC 6819): datatracker
AWS Lambda Developer Guide: docs.aws.amazon.com/lambda
Amazon API Gateway Developer Guide: docs.aws.amazon.com/apigateway
Amazon DynamoDB Developer Guide: docs.aws.amazon.com/amazondynamodb
DynamoDB Streams and Lambda event source mappings: docs.aws.amazon.com/lambda
AWS Systems Manager Parameter Store (SecureString): docs.aws.amazon.com/systems-manager
AWS Well-Architected Framework (Security and Cost pillars): docs.aws.amazon.com/wellarchitected
Amazon RDS for PostgreSQL User Guide: docs.aws.amazon.com/AmazonRDS
PostgreSQL INSERT ... ON CONFLICT (UPSERT): postgresql/sql-insert
Python requests documentation: requests.readthedocs