AWS Lambda vs Cloudflare Workers for a latency-sensitive API gateway. 200M requests/month, P99 target <50ms, currently on ECS Fargate.

accepted_conditional · Pro · 553s · $0.75
7 branches explored · 3 survived · 3 rounds · integrity 75%
Deploy Cloudflare Workers as the API gateway using edge-compute + origin-fetch to existing AWS backends. Workers V8...
Confidence
80%
Risk unknown 553s
Decision timeline Verdict

Migrate the API gateway from ECS Fargate to Cloudflare Workers using an edge-compute + origin-fetch pattern to...

Decision
80%
Execution
Uncertainty

Decision

Deploy Cloudflare Workers as the API gateway using edge-compute + origin-fetch to existing AWS backends. Workers V8 isolates have zero cold starts (sub-1ms spin-up), edge execution adds <5ms, leaving ~45ms for origin fetch — US-East RTT of 10-25ms fits comfortably. Total cost: ~$500/month versus $6K Fargate (94% savings). Lambda is disqualified: Provisioned Concurrency for 770 peak concurrent instances costs $11K-13K/month (40-60% over $8K ceiling), and without it, Lambda P99 is 800ms — 16x the target. Migrate via Cloudflare DNS with percentage-based traffic splitting (10% → 50% → 100% over 5 days), keeping Fargate as fallback. Critical failure mode: if the gateway calls AWS VPC-internal services, cross-cloud RTT of 15-30ms leaves <20ms for backend processing. Mitigate with Cloudflare Tunnel or public endpoints, and validate backend processing time fits within the remaining budget before full cutover.

Next actions

Build a Cloudflare Workers PoC proxying 3-5 high-traffic API routes to the existing Fargate ALB using fetch() with standard Web API patterns
backend · immediate
Run synthetic load tests at 770 concurrent requests from multiple geographic regions, measuring P99 end-to-end latency including origin fetch to AWS us-east-1
backend · immediate
Audit all API gateway routes for AWS VPC-private service dependencies (DynamoDB, ElastiCache, SQS) and determine if Cloudflare Tunnel or public endpoints are needed
infra · immediate
Configure Cloudflare DNS percentage-based traffic splitting starting at 10%, with automated rollback to Fargate ALB if P99 exceeds 50ms
infra · before_launch
Set up unified observability pipeline (OpenTelemetry) spanning Workers analytics and CloudWatch to track P99 latency, error rates, and cost per request across both platforms during migration
infra · before_launch
This verdict stops being true when
Backend processing time for critical routes exceeds 20ms AND cross-cloud RTT is consistently >25ms, making the combined latency exceed 50ms P99 → Optimize the existing Fargate deployment (right-size tasks, enable connection reuse, add caching layer) rather than migrating to edge compute, or co-locate the gateway on Lambda within the same VPC
API gateway requires heavy compute (>50ms CPU time per request for auth/transformation) that exceeds Workers CPU limits or makes Workers Unbound pricing uncompetitive → Lambda with Provisioned Concurrency if budget ceiling can be raised, or optimized Fargate with reduced instance count
Organization mandates single-cloud AWS policy or compliance requirements prohibit request data transiting non-AWS infrastructure → Lambda@Edge + CloudFront for edge latency within AWS ecosystem, accepting higher cost
Full council reasoning, attack grid, and flip conditions included with Pro

Council notes

Vulcan
Explore Cloudflare Workers for handling latency-sensitive API workload migration from ECS Fargate. Focus on cold star...
Socrates
Reframe the problem as 'eliminating the API gateway entirely' rather than 'choosing between Workers and Lambda'. Depl...
Daedalus
**Recommendation: Cloudflare Workers** for the API gateway migration, with a specific architectural pattern of edge-c...
Loki
All branches assume edge compute solves latency without considering operational overhead: Cloudflare Workers forces m...

Assumptions

  • The API gateway primarily performs routing, auth, and lightweight transformation — not compute-heavy processing that would exhaust Workers CPU limits
  • Backend services are accessible via public endpoints or can be exposed via Cloudflare Tunnel without prohibitive latency
  • Traffic is predominantly US-centric, making the 10-25ms RTT estimate to us-east-1 representative
  • The $8K/month cost ceiling is a hard constraint that disqualifies Lambda Provisioned Concurrency
  • Current ECS Fargate can remain operational as a fallback during the 5-day migration window

Operational signals to watch

reversal — Backend processing time for critical routes exceeds 20ms AND cross-cloud RTT is consistently >25ms, making the combined latency exceed 50ms P99
reversal — API gateway requires heavy compute (>50ms CPU time per request for auth/transformation) that exceeds Workers CPU limits or makes Workers Unbound pricing uncompetitive
reversal — Organization mandates single-cloud AWS policy or compliance requirements prohibit request data transiting non-AWS infrastructure

Unresolved uncertainty

  • Actual backend processing time for AWS-internal services is unknown — if complex queries take >20ms, the 50ms P99 budget may be violated when combined with 15-30ms cross-cloud RTT
  • Whether the API gateway requires access to AWS VPC-private services (DynamoDB, ElastiCache, SQS) is unstated — this determines whether Cloudflare Tunnel or public endpoints are needed, adding latency and complexity
  • The $500/month Workers cost estimate depends on CPU duration staying low; compute-heavy gateway logic (auth, transformation, validation) could push Workers Unbound costs higher
  • Geographic distribution of users is unknown — the 10-25ms RTT estimate assumes US-centric traffic hitting US edge nodes; global traffic patterns could differ
  • Cost numbers for Lambda Provisioned Concurrency are model-estimated, not sourced from AWS pricing calculator with the specific runtime/memory configuration

Branch battle map

R1R2R3Censor reopenb001b002b003b004b005b006b007
Battle timeline (3 rounds)
Round 1 — Initial positions · 3 branches
Branch b001 (Vulcan) eliminated — Branch b001 is structurally hollow — it says 'Explore C...
Socrates proposed branch b004
Loki proposed branch b005
Socrates Reframe the problem as optimizing the entire API gateway architecture for latenc…
Loki All branches assume edge compute solves latency without considering operational …
Round 2 — Adversarial probes · 3 branches
Branch b004 (Socrates) eliminated — auto-pruned: unsupported low-confidence branch
Branch b005 (Loki) eliminated — The proposal for AWS API Gateway + Lambda@Edge + Global A...
Socrates proposed branch b006
Socrates Reframe the problem around geographic distribution of users and backend services…
Round 3 — Final convergence · 3 branches
Branch b002 (Socrates) eliminated — The hybrid Cloudflare Workers + Lambda architecture in b0...
Socrates proposed branch b007
Socrates Reframe the problem as 'eliminating the API gateway entirely' rather than 'choos…
Markdown JSON