Back to blog
Backend Systemsintermediate

API Gateway: The Complete Guide โ€” AWS vs Azure, Real-World Examples, and What Actually Breaks

Everything you need to know about API Gateway. AWS HTTP API vs REST API vs WebSocket API. Azure API Management policies. JWT authorisers, rate limiting, CORS, versioning, and the mistakes every team makes in production.

SystemForgeApril 21, 202619 min read
API GatewayAWSAzureREST APIWebSocketJWTRate LimitingCORSLambdaAzure API ManagementSystem DesignInterview Prep
Share:๐•

What Problem Does API Gateway Solve?

Without a gateway, every client talks directly to every backend service:

Mobile App โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Lambda: bookAppointment
Mobile App โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Lambda: getCallLogs
Mobile App โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Lambda: verifyInsurance
Web App    โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Lambda: bookAppointment
Web App    โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Lambda: getCallLogs
3rd Party  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Lambda: getCallLogs

Every service now handles its own authentication, rate limiting, logging, and CORS. You duplicate the same code in 20 Lambda functions. One misconfigured function leaks internal errors. A bad client overwhelms one Lambda and takes down the others. Clients know internal service URLs โ€” changing them breaks integrations.

With an API Gateway:

Mobile App โ”€โ”
Web App    โ”€โ”ผโ”€โ”€โ”€โ”€ API Gateway โ”€โ”€โ”€โ”€ Lambda: bookAppointment
3rd Party  โ”€โ”˜         โ”‚
                       โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Lambda: getCallLogs
                       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Lambda: verifyInsurance

One front door handles everything:
  Auth ยท Rate Limiting ยท Logging ยท CORS ยท Routing ยท Versioning ยท Caching

Real-world analogy: The reception desk at a hospital. Patients (clients) do not wander directly into operating theatres, pharmacies, and wards. They go to reception, show identification (authentication), state their purpose (routing), and are directed accordingly. Reception keeps a log of every visitor (logging) and turns away people without valid appointments (rate limiting). The clinical staff behind reception focus entirely on patient care โ€” not on checking IDs.


The Pipeline โ€” What Happens to Every Request

Understanding the pipeline is the difference between debugging API Gateway in 5 minutes versus 5 hours. Every request flows through these stages in order:

Client Request
      โ†“
โ‘  METHOD + ROUTE MATCHING
  "Does POST /appointments/{id}/cancel exist?"
  No โ†’ 404 immediately. Lambda never invoked.
      โ†“
โ‘ก AUTHORISER
  Validates JWT or calls Lambda Authoriser
  Invalid token โ†’ 401. Lambda never invoked.
      โ†“
โ‘ข REQUEST VALIDATION  (REST API only)
  Required headers present? Body matches schema?
  Invalid โ†’ 400. Lambda never invoked.
      โ†“
โ‘ฃ REQUEST TRANSFORMATION  (REST API only)
  Reshape the incoming JSON, inject context variables
      โ†“
โ‘ค LAMBDA INVOCATION
  Lambda receives event, processes, returns response
      โ†“
โ‘ฅ RESPONSE TRANSFORMATION  (REST API only)
  Reshape Lambda output to client-expected format
      โ†“
โ‘ฆ RESPONSE TO CLIENT
  + CORS headers added
  + CloudWatch log entry written
  + X-Ray trace segment closed

The key insight: Lambda is only called if the request passes steps โ‘ โ€“โ‘ฃ. Bad tokens, malformed bodies, and unknown routes never reach your code. This is both a cost optimisation and a security property โ€” your business logic never sees malicious or malformed input.


AWS API Gateway โ€” Three Types You Must Know

AWS has three completely different products with similar names. Choosing the wrong one causes pain:


Type 1: HTTP API โ€” Use This 90% of the Time

Cost: $1.00 per million requests Latency overhead: ~1ms What it does: Simple, fast, cheap. Proxies HTTP requests to Lambda or any HTTP endpoint with built-in JWT auth and CORS.

What it does NOT do: Request transformation, response caching, WAF integration, per-client rate limits.

Client โ†’ HTTPS โ†’ HTTP API โ†’ Lambda โ†’ Response
                    โ†‘
        JWT validation, CORS, routing โ€” that is it

When to choose HTTP API:

  • Internal APIs where you control all clients
  • Mobile and web app backends
  • Standard CRUD REST endpoints
  • Any scenario where the missing features (caching, WAF, usage plans) are not needed

Real scenario โ€” Healthcare Scheduling API:

POST https://api.mybcat.com/appointments        โ†’ book-appointment Lambda
GET  https://api.mybcat.com/appointments/{id}   โ†’ get-appointment Lambda
PUT  https://api.mybcat.com/appointments/{id}/cancel โ†’ cancel-appointment Lambda
GET  https://api.mybcat.com/practices/{id}/slots    โ†’ get-slots Lambda

Cognito JWT authoriser validates tokens on every request. The practice_id claim is passed as context to each Lambda. Full setup takes 30 minutes in Terraform. Total cost at 10 million requests/month: $10.


Type 2: REST API โ€” Use This for Enterprise Features

Cost: $3.50 per million requests (3.5ร— more expensive than HTTP API) Latency overhead: ~6ms Extra features: WAF, response caching, usage plans, API keys, request/response transformation, mock responses, custom domain features

When to choose REST API:

  • Public APIs where external developers or third parties connect
  • Per-client rate limiting (each client gets their own quota)
  • Response caching for stable data (reduce Lambda invocations)
  • WAF integration (block SQL injection, XSS, geographic restrictions)
  • Request/response transformation without modifying backend code

Usage Plans โ€” The Feature That Justifies the Cost

Usage plans let you assign different rate limits and quotas to different API keys:

Practice A โ€” Small Practice:
  API Key: key_p001_abc123
  Quota:   1,000 requests/day
  Throttle: 50 requests/second burst

Practice B โ€” Large Practice:
  API Key: key_p002_xyz789
  Quota:   50,000 requests/day
  Throttle: 200 requests/second burst

Practice C โ€” Enterprise:
  API Key: key_p003_enterprise
  Quota:   Unlimited
  Throttle: 500 requests/second burst

Result:
  Practice A exceeds 1,000 โ†’ 429 Too Many Requests
  Practice B is completely unaffected
  Your Lambda never sees the excess traffic from Practice A

Without usage plans, one misbehaving integration from one practice exhausts your Lambda concurrency and brings down the entire platform.

Response Caching โ€” Reduce Lambda Costs by 90%

GET /practices/p001/insurance-plans
  โ†’ First request in 5 minutes: Lambda called, result cached
  โ†’ Next 300 requests in 5 minutes: served from API Gateway cache
  โ†’ Lambda invocations: 1 instead of 301
  โ†’ 99.7% reduction in Lambda calls for stable reference data

Cache key includes:
  - The route path
  - Any query parameters you specify
  - Any headers you specify (e.g., practice_id)

Insurance plan lists change once a week. Serving them from a 5-minute cache is invisible to users but eliminates unnecessary Lambda invocations entirely.


Type 3: WebSocket API โ€” Real-Time Bidirectional Communication

What it does: Maintains a persistent open connection between client and server. The server can push messages to the client at any time โ€” no polling required.

When to use it:

  • Live call status dashboards โ€” agent picks up a call โ†’ dashboard updates instantly
  • Real-time notifications โ€” appointment booked โ†’ practice manager sees it immediately
  • Live device telemetry โ€” medical device status changes โ†’ monitoring screen reflects it

How It Works

Client browser opens connection:
  WSS://realtime.mybcat.com/

API Gateway:
  1. Assigns a unique connectionId: "Abc123XYZ"
  2. Calls your $connect Lambda โ€” store connectionId in DynamoDB
  3. Keeps the connection open
  4. When client sends a message โ†’ routes to Lambda based on action field
  5. When client disconnects โ†’ calls $disconnect Lambda โ€” clean up DynamoDB

When a backend event happens (call ended):
  Lambda fetches all connectionIds for practiceId "p001" from DynamoDB
  Posts message to each connection via API Gateway Management API
  Client receives message instantly โ€” no request was made
Python
# Backend pushes a message to all connected dashboards for a practice
def notify_practice_dashboards(practice_id: str, event: dict):
    apigw = boto3.client(
        'apigatewaymanagementapi',
        endpoint_url='https://abc123.execute-api.us-east-1.amazonaws.com/prod'
    )

    connections = get_active_connections(practice_id)  # DynamoDB query

    for conn_id in connections:
        try:
            apigw.post_to_connection(
                ConnectionId=conn_id,
                Data=json.dumps(event)
            )
        except apigw.exceptions.GoneException:
            # Browser was closed โ€” remove stale connection
            delete_connection(conn_id)

The client does not need to refresh or poll. The moment a call ends, every open dashboard for that practice sees the update.


JWT Authoriser โ€” Deep Dive

Authentication is the most important thing an API Gateway does. Understanding JWT authorisers separates senior engineers from junior ones.

Native JWT Authoriser (HTTP API + REST API)

The simplest form. You tell API Gateway where to find the token and which Cognito User Pool (or any OIDC provider) issued it. API Gateway validates the signature and expiry itself โ€” no Lambda needed.

HCL
resource "aws_apigatewayv2_authorizer" "cognito" {
  api_id           = aws_apigatewayv2_api.main.id
  authorizer_type  = "JWT"
  identity_sources = ["$request.header.Authorization"]
  name             = "cognito-jwt"

  jwt_configuration {
    audience = [aws_cognito_user_pool_client.app.id]
    issuer   = "https://cognito-idp.${var.region}.amazonaws.com/${aws_cognito_user_pool.main.id}"
  }
}

API Gateway fetches Cognito's public keys once and caches them. Every request's JWT is verified locally โ€” no external call, no latency.

Lambda Authoriser โ€” When You Need Custom Logic

Use when JWT validation alone is not enough. Examples: check if a user's subscription is active, validate a custom API key format, enforce business rules before routing.

Python
# Lambda Authoriser โ€” custom auth logic
def handler(event, context):
    token = event['authorizationToken'].replace('Bearer ', '')

    try:
        payload = jwt.decode(token, PUBLIC_KEY, algorithms=['RS256'])

        # Custom check: is this practice's subscription active?
        practice_id = payload['custom:practice_id']
        if not is_subscription_active(practice_id):
            raise Exception('Unauthorized')  # returns 403

        # Return Allow policy with context โ€” passed to downstream Lambda
        return build_policy(
            principal_id=payload['sub'],
            effect='Allow',
            resource=event['methodArn'],
            context={
                'practice_id': practice_id,
                'role': payload['custom:role']
            }
        )

    except jwt.ExpiredSignatureError:
        raise Exception('Unauthorized')  # 401
    except Exception:
        raise Exception('Unauthorized')  # 401


def build_policy(principal_id, effect, resource, context):
    return {
        'principalId': principal_id,
        'policyDocument': {
            'Version': '2012-10-17',
            'Statement': [{'Action': 'execute-api:Invoke', 'Effect': effect, 'Resource': resource}]
        },
        'context': context
    }

Authoriser caching: The policy is cached by token value for up to 1 hour. At 10,000 requests/minute, your authoriser Lambda is called ~167 times/minute instead of 10,000. Your downstream Lambda receives the context fields (practice_id, role) directly โ€” it never needs to parse a JWT.


CORS โ€” Why It Breaks and How to Fix It

CORS (Cross-Origin Resource Sharing) is the single most common API Gateway pain point. Understanding it fully makes you the person who fixes it instead of the person who spends 4 hours on Stack Overflow.

What CORS Actually Is

A browser security rule. JavaScript running on app.mybcat.com is not allowed to call api.mybcat.com by default โ€” different domain. The browser blocks it before the request is even sent.

The browser sends a preflight OPTIONS request first to ask: "Is this cross-origin call allowed?"

Step 1: Browser sends preflight (before the real request)
  OPTIONS https://api.mybcat.com/appointments
  Origin: https://app.mybcat.com
  Access-Control-Request-Method: POST
  Access-Control-Request-Headers: Authorization, Content-Type

Step 2: API Gateway must respond with permission
  HTTP 200
  Access-Control-Allow-Origin: https://app.mybcat.com
  Access-Control-Allow-Methods: GET, POST, PUT, DELETE, OPTIONS
  Access-Control-Allow-Headers: Authorization, Content-Type
  Access-Control-Max-Age: 3600

Step 3: Browser sends the real request
  POST https://api.mybcat.com/appointments
  (with JWT, with body โ€” now allowed)

If step 2 is missing or wrong โ€” the browser blocks the real request. You see a CORS error in the browser console. Postman still works fine (Postman does not enforce CORS โ€” it is a browser-only rule). This is why "it works in Postman but not in my app" is always a CORS problem.

HCL
# HTTP API โ€” CORS in Terraform (one block, handles everything)
cors_configuration {
  allow_origins = ["https://app.mybcat.com"]
  allow_methods = ["GET", "POST", "PUT", "DELETE", "OPTIONS"]
  allow_headers = ["Authorization", "Content-Type", "X-Practice-Id"]
  max_age       = 3600
}

For REST API you must manually add an OPTIONS method with a MOCK integration to every resource. Forgetting one route = CORS error for that specific endpoint only. This is the source of 80% of REST API CORS bugs.


Rate Limiting โ€” Protecting Your Platform

The Problem Without Rate Limiting

9am Monday โ€” all 30 practices open simultaneously
Each practice has 10 agents booking appointments
Each agent clicks rapidly due to slow UI
= 300 agents ร— 10 clicks = 3,000 Lambda invocations/second

AWS Lambda account limit: 1,000 concurrent executions
Result: 2,000 requests per second get throttled
        Error responses flood practice UIs
        Agents book the same slot twice (race condition)
        Practice manager calls complaining

The Three Layers of Rate Limiting

Layer 1: WAF (before API Gateway)

Block at the edge before traffic even reaches API Gateway:
  Rule: IP sends > 1,000 requests in 5 minutes โ†’ block for 1 hour
  Rule: Request body contains SQL injection pattern โ†’ block + log
  Rule: User-Agent is known bot/scraper โ†’ block
  Rule: Request origin is not US โ†’ block (MyBCAT serves US-only)

Cost: ~$5/month for WAF + $0.60/million requests evaluated
Benefit: Malicious traffic never consumes Lambda concurrency

Layer 2: API Gateway throttling

Account-level:  10,000 requests/second (default, can be raised)
Stage-level:    500 requests/second per route
Usage plan:     Per-API-key quotas (REST API only)

If a single practice's integration goes into a loop:
  Their API key's quota is exhausted โ†’ 429 only for them
  All other practices are unaffected

Layer 3: Lambda reserved concurrency

Booking Lambda:          reserved = 200
Post-call workflow:      reserved = 100
Analytics/reporting:     reserved = 20
Background jobs:         reserved = 10
                         โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
Total reserved:          330
Remaining for bursts:    670 (shared pool)

Booking Lambda always has 200 slots available regardless of
what other Lambdas are doing. Critical path is protected.

API Versioning โ€” The Problem That Bites You Later

Every API evolves. The fields you return today are not the fields you return in a year. Handling this without breaking existing integrations is a senior-level concern.

Strategy 1: Version in the URL Path

/v1/appointments โ†’ original Lambda (maintained for backward compat)
/v2/appointments โ†’ new Lambda (breaking change in response format)

Clients choose when to migrate. You support both until v1 is sunset.

API Gateway routing:
  GET /v1/* โ†’ appointment-handler-v1:stable alias
  GET /v2/* โ†’ appointment-handler-v2:stable alias

Deprecation process:
  1. Add response header: Deprecation: true, Sunset: 2026-10-01
  2. Log which API keys are still calling v1 (CloudWatch)
  3. Contact practices still on v1 before sunset date
  4. On sunset date: v1 returns 410 Gone with migration guide URL

Strategy 2: API Gateway Stage + Lambda Alias

API Gateway Stage: v1 โ†’ Lambda alias: v1 (points to version 3 of Lambda)
API Gateway Stage: v2 โ†’ Lambda alias: v2 (points to version 7 of Lambda)

Benefit: you can run different versions of the same Lambda
         behind different stages without duplicate code

Strategy 3: Canary Deployments (REST API)

Production: 100% traffic โ†’ Lambda:v5
Deploy v6:
  API Gateway canary: 10% โ†’ Lambda:v6, 90% โ†’ Lambda:v5
  Monitor error rate and latency for 30 minutes
  If healthy: promote v6 to 100%
  If errors spike: rollback by removing canary โ€” zero downtime

Azure API Management โ€” Where Azure Wins

Azure API Management is more powerful than AWS API Gateway for enterprise scenarios. Understanding the difference is important for teams working across both clouds.

The APIM Architecture

External Client
      โ†“
  Gateway  โ† handles every request (runtime)
      โ†“
Policy Engine โ† inbound โ†’ backend โ†’ outbound โ†’ error policies
      โ†“
  Backend  โ† Azure Function, App Service, Logic App, any HTTP endpoint
      โ†‘
Developer Portal โ† auto-generated docs, interactive testing for external devs
Management API   โ† configure APIs, policies, products, subscriptions

Policies โ€” The Superpower Azure Has Over AWS

Policies are XML rules that run on every request and response. They are more powerful and readable than AWS's VTL mapping templates.

Policy: Rate limiting per custom claim

XML
<policies>
  <inbound>
    <validate-jwt header-name="Authorization" failed-validation-httpcode="401">
      <openid-config url="https://login.microsoftonline.com/{tenant}/.well-known/openid-configuration"/>
    </validate-jwt>

    <!-- Rate limit by practice ID extracted from JWT โ€” not by IP -->
    <rate-limit-by-key
      calls="1000"
      renewal-period="60"
      counter-key="@(context.Request.Headers.GetValueOrDefault("Authorization")
                     .AsJwt()?.Claims["practice_id"].FirstOrDefault())" />
  </inbound>
</policies>

Policy: Remove internal fields from responses

XML
<outbound>
  <!-- Backend returns full record โ€” strip fields clients should not see -->
  <json-to-xml apply="always" consider-accept-header="false" />
  <xsl-transform>
    <xsl:stylesheet version="1.0">
      <xsl:template match="@internalNotes|@billingCode|@auditTrail" />
    </xsl:stylesheet>
  </xsl-transform>
</outbound>

Policy: Circuit breaker with retry

XML
<inbound>
  <retry condition="@(context.Response.StatusCode >= 500)"
         count="3"
         interval="2"
         delta="1"
         max-interval="10"
         first-fast-retry="true">
    <forward-request />
  </retry>
</inbound>

AWS API Gateway has no built-in retry logic. You implement retries in the Lambda itself or use the SQS retry mechanism. APIM policies handle it at the gateway layer โ€” no Lambda code needed.

APIM Developer Portal โ€” The Feature AWS Cannot Match

APIM auto-generates a full developer portal from your OpenAPI specification:

Features:
  Interactive API explorer โ€” try any endpoint in the browser
  Auto-generated code samples (C#, Python, curl, JavaScript)
  API subscription management โ€” developers self-serve API keys
  Changelog โ€” track API version history
  Usage analytics per developer/subscription
  Sandbox environment โ€” separate from production, safe to explore

For MyBCAT: if you expose an API for optometry practice management software vendors to integrate with your scheduling system, APIM's developer portal provides a professional integration experience. AWS API Gateway requires you to build this yourself or use a third-party portal like Readme.io.


AWS API Gateway vs Azure API Management โ€” Decision Guide

Choose AWS HTTP API when:
  โœ“ Simple Lambda proxy โ€” internal APIs, mobile/web app backends
  โœ“ Cognito JWT auth is sufficient
  โœ“ Cost matters โ€” $1/million requests
  โœ“ Low latency matters โ€” ~1ms overhead

Choose AWS REST API when:
  โœ“ Per-client rate limits (usage plans + API keys)
  โœ“ Response caching for stable data
  โœ“ WAF integration for public APIs
  โœ“ Request/response transformation (if VTL templates are acceptable)

Choose Azure API Management when:
  โœ“ Complex policy logic (retry, circuit breaker, response transformation)
  โœ“ External developer portal needed
  โœ“ Multiple backends with routing rules
  โœ“ Built-in API versioning and revision management
  โœ“ Already heavily invested in Azure ecosystem

Terraform โ€” Complete Production Setup

This is what a real HTTP API Gateway deployment looks like for a healthcare platform:

HCL
# API Gateway
resource "aws_apigatewayv2_api" "main" {
  name          = "mybcat-${var.env}"
  protocol_type = "HTTP"

  cors_configuration {
    allow_origins = var.env == "prod" ? ["https://app.mybcat.com"] : ["*"]
    allow_methods = ["GET", "POST", "PUT", "DELETE", "OPTIONS"]
    allow_headers = ["Authorization", "Content-Type", "X-Idempotency-Key"]
    max_age       = 3600
  }
}

# Cognito JWT Authoriser
resource "aws_apigatewayv2_authorizer" "cognito" {
  api_id           = aws_apigatewayv2_api.main.id
  authorizer_type  = "JWT"
  identity_sources = ["$request.header.Authorization"]
  name             = "cognito"

  jwt_configuration {
    audience = [aws_cognito_user_pool_client.app.id]
    issuer   = "https://cognito-idp.${var.region}.amazonaws.com/${aws_cognito_user_pool.main.id}"
  }
}

# Stage with access logging
resource "aws_apigatewayv2_stage" "default" {
  api_id      = aws_apigatewayv2_api.main.id
  name        = "$default"
  auto_deploy = true

  access_log_settings {
    destination_arn = aws_cloudwatch_log_group.api_logs.arn
    format = jsonencode({
      requestId      = "$context.requestId"
      routeKey       = "$context.routeKey"
      status         = "$context.status"
      latency        = "$context.integrationLatency"
      practiceId     = "$context.authorizer.practice_id"
      userAgent      = "$context.identity.userAgent"
    })
  }
}

# Reusable module for Lambda routes
module "book_appointment_route" {
  source         = "./modules/api_route"
  api_id         = aws_apigatewayv2_api.main.id
  authorizer_id  = aws_apigatewayv2_authorizer.cognito.id
  route_key      = "POST /appointments"
  lambda_arn     = aws_lambda_function.book_appointment.invoke_arn
}

module "get_slots_route" {
  source         = "./modules/api_route"
  api_id         = aws_apigatewayv2_api.main.id
  authorizer_id  = aws_apigatewayv2_authorizer.cognito.id
  route_key      = "GET /practices/{practiceId}/slots"
  lambda_arn     = aws_lambda_function.get_slots.invoke_arn
}

# Health check โ€” mock response, no Lambda invoked
resource "aws_apigatewayv2_route" "health" {
  api_id    = aws_apigatewayv2_api.main.id
  route_key = "GET /health"
  target    = "integrations/${aws_apigatewayv2_integration.health_mock.id}"
}

Common Mistakes โ€” What Breaks in Production

Mistake 1: Lambda timeout longer than API Gateway timeout

API Gateway has a hard 29-second timeout. If your Lambda takes 35 seconds, API Gateway returns 504 to the client โ€” but the Lambda keeps running and you are still billed. Set your Lambda timeout to 28 seconds maximum for synchronous API routes. For genuinely long operations: return 202 Accepted immediately, process asynchronously via SQS, expose a polling endpoint.

Mistake 2: Leaking internal errors to clients

Python
# BAD โ€” stack trace reaches the client
def handler(event, context):
    patient = db.get_patient(event['patientId'])
    return patient  # KeyError if not found โ†’ full traceback in response

# GOOD โ€” structured errors, internal details stay in CloudWatch
def handler(event, context):
    try:
        patient = db.get_patient(event['patientId'])
        if not patient:
            return {'statusCode': 404, 'body': json.dumps({'error': 'Not found'})}
        return {'statusCode': 200, 'body': json.dumps(patient)}
    except Exception:
        logger.exception("Unhandled error in get_patient")
        return {'statusCode': 500, 'body': json.dumps({'error': 'Internal server error'})}

Mistake 3: No throttling on public or webhook endpoints

A webhook endpoint receives callbacks from Amazon Connect. It has no auth (Connect cannot send Cognito JWTs). Without throttling, a misconfigured Connect instance or a bad actor discovering the URL can fire thousands of requests per second, consuming your Lambda concurrency budget and blocking legitimate traffic.

Fix: add an IP allowlist at WAF level (Connect webhooks come from known AWS IP ranges), add a shared secret header validated in the Lambda, and add a WAF rate rule as a backstop.

Mistake 4: Using wildcard CORS in production

HCL
# NEVER in production
allow_origins = ["*"]

# Always scope to your actual domain
allow_origins = ["https://app.mybcat.com"]

Wildcard CORS allows any website to make authenticated API calls using a logged-in user's credentials. If Access-Control-Allow-Credentials: true is also set, this is a cross-site request forgery vector.

Mistake 5: Forgetting that API Gateway is stateless

Two sequential requests from the same user may be handled by different Lambda containers. Lambda has no memory between invocations. All state โ€” session data, in-progress bookings, user preferences โ€” must live in DynamoDB or Redis. This is correct and expected, but developers coming from traditional server frameworks (Express, Django, ASP.NET) frequently assume state persists between requests.


Complete Architecture โ€” Everything Together

                    Users (browsers, mobile apps)
                              โ†“
                    CloudFront (CDN + WAF)
                    โ”œโ”€โ”€ Static assets served from edge
                    โ”œโ”€โ”€ WAF: blocks bots, SQL injection, geo-restrictions
                    โ””โ”€โ”€ Forwards API requests to API Gateway
                              โ†“
                    API Gateway (HTTP API)
                    โ”œโ”€โ”€ JWT Authoriser (Cognito)
                    โ”œโ”€โ”€ CORS headers on all routes
                    โ”œโ”€โ”€ Access logging โ†’ CloudWatch
                    โ””โ”€โ”€ Routes:
                        โ”œโ”€โ”€ POST /appointments โ†’ book-appt Lambda
                        โ”œโ”€โ”€ GET  /appointments/{id} โ†’ get-appt Lambda
                        โ”œโ”€โ”€ PUT  /appointments/{id}/cancel โ†’ cancel-appt Lambda
                        โ”œโ”€โ”€ GET  /practices/{id}/slots โ†’ get-slots Lambda
                        โ”œโ”€โ”€ POST /calls/webhook โ†’ call-event Lambda (no auth)
                        โ”œโ”€โ”€ GET  /health โ†’ mock response (200 OK)
                        โ””โ”€โ”€ WebSocket /realtime โ†’ ws-connect, ws-message, ws-disconnect Lambdas
                              โ†“
                    Lambda Functions
                    โ”œโ”€โ”€ IAM roles with least privilege
                    โ”œโ”€โ”€ No PHI in CloudWatch logs (correlation IDs only)
                    โ”œโ”€โ”€ Secrets from Secrets Manager (not env vars)
                    โ””โ”€โ”€ DynamoDB + RDS + SQS backends

The One-Line Summary for Every Decision

| Decision | Answer | |---|---| | HTTP API vs REST API | HTTP API by default. REST API only when you need WAF, caching, or per-client quotas | | WebSocket vs polling | WebSocket when you need sub-second updates. Polling with React Query for near-real-time (5s interval) | | Native JWT vs Lambda Authoriser | Native JWT for standard Cognito auth. Lambda Authoriser when you need custom business logic before routing | | Authoriser cache TTL | 300 seconds for most cases. 0 only if token content must be re-validated on every request (rare) | | CORS wildcard in dev/prod | * in dev only. Explicit origin list in staging and production | | Rate limiting layer | WAF for malicious traffic. API Gateway usage plans for per-client quotas. Lambda reserved concurrency for service protection | | Versioning strategy | URL path versioning (/v1/, /v2/) for external APIs. Lambda aliases for internal deployments | | API Gateway timeout | Set Lambda timeout to 28 seconds max. Use async pattern (202 + polling) for longer operations |

REST API Knowledge Check

5 questions ยท Test what you just learned ยท Instant explanations

Enjoyed this article?

Explore the Backend Systems learning path for more.

Found this helpful?

Share:๐•

Leave a comment

Have a question, correction, or just found this helpful? Leave a note below.