AWS Serverless Full-Stack · Lesson 3 of 7
DynamoDB: Single-Table Design & Queries
Why DynamoDB?
DynamoDB is AWS's fully managed NoSQL database. Unlike SQL databases, it scales to millions of requests per second with single-digit millisecond latency — and you pay only for what you use. It's the backbone of most serverless architectures on AWS.
Key characteristics:
- Schemaless — only primary key attributes are required
- Key-value + document — items can be simple
{id, name}or deeply nested JSON - Provisioned or on-demand capacity — on-demand auto-scales with traffic
- DynamoDB Streams — CDC event stream for every write (great for real-time pipelines)
Core Concepts
Primary Key
Every item in DynamoDB is identified by its primary key. Two options:
Simple primary key — just a partition key (PK):
PK: "CLINIC#CLN-001" → {name: "Sunrise Eye Care", state: "CA"}Composite primary key — partition key (PK) + sort key (SK):
PK: "CLINIC#CLN-001", SK: "APPT#2026-04-16#APT-789"The combination of PK + SK must be unique per item. You can have many items with the same PK but different SKs — this is how you store "one-to-many" relationships in a single table.
Capacity Units
- Read Capacity Unit (RCU): 1 strongly consistent read of up to 4 KB
- Write Capacity Unit (WCU): 1 write of up to 1 KB
Use on-demand billing for variable workloads (you pay per request). Use provisioned with auto-scaling for predictable traffic.
Single-Table Design
In relational databases you normalize data into separate tables and JOIN them. DynamoDB has no JOINs — fetching related data in multiple round-trips is expensive. The solution: store everything in one table, design your keys so a single Query returns all data for a use case.
Naming Conventions
Use generic key names (PK, SK) and encode the entity type in the value:
PK SK Type
──────────────── ───────────────────── ──────────
CLINIC#CLN-001 META Clinic
CLINIC#CLN-001 STAFF#USR-123 Staff member
CLINIC#CLN-001 APPT#2026-04-16#APT-1 Appointment
PATIENT#PAT-456 META Patient
PATIENT#PAT-456 APPT#2026-04-16#APT-1 Patient→Appt linkA single Query(PK="CLINIC#CLN-001") returns the clinic + all its staff + all its appointments.
Defining Access Patterns First
Before writing a single line of code, list every query your application needs:
| Access Pattern | Key Condition | |---------------|---------------| | Get clinic by ID | PK = CLINIC#, SK = META | | Get all appointments for a clinic | PK = CLINIC#, SK begins_with APPT# | | Get appointments for a date | PK = CLINIC#, SK begins_with APPT#2026-04-16 | | Get patient by ID | PK = PATIENT#, SK = META | | Get all appointments for a patient | PK = PATIENT#, SK begins_with APPT# |
Design your keys to satisfy these patterns.
Python SDK (boto3)
Setup
import boto3
import os
from boto3.dynamodb.conditions import Key, Attr
from decimal import Decimal
dynamodb = boto3.resource("dynamodb", region_name=os.environ["AWS_REGION"])
table = dynamodb.Table(os.environ["TABLE_NAME"])PutItem
def create_appointment(appt: dict) -> None:
table.put_item(
Item={
"PK": f"CLINIC#{appt['clinic_id']}",
"SK": f"APPT#{appt['date']}#{appt['id']}",
"GSI1PK": f"PATIENT#{appt['patient_id']}",
"GSI1SK": f"APPT#{appt['date']}#{appt['id']}",
"type": "APPOINTMENT",
"id": appt["id"],
"clinic_id": appt["clinic_id"],
"patient_id": appt["patient_id"],
"datetime": appt["datetime"],
"status": "scheduled",
"created_at": appt["created_at"],
},
ConditionExpression="attribute_not_exists(PK)" # prevent overwrites
)Query — all appointments for a clinic
def get_clinic_appointments(clinic_id: str, date_prefix: str = None) -> list:
key_condition = Key("PK").eq(f"CLINIC#{clinic_id}")
if date_prefix:
key_condition &= Key("SK").begins_with(f"APPT#{date_prefix}")
else:
key_condition &= Key("SK").begins_with("APPT#")
response = table.query(KeyConditionExpression=key_condition)
return response["Items"]
# Get all appointments for clinic today
appointments = get_clinic_appointments("CLN-001", "2026-04-16")Query with pagination
DynamoDB returns up to 1MB per call. Use LastEvaluatedKey to paginate:
def get_all_clinic_appointments(clinic_id: str) -> list:
items = []
kwargs = {
"KeyConditionExpression": Key("PK").eq(f"CLINIC#{clinic_id}")
& Key("SK").begins_with("APPT#")
}
while True:
response = table.query(**kwargs)
items.extend(response["Items"])
if "LastEvaluatedKey" not in response:
break
kwargs["ExclusiveStartKey"] = response["LastEvaluatedKey"]
return itemsUpdateItem — conditional update
def update_appointment_status(clinic_id: str, appt_id: str,
date: str, new_status: str) -> None:
table.update_item(
Key={
"PK": f"CLINIC#{clinic_id}",
"SK": f"APPT#{date}#{appt_id}"
},
UpdateExpression="SET #status = :status, updated_at = :ts",
ConditionExpression="#status = :current_status",
ExpressionAttributeNames={"#status": "status"}, # 'status' is a reserved word
ExpressionAttributeValues={
":status": new_status,
":ts": datetime.utcnow().isoformat(),
":current_status": "scheduled" # only update if still scheduled
}
)DeleteItem
def cancel_appointment(clinic_id: str, appt_id: str, date: str) -> None:
table.delete_item(
Key={
"PK": f"CLINIC#{clinic_id}",
"SK": f"APPT#{date}#{appt_id}"
},
ConditionExpression="attribute_exists(PK)"
)Global Secondary Indexes (GSI)
Your primary key satisfies some access patterns, but you'll often need to query by a different attribute. That's what GSIs are for.
Use case: Get all appointments for a patient (across all clinics)
The main table has PK = CLINIC#{id}. A patient query needs PK = PATIENT#{id}. Define a GSI:
resource "aws_dynamodb_table" "main" {
name = "learnixo-${var.env}"
billing_mode = "PAY_PER_REQUEST"
hash_key = "PK"
range_key = "SK"
attribute { name = "PK"; type = "S" }
attribute { name = "SK"; type = "S" }
attribute { name = "GSI1PK"; type = "S" }
attribute { name = "GSI1SK"; type = "S" }
global_secondary_index {
name = "GSI1"
hash_key = "GSI1PK"
range_key = "GSI1SK"
projection_type = "ALL"
}
}Query the GSI in Python:
def get_patient_appointments(patient_id: str) -> list:
response = table.query(
IndexName="GSI1",
KeyConditionExpression=Key("GSI1PK").eq(f"PATIENT#{patient_id}")
& Key("GSI1SK").begins_with("APPT#")
)
return response["Items"]Handling Decimal Types
DynamoDB returns numbers as Decimal objects in Python. Convert for JSON serialization:
import json
from decimal import Decimal
def decimal_default(obj):
if isinstance(obj, Decimal):
return int(obj) if obj % 1 == 0 else float(obj)
raise TypeError(f"Object of type {type(obj)} is not JSON serializable")
# Usage
body = json.dumps(items, default=decimal_default)DynamoDB Streams
Enable streams to react to every table change — great for real-time analytics, search indexing, or triggering downstream workflows:
resource "aws_dynamodb_table" "main" {
...
stream_enabled = true
stream_view_type = "NEW_AND_OLD_IMAGES"
}
resource "aws_lambda_event_source_mapping" "stream_processor" {
event_source_arn = aws_dynamodb_table.main.stream_arn
function_name = aws_lambda_function.stream_processor.arn
starting_position = "LATEST"
batch_size = 100
bisect_batch_on_function_error = true
}Lambda receives batches of change records:
def handler(event, context):
for record in event["Records"]:
event_name = record["eventName"] # INSERT, MODIFY, REMOVE
if event_name == "INSERT":
new_item = record["dynamodb"]["NewImage"]
item_type = new_item.get("type", {}).get("S")
if item_type == "APPOINTMENT":
appt_id = new_item["id"]["S"]
notify_new_appointment(appt_id)
elif event_name == "MODIFY":
old = record["dynamodb"]["OldImage"]
new = record["dynamodb"]["NewImage"]
old_status = old.get("status", {}).get("S")
new_status = new.get("status", {}).get("S")
if old_status != new_status:
handle_status_change(new["id"]["S"], old_status, new_status)Common Mistakes
| Mistake | Fix |
|---------|-----|
| Hot partition key | Use high-cardinality PK — avoid sequential IDs or status values |
| Scan instead of Query | Always Query with a key condition — Scan reads the entire table |
| Forgetting reserved words | Use ExpressionAttributeNames for words like name, status, type |
| Missing GSI for access pattern | Model all access patterns before building — adding GSIs later requires time |
| Storing large blobs in items | Max item size is 400 KB — store files in S3, store the S3 key in DynamoDB |
| Not handling pagination | Always loop on LastEvaluatedKey — DynamoDB caps results at 1 MB |