DynamoDB: Single-Table Design, Access Patterns & Queries
Master DynamoDB data modeling β partition keys, sort keys, GSIs, single-table design, query patterns, condition expressions, streams, and Python SDK patterns for production serverless apps.
Why DynamoDB?
DynamoDB is AWS's fully managed NoSQL database. Unlike SQL databases, it scales to millions of requests per second with single-digit millisecond latency β and you pay only for what you use. It's the backbone of most serverless architectures on AWS.
Key characteristics:
- Schemaless β only primary key attributes are required
- Key-value + document β items can be simple
{id, name}or deeply nested JSON - Provisioned or on-demand capacity β on-demand auto-scales with traffic
- DynamoDB Streams β CDC event stream for every write (great for real-time pipelines)
Core Concepts
Primary Key
Every item in DynamoDB is identified by its primary key. Two options:
Simple primary key β just a partition key (PK):
PK: "CLINIC#CLN-001" β {name: "Sunrise Eye Care", state: "CA"}Composite primary key β partition key (PK) + sort key (SK):
PK: "CLINIC#CLN-001", SK: "APPT#2026-04-16#APT-789"The combination of PK + SK must be unique per item. You can have many items with the same PK but different SKs β this is how you store "one-to-many" relationships in a single table.
Capacity Units
- Read Capacity Unit (RCU): 1 strongly consistent read of up to 4 KB
- Write Capacity Unit (WCU): 1 write of up to 1 KB
Use on-demand billing for variable workloads (you pay per request). Use provisioned with auto-scaling for predictable traffic.
Single-Table Design
In relational databases you normalize data into separate tables and JOIN them. DynamoDB has no JOINs β fetching related data in multiple round-trips is expensive. The solution: store everything in one table, design your keys so a single Query returns all data for a use case.
Naming Conventions
Use generic key names (PK, SK) and encode the entity type in the value:
PK SK Type
ββββββββββββββββ βββββββββββββββββββββ ββββββββββ
CLINIC#CLN-001 META Clinic
CLINIC#CLN-001 STAFF#USR-123 Staff member
CLINIC#CLN-001 APPT#2026-04-16#APT-1 Appointment
PATIENT#PAT-456 META Patient
PATIENT#PAT-456 APPT#2026-04-16#APT-1 PatientβAppt linkA single Query(PK="CLINIC#CLN-001") returns the clinic + all its staff + all its appointments.
Defining Access Patterns First
Before writing a single line of code, list every query your application needs:
| Access Pattern | Key Condition | |---------------|---------------| | Get clinic by ID | PK = CLINIC#, SK = META | | Get all appointments for a clinic | PK = CLINIC#, SK begins_with APPT# | | Get appointments for a date | PK = CLINIC#, SK begins_with APPT#2026-04-16 | | Get patient by ID | PK = PATIENT#, SK = META | | Get all appointments for a patient | PK = PATIENT#, SK begins_with APPT# |
Design your keys to satisfy these patterns.
Python SDK (boto3)
Setup
import boto3
import os
from boto3.dynamodb.conditions import Key, Attr
from decimal import Decimal
dynamodb = boto3.resource("dynamodb", region_name=os.environ["AWS_REGION"])
table = dynamodb.Table(os.environ["TABLE_NAME"])PutItem
def create_appointment(appt: dict) -> None:
table.put_item(
Item={
"PK": f"CLINIC#{appt['clinic_id']}",
"SK": f"APPT#{appt['date']}#{appt['id']}",
"GSI1PK": f"PATIENT#{appt['patient_id']}",
"GSI1SK": f"APPT#{appt['date']}#{appt['id']}",
"type": "APPOINTMENT",
"id": appt["id"],
"clinic_id": appt["clinic_id"],
"patient_id": appt["patient_id"],
"datetime": appt["datetime"],
"status": "scheduled",
"created_at": appt["created_at"],
},
ConditionExpression="attribute_not_exists(PK)" # prevent overwrites
)Query β all appointments for a clinic
def get_clinic_appointments(clinic_id: str, date_prefix: str = None) -> list:
key_condition = Key("PK").eq(f"CLINIC#{clinic_id}")
if date_prefix:
key_condition &= Key("SK").begins_with(f"APPT#{date_prefix}")
else:
key_condition &= Key("SK").begins_with("APPT#")
response = table.query(KeyConditionExpression=key_condition)
return response["Items"]
# Get all appointments for clinic today
appointments = get_clinic_appointments("CLN-001", "2026-04-16")Query with pagination
DynamoDB returns up to 1MB per call. Use LastEvaluatedKey to paginate:
def get_all_clinic_appointments(clinic_id: str) -> list:
items = []
kwargs = {
"KeyConditionExpression": Key("PK").eq(f"CLINIC#{clinic_id}")
& Key("SK").begins_with("APPT#")
}
while True:
response = table.query(**kwargs)
items.extend(response["Items"])
if "LastEvaluatedKey" not in response:
break
kwargs["ExclusiveStartKey"] = response["LastEvaluatedKey"]
return itemsUpdateItem β conditional update
def update_appointment_status(clinic_id: str, appt_id: str,
date: str, new_status: str) -> None:
table.update_item(
Key={
"PK": f"CLINIC#{clinic_id}",
"SK": f"APPT#{date}#{appt_id}"
},
UpdateExpression="SET #status = :status, updated_at = :ts",
ConditionExpression="#status = :current_status",
ExpressionAttributeNames={"#status": "status"}, # 'status' is a reserved word
ExpressionAttributeValues={
":status": new_status,
":ts": datetime.utcnow().isoformat(),
":current_status": "scheduled" # only update if still scheduled
}
)DeleteItem
def cancel_appointment(clinic_id: str, appt_id: str, date: str) -> None:
table.delete_item(
Key={
"PK": f"CLINIC#{clinic_id}",
"SK": f"APPT#{date}#{appt_id}"
},
ConditionExpression="attribute_exists(PK)"
)Global Secondary Indexes (GSI)
Your primary key satisfies some access patterns, but you'll often need to query by a different attribute. That's what GSIs are for.
Use case: Get all appointments for a patient (across all clinics)
The main table has PK = CLINIC#{id}. A patient query needs PK = PATIENT#{id}. Define a GSI:
resource "aws_dynamodb_table" "main" {
name = "learnixo-${var.env}"
billing_mode = "PAY_PER_REQUEST"
hash_key = "PK"
range_key = "SK"
attribute { name = "PK"; type = "S" }
attribute { name = "SK"; type = "S" }
attribute { name = "GSI1PK"; type = "S" }
attribute { name = "GSI1SK"; type = "S" }
global_secondary_index {
name = "GSI1"
hash_key = "GSI1PK"
range_key = "GSI1SK"
projection_type = "ALL"
}
}Query the GSI in Python:
def get_patient_appointments(patient_id: str) -> list:
response = table.query(
IndexName="GSI1",
KeyConditionExpression=Key("GSI1PK").eq(f"PATIENT#{patient_id}")
& Key("GSI1SK").begins_with("APPT#")
)
return response["Items"]Handling Decimal Types
DynamoDB returns numbers as Decimal objects in Python. Convert for JSON serialization:
import json
from decimal import Decimal
def decimal_default(obj):
if isinstance(obj, Decimal):
return int(obj) if obj % 1 == 0 else float(obj)
raise TypeError(f"Object of type {type(obj)} is not JSON serializable")
# Usage
body = json.dumps(items, default=decimal_default)DynamoDB Streams
Enable streams to react to every table change β great for real-time analytics, search indexing, or triggering downstream workflows:
resource "aws_dynamodb_table" "main" {
...
stream_enabled = true
stream_view_type = "NEW_AND_OLD_IMAGES"
}
resource "aws_lambda_event_source_mapping" "stream_processor" {
event_source_arn = aws_dynamodb_table.main.stream_arn
function_name = aws_lambda_function.stream_processor.arn
starting_position = "LATEST"
batch_size = 100
bisect_batch_on_function_error = true
}Lambda receives batches of change records:
def handler(event, context):
for record in event["Records"]:
event_name = record["eventName"] # INSERT, MODIFY, REMOVE
if event_name == "INSERT":
new_item = record["dynamodb"]["NewImage"]
item_type = new_item.get("type", {}).get("S")
if item_type == "APPOINTMENT":
appt_id = new_item["id"]["S"]
notify_new_appointment(appt_id)
elif event_name == "MODIFY":
old = record["dynamodb"]["OldImage"]
new = record["dynamodb"]["NewImage"]
old_status = old.get("status", {}).get("S")
new_status = new.get("status", {}).get("S")
if old_status != new_status:
handle_status_change(new["id"]["S"], old_status, new_status)Common Mistakes
| Mistake | Fix |
|---------|-----|
| Hot partition key | Use high-cardinality PK β avoid sequential IDs or status values |
| Scan instead of Query | Always Query with a key condition β Scan reads the entire table |
| Forgetting reserved words | Use ExpressionAttributeNames for words like name, status, type |
| Missing GSI for access pattern | Model all access patterns before building β adding GSIs later requires time |
| Storing large blobs in items | Max item size is 400 KB β store files in S3, store the S3 key in DynamoDB |
| Not handling pagination | Always loop on LastEvaluatedKey β DynamoDB caps results at 1 MB |
Enjoyed this article?
Explore the Backend Systems learning path for more.
Found this helpful?
Leave a comment
Have a question, correction, or just found this helpful? Leave a note below.