Back to blog
architectureintermediate

Developer Documentation: How to Write, Maintain, and Never Let It Rot

Why documentation fails, how to write docs that survive 10 years of team changes, and the tools, patterns, and workflows that keep documentation alive during active development. ADRs, OpenAPI, Mermaid diagrams, README standards, and CI gates — all with real examples.

SystemForgeApril 21, 202626 min read
DocumentationADROpenAPIDeveloper ExperienceSoftware ArchitectureBest PracticesOnboardingDevOpsClean CodeInterview Prep
Share:š•

Why Documentation Fails

Every team starts with good intentions. Someone writes a Confluence page. Someone else creates a Lucidchart diagram. A third person writes a README. Six months later, none of it is accurate. A year later, no one reads it. Two years later, a new joiner spends three weeks figuring out what took two hours to build.

This is the documentation death cycle, and almost every team goes through it:

Sprint 1:   Code written → docs written (accurate)
Sprint 3:   Code changed → docs not updated (slightly wrong)
Sprint 6:   Docs ignored because "they're always wrong"
Sprint 12:  New joiner asks questions → senior dev explains verbally
Sprint 18:  Senior dev leaves → knowledge is gone
Sprint 24:  "Why did we use DynamoDB here?" — nobody knows

The problem is not that developers are lazy. The problem is that documentation lives in a different place than the code, so updating one does not remind you to update the other. A PR gets merged, the code changes, and the Confluence page sits at the old version forever.

The solution is treating documentation the same way you treat code: version-controlled, reviewed, and co-located with what it describes.


The Real Cost of Bad Documentation

Before fixing the problem, understand what it actually costs:

New hire onboarding time: The industry average for a developer to become productive is 3–6 months. Poor documentation is the primary reason. At a Ā£80k salary, 3 months of reduced productivity costs roughly Ā£20k per hire.

The bus factor: If one person on your team gets hit by a bus (leaves, gets sick, takes holiday), how many systems grind to a halt? The lower the bus factor, the more your team depends on undocumented tribal knowledge. Most teams have a bus factor of 1 for at least one critical system.

Debugging mystery systems: "Why does this service call that endpoint at 3am?" Without documentation, answering this question requires reading through git history, asking people, and guessing. With an ADR (Architecture Decision Record), it is answered in 30 seconds.

Real example — Knight Capital Group (2012): A deployment activated dormant code that had not been documented or removed. The undocumented legacy flag caused $440 million in losses in 45 minutes. This is an extreme case, but the principle is universal: undocumented code is a liability.


The Fundamental Principle: Documentation as Code

The single most important shift in mindset:

Documentation should live in the same repository as the code it describes, be written in plain text (Markdown), be reviewed in pull requests, and be versioned in Git.

When documentation lives in Git:

  • It is reviewed alongside the code change that necessitated it
  • It has a history — you can see what it said last year
  • It is searchable with the same tools you use for code
  • When code is deleted, the documentation can be deleted in the same PR
  • A CI pipeline can enforce that it exists

When documentation lives in Confluence or Notion or a shared Google Drive:

  • Nobody is reminded to update it when code changes
  • It has no connection to which version of the code it describes
  • It goes stale silently
  • A new employee has to know it exists to find it

This does not mean throw away all wikis. Confluence is fine for company-wide processes, meeting notes, and non-technical documentation. But technical documentation — architecture decisions, API contracts, runbooks, data models — belongs in the repository.


The Five Types of Documentation Every System Needs

Different documentation serves different purposes. Mixing them up is why docs end up being useless walls of text.

1. README — The Front Door

Every repository needs a README that answers five questions:

1. What is this? (one paragraph, plain English)
2. How do I run it locally? (exact commands, no assumptions)
3. How do I run the tests?
4. What does the folder structure mean?
5. Where do I go for more help?

That is it. The README is not the place for architecture decisions, API documentation, or runbooks. It is a front door — it gets you inside and points you to the right room.

The onboarding test: Give the README to a developer who has never seen the codebase. They should be able to run the service locally within 30 minutes without asking a single question. If they cannot, the README is broken, not the developer.

Example README structure:

MARKDOWN
# Appointment Booking Service

Handles scheduling, slot management, and confirmation notifications
for the MyBCAT healthcare platform.

## Quick Start

Prerequisites: Node 20+, Docker Desktop running

\`\`\`bash
git clone https://github.com/mybcat/appointment-service
cd appointment-service
cp .env.example .env.local
docker-compose up -d        # starts DynamoDB Local + LocalStack
npm install
npm run dev                 # http://localhost:3001
\`\`\`

## Running Tests

\`\`\`bash
npm test                    # unit tests
npm run test:integration    # requires Docker running
npm run test:coverage       # coverage report in /coverage
\`\`\`

## Project Structure

\`\`\`
src/
  api/          HTTP handlers (thin — just parse and delegate)
  domain/       Business logic (no framework dependencies)
  infrastructure/ DynamoDB, SQS, external API clients
  shared/       Types, utilities, constants
tests/
  unit/         Pure function tests, no I/O
  integration/  Tests against real DynamoDB Local
\`\`\`

## Architecture

See [docs/architecture/overview.md](docs/architecture/overview.md)

## Runbook

See [docs/runbook.md](docs/runbook.md) for production operations

2. ADRs — Architecture Decision Records

This is the single most valuable documentation practice that almost no team does.

An ADR is a short document that records:

  • What decision was made
  • Why it was made (the options considered, the constraints, the tradeoffs)
  • What the consequences are

The reason ADRs are so valuable is that code tells you what a system does. ADRs tell you why it does it that way. And the "why" is what disappears when senior developers leave.

Real scenario without ADRs:

New developer: "Why are we using DynamoDB for this? PostgreSQL 
               would be much simpler."

Senior dev:    "Because two years ago we had a requirement to 
               support 50,000 concurrent users for a US health 
               system rollout. The original engineer chose Dynamo 
               for the write throughput. The rollout never 
               happened but we kept it."

New developer: "So we could migrate to Postgres now?"

Senior dev:    "Maybe. I don't know all the constraints. The 
               engineer who built it left last year."

With an ADR, this question is answered in 30 seconds by reading a file.

ADR format (Michael Nygard's original format, the most widely used):

MARKDOWN
# ADR-007: Use DynamoDB for Appointment Slot Storage

**Date:** 2024-03-15  
**Status:** Accepted  
**Deciders:** Sarah Chen (Lead), Marcus Williams (Architect)

## Context

The appointment booking system needs to handle slot availability 
reads and writes for potentially 50,000 concurrent users during 
peak scheduling windows (Monday 9am, post-holiday rushes).

Initial load testing on PostgreSQL showed connection pool exhaustion 
above 5,000 concurrent users. RDS Aurora Serverless was evaluated 
but the cold start latency (up to 30s) was unacceptable for a 
booking flow.

## Decision

Use DynamoDB with a single-table design for appointment slots, 
availability windows, and booking records.

Partition key: `PK = PRACTICE#<practiceId>`  
Sort key: `SK = SLOT#<date>#<time>#<providerId>`

## Consequences

**Positive:**
- Handles 50,000+ concurrent writes with consistent sub-10ms latency
- No connection pool management
- Pay-per-request pricing scales to zero outside business hours

**Negative:**
- Query patterns must be known at design time (no ad-hoc queries)
- Reporting queries (analytics, aggregations) need to be done 
  separately via DynamoDB Streams → Lambda → PostgreSQL read model
- New developers unfamiliar with single-table design have a 
  learning curve

## Alternatives Considered

- **PostgreSQL + PgBouncer:** Simpler, but load testing showed 
  connection exhaustion at scale. Kept as the read model for analytics.
- **Aurora Serverless v2:** Evaluated but minimum capacity unit 
  cost was higher than DynamoDB at our usage patterns.
- **Redis:** Considered for slot caching but adds operational 
  complexity without solving the write throughput problem.

Where to store ADRs:

docs/
  adr/
    0001-use-nextjs-for-frontend.md
    0002-jwt-not-sessions.md
    0003-single-table-dynamodb-design.md
    0004-eventbridge-for-post-call-workflow.md
    0007-dynamodb-for-appointment-slots.md

Number them sequentially. Never delete an ADR — if a decision is reversed, add a new ADR that supersedes it and update the old one's status to "Superseded by ADR-012."

Statuses:

  • Proposed — under discussion
  • Accepted — decided and implemented
  • Deprecated — still in place but we want to move away
  • Superseded — replaced by a later decision

3. API Documentation — Auto-Generated, Always Accurate

Manually written API documentation is always wrong. The moment you change a field name, add a parameter, or change a response shape, the documentation is stale. The only way to keep API documentation accurate is to generate it from the code.

For REST APIs — OpenAPI (formerly Swagger):

OpenAPI lets you describe your API as a YAML/JSON spec. Many frameworks can generate this automatically from your code annotations.

.NET — from XML comments:

C#
/// <summary>
/// Book an appointment slot for a patient.
/// </summary>
/// <param name="request">The booking request with slot ID, patient ID, and reason.</param>
/// <returns>The confirmed appointment with confirmation number.</returns>
/// <response code="201">Appointment created successfully.</response>
/// <response code="409">Slot is no longer available (race condition — retry).</response>
/// <response code="422">Request validation failed.</response>
[HttpPost("appointments")]
[ProducesResponseType(typeof(AppointmentResponse), StatusCodes.Status201Created)]
[ProducesResponseType(typeof(ProblemDetails), StatusCodes.Status409Conflict)]
[ProducesResponseType(typeof(ValidationProblemDetails), StatusCodes.Status422UnprocessableEntity)]
public async Task<IActionResult> BookAppointment([FromBody] BookAppointmentRequest request)
{
    // ...
}

Add Swagger to your startup:

C#
// Program.cs
builder.Services.AddSwaggerGen(c =>
{
    c.SwaggerDoc("v1", new OpenApiInfo
    {
        Title = "Appointment Booking API",
        Version = "v1",
        Description = "Manages slot availability and appointment booking for MyBCAT"
    });
    
    // Include XML comments
    var xmlFile = $"{Assembly.GetExecutingAssembly().GetName().Name}.xml";
    var xmlPath = Path.Combine(AppContext.BaseDirectory, xmlFile);
    c.IncludeXmlComments(xmlPath);
});

app.UseSwagger();
app.UseSwaggerUI(c => c.SwaggerEndpoint("/swagger/v1/swagger.json", "Booking API v1"));

Now your Swagger UI at /swagger is always accurate because it is generated from the actual code. A developer changes the endpoint — the Swagger UI reflects it immediately.

TypeScript/Node — from Zod schemas:

TYPESCRIPT
import { z } from 'zod';
import { extendZodWithOpenApi } from '@asteasolutions/zod-to-openapi';

extendZodWithOpenApi(z);

const BookAppointmentRequest = z.object({
  slotId: z.string().uuid().openapi({ description: 'The slot to book' }),
  patientId: z.string().openapi({ description: 'Patient identifier from Cognito' }),
  reason: z.string().max(500).openapi({ description: 'Reason for visit' }),
  idempotencyKey: z.string().uuid().openapi({ 
    description: 'Prevent duplicate bookings on retry. Generate once per booking attempt.' 
  }),
}).openapi('BookAppointmentRequest');

The schema is your validation AND your documentation. Change one, both update.

Python FastAPI — documentation is automatic:

Python
from pydantic import BaseModel, Field
from fastapi import FastAPI

class BookAppointmentRequest(BaseModel):
    slot_id: str = Field(..., description="The slot to book", example="slot_2024_03_15_09_00")
    patient_id: str = Field(..., description="Patient identifier")
    reason: str = Field(..., max_length=500, description="Reason for visit")
    
    class Config:
        schema_extra = {
            "example": {
                "slot_id": "slot_2024_03_15_09_00",
                "patient_id": "patient_abc123",
                "reason": "Annual checkup"
            }
        }

app = FastAPI(title="Appointment API", version="1.0.0")

@app.post("/appointments", response_model=AppointmentResponse, status_code=201)
async def book_appointment(request: BookAppointmentRequest):
    """
    Book an appointment slot.
    
    Returns 409 if the slot was taken between viewing and booking (retry with a different slot).
    Returns 422 if the idempotency key was already used for a different slot.
    """

FastAPI generates a complete, interactive Swagger UI at /docs automatically.


4. Diagrams as Code — Architecture That Doesn't Go Stale

Architecture diagrams created in Lucidchart or draw.io have the same problem as all external documentation: they are not updated when the code changes. Six months after creation, the diagram shows services that no longer exist and missing services that were added.

Diagrams as Code means the diagram is defined in a text file (Markdown, YAML, or a DSL) that lives in the repository and renders to a visual diagram.

Mermaid — renders in GitHub, GitLab, Notion, and most docs tools:

MARKDOWN
## Appointment Booking Flow

\`\`\`mermaid
sequenceDiagram
    participant U as Patient (Browser)
    participant AG as API Gateway
    participant BS as Booking Service
    participant DB as DynamoDB
    participant SQS as SQS Queue
    participant NS as Notification Service

    U->>AG: POST /appointments {slotId, patientId}
    AG->>AG: Validate JWT
    AG->>BS: Forward request
    BS->>DB: TransactWrite (reserve slot + create appointment)
    
    alt Slot available
        DB-->>BS: Success
        BS->>SQS: send_message(appointment_confirmed)
        BS-->>AG: 201 Created {appointmentId, confirmationNo}
        AG-->>U: 201 Created
        SQS-->>NS: Process async (send SMS + email)
    else Slot taken (race condition)
        DB-->>BS: ConditionalCheckFailedException
        BS-->>AG: 409 Conflict {message: "Slot no longer available"}
        AG-->>U: 409 Conflict
    end
\`\`\`

This renders as a proper sequence diagram on GitHub. When the booking flow changes (say, you add an SMS check before booking), you update this file in the same PR as the code change. The reviewer sees the diagram change alongside the code change.

C4 Model with Structurizr — for larger systems:

The C4 model defines four levels of architecture diagrams: Context, Container, Component, Code. Each level zooms in further. Structurizr DSL lets you define all of them in a single text file:

workspace "MyBCAT Platform" {

    model {
        patient = person "Patient" "Books appointments via web or phone"
        agent = person "Agent" "Handles calls, views schedules in dashboard"
        
        platform = softwareSystem "MyBCAT Platform" {
            webApp = container "Web Application" "Next.js" "Patient-facing booking UI"
            agentDashboard = container "Agent Dashboard" "React" "Internal agent tooling"
            bookingApi = container "Booking API" "Python Lambda" "Appointment management"
            callApi = container "Call API" "Python Lambda" "Amazon Connect integration"
            appointmentDb = container "DynamoDB" "NoSQL" "Appointments, slots, patients"
            callDb = container "DynamoDB" "NoSQL" "Call logs, transcriptions"
            messageQueue = container "SQS" "Queue" "Post-booking async tasks"
        }
        
        patient -> webApp "Books appointments"
        agent -> agentDashboard "Manages schedule, views caller"
        webApp -> bookingApi "API calls"
        agentDashboard -> callApi "API calls"
        bookingApi -> appointmentDb "Read/write"
        bookingApi -> messageQueue "Publish events"
    }
    
    views {
        systemContext platform "Context" {
            include *
            autolayout lr
        }
        container platform "Containers" {
            include *
            autolayout lr
        }
    }
}

This generates Context and Container diagrams that update when the DSL file updates.

PlantUML — for teams already using it:

@startuml
package "Booking Domain" {
  class AppointmentService {
    + bookSlot(request: BookRequest): Appointment
    + cancelAppointment(id: string): void
    + getAvailableSlots(date: Date): Slot[]
  }
  
  class Appointment {
    + id: string
    + slotId: string
    + patientId: string
    + status: AppointmentStatus
    + confirmationNumber: string
  }
  
  enum AppointmentStatus {
    CONFIRMED
    CANCELLED
    NO_SHOW
    COMPLETED
  }
  
  AppointmentService --> Appointment : creates
  Appointment --> AppointmentStatus : uses
}
@enduml

5. Runbooks — Operations Documentation

A runbook answers the question: "Something is broken at 3am, what do I do?"

It is not written for developers who built the system. It is written for an on-call engineer who may never have touched this service.

MARKDOWN
# Appointment Service — Runbook

## Alerts and What They Mean

### HIGH_DLQ_DEPTH — SQS Dead Letter Queue depth > 10

**What it means:** Messages failed processing 3 times and landed in the DLQ.
This means appointments were booked but confirmation emails/SMS were not sent.

**Immediate impact:** Patients booked appointments but did not receive confirmation.
The appointment IS in DynamoDB — data is not lost. Only notifications are affected.

**Steps:**
1. Check CloudWatch Logs for the notification Lambda:
   `aws logs tail /aws/lambda/notification-processor --since 1h`
   
2. Look for the error pattern. Common causes:
   - Twilio API down → check https://status.twilio.com
   - Patient phone number invalid → data issue, fix and redrive
   - Lambda timeout → check if Twilio is slow, increase timeout in Terraform
   
3. If Twilio is down: wait for recovery, then redrive from DLQ:
   `aws sqs start-message-move-task --source-queue-url <DLQ_URL>`
   
4. If data issue: fix the data, then redrive. Do NOT redrive without fixing 
   or you will get the same failures again.

### BOOKING_5XX_SPIKE — Booking API error rate > 1% for 5 minutes

**What it means:** The booking Lambda is failing.

**Steps:**
1. Check Lambda error logs:
   `aws logs tail /aws/lambda/booking-handler --since 15m --filter "ERROR"`
   
2. Check DynamoDB throttling:
   CloudWatch > DynamoDB > Table: appointments > ConsumedWriteCapacityUnits
   
3. If DynamoDB throttling: temporarily increase capacity in AWS Console 
   (Terraform update to follow in business hours)
   
4. Escalate to: @sarah-chen (lead) if unresolved after 20 minutes

Documentation Workflow: Keeping It Alive During Development

The documentation is written — now how do you stop it from rotting?

The PR Rule: Code Change = Documentation Check

The most effective practice: add a documentation checklist to your PR template.

MARKDOWN
# Pull Request

## What changed?

## Why?

## Documentation checklist
- [ ] README updated if setup/run steps changed
- [ ] ADR created if an architectural decision was made
- [ ] OpenAPI annotations updated if API contract changed
- [ ] Runbook updated if new alerts/failure modes introduced
- [ ] Diagrams updated if service relationships changed

The checklist does not block the PR (that would slow teams down unnecessarily). It is a forcing function — it makes the author consciously think about documentation before merging, rather than "I'll do it later" (which means never).

CI Gates: Enforce What Matters Most

Some documentation rules are objective enough to be enforced automatically.

Gate 1: Validate OpenAPI spec on every PR

YAML
# .github/workflows/docs.yml
name: Documentation Checks

on: [pull_request]

jobs:
  openapi-validation:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Validate OpenAPI spec
        uses: actions/setup-node@v4
        with:
          node-version: 20
          
      - run: npm install -g @redocly/cli
      
      - name: Lint OpenAPI spec
        run: redocly lint docs/openapi.yaml
        
      - name: Check for breaking changes
        run: |
          git fetch origin main
          npx openapi-diff \
            origin/main:docs/openapi.yaml \
            docs/openapi.yaml \
            --fail-on-incompatible

This blocks PRs that introduce breaking API changes without a version bump — exactly what GitHub does with their API deprecation process.

Gate 2: Check ADR exists for tagged commits

YAML
  adr-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Check for ADR if decision tag present
        run: |
          # If PR title contains [decision], require a new ADR file
          if echo "${{ github.event.pull_request.title }}" | grep -q "\[decision\]"; then
            NEW_ADRS=$(git diff --name-only origin/main | grep "^docs/adr/" | grep "\.md$")
            if [ -z "$NEW_ADRS" ]; then
              echo "PR title contains [decision] but no ADR file was added to docs/adr/"
              echo "Create an ADR before merging."
              exit 1
            fi
          fi

Gate 3: Check README was updated if Dockerfile changed

YAML
  readme-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Check README updated with Docker changes
        run: |
          DOCKER_CHANGED=$(git diff --name-only origin/main | grep -E "Dockerfile|docker-compose")
          README_CHANGED=$(git diff --name-only origin/main | grep -E "README")
          
          if [ -n "$DOCKER_CHANGED" ] && [ -z "$README_CHANGED" ]; then
            echo "Warning: Docker files changed but README not updated."
            echo "If the setup steps changed, update the README."
            # This is a warning, not a failure — adjust to exit 1 if strict enforcement wanted
          fi

Docs Review as Part of Code Review

Reviewers should be trained to ask one question: "If I joined this team tomorrow and read this code change, would I understand what it does and why?"

This means:

  • ADRs are reviewed as carefully as code
  • Diagram updates are verified against the actual changed code
  • Runbook updates for new failure modes are checked before merge

This is not overhead — it is investment. The time spent writing an ADR during review is recovered tenfold when the next person investigates the same question.


Tools by Stack

.NET / C#

| Purpose | Tool | How it works | |---------|------|--------------| | API docs | Swashbuckle / NSwag | Generates Swagger UI from XML comments + annotations | | Type docs | XML documentation comments | Generates IntelliSense tooltips, exportable to HTML | | Diagrams | Mermaid in Markdown | Renders in GitHub, Azure DevOps, Confluence | | ADRs | Markdown in docs/adr/ | Just files — no tooling needed | | Changelog | dotnet-releaser / manual CHANGELOG.md | Document what changed per version |

C#
/// <summary>
/// Books an appointment slot atomically.
/// Uses DynamoDB conditional writes to prevent double-booking.
/// See ADR-007 for why DynamoDB was chosen over PostgreSQL.
/// </summary>
/// <exception cref="SlotUnavailableException">
/// Thrown when the slot was taken between viewing and booking.
/// Caller should display "slot no longer available" and prompt re-selection.
/// </exception>
public async Task<Appointment> BookSlotAsync(BookSlotCommand command)

TypeScript / Node

| Purpose | Tool | How it works | |---------|------|--------------| | API docs | @asteasolutions/zod-to-openapi | Generates OpenAPI spec from Zod schemas | | Type docs | TypeDoc | Generates HTML docs from TSDoc comments | | Component docs | Storybook | Interactive component catalogue with props documentation | | ADRs | Markdown in docs/adr/ | Same as any stack | | Changelog | changesets | Structured changelog via PR labels |

TYPESCRIPT
/**
 * Books an appointment slot.
 * 
 * Uses DynamoDB TransactWrite to atomically reserve the slot and create
 * the appointment record. If the slot is taken between viewing and booking,
 * throws SlotUnavailableError — callers should handle this as a 409.
 *
 * @see {@link https://github.com/mybcat/docs/adr/0007-dynamodb-slots.md | ADR-007}
 */
export async function bookSlot(request: BookSlotRequest): Promise<Appointment>

Python

| Purpose | Tool | How it works | |---------|------|--------------| | API docs | FastAPI (automatic) | Swagger UI at /docs, ReDoc at /redoc | | Code docs | Sphinx + autodoc | Generates HTML from docstrings | | Type hints | Python type annotations | Documentation + runtime validation | | ADRs | Markdown in docs/adr/ | Same as any stack |

Python
def book_slot(request: BookSlotRequest) -> Appointment:
    """
    Book an appointment slot atomically.
    
    Uses DynamoDB TransactWrite to prevent double-booking.
    If the slot is taken between viewing and booking, raises SlotUnavailableError.
    
    Args:
        request: Contains slot_id, patient_id, reason, idempotency_key
        
    Returns:
        Confirmed appointment with confirmation_number
        
    Raises:
        SlotUnavailableError: Slot was taken (race condition). Caller should
            show "slot no longer available" and prompt re-selection.
        DuplicateBookingError: idempotency_key was already used for a 
            different slot. Return 422.
    """

React / Frontend

| Purpose | Tool | How it works | |---------|------|--------------| | Component docs | Storybook | Stories are both documentation and visual tests | | Props docs | TypeDoc + TSDoc | Generates from TypeScript interfaces | | Design system | Storybook Docs addon | MDX documentation alongside component stories |

TYPESCRIPT
interface AppointmentCardProps {
  /** The appointment to display. Must be a confirmed appointment — 
   *  pending/cancelled use different components. */
  appointment: ConfirmedAppointment;
  
  /** Called when the patient clicks "Cancel". 
   *  Cancellation happens asynchronously — component handles loading state. */
  onCancel: (appointmentId: string) => void;
  
  /** Show the provider name. Default: true. 
   *  Set false in contexts where provider is shown in parent (e.g. provider view). */
  showProvider?: boolean;
}

/**
 * Displays a confirmed appointment with options to cancel or reschedule.
 * 
 * @example
 * <AppointmentCard
 *   appointment={appointment}
 *   onCancel={handleCancel}
 * />
 */
export function AppointmentCard({ appointment, onCancel, showProvider = true }: AppointmentCardProps)

Storybook story for documentation:

TYPESCRIPT
// AppointmentCard.stories.tsx
import type { Meta, StoryObj } from '@storybook/react';
import { AppointmentCard } from './AppointmentCard';

const meta: Meta<typeof AppointmentCard> = {
  title: 'Booking/AppointmentCard',
  component: AppointmentCard,
  parameters: {
    docs: {
      description: {
        component: 'Displays a confirmed appointment. Use AppointmentSummaryCard for compact list views.'
      }
    }
  }
};

export const Default: StoryObj<typeof AppointmentCard> = {
  args: {
    appointment: {
      id: 'apt_123',
      confirmationNumber: 'MBC-2024-001234',
      date: '2024-03-15',
      time: '09:00',
      provider: 'Dr. Sarah Chen',
      reason: 'Annual checkup'
    }
  }
};

export const WithoutProvider: StoryObj<typeof AppointmentCard> = {
  args: { ...Default.args, showProvider: false },
  parameters: {
    docs: { description: { story: 'Used in provider dashboard where provider context is already shown.' } }
  }
};

Changelog — Telling Users What Changed

A changelog is documentation for people who use your system (external developers, product teams, operations) — not internal documentation.

Keep a CHANGELOG.md following the Keep a Changelog format:

MARKDOWN
# Changelog

All notable changes to the Appointment API are documented here.
Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/)

## [Unreleased]

### Added
- Patient can now attach a file (e.g. referral letter) to a booking request

## [2.4.0] — 2026-04-15

### Added
- `GET /appointments/{id}/history` — returns full status change history
- `reason` field on cancellation request (previously only captured internally)

### Changed
- Slot availability now returns up to 30 days ahead (previously 14 days)
- Confirmation emails now include ICS calendar attachment

### Fixed
- Race condition where two patients could book the same slot within 50ms
  (fixed with DynamoDB conditional write — see ADR-019)

### Deprecated
- `GET /slots?date=` parameter will be removed in v3.0. 
  Use `GET /slots?from=&to=` instead.

## [2.3.1] — 2026-03-28

### Fixed
- Notification Lambda was not retrying on Twilio 503 responses

Use changesets for automated changelog management in monorepos — developers add a changeset file in each PR describing the change, and the release pipeline assembles the CHANGELOG automatically.


Real Company Examples

Stripe — Documentation as a Competitive Advantage

Stripe's API documentation is widely considered the best in the industry. Their approach:

  • Every API endpoint has a description, all parameters documented, multiple code examples in 8 languages
  • All documentation is generated from internal schemas — the same schemas that validate the actual API requests
  • A dedicated team (Stripe Developer Experience) maintains it
  • Breaking changes require documentation of migration paths before the change ships

Result: Developers integrate Stripe in hours, not days. This is a direct revenue driver — every hour of friction in integration is a potential lost customer.

GitHub — Versioned Deprecation Notices in the API

GitHub uses standardised HTTP response headers for deprecation:

HTTP
HTTP/1.1 200 OK
Deprecation: Sun, 31 Dec 2023 23:59:59 GMT
Sunset: Sat, 30 Jun 2024 23:59:59 GMT
Link: <https://docs.github.com/changes/2023-migration>; rel="deprecation"

Every SDK that calls GitHub automatically shows deprecation warnings. This is documentation embedded in the API protocol itself.

Amazon — Internal "You Build It, You Document It" Culture

Amazon's two-pizza team model means small teams own their services end-to-end. Each service has an internal wiki page called a "One-Pager" that describes:

  • What the service does (one paragraph)
  • Who owns it (current team, past owners)
  • Key design decisions
  • Known limitations
  • Alarm runbook

When you call an internal Amazon API and it is broken at 3am, the runbook tells you exactly who to call.

Netflix — Code Comments That Explain Scale Constraints

Netflix engineers write comments that explain the non-obvious performance constraints of their systems:

JAVA
// WARNING: This method is called ~40 million times per second across the fleet.
// Do not add any I/O, logging, or object allocation here.
// The 2019 outage (INC-4892) was caused by a debug log statement in this path.
// See https://internal/postmortems/2019-dec-outage
private boolean isEligibleForCaching(ContentId contentId) {

This is the correct use of a comment: explaining a non-obvious constraint (the scale) and a real incident that illustrates the consequence of violating it.


The Documentation Health Check

Run this quarterly on any repository you own:

1. Onboarding test
   └─ Can a developer who has never seen this service run it locally
      within 30 minutes using only the README?
      Pass: README is accurate and complete
      Fail: Update the README

2. Decision test
   └─ Pick any major architectural choice in this service (database, 
      framework, message broker, auth strategy).
      Can you find the written reason for that choice?
      Pass: ADR exists and is linked from the README or inline
      Fail: Create an ADR retroactively (better late than never)

3. API test
   └─ Is there an auto-generated API doc that you can point a 
      new integrator to?
      Pass: Swagger/OpenAPI accessible and accurate
      Fail: Add Swashbuckle/FastAPI docs or OpenAPI annotations

4. Diagram test
   └─ Is there a sequence diagram or architecture diagram for the 
      main flow of this service? Is it in the repository?
      Pass: Mermaid/C4 diagram in docs/ folder
      Fail: Draw or update the diagram in Mermaid and commit it

5. Runbook test
   └─ If this service paged you at 3am, would you know where to 
      start without asking anyone?
      Pass: Runbook exists and covers the top 3 alerts
      Fail: Write the runbook for at least the most common failure

Interview Answers

"How do you keep documentation up to date?"

"The key insight is that documentation placed outside the repository is never updated — it has no reminder mechanism. I keep all technical documentation — ADRs, API specs, diagrams — inside the repository as Markdown files. This means every PR that changes the code is also the PR where docs are reviewed and updated.

For APIs, I use auto-generation: Swashbuckle in .NET or FastAPI's built-in docs generate accurate documentation from the code, so they cannot get out of sync. For architecture decisions, I use ADRs — a short record of what was decided, why, and what the alternatives were. Six months later, when someone asks 'why did we use DynamoDB?', the answer is in a file.

I add a documentation checklist to the PR template so authors actively consider what needs updating. And for the most critical things — breaking API changes, missing runbooks — I add CI checks that enforce them automatically."

"What is an ADR and why would you use one?"

"An ADR — Architecture Decision Record — is a short document that captures a technical decision: what was chosen, why, what alternatives were considered, and what the consequences are.

Code tells you what a system does. An ADR tells you why it does it that way. That 'why' is usually the most valuable information in a codebase, and it is also the first thing that disappears when senior engineers leave.

A well-maintained set of ADRs means a new developer can read through them and understand the architectural shape of the system and the constraints that drove it — without needing to ask questions that wake someone up."

"What makes good API documentation?"

"Good API documentation has three properties: it is accurate, it is always up to date, and it includes examples.

Accuracy and up-to-date are solved by the same approach: generate the documentation from the code. Swashbuckle, FastAPI, Zod-to-OpenAPI — all generate documentation from your actual request/response types. If the type changes, the documentation changes automatically.

Examples are the most important part and the most often skipped. A developer integrating your API does not want to read a parameter description for 20 minutes — they want to see a working request/response pair and copy from it. Every endpoint should have at least one realistic example, not a lorem ipsum placeholder."


Summary: What to Do This Week

If you do nothing else, do these three things:

1. Add an ADR for the most recent significant decision your team made. It does not have to be perfect. It should answer: what did we decide, why, and what did we consider instead? File it in docs/adr/. This will be the single most useful thing you write this week.

2. Run the onboarding test on your main service README. Ask a colleague to run the service using only the README. Time them. If it takes more than 30 minutes or requires any verbal help, fix the README before writing anything else.

3. Add a documentation checklist to your PR template. Not a blocking requirement — just checkboxes. The act of reading the checklist before merging changes behaviour more than any policy document.

Documentation is not a separate activity from engineering. It is what separates a codebase that lasts 10 years from one that becomes a rewrite candidate after 3.

Developer Documentation Knowledge Check

5 questions Ā· Test what you just learned Ā· Instant explanations

Enjoyed this article?

Explore the learning path for more.

Found this helpful?

Share:š•

Leave a comment

Have a question, correction, or just found this helpful? Leave a note below.