How to Ace a Real-World Fullstack Developer Case Study: A Complete Walkthrough

Most technical hiring processes test whether you can code. The best ones test whether you can think.

This article walks through a real fullstack developer case study given by a Nordic startup — anonymized, translated, and fully deconstructed. We cover every decision: what to build, what to cut, how to structure the architecture, and what the evaluators are actually reading between the lines of your submission.

If you're preparing for senior fullstack interviews, or you want to understand how strong engineers approach scoped product problems, this is the full breakdown.

The Case Study: GarmentLens

What They Asked For

Build an internal intake tool for a garment repair business. When a garment arrives for repair (physically or by post), a staff member takes a photo, AI classifies the garment (type, damage, material, complexity), and a list shows all garments awaiting assessment.

Users:

Intake staff — scan garments quickly, trust AI for rough classification, override if wrong
Two part-time tailors — see the queue, pick up the next garment to work on
Shop manager — wants reporting and statistics eventually

This is NOT:

A customer-facing system (no customer login, no payments)
A full CRM
A production SaaS
A replacement for the company's actual platform

You're building a focused MVP of the intake flow — one staff member, one shop, one purpose.

Why This Case Is Different

The brief explicitly says:

"A significant part of the task is to discard requirements, prioritise the remaining ones, and justify why. Scope management is half the job at a startup."

You get a long requirements list. Some are pre-categorised. Some are deliberately wrong for this product. Your job is to recognise them.

This isn't a "build everything" task. It's a "demonstrate judgement" task.

The Full Requirements List

Here's every requirement, as given. Some are pre-categorised by the evaluators. The rest you must categorise yourself.

Format:

🔒 Locked Must — non-negotiable
🟡 Pre-categorised — their current placement, you may argue to move
⚪ Uncategorised — you assign Must / Should / Nice / Will-not

Core

| ID | Requirement | Given Category | |----|-------------|----------------| | F1 | User can upload a photo of a garment | 🔒 Must | | F2 | Photo stored in S3-compatible object storage (MinIO locally) | 🟡 Must | | F3 | Multimodal AI classifies the image (garment type, damage, material, complexity) | 🟡 Must | | F4 | List of all garments with classification visible | 🔒 Must | | F5 | Image metadata persisted in database | 🟡 Must | | F6 | README so a colleague can run the app from scratch | 🔒 Must | | F7 | Classification result appears live in UI — user should not need to refresh | 🟡 Must | | F8 | Company logo in header and brand colours on primary buttons | 🔒 Must |

Domain and Data Modelling

| ID | Requirement | Given Category | |----|-------------|----------------| | D1 | User can manually override AI classification | ⚪ | | D2 | AI returns confidence per attribute (0–1) | ⚪ | | D3 | Garment can be marked as "done" and falls out of the active queue | ⚪ |

AI Architecture

| ID | Requirement | Given Category | |----|-------------|----------------| | AI1 | AI returns structured JSON, not free text | 🟡 Should | | AI2 | Graceful degradation when AI fails (timeout, invalid JSON) | ⚪ | | AI3 | Classification happens async — user doesn't wait | ⚪ | | AI4 | AI must achieve ≥95% accuracy on garment type | ⚪ |

Security

| ID | Requirement | Given Category | |----|-------------|----------------| | S1 | File size and MIME validation on upload | ⚪ | | S2 | Rate limiting on the upload endpoint | ⚪ | | S3 | Login — intake staff must authenticate | ⚪ | | S4 | EXIF GPS data stripped before storage | ⚪ |

UX

| ID | Requirement | Given Category | |----|-------------|----------------| | UX1 | Responsive UI targeting mobile devices | 🟡 Should | | UX2 | Progress indicator during upload and classification | ⚪ | | UX3 | WCAG 2.1 AA accessibility compliance | ⚪ |

Mobile Strategy

| ID | Requirement | Given Category | |----|-------------|----------------| | M1 | PWA — installable, offline upload queuing | ⚪ | | M2 | Capacitor wrapper | ⚪ | | M3 | Native iOS (Swift) + Android (Kotlin) apps for intake staff | ⚪ |

Infrastructure

| ID | Requirement | Given Category | |----|-------------|----------------| | I1 | Docker Compose starts entire stack with one command | 🟡 Should | | I2 | At least one meaningful automated test | ⚪ | | I3 | Structured logging (JSON) | ⚪ |

Scale and Integrations

| ID | Requirement | Given Category | |----|-------------|----------------| | X1 | Multi-tenant support (multiple shops sharing one instance) | ⚪ | | X2 | Webhook API so external systems are notified when garments are classified | ⚪ | | X3 | Real-time collaboration — two staff editing the same garment simultaneously | ⚪ | | X4 | Blockchain-based immutable audit log for repair history | ⚪ | | X5 | Customer can upload photos from their own phone via shared link before drop-off | ⚪ | | X6 | Price catalogue per repair type visible in UI with fixed prices | ⚪ | | X7 | Checkout flow where customer pays via mobile payment | 🟡 Nice to have |

The Model Answer: Full MoSCoW with Reasoning

This is where most candidates fail. They categorise without explaining. The evaluators said clearly: we read the reasons more than the categories.

Must Have

F1 — Upload garment photo Non-negotiable. This is the entry point of the entire system. Without it nothing else exists.

F4 — List of garments with classifications The read side of the intake workflow. Tailors use this to see their queue. Without it the tool has no output.

F6 — README The case explicitly locked this. An internal tool that only one person can run is useless the moment they're sick. Minimum viable documentation is a product requirement, not polish.

F3 — AI classification This is the core value proposition — saves 30–60 seconds per garment. Without it you've built an expensive photo album.

F5 — Metadata persistence Photos need context. Without DB persistence, classifications vanish on restart. The tool needs to outlive a single session.

AI1 — Structured JSON output from AI Structured output isn't optional if you're displaying classifications. Free text can't be stored per attribute, can't be overridden cleanly, can't be filtered. A well-designed prompt with response_format costs nothing extra.

AI2 — Graceful degradation AI calls fail. Networks time out. Models return malformed JSON. A tool that crashes on the first AI failure isn't usable by non-technical staff. This is not a nice-to-have — it's the difference between "staff trusts the tool" and "staff avoids the tool". Move from Should to Must.

S1 — File size and MIME validation Any endpoint that accepts user file uploads that isn't validated is a security hole. A JPEG that's actually a PHP shell is a real attack vector. This takes 10 lines and prevents a category of exploits.

D1 — Manual override of AI classification AI is not truth. A wrinkled grey wool coat might be misclassified as "damaged polyester". The whole premise of "trust AI for rough classification, override if wrong" from the brief makes this Must have. Without override, you're asking staff to trust an AI they cannot correct.

Should Have

F2 — S3-compatible object storage (MinIO) Files need to go somewhere. MinIO via Docker is excellent for local development. But storing to local disk is also valid for MVP. We keep this as Should because the abstraction matters more than the specific implementation — a StorageService interface means you can swap MinIO for S3 later. Don't skip the interface.

F7 — Live UI update without refresh Real-time feels premium but the actual need is: "I want to see the result when it's ready". Polling every 2 seconds is 200 lines of infrastructure less than WebSockets and solves the same problem at this scale. Keep as Should — implement with polling, mention WebSockets as the upgrade path in DECISIONS.md.

I1 — Docker Compose An internal tool that requires 45 minutes of manual setup will be abandoned. One-command startup is close to Must for developer experience. Keep it here but treat it with urgency.

D2 — Confidence score per attribute "AI says it's denim, confidence 0.62" is genuinely useful for intake staff. If confidence is low, they know to look more carefully. This is low implementation cost (just add a field to the JSON schema) and high product value. Good Should.

D3 — Mark garment as done The queue has no exit condition without this. A garment list that grows forever is not a queue — it's a log. However, a simple boolean is_complete field is 30 minutes of work. Move to Should if time allows.

UX1 — Responsive, mobile-first UI Intake staff are on their feet, probably on a phone or tablet. This isn't polish — it's the usage context. However, basic responsive Tailwind/flex layouts are not extra work. Do it while building, not as a separate task.

UX2 — Progress indicator AI classification takes 2–5 seconds. A blank screen that eventually shows results will feel broken. A spinner is 10 lines. High perceived performance gain for minimal effort. Worth it.

I2 — At least one automated test The brief says "at least one meaningful test". This signals whether you have test instincts at all. Test the AI classification parsing logic — the function that takes raw JSON and maps it to a domain object. That's the most likely place bugs hide and the easiest to unit test.

Nice to Have

F8 — Company logo and brand colours This is locked as Must, which we'd respectfully challenge. This is an internal tool. No customer sees it. No investor demos it. An intake staff member scanning garments doesn't care about hex colours. We'd argue in "Questions to the Evaluators" that this should be Should at most, and raise: what does branding polish cost relative to workflow reliability?

S3 — Authentication / login Internal tool, single shop, known staff. Session tokens, even basic HTTP auth, are trivial to add but not zero cost. For MVP the risk profile is low — this runs on a local network or behind a VPN. Flag in REFLECTION that this is the first security upgrade for any production deployment.

S4 — Strip EXIF GPS data Privacy instinct: good. Intake photos are taken at the shop — the GPS data is "23 High Street, Oslo" which is already public. Risk is minimal for an internal tool. Still worth doing as it signals privacy awareness. 5 lines with sharp or piexifjs. Nice to have.

I3 — Structured JSON logging Good observability instinct. console.log is fine for MVP. A pino or structlog integration is 30 minutes and makes debugging much easier in staging. Nice to have.

Will Not Have

AI3 — Async classification (queue/worker architecture) At 200 garments/week, you're scanning roughly one garment every 2 minutes during business hours. Synchronous AI calls with a loading spinner are completely appropriate. A Redis queue + worker process adds two services, a message broker, error handling, retry logic, and dead-letter queues — for a tool that processes one item at a time. This is over-engineering by an order of magnitude.

AI4 — ≥95% accuracy on garment type This is a non-measurable requirement with no baseline dataset. You cannot test this, you cannot commit to it, you cannot measure it in 4 hours. The correct response: "AI accuracy is an empirical property that requires a labelled dataset and evaluation harness we don't have. We stub the interface, validate output structure, and accept that accuracy improves with model and prompt iteration."

UX3 — WCAG 2.1 AA accessibility Legitimate requirement for public-facing products. For an internal tool used by 3 staff members who were hired into the role, legal compliance requirements don't apply in the same way. Will not have for MVP — note that semantic HTML and ARIA labels are good practice regardless.

M3 — Native iOS + Android apps You're building an internal intake tool for a shop with 3 users. Native mobile apps require two separate codebases, App Store/Play Store distribution, and ongoing maintenance. A responsive PWA covers the mobile use case with one codebase. Native apps are the wrong tool.

X1 — Multi-tenant support One shop. The brief says so explicitly. Multi-tenancy now means adding shop_id to every DB query, every API endpoint, every storage path, every session — for a problem that doesn't exist yet. Add it when you have a second customer.

X2 — Webhook API There are no stated external consumers for this event. Building a webhook delivery system (with retry logic, signature verification, subscription management) for an undefined consumer is pure speculation. Will not have.

X3 — Real-time collaboration Two tailors editing the same garment record simultaneously requires operational transforms or CRDTs. The brief says tailors "pick up the next garment" — this is a queue, not a collaborative document. The shape of the workflow doesn't justify the infrastructure.

X4 — Blockchain audit log This is the test requirement. Blockchain is an append-only distributed ledger — solving a Byzantine fault tolerance problem between untrusting parties. Your parties are 3 shop employees on the same LAN. A repair_events table with TIMESTAMP DEFAULT NOW() is immutable enough and costs zero infrastructure. Blockchain here is technology for technology's sake.

X5 — Customer uploads via shared link This changes the user model. Now you have a public-facing upload endpoint, customer sessions, link expiry, link revocation, photo privacy across customers. This is a different product from what was scoped. Will not have.

X6 — Fixed price catalogue AI classifies complexity and damage type. Prices for repairs depend on complexity, material, tailor, and judgement. A fixed price catalogue contradicts the variable nature of AI-assessed repair work. Interesting future feature, but doesn't belong in intake MVP.

X7 — Customer payment checkout This is explicitly called out as questionable by the evaluators themselves — "Is this the same product we defined?" No. This is scope creep from an intake tool to an e-commerce platform. Will not have.

Architecture Decision: How to Build It

Stack Choice

Backend: FastAPI (Python) or Node.js/Express (TypeScript)

FastAPI is the natural choice if you're calling vision AI APIs — the Python AI ecosystem (Anthropic, OpenAI SDKs) is first-class, async is built in, and Pydantic gives you automatic JSON schema validation for both request bodies and AI outputs.

TypeScript + Express works fine too — the Anthropic SDK has full TypeScript support.

Frontend: React + TypeScript + Vite

The evaluator prefers React. Use Vite for fast startup, Tailwind for utility-first responsive layout. No server-side rendering needed — this is an internal tool.

Database: PostgreSQL

The garment record is structured. Classifications map to columns. Status is an enum. SQL is the right tool.

Storage: MinIO via Docker

S3-compatible locally, swappable to S3 in production. Wrap it in a StorageService interface.

Data Model

SQL

CREATE TYPE garment_status AS ENUM ('pending', 'classified', 'in_progress', 'done');

CREATE TABLE garments (
  id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  image_key   TEXT NOT NULL,           -- path in object storage
  image_url   TEXT NOT NULL,           -- presigned URL or public URL
  status      garment_status DEFAULT 'pending',
  uploaded_at TIMESTAMPTZ DEFAULT NOW(),
  
  -- AI classification (nullable until classified)
  garment_type        TEXT,
  damage_description  TEXT,
  material            TEXT,
  complexity          TEXT,           -- low / medium / high
  ai_confidence       JSONB,          -- { "garment_type": 0.92, "material": 0.71 }
  classified_at       TIMESTAMPTZ,
  
  -- Manual override
  override_type       TEXT,
  override_material   TEXT,
  override_complexity TEXT,
  overridden_by       TEXT,
  overridden_at       TIMESTAMPTZ,
  
  notes               TEXT
);

AI Integration

Define the interface first — this is what the brief means by "the architecture around the model is what matters":

Python

from pydantic import BaseModel
from typing import Optional
from abc import ABC, abstractmethod

class ClassificationResult(BaseModel):
    garment_type: str        # "jacket", "trousers", "dress"
    damage_description: str  # "torn seam at left shoulder"
    material: str            # "wool", "denim", "polyester blend"
    complexity: str          # "low" | "medium" | "high"
    confidence: dict[str, float]  # per-attribute confidence scores
    raw_response: Optional[str] = None

class GarmentClassifier(ABC):
    @abstractmethod
    async def classify(self, image_bytes: bytes) -> ClassificationResult:
        pass

Stub implementation (for development without API costs):

Python

class StubClassifier(GarmentClassifier):
    async def classify(self, image_bytes: bytes) -> ClassificationResult:
        await asyncio.sleep(1.5)  # simulate latency
        return ClassificationResult(
            garment_type="jacket",
            damage_description="torn inner lining at left pocket",
            material="wool blend",
            complexity="medium",
            confidence={"garment_type": 0.91, "material": 0.78, "complexity": 0.65}
        )

Claude vision implementation:

Python

import anthropic
import base64

class ClaudeClassifier(GarmentClassifier):
    def __init__(self):
        self.client = anthropic.Anthropic()
    
    async def classify(self, image_bytes: bytes) -> ClassificationResult:
        image_b64 = base64.standard_b64encode(image_bytes).decode("utf-8")
        
        message = self.client.messages.create(
            model="claude-opus-4-7",
            max_tokens=512,
            messages=[{
                "role": "user",
                "content": [
                    {
                        "type": "image",
                        "source": {
                            "type": "base64",
                            "media_type": "image/jpeg",
                            "data": image_b64,
                        },
                    },
                    {
                        "type": "text",
                        "text": """Classify this garment for a repair shop intake system.
                        
Return ONLY valid JSON with this exact structure:
{
  "garment_type": "one of: jacket, trousers, dress, shirt, skirt, coat, other",
  "damage_description": "brief description of visible damage or wear",
  "material": "primary material if identifiable, or 'unknown'",
  "complexity": "one of: low, medium, high",
  "confidence": {
    "garment_type": 0.0-1.0,
    "damage_description": 0.0-1.0,
    "material": 0.0-1.0,
    "complexity": 0.0-1.0
  }
}"""
                    }
                ],
            }]
        )
        
        raw = message.content[0].text
        
        # Robust JSON extraction — Claude may add explanation text
        import re, json
        match = re.search(r'\{[\s\S]*\}', raw)
        if not match:
            raise ValueError(f"No JSON in AI response: {raw}")
        
        data = json.loads(match.group())
        return ClassificationResult(**data, raw_response=raw)

Graceful degradation (AI2 — moved to Must):

Python

async def classify_with_fallback(
    classifier: GarmentClassifier,
    image_bytes: bytes
) -> tuple[ClassificationResult | None, str]:
    try:
        result = await asyncio.wait_for(
            classifier.classify(image_bytes),
            timeout=15.0
        )
        return result, "classified"
    except asyncio.TimeoutError:
        return None, "classification_timeout"
    except ValueError as e:  # malformed JSON
        return None, "classification_parse_error"
    except Exception as e:
        return None, f"classification_error"

Upload Endpoint with Security

Python

from fastapi import UploadFile, HTTPException
import magic  # python-magic for MIME detection

ALLOWED_MIME_TYPES = {"image/jpeg", "image/png", "image/webp"}
MAX_FILE_SIZE = 10 * 1024 * 1024  # 10MB

@router.post("/garments/upload")
async def upload_garment(file: UploadFile):
    # Size check
    content = await file.read()
    if len(content) > MAX_FILE_SIZE:
        raise HTTPException(413, "File too large (max 10MB)")
    
    # MIME validation — check actual bytes, not filename extension
    # A renamed .php file still has PHP magic bytes
    detected_mime = magic.from_buffer(content, mime=True)
    if detected_mime not in ALLOWED_MIME_TYPES:
        raise HTTPException(415, f"Unsupported file type: {detected_mime}")
    
    # Strip EXIF (including GPS) using Pillow
    from PIL import Image
    from io import BytesIO
    img = Image.open(BytesIO(content))
    clean_buffer = BytesIO()
    img.save(clean_buffer, format=img.format or "JPEG")  # save without EXIF
    clean_bytes = clean_buffer.getvalue()
    
    # Store to MinIO
    key = f"garments/{uuid4()}.jpg"
    storage.put_object(key, clean_bytes)
    
    # Save to DB with status=pending
    garment = await db.create_garment(image_key=key)
    
    # Classify (async — respond immediately, poll for result)
    asyncio.create_task(classify_and_update(garment.id, clean_bytes))
    
    return {"id": garment.id, "status": "pending"}

Polling for Live Updates (F7)

Rather than WebSockets, use simple polling. 80% of the real-time feel for 5% of the complexity:

TYPESCRIPT

// React — poll for classification result
function useGarmentStatus(garmentId: string) {
  const [garment, setGarment] = useState<Garment | null>(null);

  useEffect(() => {
    if (!garmentId) return;
    
    const poll = setInterval(async () => {
      const data = await fetch(`/api/garments/${garmentId}`).then(r => r.json());
      setGarment(data);
      
      if (data.status !== "pending") {
        clearInterval(poll); // stop polling once classified
      }
    }, 2000); // every 2 seconds
    
    return () => clearInterval(poll);
  }, [garmentId]);

  return garment;
}

In DECISIONS.md, note: "Polling chosen over WebSockets for simplicity. At 200 garments/week, WebSocket infrastructure is not justified. Upgrade path: replace polling with SSE (Server-Sent Events) as a step up from polling, WebSockets if bidirectional communication is ever needed."

Docker Compose Setup

YAML

version: "3.9"
services:
  db:
    image: postgres:16
    environment:
      POSTGRES_DB: garmentlens
      POSTGRES_USER: app
      POSTGRES_PASSWORD: dev_password
    ports: ["5432:5432"]
    volumes: ["postgres_data:/var/lib/postgresql/data"]

  minio:
    image: minio/minio
    command: server /data --console-address ":9001"
    ports: ["9000:9000", "9001:9001"]
    environment:
      MINIO_ROOT_USER: minioadmin
      MINIO_ROOT_PASSWORD: minioadmin
    volumes: ["minio_data:/data"]

  backend:
    build: ./backend
    ports: ["8000:8000"]
    environment:
      DATABASE_URL: postgresql://app:dev_password@db:5432/garmentlens
      MINIO_ENDPOINT: minio:9000
      ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY}
      USE_STUB_CLASSIFIER: ${USE_STUB_CLASSIFIER:-false}
    depends_on: [db, minio]

  frontend:
    build: ./frontend
    ports: ["3000:3000"]
    environment:
      VITE_API_URL: http://localhost:8000

volumes:
  postgres_data:
  minio_data:

The Reflection — What Actually Matters

The brief says this is the single most important document. Here's what a strong reflection looks like:

Plan vs. Reality

"I scoped D3 (mark as done) as Should but ran out of time — it stayed uncoded. In hindsight this should have been Must: a queue with no exit condition is a log, not a queue. I underestimated how long the MinIO bucket creation script would take to get right."

What Surprised You

"MIME validation with actual byte inspection was faster than I expected — python-magic just works. EXIF stripping with Pillow was also trivial. I'd estimated 45 minutes for S1+S4 combined; it took 20."

Biggest Security Weakness

"The upload endpoint accepts files and stores them to MinIO before MIME validation is complete in the async flow. There's a brief window where an unvalidated file exists in storage. Fix: validate before storing, not after."

Where Does It Break First at Scale

"At 50,000 images/month (~12/hour), the synchronous classification creates a bottleneck if AI response time degrades. The fix is exactly AI3 — a background queue. I deliberately didn't build it because 200 garments/week doesn't need it, but I'd add it at 2,000/week."

What Evaluators Actually Read

When they score your submission, they're asking:

In PLAN.md:

Did you catch X4 (blockchain) as the decoy? Almost everyone misses one.
Did you challenge F8 (branding) or just accept it?
Are your questions specific ("what confidence threshold triggers the override prompt?") or generic ("what is the tech stack?")?

In the code:

Does the AI interface allow swapping implementations? (Dependency injection, not hardcoded)
Does the upload endpoint validate MIME from bytes, not filename?
Is classification failure handled visibly in the UI, not silently dropped?

In REFLECTION.md:

Did you describe the security weakness accurately? (Not "it could be more secure" but "specifically, an attacker could X by doing Y via Z")
Is your time estimate honest? Padding is obvious. So is under-reporting.
Did you learn something? Or just summarise what you built?

The Meta-Lesson

This case study is testing something deeper than technical skill.

At a startup with 3 engineers and 200 garments a week, every hour you spend on blockchain audit logs or native iOS apps is an hour you didn't spend making intake staff 30 seconds faster per scan. That's the real product value.

The engineers who score highest on this case aren't the ones who implement the most requirements. They're the ones who identify the two or three things that actually move the needle and execute those cleanly while clearly articulating why everything else was deliberately left out.

That judgement — scope management, knowing when "good enough" is genuinely good enough, and saying so with confidence — is harder to teach than any technical skill. It's what the case is measuring.

"Ship over polish — a working core with honest documentation beats half-finished ambition."

That line is the whole brief.

How to Ace a Real-World Fullstack Developer Case Study: A Complete Walkthrough

The Case Study: GarmentLens

What They Asked For

Why This Case Is Different

The Full Requirements List

Core

Domain and Data Modelling

AI Architecture

Security

UX

Mobile Strategy

Infrastructure

Scale and Integrations

The Model Answer: Full MoSCoW with Reasoning

Must Have

Should Have

Nice to Have

Will Not Have

Architecture Decision: How to Build It

Stack Choice

Data Model

AI Integration

Upload Endpoint with Security

Polling for Live Updates (F7)

Docker Compose Setup

The Reflection — What Actually Matters

Plan vs. Reality

What Surprised You

Biggest Security Weakness

Where Does It Break First at Scale

What Evaluators Actually Read

The Meta-Lesson

REST API Knowledge Check

Enjoyed this article?

Leave a comment