MedScribe AI

Local-first AI clinical documentation — speech to structured FHIR notes

2–3 hours to set up locally17 technologies11 guided steps

About This Project

MedScribe AI is a privacy-first healthcare platform that turns doctor-patient conversations into structured clinical notes using local AI. It transcribes audio with Whisper (running on-premises), generates SOAP/clinical notes with a local LLM via Ollama, and exports in FHIR R4, HL7 v2, and KITH XML formats — all without a single byte of patient data leaving the hospital.

What You'll Learn

Understand how to build privacy-first AI systems that run fully on-premises

Learn to use Whisper for production speech-to-text in Python

Build a FastAPI backend with async SQLAlchemy and Pydantic v2

Integrate Ollama to run local LLMs in a web application

Implement multi-format healthcare data export (FHIR, HL7)

Apply human-in-the-loop patterns for high-stakes AI workflows

Key Features

✓Local speech-to-text with faster-whisper — audio never leaves the server

✓Structured SOAP note generation with local LLM (Ollama)

✓5 clinical specialty templates: GP, psychiatry, surgery, emergency, pediatrics

✓5 AI agents: diagnosis coding, referral drafting, task creation, care planning, patient letters

✓RAG Q&A over patient history with source citations

✓Multi-format export: FHIR R4, HL7 v2, KITH XML

✓Hallucination detection + confidence scoring on every AI output

✓Human-in-the-loop: all AI actions require clinician approval

✓GDPR-compliant auto-purge after EPJ transfer

✓35+ REST/WebSocket API endpoints

✓Role-based access control (RBAC) with JWT auth

Project Structure

directory tree

MedScribe-AI/
├── medscribe/              # Python backend (FastAPI)
│   ├── api/                # Route handlers (35+ endpoints)
│   ├── agents/             # AI agents (diagnosis, referral, etc.)
│   ├── models/             # SQLAlchemy database models
│   ├── schemas/            # Pydantic request/response schemas
│   ├── services/           # Business logic (transcription, notes, RAG)
│   │   ├── whisper.py      # Speech-to-text with faster-whisper
│   │   ├── ollama.py       # Local LLM integration
│   │   ├── fhir.py         # FHIR R4 export
│   │   └── safety.py       # Hallucination detection
│   └── main.py             # FastAPI app + startup
├── frontend/               # React 19 + TypeScript + Vite
│   ├── src/
│   │   ├── components/     # UI components
│   │   ├── pages/          # Route pages
│   │   └── api/            # API client
│   └── package.json
├── tests/                  # 38 pytest unit tests
├── k8s/                    # Kubernetes manifests
├── .env.example
└── pyproject.toml

Setup Guide

Clone the repository

Clone MedScribe AI from GitHub and navigate into the project directory.

bash

git clone https://github.com/asmanasir/MedScribe-AI.git
cd MedScribe-AI

Create a Python virtual environment

Always use a venv to keep project dependencies isolated from your system Python.

bash

# Create the virtual environment
python -m venv .venv

# Activate it — Linux/macOS:
source .venv/bin/activate

# Activate it — Windows:
.venv\Scripts\activate

Install Python dependencies

Install the project with dev and local AI dependencies. This includes FastAPI, SQLAlchemy, faster-whisper, and all other backend packages.

bash

pip install -e ".[dev,local]"

Set up environment variables

Copy the example .env file and review the configuration. For local development the defaults work out of the box.

bash

cp .env.example .env

# Open .env in your editor and review the settings
# Key variables:
# DATABASE_URL=sqlite:///./medscribe.db   (default — SQLite for dev)
# JWT_SECRET_KEY=your-secret-key-here
# OLLAMA_BASE_URL=http://localhost:11434

Pull the local LLM model with Ollama

MedScribe uses llama3.2:3b for clinical note generation. This runs entirely on your machine — no OpenAI key needed.

bash

# Pull the model (~2 GB download)
ollama pull llama3.2:3b

# Verify it's available
ollama list

Install frontend dependencies

The React frontend is in the /frontend subdirectory.

bash

cd frontend
npm install
cd ..

Running the Project

Start Ollama (if not already running)

Ollama must be running before the backend starts. It listens on port 11434 by default.

bash

ollama serve

Start the FastAPI backend

Open a new terminal. The backend starts on port 8000 and auto-reloads on file changes.

bash

# Make sure your venv is activated
source .venv/bin/activate   # or .venv\Scripts\activate on Windows

python -m medscribe

Start the React frontend

Open a second terminal. The frontend dev server starts on port 3000.

bash

cd frontend
npm run dev

Verify everything is running

Open your browser and check these URLs to confirm the system is healthy.

bash

# API documentation (Swagger UI)
http://localhost:8000/docs

# React frontend
http://localhost:3000

# Health check
curl http://localhost:8000/health

Run the test suite

MedScribe has 38 unit tests. Run them to verify your setup is working correctly.

bash

pytest tests/ -v

Project Info

CategoryAI & ML

Difficultyadvanced

Setup time2–3 hours to set up locally

Technologies17 tools

Tech Stack

Python 3.10FastAPIPydantic v2SQLAlchemy 2.0 (async)React 19TypeScriptViteTailwind CSSfaster-whisperOllama (llama3.2:3b)PostgreSQLSQLiteDockerKubernetesJWTFHIR R4HL7 v2

Prerequisites

Python 3.10+ installed
Node.js 18+ installed
Ollama installed — download from ollama.ai
Git installed
8 GB RAM minimum (16 GB recommended for GPU inference)
Basic familiarity with Python and React

View Source on GitHub

Asma Nasir

Project Author

Built by a senior AI/healthcare systems engineer with deep experience in FHIR, Azure healthcare services, and production AI systems. This project demonstrates how to build privacy-compliant clinical AI that runs entirely on-premises — no cloud required.

MedScribe AI

About This Project

What You'll Learn

Key Features

Project Structure

Setup Guide

Clone the repository

Create a Python virtual environment

Install Python dependencies

Set up environment variables

Pull the local LLM model with Ollama

Install frontend dependencies

Running the Project

Start Ollama (if not already running)

Start the FastAPI backend

Start the React frontend

Verify everything is running

Run the test suite

Project Info

Tech Stack

Prerequisites

Related Courses