Platform Engineering: Developer Environments — Dev Containers, Tilt, Telepresence, and vCluster

The Developer Environment Problem

Every engineer who has joined a new company knows the feeling: you spend your first week setting up a local development environment. You follow a README that's 6 months out of date. Something doesn't work. You file a Slack message. Someone sends you a different doc. Eventually it works — sort of.

This is not just an onboarding problem. It's a daily friction problem:

"It works on my machine" bugs that take hours to reproduce
Local service doesn't behave like production (different Postgres version, no Redis, no message queue)
Testing microservice integrations locally is nearly impossible
Every developer has slightly different tool versions

Platform engineering owns this problem. A platform that only solves deployment doesn't reduce cognitive load — it just moves the friction from ops to the developer's local machine.

The Four Tools That Solve It

| Tool | Problem it solves | Where work runs | |------|------------------|-----------------| | Dev Containers | Inconsistent local setup, "works on my machine" | Developer's machine (Docker) | | Tilt | Slow feedback loop when developing for Kubernetes | Developer's machine + local/remote cluster | | Telepresence | Can't replicate full microservice graph locally | Your service local, rest in real cluster | | vCluster | PR preview environments with production parity | Cloud cluster (ephemeral K8s namespace) |

These aren't mutually exclusive — a mature platform provides all four.

Dev Containers: Reproducible Local Environments

A devcontainer is a Docker container that defines your entire development environment: OS, language runtimes, tools, VS Code extensions, startup scripts.

Every developer on the team runs the same container. "Works on my machine" stops being a thing.

`.devcontainer/devcontainer.json`

JSON

{
  "name": "Order Service Dev",
  "image": "mcr.microsoft.com/devcontainers/dotnet:1-8.0",
  "features": {
    "ghcr.io/devcontainers/features/docker-in-docker:2": {},
    "ghcr.io/devcontainers/features/kubectl-helm-minikube:1": {
      "version": "1.29"
    },
    "ghcr.io/devcontainers/features/github-cli:1": {}
  },
  "postCreateCommand": "dotnet restore && docker-compose up -d",
  "forwardPorts": [5000, 5432, 6379],
  "customizations": {
    "vscode": {
      "extensions": [
        "ms-dotnettools.csharp",
        "ms-azuretools.vscode-docker",
        "hashicorp.terraform",
        "redhat.vscode-yaml"
      ],
      "settings": {
        "editor.formatOnSave": true,
        "dotnet.defaultSolution": "OrderService.sln"
      }
    }
  },
  "mounts": [
    "source=${localWorkspaceFolder},target=/workspace,type=bind,consistency=cached"
  ],
  "remoteUser": "vscode"
}

What the postCreateCommand does: runs automatically after the container starts — restores NuGet packages and starts the Docker Compose stack (PostgreSQL, Redis, RabbitMQ) in the background.

docker-compose.yml for local dependencies

YAML

# .devcontainer/docker-compose.yml — local services the app depends on
services:
  postgres:
    image: postgres:16
    environment:
      POSTGRES_DB: orders
      POSTGRES_PASSWORD: dev_password
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"

  rabbitmq:
    image: rabbitmq:3-management
    ports:
      - "5672:5672"
      - "15672:15672"  # management UI

volumes:
  postgres_data:

The platform's role

The platform team provides:

Base devcontainer images with company standard tools pre-installed
Devcontainer features for company-specific tools (internal CLI, vault CLI, kubectl with cluster configs)
Backstage scaffolder template that generates the .devcontainer/ folder for new services

Every service created via the golden path gets a devcontainer automatically. New engineers open the repo and VS Code offers "Reopen in Container" — done.

GitHub Codespaces integration

The same devcontainer.json works with GitHub Codespaces — a cloud-hosted VS Code environment. Zero local setup required.

YAML

# .github/workflows — optional pre-build devcontainer for faster start
name: Pre-build Codespace
on:
  push:
    branches: [main]
jobs:
  devcontainer:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: devcontainers/ci@v0.3
        with:
          imageName: ghcr.io/org/order-service-devcontainer
          cacheFrom: ghcr.io/org/order-service-devcontainer

Pre-building the devcontainer image means engineers get a Codespace in under 30 seconds.

Tilt: Hot-Reload Kubernetes Development

Running a microservice locally is fine. Running 5 microservices that depend on each other, with Kubernetes manifests, health checks, and config maps — that's where Tilt shines.

Tilt's value proposition: Change a file → Tilt detects it → rebuilds the container layer that changed → hot-swaps it in the local cluster → your app is updated in seconds, not minutes.

Tiltfile

Python

# Tiltfile (Python-like DSL)

# Load platform helper extensions
load('ext://helm_resource', 'helm_resource', 'helm_repo')
load('ext://dotenv', 'dotenv')

# Load local .env
dotenv()

# ── Dependencies (use Helm for shared services) ──────────────────────────────
helm_repo('bitnami', 'https://charts.bitnami.com/bitnami')
helm_resource('postgres',  'bitnami/postgresql', namespace='dev', flags=['--set', 'auth.password=devpass'])
helm_resource('redis',     'bitnami/redis',      namespace='dev', flags=['--set', 'auth.enabled=false'])

# ── Order Service ─────────────────────────────────────────────────────────────
# Build: only rebuilds the app layer when .cs files change (not the SDK layer)
docker_build(
  'ghcr.io/org/order-service',
  '.',
  dockerfile='Dockerfile',
  live_update=[
    # Sync .cs files without full rebuild
    sync('./src', '/app/src'),
    # Run dotnet build after sync
    run('dotnet build /app/src -o /app/publish'),
  ]
)

# Deploy: apply K8s manifests
k8s_yaml(kustomize('./k8s/dev'))

# Configure resource in Tilt UI
k8s_resource(
  'order-service',
  port_forwards=['5000:8080', '5001:8081'],  # app + health
  labels=['backend'],
  links=[
    link('http://localhost:5000/swagger', 'Swagger UI'),
    link('http://localhost:5000/health', 'Health Check'),
  ]
)

# ── Notification Service (dependency) ────────────────────────────────────────
docker_build('ghcr.io/org/notification-service', '../notification-service')
k8s_yaml('../notification-service/k8s/dev')
k8s_resource('notification-service', port_forwards=['5002:8080'])

What Tilt gives you:

Tilt UI: web dashboard showing all services, their logs, and build status
Live update: sync code changes without rebuilding the container layer (sub-second)
Resource grouping: see all your microservices in one place
Dependency graph: understand what needs to start before what

Platform-provided Tilt extensions

Python

# company-tilt-lib/Tiltfile — shared platform helpers
def platform_service(name, port, namespace='dev'):
  """Standard platform wrapper for all services"""
  docker_build(
    'ghcr.io/org/' + name,
    '.',
    live_update=[
      sync('./src', '/app/src'),
      run('dotnet build /app/src -o /app/publish'),
    ]
  )
  k8s_yaml(kustomize('./k8s/dev'))
  k8s_resource(
    name,
    port_forwards=[str(port) + ':8080'],
    labels=['service'],
    links=[link('http://localhost:' + str(port) + '/swagger', 'Swagger UI')]
  )

Teams import this helper so every service gets consistent Tilt integration.

Telepresence: Debug Inside the Real Cluster

Dev containers + Tilt solve the local development problem. But some bugs only appear in production-like environments — with the real database, real message volumes, real service-to-service traffic.

Telepresence lets you run one service locally while it appears to the cluster as if it's deployed there. Your local process intercepts traffic meant for the Kubernetes service.

Normal cluster:           With Telepresence:
[order-service Pod]       [order-service Pod (intercepted)]
        ↓                         ↓
[payment-service]     →   [Your local process on port 5000]
        ↓                         ↓
[notification-service]    [Real payment-service Pod]

Basic usage

Bash

# Connect to the cluster (creates a transparent VPN to cluster DNS and services)
telepresence connect

# You can now reach any cluster service by DNS name:
curl http://payment-service.payments.svc.cluster.local/health

# Intercept traffic for order-service (send it to your local port 5000)
telepresence intercept order-service \
  --port 5000:8080 \
  --env-file .env.telepresence

# .env.telepresence gets all the env vars from the real Pod:
# DB_HOST=postgres.database.svc.cluster.local
# REDIS_URL=redis://redis.cache.svc.cluster.local:6379
# RABBIT_URL=amqp://rabbitmq.messaging.svc.cluster.local

Now start your local process — it gets the real cluster's environment variables, it talks to the real database, and all cluster traffic for order-service hits your local debugger.

Use cases:

Step-through debugging with real production data (staging environment)
Reproduce a bug that only happens with real service-to-service traffic
Performance profiling with realistic load patterns

Personal intercepts for team collaboration

Bash

# Intercept only requests that match a header (doesn't break other developers)
telepresence intercept order-service \
  --port 5000:8080 \
  --http-header "x-developer: alice"

With this, only requests with x-developer: alice go to Alice's local machine. Other team members' requests continue to the real pod.

vCluster: Ephemeral PR Preview Environments

vCluster creates a virtual Kubernetes cluster inside a namespace of a host cluster. Each vCluster has its own API server, scheduler, and controller — developers get full cluster-admin without touching the host.

Use case: PR preview environments

Every pull request gets a fully isolated K8s cluster with the PR's code deployed. QA can test the feature in a real cluster before merge. The cluster is deleted when the PR closes.

GitHub Actions: PR preview with vCluster

YAML

# .github/workflows/pr-preview.yml
name: PR Preview
on:
  pull_request:
    types: [opened, synchronize]

jobs:
  deploy-preview:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install vcluster CLI
        run: |
          curl -L -o vcluster https://github.com/loft-sh/vcluster/releases/latest/download/vcluster-linux-amd64
          chmod +x vcluster
          sudo mv vcluster /usr/local/bin

      - name: Create vCluster for PR
        run: |
          CLUSTER_NAME="pr-${{ github.event.pull_request.number }}"
          
          # Create the virtual cluster
          vcluster create $CLUSTER_NAME \
            --namespace preview-$CLUSTER_NAME \
            --connect=false \
            --values ./platform/vcluster-values.yaml

      - name: Deploy PR code to vCluster
        run: |
          CLUSTER_NAME="pr-${{ github.event.pull_request.number }}"
          
          # Connect and deploy
          vcluster connect $CLUSTER_NAME -- kubectl apply -k k8s/preview

      - name: Comment PR with preview URL
        uses: actions/github-script@v7
        with:
          script: |
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: `🚀 Preview deployed at: https://pr-${{ github.event.pull_request.number }}.preview.company.dev`
            })

  cleanup-preview:
    if: github.event.action == 'closed'
    runs-on: ubuntu-latest
    steps:
      - name: Delete vCluster
        run: |
          CLUSTER_NAME="pr-${{ github.event.pull_request.number }}"
          vcluster delete $CLUSTER_NAME --namespace preview-$CLUSTER_NAME

vCluster configuration for developer environments

YAML

# platform/vcluster-values.yaml
sync:
  ingresses:
    enabled: true
  storageclasses:
    enabled: true

vcluster:
  resources:
    requests:
      cpu: 200m
      memory: 256Mi
    limits:
      cpu: 2000m
      memory: 2Gi

# Share node resources from host
storage:
  persistence: false  # ephemeral — no PVCs survive cluster deletion

Cost management: Karpenter + spot instances keep ephemeral cluster costs under control. Each vCluster costs ~$0.10/hour on spot. A PR preview that runs for 4 hours costs ~$0.40. Set a maximum lifetime (24h) and auto-delete idle clusters.

The Developer Environment Stack in Practice

A mature platform provides the full stack:

New hire joins:
1. Clone repo → VS Code detects .devcontainer → "Reopen in Container?"
2. One click → Docker container starts with all tools + dependencies
3. Run `tilt up` → all local services start, hot-reload enabled
4. Open Tilt UI: http://localhost:10350 — see all service logs and status
   
Testing integration bugs:
5. `telepresence connect` → connect to staging cluster
6. `telepresence intercept my-service --port 5000:8080`
7. Set breakpoint in VS Code, send request to staging — it hits local debugger

PR review:
8. Push PR → GitHub Actions creates vCluster → deploys PR code
9. QA tests feature at https://pr-123.preview.company.dev
10. PR merged → vCluster deleted automatically

Onboarding time: Before platform: 3-5 days to first working local setup. After platform: 2-4 hours.

Developer survey question: "How long does it take to set up a new service for local development?" Before: "1-2 days." After: "30 minutes — run the scaffolder, open in container, done."

Platform Team Implementation Priority

| Tool | Effort | Impact | Recommended order | |------|--------|--------|-------------------| | Dev Containers | Low | High (onboarding, reproducibility) | 1st | | Tilt | Medium | High (feedback loop, local K8s dev) | 2nd | | Telepresence | Low | Medium (debugging, not daily) | 3rd | | vCluster PR previews | High | Medium-High (team size dependent) | 4th |

Start with dev containers. The Backstage scaffolder template generates the .devcontainer/ folder for every new service. Rollout to existing services via a hackathon day where teams add devcontainers with platform team support.

Platform Engineering: Developer Environments — Dev Containers, Tilt, Telepresence, and vCluster

The Developer Environment Problem

The Four Tools That Solve It

Dev Containers: Reproducible Local Environments

`.devcontainer/devcontainer.json`

docker-compose.yml for local dependencies

The platform's role

GitHub Codespaces integration

Tilt: Hot-Reload Kubernetes Development

Tiltfile

Platform-provided Tilt extensions

Telepresence: Debug Inside the Real Cluster

Basic usage

Personal intercepts for team collaboration

vCluster: Ephemeral PR Preview Environments

GitHub Actions: PR preview with vCluster

vCluster configuration for developer environments

The Developer Environment Stack in Practice

Platform Team Implementation Priority

Enjoyed this article?

Leave a comment