Back to blog
Cloud & DevOpsintermediate

Kubernetes Deep Dive — Production Workloads, Networking & Security

Production Kubernetes guide — core architecture, workload resources (Deployments, StatefulSets, Jobs), networking (Services, Ingress, NetworkPolicy), RBAC, HPA/VPA/KEDA autoscaling, resource management, Helm, secrets management, and production readiness patterns.

SystemForgeApril 18, 202610 min read
KubernetesK8sDevOpsContainersDockerCloud NativeHelmSecurity
Share:𝕏

Kubernetes is the operating system for distributed systems — it schedules containerised workloads across a cluster, manages their lifecycle, handles failures, and provides primitives for networking, storage, configuration, and secrets. Understanding Kubernetes deeply means understanding its control loop: every component watches state and reconciles it toward the desired state you declare.


Architecture: Control Plane + Data Plane

┌─────────────────────────────────────────────────────────────────┐
│                         Control Plane                            │
│                                                                  │
│   kube-apiserver ──── etcd (cluster state, consistent store)    │
│        │                                                         │
│   kube-scheduler  (assigns Pods to Nodes)                        │
│   kube-controller-manager  (ReplicaSet, Node, Job controllers)   │
│   cloud-controller-manager (LoadBalancer, PV provisioning)       │
└──────────────────────────────────────────┬──────────────────────┘
                                           │ API
┌──────────────────────────────────────────▼──────────────────────┐
│                          Data Plane (Nodes)                      │
│                                                                  │
│  Node 1                    Node 2                    Node 3      │
│  ┌─────────────────┐       ┌─────────────────┐                   │
│  │  kubelet        │       │  kubelet        │   (pod lifecycle) │
│  │  kube-proxy     │       │  kube-proxy     │   (iptables/IPVS) │
│  │  container runtime│     │  container runtime│  (containerd)   │
│  │  Pod  Pod  Pod  │       │  Pod  Pod       │                   │
│  └─────────────────┘       └─────────────────┘                   │
└──────────────────────────────────────────────────────────────────┘

Key insight: Every resource in Kubernetes is stored in etcd as JSON. Controllers watch etcd via the API server for changes and reconcile the actual state toward the desired state. This reconciliation loop is the foundation of Kubernetes' self-healing behaviour.


Core Workload Resources

Deployment — Stateless Applications

YAML
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-service
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api-service
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1           # allow 1 extra pod during update
      maxUnavailable: 0     # never go below replica count (zero-downtime)
  template:
    metadata:
      labels:
        app: api-service
        version: "1.5.2"
    spec:
      containers:
        - name: api
          image: myregistry.azurecr.io/api-service:1.5.2
          ports:
            - containerPort: 8080
          resources:
            requests:             # scheduler uses this for placement
              cpu: "250m"
              memory: "256Mi"
            limits:               # hard ceiling  OOMKilled or throttled if exceeded
              cpu: "500m"
              memory: "512Mi"
          readinessProbe:         # gates traffic  pod excluded from Service until ready
            httpGet:
              path: /health/ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10
            failureThreshold: 3
          livenessProbe:          # restarts pod if unhealthy
            httpGet:
              path: /health/live
              port: 8080
            initialDelaySeconds: 15
            periodSeconds: 20
            failureThreshold: 3
          env:
            - name: ASPNETCORE_ENVIRONMENT
              value: "Production"
            - name: ConnectionStrings__DefaultConnection
              valueFrom:
                secretKeyRef:
                  name: db-credentials
                  key: connection-string
      terminationGracePeriodSeconds: 30   # SIGTERM, wait, then SIGKILL

StatefulSet — Stateful Applications

StatefulSets give each Pod a stable network identity (pod-0, pod-1) and stable persistent storage:

YAML
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
spec:
  serviceName: "postgres-headless"   # requires a headless Service
  replicas: 3
  selector:
    matchLabels:
      app: postgres
  template:
    spec:
      containers:
        - name: postgres
          image: postgres:16
          ports:
            - containerPort: 5432
          env:
            - name: PGDATA
              value: /var/lib/postgresql/data/pgdata
          volumeMounts:
            - name: postgres-data
              mountPath: /var/lib/postgresql/data
  volumeClaimTemplates:              # each Pod gets its own PVC
    - metadata:
        name: postgres-data
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: "premium-ssd"
        resources:
          requests:
            storage: 50Gi
---
apiVersion: v1
kind: Service
metadata:
  name: postgres-headless
spec:
  clusterIP: None             # headless: DNS returns individual pod IPs
  selector:
    app: postgres
  ports:
    - port: 5432

Jobs and CronJobs

YAML
# One-off database migration job
apiVersion: batch/v1
kind: Job
metadata:
  name: db-migration-v2
spec:
  backoffLimit: 3           # retry up to 3 times on failure
  ttlSecondsAfterFinished: 600
  template:
    spec:
      restartPolicy: OnFailure
      containers:
        - name: migrator
          image: myregistry.azurecr.io/migrator:v2
          command: ["dotnet", "ef", "database", "update"]
---
# Scheduled cleanup job
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cleanup-old-sessions
spec:
  schedule: "0 2 * * *"    # 2 AM daily (UTC)
  concurrencyPolicy: Forbid
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 1
  jobTemplate:
    spec:
      template:
        spec:
          restartPolicy: OnFailure
          containers:
            - name: cleanup
              image: myregistry.azurecr.io/cleanup:latest
              command: ["python", "cleanup.py", "--days=30"]

Networking: Services

YAML
# ClusterIP: internal only (default)
apiVersion: v1
kind: Service
metadata:
  name: api-service
spec:
  selector:
    app: api-service
  ports:
    - port: 80
      targetPort: 8080
  type: ClusterIP

# NodePort: exposed on every node at a static port (dev/testing)
# type: NodePort + nodePort: 30080

# LoadBalancer: cloud-provisioned external LB
apiVersion: v1
kind: Service
metadata:
  name: api-service-lb
  annotations:
    service.beta.kubernetes.io/azure-load-balancer-internal: "true"   # AKS: internal LB
spec:
  selector:
    app: api-service
  ports:
    - port: 443
      targetPort: 8080
  type: LoadBalancer

Ingress — HTTP Routing

Ingress routes external HTTP/S traffic to Services based on host/path rules. Requires an Ingress Controller (NGINX, Traefik, etc.):

YAML
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: api-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
  ingressClassName: nginx
  tls:
    - hosts:
        - api.example.com
      secretName: api-tls-cert       # cert-manager creates this
  rules:
    - host: api.example.com
      http:
        paths:
          - path: /v1
            pathType: Prefix
            backend:
              service:
                name: api-v1
                port:
                  number: 80
          - path: /v2
            pathType: Prefix
            backend:
              service:
                name: api-v2
                port:
                  number: 80

NetworkPolicy — Zero-Trust Pod Networking

By default, all pods can communicate with all other pods. NetworkPolicy restricts traffic:

YAML
# Default deny all ingress for namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-ingress
  namespace: production
spec:
  podSelector: {}        # applies to all pods in namespace
  policyTypes: [Ingress]
---
# Allow only specific traffic to api-service
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: api-service-ingress
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: api-service
  policyTypes: [Ingress]
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: ingress-nginx   # ingress controller namespace
        - podSelector:
            matchLabels:
              app: frontend    # allow frontend pods in same namespace
      ports:
        - protocol: TCP
          port: 8080

RBAC — Role-Based Access Control

YAML
# ServiceAccount for an application
apiVersion: v1
kind: ServiceAccount
metadata:
  name: api-service-account
  namespace: production
  annotations:
    # AKS Workload Identity: bind to Azure Managed Identity
    azure.workload.identity/client-id: "00000000-0000-0000-0000-000000000000"
---
# Role: what permissions are allowed in a namespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: secret-reader
  namespace: production
rules:
  - apiGroups: [""]
    resources: ["secrets"]
    resourceNames: ["db-credentials", "redis-password"]
    verbs: ["get"]
---
# RoleBinding: bind role to service account
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: api-secret-reader
  namespace: production
subjects:
  - kind: ServiceAccount
    name: api-service-account
    namespace: production
roleRef:
  kind: Role
  apiGroupt: rbac.authorization.k8s.io
  name: secret-reader

Principle of least privilege: ServiceAccounts should only have the permissions they need. Use Role/RoleBinding for namespace-scoped access, ClusterRole/ClusterRoleBinding only when cluster-wide access is genuinely required.


Autoscaling

Horizontal Pod Autoscaler (HPA)

YAML
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-service
  minReplicas: 2
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70    # scale out when avg CPU > 70%
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300   # wait 5 min before scaling down
      policies:
        - type: Percent
          value: 10
          periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
        - type: Pods
          value: 4                   # add max 4 pods per minute
          periodSeconds: 60

KEDA — Scale on Custom Metrics

KEDA (Kubernetes Event-Driven Autoscaling) scales on queue depth, Kafka lag, HTTP request rate, and 60+ other sources:

YAML
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: order-processor-scaler
spec:
  scaleTargetRef:
    name: order-processor
  minReplicaCount: 0           # scale to zero when idle
  maxReplicaCount: 50
  triggers:
    - type: azure-servicebus
      metadata:
        connectionFromEnv: SERVICEBUS_CONNECTION
        queueName: order-events
        messageCount: "5"      # 1 replica per 5 messages in queue
    - type: prometheus
      metadata:
        serverAddress: http://prometheus:9090
        metricName: http_requests_in_flight
        threshold: "100"       # 1 replica per 100 in-flight requests

Vertical Pod Autoscaler (VPA)

VPA adjusts CPU and memory requests/limits automatically. Run in Off mode first to get recommendations without auto-applying:

YAML
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: api-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-service
  updatePolicy:
    updateMode: "Off"   # Off = recommendations only, Auto = auto-apply

Resource Management and QoS

Kubernetes uses requests and limits to classify pods into QoS tiers for eviction priority:

QoS Tier         | When                               | Eviction priority
─────────────────│────────────────────────────────────│──────────────────
BestEffort       │ No requests or limits set          │ First (evicted first)
Burstable        │ Requests < Limits                  │ Second
Guaranteed       │ Requests == Limits (for all containers) │ Last

For production pods, set requests == limits (Guaranteed QoS) on critical services, or at minimum set requests so the scheduler places pods on nodes with sufficient capacity.

YAML
# LimitRange: default limits if not specified by pods
apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: production
spec:
  limits:
    - type: Container
      default:
        cpu: "500m"
        memory: "512Mi"
      defaultRequest:
        cpu: "100m"
        memory: "128Mi"
      max:
        cpu: "4"
        memory: "8Gi"
---
# ResourceQuota: cap total namespace resource usage
apiVersion: v1
kind: ResourceQuota
metadata:
  name: production-quota
spec:
  hard:
    requests.cpu: "50"
    requests.memory: "100Gi"
    limits.cpu: "100"
    limits.memory: "200Gi"
    pods: "200"
    persistentvolumeclaims: "50"

Secrets Management

Kubernetes Secrets are base64-encoded (not encrypted) by default. For production, use external secret managers:

YAML
# External Secrets Operator: sync from Azure Key Vault
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
  name: azure-keyvault
  namespace: production
spec:
  provider:
    azurekv:
      tenantId: "your-tenant-id"
      vaultUrl: "https://myvault.vault.azure.net"
      authType: WorkloadIdentity    # uses Pod's WorkloadIdentity
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: db-credentials
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: azure-keyvault
    kind: SecretStore
  target:
    name: db-credentials            # creates this K8s Secret
  data:
    - secretKey: connection-string  # K8s Secret key
      remoteRef:
        key: prod-db-connection-string   # Key Vault secret name

Helm — Package Management

Helm packages Kubernetes manifests as reusable charts with values-based templating:

Bash
# Add a chart repository
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update

# Install with custom values
helm install nginx-ingress ingress-nginx/ingress-nginx \
    --namespace ingress-nginx \
    --create-namespace \
    --values values-prod.yaml

# Upgrade (rolling update of the chart)
helm upgrade nginx-ingress ingress-nginx/ingress-nginx \
    --namespace ingress-nginx \
    --values values-prod.yaml \
    --atomic \          # roll back automatically on failure
    --timeout 5m

# View release history
helm history nginx-ingress -n ingress-nginx

# Roll back to previous release
helm rollback nginx-ingress 3 -n ingress-nginx

Creating a Helm Chart

Bash
helm create my-api      # generates chart scaffold
YAML
# my-api/templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "my-api.fullname" . }}
  labels:
    {{- include "my-api.labels" . | nindent 4 }}
spec:
  replicas: {{ .Values.replicaCount }}
  template:
    spec:
      containers:
        - name: {{ .Chart.Name }}
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
          resources:
            {{- toYaml .Values.resources | nindent 12 }}
YAML
# my-api/values.yaml
replicaCount: 2
image:
  repository: myregistry.azurecr.io/my-api
  tag: "1.0.0"
resources:
  requests:
    cpu: 250m
    memory: 256Mi
  limits:
    cpu: 500m
    memory: 512Mi

Production Readiness Checklist

Pod Disruption Budgets

YAML
# Ensure at least 2 replicas available during voluntary disruptions (node drains, upgrades)
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: api-pdb
spec:
  minAvailable: 2          # or: maxUnavailable: 1
  selector:
    matchLabels:
      app: api-service

Priority Classes

YAML
# High-priority class for critical workloads  protected from preemption
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: critical-workload
value: 1000000
globalDefault: false
description: "Critical production workloads"
---
# Use in pod spec:
# priorityClassName: critical-workload

Production Readiness Summary

| Concern | Implementation | |---------|---------------| | Health checks | readinessProbe + livenessProbe + optional startupProbe | | Graceful shutdown | terminationGracePeriodSeconds ≥ request timeout + drain period | | Resource limits | Always set requests and limits for every container | | Disruption budget | PodDisruptionBudget for every production Deployment | | Zero-trust networking | Default-deny NetworkPolicy + explicit allow rules | | Secrets | External Secrets Operator from Key Vault/Secrets Manager | | RBAC | Dedicated ServiceAccount per workload, least-privilege roles | | Autoscaling | HPA for CPU/memory, KEDA for queue-driven, Cluster Autoscaler for nodes | | Multi-AZ | topologySpreadConstraints across zones | | Image security | Pin image digests, scan with Trivy in CI, private registry only |

Topology Spread Constraints — Multi-AZ Distribution

YAML
spec:
  topologySpreadConstraints:
    - maxSkew: 1
      topologyKey: topology.kubernetes.io/zone
      whenUnsatisfiable: DoNotSchedule
      labelSelector:
        matchLabels:
          app: api-service
    - maxSkew: 1
      topologyKey: kubernetes.io/hostname
      whenUnsatisfiable: DoNotSchedule
      labelSelector:
        matchLabels:
          app: api-service

Observability

YAML
# Prometheus scrape annotations  auto-discovered by Prometheus Operator
metadata:
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "8080"
    prometheus.io/path: "/metrics"
Bash
# kubectl top  requires metrics-server
kubectl top nodes
kubectl top pods -n production --sort-by=memory

# Describe a pod for troubleshooting (events, conditions, probe failures)
kubectl describe pod <pod-name> -n production

# Follow logs across all replicas
kubectl logs -l app=api-service -n production -f --tail=100

# Execute into a running container
kubectl exec -it <pod-name> -n production -- /bin/sh

Related: Azure AKS Production Guide — AKS-specific configuration
Related: Azure Cloud Architecture — Well-Architected Framework
Related: Event-Driven Architecture — Kafka and async patterns

Enjoyed this article?

Explore the Cloud & DevOps learning path for more.

Found this helpful?

Share:𝕏

Leave a comment

Have a question, correction, or just found this helpful? Leave a note below.