Learnixo
Back to blog
Cloud & DevOpsadvanced

Platform Engineering: eBPF and Cilium — Sidecar-Free Networking, L7 Policies, and Hubble Observability

Deep guide to eBPF-based Kubernetes networking with Cilium — how eBPF replaces kube-proxy and service mesh sidecars, L7 network policies for per-path authorization, Cilium Hubble for real-time traffic flow observability, and migrating from Calico or flannel.

LearnixoJune 11, 20269 min read
Platform EngineeringeBPFCiliumKubernetesNetworkingService MeshZero TrustDevOps
Share:𝕏

Why eBPF Changes Everything in Kubernetes Networking

Traditional Kubernetes networking runs on iptables — a decades-old Linux netfilter framework. Every service you add creates more iptables rules. At 5,000 services, iptables rule evaluation takes tens of milliseconds per packet. At 10,000 services, it can exceed 100ms.

eBPF (extended Berkeley Packet Filter) runs programs inside the Linux kernel in a sandboxed virtual machine. These programs intercept system calls and network events with near-zero overhead. No kernel modules, no recompilation — programs are verified by the kernel before loading and can be hot-swapped.

Cilium uses eBPF to implement:

  • kube-proxy replacement — service load balancing in-kernel, O(1) instead of O(n) iptables
  • Network policies at L3/L4/L7 — without iptables
  • Service mesh without sidecars — mTLS and observability via eBPF, no Envoy sidecar injected
  • Hubble — real-time network flow visibility from eBPF hooks

Cilium vs Traditional CNIs

| Feature | Calico / Flannel | Cilium | |---------|-----------------|--------| | Implementation | iptables / IPVS | eBPF | | kube-proxy | Required | Replaced | | Service lookup | O(n) iptables | O(1) eBPF hash map | | NetworkPolicy | L3/L4 | L3/L4/L7 (HTTP, Kafka, gRPC) | | Service mesh | Requires Istio/Linkerd | Built-in (eBPF-based) | | Observability | None built-in | Hubble (built-in flow logs) | | Kernel requirement | Any | 5.10+ (recommended) | | Complexity | Low | Medium |


Installing Cilium with kube-proxy Replacement

Bash
# Install with kube-proxy completely disabled
helm install cilium cilium/cilium \
  --version 1.15.0 \
  --namespace kube-system \
  --set kubeProxyReplacement=true \
  --set k8sServiceHost=10.0.0.1 \         # your API server IP
  --set k8sServicePort=6443 \
  --set hubble.enabled=true \
  --set hubble.relay.enabled=true \
  --set hubble.ui.enabled=true \
  --set hubble.metrics.enabled="{dns,drop,tcp,flow,port-distribution,icmp,http}"

Verify kube-proxy is gone and Cilium handles services:

Bash
cilium status
# KubeProxyReplacement: True
# Masquerading: IPTables (iptables-based, minimal rules)
# Services: 1,247 services via eBPF

kubectl -n kube-system get pods | grep kube-proxy
# (no output  kube-proxy is not running)

Standard Kubernetes NetworkPolicy vs Cilium Network Policy

Standard NetworkPolicy (L3/L4 only)

YAML
# Allow ingress to payment-service only from checkout-service
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: payment-service-ingress
  namespace: payments
spec:
  podSelector:
    matchLabels:
      app: payment-service
  policyTypes:
    - Ingress
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: checkout
          podSelector:
            matchLabels:
              app: checkout-service
      ports:
        - protocol: TCP
          port: 8080

This allows any HTTP method, any path on port 8080. A compromised checkout-service could call DELETE /api/v1/payments/all.

CiliumNetworkPolicy (L7 — per HTTP path)

YAML
# Allow checkout-service to call only GET /api/v1/payments/* on payment-service
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: payment-service-l7
  namespace: payments
spec:
  endpointSelector:
    matchLabels:
      app: payment-service
  ingress:
    - fromEndpoints:
        - matchLabels:
            app: checkout-service
            k8s:io.kubernetes.pod.namespace: checkout
      toPorts:
        - ports:
            - port: "8080"
              protocol: TCP
          rules:
            http:
              - method: GET
                path: "/api/v1/payments/.*"
              - method: POST
                path: "/api/v1/payments$"
              # Anything else: DENIED (checkout cannot DELETE or call /admin/*)

Now checkout-service can only GET payments and POST new ones. Any attempt to call DELETE /api/v1/payments/123 is dropped at the network layer — before it reaches the application.

Kafka topic-level policy

Cilium understands Kafka protocol natively:

YAML
spec:
  ingress:
    - fromEndpoints:
        - matchLabels:
            app: order-processor
      toPorts:
        - ports:
            - port: "9092"
              protocol: TCP
          rules:
            kafka:
              - role: consume
                topic: "orders.created"
              - role: produce
                topic: "orders.processed"
              # order-processor cannot produce to payment-events topic

DNS-based egress policy

Allow egress only to specific external DNS names:

YAML
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: payment-service-egress
spec:
  endpointSelector:
    matchLabels:
      app: payment-service
  egress:
    # Allow internal cluster DNS
    - toEndpoints:
        - matchLabels:
            k8s:io.kubernetes.pod.namespace: kube-system
            k8s-app: kube-dns
      toPorts:
        - ports:
            - port: "53"
              protocol: ANY
    # Allow only Stripe API
    - toFQDNs:
        - matchName: "api.stripe.com"
      toPorts:
        - ports:
            - port: "443"
              protocol: TCP
    # Block everything else by default (default-deny handled by CiliumClusterwideNetworkPolicy)

Cluster-Wide Default Deny

Standard NetworkPolicy is namespace-scoped. Cilium's CiliumClusterwideNetworkPolicy applies globally:

YAML
# Default deny all ingress and egress cluster-wide
apiVersion: cilium.io/v2
kind: CiliumClusterwideNetworkPolicy
metadata:
  name: default-deny-all
spec:
  endpointSelector: {}    # matches ALL endpoints in ALL namespaces
  ingress:
    - {}                  # empty rule = deny all ingress
  egress:
    - {}                  # empty rule = deny all egress
---
# Allow pods to reach kube-dns (without this, DNS breaks)
apiVersion: cilium.io/v2
kind: CiliumClusterwideNetworkPolicy
metadata:
  name: allow-kube-dns
spec:
  endpointSelector: {}
  egress:
    - toEndpoints:
        - matchLabels:
            k8s:io.kubernetes.pod.namespace: kube-system
            k8s-app: kube-dns
      toPorts:
        - ports:
            - port: "53"
              protocol: ANY
---
# Allow pods to reach the API server (required for service accounts)
apiVersion: cilium.io/v2
kind: CiliumClusterwideNetworkPolicy
metadata:
  name: allow-apiserver
spec:
  endpointSelector: {}
  egress:
    - toEntities:
        - kube-apiserver

With this in place, every namespace starts with zero connectivity. Teams add CiliumNetworkPolicy for the specific traffic they need.


Mutual TLS Without Sidecars

Cilium's Wireguard encryption provides transparent encryption between all nodes — all pod-to-pod traffic is encrypted without sidecars:

Bash
helm upgrade cilium cilium/cilium \
  --set encryption.enabled=true \
  --set encryption.type=wireguard

For per-workload mTLS with identity verification, Cilium integrates with SPIFFE/SPIRE:

YAML
# Cilium policy using SPIFFE workload identity
spec:
  ingress:
    - fromEntities:
        - world
      authentication:
        mode: required    # mutual TLS required

With SPIFFE integration, Cilium verifies the cryptographic workload identity (SVID) on every connection — not just the source IP or label. A Pod that spoofs labels cannot fake a SPIFFE identity.

This gives you zero-trust networking at the kernel level without Istio's Envoy sidecars.


Hubble: Real-Time Network Flow Observability

Hubble reads eBPF hook data and provides a flow log of every packet in the cluster.

CLI: Live flow inspection

Bash
# Install hubble CLI
export HUBBLE_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/hubble/master/stable.txt)
curl -L --remote-name-all https://github.com/cilium/hubble/releases/download/$HUBBLE_VERSION/hubble-linux-amd64.tar.gz

# Port-forward to Hubble relay
cilium hubble port-forward &

# Watch all flows in the payments namespace
hubble observe \
  --namespace payments \
  --follow

# Output:
# payments/checkout-service -> payments/payment-service:8080 (HTTP GET /api/v1/payments/P123, 200, 23ms)
# payments/checkout-service -> payments/payment-service:8080 (HTTP POST /api/v1/payments, 201, 45ms)
# kube-system/monitoring -> payments/payment-service:9090 (TCP SYN, FORWARDED)

# Filter: show only dropped flows (NetworkPolicy violations)
hubble observe \
  --verdict DROPPED \
  --follow

# Output:
# payments/rogue-service -> payments/payment-service:8080 (HTTP DELETE /api/v1/payments, DROPPED, L7 policy)

This is the most powerful debugging tool in the platform. Any network issue — "why can't service A reach service B?" — is answered in seconds:

Bash
# Check if payment-service can reach Stripe
hubble observe \
  --from-label app=payment-service \
  --to-fqdn api.stripe.com \
  --verdict DROPPED,FORWARDED

# If dropped: which policy is blocking it?
# If forwarded: the network is fine, check the app

Hubble UI: Service Map

Bash
# Port-forward to Hubble UI
kubectl port-forward -n kube-system svc/hubble-ui 12000:80
open http://localhost:12000

Hubble UI shows a live service dependency map — every service as a node, every active connection as an edge, with request rate, error rate, and latency per connection.

For a compliance audit: "Show me all services that communicated with the payments namespace in the last 7 days" — Hubble answers this instantly.

Hubble metrics in Prometheus

YAML
# Cilium Helm values  enable Hubble metrics
hubble:
  metrics:
    enabled:
      - dns:query;ignoreAAAA
      - drop
      - tcp
      - flow
      - icmp
      - http
    serviceMonitor:
      enabled: true    # auto-register Prometheus ServiceMonitor

Grafana dashboards from these metrics:

  • HTTP error rate per source/destination pair: which service is causing 5xx to payment-service?
  • DNS failure rate: which pods are failing DNS resolution?
  • Drop rate: how many packets are being dropped by network policies?
  • Connection latency per service pair: is checkout→payment slow due to network or app?

Migrating from Calico to Cilium

Migration cannot be done in-place — Cilium and Calico use different data planes. Options:

Option 1: Blue-Green node groups (recommended for production)

1. Create new node group with Cilium CNI configured
2. Taint old nodes: kubectl taint nodes -l cni=calico migration=true:NoSchedule
3. Workloads migrate to new Cilium nodes via eviction
4. Validate network policies have equivalents in CiliumNetworkPolicy
5. Remove old Calico nodes

Option 2: Cluster rebuild (for EKS/AKS managed clusters)

Create a new cluster with Cilium, migrate workloads via GitOps (ArgoCD sync to new cluster), redirect traffic via DNS. Clean migration with no hybrid state.

Policy conversion

Standard NetworkPolicy YAML works unchanged on Cilium — Cilium is fully compatible. You only need to write CiliumNetworkPolicy for the L7 features. Start by applying existing NetworkPolicies, then add Cilium-specific L7 policies incrementally.


Cilium on AKS and EKS

AKS (Azure Kubernetes Service):

Bash
# Create AKS cluster with Cilium as network policy engine
az aks create \
  --network-plugin azure \
  --network-plugin-mode overlay \
  --network-policy cilium \
  --resource-group myRG \
  --name myCluster

EKS (Amazon EKS) — Cilium as a replacement for the AWS VPC CNI:

Bash
# Disable kube-proxy before installing Cilium
kubectl -n kube-system patch daemonset kube-proxy -p '{"spec":{"template":{"spec":{"nodeSelector":{"non-existing":"true"}}}}}'

helm install cilium cilium/cilium \
  --namespace kube-system \
  --set eni.enabled=true \           # use AWS ENI for pod IPs
  --set ipam.mode=eni \
  --set kubeProxyReplacement=true \
  --set hubble.enabled=true

Platform Security Architecture with Cilium

Combining all Cilium features gives a layered zero-trust architecture:

Layer 1: Node-to-node encryption (Wireguard)
  → All pod traffic encrypted in transit between nodes, automatically

Layer 2: Default-deny cluster-wide policy (CiliumClusterwideNetworkPolicy)
  → No service can reach any other service unless explicitly allowed

Layer 3: Namespace-level policies
  → Namespaces are isolated; ingress/egress to other namespaces must be declared

Layer 4: L7 policies (CiliumNetworkPolicy with HTTP/Kafka rules)
  → Services can only call specific paths/methods on other services

Layer 5: SPIFFE workload identity + authentication mode
  → Connections verified by cryptographic identity, not just IP labels

Layer 6: Hubble audit log
  → Every flow logged; dropped packets show which policy blocked them
  → 30-day retention for compliance audit evidence

This is zero trust as a platform feature — teams don't configure it, they declare the traffic they need via CiliumNetworkPolicy. The platform enforces everything else.


Quick Reference: Key Cilium Commands

Bash
# Check overall Cilium health
cilium status --verbose

# Check connectivity between two pods
cilium connectivity test

# List all network policies affecting a pod
cilium policy get

# Inspect Cilium endpoint (pod) details
cilium endpoint get <pod-name>

# Watch flows in real-time
hubble observe --namespace <ns> --follow

# Show dropped flows (policy violations)
hubble observe --verdict DROPPED

# Get flows between two services
hubble observe \
  --from-label app=checkout \
  --to-label app=payment-service

# Check Hubble relay status
cilium hubble ui &

Enjoyed this article?

Explore the Cloud & DevOps learning path for more.

Found this helpful?

Share:𝕏

Leave a comment

Have a question, correction, or just found this helpful? Leave a note below.