Platform Engineering: eBPF and Cilium — Sidecar-Free Networking, L7 Policies, and Hubble Observability
Deep guide to eBPF-based Kubernetes networking with Cilium — how eBPF replaces kube-proxy and service mesh sidecars, L7 network policies for per-path authorization, Cilium Hubble for real-time traffic flow observability, and migrating from Calico or flannel.
Why eBPF Changes Everything in Kubernetes Networking
Traditional Kubernetes networking runs on iptables — a decades-old Linux netfilter framework. Every service you add creates more iptables rules. At 5,000 services, iptables rule evaluation takes tens of milliseconds per packet. At 10,000 services, it can exceed 100ms.
eBPF (extended Berkeley Packet Filter) runs programs inside the Linux kernel in a sandboxed virtual machine. These programs intercept system calls and network events with near-zero overhead. No kernel modules, no recompilation — programs are verified by the kernel before loading and can be hot-swapped.
Cilium uses eBPF to implement:
- kube-proxy replacement — service load balancing in-kernel, O(1) instead of O(n) iptables
- Network policies at L3/L4/L7 — without iptables
- Service mesh without sidecars — mTLS and observability via eBPF, no Envoy sidecar injected
- Hubble — real-time network flow visibility from eBPF hooks
Cilium vs Traditional CNIs
| Feature | Calico / Flannel | Cilium | |---------|-----------------|--------| | Implementation | iptables / IPVS | eBPF | | kube-proxy | Required | Replaced | | Service lookup | O(n) iptables | O(1) eBPF hash map | | NetworkPolicy | L3/L4 | L3/L4/L7 (HTTP, Kafka, gRPC) | | Service mesh | Requires Istio/Linkerd | Built-in (eBPF-based) | | Observability | None built-in | Hubble (built-in flow logs) | | Kernel requirement | Any | 5.10+ (recommended) | | Complexity | Low | Medium |
Installing Cilium with kube-proxy Replacement
# Install with kube-proxy completely disabled
helm install cilium cilium/cilium \
--version 1.15.0 \
--namespace kube-system \
--set kubeProxyReplacement=true \
--set k8sServiceHost=10.0.0.1 \ # your API server IP
--set k8sServicePort=6443 \
--set hubble.enabled=true \
--set hubble.relay.enabled=true \
--set hubble.ui.enabled=true \
--set hubble.metrics.enabled="{dns,drop,tcp,flow,port-distribution,icmp,http}"Verify kube-proxy is gone and Cilium handles services:
cilium status
# KubeProxyReplacement: True
# Masquerading: IPTables (iptables-based, minimal rules)
# Services: 1,247 services via eBPF
kubectl -n kube-system get pods | grep kube-proxy
# (no output — kube-proxy is not running)Standard Kubernetes NetworkPolicy vs Cilium Network Policy
Standard NetworkPolicy (L3/L4 only)
# Allow ingress to payment-service only from checkout-service
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: payment-service-ingress
namespace: payments
spec:
podSelector:
matchLabels:
app: payment-service
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: checkout
podSelector:
matchLabels:
app: checkout-service
ports:
- protocol: TCP
port: 8080This allows any HTTP method, any path on port 8080. A compromised checkout-service could call DELETE /api/v1/payments/all.
CiliumNetworkPolicy (L7 — per HTTP path)
# Allow checkout-service to call only GET /api/v1/payments/* on payment-service
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: payment-service-l7
namespace: payments
spec:
endpointSelector:
matchLabels:
app: payment-service
ingress:
- fromEndpoints:
- matchLabels:
app: checkout-service
k8s:io.kubernetes.pod.namespace: checkout
toPorts:
- ports:
- port: "8080"
protocol: TCP
rules:
http:
- method: GET
path: "/api/v1/payments/.*"
- method: POST
path: "/api/v1/payments$"
# Anything else: DENIED (checkout cannot DELETE or call /admin/*)Now checkout-service can only GET payments and POST new ones. Any attempt to call DELETE /api/v1/payments/123 is dropped at the network layer — before it reaches the application.
Kafka topic-level policy
Cilium understands Kafka protocol natively:
spec:
ingress:
- fromEndpoints:
- matchLabels:
app: order-processor
toPorts:
- ports:
- port: "9092"
protocol: TCP
rules:
kafka:
- role: consume
topic: "orders.created"
- role: produce
topic: "orders.processed"
# order-processor cannot produce to payment-events topicDNS-based egress policy
Allow egress only to specific external DNS names:
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: payment-service-egress
spec:
endpointSelector:
matchLabels:
app: payment-service
egress:
# Allow internal cluster DNS
- toEndpoints:
- matchLabels:
k8s:io.kubernetes.pod.namespace: kube-system
k8s-app: kube-dns
toPorts:
- ports:
- port: "53"
protocol: ANY
# Allow only Stripe API
- toFQDNs:
- matchName: "api.stripe.com"
toPorts:
- ports:
- port: "443"
protocol: TCP
# Block everything else by default (default-deny handled by CiliumClusterwideNetworkPolicy)Cluster-Wide Default Deny
Standard NetworkPolicy is namespace-scoped. Cilium's CiliumClusterwideNetworkPolicy applies globally:
# Default deny all ingress and egress cluster-wide
apiVersion: cilium.io/v2
kind: CiliumClusterwideNetworkPolicy
metadata:
name: default-deny-all
spec:
endpointSelector: {} # matches ALL endpoints in ALL namespaces
ingress:
- {} # empty rule = deny all ingress
egress:
- {} # empty rule = deny all egress
---
# Allow pods to reach kube-dns (without this, DNS breaks)
apiVersion: cilium.io/v2
kind: CiliumClusterwideNetworkPolicy
metadata:
name: allow-kube-dns
spec:
endpointSelector: {}
egress:
- toEndpoints:
- matchLabels:
k8s:io.kubernetes.pod.namespace: kube-system
k8s-app: kube-dns
toPorts:
- ports:
- port: "53"
protocol: ANY
---
# Allow pods to reach the API server (required for service accounts)
apiVersion: cilium.io/v2
kind: CiliumClusterwideNetworkPolicy
metadata:
name: allow-apiserver
spec:
endpointSelector: {}
egress:
- toEntities:
- kube-apiserverWith this in place, every namespace starts with zero connectivity. Teams add CiliumNetworkPolicy for the specific traffic they need.
Mutual TLS Without Sidecars
Cilium's Wireguard encryption provides transparent encryption between all nodes — all pod-to-pod traffic is encrypted without sidecars:
helm upgrade cilium cilium/cilium \
--set encryption.enabled=true \
--set encryption.type=wireguardFor per-workload mTLS with identity verification, Cilium integrates with SPIFFE/SPIRE:
# Cilium policy using SPIFFE workload identity
spec:
ingress:
- fromEntities:
- world
authentication:
mode: required # mutual TLS requiredWith SPIFFE integration, Cilium verifies the cryptographic workload identity (SVID) on every connection — not just the source IP or label. A Pod that spoofs labels cannot fake a SPIFFE identity.
This gives you zero-trust networking at the kernel level without Istio's Envoy sidecars.
Hubble: Real-Time Network Flow Observability
Hubble reads eBPF hook data and provides a flow log of every packet in the cluster.
CLI: Live flow inspection
# Install hubble CLI
export HUBBLE_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/hubble/master/stable.txt)
curl -L --remote-name-all https://github.com/cilium/hubble/releases/download/$HUBBLE_VERSION/hubble-linux-amd64.tar.gz
# Port-forward to Hubble relay
cilium hubble port-forward &
# Watch all flows in the payments namespace
hubble observe \
--namespace payments \
--follow
# Output:
# payments/checkout-service -> payments/payment-service:8080 (HTTP GET /api/v1/payments/P123, 200, 23ms)
# payments/checkout-service -> payments/payment-service:8080 (HTTP POST /api/v1/payments, 201, 45ms)
# kube-system/monitoring -> payments/payment-service:9090 (TCP SYN, FORWARDED)
# Filter: show only dropped flows (NetworkPolicy violations)
hubble observe \
--verdict DROPPED \
--follow
# Output:
# payments/rogue-service -> payments/payment-service:8080 (HTTP DELETE /api/v1/payments, DROPPED, L7 policy)This is the most powerful debugging tool in the platform. Any network issue — "why can't service A reach service B?" — is answered in seconds:
# Check if payment-service can reach Stripe
hubble observe \
--from-label app=payment-service \
--to-fqdn api.stripe.com \
--verdict DROPPED,FORWARDED
# If dropped: which policy is blocking it?
# If forwarded: the network is fine, check the appHubble UI: Service Map
# Port-forward to Hubble UI
kubectl port-forward -n kube-system svc/hubble-ui 12000:80
open http://localhost:12000Hubble UI shows a live service dependency map — every service as a node, every active connection as an edge, with request rate, error rate, and latency per connection.
For a compliance audit: "Show me all services that communicated with the payments namespace in the last 7 days" — Hubble answers this instantly.
Hubble metrics in Prometheus
# Cilium Helm values — enable Hubble metrics
hubble:
metrics:
enabled:
- dns:query;ignoreAAAA
- drop
- tcp
- flow
- icmp
- http
serviceMonitor:
enabled: true # auto-register Prometheus ServiceMonitorGrafana dashboards from these metrics:
- HTTP error rate per source/destination pair: which service is causing 5xx to payment-service?
- DNS failure rate: which pods are failing DNS resolution?
- Drop rate: how many packets are being dropped by network policies?
- Connection latency per service pair: is checkout→payment slow due to network or app?
Migrating from Calico to Cilium
Migration cannot be done in-place — Cilium and Calico use different data planes. Options:
Option 1: Blue-Green node groups (recommended for production)
1. Create new node group with Cilium CNI configured
2. Taint old nodes: kubectl taint nodes -l cni=calico migration=true:NoSchedule
3. Workloads migrate to new Cilium nodes via eviction
4. Validate network policies have equivalents in CiliumNetworkPolicy
5. Remove old Calico nodesOption 2: Cluster rebuild (for EKS/AKS managed clusters)
Create a new cluster with Cilium, migrate workloads via GitOps (ArgoCD sync to new cluster), redirect traffic via DNS. Clean migration with no hybrid state.
Policy conversion
Standard NetworkPolicy YAML works unchanged on Cilium — Cilium is fully compatible. You only need to write CiliumNetworkPolicy for the L7 features. Start by applying existing NetworkPolicies, then add Cilium-specific L7 policies incrementally.
Cilium on AKS and EKS
AKS (Azure Kubernetes Service):
# Create AKS cluster with Cilium as network policy engine
az aks create \
--network-plugin azure \
--network-plugin-mode overlay \
--network-policy cilium \
--resource-group myRG \
--name myClusterEKS (Amazon EKS) — Cilium as a replacement for the AWS VPC CNI:
# Disable kube-proxy before installing Cilium
kubectl -n kube-system patch daemonset kube-proxy -p '{"spec":{"template":{"spec":{"nodeSelector":{"non-existing":"true"}}}}}'
helm install cilium cilium/cilium \
--namespace kube-system \
--set eni.enabled=true \ # use AWS ENI for pod IPs
--set ipam.mode=eni \
--set kubeProxyReplacement=true \
--set hubble.enabled=truePlatform Security Architecture with Cilium
Combining all Cilium features gives a layered zero-trust architecture:
Layer 1: Node-to-node encryption (Wireguard)
→ All pod traffic encrypted in transit between nodes, automatically
Layer 2: Default-deny cluster-wide policy (CiliumClusterwideNetworkPolicy)
→ No service can reach any other service unless explicitly allowed
Layer 3: Namespace-level policies
→ Namespaces are isolated; ingress/egress to other namespaces must be declared
Layer 4: L7 policies (CiliumNetworkPolicy with HTTP/Kafka rules)
→ Services can only call specific paths/methods on other services
Layer 5: SPIFFE workload identity + authentication mode
→ Connections verified by cryptographic identity, not just IP labels
Layer 6: Hubble audit log
→ Every flow logged; dropped packets show which policy blocked them
→ 30-day retention for compliance audit evidenceThis is zero trust as a platform feature — teams don't configure it, they declare the traffic they need via CiliumNetworkPolicy. The platform enforces everything else.
Quick Reference: Key Cilium Commands
# Check overall Cilium health
cilium status --verbose
# Check connectivity between two pods
cilium connectivity test
# List all network policies affecting a pod
cilium policy get
# Inspect Cilium endpoint (pod) details
cilium endpoint get <pod-name>
# Watch flows in real-time
hubble observe --namespace <ns> --follow
# Show dropped flows (policy violations)
hubble observe --verdict DROPPED
# Get flows between two services
hubble observe \
--from-label app=checkout \
--to-label app=payment-service
# Check Hubble relay status
cilium hubble ui &Enjoyed this article?
Explore the Cloud & DevOps learning path for more.
Found this helpful?
Leave a comment
Have a question, correction, or just found this helpful? Leave a note below.