Platform Engineering: FinOps and Kubernetes Cost Optimization — Karpenter, Kubecost, VPA, and Spot Instances

Why FinOps Belongs in the Platform

Without intentional cost engineering, Kubernetes clusters become expensive fast. The default behavior of most teams: over-request resources (to be safe), never clean up dev workloads, and have no idea what anything costs.

Platform engineers are in the unique position to fix this structurally — because they control:

How nodes are provisioned (Karpenter)
What resource requests and limits teams set (VPA recommendations + Kyverno enforcement)
How costs are attributed (Kubecost labels enforced by policy)
What developers see (Backstage cost widget)

Cost optimization without platform ownership is a one-time cleanup. Cost optimization with platform ownership is a self-sustaining flywheel.

The Kubernetes Cost Model

Before optimizing, understand what drives K8s costs:

Total cluster cost = Node costs + Managed control plane costs + Storage + Network egress

Node cost breakdown:
├── Compute cost: priced per CPU/hour and RAM/hour
├── Spot discount: 60-80% cheaper, can be reclaimed by cloud provider
└── Reserved vs on-demand: 30-50% savings for stable baseline workloads

What your workloads actually pay for:
├── Requested CPU and memory (even if unused — you pay for reserved capacity)
├── NOT actual usage — if you request 4 CPU and use 0.2, you paid for 4
└── Node overhead: ~10-15% of each node is consumed by the OS and K8s agents

The rightsizing opportunity: Most Kubernetes workloads use 20-40% of their requested resources. This is the biggest cost reduction lever, and it's purely about correct resource requests.

Karpenter: Right-Sizing Nodes

The Cluster Autoscaler can only scale within pre-defined node groups. If your groups contain c5.2xlarge (8 CPU, 16GB), that's what every scale-up gets — whether your workload needs 1 CPU or 8 CPU.

Karpenter looks at the resource requests of unschedulable Pods and provisions exactly the right instance type:

Pod needs 1.5 CPU + 3GB RAM → Karpenter picks t3.medium or c6i.large (cheapest that fits)
Pod needs 16 CPU + 32GB RAM → Karpenter picks c6i.4xlarge
Pod needs GPU → Karpenter picks g5.xlarge

No node groups. No pre-defined pools. Just constraints.

NodePool configuration

YAML

# NodePool — defines what Karpenter can provision
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
spec:
  template:
    metadata:
      labels:
        node-type: general
    spec:
      nodeClassRef:
        apiVersion: karpenter.k8s.aws/v1beta1
        kind: EC2NodeClass
        name: default
      requirements:
        # Allow any architecture
        - key: kubernetes.io/arch
          operator: In
          values: ["amd64", "arm64"]
        # Prefer spot, fall back to on-demand
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot", "on-demand"]
        # Broad instance family selection — let Karpenter choose
        - key: karpenter.k8s.aws/instance-family
          operator: In
          values: ["c6i", "c6a", "c7i", "m6i", "m6a", "r6i"]
        # Maximum node size — prevent overprovisioning
        - key: karpenter.k8s.aws/instance-size
          operator: NotIn
          values: ["nano", "micro", "small"]
  limits:
    cpu: 1000            # max 1000 CPU across all Karpenter nodes
    memory: 4000Gi
  disruption:
    consolidationPolicy: WhenUnderutilized  # key for cost savings
    consolidateAfter: 30s                   # aggressive consolidation

EC2NodeClass

YAML

apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiFamily: AL2        # Amazon Linux 2
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: "my-cluster"
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: "my-cluster"
  instanceProfile: KarpenterNodeInstanceProfile
  blockDeviceMappings:
    - deviceName: /dev/xvda
      ebs:
        volumeSize: 50Gi
        volumeType: gp3
        encrypted: true

Spot instance handling

The critical consideration: spot instances can be reclaimed by AWS with 2 minutes notice.

YAML

# Make workloads spot-tolerant
apiVersion: apps/v1
kind: Deployment
spec:
  template:
    spec:
      topologySpreadConstraints:
        # Spread across availability zones — if one AZ has spot reclamation, not all pods die
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: DoNotSchedule
          labelSelector:
            matchLabels:
              app: my-service
      # Graceful shutdown — handle SIGTERM quickly
      terminationGracePeriodSeconds: 30
      containers:
        - name: my-service
          lifecycle:
            preStop:
              exec:
                command: ["/bin/sh", "-c", "sleep 5"]  # gives LB time to drain

Spot fallback strategy in NodePool:

YAML

requirements:
  - key: karpenter.sh/capacity-type
    operator: In
    values: ["spot", "on-demand"]

Karpenter tries spot first, automatically falls back to on-demand if spot is unavailable.

Consolidation: The Hidden Cost Saver

Karpenter consolidation continuously looks for opportunities to move workloads to fewer, cheaper nodes and terminate the rest:

Before consolidation:
  Node A (c6i.2xlarge, $0.34/h): 30% CPU used, 3 pods
  Node B (c6i.2xlarge, $0.34/h): 20% CPU used, 2 pods
  Node C (c6i.xlarge,  $0.17/h): 15% CPU used, 1 pod

After consolidation:
  Node A (c6i.xlarge, $0.17/h): 65% CPU used, 6 pods
  Nodes B and C: terminated → save $0.34/h = $245/month

With consolidationPolicy: WhenUnderutilized, this happens automatically. Organizations typically see 20-40% cost reduction from consolidation alone.

Kubecost: Team-Level Cost Visibility

Kubecost (open-source via OpenCost) allocates cluster costs to namespaces, workloads, and teams using Kubernetes labels.

Installation

Bash

helm install kubecost cost-analyzer \
  --repo https://kubecost.github.io/cost-analyzer \
  --namespace kubecost \
  --create-namespace \
  --set kubecostToken="" \
  --set prometheus.server.persistentVolume.size=32Gi

Cost attribution via labels (enforced by Kyverno)

For Kubecost to allocate costs correctly, every resource needs labels. The platform enforces this:

YAML

# Kyverno policy: require cost-attribution labels
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-cost-labels
spec:
  validationFailureAction: Enforce
  rules:
    - name: check-labels
      match:
        any:
          - resources:
              kinds: [Deployment, StatefulSet, DaemonSet, Job, CronJob]
      validate:
        message: "Resources must have 'team', 'env', and 'cost-center' labels."
        pattern:
          metadata:
            labels:
              team: "?*"
              env: "?*"
              cost-center: "?*"

Every deployment without these labels is rejected at admission.

Kubecost allocation API

Bash

# Get per-team costs for the last 30 days
curl "http://kubecost.internal/model/allocation?window=30d&aggregate=label:team&accumulate=true"

# Response:
{
  "data": {
    "team-payments": {
      "cpuCost": 245.32,
      "ramCost": 87.14,
      "pvCost": 34.20,
      "networkCost": 12.50,
      "totalCost": 379.16
    },
    "team-orders": {
      "totalCost": 521.44
    }
  }
}

Backstage plugin integration

Wire Kubecost into Backstage so each team sees their costs:

TYPESCRIPT

// backstage-kubecost-plugin/src/components/CostWidget.tsx
export const CostWidget = ({ entity }: { entity: Entity }) => {
  const teamLabel = entity.metadata.annotations?.['backstage.io/team'];
  const { data } = useKubecostAllocation({ team: teamLabel, window: '30d' });

  return (
    <InfoCard title="Cloud Cost (30 days)">
      <Typography variant="h4">${data?.totalCost.toFixed(2)}</Typography>
      <CostBreakdownChart data={data} />
      <Typography variant="body2" color="textSecondary">
        CPU: ${data?.cpuCost.toFixed(2)} | Memory: ${data?.ramCost.toFixed(2)}
      </Typography>
    </InfoCard>
  );
};

When teams see their own cloud spend in the developer portal, behavior changes. Teams that were requesting 4 CPU and using 0.3 CPU start right-sizing voluntarily once they see the dollar amount.

Cost alerting

YAML

# Alert when a team's weekly cost increases >20%
groups:
  - name: finops
    rules:
      - alert: TeamCostSpike
        expr: |
          (
            kubecost_cluster_costs_total{label_team!=""}
            - kubecost_cluster_costs_total{label_team!=""} offset 7d
          ) / kubecost_cluster_costs_total{label_team!=""} offset 7d > 0.20
        for: 1h
        labels:
          severity: warning
        annotations:
          summary: "Team {{ $labels.label_team }} costs up >20% vs last week"
          description: "Current: ${{ $value | humanize }} above last week's baseline"

Vertical Pod Autoscaler (VPA): Resource Rightsizing

VPA analyzes historical resource usage and recommends correct CPU/memory requests. It can also apply recommendations automatically.

Installation

Bash

git clone https://github.com/kubernetes/autoscaler
./autoscaler/vertical-pod-autoscaler/hack/vpa-up.sh

VPA in recommendation mode (safest start)

YAML

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: order-service-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: order-service
  updatePolicy:
    updateMode: "Off"      # recommendation only, don't touch the pods
  resourcePolicy:
    containerPolicies:
      - containerName: order-service
        minAllowed:
          cpu: 100m
          memory: 128Mi
        maxAllowed:
          cpu: 4
          memory: 4Gi

Check recommendations:

Bash

kubectl describe vpa order-service-vpa
# Status:
#   Recommendation:
#     Container Recommendations:
#       Container Name: order-service
#       Target:
#         Cpu:     250m         ← suggested request
#         Memory:  512Mi
#       Lower Bound:
#         Cpu:     100m
#         Memory:  256Mi
#       Upper Bound:
#         Cpu:     1000m
#         Memory:  1Gi

The team requested 2 CPU. VPA says 250m is enough. That's an 8x oversizing — this Pod is paying for 7.75 CPU it never uses.

VPA Auto mode (apply recommendations automatically)

YAML

updatePolicy:
  updateMode: "Auto"    # apply recommendations, evict and restart pods as needed

Warning: VPA Auto mode evicts Pods to apply recommendations. This causes brief service disruption. Use minReplicas > 1 and a PodDisruptionBudget to prevent complete outage.

Combining VPA and HPA

VPA adjusts resource requests per pod. HPA adjusts replica count. They conflict when both manage CPU.

Best practice: Use VPA for memory rightsizing (HPA doesn't scale on memory well). Use HPA for CPU-based scaling. Exclude CPU from VPA when HPA is active:

YAML

containerPolicies:
  - containerName: order-service
    controlledResources:
      - memory      # VPA manages memory only
    # CPU managed by HPA

Waste Detection: Finding Hidden Costs

Idle workloads

Workloads running with no traffic — old dev deployments, forgotten test services:

Bash

# Find deployments with zero requests in the last 7 days
kubectl get deployments -A -o json | jq -r '
  .items[] | 
  select(.spec.replicas > 0) | 
  [.metadata.namespace, .metadata.name, .spec.replicas] | 
  @csv
'
# Cross-reference with Prometheus: http_requests_total by service

Automate: a daily job queries Prometheus for services with zero HTTP traffic in 30 days, creates a Jira ticket for the owning team, and auto-scales to 0 after 7 days without response.

Oversized PVCs

Persistent volume claims that are < 20% utilized:

Bash

# Kubelet exposes PVC usage metrics
kubectl top pv  # not built-in, use kubecost or kube-state-metrics

# Prometheus query for underutilized PVCs:
# (kubelet_volume_stats_used_bytes / kubelet_volume_stats_capacity_bytes) < 0.2

Orphaned resources

Resources with no owner (PVCs from deleted StatefulSets, old ConfigMaps, stale Secrets):

Bash

# Find PVCs not mounted by any Pod
kubectl get pvc -A -o json | jq -r '
  .items[] | 
  select(.status.phase == "Bound") | 
  select(.metadata.ownerReferences == null) | 
  [.metadata.namespace, .metadata.name, .spec.resources.requests.storage] | 
  @csv
'

The FinOps Flywheel

FinOps on a platform isn't a one-time project — it's a flywheel:

1. Enforce labels (Kyverno) → cost attribution possible
         ↓
2. Show teams their costs (Kubecost + Backstage) → awareness
         ↓
3. Teams see they're over-provisioned → motivation to rightsize
         ↓
4. VPA recommendations available → easy to fix
         ↓
5. Karpenter consolidates the savings → costs drop
         ↓
6. Weekly cost report to team leads → accountability
         ↓
7. Quarterly FinOps review → celebrate savings, identify next wave
         ↓
        back to 3

Typical results after 6 months of systematic FinOps:

30-45% reduction in cluster compute costs
20-30% reduction from spot instance adoption
15-25% reduction from rightsizing
10-20% reduction from consolidation

Combined: organizations routinely save 40-60% on Kubernetes infrastructure with these tools.

Quick Reference: FinOps Tooling

| Tool | Function | Install | |------|----------|---------| | Karpenter | Node provisioning + consolidation | Helm chart | | Kubecost | Cost allocation and chargeback | Helm chart | | OpenCost | Open-source cost allocation (Kubecost OSS) | Helm chart | | VPA | Resource request recommendations | Shell script | | Goldilocks | VPA recommendations in Backstage/UI | Helm chart | | kube-resource-report | HTML cost report per namespace | Docker | | Popeye | Cluster sanitizer — find waste and config issues | CLI / CronJob |

Recommended stack: Karpenter + Kubecost + VPA (recommendation mode) + Kyverno label enforcement. Add Goldilocks for a developer-friendly VPA UI.

Platform Engineering: FinOps and Kubernetes Cost Optimization — Karpenter, Kubecost, VPA, and Spot Instances

Why FinOps Belongs in the Platform

The Kubernetes Cost Model

Karpenter: Right-Sizing Nodes

NodePool configuration

EC2NodeClass

Spot instance handling

Consolidation: The Hidden Cost Saver

Kubecost: Team-Level Cost Visibility

Installation

Cost attribution via labels (enforced by Kyverno)

Kubecost allocation API

Backstage plugin integration

Cost alerting

Vertical Pod Autoscaler (VPA): Resource Rightsizing

Installation

VPA in recommendation mode (safest start)

VPA Auto mode (apply recommendations automatically)

Combining VPA and HPA

Waste Detection: Finding Hidden Costs

Idle workloads

Oversized PVCs

Orphaned resources

The FinOps Flywheel

Quick Reference: FinOps Tooling

Enjoyed this article?

Leave a comment