Metrics and Prometheus in .NET

Logs tell you what happened. Traces tell you where time was spent. Metrics tell you the health of your system right now — and over time.

A good metrics setup answers questions like: How many requests per second? What's the p99 latency? How many orders failed in the last 5 minutes? Is the database connection pool exhausted?

The Three Metric Types You Need to Know

Counter — monotonically increasing. Use for "how many times did X happen?"

http_requests_total{method="POST", route="/api/orders", status="200"} 12847

Gauge — current value, goes up and down. Use for "what is the current state?"

connection_pool_active_connections 42
queue_depth 156

Histogram — distribution of values. Use for "how long did X take?" — gives you p50, p95, p99 latency.

http_request_duration_seconds_bucket{le="0.1"} 9823
http_request_duration_seconds_bucket{le="0.5"} 12100
http_request_duration_seconds_bucket{le="1.0"} 12847
http_request_duration_seconds_sum 3421.4
http_request_duration_seconds_count 12847

Automatic Metrics with OpenTelemetry

ASP.NET Core emits built-in metrics via OpenTelemetry. Add the package and they appear automatically:

Bash

dotnet add package OpenTelemetry.Extensions.Hosting
dotnet add package OpenTelemetry.Instrumentation.AspNetCore
dotnet add package OpenTelemetry.Instrumentation.Runtime
dotnet add package OpenTelemetry.Exporter.Prometheus.AspNetCore

// Program.cs
builder.Services.AddOpenTelemetry()
    .WithMetrics(metrics =>
    {
        metrics
            .SetResourceBuilder(ResourceBuilder.CreateDefault()
                .AddService("OrderService"))
            .AddAspNetCoreInstrumentation()     // HTTP request metrics
            .AddRuntimeInstrumentation()         // GC, thread pool, memory
            .AddPrometheusExporter();            // expose /metrics endpoint
    });

// Expose the /metrics endpoint
app.MapPrometheusScrapingEndpoint();

Visit http://localhost:5000/metrics — you'll see Prometheus-format text output immediately.

Built-in ASP.NET Core metrics include:

http.server.request.duration — latency histogram per route/method/status
http.server.active_requests — in-flight requests gauge
dotnet.gc.heap.total_allocated — GC allocation rate
dotnet.thread_pool.queue.length — thread pool pressure

Custom Business Metrics

The built-in metrics cover infrastructure. You need custom metrics for business events.

using System.Diagnostics.Metrics;

public class OrderMetrics
{
    private readonly Counter<long> _ordersPlaced;
    private readonly Counter<long> _ordersFailed;
    private readonly Histogram<double> _checkoutDuration;
    private readonly ObservableGauge<int> _pendingOrders;

    private readonly IOrderRepository _repository;

    public OrderMetrics(IMeterFactory meterFactory, IOrderRepository repository)
    {
        _repository = repository;
        var meter = meterFactory.Create("OrderService");

        _ordersPlaced = meter.CreateCounter<long>(
            "orders.placed",
            unit: "{orders}",
            description: "Total orders successfully placed");

        _ordersFailed = meter.CreateCounter<long>(
            "orders.failed",
            unit: "{orders}",
            description: "Total orders that failed during checkout");

        _checkoutDuration = meter.CreateHistogram<double>(
            "orders.checkout.duration",
            unit: "ms",
            description: "End-to-end checkout duration in milliseconds");

        // ObservableGauge: pulls current value on each scrape
        _pendingOrders = meter.CreateObservableGauge<int>(
            "orders.pending",
            () => _repository.GetPendingCountAsync().GetAwaiter().GetResult(),
            unit: "{orders}",
            description: "Current count of pending orders");
    }

    public void RecordOrderPlaced(string tier, string currency)
    {
        _ordersPlaced.Add(1,
            new KeyValuePair<string, object?>("tier", tier),
            new KeyValuePair<string, object?>("currency", currency));
    }

    public void RecordOrderFailed(string reason)
    {
        _ordersFailed.Add(1,
            new KeyValuePair<string, object?>("reason", reason));
    }

    public void RecordCheckoutDuration(double milliseconds, string tier)
    {
        _checkoutDuration.Record(milliseconds,
            new KeyValuePair<string, object?>("tier", tier));
    }
}

// Program.cs
builder.Services.AddSingleton<OrderMetrics>();

// In OpenTelemetry setup
.AddMeter("OrderService")  // must match meter name above

// In OrderService
public class OrderService
{
    private readonly OrderMetrics _metrics;

    public async Task<Order> PlaceOrderAsync(PlaceOrderRequest request)
    {
        var sw = Stopwatch.StartNew();
        try
        {
            var order = await ProcessOrderAsync(request);
            _metrics.RecordOrderPlaced(request.Tier, request.Currency);
            _metrics.RecordCheckoutDuration(sw.ElapsedMilliseconds, request.Tier);
            return order;
        }
        catch (PaymentException ex)
        {
            _metrics.RecordOrderFailed("payment_declined");
            throw;
        }
        catch (InventoryException)
        {
            _metrics.RecordOrderFailed("out_of_stock");
            throw;
        }
    }
}

Running Prometheus Locally

Prometheus scrapes your /metrics endpoint on an interval and stores the time-series data.

YAML

# prometheus.yml
global:
  scrape_interval: 15s

scrape_configs:
  - job_name: "order-service"
    static_configs:
      - targets: ["host.docker.internal:5000"]
    metrics_path: "/metrics"

Bash

docker run -d --name prometheus \
  -p 9090:9090 \
  -v ./prometheus.yml:/etc/prometheus/prometheus.yml \
  prom/prometheus

Open http://localhost:9090 → Graph tab. Query: rate(orders_placed_total[5m]) — orders per second over the last 5 minutes.

Essential PromQL Queries

Request rate (per second):

rate(http_server_request_duration_seconds_count{job="order-service"}[5m])

Error rate:

rate(http_server_request_duration_seconds_count{status=~"5.."}[5m])
  /
rate(http_server_request_duration_seconds_count[5m])

p99 latency:

histogram_quantile(0.99,
  rate(http_server_request_duration_seconds_bucket[5m])
)

Orders per second:

rate(orders_placed_total[5m])

Order failure rate:

rate(orders_failed_total[5m]) / rate(orders_placed_total[5m])

Grafana Dashboards

Prometheus stores data; Grafana visualises it.

Bash

docker run -d --name grafana \
  -p 3001:3000 \
  grafana/grafana-oss

Open http://localhost:3001 (admin/admin)
Add data source → Prometheus → http://prometheus:9090
Create dashboard → Add panel → use PromQL queries above

A minimal API dashboard should show:

Request rate (time series)
Error rate % (gauge + time series)
p50/p95/p99 latency (time series)
Active requests (gauge)

For the order service, add:

Orders per second
Order failure rate by reason
Checkout duration p99

The USE Method for Diagnosing Performance

For any resource (CPU, thread pool, connection pool), check three metrics:

Utilization — how busy is it? (CPU: 80%, thread pool: 200/400 threads)
Saturation — how much is waiting? (thread pool queue depth, connection pool wait time)
Errors — is it failing? (connection pool exhausted errors)

High utilization + high saturation = bottleneck. The USE method tells you where to look first.

# Thread pool saturation
dotnet_thread_pool_queue_length > 100  → alert

# Connection pool exhaustion
sqlclient_connection_pool_active_connections / sqlclient_connection_pool_max_connections > 0.9  → alert

Summary

Counters for totals, Gauges for current state, Histograms for latency distributions
OpenTelemetry + Prometheus exporter gives automatic HTTP and runtime metrics in 10 lines
Add custom business metrics (IMeterFactory) for domain events — orders, payments, jobs
Use labels/tags to slice metrics by route, status, tier, reason
Run Prometheus + Grafana locally with Docker for development dashboards
PromQL rate() for rates, histogram_quantile() for percentiles

Metrics and Prometheus in .NET — What to Measure and How

Metrics and Prometheus in .NET

The Three Metric Types You Need to Know

Automatic Metrics with OpenTelemetry

Custom Business Metrics

Running Prometheus Locally

Essential PromQL Queries

Grafana Dashboards

The USE Method for Diagnosing Performance

Summary

Enjoyed this article?

Leave a comment