.NET & C# Development · Lesson 38 of 92
Rate Limit Your API Before It Gets Hammered
Why Rate Limiting Belongs in Your API
Without it:
- A single bad actor (or a bug in a client) can exhaust your database connections
- Scrapers can pull your entire catalogue in seconds
- A thundering herd during a traffic spike takes down every user
.NET 7 added Microsoft.AspNetCore.RateLimiting — no third-party package needed.
The Four Built-In Limiters
Fixed Window
A fixed quota resets at the end of each window. Simple, but allows a burst of requests right at the boundary (end of window N + start of window N+1).
// Program.cs
builder.Services.AddRateLimiter(options =>
{
options.AddFixedWindowLimiter("fixed", limiter =>
{
limiter.Window = TimeSpan.FromMinutes(1);
limiter.PermitLimit = 60; // 60 requests per minute
limiter.QueueLimit = 0; // reject excess immediately
limiter.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
});
});Sliding Window
Subdivides the window into segments and slides the quota forward, smoothing out boundary bursts.
options.AddSlidingWindowLimiter("sliding", limiter =>
{
limiter.Window = TimeSpan.FromMinutes(1);
limiter.SegmentsPerWindow = 6; // 6 x 10-second segments
limiter.PermitLimit = 60;
limiter.QueueLimit = 5;
});Token Bucket
Tokens accumulate at a steady rate up to a maximum. Good for bursty workloads where short spikes are acceptable but sustained throughput is capped.
options.AddTokenBucketLimiter("token-bucket", limiter =>
{
limiter.TokenLimit = 100; // max burst
limiter.ReplenishmentPeriod = TimeSpan.FromSeconds(10);
limiter.TokensPerPeriod = 20; // refill 20 tokens every 10s
limiter.QueueLimit = 0;
});Concurrency Limiter
Caps the number of simultaneous requests in flight — not the rate, but the parallelism. Useful for CPU-bound or database-bound endpoints.
options.AddConcurrencyLimiter("concurrency", limiter =>
{
limiter.PermitLimit = 10; // max 10 concurrent requests
limiter.QueueLimit = 5; // queue up to 5 more
});Returning 429 With Retry-After
The default rejection returns 503 Service Unavailable. Change it globally:
builder.Services.AddRateLimiter(options =>
{
options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
options.OnRejected = async (context, cancellationToken) =>
{
context.HttpContext.Response.StatusCode = StatusCodes.Status429TooManyRequests;
if (context.Lease.TryGetMetadata(MetadataName.RetryAfter, out var retryAfter))
{
context.HttpContext.Response.Headers.RetryAfter =
((int)retryAfter.TotalSeconds).ToString();
}
await context.HttpContext.Response.WriteAsync(
"Too many requests. Please slow down.", cancellationToken);
};
});Applying Limiters
Global — Every Endpoint
// Applies a named policy to all endpoints that don't override it
app.UseRateLimiter();Set options.GlobalLimiter to a partition-based limiter (shown below) to rate limit everything.
Per Endpoint With the Attribute
app.UseRateLimiter(); // must be registered in the pipeline
// Controller action
[EnableRateLimiting("sliding")]
[HttpPost("search")]
public IActionResult Search([FromBody] SearchRequest req) { ... }
// Opt a specific action out of a global limiter
[DisableRateLimiting]
[HttpGet("health")]
public IActionResult Health() => Ok();Minimal API:
app.MapPost("/search", SearchHandler)
.RequireRateLimiting("sliding");
app.MapGet("/health", () => Results.Ok())
.DisableRateLimiting();Rate Limiting by User ID
Partition a limiter so each authenticated user gets their own quota:
builder.Services.AddRateLimiter(options =>
{
options.AddPolicy("per-user", httpContext =>
RateLimitPartition.GetFixedWindowLimiter(
partitionKey: httpContext.User.FindFirstValue(ClaimTypes.NameIdentifier)
?? httpContext.Connection.RemoteIpAddress?.ToString()
?? "anonymous",
factory: _ => new FixedWindowRateLimiterOptions
{
Window = TimeSpan.FromMinutes(1),
PermitLimit = 100,
QueueLimit = 0
}));
});Authenticated users get 100 req/min each. Unauthenticated requests share a quota per IP.
Rate Limiting by IP Address
options.AddPolicy("per-ip", httpContext =>
RateLimitPartition.GetSlidingWindowLimiter(
partitionKey: httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown",
factory: _ => new SlidingWindowRateLimiterOptions
{
Window = TimeSpan.FromSeconds(30),
SegmentsPerWindow = 3,
PermitLimit = 30,
QueueLimit = 0
}));Chaining Limiters (Global + Per-Endpoint)
Use PartitionedRateLimiter.CreateChained when you want multiple independent limits (e.g., per-IP AND global):
using System.Threading.RateLimiting;
var perIpLimiter = PartitionedRateLimiter.Create<HttpContext, string>(ctx =>
RateLimitPartition.GetFixedWindowLimiter(
partitionKey: ctx.Connection.RemoteIpAddress?.ToString() ?? "unknown",
factory: _ => new FixedWindowRateLimiterOptions
{
Window = TimeSpan.FromSeconds(10), PermitLimit = 10
}));
var globalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(_ =>
RateLimitPartition.GetTokenBucketLimiter(
partitionKey: "global",
factory: _ => new TokenBucketRateLimiterOptions
{
TokenLimit = 1000, ReplenishmentPeriod = TimeSpan.FromSeconds(1), TokensPerPeriod = 100
}));
builder.Services.AddRateLimiter(options =>
{
options.GlobalLimiter = PartitionedRateLimiter.CreateChained(perIpLimiter, globalLimiter);
options.RejectionStatusCode = 429;
});A request must satisfy both limiters to proceed.
Full Setup Example
// Program.cs
builder.Services.AddRateLimiter(options =>
{
options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
options.OnRejected = async (ctx, ct) =>
{
if (ctx.Lease.TryGetMetadata(MetadataName.RetryAfter, out var retry))
ctx.HttpContext.Response.Headers.RetryAfter = ((int)retry.TotalSeconds).ToString();
ctx.HttpContext.Response.ContentType = "application/json";
await ctx.HttpContext.Response.WriteAsync(
"""{"error":"rate_limit_exceeded","message":"Slow down, friend."}""", ct);
};
// Authenticated: 200/min per user. Anonymous: 20/min per IP.
options.AddPolicy("adaptive", httpContext =>
{
var userId = httpContext.User.FindFirstValue(ClaimTypes.NameIdentifier);
if (userId is not null)
return RateLimitPartition.GetFixedWindowLimiter(userId,
_ => new FixedWindowRateLimiterOptions { Window = TimeSpan.FromMinutes(1), PermitLimit = 200 });
var ip = httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown";
return RateLimitPartition.GetFixedWindowLimiter($"anon:{ip}",
_ => new FixedWindowRateLimiterOptions { Window = TimeSpan.FromMinutes(1), PermitLimit = 20 });
});
});
var app = builder.Build();
app.UseRateLimiter(); // must come before UseRouting/MapControllers
app.MapControllers();Key Takeaways
- Fixed window is simplest; sliding window is smoother; token bucket handles bursts gracefully
- Concurrency limiter caps parallelism, not throughput — great for downstream bottlenecks
- Always return
429not503— clients can distinguish "slow down" from "server broken" Retry-Aftertells clients exactly how long to wait — implement it- Partition by user ID when authenticated, fall back to IP for anonymous traffic