Rate Limiting in .NET
Reading Time: 9 minutes
The Problem
As traffic grows, APIs face three common failure modes:
- noisy clients consume disproportionate capacity
- bursts overwhelm downstream dependencies
- fair users experience latency or failed requests because a few clients dominate throughput
Without rate limiting, even a healthy service can degrade quickly under spikes, retries, scraping, or accidental tight loops.
The Solution
Apply rate limiting as a first-class control in your .NET architecture.
In practice, rate limiting can be applied in several ways:
- globally across the whole app
- per endpoint or endpoint group
- per partition key (IP, user, API key, tenant, client app)
- with different algorithms (fixed window, sliding window, token bucket, concurrency)
- at reverse proxy or gateway layers
- for outbound calls from your service to third-party APIs
- in distributed form across multiple instances when required
Use built-in ASP.NET Core middleware for most APIs, then add gateway/distributed controls where scale or multi-node fairness demands it.
Description
1. Apply a global limiter for the entire API
A global limiter protects baseline capacity regardless of endpoint behavior.
using System.Threading.RateLimiting;
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddRateLimiter(options =>
{
options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
RateLimitPartition.GetFixedWindowLimiter(
partitionKey: "global",
factory: _ => new FixedWindowRateLimiterOptions
{
PermitLimit = 500,
Window = TimeSpan.FromMinutes(1),
QueueProcessingOrder = QueueProcessingOrder.OldestFirst,
QueueLimit = 100,
AutoReplenishment = true
}));
options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
});
var app = builder.Build();
app.UseRateLimiter();
This is the broadest and simplest guardrail.
2. Apply named policies per endpoint
Different endpoints often need different budgets. Login and search endpoints usually require stricter limits than health checks.
using System.Threading.RateLimiting;
builder.Services.AddRateLimiter(options =>
{
options.AddFixedWindowLimiter("strict", limiterOptions =>
{
limiterOptions.PermitLimit = 10;
limiterOptions.Window = TimeSpan.FromMinutes(1);
limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
limiterOptions.QueueLimit = 2;
limiterOptions.AutoReplenishment = true;
});
options.AddTokenBucketLimiter("burst", limiterOptions =>
{
limiterOptions.TokenLimit = 100;
limiterOptions.TokensPerPeriod = 20;
limiterOptions.ReplenishmentPeriod = TimeSpan.FromSeconds(10);
limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
limiterOptions.QueueLimit = 20;
limiterOptions.AutoReplenishment = true;
});
});
app.MapPost("/auth/login", () => Results.Ok())
.RequireRateLimiting("strict");
app.MapGet("/catalog/search", () => Results.Ok())
.RequireRateLimiting("burst");
3. Apply per-user, per-IP, or per-tenant partitioning
Partitioned limits improve fairness by isolating quotas.
builder.Services.AddRateLimiter(options =>
{
options.AddPolicy("per-user", context =>
{
var userId = context.User.Identity?.IsAuthenticated == true
? context.User.Identity!.Name!
: context.Connection.RemoteIpAddress?.ToString() ?? "anonymous";
return RateLimitPartition.GetSlidingWindowLimiter(
partitionKey: userId,
factory: _ => new SlidingWindowRateLimiterOptions
{
PermitLimit = 60,
Window = TimeSpan.FromMinutes(1),
SegmentsPerWindow = 6,
QueueProcessingOrder = QueueProcessingOrder.OldestFirst,
QueueLimit = 5,
AutoReplenishment = true
});
});
});
app.MapGroup("/v1")
.RequireRateLimiting("per-user")
.MapGet("/orders", () => Results.Ok());
You can partition by:
- authenticated user id
- source IP
- API key/client id
- tenant id
- route group or operation key
4. Choose the right algorithm
ASP.NET Core supports multiple algorithms, each with trade-offs:
- fixed window: simple and efficient, but can allow boundary bursts
- sliding window: smoother than fixed windows
- token bucket: ideal when short bursts are acceptable within a refill budget
- concurrency limiter: caps in-flight requests instead of request rate over time
Concurrency is useful for expensive endpoints:
builder.Services.AddRateLimiter(options =>
{
options.AddConcurrencyLimiter("expensive-op", limiterOptions =>
{
limiterOptions.PermitLimit = 20;
limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
limiterOptions.QueueLimit = 50;
});
});
app.MapPost("/reports/generate", () => Results.Ok())
.RequireRateLimiting("expensive-op");
5. Apply limits via attributes in MVC/controllers
For controller-based APIs, you can decorate specific actions:
using Microsoft.AspNetCore.Mvc;
[ApiController]
[Route("api/[controller]")]
public class PaymentsController : ControllerBase
{
[HttpPost]
[EnableRateLimiting("strict")]
public IActionResult CreatePayment() => Ok();
[HttpGet("health")]
[DisableRateLimiting]
public IActionResult Health() => Ok();
}
6. Apply rate limiting at proxy or gateway boundaries
Rate limiting can also be enforced before requests hit your app:
- reverse proxy (for example, YARP in .NET)
- API gateway (for example, managed gateway products)
- edge/CDN layer for basic abuse protection
Gateway limits reduce load earlier in the path and can enforce cross-service policies.
7. Apply outbound rate limiting for external APIs
Rate limiting is not only inbound. Many systems need to throttle outgoing calls to partner APIs.
using System.Threading.RateLimiting;
var limiter = new TokenBucketRateLimiter(new TokenBucketRateLimiterOptions
{
TokenLimit = 50,
TokensPerPeriod = 10,
ReplenishmentPeriod = TimeSpan.FromSeconds(1),
QueueProcessingOrder = QueueProcessingOrder.OldestFirst,
QueueLimit = 100,
AutoReplenishment = true
});
// Acquire permit before invoking an external API call.
using var lease = await limiter.AcquireAsync(1);
if (!lease.IsAcquired)
{
throw new InvalidOperationException("Outbound rate limit exceeded.");
}
8. Apply distributed limits for multi-instance fairness
Built-in middleware limits are in-memory per instance. In horizontally scaled environments, consider distributed coordination (for example, Redis-backed counters or gateway-native distributed throttling) when strict global quotas are required.
NuGet Packages and Versions
As of April 17, 2026, current stable packages relevant to rate limiting in .NET:
System.Threading.RateLimitingversion10.0.6Microsoft.AspNetCore.App.Refversion10.0.6(reference pack containing ASP.NET Core rate limiting APIs in modern projects)AspNetCoreRateLimitversion5.0.0(popular legacy/community package)Yarp.ReverseProxyversion2.3.0(when implementing proxy-layer controls in .NET)
dotnet add package System.Threading.RateLimiting --version 10.0.6
dotnet add package AspNetCoreRateLimit --version 5.0.0
dotnet add package Yarp.ReverseProxy --version 2.3.0
Note: for standard ASP.NET Core applications targeting modern .NET, built-in rate limiting middleware is available through the shared framework and typically does not require an extra package install.
Summary
Rate limiting in .NET is most effective when applied at multiple layers:
- globally for baseline protection
- per endpoint and partition key for fairness
- with the right algorithm for each workload
- at gateway/proxy boundaries for early protection
- outbound to protect dependencies
- distributed when multi-instance consistency is required
Start with built-in ASP.NET Core middleware, then evolve toward gateway and distributed patterns as traffic and architecture complexity grow.
References
- Rate limiting middleware in ASP.NET Core: https://learn.microsoft.com/aspnet/core/performance/rate-limit
- System.Threading.RateLimiting namespace: https://learn.microsoft.com/dotnet/api/system.threading.ratelimiting
- PartitionedRateLimiter API: https://learn.microsoft.com/dotnet/api/system.threading.ratelimiting.partitionedratelimiter-2
- YARP Reverse Proxy documentation: https://learn.microsoft.com/aspnet/core/fundamentals/servers/yarp/getting-started
- AspNetCoreRateLimit package: https://www.nuget.org/packages/AspNetCoreRateLimit