As APIs become increasingly exposed to public consumers, mobile applications, and third-party integrations, protecting them from abuse and excessive traffic becomes critical. One of the most effective ways to achieve this is through Rate Limiting.
Rate limiting controls how many requests a client can make within a specified period. It helps:- Prevent denial-of-service (DoS) attacks
- Protect backend resources
- Improve API reliability
- Ensure fair usage among consumers
- Reduce infrastructure costs
Starting with .NET 7, ASP.NET Core provides built-in support for rate limiting through the Microsoft.AspNetCore.RateLimiting package.
- Fixed Window
- Sliding Window
- Token Bucket
- Concurrency Limiter
Before diving into implementation, it's a good idea to help readers understand why different rate-limiting strategies exist and how each one behaves.
ASP.NET Core Rate Limiting Strategies
Rate Limiting is not a one-size-fits-all solution. Different applications have different traffic patterns, and each rate-limiting strategy is designed to solve specific challenges.
Before implementing rate limiting in ASP.NET Core, it's important to understand how these strategies work and when they should be used.
Why Are There Multiple Strategies?
Consider the following scenarios:- A public API should allow occasional traffic spikes but prevent abuse.
- A reporting endpoint should limit the number of concurrent requests because generating reports is resource-intensive.
- An internal API may only need a simple request limit per minute.
Using the same rate-limiting algorithm for all these scenarios may result in poor user experience or inefficient resource utilization. This is why ASP.NET Core provides multiple rate-limiting strategies.
Let's explore each strategy conceptually before implementing them.Fixed Window Strategy
The Fixed Window strategy divides time into fixed intervals, known as windows.For example, if the limit is 100 requests per minute, a client can make up to 100 requests during a one-minute window. Once the limit is reached, additional requests are rejected until the next window begins.
00:00 ───────────────── 00:59 → 100 Requests Allowed
01:00 ───────────────── 01:59 → Counter Reset
The main advantage of this strategy is its simplicity. However, it can allow traffic spikes around window boundaries. A client could send 100 requests at the end of one window and another 100 immediately after the next window starts.
Advantages- Easy to understand
- Simple implementation
- Low memory usage
- Traffic spikes can occur at window boundaries
- Users may effectively double their request rate around reset times
- Internal APIs
- Administrative APIs
- Low-traffic applications
Sliding Window Strategy
The Sliding Window strategy addresses the traffic spike issue found in Fixed Window.Instead of resetting the counter at fixed intervals, it continuously evaluates requests over a rolling time period.
Current Time
│
▼
[ Last 60 Seconds ]
As older requests leave the window, new requests can enter. This creates smoother traffic distribution and prevents sudden bursts at window boundaries.
Sliding Window is commonly used in public-facing APIs where fairness and consistent traffic control are important.
Advantages- Smoother traffic distribution
- Reduces burst behavior
- More accurate request control
- Slightly more complex
- Higher memory overhead
- Public APIs
- SaaS platforms
- High-volume services
Token Bucket Strategy
The Token Bucket strategy uses a bucket that contains a limited number of tokens. Each incoming request consumes one token. Tokens are replenished at a fixed rate over time.
Bucket Capacity: 100
Tokens Request → Consume Token → Success
No Token Available → Request Rejected
Unlike Fixed Window and Sliding Window, Token Bucket allows temporary bursts of traffic as long as enough tokens are available.
For example, if a client has not made requests for some time, tokens accumulate in the bucket. The client can then use those tokens to handle a sudden spike in traffic.
This strategy is widely used for public APIs because it balances flexibility and protection.
Advantages- Supports traffic bursts
- Excellent for real-world workloads
- Flexible traffic shaping
- More configuration parameters
- Slightly harder to tune
- Public REST APIs
- icroservices
- Mobile application backends
Concurrency Limiter Strategy
The Concurrency Limiter works differently from the previous strategies. Instead of limiting requests over time, it limits how many requests can be processed simultaneously.
Maximum Concurrent Requests = 10
Request 1 → Processing
Request 2 → Processing
...
Request 10 → Processing
Request 11 → Rejected or Queued
This strategy is particularly useful for resource-intensive operations such as:
- File uploads
- Report generation
- Database-heavy queries
- Long-running background tasks
Even if the total request count is low, too many concurrent operations can overwhelm the server. The Concurrency Limiter helps prevent this problem.
Advantages- Prevents server overload
- Protects expensive operations
- Controls resource consumption
- Does not limit total request volume
- Not suitable as the only protection mechanism
- File uploads
- Report generation
- Database-intensive operations
- Long-running requests
Setting Up Rate Limiting in ASP.NET Core
Now that we understand the different rate-limiting strategies and their use cases, it's time to implement them in an ASP.NET Core Web API.
ASP.NET Core provides built-in support for rate limiting through the Microsoft.AspNetCore.RateLimiting package. This package includes middleware and several rate-limiting algorithms that can be configured with minimal effort.
- Install the required package.
- Register and configure the rate-limiting middleware.
Install the Required Package
dotnet add package Microsoft.AspNetCore.RateLimiting
Add the rate-limiting package to your project using the .NET CLI. This package provides the middleware and the built-in rate-limiting implementations, including Fixed Window, Sliding Window, Token Bucket, and Concurrency Limiter.
Register the Rate Limiter
Once the package is installed, the next step is to register the rate-limiting services in the dependency injection container.
In theProgram.cs file, call AddRateLimiter() to configure the available rate-limiting policies:
builder.Services.AddRateLimiter(options =>
{
options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
});
The RejectionStatusCode property specifies the HTTP status code returned when a request exceeds the configured rate limit. The standard response code for rate-limited requests is 429 (Too Many Requests).
Enable the Middleware
After registering the services, enable the rate-limiting middleware in the request processing pipeline:
app.UseRateLimiter();
The middleware intercepts incoming requests and evaluates them against the configured rate-limiting policies. If a request exceeds the allowed limit, the middleware automatically rejects it and returns the configured response.
Complete Setup
The basic setup should look similar to the following:
using System.Threading.RateLimiting;
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddRateLimiter(options =>
{
options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
});
var app = builder.Build();
app.UseRateLimiter();
app.MapControllers();
app.Run();
With the middleware configured, we can now define individual rate-limiting policies and apply them to specific endpoints or groups of endpoints. In the following sections, we'll explore how to implement each strategy: Fixed Window, Sliding Window, Token Bucket, and Concurrency Limiter.
Implementing Fixed Window Rate Limiting
The Fixed Window strategy divides time into fixed intervals (windows). Each client is allowed a specific number of requests within a window.
Example:- Window Duration: 1 minute
- Permit Limit: 100 requests
A client can send up to 100 requests during the minute. Once the limit is reached, additional requests are rejected until the next window begins.
Visualization
Minute 1: [100 Requests Allowed]
Minute 2: [Counter Reset]
Minute 3: [100 Requests Allowed]
Implementation
builder.Services.AddRateLimiter(options =>
{
options.AddFixedWindowLimiter("fixed", limiterOptions =>
{
limiterOptions.PermitLimit = 100;
limiterOptions.Window = TimeSpan.FromMinutes(1);
limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst; limiterOptions.QueueLimit = 10;
});
});
Apply to an endpoint:
app.MapGet("/fixed", () => "Fixed Window API") .RequireRateLimiting("fixed");
Implementing Sliding Window Rate Limiting
The Sliding Window algorithm improves upon Fixed Window by dividing the time window into smaller segments and calculating limits based on recent activity.
Example:- Window: 1 minute
- Segments: 6
- Permit Limit: 100
Instead of resetting all requests at once, the limiter continuously evaluates activity from the previous minute.
Visualization
|----10s-----|----10s-----|----10s----|
Segment1 Segment2 Segment3
As old segments expire, capacity becomes available gradually.Implementation
builder.Services.AddRateLimiter(options =>
{
options.AddSlidingWindowLimiter("sliding", limiterOptions =>
{
limiterOptions.PermitLimit = 100;
limiterOptions.Window = TimeSpan.FromMinutes(1);
limiterOptions.SegmentsPerWindow = 6;
limiterOptions.QueueLimit = 10;
limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
});
});
Apply to an endpoint:
app.MapGet("/sliding", () => "Sliding Window API") .RequireRateLimiting("sliding");
Implementing Token Bucket Rate Limiting
The Token Bucket algorithm maintains a bucket containing tokens. Each request consumes one token. Tokens are replenished at a configured rate.
Example:
Bucket Capacity = 100 tokens
Refill Rate = 10 tokens/sec
If tokens are available:
Request → Token Consumed → Success
If no tokens are available:
Request → Rejected (429)
This strategy allows occasional bursts while maintaining long-term control.Implementation
builder.Services.AddRateLimiter(options =>
{
options.AddTokenBucketLimiter("token", limiterOptions =>
{
limiterOptions.TokenLimit = 100;
limiterOptions.QueueLimit = 20;
limiterOptions.ReplenishmentPeriod = TimeSpan.FromSeconds(10); limiterOptions.TokensPerPeriod = 20;
limiterOptions.AutoReplenishment = true;
});
});
Apply to endpoint:
app.MapGet("/token", () => "Token Bucket API") .RequireRateLimiting("token");
Concurrency Limiter
Unlike the previous strategies, the Concurrency Limiter does not focus on requests per time period.
Instead, it limits the number of concurrent requests being processed.Example:
Max Concurrent Requests = 10
If 10 requests are already being processed:
11th Request → Queued or Rejected
This protects resource-intensive operations from overwhelming the server.
Implementation
builder.Services.AddRateLimiter(options =>
{
options.AddConcurrencyLimiter("concurrency", limiterOptions =>
{
limiterOptions.PermitLimit = 10;
limiterOptions.QueueLimit = 20;
limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
});
});
Apply to endpoint:
app.MapGet("/concurrency", async () =>
{
await Task.Delay(5000);
return "Concurrency Limited API";
}) .RequireRateLimiting("concurrency");
Combining Multiple Rate Limiters
In production environments, a single strategy is often insufficient.
A common approach is:- Token Bucket → Controls overall traffic
- Concurrency Limiter → Protects expensive operations
builder.Services.AddRateLimiter(options =>
{
options.AddTokenBucketLimiter("api", opt =>
{
opt.TokenLimit = 100;
opt.TokensPerPeriod = 20;
opt.ReplenishmentPeriod = TimeSpan.FromSeconds(10);
opt.AutoReplenishment = true;
});
options.AddConcurrencyLimiter("heavy-work", opt =>
{
opt.PermitLimit = 5;
opt.QueueLimit = 10;
});
});
Handling Rate Limit Responses
builder.Services.AddRateLimiter(options =>
{
options.OnRejected = async (context, token) =>
{
context.HttpContext.Response.StatusCode = 429;
await context.HttpContext.Response.WriteAsync( "Too many requests. Please try again later.", token);
};
});
Choosing the Right Strategy
| Strategy | Controls | Burst Handling | Complexity | Recommended For |
|---|---|---|---|---|
| Fixed Window | Requests per time window | Poor | Low | Internal APIs |
| Sliding Window | Requests over rolling period | Good | Medium | Public APIs |
| Token Bucket | Requests with refill mechanism | Excellent | Medium | High-traffic APIs |
| Concurrency Limiter | Simultaneous requests | N/A | Low | Resource-heavy operations |
In many production systems, multiple strategies are combined. For example, a Token Bucket limiter may control overall API traffic while a Concurrency Limiter protects expensive operations from consuming too many server resources.
Summary
Rate limiting is an essential mechanism for building resilient and secure APIs. ASP.NET Core's built-in rate limiting middleware provides multiple strategies to address different traffic patterns and workloads.
- Use Fixed Window for simple and predictable scenarios.
- Use Sliding Window for smoother traffic control.
- Use Token Bucket when burst traffic is expected.
- Use Concurrency Limiter to protect resource-intensive endpoints.
For most production APIs, combining Token Bucket with Concurrency Limiter provides a balanced solution that controls traffic volume while protecting backend resources from overload.
Thanks