Implementing Rate Limiting in ASP.NET Core Web APIs

As APIs become increasingly exposed to public consumers, mobile applications, and third-party integrations, protecting them from abuse and excessive traffic becomes critical. One of the most effective ways to achieve this is through Rate Limiting.

Rate limiting controls how many requests a client can make within a specified period. It helps:

Prevent denial-of-service (DoS) attacks
Protect backend resources
Improve API reliability
Ensure fair usage among consumers
Reduce infrastructure costs

Starting with .NET 7, ASP.NET Core provides built-in support for rate limiting through the Microsoft.AspNetCore.RateLimiting package.

This article explores four commonly used rate-limiting strategies:

Fixed Window
Sliding Window
Token Bucket
Concurrency Limiter

Before diving into implementation, it's a good idea to help readers understand why different rate-limiting strategies exist and how each one behaves.

ASP.NET Core Rate Limiting Strategies

Rate Limiting is not a one-size-fits-all solution. Different applications have different traffic patterns, and each rate-limiting strategy is designed to solve specific challenges.

Before implementing rate limiting in ASP.NET Core, it's important to understand how these strategies work and when they should be used.

Why Are There Multiple Strategies?

Consider the following scenarios:

A public API should allow occasional traffic spikes but prevent abuse.
A reporting endpoint should limit the number of concurrent requests because generating reports is resource-intensive.
An internal API may only need a simple request limit per minute.

Using the same rate-limiting algorithm for all these scenarios may result in poor user experience or inefficient resource utilization. This is why ASP.NET Core provides multiple rate-limiting strategies.

Let's explore each strategy conceptually before implementing them.

Fixed Window Strategy

The Fixed Window strategy divides time into fixed intervals, known as windows.

For example, if the limit is 100 requests per minute, a client can make up to 100 requests during a one-minute window. Once the limit is reached, additional requests are rejected until the next window begins.


00:00 ───────────────── 00:59 → 100 Requests Allowed 
01:00 ───────────────── 01:59 → Counter Reset

The main advantage of this strategy is its simplicity. However, it can allow traffic spikes around window boundaries. A client could send 100 requests at the end of one window and another 100 immediately after the next window starts.

Advantages

Easy to understand
Simple implementation
Low memory usage

Disadvantages

Traffic spikes can occur at window boundaries
Users may effectively double their request rate around reset times

Best Use Cases

Internal APIs
Administrative APIs
Low-traffic applications

Sliding Window Strategy

The Sliding Window strategy addresses the traffic spike issue found in Fixed Window.

Instead of resetting the counter at fixed intervals, it continuously evaluates requests over a rolling time period.


		Current Time 
        	│ 
            ▼ 
    [ Last 60 Seconds ]

As older requests leave the window, new requests can enter. This creates smoother traffic distribution and prevents sudden bursts at window boundaries.

Sliding Window is commonly used in public-facing APIs where fairness and consistent traffic control are important.

Advantages

Smoother traffic distribution
Reduces burst behavior
More accurate request control

Disadvantages

Slightly more complex
Higher memory overhead

Best Use Cases

Public APIs
SaaS platforms
High-volume services

Token Bucket Strategy

The Token Bucket strategy uses a bucket that contains a limited number of tokens. Each incoming request consumes one token. Tokens are replenished at a fixed rate over time.


Bucket Capacity: 100 

Tokens Request → Consume Token → Success 
No Token Available → Request Rejected

Unlike Fixed Window and Sliding Window, Token Bucket allows temporary bursts of traffic as long as enough tokens are available.

For example, if a client has not made requests for some time, tokens accumulate in the bucket. The client can then use those tokens to handle a sudden spike in traffic.

This strategy is widely used for public APIs because it balances flexibility and protection.

Advantages

Supports traffic bursts
Excellent for real-world workloads
Flexible traffic shaping

Disadvantages

More configuration parameters
Slightly harder to tune

Best Use Cases

Public REST APIs
icroservices
Mobile application backends

Concurrency Limiter Strategy

The Concurrency Limiter works differently from the previous strategies. Instead of limiting requests over time, it limits how many requests can be processed simultaneously.


Maximum Concurrent Requests = 10

Request 1 → Processing 
Request 2 → Processing 
... 
Request 10 → Processing 
Request 11 → Rejected or Queued

This strategy is particularly useful for resource-intensive operations such as:

File uploads
Report generation
Database-heavy queries
Long-running background tasks

Even if the total request count is low, too many concurrent operations can overwhelm the server. The Concurrency Limiter helps prevent this problem.

Advantages

Prevents server overload
Protects expensive operations
Controls resource consumption

Disadvantages

Does not limit total request volume
Not suitable as the only protection mechanism

Best Use Cases

File uploads
Report generation
Database-intensive operations
Long-running requests

Setting Up Rate Limiting in ASP.NET Core

Now that we understand the different rate-limiting strategies and their use cases, it's time to implement them in an ASP.NET Core Web API.

ASP.NET Core provides built-in support for rate limiting through the Microsoft.AspNetCore.RateLimiting package. This package includes middleware and several rate-limiting algorithms that can be configured with minimal effort.

Before implementing any specific strategy, we need to complete two setup steps:

Install the required package.
Register and configure the rate-limiting middleware.

Install the Required Package


dotnet add package Microsoft.AspNetCore.RateLimiting

Add the rate-limiting package to your project using the .NET CLI. This package provides the middleware and the built-in rate-limiting implementations, including Fixed Window, Sliding Window, Token Bucket, and Concurrency Limiter.

Register the Rate Limiter

Once the package is installed, the next step is to register the rate-limiting services in the dependency injection container.

In the Program.cs file, call AddRateLimiter() to configure the available rate-limiting policies:


builder.Services.AddRateLimiter(options => 
{ 
	options.RejectionStatusCode = StatusCodes.Status429TooManyRequests; 
});

The RejectionStatusCode property specifies the HTTP status code returned when a request exceeds the configured rate limit. The standard response code for rate-limited requests is 429 (Too Many Requests).

Enable the Middleware

After registering the services, enable the rate-limiting middleware in the request processing pipeline:


app.UseRateLimiter();

The middleware intercepts incoming requests and evaluates them against the configured rate-limiting policies. If a request exceeds the allowed limit, the middleware automatically rejects it and returns the configured response.

Complete Setup

The basic setup should look similar to the following:


using System.Threading.RateLimiting;
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddRateLimiter(options =>
{
  options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
});
var app = builder.Build();
app.UseRateLimiter();
app.MapControllers();
app.Run();

With the middleware configured, we can now define individual rate-limiting policies and apply them to specific endpoints or groups of endpoints. In the following sections, we'll explore how to implement each strategy: Fixed Window, Sliding Window, Token Bucket, and Concurrency Limiter.

Implementing Fixed Window Rate Limiting

The Fixed Window strategy divides time into fixed intervals (windows). Each client is allowed a specific number of requests within a window.

Example:

Window Duration: 1 minute
Permit Limit: 100 requests

A client can send up to 100 requests during the minute. Once the limit is reached, additional requests are rejected until the next window begins.

Visualization


Minute 1: [100 Requests Allowed] 
Minute 2: [Counter Reset] 
Minute 3: [100 Requests Allowed]

Implementation


builder.Services.AddRateLimiter(options =>
{
  options.AddFixedWindowLimiter("fixed", limiterOptions =>
  {
    limiterOptions.PermitLimit = 100;
    limiterOptions.Window = TimeSpan.FromMinutes(1);
    limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst; limiterOptions.QueueLimit = 10;
   });
});

Apply to an endpoint:


app.MapGet("/fixed", () => "Fixed Window API") .RequireRateLimiting("fixed");

Implementing Sliding Window Rate Limiting

The Sliding Window algorithm improves upon Fixed Window by dividing the time window into smaller segments and calculating limits based on recent activity.

Example:

Window: 1 minute
Segments: 6
Permit Limit: 100

Instead of resetting all requests at once, the limiter continuously evaluates activity from the previous minute.

Visualization


|----10s-----|----10s-----|----10s----| 
	Segment1 	Segment2 	Segment3

As old segments expire, capacity becomes available gradually.

Implementation


builder.Services.AddRateLimiter(options =>
{
  options.AddSlidingWindowLimiter("sliding", limiterOptions =>
  {
    limiterOptions.PermitLimit = 100;
    limiterOptions.Window = TimeSpan.FromMinutes(1);
    limiterOptions.SegmentsPerWindow = 6;
    limiterOptions.QueueLimit = 10;
    limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
  });
});

Apply to an endpoint:


app.MapGet("/sliding", () => "Sliding Window API") .RequireRateLimiting("sliding");

Implementing Token Bucket Rate Limiting

The Token Bucket algorithm maintains a bucket containing tokens. Each request consumes one token. Tokens are replenished at a configured rate.

Example:


Bucket Capacity = 100 tokens
Refill Rate = 10 tokens/sec

If tokens are available:
Request → Token Consumed → Success

If no tokens are available:
Request → Rejected (429)

This strategy allows occasional bursts while maintaining long-term control.
Implementation


builder.Services.AddRateLimiter(options =>
{
  options.AddTokenBucketLimiter("token", limiterOptions =>
  {
    limiterOptions.TokenLimit = 100;
    limiterOptions.QueueLimit = 20;
    limiterOptions.ReplenishmentPeriod = TimeSpan.FromSeconds(10); limiterOptions.TokensPerPeriod = 20;
    limiterOptions.AutoReplenishment = true;
    });
    });

Apply to endpoint:


app.MapGet("/token", () => "Token Bucket API") .RequireRateLimiting("token");

Concurrency Limiter

Unlike the previous strategies, the Concurrency Limiter does not focus on requests per time period.

Instead, it limits the number of concurrent requests being processed.

Example:
Max Concurrent Requests = 10

If 10 requests are already being processed:

11th Request → Queued or Rejected

This protects resource-intensive operations from overwhelming the server.

Implementation


builder.Services.AddRateLimiter(options =>
{
  options.AddConcurrencyLimiter("concurrency", limiterOptions =>
  {
    limiterOptions.PermitLimit = 10;
    limiterOptions.QueueLimit = 20;
    limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
    });
    });

Apply to endpoint:


app.MapGet("/concurrency", async () => 
{ 
	await Task.Delay(5000); 
    return "Concurrency Limited API"; 
}) .RequireRateLimiting("concurrency");

Combining Multiple Rate Limiters

In production environments, a single strategy is often insufficient.

A common approach is:

Token Bucket → Controls overall traffic
Concurrency Limiter → Protects expensive operations

Example:


builder.Services.AddRateLimiter(options =>
{
  options.AddTokenBucketLimiter("api", opt =>
  {
    opt.TokenLimit = 100;
    opt.TokensPerPeriod = 20;
    opt.ReplenishmentPeriod = TimeSpan.FromSeconds(10);
    opt.AutoReplenishment = true; 
  });
  options.AddConcurrencyLimiter("heavy-work", opt =>
  {
     opt.PermitLimit = 5;
     opt.QueueLimit = 10;
  });
});

Handling Rate Limit Responses


builder.Services.AddRateLimiter(options =>
{
  options.OnRejected = async (context, token) =>
  {
    context.HttpContext.Response.StatusCode = 429; 
    await context.HttpContext.Response.WriteAsync( "Too many requests. Please try again later.", token);
  };
});

Choosing the Right Strategy

Strategy	Controls	Burst Handling	Complexity	Recommended For
Fixed Window	Requests per time window	Poor	Low	Internal APIs
Sliding Window	Requests over rolling period	Good	Medium	Public APIs
Token Bucket	Requests with refill mechanism	Excellent	Medium	High-traffic APIs
Concurrency Limiter	Simultaneous requests	N/A	Low	Resource-heavy operations

In many production systems, multiple strategies are combined. For example, a Token Bucket limiter may control overall API traffic while a Concurrency Limiter protects expensive operations from consuming too many server resources.

Summary

Rate limiting is an essential mechanism for building resilient and secure APIs. ASP.NET Core's built-in rate limiting middleware provides multiple strategies to address different traffic patterns and workloads.

Use Fixed Window for simple and predictable scenarios.
Use Sliding Window for smoother traffic control.
Use Token Bucket when burst traffic is expected.
Use Concurrency Limiter to protect resource-intensive endpoints.

For most production APIs, combining Token Bucket with Concurrency Limiter provides a balanced solution that controls traffic volume while protecting backend resources from overload.

Thanks

Implementing Rate Limiting in ASP.NET Core Web APIs

ASP.NET Core Rate Limiting Strategies

Why Are There Multiple Strategies?

Fixed Window Strategy

Sliding Window Strategy

Token Bucket Strategy

Concurrency Limiter Strategy

Setting Up Rate Limiting in ASP.NET Core

Install the Required Package

Register the Rate Limiter

Enable the Middleware

Complete Setup

Implementing Fixed Window Rate Limiting

Implementing Sliding Window Rate Limiting

Implementing Token Bucket Rate Limiting

Concurrency Limiter

Combining Multiple Rate Limiters

Handling Rate Limit Responses

Choosing the Right Strategy

Summary

نموذج الاتصال