Rate limiting in gateways is a crucial configuration to manage traffic and ensure service availability. By implementing a rate limiter, we can control the number of requests a client can make to the server within a specified time frame. This protects the underlying services from being overwhelmed by excessive traffic, whether malicious or accidental.
The configuration typically involves -
Replenish Rate: The rate at which tokens are added to the bucket. For example, if the replenish rate is 2 tokens per second, two tokens are added to the bucket every second.
Burst Capacity: The maximum number of tokens that the bucket can hold. This allows for short bursts of traffic.
KeyResolver: A KeyResolver
is an interface used to determine a key for rate-limiting purposes.
NOTE: We currently provide two options for keyResolver and if none of them is specified the spring cloud will take a default go PrincipalNameKeyResolver
which retrieves the Principal
from the ServerWebExchange
and calls Principal.getName()
.
ipKeyResolver : Resolves key based on ip address of the request
userKeyResolver : Resolves key based on use UUID of the request
Let's say we have a rate limiter configured with:
replenishRate
: 2 tokens per second
burstCapacity
: 5 tokens
This means:
2 tokens are added to the bucket every second.
The bucket can hold a maximum of 5 tokens.
Scenario: A user makes requests at different intervals.
Initial State: The bucket has 5 tokens (full capacity).
First Request: The user makes a request and consumes 1 token. 4 tokens remain.
Second Request: The user makes another request and consumes 1 more token. 3 tokens remain.
Third Request: The user waits 1 second (2 tokens added) and then makes a request. The bucket has 4 tokens (3 remaining + 2 added - 1 consumed).
Let's consider a scenario where a user makes multiple requests in quick succession.
Configuration:
replenishRate
: 1 token per second
burstCapacity
: 3 tokens
Scenario: A user makes 4 requests in rapid succession.
Initial State: The bucket has 3 tokens (full capacity).
First Request: Consumes 1 token. 2 tokens remain.
Second Request: Consumes 1 token. 1 token remains.
Third Request: Consumes 1 token. 0 tokens remain.
Fourth Request: There are no tokens left, so the request is denied. The user must wait for more tokens to be added.
After 1 second, 1 token is added to the bucket. The user can make another request.
Here’s a practical example using Spring Cloud Gateway with Redis Rate Limiting.
Configuration: In Routes.properties you can set rate limiting as
Explanation:
Replenish Rate: 5 tokens per second.
Burst Capacity: 10 tokens.
Behavior:
A user can make up to 10 requests instantly (burst capacity).
After consuming the burst capacity, the user can make 5 requests per second (replenish rate).
API Service Rate Limiting
An API service wants to ensure clients do not overwhelm the server with too many requests. They set up rate limits as follows:
Replenish Rate: 100 tokens per minute.
Burst Capacity: 200 tokens.
Scenario:
A client can make 200 requests instantly.
After the burst capacity is exhausted, the client can make 100 requests per minute.
If any client tries to make more requests than allowed, they receive a response indicating they are being rate-limited.
Prevents Abuse: Limits the number of requests to prevent abuse or malicious attacks (e.g., DDoS attacks).
Fair Usage: Ensures fair usage among all users by preventing a single user from consuming all the resources.
Load Management: Helps manage server load and maintain performance by controlling the rate of incoming requests.
Improved User Experience: Prevents server overload, ensuring a smoother and more reliable experience for all users.
Rate limiting is crucial for traffic management, fair usage, and server resource protection. By setting parameters like replenishRate and burstCapacity, you can regulate request flow and manage traffic spikes efficiently. In Spring Cloud Gateway, the Redis Rate Limiter filter offers a robust solution for implementing rate limiting on your routes.
Deployment With Spring Cloud
We are updating our core services to remove outdated dependencies and ensure long-term support for the DIGIT platform. The current API gateway uses Netflix Zuul, which has dependencies that will soon be obsolete. To address this, we are building a new gateway using Spring Cloud Gateway.
What is Spring Cloud Gateway and how is it different from Zuul?
Spring Cloud Gateway and Zuul both function as API gateways but differ in architecture and design. Spring Cloud Gateway is ideal for modern, reactive applications, while Zuul is better suited for traditional, blocking I/O environments. The choice between them depends on your specific needs.
Navigating the new Gateway codebase
The new Gateway codebase is well-organized, with each module containing similar files and names. This makes it easy to understand the tasks each part performs. Below is a snapshot of the current directory structure.
Config: It contains the configuration-related files for example Application Properties etc.
Constants: It contains the constants referenced frequently within the codebase. Add any constant string literal here and then access it via this file.
Filters: This folder is the heart of the API gateway since it contains all the PRE, POST, and ERROR filters. For those new to filters: Filters intercept incoming and outgoing requests, allowing developers to apply various functionalities such as authentication, authorisation, rate limiting, logging, and transformation to these requests and responses.
Model: contains the P.O.J.O required in the gateway.
Producer: contains code related to pushing data onto Kafka.
Rate limiters contain files for initialising the relevant bean for custom rate limiting.
Utils: contains the helper function which can be reused across the project.
The above paragraphs provide a basic overview of the gateway's functionality and project structure.
When a request is received, the gateway checks if it matches any predefined routes. If a match is found, the request goes through a series of filters, each performing specific validation or enrichment tasks. The exact order of these filters is discussed later.
The gateway also ensures that restricted routes have proper authentication and authorization. Some APIs can be whitelisted as open or mixed-mode endpoints, allowing them to bypass authentication or authorization.
Upon receiving a request, the gateway first looks for a matching route definition. If a match is found, it starts executing the pre-filters in the specified order.
Pre-Filter
RequestStartTimerFilter: Sets request start time
CorrelationIdFilter: Generate and set a correlationId in each request to help track it in the downstream service
AuthPreCheckFilter: Checks for if Authorisation has to be performed
PreHookFilter: Sends a pre-hook request
RbacPreFilter: Checks if Authentication has to be performed or not
AuthFilter: Authenticate the request
RbacFilter: Authorise the request
RequestEnrichmentFilter: Enrich the request with userInfo & correlationId
Error-Filter
This filter handles all the errors shown either during the request processing or from the downstream service.
There are two ways to configure Rate Limits in Gateway
Default Rate Limiting
Service Level Rate Limiting
Default rate limiting sets a standard limit on the number of requests that can be made to the gateway within a specified time frame. This limit applies to all services unless specific rate limits are configured at the service level.
Add these properties in Values.YAML of Gateway helm file and then configure the values as per the use case. Read Configuring Gateway Rate Limiting for more information about these properties.
Service level rate limiting allows you to set specific rate limits for individual services. This means each service can have its request limits, tailored to its unique needs and usage patterns, providing more granular control over traffic management.
If you want to define rate limiting for each service differently you can do so by defining these properties in the Values.YAML of the respective service. Read Configuring Gateway Rate Limiting for more information about these properties.
Note: We currently provide two options for keyResolver and if none of them is specified the spring cloud will take a default go PrincipalNameKeyResolver
which retrieves the Principal
from the ServerWebExchange
and calls Principal.getName()
.
ipKeyResolver : Resolves key based on ip address of the request
userKeyResolver : Resolves key based on use UUID of the request
To enable gateway routes, a service must activate the gateway flag in the Helm chart. Based on this flag, a Go script runs in the Init container, automatically generating the necessary properties for all services using the gateway.
NOTE: Restart the Gateway after making changes in service Values.YAML so that it can pick up the changes.