What is the difference between the Rate Limit and Spike Arrest interceptors?

Both Rate Limit and Spike Arrest are options for controlling the number of requests that will be accepted by a given API (or resource or operation). But the way they control traffic is different:

Rate Limit Spike Arrest

Rate Limit works based on time intervals (second, minute, hour, day, month). It checks if the number of calls within the configured interval is within the expected limit, regardless of the time between calls. When the interval ends, a new one starts and the call count is also reset.

Spike Arrest guarantees a minimum time distance between two calls. If the time between two requests is not respected, the second request will not be accepted and the HTTP error code returned will be 429. This operation avoids traffic spikes and protects the server, ensuring that the incoming flow is possible to handle.

With this difference in mind, how to choose when to use each one and how to configure them?

To guarantee general limits, Rate Limit is the most suitable. It can also be used to reduce traffic spikes when the time frame is in seconds, but for relatively spaced-out traffic. Now, if your goal is to protect the server against call overload and the expected traffic is very intense, Spike Arrest allows for better control. These points are better understood after checking the examples below.

To configure the interceptors so as to get to the desired goal, it’s important to understand the operating logic of each of them. As Spike Arrest works based on the time between calls, configuring it with a limit of 1 request per second, 60 per minute or 3600 per hour generates exactly the same behaviour, given that the minimum time between calls will be the same. With Rate Limit, you need to pay better attention to the settings. As it works based on checking the total number of calls in a given time interval, setting a limit of 1 request per second is different from 60 per minute, since in the first case the count is renewed every second and in the second case, every minute.

If you still can’t tell one interceptor from the other very well, the examples below will help to clarify their differences.

Examples of use: Spike Arrest

In the first example below, we are showing the behaviour of a Spike Arrest interceptor configured with a limit of 6 requests per minute. Therefore, the minimum time between requests is 10 seconds. Accepted calls are shown in purple and denied calls are in orange:

spike arrest 6 1m

Analysing the result, we can see that the only condition that allows all calls to be answered is if they are made regularly, spaced with the minimum time.

In the next example, we show the behaviour of a Spike Arrest interceptor configured with a limit of 1 request per second (which is equivalent to 60 per minute). For the graph, we are considering 100 requests made in an interval of 1 minute (the divisions represent each second). Accepted calls are shown in purple and denied calls are in orange:

spike arrest 1 1s

In total, 45 of the 100 calls were made respecting the time distance between calls, and these were answered with success. Note that traffic spikes are prevented and calls are evenly distributed during the time frame.

Examples of use: Rate Limit

In the first example below, we are showing the behaviour of a Rate Limit interceptor configured with a limit of 1 request per second. Accepted calls are shown in purple and denied calls are in orange:

rate limit 1 1s

If more calls are made than the limit established within the stipulated interval, Rate Limit will always handle the number of requests that are within the limit. The same configuration (limit of 1 request per second) is shown below, but now the graph represents a full minute in which 100 requests were made. Accepted calls are shown in purple and denied calls are in orange (and the divisions represent every second of the interval):

rate limit 1 1s 100requests

By guaranteeing a limit of 1 call per second, Rate Limit answered 55 of the 100 calls. Comparing this number with Spike Arrest (with a limit of 1 request per second as shown above), we see that Rate Limit accepted more calls. This is because very close calls are allowed, as long as one is at the end of one interval and the other is at the beginning of the next interval.

In the following example, we set Rate Limit to accept a 60 requests per minute. As the call count resumes only at the end of the period, the behaviour is different from the previous configuration (1 request per second). Again, we represent 100 calls made in 1 minute. As expected, Rate Limit accepted 60 requests:

rate limit 60 1m

In this last example, traffic spikes are allowed, as long as the total number of calls over the configured period is within the proposed limit. If the limit is exceeded, the API (or resource or operation, depending on where the interceptor was inserted) will be unavailable until the new time interval is started. In the example above, as the 60th request happened just before the 33rd second, the unavailability would last for the remaining 27 seconds of the interval.

Thanks for your feedback!
EDIT

Share your suggestions with us!
Click here and then [+ Submit idea]