V9 Rate Limits
This page defines the default rate limits for the backend services. The limits are separated by context to offer granular control over limits within the application area.
What are rate limits?
Rate limiting is a mechanism used to protect an application from excessive traffic by controlling how many requests can be processed over a period of time. It helps prevent system overload, abuse, and performance degradation by limiting request rates, controlling traffic bursts, or restricting the number of concurrent operations. When a limit is reached, requests may either be placed in a queue and processed later or be rejected with an HTTP 429 (Too Many Requests) response. Different rate limiting strategies are available depending on the scenario: Fixed Window limits requests within a fixed time period, Sliding Window provides smoother traffic control using a moving time window, Token Bucket allows short bursts while enforcing a sustainable average rate, and Concurrency limits the number of requests that can execute at the same time. All queued requests in this implementation are processed using an OldestFirst approach.
Rate Limiting Configuration
Global Behaviour
- Rejected requests return HTTP 429 (Too Many Requests).
- Response trailer includes:
error_detail: too many requests
- Queue processing order for all policies:
- OldestFirst
Built-in Rate Limiter Policies
| Policy Name | Type | Purpose | Configuration |
|---|---|---|---|
| fixed | Fixed Window | Allows a fixed number of requests within a time period. Counter resets at the end of each window. | PermitLimit=100, Window=20s, QueueLimit=50 |
| sliding | Sliding Window | Similar to Fixed Window but uses a moving time window for smoother traffic control. | PermitLimit=25, Window=9s, SegmentsPerWindow=3, QueueLimit=10 |
| token | Token Bucket | Uses tokens that are consumed by requests and replenished over time. Allows short bursts of traffic. | TokenLimit=50, TokensPerPeriod=1, ReplenishmentPeriod=5s, AutoReplenishment=true, QueueLimit=10 |
| concurrency | Concurrency | Limits the number of requests that can execute simultaneously. | PermitLimit=2, QueueLimit=3 |
Limiter Type Comparison
| Limiter Type | What It Limits | Example |
|---|---|---|
| Fixed Window | Number of requests in a fixed time period | Allow 100 requests every 20 seconds |
| Sliding Window | Number of requests in a continuously moving time period | Allow 25 requests within any rolling 9-second period |
| Token Bucket | Requests based on available tokens that refill over time | Allow bursts of requests but enforce a sustainable rate |
| Concurrency | Number of requests running simultaneously | Allow only 2 imports to execute at the same time |
Token Bucket Business Policies
Configuration values are sourced from application settings, with the defaults shown below.
| Policy Name | Configuration Section | Purpose | Token Limit | Tokens / Period | Replenishment Period | Queue Limit |
|---|---|---|---|---|---|---|
| api-policy | ApiRateLimitPolicy:* |
General API request throttling. Allows small bursts while protecting the API from excessive traffic. | 60 | 10 | 10s | 10 |
| import-policy | ImportRateLimitPolicy:* |
Supports high-volume import operations while preventing imports from overwhelming the system. | 1,000 | 200 | 10s | 20 |
| signify-signing-policy | SigningRateLimitPolicy:* |
Designed for high-throughput document signing workloads. | 4,000 | 1,000 | 5s | 10 |
| signify-email-policy | EmailRateLimitPolicy:* |
Controls email sending throughput and protects downstream email providers. | 6,000 | 1,000 | 10s | 50 |
| signify-sms-policy | SMSRateLimitPolicy:* |
Limits SMS traffic to avoid overwhelming SMS gateways and third-party providers. | 1,000 | 100 | 10s | 10 |
What This Means in Practice
E.g. the api-policy for API requests:
- Handle a burst of up to 60 requests immediately.
- Recover at a rate of 10 requests every 10 seconds (approximately 1 request per second on average).
- Queue up to 10 additional requests while waiting for tokens to become available.
- Reject further requests with HTTP 429 when both the bucket and queue are full.
This configuration is useful because it allows short traffic spikes while still protecting the API from sustained high request volumes. The same token bucket policy is enforced for the other policies defined above.
Queue Behaviour
| Scenario | Result |
|---|---|
| Permit available | Request executes immediately |
| Permit unavailable, queue has space | Request waits in queue |
| Permit unavailable, queue full | Request rejected with HTTP 429 |
| Multiple requests queued | Processed in OldestFirst order |
Summary
All rate limiters:
- Return HTTP 429 when requests are rejected.
- Include the trailer
error_detail: too many requests. - Process queued requests using OldestFirst ordering.
Business-specific policies use the Token Bucket algorithm and are configurable on application level.