Setting Up Rate Limiting in Production
Rate limiting is your first line of defense against API abuse, DDoS attacks, and resource exhaustion. In this comprehensive guide, we will walk through implementing production-grade rate limiting with real-world examples.
Why Rate Limiting Matters
Real Incident: The $72,000 Bill
In March 2024, a startup without rate limiting was hit with a DDoS attack. Their auto-scaling infrastructure spun up 300+ servers trying to handle the traffic. The attack lasted 6 hours. Their cloud bill: $72,000.
With proper rate limiting, the attack would have been blocked at the gateway, costing them $0 in excess infrastructure.
Rate Limiting Algorithms Compared
1. Token Bucket (Best for Bursts)
Tokens are added to a bucket at a fixed rate. Each request consumes a token. Allows controlled bursts.
✓ Pros:
- Allows traffic bursts
- Smooth over time
- Memory efficient
✗ Cons:
- Complex to configure
- Can allow sudden spikes
2. Fixed Window (Simplest)
Count requests in fixed time windows (e.g., per minute). Reset counter at window boundary.
✓ Pros:
- Very simple to implement
- Low memory usage
- Easy to understand
✗ Cons:
- Window boundary issue
- Allows 2x burst at boundaries
3. Sliding Window (Recommended)
G8KEPR Uses ThisCombines fixed window simplicity with smooth rate enforcement. No boundary issues.
✓ Pros:
- Accurate rate limiting
- No boundary exploits
- Smooth enforcement
~ Cons:
- Slightly more memory
- More computation
Multi-Level Rate Limiting Strategy
Production systems need rate limiting at multiple levels. Here is the recommended approach:
| Level | Limit | Purpose |
|---|---|---|
| Per IP Address | 100/min | Prevent DDoS, scraping |
| Per User | 1000/min | Fair usage across users |
| Per API Key | 5000/min | Tier-based limits |
| Per Endpoint | Varies | Protect expensive operations |
| Global | 50000/min | Infrastructure capacity |
Implementation with G8KEPR
G8KEPR provides production-ready rate limiting out of the box. Here is a sample configuration:
# docker-compose.yml
services:
gatekeeper:
image: g8kepr/gateway:latest
environment:
# Sliding window rate limits
RATE_LIMIT_PER_IP: "100/1m"
RATE_LIMIT_PER_USER: "1000/1m"
RATE_LIMIT_PER_API_KEY: "5000/1m"
# Endpoint-specific limits
RATE_LIMIT_LOGIN: "5/5m" # Prevent brute force
RATE_LIMIT_SIGNUP: "3/1h" # Prevent spam accounts
RATE_LIMIT_EXPORT: "10/1h" # Expensive operation
# Redis for distributed counting
REDIS_URL: "redis://redis:6379"
# Response headers
RATE_LIMIT_HEADERS: "true" # X-RateLimit-* headers
ports:
- "8080:8080"
redis:
image: redis:7-alpine
volumes:
- redis-data:/data
volumes:
redis-data:Monitoring and Alerts
Rate limiting is only effective if you monitor it. Set up alerts for:
- Unusual spike in rate limit rejections (possible attack)
- Specific IPs hitting limits repeatedly (block them)
- Legitimate users hitting limits (increase their tier)
- Global limit approaching capacity (scale infrastructure)
Get Production-Ready Rate Limiting
G8KEPR includes sliding window rate limiting, Redis integration, and real-time monitoring dashboards.
View PricingReady to Secure Your APIs?
Deploy enterprise-grade API security in 5 minutes. No credit card required.
Start Free Trial