Reserved Capacity
Reserved Capacity for the Traffic That Can't Wait
When batch jobs and user-facing requests share the same API quota, batch jobs win by volume. They fire in bulk, consume the budget, and your interactive traffic gets queued behind 500 enrichment jobs. Reserved capacity lets you hold back a portion of the resource exclusively for critical traffic.
The Noisy Neighbor Problem
You have 100 units of capacity on an API. Background jobs consume 95 of them. A user request comes in. It waits. The user sees latency they shouldn't see, caused by work that could wait. This is the noisy neighbor problem — low-priority traffic crowding out high-priority traffic on a shared resource.
Priority helps, but isn't enough on its own. Even with priority ordering, if 100 batch jobs were already running when the user request arrives, the user still waits for the next available slot. Reserved capacity is the hard guarantee.
How Reserved Capacity Works
A reservation holds back N units of capacity that only designated traffic can access. Non-reserved traffic cannot consume those units, regardless of priority or queue depth.
Example: 100 total capacity, 20 reserved for user-facing, 20 reserved for API tier, 60 general pool. User-facing requests always have 20 units available, even when 80 units of the general pool are busy.
import ratequeue.aio as rq
# User-facing: uses reserved capacity block
async with rq.acquire(
"openai-gpt4",
priority=100,
lane="user-facing",
api_key=RATEQUEUE_API_KEY
):
await generate_user_response(prompt)
# Background job: draws from general pool only
async with rq.acquire(
"openai-gpt4",
priority=1,
lane="batch",
api_key=RATEQUEUE_API_KEY
):
await enrich_record(record_id)When Priority Isn't Enough
Priority ordering works well when the resource has free capacity. When the resource is fully loaded, high-priority requests skip the queue — but only relative to other queued requests. Reserved capacity ensures that even under full load, your critical traffic has guaranteed access.
SLA-bound user requests
Guarantee headroom for interactive traffic regardless of how deep the batch backlog runs.
Premium vs. free tier
Reserve capacity for paying customers so free-tier volume never degrades their experience.
Real-time inference vs. batch
Keep inference capacity available for live requests even when batch training jobs are running.
Webhooks vs. scheduled reports
Time-sensitive webhook delivery always has headroom, even during report generation bursts.
Configure on the Dashboard
Reserved capacity is set at the resource level in the RateQueue dashboard — no code changes needed. Define what percentage or unit count to reserve per traffic class, and it takes effect immediately across all workers and services using that resource.
Stop letting batch jobs starve your critical traffic
Start free — reserved capacity is available on the paid plan.