Real-Time Feedback
Know the Instant Your Request Is Active — No Polling
When a request is queued, the natural question is: when is it my turn? The naive solution is polling — check every N milliseconds. But polling burns CPU, adds latency, and fills your logs with noise. RateQueue's real-time feedback pushes activation to your worker the moment the slot opens.
The Polling Problem
The typical pattern developers end up writing without real-time feedback:
# What you end up building without real-time feedback
while True:
status = check_queue_status(request_id)
if status == "active":
break
await asyncio.sleep(0.2) # poll every 200ms
await make_api_call()Problems with this approach: wasted CPU on every check, up to 200ms of additional latency between a slot becoming available and your worker noticing, flooded logs, and catastrophic behavior under load when hundreds of requests are all polling simultaneously — creating a thundering herd of status checks.
Real-Time Activation with the SDK
The SDK handles this transparently. When you call acquire, your code blocks efficiently until the slot is granted — no polling loop on your side. Under the hood, RateQueue uses real-time feedback to wake up your coroutine the instant the slot opens.
import ratequeue.aio as rq
# Awaits efficiently — no busy polling on your side
async with rq.acquire("openai-gpt4", load=2000, api_key=RATEQUEUE_API_KEY):
# Runs only once the slot is active
response = await openai_client.chat.completions.create(
model="gpt-4o",
messages=messages
)
# Slot released — next queued request is notified immediatelyNo while True: check_status() in your code. The async with suspends your coroutine and resumes it when RateQueue grants the activation.
Why This Matters Under Load
At scale, polling creates a thundering herd: 500 queued requests all polling every 200ms generates 2,500 status requests per second — none of which do useful work. Real-time feedback eliminates this. The coordination happens server-side — each worker is notified exactly once when it's time.
Polling
500 requests × 5 checks/second = 2,500 wasted status calls per second. Latency up to 200ms after slot opens.
Real-time
Zero status calls. Each worker receives exactly one notification at the exact moment its slot opens.
Polling Is Still Supported
If your environment doesn't support persistent connections, RateQueue's polling mode works fine. Switch between real-time and polling per resource in the dashboard. Both modes are handled transparently by the SDK — you don't change your code, just your plan.
Eliminate polling — upgrade for real-time feedback
Real-time feedback is available on the paid plan ($6/month). The free plan uses polling-based activation — both are transparent to your code.