Real-Time Feedback

Know the Instant Your Request Is Active — No Polling

When a request is queued, the natural question is: when is it my turn? The naive solution is polling — check every N milliseconds. But polling burns CPU, adds latency, and fills your logs with noise. RateQueue's real-time feedback pushes activation to your worker the moment the slot opens.

The Polling Problem

The typical pattern developers end up writing without real-time feedback:

# What you end up building without real-time feedback
while True:
    status = check_queue_status(request_id)
    if status == "active":
        break
    await asyncio.sleep(0.2)  # poll every 200ms

await make_api_call()

Problems with this approach: wasted CPU on every check, up to 200ms of additional latency between a slot becoming available and your worker noticing, flooded logs, and catastrophic behavior under load when hundreds of requests are all polling simultaneously — creating a thundering herd of status checks.

Real-Time Activation with the SDK

The SDK handles this transparently. When you call acquire, your code blocks efficiently until the slot is granted — no polling loop on your side. Under the hood, RateQueue uses real-time feedback to wake up your coroutine the instant the slot opens.

import ratequeue.aio as rq

# Awaits efficiently — no busy polling on your side
async with rq.acquire("openai-gpt4", load=2000, api_key=RATEQUEUE_API_KEY):
    # Runs only once the slot is active
    response = await openai_client.chat.completions.create(
        model="gpt-4o",
        messages=messages
    )
# Slot released — next queued request is notified immediately

No while True: check_status() in your code. The async with suspends your coroutine and resumes it when RateQueue grants the activation.

Why This Matters Under Load

At scale, polling creates a thundering herd: 500 queued requests all polling every 200ms generates 2,500 status requests per second — none of which do useful work. Real-time feedback eliminates this. The coordination happens server-side — each worker is notified exactly once when it's time.

Polling

500 requests × 5 checks/second = 2,500 wasted status calls per second. Latency up to 200ms after slot opens.

Real-time

Zero status calls. Each worker receives exactly one notification at the exact moment its slot opens.

Polling Is Still Supported

If your environment doesn't support persistent connections, RateQueue's polling mode works fine. Switch between real-time and polling per resource in the dashboard. Both modes are handled transparently by the SDK — you don't change your code, just your plan.

Eliminate polling — upgrade for real-time feedback

Real-time feedback is available on the paid plan ($6/month). The free plan uses polling-based activation — both are transparent to your code.