Rate Limit Errors

Stop Getting 429 Errors

Hitting rate limits isn't a sign you're doing something wrong — it's what happens when multiple workers share the same API quota without coordination. Here's how to actually fix it.

Why You Keep Hitting 429s

The problem isn't that you're making too many requests — it's that your workers don't know what the others are doing. When you run 4 services all calling OpenAI, each one sees its own request count. None of them know when the shared limit is almost full. So they all keep firing until the API pushes back with a 429.

Retries make this worse. Exponential backoff adds jitter and latency, but doesn't prevent the 429s from happening in the first place. You're reacting to a failure that could have been avoided entirely.

The Right Fix: Coordinate Before Sending

Instead of reacting to 429s after they happen, stop the request before it leaves your system when capacity isn't available. RateQueue sits in front of your outbound calls and coordinates access across all your workers from a single control plane. When a worker calls acquire, it either proceeds immediately or waits — no racing, no rejection.

Python

import ratequeue.aio as rq

async with rq.acquire("openai", load=1, api_key=RATEQUEUE_API_KEY):
    response = await openai_client.chat.completions.create(...)

TypeScript

import { ratequeue } from "@ratequeue/sdk";

await ratequeue.acquire("openai", { apiKey: process.env.RATEQUEUE_API_KEY! }, async () => {
  response = await openai.chat.completions.create(...);
});

What This Actually Solves

429s eliminated

Requests wait for capacity instead of racing and failing. The API never sees more traffic than its limit allows.

No retries needed

Prevention beats reaction every time. When you don't get 429s, you don't need retry logic, backoff strategies, or jitter tuning.

Works across workers

All services share the same limit view. It doesn't matter if they're in the same process, the same host, or different regions.

Distributed by default

Works whether you have 2 or 200 workers. No per-worker configuration, no coordination infrastructure to manage.

No more 429s

Sign up free and wrap your first API call in minutes. One resource, one limit, no infrastructure required.