Is RateQueue for protecting my public API endpoints?
No. RateQueue is for coordinating outbound access to shared constrained resources. It is not an API gateway, WAF, or inbound endpoint rate limiter.
FAQ
Reach out for additional questions or support.
No. RateQueue is for coordinating outbound access to shared constrained resources. It is not an API gateway, WAF, or inbound endpoint rate limiter.
Retries react after failure. RateQueue prevents the contention before it becomes a failure by coordinating access before the request is sent.
A normal queue stores work. RateQueue controls when queued work is actually allowed to run based on rate limits, concurrency limits, priority, lanes, and reserved capacity.
Yes. RateQueue is a strong fit for shared LLM quotas where multiple workers need to coordinate request, token, and concurrency budgets without routing through a third-party gateway.
Not if you configure lanes or reserved capacity. Those features let you separate traffic classes and protect critical work from lower-priority traffic.
Yes. A single resource can enforce multiple limits at the same time, including rate-based and concurrency-based rules.
No. Polling is supported, and real-time feedback can be used when you want immediate activation updates.
Yes. A capacity-1 request-scoped resource behaves like a queued distributed lock, which makes it useful for preventing race conditions on shared dependencies.
No. RateQueue is fully managed. You define the resource and limits, then integrate with the API or SDK.
Yes. The free plan includes one resource and one limit per resource so you can validate a real production use case before upgrading.