Node.js SDK

Node.js Rate Limiting That Scales With Your Workers

In-process rate limiters in Node.js — bottleneck, p-throttle, rate-limiter-flexible — work fine for a single service. The moment you run multiple Lambda functions, worker threads, or separate Node.js services against the same API key, their counters diverge. @ratequeue/sdk coordinates centrally.

Install

npm install @ratequeue/sdk

Basic Usage

import { ratequeue } from "@ratequeue/sdk";

await ratequeue.acquire(
  "openai-gpt4",
  {
    load: estimatedTokens,
    priority: 100,
    apiKey: process.env.RATEQUEUE_API_KEY!,
  },
  async () => {
    const response = await openai.chat.completions.create({
      model: "gpt-4o",
      messages,
    });
    return response;
  }
);

resource

Identifies which resource to acquire. Matches the resource name from your dashboard.

options.load

Weight of this request against load-based limits — tokens, bytes, or any numeric unit.

options.priority

Higher values are served first when capacity is constrained.

options.lane

Isolate traffic into separate queue segments to prevent starvation.

Lambda / Serverless Integration

Every Lambda invocation is an isolated process with no shared memory. Per-process rate limiters are useless here. RateQueue coordinates across all invocations because the limit is enforced server-side.

import { ratequeue } from "@ratequeue/sdk";

export const handler = async (event: LambdaEvent) => {
  return await ratequeue.acquire(
    "sendgrid-email",
    { apiKey: process.env.RATEQUEUE_API_KEY! },
    async () => {
      await sgMail.send({
        to: event.recipient,
        from: "noreply@example.com",
        subject: event.subject,
        text: event.body,
      });
      return { statusCode: 200 };
    }
  );
};

100 Lambda invocations triggered simultaneously — they all queue against the same resource. SendGrid's limit is respected regardless of how many functions are running.

Next.js API Route Integration

// pages/api/generate.ts
import { ratequeue } from "@ratequeue/sdk";
import type { NextApiRequest, NextApiResponse } from "next";

export default async function handler(req: NextApiRequest, res: NextApiResponse) {
  const result = await ratequeue.acquire(
    "openai-gpt4",
    {
      load: 2000,
      priority: 100,
      lane: "user-facing",
      apiKey: process.env.RATEQUEUE_API_KEY!,
    },
    async () => {
      const response = await openai.chat.completions.create({
        model: "gpt-4o",
        messages: [{ role: "user", content: req.body.prompt }],
      });
      return response.choices[0].message.content;
    }
  );

  res.json({ result });
}

Works Anywhere Node.js Runs

Lambda, ECS tasks, Kubernetes pods, Vercel functions, Express, Fastify, Next.js API routes, background workers — any Node.js environment with outbound HTTP access. For Cloudflare Workers, use the HTTP API directly.

Install @ratequeue/sdk and start coordinating

Create a free resource, wrap your first API call, and coordinate rate limits across all your Node.js workers.