Scaling to 1 Million Briefs: A Technical Deep Dive

Sarah Jones

Head of Engineering

Running one AI brief is easy. Delivering one million recurring briefs reliably is an engineering challenge.

The Scheduling Problem

We use Trigger.dev with a distributed queue system to manage brief delivery. This allows us to throttle execution rates per organization, per model, and per provider dynamically.

// Brief scheduling with rate limits
const task = await scheduler.create({
  prompt: userPrompt,
  schedule: "0 8 * * *", // Daily at 8am
  modelId: "gpt-4o",
  organizationId: org.id
});

Model Gateway

The secret sauce is our model gateway via OpenRouter. We abstract away provider-specific rate limits, retries, and fallbacks. If one model is overloaded, we automatically route to an equivalent alternative.

Database Architecture

To store millions of briefs and billions of deliveries, we optimized our Postgres schema with careful indexing on nextRunAt timestamps. This ensures that the scheduler can efficiently find and deliver due briefs every minute.

Share this article:

Scaling to 1 Million Briefs: A Technical Deep Dive

The Scheduling Problem

Model Gateway

Database Architecture

Read Next

Recurring intelligence, not chat: why we built 0ct as an operator

Why We Built Recurring Briefs: The 0ct Story

AI Briefs That Work While You Sleep (And Why It Matters)