Architecture

Talos Auditor is a multi-runtime product — a local agent, two Cloudflare Workers, and a Next.js dashboard — deliberately kept simple so one engineer can operate it. This page maps every piece to what it does and where it lives.

Components at a glance

Piece	Runs on	Responsibility
Agent	Developer laptop (Node 22)	Tail JSONL files, redact per mode, optionally call Ollama, batch-upload.
api Worker	Cloudflare Workers	Public HTTP surface — auth, ingest validation, rate limit, queue push.
consumer Worker	Cloudflare Workers	Queue consumer + cron workflows + TeamRateLimiter Durable Object.
D1	Cloudflare D1 (SQLite)	Aggregates, members, sessions, insights, alerts. 13 tables today.
R2	Cloudflare R2	Raw JSONL archive for teams in `full` mode.
Queues	Cloudflare Queues	Events queue + alerts queue. Explicit ack, DLQ on 3 retries.
Dashboard	Next.js 15 (Cloudflare Pages once DNS is live)	RSC + TanStack Query. Clerk for auth.

Ingest pipeline

POST /v1/ingest              (agent → api Worker, Bearer token)
  │
  ├── verify token (D1 agent_tokens, SHA-256 lookup)
  ├── zod-parse IngestBatch
  ├── TeamRateLimiter DO   (per-team token bucket)
  │
  ▼
EVENTS queue
  │   batch_size=100, wait=5s, max_retries=3
  │
  ▼
consumer Worker.queue()
  │
  ├── look up teams.privacy_mode + categorizer (authoritative)
  ├── upsert sessions row (idempotent ON CONFLICT)
  ├── upsert hourly_usage + daily_usage
  │
  ├── [mode=full] append to R2 staging/team=X/date=Y/
  ├── [mode=categorize]
  │     ├── precomputedCategory present (local mode)? → insert event_categories
  │     └── else call Haiku → insert event_categories
  │
  └── anomaly threshold crossed? → enqueue ALERTS

The consumer is the single enforcement point for privacy mode — even if a misbehaving agent sent prompt text when it shouldn't, the consumer refuses to write it anywhere.

AI reports

Cron triggers (set in apps/consumer/wrangler.toml)
  0 * * * *      → runHourlyAnomalyScan   (z-score on hourly_usage)
  15 2 * * *     → runDailySummary        (Claude Sonnet on daily aggregates)
  15 3 * * 1     → runWeeklyReport        (Claude Sonnet on weekly aggregates)

Each workflow:
  1. Load aggregate snapshot from D1 (no prompt text — ever).
  2. Render to a prompt, call Sonnet.
  3. Insert into insights table.
  4. Post to Slack if integration is set up + toggle is on.

Data model

The core flow of cost through the schema — micro-USD integers from ingest to display:

events (ephemeral, in queue)
  │
  └→ sessions.cost_usd_micros          (idempotent upsert per event)
  └→ hourly_usage.cost_usd_micros      (team, member, project, source, model, hour)
  └→ daily_usage.cost_usd_micros       (same grain but per-day)

     │
     └→ /v1/metrics/team reads daily_usage
     └→ /v1/metrics/members reads daily_usage
     └→ weekly-report workflow reads daily_usage + event_categories

Cost is stored as an integer µUSD (1 USD = 1,000,000). We never use floats for money — summing 100k sessions with float dollars drifts materially.

Authentication

Agent → API: Authorization: Bearer tal_agt_<32 random bytes base62>. Server stores SHA-256(token + HASH_PEPPER) only.
Dashboard → API: Clerk session JWT, verified at the edge by @clerk/backend's authenticateRequest.
CLI login: OAuth 2.0 device authorization flow, same UX as gh auth login.

Repo layout (monorepo)

talos-auditor/
├── apps/
│   ├── agent/        CLI (@talos-foundrix/auditor-agent)
│   ├── api/          Cloudflare Worker (Hono)
│   ├── consumer/     Cloudflare Worker (queue + workflows)
│   └── dashboard/    Next.js 15 (this app you're reading)
├── packages/
│   ├── shared/       zod schemas, pricing, privacy modes
│   ├── db/           D1 schema + migrations
│   └── ui/           design tokens + TalosLogo
├── infra/
│   └── install.sh    one-line installer
└── docs/
    ├── PLAN.md
    ├── CLERK_SETUP.md
    └── SLACK_SETUP.md

Deployment

Two Workers deploy via wrangler deploy --env prod on merge to main. D1 migrations auto-apply via wrangler d1 migrations apply auditor-prod --remote. Dashboard ships to Cloudflare Pages via OpenNext.

Scaling ceilings

D1: 10 GB per database. We only store aggregates here, so 10 GB is hundreds of thousands of developer-months.
R2: effectively unlimited. full mode at 10 devs × 20MB/day ≈ 200MB/day ≈ 6GB/month. Pennies.
Queues: 5k msgs/sec per queue. We batch 50 events per message so that's 250k events/sec raw throughput before rate limit kicks in.
Workers: 128MB memory, 50ms bundled CPU on api. Consumer gets 5 min CPU because it's on unbound.

See the Trust page for data-flow and subprocessor details.