Architecture
Talos Auditor is a multi-runtime product — a local agent, two Cloudflare Workers, and a Next.js dashboard — deliberately kept simple so one engineer can operate it. This page maps every piece to what it does and where it lives.
Components at a glance
| Piece | Runs on | Responsibility |
|---|---|---|
| Agent | Developer laptop (Node 22) | Tail JSONL files, redact per mode, optionally call Ollama, batch-upload. |
| api Worker | Cloudflare Workers | Public HTTP surface — auth, ingest validation, rate limit, queue push. |
| consumer Worker | Cloudflare Workers | Queue consumer + cron workflows + TeamRateLimiter Durable Object. |
| D1 | Cloudflare D1 (SQLite) | Aggregates, members, sessions, insights, alerts. 13 tables today. |
| R2 | Cloudflare R2 | Raw JSONL archive for teams in full mode. |
| Queues | Cloudflare Queues | Events queue + alerts queue. Explicit ack, DLQ on 3 retries. |
| Dashboard | Next.js 15 (Cloudflare Pages once DNS is live) | RSC + TanStack Query. Clerk for auth. |
Ingest pipeline
POST /v1/ingest (agent → api Worker, Bearer token)
│
├── verify token (D1 agent_tokens, SHA-256 lookup)
├── zod-parse IngestBatch
├── TeamRateLimiter DO (per-team token bucket)
│
▼
EVENTS queue
│ batch_size=100, wait=5s, max_retries=3
│
▼
consumer Worker.queue()
│
├── look up teams.privacy_mode + categorizer (authoritative)
├── upsert sessions row (idempotent ON CONFLICT)
├── upsert hourly_usage + daily_usage
│
├── [mode=full] append to R2 staging/team=X/date=Y/
├── [mode=categorize]
│ ├── precomputedCategory present (local mode)? → insert event_categories
│ └── else call Haiku → insert event_categories
│
└── anomaly threshold crossed? → enqueue ALERTSThe consumer is the single enforcement point for privacy mode — even if a misbehaving agent sent prompt text when it shouldn't, the consumer refuses to write it anywhere.
AI reports
Cron triggers (set in apps/consumer/wrangler.toml)
0 * * * * → runHourlyAnomalyScan (z-score on hourly_usage)
15 2 * * * → runDailySummary (Claude Sonnet on daily aggregates)
15 3 * * 1 → runWeeklyReport (Claude Sonnet on weekly aggregates)
Each workflow:
1. Load aggregate snapshot from D1 (no prompt text — ever).
2. Render to a prompt, call Sonnet.
3. Insert into insights table.
4. Post to Slack if integration is set up + toggle is on.Data model
The core flow of cost through the schema — micro-USD integers from ingest to display:
events (ephemeral, in queue)
│
└→ sessions.cost_usd_micros (idempotent upsert per event)
└→ hourly_usage.cost_usd_micros (team, member, project, source, model, hour)
└→ daily_usage.cost_usd_micros (same grain but per-day)
│
└→ /v1/metrics/team reads daily_usage
└→ /v1/metrics/members reads daily_usage
└→ weekly-report workflow reads daily_usage + event_categoriesCost is stored as an integer µUSD (1 USD = 1,000,000). We never use floats for money — summing 100k sessions with float dollars drifts materially.
Authentication
- Agent → API:
Authorization: Bearer tal_agt_<32 random bytes base62>. Server stores SHA-256(token + HASH_PEPPER) only. - Dashboard → API: Clerk session JWT, verified at the edge by
@clerk/backend'sauthenticateRequest. - CLI login: OAuth 2.0 device authorization flow, same UX as
gh auth login.
Repo layout (monorepo)
talos-auditor/
├── apps/
│ ├── agent/ CLI (@talos-foundrix/auditor-agent)
│ ├── api/ Cloudflare Worker (Hono)
│ ├── consumer/ Cloudflare Worker (queue + workflows)
│ └── dashboard/ Next.js 15 (this app you're reading)
├── packages/
│ ├── shared/ zod schemas, pricing, privacy modes
│ ├── db/ D1 schema + migrations
│ └── ui/ design tokens + TalosLogo
├── infra/
│ └── install.sh one-line installer
└── docs/
├── PLAN.md
├── CLERK_SETUP.md
└── SLACK_SETUP.mdDeployment
Two Workers deploy via wrangler deploy --env prod on merge to main. D1 migrations auto-apply via wrangler d1 migrations apply auditor-prod --remote. Dashboard ships to Cloudflare Pages via OpenNext.
Scaling ceilings
- D1: 10 GB per database. We only store aggregates here, so 10 GB is hundreds of thousands of developer-months.
- R2: effectively unlimited.
fullmode at 10 devs × 20MB/day ≈ 200MB/day ≈ 6GB/month. Pennies. - Queues: 5k msgs/sec per queue. We batch 50 events per message so that's 250k events/sec raw throughput before rate limit kicks in.
- Workers: 128MB memory, 50ms bundled CPU on api. Consumer gets 5 min CPU because it's on unbound.
See the Trust page for data-flow and subprocessor details.