Writing  /  Topic

Architecture deep-dives

How the one agent is actually built: routing, decomposition, evals, completion gates. The machinery behind the position.

Field notes
Your group chat is dying. I built a member that won't let it. Most bots are on-call: summoned, they answer, they vanish. The harder half is ambient: present without being summoned, and choosing almost always not to speak. letmecheckbot does both, and the ambient half has one job that matters most to me. When a group chat goes quiet, it reaches into the room's own past, finds the day worth remembering, and hands it back. Field notes on the mechanism, and on why a member that remembers is the only kind that can keep a room alive. Read · 6 min
Architecture
How my agent earns the word done: evals and certification gates My own agent made my daughter's first-birthday book and misspelled her name on the cover, the kind of error a spellchecker passes because it is a correctly spelled name, just not hers. The eval layer and per-type certification gates that came out of it, with the production numbers behind them. Read · 8 min
Architecture
LetMeCheckThatBot: a group-chat agent that remembers LetMeCheckThatBot is a second production agent I run, a different shape from Penny: multi-user, conversational, real-time. It sits in a Telegram thread, wakes on a keyword, and does real work through a tool loop, a single cheap chat model with no fallback, and a dedicated vision model. The part that took the most work was the part nobody expects: a memory that silently ingests every artifact dropped in the chat and recalls it by meaning weeks later. Read · 8 min
Architecture
Auto-decomposition, multi-model review, and quality gates A real architecture pattern for handing a large task to one agent: let it decompose the work into a dependency graph, route the pieces across model families, and gate the output with a review pass run by a different model. The pattern that made a 2,102-record corpus job ship in one afternoon. Read · 7 min
Architecture
Your agent needs routines, not just skills Model, agent, orchestrator, framework, skill, workflow engine. Six words the field uses as one. They are not the same thing, and the confusion costs you architecture. Read · 1 min
Architecture
Inside a Claude Code setup running 6,442 jobs: the completion gate How I wired Claude Code to run as a job queue instead of a chat: a thin router, forked workers, injectable skills, and a completion gate that refuses to call work done after a single pass. Read · 10 min
Architecture
Inside one production AI agent: routing and the failure log What one production AI agent actually looks like after 5,252 jobs: multi-model routing, an explicit fallback chain, a durable SQLite job engine, and the five failure classes that account for the breakage. Part one of two. Read · 8 min

Other topics

The harness thesisTooling reviewsFailure postmortemsContext and memory engineeringStart here