the lab
Systems I run
in production.
Two production agents, running every day on my own infrastructure, plus the experiments I keep alive. The same engineering my private work turns on, in the open: I break things on purpose, publish what holds up, and the numbers are there for anyone to check. Each system here is the current build of a line of agents I have run and rebuilt over four years.
Production agents
Both run dailyAn autonomous operator on a home server. One agent in a loop, calling many tools, with a workflow engine as the biggest tool. The hard part is the architecture: single-agent over a fleet, model routing by name not cascade, retrieval instead of a context manager, and the eval gate that decides when work is actually done.
single agentClauden8nPostgresLanceDB How it works Running daily since 2022 LetMeCheckThatBotA multi-user agent that turns a Telegram group chat into the interface. Say “robot” and it fact-checks, researches, reads the link and the voice note, and answers in the thread, so nobody leaves the room. Nineteen tools, one inexpensive model with no fallback, memory that recalls by meaning. Lifetime model spend: $4.35. The hard part was not the features. It was keeping the model in character, in budget, and online.
multi-userTelegramOpenRouterWhisperSQLite How it worksExperiments
Smaller systems I runThe smaller things I keep running, each one a single idea pushed far enough to use in production.
Sunset projects
No longer runningStandalone products I designed, built, and shipped end to end, on my own. Each was a bet taken far enough to ship, learn from, and retire.
An AI that watches short-form video (Shorts, TikTok, Reels) frame by frame and returns a growth and virality report.
video AImultimodal Write-up
ScooScoo.homes
AI home-portrait e-commerce: an end-to-end pipeline that turns a photo of any house into stylized art, ready to print and ship.
SDXLShopify Write-up V vLLM Video IntelligenceA video engine that tiles frames into contact sheets so a vision model reads eight at once. It cut the cost of reading a video about 80 percent.
video AIcontact sheets Write-up E Everything-BotA Telegram fact-checker that ran entirely on AWS serverless, so it cost nothing when the chat was quiet. The ancestor of LetMeCheckThatBot.
serverlessTelegram Write-up @ AgentsThatEmailForward an email thread and a serverless bot replies to everyone with the summary. No app, no account: the inbox is the interface.
serverlessemail Write-up PDF aiPDF EngineSay "clean this scanned packet" and a planner turns it into a chain of retryable PDF jobs across specialized workers.
serverlessFastAPI Write-up PP PagePurgerAI record review for medical-legal cases: it pulls the duplicate and irrelevant pages out of eight-thousand-page stacks before anyone bills to read them.
document AIAWS Batch Write-up W WatchPostA computer-vision layer over a doorbell feed that logs who came by, so you stop scrubbing hours of footage to find one visit.
computer visionYOLOv8 Write-upRunning a system in front of real consequences is the test that matters. Everything in the writing started as something that broke in here.