Our approach — Le Petit Renard

The autoresearch loop

We don't hand-write the groundrule. An agent runs an autoresearch loop: read what's been tried, propose a rule, test it on every night, score it, keep it only if it beats the best — then go again, on its own. It's the agentic setup Andrej Karpathy points to as where AI is heading.

Two roles, split clean. The agent owns the loop and runs it autonomously. We supervise from outside: set the levers — patience, model, library — curate the inspiration, and decide when to pivot or start a fresh run.

Each round keeps only what beats the best so far, and builds the next groundrule on it.

Watching the search

A search you can't see is one you can't steer. Three live views show what the agent is doing — so you know when to let a run breathe and when to pivot, swap models, or change the library.

The live dashboard

The control room. Every candidate the agent proposes lands here the moment it's scored — newest on top — beside its total wait and whether it beat the best so far. The chart tells the same story: scores falling round by round, the running best stepping toward the target.

It's where you watch a run breathe — and catch the moment it stalls.

Open the dashboard →

The candidate grid

One service night, every candidate's schedule laid side by side — then stacked into a single overlay. Where the agent keeps landing on the same move, the bars pile into a column and the cluster jumps out; where it's still casting about, they scatter.

A read on what the search has settled on, at a glance.

Open the grid →

The experiment lineage

No candidate appears from nowhere. Each one builds on a parent — and the lineage draws the whole search as a family tree: the lines that refined step by step, the pivots that struck out in a new direction, and the single path that reached the champion.

The search's entire history, in one picture.

Open the lineage →

Nifty research strategies

Pivoting on patience

Give the agent a patience — how many rounds it may go without a new best. Spend it, and it must pivot: drop the current line for a genuinely different groundrule. Our main guard against rabbit-holes.

Runs that build on runs

A search isn't one fixed run. Each new run inherits the champion — the best groundrule so far — and climbs from there, while you change the model, the patience, the library around it.

The library — inspiration, by hand

The library is an optional, human-written input — notes, framings, even a personality handed to the agent before it thinks. It doesn't fix the answer; it shapes the frame the search starts from.

Run it on your own hardware

The loop talks to any OpenAI-compatible endpoint, so the proposing model can run locally — LM Studio, or anything serving the same API. Point it at your own hardware and the search runs unmetered: no rate limits, no quota walls.

The training & evaluation scenarios

Each candidate faces a battery of service nights, every one leaning on a different station — so a winning groundrule has to be good all round, not just on the easy nights.

The House ServiceGrill

The doors open on a quiet Tuesday and the room fills gently. Two tables order the moment they sit — a burger and fries, a steak and salad — and two more drift in over the next ten minutes. Easy tickets on their own, except two burgers and a steak are all bound for Priya's single grill.

The Early RushFryer & cold

The first Seaport crowd piles in off the bridge wanting something quick before the show — baskets of fries, bowls of the day's soup, a salad to share. Light on the grill, but the fryer never stops and Marco's range is three pots deep, every ticket on a short fuse.

The Grill JamGrill

A small table that hits the grill hard. Two steaks and two burgers land almost on top of each other — four cuts of meat queued at a single fire while the salad and fries wait on the cold side. Mis-order Priya's grill and every plate lands late.

The Range RunRange

A cold, wet night and the whole room wants comfort: pasta after pasta, the day's soup by the bowl. Marco's single range is the whole game — five pots deep with barely a burner to spare — while the odd salad and basket of fries slip by on the side.

The Late SeatingBoth walls

The last tables of the night sit down late and still want to eat fast. Grill and range are both buried — steaks and burgers stacked on one side, pasta and soup on the other — and every clock is short. Both bottlenecks bite at once; there's no slack left to hide a bad call.

The Bar RushFryer

A wave of bar tables off the late train, all wanting something quick with a drink — basket after basket of fries, the odd snack to share. The grill barely fires, but the fryer never rests and every plate funnels through the single pass.

Three more nights stay sealed. The kitchen never trains on them, so they're the exam that proves a groundrule generalised rather than memorised.

The Unseen CoverMixed

A held-out Thursday the kitchen never rehearsed on: five varied tables, a mix of grill, range and cold orders trickling in over ten minutes, with one heavy three-dish ticket buried in the middle. The true test of whether a method generalises or merely memorised the nights it trained on.

The Saturday CrushGrill

Saturday at full tilt, the long five-till-midnight window at its peak. A packed room all wanting red meat — steaks and burgers stacked on nearly every ticket with the tightest deadlines of the week. The grill is swamped, and any cook who ignores the bottleneck buries the whole service.

The Sunday GravyRange

A held-out Sunday the kitchen never rehearsed on: a slow, full room that all wants comfort — pasta and soup, ticket after ticket. Marco's range is buried while the grill stays cold, the mirror image of Saturday's crush, and the test of whether a method protects whichever wall is the bottleneck.

Start on the research dashboard to watch the next groundrule get found, or revisit the problem it's solving.