Hacker News story: 6 Practices that turned AI from prototyper to workhorse (106 PRs in 14 days)

6 Practices that turned AI from prototyper to workhorse (106 PRs in 14 days)
1. Specs and plans are source code : Specs and plans live in git alongside source code, not in chat history. A new agent reads arch.md for the big picture, then its specific spec. You always know why something was built. 2. Three models review every phase : Claude, Gemini, and Codex catch almost entirely different bugs. No single model found more than 55% of issues. If you only review with the model that wrote the code, you're missing half the bugs. 20 bugs caught before shipping. Claude Code found 5 bugs, Gemini and Codex caught another 15, including a severe security issue Claude missed. 3. Enforce the process, don't suggest it . A state machine forces Spec → Plan → Implement → Review → PR. The AI can't skip steps. Tests must pass before advancing. AIs don't stick to the plan by themselves, you need rails. 4. Annotate, don't edit . Most of the work is writing specs and reviews that guide the code, not hacking at files in an open-ended chat. 5. Agents coordinate agents . An architect agent spawns builder agents into isolated git worktrees. You direct the architect; it directs the builders. They message each other async. 6. Manage the whole lifecycle . Most AI tools help you write code faster — maybe 30% of the job. The other 70% is planning how, reviewing, integrating, deployment scripts, managing staging vs prod. Have AI run the whole pipeline from spec to PR and beyond. Overall result : One engineer able to produce what a team of 3-4 would usually do. Measured 1.2 points better code on a 10 point scale vs claude code. Downsides: takes a lot longer, much more token usage, but still reasonable at $1.60 per PR. We open sourced it: https://ift.tt/FitRSKW More details and raw results: https://ift.tt/8WLapm3 1 comments on Hacker News.
1. Specs and plans are source code : Specs and plans live in git alongside source code, not in chat history. A new agent reads arch.md for the big picture, then its specific spec. You always know why something was built. 2. Three models review every phase : Claude, Gemini, and Codex catch almost entirely different bugs. No single model found more than 55% of issues. If you only review with the model that wrote the code, you're missing half the bugs. 20 bugs caught before shipping. Claude Code found 5 bugs, Gemini and Codex caught another 15, including a severe security issue Claude missed. 3. Enforce the process, don't suggest it . A state machine forces Spec → Plan → Implement → Review → PR. The AI can't skip steps. Tests must pass before advancing. AIs don't stick to the plan by themselves, you need rails. 4. Annotate, don't edit . Most of the work is writing specs and reviews that guide the code, not hacking at files in an open-ended chat. 5. Agents coordinate agents . An architect agent spawns builder agents into isolated git worktrees. You direct the architect; it directs the builders. They message each other async. 6. Manage the whole lifecycle . Most AI tools help you write code faster — maybe 30% of the job. The other 70% is planning how, reviewing, integrating, deployment scripts, managing staging vs prod. Have AI run the whole pipeline from spec to PR and beyond. Overall result : One engineer able to produce what a team of 3-4 would usually do. Measured 1.2 points better code on a 10 point scale vs claude code. Downsides: takes a lot longer, much more token usage, but still reasonable at $1.60 per PR. We open sourced it: https://ift.tt/FitRSKW More details and raw results: https://ift.tt/8WLapm3

US Economy News

Hacker News story: 6 Practices that turned AI from prototyper to workhorse (106 PRs in 14 days)

No comments:

Follow Us

Recent Posts

Popular Posts

Search This Blog

Random Posts

Tags

Recent Posts