What is the industry rewrite failure rate, really?

Hard to measure precisely because failed rewrites rarely get blogged about, but the directional answer is grim: case studies from Netscape, Borland, Lotus, MySpace, Digg, Friendfeed and a long tail of mid-stage companies all describe rewrites that took two to three times longer than projected, reproduced the original bugs, lost edge-case behaviour and damaged the business in the process. The Joel Spolsky essay "Things You Should Never Do, Part I" from 2000 named the pattern; nothing in the twenty-five years since has changed the fundamentals. Refactor in place wins on the math.

Can you do an audit before we commit to either path?

Yes. The first phase of every vibe-code-rescue engagement is a fixed-price AI Technical Audit — usually three to seven business days — covering the architecture, code quality, infrastructure, scalability bottlenecks, security posture and cost surface. The deliverable is a written assessment plus a specific recommendation: refactor in place, surgical replacement of named modules, or (rarely) a structured rewrite. You can stop at the end of the audit with no commitment to further work; the audit is useful even if you take a different agency forward.

What if we already started a rewrite and it is failing?

Common situation, and the right move is almost always to stop the rewrite and resurrect the original. The sunk cost on the rewrite is gone either way; the question is whether you keep burning months on it or cut it loose. We will audit both codebases (the current production original and the in-flight rewrite), identify what was good in the rewrite that can be back-ported to the original, and propose a refactor plan on the original. Most teams find this conversation uncomfortable for a week and validating in retrospect.

Does the refactor approach actually fix the issues that caused the original code to fail?

Yes, when done deliberately. The problems with AI-generated codebases are usually structural in a specific way: missing type discipline, duplicated logic in slightly different shapes, no clear boundary between side effects and pure logic, sparse or coupled tests, and an architectural pattern that was never explicitly chosen. None of those require a rewrite — they require focused refactor work with a senior lead who recognises the patterns. The output is a codebase your team can extend without each new feature breaking three others.

How do you decide which modules to surgically replace?

Three signals point to a module being a candidate for replacement rather than refactor: the module sits on a load-bearing wrong decision (database choice, rendering model, language), most of the bugs in the issue tracker trace back to it, and refactoring it touches so much surrounding code that replacement is cheaper. Modules that fail two or three of these tests get scoped for replacement in the audit. Modules that fail only one almost always refactor more cheaply. The audit produces the named list; you decide which ones get budget.

Refactor in place vs rewrite from scratch: AI-generated codebases

Compare

Refactor in placevsrewrite from scratch

TL;DR. Rewrites from scratch fail more often than they succeed, and refactoring an AI-generated codebase in place is almost always the right call. The exceptions are narrow: when a single load-bearing wrong decision is doing all the damage, replacing that one thing is cheaper than untangling everything around it. Even then, replace in place rather than starting a new repository.

You inherited or shipped a codebase generated heavily with Cursor, Claude Code, Lovable or Bolt. It works in the happy path but breaks on the edges. Every new feature breaks something unrelated. Your team's velocity is dropping. The internal debate is "do we keep ploughing through this" versus "let's just rewrite the whole thing from scratch." The instinct is rewrite. The data says refactor.

Side by side

Side by side
Criterion	Refactor in place	Rewrite from scratch
Time to "back to shipping features"	Two to six weeks once the audit is done.	Three to twelve months. The team ships nothing during that window.
Risk of reproducing the same bugs	Low — the bugs are visible in the current code; you can fix them as you refactor.	High — same domain assumptions, often the same prompt patterns, often the same engineers under deadline pressure.
Business continuity	Maintained. Feature delivery slows during the rescue but does not stop.	Frozen. New features wait until the new codebase reaches feature parity, which is by definition months away.
Cost surface	Bounded by the audit estimate, with re-baselining if scope is wrong.	Open-ended. Rewrites famously overrun two to three times their original estimate.
Knowledge of edge cases	Preserved — it lives in the existing code's branches, tests and bug-fix history.	Lost. Every edge case must be rediscovered, usually by your users.
Team morale	Improves visibly as the code stops fighting back week by week.	Initial euphoria, then "new codebase, same problems" cynicism by month four.
Outcome dependence on senior staffing	Moderate. A senior lead is needed; the rest of the team can be a normal mix.	Critical. One wrong architectural call early in a rewrite is fatal and only senior engineers catch it.
Recoverable if it fails	Yes — you can resume work mid-refactor, fall back to the previous state, or change strategy.	No. A failed rewrite means the original is six months staler and the rewrite is unfinished. Both are now liabilities.
Industry track record	Boring, succeeds. The default approach across the industry for two decades.	Cinematic, fails. Joel Spolsky wrote the canonical "things you should never do" essay about exactly this in 2000, and the data has not improved since.

Whenrefactoringwins

Refactor in place wins when the architecture has the right shape but the wrong execution. Most AI-generated codebases look like this: the chosen framework is sensible, the data model approximately matches the domain, the file layout is recognisable — but the implementation has duplicated logic in three slightly different shapes, stringly-typed interfaces between layers, side effects leaking into pure functions, and tests that are coupled to implementation details rather than behaviour. None of that requires a new codebase. It requires focused refactor work.

Refactor wins when the business cannot freeze feature delivery for the months a rewrite needs. A working-but-fragile codebase shipping at half-speed still ships. A rewrite ships nothing until it reaches parity with what the previous codebase already does, and "parity" is a moving target if features keep being added in the meantime.

Refactor wins on cost. The audit puts a defensible upper bound on the work; sprint-level milestones make slippage visible early; if the plan changes, you absorb that change against work you have already gained, not work you would still have to do from scratch.

And refactor wins on edge cases. Every bug fix in the original repository encoded knowledge — a regulation, a customer requirement, a corner of the data — that nobody currently remembers but the code still respects. A rewrite throws that corpus away and rediscovers it via production incidents over the following year.

Whenrewritewins (rarely)

Rewrite is the right call in a narrow set of cases. The cleanest signal: a single load-bearing wrong decision is doing all the damage, and replacing it in place would require rewriting most of the calling code anyway. The wrong database choice for the workload. The wrong rendering model for the user pattern. The wrong language for the team you can hire. In those cases, the cost of replacing the one thing approximates the cost of starting over — and starting over removes some of the dead weight that built up around the wrong call.

Rewrite is on the table when the codebase is small. Under roughly twenty thousand lines of code, where most of the bulk is AI-generated boilerplate rather than earned domain logic, the math for a rewrite is less catastrophic. Above that line, every extra thousand lines makes the rewrite economics worse.

Rewrite is feasible when the user-facing behaviour is well documented separately — in specs, mocks, screenshots, customer support tickets — so the rediscovery cost is bounded. Without that documentation, the rewrite team will recreate everything from the existing code anyway, slower.

And rewrite is feasible when you have six or more months of business runway you can afford to spend with no shipped features. That is a real cost; teams routinely underestimate it. If your runway does not include it, you do not have the option you think you have.

The honest middle:surgical replacement

The pattern that beats both pure refactor and pure rewrite: identify the one or two load-bearing wrong decisions, replace those specific modules in place (behind feature flags), and refactor the rest of the codebase around them. The repository never restarts. The business keeps shipping. Risk is bounded by sprint-level milestones. Cost is bounded by the audit estimate.

This is what a Bitnoise vibe-code-rescue engagement looks like in practice. The audit identifies the twenty percent of the codebase doing eighty percent of the damage; that twenty percent gets surgically replaced in production behind flags; the rest of the codebase gets refactored where it touches the new modules. Feature delivery continues on a hardened path that we set up in the first week of the engagement.

The version that does not work: declaring "we're rewriting the whole thing" without an audit, without the senior staffing the rewrite needs, and without a feature freeze the business agreed to. That is the failure mode that the Spolsky essay warned about in 2000 and that mid-stage companies are still walking into in 2026.

Decisionhelper

Architecture has the right shape but the wrong execution. — Refactor in place. This is the majority case for AI-generated codebases.
Codebase is over twenty thousand lines. — Refactor regardless. Rewrite economics get worse with size, not better.
Business cannot freeze feature delivery for months. — Refactor. A rewrite needs that freeze whether you admit it or not.
One critical module is doing all the damage. — Surgical replacement of that module in place, not a full rewrite.
Codebase is under twenty thousand lines and most of it is generated boilerplate. — Rewrite is on the table, but still investigate the surgical alternative first.
You already started a rewrite and it is stalling. — Stop the rewrite. Resurrect the original and refactor that. The rewrite write-off is a sunk cost.

Frequently askedquestions

What is the industry rewrite failure rate, really?
Hard to measure precisely because failed rewrites rarely get blogged about, but the directional answer is grim: case studies from Netscape, Borland, Lotus, MySpace, Digg, Friendfeed and a long tail of mid-stage companies all describe rewrites that took two to three times longer than projected, reproduced the original bugs, lost edge-case behaviour and damaged the business in the process. The Joel Spolsky essay "Things You Should Never Do, Part I" from 2000 named the pattern; nothing in the twenty-five years since has changed the fundamentals. Refactor in place wins on the math.
Can you do an audit before we commit to either path?
Yes. The first phase of every vibe-code-rescue engagement is a fixed-price AI Technical Audit — usually three to seven business days — covering the architecture, code quality, infrastructure, scalability bottlenecks, security posture and cost surface. The deliverable is a written assessment plus a specific recommendation: refactor in place, surgical replacement of named modules, or (rarely) a structured rewrite. You can stop at the end of the audit with no commitment to further work; the audit is useful even if you take a different agency forward.
What if we already started a rewrite and it is failing?
Common situation, and the right move is almost always to stop the rewrite and resurrect the original. The sunk cost on the rewrite is gone either way; the question is whether you keep burning months on it or cut it loose. We will audit both codebases (the current production original and the in-flight rewrite), identify what was good in the rewrite that can be back-ported to the original, and propose a refactor plan on the original. Most teams find this conversation uncomfortable for a week and validating in retrospect.
Does the refactor approach actually fix the issues that caused the original code to fail?
Yes, when done deliberately. The problems with AI-generated codebases are usually structural in a specific way: missing type discipline, duplicated logic in slightly different shapes, no clear boundary between side effects and pure logic, sparse or coupled tests, and an architectural pattern that was never explicitly chosen. None of those require a rewrite — they require focused refactor work with a senior lead who recognises the patterns. The output is a codebase your team can extend without each new feature breaking three others.
How do you decide which modules to surgically replace?
Three signals point to a module being a candidate for replacement rather than refactor: the module sits on a load-bearing wrong decision (database choice, rendering model, language), most of the bugs in the issue tracker trace back to it, and refactoring it touches so much surrounding code that replacement is cheaper. Modules that fail two or three of these tests get scoped for replacement in the audit. Modules that fail only one almost always refactor more cheaply. The audit produces the named list; you decide which ones get budget.

Ready

to ship faster?

Start a conversation

Tell us about your project. We’ll get back within 24 hours.

Let’s talk