Opinion

When to Rewrite Software: The Honest Answer

When to rewrite software: the specific technical and business conditions that justify a full rewrite, and the warning signs that a rewrite is the wrong choice.

Checklist showing signals that indicate when a full software rewrite may be justified

In this article:

When to rewrite software is one of the most consequential decisions an engineering leader makes. The honest answer is: rarely, and almost never in the way teams initially propose. This article covers the specific conditions that actually justify a rewrite, the signals teams mistake for justifications, and what a responsible rewrite process requires even when the decision is correct.

Why the Question Is Harder Than It Looks

The question “should we rewrite this?” usually arrives after a period of frustration. Releases are slow. Bugs are frequent. Engineers complain that the code is impossible to work with. The intuitive response is to start fresh, because starting fresh feels like escape from the accumulated problems.

The problem is that the frustration is real but the diagnosis is often wrong. Slow releases, frequent bugs, and difficult code are symptoms. They can have several causes. The cause might be architectural: the system’s fundamental design cannot support the required capabilities. It might be code quality: the architecture is fine but the implementation is poor. It might be process: the deployment pipeline is broken, testing is inadequate, or team coordination is the bottleneck. Or it might be organizational: the team does not have the skills or capacity to maintain the system effectively.

A rewrite addresses the first cause. It does not address the second, third, or fourth. Teams that rewrite without addressing the underlying cause discover that the new system, built by the same team with the same practices, accumulates the same problems within 18 months.

The software due diligence process is designed to distinguish between these causes before a decision is made. The correct intervention depends on the correct diagnosis.

The Signals That Actually Justify a Rewrite

There are specific, verifiable conditions under which a rewrite is genuinely the right call.

The runtime is end-of-life with no upgrade path. The language version or framework is no longer receiving security patches, and upgrading would require changes to the entire codebase anyway. If the upgrade cost equals or exceeds the rewrite cost, the rewrite offers additional benefits: a cleaner architecture, a modern toolchain, and no legacy constraints.

The architectural constraint is fundamental and cannot be addressed incrementally. The system is single-tenant and must become multi-tenant. The data model encodes assumptions that are incompatible with required new behavior, and changing those assumptions would require modifying every module. Incremental migration cannot bridge this gap because there is no valid intermediate state.

The system is genuinely small and the behavioral specification is complete. A system with 20,000 lines of code that handles a well-defined problem with complete documentation and test coverage can be rewritten safely. A system with 500,000 lines of code with undocumented behavior and no tests cannot.

The codebase is genuinely unmaintainable due to technology constraints, not code quality. The language has no static analysis, no type system, no modern tooling. Every change is a manual investigation across hundreds of files with no automated safety net. Refactoring is not possible because the tools required to do it safely do not exist for this technology.

These conditions are specific and verifiable. They are not “the code is messy” or “engineers hate working with it” or “we could build it better now.” Those are real problems, but they do not justify a rewrite.

The Signals That Do Not Justify a Rewrite

The most common false justifications for rewrites, and what they actually indicate:

“The code is impossible to understand.” This indicates missing documentation, poor naming, and inadequate code review standards. It does not indicate that the architecture is fundamentally broken. Refactoring with improved documentation and naming standards addresses this.

“Every change breaks something else.” This indicates tight coupling and inadequate test coverage. These are code quality problems, not architectural incompatibilities. Introducing seams, writing characterization tests, and refactoring the most coupled modules addresses this.

“We could build it better now.” This is almost certainly true. It is also true that the new system, built with current knowledge, will have its own problems that future engineers will complain about. The question is not whether a rewrite would produce a better system initially. It is whether the rewrite is worth the cost and risk.

“The technology is old.” Technology age is not a quality indicator. A Rails 4 application with good test coverage and clean architecture is easier to work with than a Go microservices system with poor design. Modernizing the technology stack is valuable but does not require a full rewrite.

“Morale is low because the codebase is bad.” Morale problems have organizational causes, not just technical ones. A rewrite undertaken to improve morale will improve morale during the rewrite (new code is fun to write) and often lower morale post-rewrite when the new system has its own problems and the team realizes the fundamental issues were not technical.

Why Software Rewrites Fail and How to Reduce the Risk

When a rewrite is genuinely justified, the risk is still substantial. The failure modes are predictable.

Scope inflation happens when the rewrite is used as an opportunity to redesign the system. Resist this. The goal of a rewrite is to reproduce the current system’s behavior in a better implementation, not to build a better system. Feature additions and architectural improvements that go beyond what is strictly required extend the timeline and increase the risk of incompleteness.

The specification problem is unavoidable. The legacy system’s behavior is more complex than it appears. Plan for this. Before writing a line of new code, write characterization tests on the legacy system that define its behavior. These tests are the acceptance criteria for the new system. Any behavior not covered by characterization tests is at risk of being reproduced incorrectly.

The parallel operation requirement is non-negotiable. The new system should run in shadow mode alongside the legacy system before any traffic is switched. Compare outputs. Identify differences. Fix them. The shadow period should run until the error rate in the new system matches or beats the legacy system’s error rate for the same inputs.

Plan for incremental cutover, not big bang. Even if the rewrite is justified, the cutover should use the strangler fig pattern: route a small percentage of traffic to the new system, monitor, expand. This applies even when the rewrite covers the entire system. The routing layer is temporary overhead that eliminates the highest-risk failure mode.

The Pre-Rewrite Checklist

Before committing to a rewrite, answer each of these questions specifically:

  1. What specific technical constraint makes refactoring impossible or more expensive than rewriting?
  2. Do you have characterization tests that cover the legacy system’s behavior in the areas being rewritten?
  3. Does the team have the capacity to maintain the legacy system while building the new one?
  4. Is there a plan for incremental cutover, or is this a big bang replacement?
  5. Has the business accepted that feature velocity will drop during the rewrite?
  6. What is the plan if the rewrite takes twice as long as estimated?

If any of these questions does not have a specific answer, the rewrite is not ready to start. The questions are not obstacles. They are the minimum due diligence for a decision with potentially irreversible consequences.

Conclusion

When to rewrite software: when specific, verifiable technical constraints make incremental improvement impossible or more expensive than replacement. Not when the code is messy, not when engineers are frustrated, not when a new framework looks appealing. The rewrite checklist is not optional. Characterization tests, parallel operation, incremental cutover, and scope discipline are what separate successful rewrites from the majority that fail to deliver their promised benefits.

Does your codebase have these problems? Let’s talk about your system