How to evaluate refactoring decisions? The 4W framework

Sun Jan 01 2023 • software-engineering

As an engineering or product manager, you've probably received your fair share of requests from software engineers to prioritize a "critical refactor". Depending on your level of experience it might be trivial or very hard to evaluate a refactor proposal. The goal of this article is to give you a set of tools to help you evaluate such proposals without actually spending too much time understanding the technical details.

But first, some context

The problem with software engineering advice on the internet is that it often comes without context. What worked for a series A startup will not work for Google. The reverse is also often true: What works for Google might not work for the startup.

The context for this article is the environment in which I have spent the past decade building software. Series A to series C companies, with engineering teams in the range of 5 to 200 people. Mostly following some form of agile development with multiple releases per month in the worst case.

There's two important properties of these types of companies that impact the urgency of refactors:

Not default alive: Essentailly, the company has non-zero probability of not existing in the next 2 years. This implies you should discount the potential benefits of a change if payback is expected after N years.
Small engineering teams: The implication being that in the average case, a refactor for the sake of productivity can improve the lives of only a handful of engineers (if the change is local to the team's code).

The psychology of the refactor

Most Developers, despite our appearances, are human beings. As such, we have psychological biases in our decision-making process. Like other human beings out there, we seek dopamine. The great thing about software development is that it's a reliable source: build a feature, get some dopamine. Fix a bug? More dopamine. This is a bit of an exageration, but I believe it to be not too far from the truth. At the end of the day we want to have accomplished something.

From time to time a novice practitioner of the dark arts of software engineering will look at some piece of code and think... This was implemented in a bad way. There is a better way of implementing this. "I propose we refactor this" - he will say.

There's one important aspect to notice about refactors: Often they require no domain-knowledge, no product knowledge nor understanding of users needs. If a developer is constantly proposing such changes it might be a sign that they are not interested in the product or solving real problems for real users. More generally it might be a sign that they are struggling to provide value and are resorting to refactors as a way of accomplishing something.

Enough talk, give me the framework

Here are 4 questions you can use to evaluate the importance of a refactor. I call it the 4W framework (gotta make it catchy ;)

Who does the change benefit?
What is the expected benefit?
What is the expected cost?
When are we expected to see the benefit?

Let's unpack these questions.

Who does the change benefit?

The goal of the first question is to understand if the change is expected to benefit customers or only internal users (e.g. programmers). Many useless refactors improve only the lives of programmers, and as mentioned before this can often be as little as 2-5 people.

What does the change improve?

Whether it's the TTI of the signup page, or the readability of the checkout logic, a useful refactor should have a meaningful impact in some valuable metric. Try to get an answer that is as precise as possible, and be skeptical of generic answers like "readability" or "maintainability". It's not that these metrics are not important, it's that they are subjective and hard to connect with customer concerns.

When does the change deliver it's payoff?

Good refactors have immediate impact on useful metrics. Be particularly skeptical of refactors that have an unknown payoff date or improve internal metrics in the very long term.

For refactors that attempt to reduce tail risks, I like to ask "What will happen if we never implement this change?". Understanding the magnitude of the risk will help you prioritise accordingly.

What is the expected cost?

This is an oversimplification, but it serves to illustrate the point: if a change takes 10 days to implement and produces a 10% daily productivity improvement, it will take 100 days to recover the 10 days of investment, or 50 days if the improvements affect 2 team members, etc. For example spending a week to reduce build times by 20% is a perfectly reasonable investment if your team has >10 developers.

These quantities are not always easy to estimate, but play around with the numbers to start forming intuitions.

A quick template for identifying bad refactors

If you start applying this framework you will notice a pattern emerges. Many changes take this form:

Who does the change benefit? Developers.
What does the change improve? Readability, code quality or productivity.
When are we expected to see the benefit? Mid to long term.
What is the expected magnitude of the benefit? Small.
What will happen if we never implement the change? Nothing impactful.

These are, in my experience, the changes you will almost always want to avoid. They're also a very common category of proposals from novice engineers. They often indicate that the developer is focusing only on the code and not enough on solving real problems for real users. This can be a sign of a deeper problem, either on the individual contributor or the overall development process.

Parting thoughts

Our job as software engineers is not to write software. Our job is to solve real problems for real users. Many refactors improve the code, but don't solve any real problems. This framework is a first step in helping you identify bad refactors, and making sure your team focuses on solving the right problems.

If you found this article useful, consider following me on twitter. I write about once a month about software engineering practices and programming in general.

I'm also developing SynthQL. A type-safe HTTP client for PostgreSQL and I would love to get your feedback.