Change is the root of all (evil) bugs
As I look back on a decade of writing software, building systems, and chasing bugs, a simple truth has gradually crystallized for me. It’s not something I learned in a book, nor did I hear it in a conference keynote. Instead, it revealed itself slowly, through countless debugging sessions, system failures, and sleepless nights spent wondering what went wrong. That truth is this: change is the root of all bugs.
When I first started writing software, I thought bugs came mostly from bad code:mistakes in logic, typos, or misunderstood requirements. And yes, these are all very real causes of bugs. But quite often, you've tested a system, pushed it to production, monitored it for a while, everything seems to be working fine, and then suddenly, a bug hits. Why?
As I've matured as an engineer, I've realized that there's one cause that's common to many of these bugs: change.
Take dependencies, for example. Early in my career, I eagerly grabbed the latest versions of libraries, thinking I was staying ahead of the curve. And then came the subtle incompatibilities: an API change here, a performance regression there. Dependencies are powerful, but they’re also ticking time bombs. They change, often in ways we don’t control or even notice until it’s too late.
Or consider distributed systems. I remember deploying my first one, thinking I had accounted for everything. But distributed systems live in a world of constant flux: network latencies, machine failures, and shifting loads. The moment you assume stability, the system finds a way to remind you otherwise. Bugs in such systems are rarely about the code itself; they’re about the assumptions you made about the environment.
Then there’s the issue of configuration. How often have you seen a system break because someone changed a setting, or because two environments weren’t perfectly aligned? Configuration drift can create bugs that are devilishly hard to diagnose. And what about the infamous "works on my machine" problem? That too is often a tale of changes: subtle differences in runtime environments, OS versions, or even hardware setups.
Most software engineering practices, I’ve realized, are about managing or limiting change. Immutable data structures ensure that once something is created, it can never be altered, removing an entire class of bugs. Pure functions, which depend only on their inputs and produce the same outputs without side effects, limit how changes ripple through a system. Modularization confines the scope of change to well-defined boundaries, reducing the risk of unintended consequences. Stable APIs act as contracts, shielding users from internal changes. Even tools like version control systems, like Git, are fundamentally about tracking and managing change. Docker was invented so you can basically get immutability over your OS level dependencies.
It's clear: many of the technological advances and techniques that we have built over the years are about managing or limiting change.
Think about your own systems. Have you ever traced a bug back to something unexpected, like a time zone setting, a leap year, or an unanticipated scale of data? These technical surprises are almost always rooted in change. In something behaving differently than it did before, or differently than you assumed it would.
Now, let’s zoom out. What about the bigger picture? Beyond the technical, change can come in many forms: organizational restructures, shifting product requirements, or evolving business priorities. These changes ripple through systems, introducing complexity and uncertainty. But before we dive into that, let me ask: do you know where change is coming from in your systems? Are you aware of the dependencies, configurations, and environments that might shift under your feet?
One thing I’ve learned to appreciate is that change isn’t inherently bad. It’s what drives progress. But unmanaged, unexpected, or misunderstood change is the real enemy. This understanding has reshaped how I approach software engineering. Now, when I debug a problem, I'm constantly asking myself: "What changed?", "How did I not see this coming?" and can I put a mechanism in place to prevent these kinds of changes from happening in the first place?
So here’s my invitation to you: take a moment to reflect on the systems you manage. Where does most change come from? Is it in your dependencies? Your infrastructure? Your configuration? The better you understand these sources of change, the better you’ll be at anticipating and mitigating their impact.
Because if you don’t understand where change comes from, it will surprise and bite you.
© Fernando Hurtado Cardenas.RSS