Test-driven Goodharting

23 September 2020

Fairly early on in my time at Zoopla, I brought in Michael C. Feathers’ book, “Working Effectively With Legacy Code”, and somewhat provocatively left it prominently on my desk.

Feathers’ definition of legacy code is one that I like: any code that isn’t covered by unit tests.

He writes:

Code without tests is bad code. It doesn’t matter how well written it is; it doesn’t matter how pretty or object-oriented or well-encapsulated it is. With tests, we can change the behaviour of our code quickly and verifiably. Without them, we really don’t know if our code is getting better or worse.

But I think I would amend that definition slightly. I’ve seen code that is technically speaking under tests, but which is still a pain to work with, and which – crucially – cannot be changed quickly or verifiably.

So how does it happen that code under tests can’t be changed quickly or verifiably? I’ve seen a few anti-patterns. For example, high-level tests that run through a complex bit of flow and test for various things that should be tested on the lower-level units; or tests that tell a long story so that it’s difficult to make modifications in the middle without rewriting all the tests at the end.

I’ve always been a big fan of unit testing, and I tend to work in a test-first way, so I’ve been happy to see greater uptake of testing, and the expectation that code will come with tests. But Goodhart’s Law can always creep in. I’ve seen teams in which writing unit tests is a checkbox to be ticked, but without real appreciation of the value of tests.

It’s entirely possible that unit tests in that environment are a net negative. The ideal process is: write tests, change code, run tests to learn whether the change was successful, repeat. But in that environment the process becomes: change code, test manually, run tests to see which have broken, alter tests so that they pass. The tests are nothing but wasted effort.

It seems bizarre, but there are developers who aren’t using the test suite as an opportunity to learn about and probe the functioning of the code. And if you don’t understand that that’s the value of automated tests, then you won’t write them with that in mind.

So perhaps my definition of legacy code, then, would be code that can’t be changed quickly and verifiably. Yes, it’s subjective and ambiguous. But there’s some value in not getting Goodharted by your dogmas.