Software/Testing

Not all testing is equal, or useful, and some is even damaging

Let’s explore how some of the popular beliefs about testing can hurt our testing effectiveness, and what we can do about it.

Vlad Lyga
Dev Genius
Published in
5 min readJun 2, 2022

--

If you are in the business of writing code, there is a big chance you also write tests for it. Probably most of these are unit tests. Unit tests are small, relatively easy, and can even be fun to write. Run quickly they give this immediate sense of satisfaction. Oh, and I love seeing that “green bar”!

It’s hard to underestimate the importance of testing in achieving quality. But, as the title says, not all testing is equal, or useful, and some can be even damaging. In this article, we’ll explore how to shift the balance in our favor.

Finding the balance (Image by Stefan Keller)

Unit tests give the confidence to change the code

Being white-box, unit tests can serve as change detectors.
This can be beneficial when the change was inadvertently made.
But what happens when you want to change a piece of code purposefully, and tens of tests fail?

Ask every developer what code is easier to change?

  • code that has little other code that depends on it
  • code that has a lot of other code that depends on it

Anyone with experience will recognize that code with little dependencies is easier to change since there is a lot less that will break. It means that we get precisely the opposite of what we tried to achieve.
Writing unit tests is similar to pouring cement on the code base, because every code change will require a coordinated change to the tests.

There are claims that there is a way to write tests in such a way that they won’t need changing when the underlying code changes. I don’t know what these tests are testing, but it’s not the underlying functionality and so carries little value.

Unit tests speed development

Tests are not free. Tests are code. We arrange them in packages and modules. This requires design and maintenance. Some research suggests unit tests can take up to 30% more development time for some teams.
There better be a fantastic ROI here, right?

Keep on reading for tips on maximizing unit-testing ROI.

Tests are easier to understand than the code

This ties to the common misconception that you can write tests that can be simpler than the code they test. A test cannot be less complex than the code it tests. Because then it carries less info than code, and such a test cannot possibly test the code underneath. Adding to this all the ‘wiring’ that we need to add to run a test (mocks, fixtures, expectations, etc.) we end up with much more test code than code under test.

See the next section to understand why having a lot of test code is not good.

Tests are more reliable (contain fewer bugs) than the code

Sometimes we see claims that you can take more care when writing tests to make them bug-free. Whether we want it or not, we introduce a certain amount of bugs for any amount of code written.
There are different statistics ranging wildly between 3–50 bugs per 1KLOC.
In any case, your test code will include the same proportion of bugs as the production code. Since we know that usually, we have more test code than production code — the test code will fail at a higher rate than the production code. Meaning at any given point of time, some of the tests should fail, but they are not failing, or the test is not even testing what we think it tests because of bugs.
Doesn’t it make you think differently about what that green bar means, eh?

A good piece of advice to anyone who claims to be able to concentrate better and write tests with fewer bugs than the code — use the same technique on the production code itself.

If I changed the code, and after changing it, I run the tests, and the bar turned green, what does it mean?

“If I changed the code, and after changing it I run the tests and the bar turned green, what does it mean?”

We write unit tests because it helps us to find bugs

We have to recognize that most of the time we do only partial and random unit testing. Ask yourself how do you know when to stop writing tests?
When you cannot think about more tests to write or maybe when it’s time to move to the next feature? Eventually, we all use some (often different and arbitrary) notion of “completeness”, like “every line is reached at least once”.
The issue is that covering each line cannot say anything about whether the code does what it should.

But how to know what it should do?

The key to the solution lies in recognizing we cannot be the ones who write the code and be the oracles for correct code behavior at the same time.
Without an external, independent oracle of correct behavior, we are simply randomly guessing about which tests to write.

If you are an OO developer your unit is not a class

While most of us write in one of the varieties of “OO” languages, we fall into the trap of what actually is the unit we are testing in our ‘unit test’.
The thing is that there is an impedance mismatch between the unit of composition of our software (the Class), and the actual unit of code we can test which is a method/function.
So while we cannot execute a class, we can execute a function.
But the most interesting and valuable errors happen between functions (where a call graph of functions is invoked during run time to accomplish a task).

Summary: Unfortunately most unit tests are not that useful

  • Chasing coverage is counter productive
  • Tests can not be simpler than the code itself
  • Unit tests can make it hard to change underlying code
  • Tests cannot promise that something works
  • Testing is sampling at best

Tips to maximize that unit test ROI

  • Concentrate on writing unit tests for key algorithms for which there is a third-party oracle of correctness. These parts of the system are usually the core and thus change rarely so the cost to change is amortized over time.
  • Delete any tests that you cannot link to a business value (what it means if this test fails)
  • Regression tests, where regression is defined as ‘Bug that was found in production and was reproduced with a test’
  • Delete tests that fail often
  • Delete tests that don’t fail at all
  • If something can be tested with a system test or unit test — prefer a system test

Reading material referenced here, and that inspired this article:

--

--

Motorcyclist, Software architect. Currently at Microsoft. All thoughts are mine.