Incremental mutation testing in the Bitcoin Core \ stacker news

by bruno

Mutation testing is a software testing technique used to evaluate the effectiveness of a test suite by intentionally introducing small, systematic faults—called mutants—into the program’s source code, such as changing operators, constants, or conditional logic, and then running the existing tests to see whether they detect these changes. If a test fails, the mutant is considered killed, indicating the tests are sensitive to that kind of fault; if it passes, the mutant survives, revealing a potential weakness or gap in test coverage.
What do we currently do in Bitcoin Core?What do we currently do in Bitcoin Core?
Currently, we have a weekly mutation testing run based on the master branch. It means that every Friday, we fetch the code from the master branch, generate the mutants, analyze them by running the unit, functional, and fuzz testing, and generate a report that is shown at https://corecheck.dev/mutation.

Mutation testing is expensive, because for every mutant we have to: 1. Compile the code, 2. Run the tests, and to be honest, compiling the code is not a considerable problem since we can have a cache, and there is a way to apply the mutations directly in the binary. The main issue is the time we take to run the tests, especially the functional ones. Currently, we have some functional tests that take over 1 minute to run. It means that if we have 20 mutants to analyze, it will take - AT LEAST - 20 minutes. That said, this is the reason I did not start to run it for the whole codebase, only for a set of files that we evaluated as more important to start with.

In general, these runs have been great. We basically have a great overview at corecheck - Test Coverage on what we should improve in our tests, and, gracefully, some PRs have been opened to address some reported mutants. Nowadays, my main goal is to make these runs more and more efficient (which means faster because it takes over 30 hours to run on our current setup), it includes:
Generate very productive mutants - avoiding spending time with what does not matter (e.g., equivalent or redundant mutants).
Improving some functional tests to be faster and running exactly what is required to test that mutant.
We can skip running the weekly mutation analysis for a specific file if no changes were made in that file or in the tests that we use to analyze it.
Skip generating mutants for lines that do not have any test coverage.
..read more at delvingbitcoin.org