2021/01/31 - mutation: review & rework of mutmut

A coworker told me about mutation testing, I was immediately interested because testing is interesting. Testing became even more interesting when I read how FoundationDB was made. I recommend you watch the video about testing with deterministic simulation.

I started looking into mutmut and a fork that adds parallel runs support... eventually I though: how hard can it be?

Yeah, you might think that is my thing, and that is the thing of many (most?) developers starting on a new project: "the code is evil, let's rewrite!". And that is somewhat the idea behind this series of logs, so I will not say I am not into "rewrites" myself (more on that later).

I went swiftly through all projects that pop'ed in google first page result, and still was interested to rewrite. Let's dive deeper into those projects.

Instead of an introduction to mutation testing, let our imagination play with the following nice track from Disiz Peter Punk called Mutation:

Disiz Peter Punk Intro Mutation

Standing on the shoulders of giants

mutmut

After my initial review, mutmut seemed like the most straightforward except the fact that you can no run tests in parallel but there is a fork that does. The code is another story. I will go through the fork because it is the code that I am most interested in.

cache.py

In no particular order:

def hash_of(filename):
    with open(filename, 'rb') as f:
        m = hashlib.sha256()
        m.update(f.read())
        return m.hexdigest()

Because it rely on a side-effect open the good abstraction if one is necessary, is a sugar (!) that takes bytes as arguments and computes the digest:

def sha256sum(bytes):
    m = hashlib.sha256()
    m.update(bytes)
    out = m.hexdigest()
    return out

The function sha256sum is very easy to tests, no need to fiddle with on disk files during testing.

I like the idea of cache.py but the name is not well chosen, I prefer db.py or something like dal.py for data access layer.

__init__.py

I start to think having code in init.py is not as evil as I though, I might just be biased because of my experience with Django in the early days.

The following pattern:

try:
    something()
except SomethingException:
    print("something that wants to be useful")
    raise

The above is useless. Instead of print it is better to comment the code and avoid the try / except.

def mutate_code(node, context):
    context.stack.append(node)
    try:
        maybe_mutate_code(node, context)
    finally:
    contexte.stack.pop()

An even better pattern is to use a context manager.

__main__.py

That is the cli definition with the library called click that is not my favorite library, I believe it is better to keep thing simple hence I rely on docopt that does less magic (!) and gives you more control (also, docopt does display all options). Interface are complex topic, and I have no definitive answer regarding cli.

loader.py

install will create a class object on the fly without directly relying on type with top-level functions passed as arguments. That is a performance optimization, but when the time of execution is several days, most milliseconds matters.

Relative imports are difficult to read.

cosmic-ray

Next I looked at cosmic-ray mostly because there was an exchange between cosmic-ray's maintainer and mutmut's maintainer and I wanted to see by myself what was the problem. I do my review a few month or years after that exchange happened so the situation is different.

Spoiler: I find the code of cosmic-ray better, I disagree that the mutations are not easy to extract and use them independently (except that it requires to dive into openstack libraries, but that ought to be good thing right !

The only thing I disagree with is the fact it rely on Celery (how hard can it be. Celery in that case is not necessary, because it is easier to rely on multiprocessing, also even more so nowadays it is easier to setup and configure a single machine with 20, 40 or even 128 thread cores than the equivalent infrastructure with multiple machines. Also less costly.

On the subject of server costs, it is a perfect time to share the following blog post:

Cerebralab Blog Note: Some details of the stories in this article are slightly altered to protect the privacy of the companies I worked for It's somewhat anecdotal, but in my work, I often encounter projects that seem to use highly inefficient infrastructure providers, from a cost perspective. https://cerebralab.com/Is_a_billion-dollar_worth_of_server_lying_on_the_ground

There is some interesting library in the requirements like yattag which is not my favorite in-python html templating library but still an interesting take, also stevedore should be the subject of follow up review!

The code is rather short with 2196 python lines of code. The code look visually nice, and is well commented. It use log as the variable name that holds the python logger, hence I am not the only one to do that.

There is a few mistakes here and there, but the overall code is good!

I recommend to read cosmic-ray code if you are getting started with Python!

Others

I did not have time to review the following projects:

Rework

Overall I am happy with the result, except the following:

Last but not least, I need to replace parso with Python 3.9 ast because it produce less noisy mutations.

forge at ~amirouche/mutation.