Generating unit tests from broken stateful invariant tests

In this post, we analyze different solutions to generate unit tests from broken stateful invariant tests

and

Apr 03, 2024

Introduction

Stateful invariant tests are useful for breaking system properties as a result of a sequence of transactions with fuzzed input values. One common problem with this testing methodology, however, is that if a property is broken by the fuzzer, it can often be difficult to understand the root cause of why the property breaks just by looking at the call sequence. Oftentimes, debugging requires looking into the state variable values of the system and how they change.

The recommended approach for debugging failing property tests for some common fuzzers (Echidna and Medusa) require emitting events within the call sequence to view state values. This can often be an inefficient and time-consuming process, as it requires a continuous loop of emitting events, running the fuzzer, and checking the call sequence to verify which values caused the property to break.

An often more efficient approach consists of using a test converter contract (usually called CryticToFoundry), in which the call sequence that breaks the property is used to create a special case of a unit test (regression test) that can be integrated into the test suite directly, facilitating the debugging of broken invariants.

One issue is that, by default, it is not straightforward to convert failing property tests to regression tests, as they often require some sort of test suite setup to properly function. When using Echidna or Medusa as a fuzzer, their test contract interfaces are different from Foundry, meaning that creating a unit test usually requires creating a whole new Foundry test contract with a setup that mirrors the existing setup to get the unit test to properly recreate the failing property.

Recently, two new tools for creating such regression tests have emerged that make it easier to take full advantage of their benefits. These tools will be the focus of this article where we’ll look at how they work and when using one might be more beneficial than the other.

Tools

Fuzz utils

Trail of Bits recently released their fuzz-utils tool which can generate Foundry unit tests from a failing test in an Echidna/Medusa run. It uses the corpus from the test to generate a Foundry unit test directly into your repository by running a single command on the command line. The created unit test can be used for quick reproduction of the assertion/property violation during development.

Recon

Recon also recently released an in-app tool for generating Foundry unit tests using a failing test from an Echidna/Medusa run. Unlike the fuzz-utils tool, Recon’s tool generates Foundry test from output logs instead of the corpus and requires using the Recon web app instead of the command line.

A comparative analysis

Now we’ll look at how to use each and some of their benefits and drawbacks.

We’ll be testing both tools on the following Counter contract that comes as part of the default template for Foundry projects created using forge init.

contract Counter {
    // number is set to a nonzero value here so that Echidna tests don't        fail without producing a corpus
    uint256 public number = 5;

    function setNumber(uint256 newNumber) public {
        number = newNumber;
    }

    function increment() public {
        number++;
    }
}

We define the following simple invariant on the above contract using an external testing setup which states that the number value in the contract can’t be 0:

contract FuzzCounter {
    Counter counter;

    constructor() {
        counter = new Counter();
    }

    function setNumber(uint256 newNumber) public {
        counter.setNumber(newNumber);
    }

    function increment() public {
        counter.increment();
    }

    function invariant_neverZero() public view returns (bool) {
        return counter.number() != 0;
    }
}

Where we’ve modified the Echidna and Medusa configuration files to use invariant_ as the prefix used for property tests instead of the default echidna_.

Just by looking at the Counter contract we can easily see that this invariant won’t hold and as expected when we run Echidna we get the following output:

If we replicate the same property in the Properties contract after adding Recon’s test scaffolding to the repo, the call sequence will be slightly different because our target contract is CryticTester instead of FuzzCounter:

Now we’ll see how each of the tools generates a Foundry unit test that allows us to reproduce this violation on-demand.

Recon test generator

Recon test generator works via the Recon website, where you paste the Echidna logs into a form box and it gives you a working Foundry unit test. From the logs for the above invariant violation, we get the following output:

Since this is designed to work with projects that are using Recon’s scaffolding architecture there’s no need to define the setup for this Foundry test since the Setup contract is automatically inherited in the CryticToFoundry contract where you would add this test to (for more on the Recon scaffolding structure see here).

Adding an assertion that mirrors the boolean expression from our broken property completes the process and gives us a test that exactly matches our property test from the call sequence:

function test_prefix_setNumber_0() public {
        CryticTester.counter_setNumber(0);
        t(number != 0, "property failed: number != 0");
}

Here we use the function t() which is a wrapper around the foundry assertTrue function provided by Recon to allow us to log a message to the console when the assertion fails.

Additionally, even though we can see from our corpus that the caller is one of the default Echidna sender addresses (0x10000):

there is no need to prank this address in our unit test since in an external testing setup msg.sender isn’t preserved, so the sender that calls the Counter contract will be the CryticTester contract.

fuzz-utils test generator

Fuzz-util’s test generator works directly from the command line, so after having run Echidna to generate a corpus all we need to do is run the following command from our Foundry project directory:

fuzz-utils generate ./test/FuzzCounter.sol --corpus-dir ./echidna --contract "FuzzCounter" --test-directory "./test/" --inheritance-path "." --fuzzer echidna

and it will generate the following Foundry unit test for us:

contract FuzzCounter_Echidna_Test is Test {
    FuzzCounter target;

    function setUp() public {
        target = new FuzzCounter();
    }

    // Reproduced from: ./echidna/reproducers/2834239240519329869.txt
    function test_auto_setNumber_0() public {
        vm.prank(0x0000000000000000000000000000000000010000);
        target.setNumber(0);
    }
}

Since this was run on a repository not using the Recon scaffolding, the inclusion of the setUp function is useful for including the same setup that was used in the Echidna test without having to copy and paste from the Echidna target contract. In this case the setUp function is trivial, however production codebases often require much more involved setups so this is certainly a nice feature.

As we did for the previously generated test we can add an assertion that replicates the boolean expression in our property so that our unit test will fail under the same conditions as our property test:

 function test_auto_setNumber_0() public {
        vm.prank(0x0000000000000000000000000000000000010000);
        target.setNumber(0);
        assert(target.number() != 0);
 }

Since the generated foundry test contract FuzzCounter_Echidna_Test doesn’t inherit the FoundryAsserts contract by default, it can only use the standard assertion function that comes built-in with Solidity.

As was mentioned above, since using an external testing setup doesn’t preserve the msg.sender value, even though the unit test pranks the 0x10000 address, the actual sender to the Counter contract will be the address at which the FuzzCounter is deployed at.

Conclusion

We’ve seen how each of these two tools can greatly speed up workflows and make debugging broken properties easier.

For internal testing setups, the fuzz-utils tool offers a slight advantage because it automatically uses the Foundry prank cheatcode which allows replicating the caller from the corpus. Additionally, for projects not implementing Recon’s scaffolding which implements a shared setup function used by Echidna, Medusa and Foundry, the auto-generation of a Foundry setUp function helps go from failing property to unit test faster.

However, given that many fuzzing/invariant setups use an external testing setup the benefit of pranking callers in fuzz-utils generated tests is essentially nullified in these types of setups, but the tool will still include pranks in such setups which may lead to less clear code in complicated unit tests. The Recon tool takes into account that an external testing setup is used with the Recon scaffolding but doesn’t provide all the necessary setup for creating a standalone unit test in repositories that don’t use Recon’s scaffolding.

Additionally, Recon’s CryticToFoundry contract inherits from the FoundryAsserts contract provided by Recon’s scaffolding, allowing it to more clearly indicate in which test and under what condition an assertion has failed if multiple tests are run at the same time.

Given that both tools are still quite new it should be expected that their functionality will expand in the future, closing the gap that exists between their capabilities.