Debugging Echidna Coverage

A real world example of how to debug Echidna's coverage report to optimize the fuzzer performance.

Mar 09, 2024

When creating invariant tests with Echidna or Medusa for a complex system, you’ll likely run into issues where your coverage stalls at a given percentage, and you subsequently don’t know whether or not the invariant you’ve defined in your system is broken.

This is where the coverage report that gets created by Echidna using the coverage and corpusDir configurations in your echidna.yaml file becomes important in helping you to understand where the tool may be getting stuck in a local maximum and be unable to expand coverage to the remainder of your contract. Solving this issue is key because if you ignore coverage, you can let the fuzzer run for an indefinite amount of time, but it may never break out of this local maxima and so give you the false confidence that your invariant holds but, in fact, the paths that may actually break your invariant are just never evaluated.

This was my case while trying to break an invariant I knew did not hold in the ElasticSwap protocol, which allowed this exploit to happen.

Fuzzing ElasticSwap

While setting up the project on Recon to bootstrap an invariant testing suite, I defined a simple invariant to catch the exploit. The invariant was defined as the user can’t gain value after adding/removing liquidity, which, as an Echidna boolean property in Solidity, looks like:

 function invariant_user_cant_gain_value() public returns (bool) {
        return _getBalanceSum() <= initialUserBalance;
 }

Where the _getBalanceSum function returns the sum of the two tokens that the user has provided as liquidity and the initialUserBalance is the result of the call to this function before the user adds liquidity to the pool.

Since it was already confirmed that the invariant didn’t hold in the system given that the attack was possible, after running the fuzzer for over an hour with the coverage never increasing after reaching a given value, this indicated that a local maxima was being reached that the fuzzer could not break out of. In other words, there were specific paths that when executed by the fuzzer always led to reverts and so the remainder of the code that should’ve been evaluated in the invariant test was never reached. This meant that even though the invariant was known to not hold on the system, it appeared that it did and would likely continue to even with more runs of the fuzzer.

To simplify the debugging process, I reduced the number of functions in my TargetFunctions contract (provided by the Recon bootstrapping), going through each to discover where they were reverting, starting with removeLiquidity in the Exchange contract which acts as a liquidity pool in the system:

function exchange_removeAllUserLiquidity(address _tokenRecipient) public {
    uint256 liquidityTokenBalanceOfCaller = exchange.balanceOf(msg.sender);
    if(liquidityTokenBalanceOfCaller > 0 && _tokenRecipient != address(0)) { 
      exchange.removeLiquidity(liquidityTokenBalanceOfCaller, 1, 1, _tokenRecipient, block.timestamp + 1);
    }
  }

When running Echidna in the default property testMode it tries using random sequences of function calls with random inputs to try and break the property. With this in mind, we can take a look at the output of the initial coverage report to start to understand the problem better:

From this, we can see that the second require statement is blocking further execution of the function. We’ll discuss more about how to resolve this below, but first, we’ll see why we’re likely to get stuck with these sorts of issues in the first place.

Given that using Echidna’s property testMode creates a randomized call sequence to try and break the invariant with random values, there is an explosion of the number of possible paths that the fuzzer could take. Without guidance, it can cause one of the calls in the sequence to revert. As a toy example, we can look at the following with the two functions of primary interest in our invariant:

mock simplified call sequence where letters (A,B, etc. indicate possible input values), r indicates a revert:

(1) 
removeLiquidity(A) 
addLiquidity(B) r

(2)
addLiquidity(C) r

(3)
addLiquidity(D) 
addLiquidity(E) 
removeLiquidity(F) r

For the above, if our input values are typed as uint256 there are 2^256 possible input values that could be tried for each function. Since we have two functions, the possibilities for each call are (2^256) * 2, and taking into account Echidna’s default seqLen of 100, we get 2^567 * 2^100 = 2^667 possible combinations. Clearly, this is a very large number, and luckily, Echidna has some logic under the hood with Slither that helps it use inputs that aren’t completely random, but we can see why our fuzzer might sometimes need further guidance.

Expanding The Coverage

When looking at the above coverage report two possible actions could influence the fuzzer’s ability to expand coverage and reach the lines that were showing up in red. The first was to clamp input values that the fuzzer would use so it wouldn’t get stuck trying values that would always revert. The second action is modifying the setup function used by our tests to create a new Exchange contract (where the removeLiquidity function is called) as the values that the Exchange get seeded, which influences the calculations in our function of interest. We’ll apply the first approach to our tests as it suits the issue identified.

Since the baseTokenQtyToReturn and quoteTokenQtyToReturn are calculated using the following formulas:

uint256 baseTokenQtyToReturn =
            (_liquidityTokenQty * baseTokenReserveQty) /
                totalSupplyOfLiquidityTokens;
uint256 quoteTokenQtyToReturn =
            (_liquidityTokenQty * quoteTokenReserveQty) /
                totalSupplyOfLiquidityTokens;

It is likely that the values passed in during transactions are causing these values to evaluate to 0, and since there’s an earlier require statement preventing _baseTokenQtyMin and _quoteTokenQtyMin from ever being 0 these values would be guaranteed to trigger the last two require statements highlighted in green. We can take a look at one of the calls in the corpus generated by echidna to confirm that this is the case:

From this corpus, we can create a simple unit test with the called values, using the convenient CryticToFoundry contract that Recon creates for us by default. Using the above calldata the foundry unit test we create looks like:

function test_removeAllLiquidity() public {
        console.log("liquidity token balance: ", exchange.balanceOf(address(0x10000)));
    
        vm.prank(address(0x10000));
        exchange.removeLiquidity(exchange.balanceOf(msg.sender), 1, 1, address(0x00000000000000000000000000000002fFffFffD), block.timestamp + 1);
    }

Running the test, we see that it fails with the following output and logging statements that we’ve added to our Exchange contract helps us see why

We see from the logs that the issue is due to a calculation resulting in baseTokenQtyToReturn = 0 that causes the previously mentioned revert to be triggered. On further inspection, I noticed that many of the calls in the corpus had the same issue because one of the addresses that Echidna calls our contract with hasn’t had any liquidity added (in this case, 0x10000), and so doesn’t have any liquidity tokens, causing this revert path to always be triggered.

To work around this and ensure that our contract is only called by accounts that have nonzero liquidity in the Exchange I added the following conditions to the function of interest in the TargetFunctions contract:

 function exchange_removeAllUserLiquidity() public {
    uint256 liquidityTokenBalanceOfCaller = exchange.balanceOf(msg.sender);
    if(liquidityTokenBalanceOfCaller == 0) {
      vm.prank(address(0x20000));
      exchange.removeLiquidity(exchange.balanceOf(address(0x20000)), 1, 1, address(0x20000), block.timestamp + 1);
    } else {
      exchange.removeLiquidity(liquidityTokenBalanceOfCaller, 1, 1, msg.sender, block.timestamp + 1);
    }
  }

This ensures that we don’t waste fuzz calls to the contract of interest that would always revert because the precondition checks if the liquidity token balance of the caller is 0. We use the 0x20000 address that was used for seeding liquidity in the Exchange contract in the initial setup as the caller instead and is guaranteed to have a nonzero liquidity token balance. This requires the use of the vm.prank cheatcode, you can learn more about using HEVM cheatcodes here.

Wrapping Up

To subsequently expand the coverage for the entire function call, we can rinse and repeat one of the two suggestions given above, ensuring that all paths are easily reachable by the fuzzer.

A word of caution: when clamping input values, there is always the possibility of biasing the fuzzer to avoiding certain real calls that would break an invariant, so this technique should be used sparingly.

It should also be noted that another way to assess which paths revert is to run Echidna with testMode set to assertion and introduce assertions into the paths in function calls to identify at which point they revert. If the assertion is reached and causes a test violation in Echidna, it means that the revert that’s blocking coverage is lower down in the call stack. This technique, along with the suggested methods for resolving low coverage, should help you expand your test coverage and give you greater confidence that all possible paths in your system have been evaluated.

A guest post by

nican0r

Web3 security researcher focused on fuzzing techniques @getreconxyz

Debugging Echidna Coverage

A real world example of how to debug Echidna's coverage report to optimize the fuzzer performance.

Fuzzing ElasticSwap

Expanding The Coverage

Wrapping Up

Discussion about this post