Learnings from 6 weeks of fuzzing Badger DAO's eBTC protocol
Key Takeaways and best practices from a comprehensive Formal Specification & Invariant Testing Audit
Note: this post was originally posted here.
Overview
Over a period of six weeks, I had the opportunity to work in collaboration with Badger DAO to enhance the invariant tests of their eBTC protocol. During this time, we established and implemented over 40 properties, validated findings from Spearbit's manual review, and uncovered previously undisclosed bugs. These issues were addressed by the team, which further implemented a continuous fuzzing integration to monitor broken invariants against code changes.
In this article, I will share our key takeaways and best practices from this comprehensive formal specification & invariant testing audit.
A brief introduction to eBTC
eBTC is a collateralized crypto asset soft pegged to the price of Bitcoin and built on the Ethereum network. It is backed exclusively by Lido Staked ETH (stETH), and built on top of Liquity's LUSD model to maintain its peg stability. Users create Collateralized Debt Positions (CDP) by depositing stETH into the protocol, which allows them to mint eBTC, that can later be used in DeFi and redeemed back for stETH.
Key Takeaways
1. Echidna is still the best tool for the job
When starting a fuzzing engagement, a common question we ask ourselves is which tool we should pick. Although, in theory, you could build a fuzzer-agnostic contract, in reality, the assertion methods and overall structure of the target contract are somewhat program-dependent, which unfortunately makes it easier to just choose a program and stick with it.
Throughout the engagement, we evaluated Echidna, Foundry, and Medusa as the three most promising open-source fuzzers for Solidity smart contracts.
Despite some benchmarks showing that Foundry and Medusa can be faster than Echidna in some cases, Foundry's lack of coverage reports for invariant tests and Medusa's incipient sequence shrinking is a huge step back for stateful fuzzing.
Indeed, some of the most impactful refactoring applied to the existing codebase was only made possible after analyzing that some code paths were not even being hit by the fuzzer.
Only after reaching 100% line coverage were we able to finally start breaking some invariants and finding bugs, and only with short, reproducible tests we could properly understand the root cause of the problems. Without these two key features, we recommend using Foundry and Medusa alongside Echidna rather than as a complete replacement.
2. Speed up your debugging process with test converters
One of the biggest challenges of fuzz testing is understanding what has happened.
After the fuzzer throws different values at your contract and you perform multiple bound
or clampBetween
functions to restrict the random input in your target function, it becomes even harder to understand how exactly the initial input relates to the final state of your contracts.
Instead of emitting events (which is the officially recommended approach for debugging failed properties) or using call summaries (an alternative debugging approach commonly used in Foundry invariant tests), we propose a novel methodology: creating test converters.
A test converter is an auxiliary test contract, EchidnaToFoundry.t.sol
, created with the sole purpose of converting from "Echidna sequences" to "Foundry unit tests".
This file contains a contract with the same target functions as Echidna, which are used to convert back from "random inputs" to "hardcoded inputs". As a result, we have very small unit tests which allows us to extend, integrate into the test suite, and debug more easily by inspecting other state variables. In addition, and most importantly, it avoids the process of adding events Echidna's target contract, which usually means running the tool over and over again until the bug is found.
With proper code compartmentalization, this test converter can be extracted directly from the Echidna contract. Therefore, think ahead and plan accordingly when you start your fuzzing campaign. An important note is that the test setup should be consistent between Foundry and Echidna, including the initial balances, so that you can easily compare the final state of the Echidna run with that of Foundry.
3. Leverage existing tooling for a faster setup time
Despite the clear benefits of stateful invariant testing, the friction in setting up the environment is significantly higher than with simple Foundry fuzz tests. This is why you should leverage existing tooling to speed up this process.
One great tool that helps in the creation of echidna tests is Trail of Bit's @crytic/properties repository, which comes with useful helper functions and pre-built properties for common ERC standards.
During the course of this project, we have extended the repository with additional helper contracts, Dockerfiles, and other boilerplate code that will hopefully make it easier for new projects to bootstrap their invariant tests.
Badger DAO will soon host a Code4rena competition, after which people will be invited to fork its invariants and break the code, so watch out for the launch of a Bug Bounty program which will grant access to their invariant testing setup.
4. Beware of Implicit Bias in Simplifying Assumptions
Many times during our engagement, we faced test cases that were difficult to understand. To better grasp why they were failing, we adopted some simplifying assumptions. However, this did not always prove to be the best strategy, as it sometimes hid the root cause behind false negatives:
4.1. Be careful with simplified values
While "beautiful values" make it easier to reproduce bugs, they can mask real-world issues. For instance, when analyzing an accounting invariant related to collateral and debt values, we initially ignored stETH rebasing and treated $1 worth of collateral as equal to $1 worth of debt, aiming for an exact calculation of the dollar value of these quantities.
This approach allowed for simple calculations and easy-to-understand values when analyzed on a spreadsheet. However, it would instantly fall apart when the fuzzer began testing again.
After extensive debugging, we realized that our simplifying assumptions caused our formulas to fail, either due to a missed rebasing or price conversion or because of a missing division or multiplication by a constant factor. In the end, we spent a significant amount of time debugging our simplistic equations, and made us realize that this strategy hasn't been the most effective after all.
4.2. Beware when excluding features from your tests
When the system has multiple public/external functions as entry points, it might be tempting to reduce the number of target functions, especially if they offer similar functionalities. However, depending on the internals of these functions, relying on just one instead of the other can conceal critical bugs from the fuzzer.
A case in point was a bug related to the liquidation of CDPs, which was only discovered in a function that would liquidate a random CDP but would not trigger a similar one designed to liquidate CDPs in order of ICR.
4.3 Avoid false negatives for invariants that must always hold
Using require
checks (or vm.assume
) can also introduce false negatives, especially when you are targeting invariants that must always hold.
For instance, if your invariant expects a particular transaction to never fail under specific conditions, you also need to ensure that no failure case exists. By testing only the happy path, you might overlook potential denial-of-service issues that could otherwise breach your invariant.
5. Mathematically prove your invariants whenever possible
Sometimes, an invariant might keep breaking because you believe it to be true, but you haven't thoroughly proven its veracity. In such instances, it can be beneficial to jot down the mathematical equation representing that invariant and attempt to prove it for all scenarios.
One such case was the "TCR must increase after liquidations" invariant. We believed it to hold true in all situations, but it was mathematically demonstrated to be invalid during redistributions. Specifically, during a liquidation that involves redistribution of bad debt, a portion of the CDP's debt remains in the system after the liquidator claims the collateral, which can further lower the TCR.
This is an intended behavior and a non-issue, but it illustrates that the invariant wasn't precisely what we had imagined when we started.
Conclusion
Working alongside Badger DAO's team, especially @GalloDaSballo and @dapp_whisperer, was an experience full of learning. Their emphasis on security was unmatched, and their patience in reviewing, debugging, and triaging false positives contributed greatly to the success of this engagement.
Security reviews, especially with protocols as critical as financial ones, are essential in the blockchain world. This engagement not only reinforced the strength of this methodology but also provided valuable insights into optimizing testing processes.
If you want to discuss invariant testing strategies, don't hesitate to reach out.