Formal Verification In Practice: Halmos, Hevm, Certora, and Ityfuzz

As a follow-up to An Introduction to Formal Verification Techniques and Tools, we will analyze 4 different examples in order to learn how to use formal verification and symbolic execution tools

Dec 19, 2023

Formal Verification In Practice: Halmos, Hevm, Certora, and Ityfuzz

As a follow-up to An Introduction to Formal Verification Techniques and Tools, we are going to analyze 4 different examples in order to learn how to use formal verification and symbolic execution tools. You can check out this repository and take a look at the examples from this post.

1. Contrat Vault

1.1 Explanation of the contract

It is a contract that manages assets and user shares.

State Variables:
- totalAssets: Total sum of assets in the vault.
- totalShares: Total sum of issued units.
Functions:
- deposit: Allows to add assets and obtain proportional shares.
- mint: Issues new units and allocates equivalent assets.

1.2 What is wrong with this contract

Attempts to calculate the number of assets equivalent to a certain number of new units, based on the current value per unit (which is totalAssets / totalShares).
However, if totalShares is 0 (which could happen if the vault is newly created and has no issued units yet), this operation would result in a division by zero, which is an invalid operation and would cause the transaction to be reversed.

1.3 TESTS

1.3.1 Explanation

We will now proceed with the test in order to demonstrate the problem.

check_deposit(uint assets):

The assert(A1 * S2 <= A2 * S1) verifies that the ratio of assets to equity does not decrease after a deposit.

check_mint(uint shares):

The assert(A1 * S2 <= A2 * S1) checks a similar property to check_deposit, but for the issue of units, ensuring that the ratio of assets to units does not decrease after the issue.

1.3.2 Test result

Halmos

halmos --contract VaultTest --solver-timeout-assertion 0

check_deposit:

Result: SUCCESS (WARNING)
The test passes, but a warning is issued because it is unknown whether there is a counterexample that would violate the assertion (Counterexample: unknown).

check_mint:

Result: FAILED
The test fails because a specific counterexample was found that violates the assertion in the contract.

The counterexample indicates that, with these particular values, the assertion that is expected to hold true fails.

This could be indicative of a division by zero or a calculation that does not comply with the expected business rules for the mint function.

Hevm

hevm test --match VaultHevmTest

The results indicate that there are problems in the implementation of the deposit and mint functions.

prove_mint(uint256)

Result: Failure.
Provided Counterexample: prove_mint(925195992671786116861126631025179451216601210845844914936195265033395958736) causes a revert.
Interpretation: This suggests that for the specified value of shares, the mint function does not behave as expected and results in an error, possibly due to division by zero or arithmetic overflow.

prove_deposit(uint256)

Result: Failure.
Additional Details: Not provided in the result.
Interpretation: Similar to prove_mint, the failure here indicates that under certain conditions.

Certora

certoraRun src/EXAMPLE_1/Vault.sol --verify Vault:test/EXAMPLE_1/Certora/Vault.spec

Rule depositIncreasesAssetsAndShares:

Result: FAILED
Description: The verification failed with the message "The relationship between totalAssets and totalShares must be kept constant." This suggests that the deposit function does not maintain a constant and proportional relationship between totalAssets and totalShares as expected.

Rule mintIncreasesAssetsAndShares:

Result: FAILED
Description: Similar to the previous rule, the verification failed indicating a problem with the mint function in maintaining the correct proportion between totalAssets and totalShares.

Summary:

Both the deposit and mint functions do not maintain a constant proportion between total assets and total shares, which could indicate a vulnerability in the contract.

Ityfuzz

The bug() keyword acts as a marker to point out possible critical points in the contract that could be susceptible to bugs or exploits.

solc *.sol -o . --bin --abi --overwrite --base-path ../../../

ityfuzz evm -t "*"

Results:

Reversions Detected:
- Test transactions indicate that both deposit and mint functions have reverted under certain conditions.
- This suggests that under specific input scenarios, the functions do not behave as expected.

SUMMARY

Halmos

✅

Hevm

✅

Certora

✅

Ityfuzz

✅

2. Contrat PostExample

2.1 Explanation of the contract

The specific condition in backdoor illustrates a scenario where standard security tools might fail to detect potentially dangerous or unexpected behavior.

2.2 What is wrong with this contract

Demonstrate how fuzzing tools and invariance tests can face difficulties with certain types of logical and mathematical conditions.

2.3 TESTS

2.3.1 Explanation

The main purpose of this test is to verify if the backdoor function behaves as expected with different input values x.

2.3.2 Test result

Halmos

halmos --contract PostExampleTest -vv

Test Result: FAILED.
Counterexample:
- Input Value: p_x_uint256 = 6912213124124532.
- Trace: A call followed by a static call leads to a REVERT, with the error code 0x4e487b71....

Failure Analysis: This result indicates that halmos correctly identified the path where the backdoor function fails when passed the specific value 6912213124124532, leading to an assertion violation and, consequently, the transaction's reversal.

Hevm

hevm test --match HevmPostExampleTest

hevm test --match HevmPostExampleTest --verbose 2 --debug --trace

Test Result: FAILED
Counterexample:
- Calldata: prove_Backdoor(6912213124124532).
- Result: Reverted (Revert: 0x4e487b71...), indicating that the transaction was reverted.

Failure Analysis: The failure occurs because passing 6912213124124532 to the backdoor function results in z becoming 6912213124124531, thus triggering the specific condition that causes the assert to fail and the transaction to revert.

Certora

certoraRun src/EXAMPLE_2/PostExample.sol --verify PostExample:test/EXAMPLE_2/Certora/backdor.spec

Rule 1: backdoorRevertsCorrectly:

Result: The rule SUCCESS.
This means that the backdoor function behaves as expected and reverts for this specific value.
This rule checks that the backdoor function correctly reverts when passed the value 6912213124124532 as argument x.

Rule 2: backdoorDoesNotRevertNormally:

Result: The rule FAILED.
This rule checks that the backdoor function does not revert for x values different from 6912213124124532.
This indicates that the backdoor function also reverts for at least one x value different from 6912213124124532, specifically for x=5. This result is unexpected and suggests that there is something in the code causing a revert even when it shouldn't.

Result Analysis:

The backdoor function should revert only for x=6912213124124532 according to its encoded logic.
The failure of the backdoorDoesNotRevertNormally rule suggests that there are other values for which the function reverts, which is an unexpected behavior.

Ityfuzz

The keyword "bug()" is added to seek or simulate potential vulnerabilities in this part of the contract.

solc *.sol -o . --bin --abi --overwrite --base-path ../../../

ityfuzz evm -t "*"

Result: Found violations!⇒
- ityfuzz has detected one or more vulnerabilities.
Transaction Details:
- The transaction involves calling the backdoor function with an extremely large argument (25696212022634510588554197811190732209697767440223.3793 Ether).
Vulnerability Analysis:
- The substantial amount of Ether used as an argument in the backdoor call suggests that the vulnerability might be related to improper handling of extremely large or unusual values.

SUMMARY

Halmos

✅

Hevm

✅

Certora

✅

Ityfuzz

✅

3. Contrat Counter

3.1 Explanation of the Contract

Public Variable:
- It has a public variable number that stores a numerical value.
SetNumber Function:
- This function allows updating the value of number.
- However, it includes a unique condition where if the newNumber is equal to a specific value (0xC0FFEE) and the inLuck boolean is true, the function will not update number and instead will revert the transaction. This reversion is signaled by a custom error CoffeeBreak.
Increment Function:
- This function increments the number by one each time it is called.

3.2 What is Wrong with This Contract

The Counter contract, though straightforward in its main functionality, introduces a potential problem in the setNumber function due to its special reversion condition:

Reversion Condition:
- The specific condition (newNumber == 0xC0FFEE && inLuck == true) that triggers the reversion is very particular and might not be immediately apparent or easily detectable in routine testing scenarios.
Potential Oversight in Testing:
- Standard unit testing might miss this conditional reversion as it would require the exact triggering values to be tested.
- Unless these specific values are included in the test cases, the reversion scenario might remain undetected.

3.3 TESTS

3.3.1 Explanation

Formal verification, unlike unit testing, can systematically analyze all possible inputs and code paths. This includes testing edge cases and specific conditions that unit tests might overlook.

3.3.2 Test result

Halmos

halmos --contract CounterHalmosTest --solver-timeout-assertion 0

check_testIncrement:
- Result: PASS
- Details: The increment() function worked as expected, increasing the number by one.
check_testSetNumber(uint256,bool):
- Result: PASS
- Details: Successfully tested various inputs for setNumber(). Asserted the correct setting of number.
check_testSetNumberRevertsOnCondition:
- Result: FAIL
- Details: The test expected a revert with a specific error ('CoffeeBreak') for a set input, but the revert didn’t occur as expected. This indicates a potential flaw or missed condition in the setNumber() function logic.
check_testFSetNumber:
- Result: PASS
- Details: Indicates that setNumber correctly updates number to x under the tested conditions.

Hevm

hevm test --match CounterHevmTest --verbose 2

prove_testIncrement():
- Result: PASSED
- Validates the increment function increases number by 1.
prove_testSetNumber(uint256, bool):
- Result: PASSED
- Confirms setNumber updates number correctly for varied inputs.
prove_Number(uint256, bool):
- Result: PASSED
- Ensures setNumber behaves correctly for special case inputs and normal cases.

Certora

setNumberBehavior(uint256 newNumber, bool inLuck):
- Objective: To verify the setNumber function's behavior under different conditions.
- Test:
  - When newNumber equals 0xC0FFEE and inLuck is true, it's expected that the function reverts (simulating a CoffeeBreak scenario).
  - In other cases, number should be updated to newNumber, indicating normal behavior.
incrementIncreasesNumber():
- Objective: To confirm that the increment function correctly increases the number by 1.
- Test:
  - It checks the value of number before and after the increment call, ensuring the increase is exactly by 1.

certoraRun src/EXAMPLE_3/Counter.sol --verify Counter:test/EXAMPLE_3/Certora/Counter.spec

setNumberBehavior:
- Result: VIOLATED.
  - Reason: The assertion "Number should be updated to newNumber if reversion conditions are not met" failed.
  - Implication: In at least one case, the number was not correctly updated to newNumber when it should have been. This suggests a flaw in the setNumber logic under certain conditions.
incrementIncreasesNumber:
- Result: SUCCESS.
  - Implication: The increment function behaves as expected by correctly incrementing the number.

Ityfuzz

The keyword "bug()" is added to seek or simulate potential vulnerabilities in this part of the contract.

solc *.sol -o . --bin --abi --overwrite --base-path ../../../

ityfuzz evm -t "*"

First Testcase:
- Result: SUCCESS
- The test involves calling the increment() function.
- It does not revert (Reverted? false), indicating that the execution was successful.
- The code coverage is 68.81% for instructions and 54.55% for branches.
Second Testcase:
- Result: FAILED
- Involves a series of calls to increment() followed by setNumber(4844365416753630588109915963476993855667914754127.7665 Ether, true).
- A reversal is detected (Reverted? true), suggesting unexpected behavior or a potential vulnerability.
Vulnerability Detection:
- ityfuzz indicates that vulnerabilities have been found (😊😊 Found violations!).
- A "Trace" is provided showing a specific transaction leading to the violation: setNumber(590.2958 Ether, false).

SUMMARY

Halmos

❌ ✅

Hevm

❌ ✅

Certora

✅

Ityfuzz

✅

4. Contrat Token

4.1 Explanation of the contract

Mint Function:
- Allows the owner to create (mint) new tokens and allocate them to any user address. This function is restricted to be callable only by the owner.
Transfer Function:
- Enables users to transfer their tokens to another address. It deducts the specified amount from the sender's balance and adds it to the recipient's balance.

4.2 What is wrong with this contract

Try out different ways of writing tests using formal verification tools.

4.3 TESTS

4.3.1 Explanation

Same Sender and Receiver:
- If the from and to addresses are the same, the test verifies that their token balances remain unchanged.
- This scenario checks for internal transfers where users send tokens to themselves.
Different Sender and Receiver:
- If the from and to addresses are different, the test verifies that:
  - The sender's (from) balance decreases by the amount transferred.
  - The receiver's (to) balance increases by the amount transferred.

4.3.2 Test result

Halmos

We are going to use the Halmos cheatcodes to create address and give them a balance.

SetUp Method:

We create alice, bob, Eve, owner accounts with the cheatcode "createAddress".
We give you a balance with the cheatcode "createUint256".
The owner address is used to mint the tokens, simulating ownership control.
- Mints tokens to alice, bob, and Eve.

Test Logic:

Assumes the from address has a sufficient token balance for the transfer.
Records token balances of from and to addresses before executing the transfer.
Conducts the transfer and checks post-transfer balances:
- If from and to are the same, balances should remain unchanged.
- Otherwise, the balance of from should decrease by amount, and to should increase by amount.

halmos --contract HalmosTokenTest --solver-timeout-assertion 0 -v

Result check_Transfer:

Outcome: PASS.
Interpretation:
- The successful execution of the check_Transfer method implies that the transfer function behaves as expected under various conditions.
- The test explores 9 potential paths, with 2 paths involving possible assertion violations. However, these paths are deemed invalid and ignored, indicating no actual violations were found in those scenarios.

Hevm

SetUp :

Initializes the Token contract and mints tokens to alice, bob, and Eve with 10, 20, and 30 ether respectively.

Test Logic:

Assumes sufficient balance for the from address before making a transfer.
Records the balances of from and to addresses before the transfer.
Executes the transfer of amount from from to to.
Verifies if the balances are updated correctly after the transfer:
- If from and to are the same, balances should remain unchanged.
- Otherwise, the balance of from should decrease by amount, and the balance of to should increase by amount.

hevm test --match hevmTokenTest --verbose 2

Result prove_Transfer:

Outcome: FAIL.
Interpretation:
- The failure of the prove_Transfer method indicates an issue with the transfer functionality in the Token contract.

Certora

certoraRun src/EXAMPLE_4/Token.sol --verify Token:test/EXAMPLE_4/Certora/TokenCertora.spec

Verified Rules Successfully

mintIncreasesBalance
- Verification: Not Violated Time: 5 seconds
- Description: Verifies that mint correctly increases the balance

Failed Rules

transferIncreasesRecipientBalance
- Verification: Violated
- Description: Should verify that transfer increases the recipient's balance
- Failure Message: "Transfer should increase recipient balance"
  - This rule failed, indicating a violation case was found, meaning a counterexample where transfer did not increase the recipient's balance as expected.
transferDecreasesSenderBalance
- Verification: Violated
- Description: Should verify that transfer decreases the sender's balance
- Failure Message: "Transfer should decrease sender balance”
  - This rule failed, indicating a violation case was found, meaning a counterexample where transfer did not decrease the sender's balance as expected.

Ityfuzz

solc *.sol -o . --bin --abi --overwrite --base-path ../../../

ityfuzz evm -t "*"

mint Function:

Sender: 0x68Dd4F5AC792eAaa5e36f4f4e0474E0625dc9024
Value: 12088324826350109117091568176438021967062.6661 Ether
Result: Reverted with "Reverted? true"
Details: An attempt to call mint on address 0x0000000000000000000000000000000000000000 with an excessively large amount.
Violation: Violated the rule that verifies the mint function's correct balance increase.

transfer Function:

Sender: 0xe1A425f1AC34A8a441566f93c82dD730639c8510
Value: 328319314958874220607.2403 Ether
Result: Reverted with "Reverted? true"
Details: An attempt to call transfer with an excessively large value.
Violation: Violated the rule that verifies the transfer function's correct sender balance decrease.

Conclusion

We have analyzed 4 different contracts and compared different formal verification and symbolic execution tools in terms of user experience and bug-finding capabilities. In future research, we aim to extend this analysis and include other examples and use cases.

Was it useful? Help us to improve!

With your feedback, we can improve the letter. Click on a link to vote:

A guest post by

secoalba

I have a strong interest in blockchain development and security, particularly in the development and utilization of tools for smart contract security, such as formal verification and fuzz testing.

Bogdan

Mar 6, 2024

Hi Secoalba,

Great article!

I just wanted to point out to the possible issue with a first example (the one with the code similar to the ERC-4626).

In case of `deposit` function, if `totalAssets` has a value of 0, is should also revert, therefore the issue is not only with `mint` function. Any of the other tools you were testing, where during setup the values are not explicitly 0 for `totalSupply` and `totalAssets` should revert (every tool except `hevm`, which as explicitly set these to non-zero value).

Discussion about this post

Ready for more?