Hachi

We do security research on Cardano


overzZzealous: the Peculiar Semantics of || and && in Plutus Core

Contents

TL;DR: We identified an issue in the semantics of Plutus Core, which we refer to as overzZzealous. This issue may slow down and raise the cost of evaluating code on the Cardano blockchain, or even cause validators to reject transactions and thus block assets. Since developers rarely engage with Plutus Core directly, most are oblivious to overzZzealous. Furthermore, as the problem comes from the design of Plutus Core, it is challenging to address. It has been reported to the Plutus team.

Introduction

One never-ending struggle in computing is to guarantee that code behaves as intended. Developers want to write programs whose behaviour aligns with their intentions and want interpreters/compilers to yield predictable results. In a well-designed and well-documented language, it is harder to make mistakes: either because the behaviour of the language is predictable and consistent with our expectations or because the type system prevents some classes of bugs from happening. If one of these points does not hold and the code does not behave as intended, we find ourselves stuck with a bug whose severity can range from mild to catastrophic.

In the Cardano environment, the language designers pay attention to the above points, which is extremely important as the pieces of code that are being written are meant to run on the blockchain, where they can manipulate all sorts of data, including that with monetary value. For instance, the Plutus language for writing off- and on-chain code is based on Haskell, a mature, strongly-typed, lazy, functional language with a rich ecosystem. However, contrary to Haskell and Plutus, the on-chain representation in Plutus Core – which Plutus compiles to – has strict semantics with the aim of making it more predictable for programmers.

Unfortunately, the choice of strict semantics does not come without problems. These inevitably find their way into the Cardano codebase in various places. This article shows that combining Haskell with strict semantics can have unexpected and potentially dangerous consequences. We discovered that || and && have strict semantics in Plutus Core, even though these operators typically have lazy semantics even in programming languages that are otherwise strict. We name this peculiar behaviour overzZzealous.

(See also our report of this behaviour to the Plutus team.)

The rest of this blog post consists of three parts. In the first part, we discuss strict and lazy semantics of || and && in several programming languages, bring the design of Plutus Core into context, and finally show how overzZzealous manifests itself. In the second part, we guide the reader step-by-step into reproducing and exhibiting this peculiar behaviour. Finally, in the third part, we discuss whether this is a problem or not (spoiler: in our opinion, it is).

About (Non-)Laziness of Boolean Operators

Lazy Operators and Strict Function Application

Programming languages usually include boolean operators “or” and “and”, often written as || and && respectively. For instance, we can express that a number n is either smaller than 10 or bigger than 100 and divisible by 3 (ie. equal to 0 modulo 3) with:

(n < 10 || n > 100) && (n mod 3 == 0)

In most programming languages (eg. C, Java, OCaml), these operators are lazy. This means that they first evaluate their left-hand side and then evaluate their right-hand side only if it is necessary. For instance, in n < 10 || n > 100, the sub-expression n > 100 is evaluated only if n < 10 evaluated to false. In other words, a || b behaves exactly like if a then true else b. Meanwhile, a && b behaves exactly like if a then b else false where b is evaluated only if a evaluates to true.

This behaviour differs from strict function application, where we evaluate all the arguments first. This is known as the strict semantics of function application. For instance, in OCaml, if we were to define our own “or” and “and” functions and use them to rewrite the above expression, such as

let my_or a b = if a then true else b
and my_and a b = if a then b else false

in my_and (my_or (n < 10) (n > 100)) (n mod 3 == 0)

then all the arguments (n < 10, n > 100 and n mod 3 == 0) are always evaluated. In general, strict languages differentiate between normal functions and operators such as || and &&, whose semantics cannot be reproduced easily with functions.

Haskell and Off-chain Plutus’ Pervasive Laziness

Unlike OCaml, Haskell features lazy semantics of function application. Arguments of a function call are only evaluated if the body of the function uses them. Consider the following Haskell code:

let my_or a b = if a then True else b
    my_and a b = if a then b else False

in my_and (my_or (n < 10) (n > 100)) (n `mod` 3 == 0)

In the program above, we only evaluate n > 100 if n < 10 evaluates to False. In such a context, lazy || and && are not special in any way: they are just normal functions in a world where function application is always lazy.

When off-chain (that is, when running anywhere else other than on the Cardano blockchain), Plutus is simply Haskell with a Cardano-specific standard library. Therefore, it shares the lazy semantics of function application. In fact, || and && are defined as ordinary functions in Plutus’ standard library. For example, || is defined as:

(||) :: Bool -> Bool -> Bool
(||) l r = if l then True else r

On-chain Plutus’ Peculiar Semantics

Although on-chain Plutus shares the same syntax as Plutus and Haskell, its compilation pipeline, design, and semantics are entirely different from off-chain code. Two essential design choices matter for us in relation to overzZzealous:

  1. On-chain Plutus is built on top of Haskell, allowing it to enjoy Haskell’s mature compiler and rich ecosystem, particularly Haskell’s syntax and type system, which have been carefully refined over the years.
  2. On-chain Plutus is a language with strict function application. This is motivated by the fact that it allows for better predictability of what is going to be computed and how, an essential property for on-chain code.

However, together, these two reasonable design choices combine into an unfortunate mixture: || and &&, as in off-chain Plutus, are functions and function application is strict, meaning that || and && will always evaluate both their arguments. This behaviour can come as a confusing feature as the usual understanding of these two operators (or functions depending on the language) is that they are lazy.

This behaviour surprised us, and we believe this will surprise many other developers. Introducing unexpected behaviour in their programs has these two critical effects:

  1. Since programmers assume that || and && are lazy, they will not expect that their program becomes overzealous and runs as much code by constantly evaluating the right-hand sides of boolean operators. For on-chain code that runs in an environment with limited resources and where processing costs gas/fee/money, this can become expensive.
  2. When the right-hand side includes side effects, such as throwing an error in the relatively common conditionMet || error () pattern, this can completely change the behaviour of a piece of code. For example, it may cause a validator to reject supposedly-valid transactions and potentially block assets.

Demonstrating overzZzealous

We want to demonstrate that overzZzealous can occur in a trustworthy environment. For that, we chose to reproduce the issue within the Plutus Starter project.

Cloning the Repository

Clone our fork of the Plutus Starter project. It takes the main branch of the upstream project, as of commit 48ab4d5 (October 12, 2021), and adds one commit updating the README and adding two Shell scripts to help reproduce the issue.

Playing the Game

As a first step, let us follow the Plutus Application Backend (PAB) example described in its README. It describes an example application with which “we can serve and interact with contracts over a web API”. By default, the PAB is configured with a guessing game contract named Game, located at ./examples/src/Plutus/Contracts/Game.hs.

Let us build and run the PAB executable with cabal run plutus-starter-pab. Once this is done, we can play the game with the Shell script ./play-game.sh. As it provides a wrong guess by default, the last line of the PAB executable’s output should look like this:

[WARNING] W8e8fdb9: Validation error: Phase2 4f530b2a2d96fa34e6b1393fca5b661eb36b6a83c5ec271d4d1316f1a329586b: ScriptFailure (EvaluationError ["PT5"] "CekEvaluationFailure")

Now let us restart the PAB executable: we quit the previous execution with <enter> and run cabal run plutus-starter-pab again. Once this is done, we can play the game again with the correct answer this time. This can be done by running ./play-game.sh with the --guess-right argument. This time, the last four lines of the PAB executable’s output should look like the following:

[INFO] W162f384: Finished balancing. fd03c93cec942edbc2f2547d33a7a3dc45551230638bcb1fe9fb27082d087411
[INFO] W162f384: Submitting tx: fd03c93cec942edbc2f2547d33a7a3dc45551230638bcb1fe9fb27082d087411
[INFO] 25456589-c00f-4f3a-8b14-d8ce089e7d5f: "Waiting for guess or lock endpoint..."
[INFO] Slot 26: TxnValidate fd03c93cec942edbc2f2547d33a7a3dc45551230638bcb1fe9fb27082d087411

If everything works smoothly, we are ready to change the code of the on-chain validator to include || and && and exhibit overzZzealous.

Changing the Semantics?

To expose this issue, let us consider the following lines from the Game contract in ./examples/src/Plutus/Contracts/Game.hs:

gameInstance :: Scripts.TypedValidator Game
gameInstance = Scripts.mkTypedValidator @Game
    $$(PlutusTx.compile [|| validateGuess ||])
    $$(PlutusTx.compile [|| wrap ||]) where
        wrap = Scripts.wrapValidator @HashedString @ClearString

-- [...] --

-- | The validation function (Datum -> Redeemer -> ScriptContext -> Bool)
validateGuess :: HashedString -> ClearString -> ScriptContext -> Bool
validateGuess hs cs _ = isGoodGuess hs cs

isGoodGuess :: HashedString -> ClearString -> Bool
isGoodGuess (HashedString actual) (ClearString guess') = actual == sha2_256 guess'

In particular, we can see that validateGuess is meant to be run on-chain because it shows up as an argument to PlutusTx.compile. For now, it only calls the auxiliary function isGoodGuess.

Let us change that slightly to something that should have the same semantics but uses the || operator. This can be achieved by calling the Shell script ./modify-code.sh with the argument --with-or. It makes the following modification to the source code:

diff --git a/examples/src/Plutus/Contracts/Game.hs b/examples/src/Plutus/Contracts/Game.hs
index 74625d2..b639356 100644
--- a/examples/src/Plutus/Contracts/Game.hs
+++ b/examples/src/Plutus/Contracts/Game.hs
@@ -94,7 +94,7 @@ clearString = ClearString . toBuiltin . C.pack

 -- | The validation function (Datum -> Redeemer -> ScriptContext -> Bool)
 validateGuess :: HashedString -> ClearString -> ScriptContext -> Bool
-validateGuess hs cs _ = isGoodGuess hs cs
+validateGuess hs cs _ = if True || error () then isGoodGuess hs cs else True

 isGoodGuess :: HashedString -> ClearString -> Bool
 isGoodGuess (HashedString actual) (ClearString guess') = actual == sha2_256 guess'

If the semantics of || were lazy, the expression True || error () should evaluate to True without ever evaluating the error () on the right-hand side. Therefore, this whole expression should always call isGoodGuess hs cs in the same way as before.

Playing Again and— Wait What?

Let us then play the game again. We restart the PAB executable, and we run ./play-game.sh again. This time, regardless of whether we provide the right or wrong answer, the transaction cannot be validated! We always get a validation error like the one below:

[WARNING] W8e8fdb9: Validation error: Phase2 4f530b2a2d96fa34e6b1393fca5b661eb36b6a83c5ec271d4d1316f1a329586b: ScriptFailure (EvaluationError ["PT5"] "CekEvaluationFailure")

What happened here? Contrary to our expectation, the || is not evaluated lazily: both sides are always evaluated, no matter what. Since the right-hand side calls error (), the validator fails systematically instead of checking whether the guess is right or wrong.

Is This an Issue?

We firmly believe that it is. Let us explore why in more detail.

Semantic Side-Effects

Firstly, overzZzealous may introduce bugs when in the presence of side effects. In pure code, there is no issue (other than a performance issue) since both lazy and strict evaluations produce the same results. However, code with side-effects such as trueCondition || error () would invalidate transactions that are meant to be valid. In reality, the bug may not be as apparent as in our example, as the side effect may be hidden deep inside a complex source code. A non-exhaustive test suite might fail to detect it, leading to forever-locked assets once the code gets deployed on-chain.

Performance

Secondly, on a blockchain such as Cardano, where evaluating code costs gas/fee/money, this issue would lead to users spending more fees and time evaluating unnecessary code. Most of the code we have seen deployed on the chain makes use of explicit case-expressions, but the cheapCheck || expensiveChecks pattern is still quite popular.

We have found an instance of this pattern in plutus-use-cases where the following innocent-looking code requires more resources than an unnatural-looking if-then-else.

-- || is surprisingly way more expensive!
isUpdateValid = (not isCurrentValueSigned) ||
    (fromMaybe False $ validateGameStatusChanges <$>
    (osmGameStatus <$> extractSigendMessage (ovSignedMessage oracleData)) <*>
    (osmGameStatus <$> extractSigendMessage outputSignedMessage))

-- if-then-else is way cheaper!
isUpdateValid = if not isCurrentValueSigned then True else
    (fromMaybe False $ validateGameStatusChanges <$>
    (osmGameStatus <$> extractSigendMessage (ovSignedMessage oracleData)) <*>
    (osmGameStatus <$> extractSigendMessage outputSignedMessage))

We calculated the required budget using writeScriptsTo as documented in the official how-to.

writeScriptsTo
  (ScriptsConfig "." (Scripts UnappliedValidators))
  "updateOracleTrace"
  Spec.Oracle.updateOracleTrace
  def

The difference is rather significant, as shown in the following table. “Size” represents the size of the script in bytes; “CPU” and “Memory” are made-up units whose actual value is up to the ledger to decide.

WithSizeCPUMemory
||9,5391,124,168,9563,152,700
if-then-else3,531698,345,7672,069,978

When the cheap check is satisfied, the strict || still evaluates the expensive right-hand side, which yields a much larger script and requires over 1.5 times more resources in terms of both, CPU and memory consumption, than a short-circuit solution with if-then-else. This overzealous behaviour might be extremely costly and wasteful if a frequently updated oracle falls into this case. To reproduce our findings, one can run the experiment in this branch to find out more.

Confusing Behaviour

The two previous issues might seem unproblematic considering that programmers “just” have to know about overzZzealous to avoid it. However, we believe that the strict semantics will confuse programmers. While we do not necessarily think that short-circuit operators are superior, || and && simply behave so in most popular languages. The most popular linter for Haskell, hlint, even suggests to replace if-then-else with || when possible:

src/Contracts/Oracle/OnChain.hs:(136,21)-(139,69): Warning: Redundant if
Found:
  if not isCurrentValueSigned then
      True
  else
      (fromMaybe False
         $ validateGameStatusChanges
             <$>
               (osmGameStatus
                  <$> extractSigendMessage (ovSignedMessage oracleData))
             <*> (osmGameStatus <$> extractSigendMessage outputSignedMessage))
Perhaps:
  not isCurrentValueSigned
    ||
      (fromMaybe False
         $ validateGameStatusChanges
             <$>
               (osmGameStatus
                  <$> extractSigendMessage (ovSignedMessage oracleData))
             <*> (osmGameStatus <$> extractSigendMessage outputSignedMessage))

A contract may have on-chain and off-chain code located close to each other. It is worrying how two occurences of || or && just a few lines apart in the same module can have different semantics. This is a disaster waiting to happen. && is even more common than || with patterns such as con1 && con2 && con3. Even though most APIs would invalidate and refuse to submit the transaction off-chain for an early False, they are still wasteful calculations that could be saved. One can never save enough resource in blockchain tech.

At Hachi, we only discovered this surprising behaviour while developing a concolic execution engine for Plutus Core. The community’s responses on our report to the Plutus team suggest that others are also surprised. We also received several private complaints from other developers in the ecosystem. We genuinely empathise with the people who fell into this trap and had to rewrite their code less idiomatically.

Solutions

We observed two kinds of mitigations to this issue which dApp developers currently employ: some rewrite their usage of || and && as if-then-else to recover the intended semantics. Many others write their own ad-hoc boolean functions and lazy lists in order to emulate the behaviour which they expected from these operators. It is unpleasant for the people who know of this behaviour. It is probably worse for people who do not as they may only notice the issues once their scripts have already been deployed to the blockchain.

Since the problem comes from the design of Plutus Core, it is not easy to fix. We have the following proposals to remedy this issue:

  • Introduce a semantic change to add lazy || and && to Plutus Core. A simple fix is to compile them to if-then-else. Of course, this implies breaking backwards compatibility.
  • Keep the current semantics for || and &&, but add new operators such as | and & with lazy semantics. Document the differences between the two variants; it should hopefully catch more attention with two variants.
  • Remind dApp developers to write tests to catch this issue.

As security researchers who cheer for both sides, we feel it is better to implement ad-hoc changes on the compiler (such as making two particular symbols operators instead of handling them in a generic way as functions) for better UX and performance for dApp developers. The compilation pipeline is non-trivial, but the language itself is not vast. We can write more documentation and refactor code more freely while deployed scripts stick. A small compiler optimisation might translate to a considerable amount of ADA for long-lived and high-traffic dApps! Most dApp developers would not work close enough to the Plutus Core level to face this complexity.

Whatever the resolution to this issue may be, we believe that this peculiar behaviour should be documented and communicated better to dApp developers.

Conclusions

This blog post presents overzZzealous, a peculiarity in the semantics of Plutus Core, the language of the Cardano blockchain. overzZzealous is simply the fact that || and && are not lazy, which can lead to slower and more expensive (in terms of gas/fee/money) evaluation for on-chain code. In some cases, it can also lead to validators rejecting transactions that are supposedly correct, potentially blocking assets.

overzZzealous came as a surprise to us and surprised several other developers. We believe it will keep surprising people for as long as the Cardano blockchain exists. At Hachi, we will continue studying the raw on-chain scripts that use ||, && and, more generally, how strict semantics affect side-effect usage in scripts from the chain index.

Hachi is a research effort in security, programming language theory, and formal methods on the Cardano blockchain. Follow our blog at blog.hachi.one for frequently updated content in this exciting field.