Hachi

We do security research on Cardano

An Introduction to Plutus Core

Michael B. Gale
Nov 25, 2021 19 min read
plutus

Contents

Plutus Core (PLC) is the programming language that “runs” on the Cardano Blockchain. A blockchain is just a distributed data structure though, so programs do not literally run on it. What we mean is that Plutus Core programs are stored on the blockchain in binary form and can be referenced by transactions. Plutus Core programs are later retrieved from the blockchain and executed by Cardano Nodes when required by other transactions that reference them.

In this blog post, we give a high-level overview of the role that Plutus Core plays in the Cardano ecosystem, what programs written in Plutus Core look like, and how those programs work.

Plutus Core’s role

Blockchains are most commonly used for distributed ledgers and form the foundation of cryptocurrencies. A ledger keeps a chronological record of transactions that move currency between different entities. A distributed ledger is a ledger, except that its contents are not maintained by one central entity, such as a bank, but agreed upon by consensus among possibly many entities. So why do we need to store programs in a ledger?

On the Cardano blockchain, the ledger is more than just a record of currency movements. By referencing programs in transactions, we can implement smart contracts or, as known in the Cardano ecosystem, validator scripts. Transactions have inputs, which consume outputs from previous transactions, and their own outputs, which future transactions can later consume. Validator scripts validate whether another transaction may consume a given transaction’s output(s). In other words, a transaction does not need to have a fixed recipient when it is created but can instead implement arbitrary logic by which a subsequent transaction may be accepted as the recipient. IOG have a blog post with more details about this process.

Now that we have established the context in which Plutus Core is used, we focus on the language itself. There are two variants of Plutus Core: Typed Plutus Core and Untyped Plutus Core. The former is based heavily on System F, a polymorphic lambda calculus. This makes Plutus Core a functional programming language. Typed Plutus Core serves as an intermediate form which allows a compiler to statically check that the programs written in it make sense. Meanwhile, Untyped Plutus Core is Typed Plutus Core with its types erased and is what gets stored on the Cardano blockchain.

Practically speaking, most developers will never write a Plutus Core program by hand. Instead, Plutus Core serves as one common target for other, higher-level programming languages. At the time of writing, that is primarily Plutus (as opposed to Plutus Core), and there is a series of transformations which turn Plutus programs into Plutus Core programs. Thanks to this design, it is possible for other programming languages to compile to Plutus Core in the future as well.

At Hachi, we are most interested in Untyped Plutus Core. Since programs written in it are stored on the blockchain and may not necessarily have been generated from Plutus programs, building tools for Untyped Plutus Core allows us to analyse the behaviour of all programs on the Cardano blockchain rather than just those for which the original source code is available.

Nonetheless, you may be asking: if most people will never write Plutus Core programs, why should we care about it? As developers in the Cardano ecosystem, understanding Plutus Core is similar to understanding how a computer works for ordinary developers. Even though you will probably be writing your validator scripts (i.e. smart contracts) in Plutus, they will be compiled to Plutus Core and executed according to its rules. This may not always be what you expect. Therefore, you should understand how things work under the hood.

Untyped Plutus Core

Both variants of Plutus Core have a lot in common. Therefore, we begin by exploring the simpler of the two, Untyped Plutus Core. This choice will allow us to understand how to write programs in it and what syntactic constructs exist. Once we have familiarised ourselves with this variant, we then move on to Typed Plutus Core in the latter half of this article and discuss its differences compared to Untyped Plutus Core. It is possible to convert any given Typed Plutus Core program into an Untyped Plutus Core program, and we describe this process as well, but it is worth noting that the inverse is not possible in general: we cannot go from any Untyped Plutus Core program to a typed one.

Both variants of Plutus Core are normally stored in binary representations, and the binary representation of Untyped Plutus Core is stored on the blockchain. However, both variants also have textual representations with LISP-like syntax that we, as humans, can use to study Plutus Core programs. Let us consider an elementary Untyped Plutus Core program in its textual representation:

(program 1.0.0
    (con integer 4)
)

As we might expect, evaluating this program results in the integer 4. Every Plutus Core program begins with the program keyword, followed by a version number and the body of the program. Parentheses are used to group terms and are always required. Indentation does not matter.

In the above example, the body of the program is a constant. The con keyword begins a constant, followed by the name of the constant type, and finally its value. Despite its name, Untyped Plutus Core still contains type annotations for constants. Both variants of Plutus Core have the following types of constants:

integer for arbitrary-precision integers
bool for booleans
string for UTF-8 strings
bytestring for byte arrays
() for the unit type
lists
pairs
Plutus data

Note that the last three cannot currently be expressed in the textual representation of Plutus Core. The last type of constant, Plutus data, serves as a generic representation of algebraic data types.

Functions and function application

Since Plutus Core is a functional programming language, we compute results by defining functions and applying those functions to arguments. Functions are always anonymous and have a single parameter. They are also higher-order: they can take other functions as arguments and return functions as results. The lam keyword introduces a new function:

(program 1.0.0
    (lam x (con integer 4))
)

It is followed by the name of the parameter and the body of the function. The parameter is named x in the example above, and the body is the constant 4. Evaluating this example yields no meaningful result since there is nothing to evaluate. We have to apply the function to an argument to get a result:

(program 1.0.0
    [(lam x (con integer 4)) (con bool False)]
)

We use square brackets to denote function applications. In the example, the function (lam x (con integer 4)) gets applied to the argument (con bool False). This example is rather silly since we do not do anything with the argument and thus the result is the integer 4. For an example which does make use of the argument, consider the following:

(program 1.0.0
    [(lam x x) (con bool False)]
)

This is perhaps the simplest possible function: the identity function. It returns whatever it is given as argument. In other words, the result of the above program is False. If we need more than one parameter, we can return another function – a process known as currying:

(program 1.0.0
    [[(lam x (lam y x)) (con bool False)] (con string "Hello")]
)

In this example, we have a function (lam x (lam y x)) where the body is another function, (lam y x). We apply it to (con bool False) as its argument. Conceptually, this evaluates to the function’s body where the parameter x is substituted by the argument we provided: (lam y (con bool False)). This is the result of the first (inner) function application, which is then used as the function in the second (outer) function application where we apply it to (con string "Hello"). The result is the body (con bool False), or in other words, False. We can visualise this process by writing down a trace of the evaluation steps where we use => to mean “reduces to”:

   [[(lam x (lam y x)) (con bool False)] (con string "Hello")]
=> [(lam y (con bool False)) (con string "Hello")]
=> (con bool False)

For convenience, Plutus Core provides some syntactic sugar to group multiple function applications into one. The above program could also be expressed as:

(program 1.0.0
    [(lam x (lam y x)) (con bool False) (con string "Hello")]
)

This syntax does not change how the program works, but it saves us from writing a few more brackets. In other words, the trace of evaluating this program is exactly the same as the trace shown above for the same program without the syntactic sugar.

Built-in functions

The Plutus Core language has a number of built-in functions, which we can view as its standard library. These built-in functions range from arithmetic operations to cryptographic functions. Like functions we define ourselves, built-in functions are curried and are therefore applied to one argument at a time. For example, to add up two numbers, we would write the following:

(program 1.0.0
    [[(builtin addInteger) (con integer 4)] (con integer 8)]
)

As expected, this evaluates to 12. To reference a built-in function, we have to use the builtin keyword followed by the name of the built-in function. We can then use ordinary function application to apply the built-in function to arguments. In the example above, we made both function applications explicit, but we equally could have used the syntactic sugar for function application as well. Here is another example where we calculate the SHA256 hash of an array of bytes which is expressed in hexadecimal notation:

(program 1.0.0
    [(builtin sha2_256) (con bytestring #54686543616B654973414C6965)]
)

The result is the hash as an array of bytes. Unfortunately, IOG do not provide complete documentation for all available built-ins, but a list of all implemented functions may be found in the Plutus sources where their meaning is given by corresponding Haskell functions.

Delaying and forcing execution

Plutus Core is a strict programming language. That is, we evaluate expressions as soon as possible. In a functional language, that corresponds to evaluating arguments to a function before calling the function. For example, in the following program, [(builtin addInteger) (con integer 4) (con integer 8)] is evaluated before (lam x (con bool False)) gets applied to the result:

(program 1.0.0
    [(lam x (con bool False)) [(builtin addInteger) (con integer 4) (con integer 8)]]
)

This computation is wasteful since the function (lam x (con bool False)) never uses its argument. This behaviour, which we also refer to as call-by-value, is what most programming languages use since it is evident what gets executed and when. In contrast, some other programming languages employ call-by-need. In such languages, nothing gets evaluated until the value is needed. In Plutus Core, we can delay the evaluation of an expression and force it elsewhere using the delay and force keywords, respectively. To not immediately evaluate an expression, as we might want to do in the above example, we can write:

(program 1.0.0
    [
        (lam x (con bool False))
        (delay [[(builtin addInteger) (con integer 4)] (con integer 8)])
    ]
)

Here, [[(builtin addInteger) (con integer 4)] (con integer 8)] will never get evaluated. If we did want to use the result of the delayed addition, we would have to force it. The following program evaluates to 12:

(program 1.0.0
    [
        (lam x (force x))
        (delay [[(builtin addInteger) (con integer 4)] (con integer 8)])
    ]
)

Whenever we need the result of a delayed computation, we must force its result. To see what happens without it, let us modify the example to just have x without the force:

(program 1.0.0
    [
        (lam x x)
        (delay [[(builtin addInteger) (con integer 4)] (con integer 8)])
    ]
)

The result of this program is the entire delayed expression:

   [
       (lam x x)
       (delay [[(builtin addInteger) (con integer 4)] (con integer 8)])
   ]
=> (delay [[(builtin addInteger) (con integer 4)] (con integer 8)])

Most functional programming languages have control flow structures such as if condition then oneThing else anotherThing for conditional execution, which first evaluate condition and then return either oneThing if condition evaluated to true or anotherThing if not. Plutus Core has no control flow structures, but some built-in functions such as ifThenElse allow us to choose between several values. For example:

(program 1.0.0
    [(force (builtin ifThenElse)) (con bool True) (con integer 23) (con integer 42)]
)

This program behaves as expected and results in 23, the value that gets returned since the condition is True. If we provided False as the first argument or an expression that evaluates to False, we would end up with 42. However, this program is a bit odd: why are we forcing builtin ifThenElse? The built-in function is not an expression that we did not want to evaluate just yet. The answer is that this is a quirk of Untyped Plutus Core’s relationship with Typed Plutus Core, which is our cue to move on to talk about Typed Plutus Core.

Typed Plutus Core

As previously mentioned, Plutus Core is based on System F – a polymorphic lambda calculus. In System F, terms may be of a polymorphic type, represented by type variables, to say that we do not care what the type is as long as it lines up with all other uses of the same type variable. In Typed Plutus Core, all functions are annotated with the type of their parameter. For example, to say that we have a function which has a parameter of type integer:

(program 1.0.0
    [(lam x (con integer) x) (con integer 4)]
)

This program is well typed and will result in the integer 4 as before. However, if we were to give it an argument of a different type, such as in the following program, then we would receive a type error telling us that the function is applied to an argument of a wrong type:

(program 1.0.0
    [(lam x (con integer) x) (con bool False)]
)

Generally speaking, this is desirable because we have expectations about how an argument is used, which only make sense if the argument is of a particular type.

Polymorphism

At other times, we may not care what the type of an argument is. For example, the identity function above, which just returns its argument, does not really need to care about the argument’s type. In other words, we want it to be polymorphic in its argument type. In Typed Plutus Core, we can express this as follows:

(program 1.0.0
    (abs a (type) (lam x a x))
)

The abs keyword signals a type abstraction which introduces a new type variable – named a in this example. The name of the type variable is followed by its kind, (type). Kinds are outside the scope of this article, so you may just ignore it and assume that we always write (type) here. We may then use a in place of a type within the scope of the type abstraction. Specifically, in the example, we have our identity function (lam x a x) where the parameter type is now a. We can view these type variables as placeholders for other types that we must fill in before we can use the expression. For example, to do just that, we might write the following:

(program 1.0.0
    [{(abs a (type) (lam x a x)) (con integer)} (con integer 4)]
)

The curly braces denote type application, which has the general form of {tyAbs type} where tyAbs is expected to be a type abstraction and type is the type we are plugging in for the type variable tyAbs abstracts over. Therefore, in the example above, we substitute a for (con integer) to obtain (lam x (con integer) x), which we can then subsequently apply to an integer argument. To better understand this process, let us visualise how this program is evaluated:

   [{(abs a (type) (lam x a x)) (con integer)} (con integer 4)]
=> [(lam x (con integer) x)) (con integer 4)]
=> (con integer 4)

Higher-order functions

We mentioned that Plutus Core supports higher-order functions, which either take other functions as arguments or return them. We have seen examples of functions which return other functions when we discussed currying. What about functions which take other functions as arguments? Let us first consider the following, contrived program in Untyped Plutus Core:

(program 1.0.0
    [(lam f [f (con integer 5)]) (lam x x)]
)

In a language like Haskell, this same program would be expressed as (\f -> f 5) (\x -> x). In other words, we have a function which binds a parameter named f and, in the body of that function, we apply f to the integer 5 as its argument. This function is then applied to the identity function, which just returns whatever it is given as argument. For clarity, let us trace this program’s evaluation:

   [(lam f [f (con integer 5)]) (lam x x)]
=> [(lam x x) (con integer 5)]
=> (con integer 5)

We already know that a good way to express the identity function in Typed Plutus Core is (abs a (type) (lam x a x)) so that we can use it with any type.

What should the type of f be? Since we provide our identity function as the argument, it must have a type corresponding to that. In Typed Plutus Core, that type is expressed as (all a (type) (fun a a)) where (fun a a) denotes the type of a function that takes something of type a as input and returns something of type a. The (all a (type) ...) part of the type says that it is a function for any given type a. In other words, if we want to use a function of this type, we must first apply it to a type as an argument so that a can be instantiated with a concrete type. With this new knowledge, let us now look at a version of the same program as above, but in Typed Plutus Core:

(program 1.0.0
    [
        (lam f (all b (type) (fun b b)) [{f (con integer)} (con integer 5)])
        (abs a (type) (lam x a x))
    ]
)

As we can see, the type of f is (all b (type) (fun b b)), which is the same as the type of the identity function (abs a (type) (lam x a x)) except that we have named the type variable b instead of a to show that they are different type variables. Technically, they could both have the same name since their scopes unambiguously distinguish them. Before we can use f, we must apply it to a type. Since we want to apply it to an integer value as the argument in the example, we apply it to the integer type. The trace for the evaluation of this program is therefore as follows:

   [
       (lam f (all b (type) (fun b b)) [{f (con integer)} (con integer 5)])
       (abs a (type) (lam x a x))
   ]
=> [{(abs a (type) (lam x a x)) (con integer)} (con integer 5)]
=> [(lam x (con integer) x) (con integer 5)]
=> (con integer 5)

Type erasure

We can go from a Typed Plutus Core program to an Untyped Plutus Core program by erasing its types. For the most part, this just involves removing the type annotations from functions. In other words, (lam x (con integer) e) becomes (lam x e).

Everything else mostly stays the same, with the notable exception of type abstractions and type applications. Since there are no types in Untyped Plutus Core, it no longer makes sense to have type abstractions and applications. Nevertheless, we need to ensure that the resulting Untyped Plutus Core program has the same behaviour as the input Typed Plutus Core program. To accomplish this, anything of the form (abs ty k expr) becomes (delay expr) and {tyAbs type} becomes (force tyAbs). For example, let us apply these three rules to the program with a higher-order function we showed above:

(program 1.0.0
    [
        (lam f [(force f) (con integer 5)])
        (delay (lam x x))
    ]
)

It may seem odd and redundant to delay and force a function, and, indeed, it is. However, in general, we could have a term such as (abs a (type) (error)) where (error) terminates the program as soon as it is evaluated. Since a type abstraction surrounds it in Typed Plutus Core, it will not get evaluated until after that term has been applied to type in a type application. To ensure we get the same behaviour in Untyped Plutus Core, delay and force are needed to delay and force evaluation accordingly.

The `ifThenElse` mystery resolved

Now that we understand type erasure, we can return to our strange example involving ifThenElse from earlier. For convenience, let us recall the code we had for it:

(program 1.0.0
    [(force (builtin ifThenElse)) (con bool True) (con integer 23) (con integer 42)]
)

In Typed Plutus Core, ifThenElse is a polymorphic function. We can think of its type being (all a (type) (fun (con bool) (fun a (fun a a)))). For those familiar with a language like Haskell, it is the equivalent of forall a . Bool -> a -> a -> a. That is, ifThenElse is a function which takes in a boolean, something of some type a, another thing of type a, and returns something that is also of type a. In other words, its first argument is the boolean condition while the true and false branches are of some arbitrary type a, one of which is returned and therefore the result is also of type a. The type abstraction that binds a is implicit in the definition of ifThenElse.

The fact that ifThenElse is polymorphic means that there is an implicit type abstraction in its definition that must be applied to a type argument. What follows is our earlier example involving ifThenElse, but in Typed Plutus Core:

(program 1.0.0
    [{(builtin ifThenElse) (con integer)} (con bool True) (con integer 23) (con integer 42)]
)

As we can see, we must apply (builtin ifThenElse) to the type of the true and false branches. If we now apply type erasure to this program, the type application turns into a force term by the rules outlined above and we end up with the Untyped Plutus Core program shown earlier, explaining the quirk. Since the type application is implicit in the definition of ifThenElse, the corresponding delay in Untyped Plutus Core is also implicit in it.

This is necessary for every polymorphic, built-in function in Untyped Plutus Core. Some, such as fstPair and sndPair, even require multiple force terms, since they involve more than one type variable! Ideally, it would not be necessary to force every polymorphic built-in function. Indeed, there is already a discussion around this topic.

Conclusions

We have given a high-level overview of Plutus Core, its role in the Cardano ecosystem, its two flavours, and how to write basic programs in them. A full, formal specification of Plutus Core is available from IOG. The specification also includes some additional information related to Typed Plutus Core, specifically its support for recursive types, that we have omitted from this article for simplicity.

Hachi is a research effort in security, programming language theory, and formal methods on the Cardano blockchain. Follow our blog at blog.hachi.one for frequently updated content in this exciting field.