I often experience this when Iām learning more advanced features of programming
languages and Iāve found that I personally find it easier to grasp language
features when I have a clear understanding of what kinds of problems theyāre
meant to solve. One way to achieve this is start out by solving a specific
problem **without** the language feature and then show how, when you add the
feature, the solution becomes more elegant. I havenāt found any posts that do
this with GADTs thus I set out to write this post. As I work through the example
I will also try to point out symptoms in your code that might mean youāre better
off modeling your problem using a GADT; this will hopefully help you find
use-cases for GADTs in your own code-base.

In case you havenāt stumbled upon GADTs before reading this here is a very brief description of what they do; this is the most succinct description Iāve found and it was written by Ptival on reddit. Iāve modified it slightly, but it goes as follows:

In essence GADTs makes it possible to describe a richer relation between your data constructors and the types they inhabit. By doing so you give the compiler more information about your program and thus itās able to type your programs more precisely.

Let that description dwell in the back of your mind as you read through the rest of the post.

The example that weāre going to use is similar to the canonical example that you can also find in the OCaml Users Guide section on GADTs. I later hope to write a blog post with lots of examples of GADTs to help solidify the concept - but thatās for another time.

## The canonical example

The task is to write a small evaluator for a simple programming language. Now I know this might be quite different from the problems you would normally deal with at work but this is the canonical example for a good reason so bear with me.

The language is simple and only contains `if`

-expressions, two operators
(`=`

, `<`

) and two primitive types, `int`

and `boolean`

.

As I alluded to in the introduction weāll first try to model this using
a regular ADT^{2}; Once weāve seen the ADT
implementation itās easier to understand the benefits of using a GADT.

### Using ADTs

```
type value =
| Bool of bool
| Int of int
type expr =
| Value of value
| If of expr * expr * expr
| Eq of expr * expr
| Lt of expr * expr
```

Iāve chosen to represent the abstract syntax
tree using two
variants: `value`

which models the primitive values and `expr`

that
model the expressions.

Lets try and write a function that evaluates an `expr`

into a `value`

;
this can be achieved with a straightforward recursive implementation.

```
let rec eval: expr -> value = function
| Value v -> v
| Lt (x, y) -> begin match eval x, eval y with
| Int x, Int y -> Bool (x < y)
| Int _, Bool _
| Bool _, Int _
| Bool _, Bool _ -> failwith "Invalid AST"
end
| If (b, l, r) -> begin match eval b with
| Bool true -> eval l
| Bool false -> eval r
| Int _ -> failwith "Invalid AST"
end
| Eq (a, b) -> begin match eval a, eval b with
| Int x, Int y -> Bool (x = y)
| Bool _, Bool _
| Bool _, Int _
| Int _, Bool _ -> failwith "Invalid AST"
end
```

An example of how to invoke this function is shown below.

```
eval (If ((Lt ((Value (Int 2)), (Value (Int 4)))),
(Value (Int 42)),
(Value (Int 0))))
```

```
- : value = VInt 42
```

An hereās what happens if we try to `eval`

an invalid `expr`

```
eval (Eq ((Value (Int 42)), (Value (Bool false))))
```

```
Exception: Failure "Invalid AST".
```

Even though this implementation is correct (at least I hope it is) it
leaves a lot to be desired. The first thing that springs to mind is that
itās possible to construct `expr`

values that we canāt evaluate. This
means we still need to cover these cases in our pattern matches in order
for them to be exhaustive; we could use wildcard matches but then the
compiler wonāt be able to warn us about missing cases if we decide to
add new data constructs later. Exhaustiveness checking is extremely
helpful when working with larger code-bases so it shouldnāt be thrown
away lightly. This gives us the first symptom you can look for when
youāre considering use-cases for GADTs

**symptom #1**: When you need to to add extra cases for invalid states to make
your pattern matches exhaustive

Now lets say we wanted to write an `eval`

function that returned a
proper OCaml `int`

or `bool`

rather than the wrapped `value`

values.

To do this we would need to create a function for each primitive type, like so

```
let eval_int: value -> int = function
| Int x -> x
| Bool _ -> failwith "Got Bool, expected Int"
let eval_bool: value -> bool = function
| Bool b -> b
| Int _ -> failwith "Got Int, expected Bool"
```

Again, this solution works but it is rather unsatisfying to have boilerplate like that. It would be preferable if we could have a single function where its return type would depend on the input. This leads us to symptom #2:

**symptom #2**: You want the result of a function to depend on the data
constructor used to create the data

With these two symptoms in mind lets see what the GADT implementation would look like.

### Using GADT

Before we dive into the GADT implementation lets do a quick review of
how to define GADTs in OCaml. Remember that we previously defined the
`value`

type like this

```
type value =
| Bool of bool
| Int of int
```

To define a GADT we need to use a slightly different syntax. The following syntax definition is taken from the OCaml Users Guide.

```
constr-decl ::= ...
ā£ constr-name : [ typexpr { * typexpr } -> ] typexpr
```

For the `value`

type the GADT definition would look like this

```
type value =
| Bool : bool -> value
| Int : int -> value
```

Notice that we explicitly type the constructors. On its own this isnāt that exciting but it comes in handy when we introduce type parameters to the GADT as you will see shortly. Now lets give the full GADT implementation a go.

```
type _ value =
| Bool : bool -> bool value
| Int : int -> int value
type _ expr =
| Value : 'a value -> 'a expr
| If : bool expr * 'a expr * 'a expr -> 'a expr
| Eq : 'a expr * 'a expr -> bool expr
| Lt : int expr * int expr -> bool expr
```

We define `value`

and `expr`

GADTs which are parameterized with an
anonymous type (notice the `_`

) and each data constructor is explicitly
typed with a type for this parameter, e.g. the `Bool`

constructor takes
a `bool`

and gives you a `bool`

typed `value`

ā.

The type parameter allows us to do two things:

- We can associate a specific type with each data constructor, e.g.
`Bool`

is of type`bool expr`

ā. - We can restrict the input to functions using the type parameter of
`expr`

ā, e.g.`If`

requires that the first argument needs to be of type`bool expr`

ā.

Now that weāve told the compiler about these restrictions lets see how
the `eval`

ā function turns out.

```
let rec eval : type a. a expr -> a = function
| Value (Bool b) -> b
| Value (Int i) -> i
| If (b, l, r) -> if eval b then eval l else eval r
| Eq (a, b) -> (eval a) = (eval b)
| Lt (a,b) -> (eval a) < (eval b)
```

This is so wonderfully concise that the previously solution now looks
horrific. Notice that this match is exhaustive **and** the return type is
an actual ocaml primitive type. This is possible as OCaml now associates
a specific type for the type parameter of each data constructor.

Lets give the `eval`

function as go with an example

```
eval (If ((Eq ((Value (Int 2)), (Value (Int 2)))),
(Value (Int 42)),
(Value (Int 12))))
```

```
42
```

And an example where we return a `bool`

rather than `int`

```
eval (If ((Eq ((Value (Int 2)), (Value (Int 2)))),
(Value (Bool true)),
(Value (Bool false))))
```

```
- : bool = true
```

Now if we give an invalid example as try, letās see what happens

```
eval (If ((Value (Int 42)), (Value (Int 42)), (Value (Int 42))))
```

```
eval (If ((Value (Int 42)), (Value (Int 42)), (Value (Int 42))))
^^^^^^
Error: This expression has type int value
but an expression was expected of type bool value
Type int is not compatible with type bool
```

Though it isnāt obvious from the output this is actually a compile-time error rather than the runtime error we got in the ADT example, that is, it is no longer possible to construct an invalid AST.

Thatās it for this time. If you have any feedback catch me on twitter at @mads_hartmann or send an email my way a mads379@gmail.com.

**Jan 20, 2018** Based on the feedback Iāve gotten
over the years Iāve updated the code examples to make them more clear.
Take that into consideration when youāre reading the comments below ;)

## Footnotes

- Generalized Algebraic Datatypes
- Algebraic Datatypes