Understanding OCaml's restrictions on recursive types
2023-09-11
This post is a summary of a discussion I started on discuss.ocaml.org: Which kind of recursive types are allowed?
If you ever programmed in a language with algebraic data types (ADTs), you’ll have seen types like:
type tree = Leaf of int | Node of tree * tree
where tree
has a recursive structure, since
each Node
contains other tree
s.
In theory, recursive types are not restricted to ADTs. Consider for instance the recursive function type:
type s = unit -> (int * s) (* warning: not valid OCaml *)
which represents a lazy stream of integers:
val make_counter : int -> s
let rec make_counter n () = n, make_counter (n + 1)
let c = make_counter 0 in
let x, c = c () in (* x = 0 *)
let x, c = c () in (* x = 1 *)
let x, c = c () in (* x = 2 *)
...
But OCaml doesn’t allow this. Why? Often programs end up with recursive arrow types by accident, so rejecting them means catching more errors. You can always use an ADT if you really need recursion:
type s = S of (unit -> (int * s)) (* valid OCaml *)
That’s fair. But by this point, I was asking myself…
What exactly is allowed to be recursive?
The short answer is:
type declarations permit recursion through the use of the name that is being defined, as we saw above with
tree
;type expressions only allow recursion if it goes through a polymorphic variant or object.
Ok, what is the difference between a type declaration and a type expression then?
Declarations, e.g.
type t = A | B
, create a new nominal type. Whenever you use a constructor (likeA
), the typechecker refers to the declaration that created such constructor, and assigns its corresponding type (here,t
) to the expression. This is the reason why, if you have multiple types that define constructors with the same name, you need to add type annotations.On the other hand, expressions, like
int -> bool
or[`A | `B]
describe a structural type.
But wait, if recursion is allowed in type declarations, why did the
type checker reject the stream type s
earlier? Here’s the
catch: type s = unit -> (int * s)
is not a type declaration. It merely defines an alias for the
type expression unit -> (int * 's) as 's
1.
Same goes for polymorphic variants and objects: a usage of type
is a type
declaration only if the right hand side describes a variant or
record.
Putting things into order
The difference in behaviour simply stems from a design decision: variants and records are typed nominally, while polymorphic variants, objects and functions structurally.
Let’s check our understanding so far:
type t = T of t
: ok, recursion in type declaration;type t = int -> t
: not ok, recursion in type expression;type t = [`T of t]
: ok, recursion in type expression but broken by a polymorphic variant.
We still have a handful of whys to answer:
why is recursion allowed in type declarations?
why isn’t it allowed in type expressions?
why is there and exception for polymorphic variants and objects?
And here come the becauses:
If you declare a recursive type, it’s pretty clear you intend for it to be recursive. Remember that the point of the restrictions is to catch unintentional recursion.
During inference, there’s no declaration to refer to for structural types (aka type expressions), so there’s no way for the type checker to know if you really meant to have a recursive type.
Much of their utility depends on them being recursive. For instance, without this exception you wouldn’t be able to make a tree with polymorphic variants.
One last thing…
You can pass the -rectypes
flag to ocamlc
to lift the restrictions outlined above. But it’s worth it only if you
make heavy use of recursive arrow types, to make the code less verbose,
otherwise breaking the recursion with variants will do just fine.
And that’s it. Happy cameling 🐫
Comments? Send them to
blog@alsd.eu!
Back to the article list.