Abstract

There are many ways to abstract functionality in programming languages. One important idea of abstraction is the notion of interfaces defining functionality for abstract types. Modern languages offer different approaches to such interfaces, some focus on data abstraction, while others focus on supporting ad-hoc polymorphism. We will see the differences between parametric and ad-hoc polymorphism. Furthermore we will look at three approaches to the concept of interfaces: Haskell type classes, the type class-inspired Rust traits and OCaml's ML-inherited module system, and find them to have many similarities.

Introduction

Programmers are lazy. That is why they tend to spend a lot of time coming up with and developing programming language concepts that allow them to write less code for the same results, and then let the compiler do the rest of the work. Few of those convey that idea better than the concept of polymorphism, a term describing the reusing of functions and methods for varying data types.

There are many different forms of polymorphism, and in this paper we will briefly look at two of them: Parametric and ad-hoc polymorphism. Ad-hoc polymorphism is primarily used by object oriented programming languages, whereas parametric polymorphism is more well known by its use in functional languages. However, as languages have matured and gained features, most modern languages have their own concepts for both.

Afterwards we will look at one particular implementation of ad-hoc polymorphism: Type Classes. We will find that the concept of type classes also offers a way to define an interface on custom data types and thus compare them to other language interface concepts, in particular Rust Traits and OCaml Modules.

Types of Polymorphism

As defined by Benjamin C. Pierce¹, a polymorphic type system refers to a type system that allows a single piece of code to be used with multiple types. There are several varieties of polymorphism, of which we will cover two.

Parametric Polymorphism

Parametric polymorphism is the idea that a function's input types can be parameters, where the exact type inserted in these parameters is irrelevant for the function's semantics. That means that any type can be inserted¹.

This is best understood by a common example for parametric polymorphism, the length function for lists in functional languages. For this, we will look at how such a function could look like in OCaml:

let rec length l =
  match l with
  | [] -> 0
  | x::xs -> 1 + length xs

The inferred type of length would then be

val length : 'a list -> int

where 'a is a type parameter. The important thing to understand about parametric polymorphism is that the actual value of 'a is irrelevant to how this function operates. This means that there must not be more than one implementation of length for different concrete instantiations of 'a. Instead there is exactly one implementation and it can be called for all types of lists with the same semantics.

The reason this works is that we don't ever infer what the type of x inside the function should be. To determine the length of a list, the type of the list's elements is irrelevant.

The compiler might still be free to compile different versions of this function, depending on what types the function is called with. Depending on how lists are implemented, small parts of the list might be laid out in memory as arrays², which means that the act of getting the next value in the list requires the compiler to know the size of 'a. However, this is hidden from the programmer and not relevant for the language definition.

The problem with parametric polymorphism is that it is inherently useless for functions that should have different implementations for different types, and potentially no implementation for some types. The obvious example for this case would be operator overloading, such as a (+) function that does addition.

In OCaml, this is solved by having a different operator for different data types, for example (+) for ints and (+.) for floats. The equality check function (=) is handled special in OCaml, in that every type automatically implements it. Thus the function is parametrically typed as

val (=) : 'a -> 'a -> bool

and the implementation of this function is handled by the compiler and runtime system. Because this is the type of the (=) function, OCaml will not give you a compile time error if you happen to try to check two functions for equality, even though functions can not generally be checked for equality. OCaml can only throw an exception at runtime in such cases.

Ad-hoc Polymorphism

Ad-hoc polymorphism is the idea of overloading functions for different types¹. This means one can have one implementation for a specific set of types, but potentially a different implementation, including none, for a different set of types. This fixes the aforementioned problem of parametric polymorphism. Operator overloading in particular is a common use case for ad-hoc polymorphism. It allows to have different implementations of (+), both for int -> int -> int and float -> float -> float.

However, the exact implementation for ad-hoc polymorphism in a language is not necessarily trivial. If one were to write (+) x y inside their code, the compiler must now be able to determine which (+) should be called. It needs to figure this out from the types of x and y, whereas before it was always unambiguous what function (+) refers to. While this is still reasonably simple when x and y are variables or constants with a defined type, once x and y are themselves parameters with polymorphic types, the amount of functions that need to be created is potentially exponential³.

See for example an add2 function that is defined like so:

let add2 (x1, x2) (y1, y2) = (x1 + y1, x2 + y2)

Assuming that (+) is defined for both int and float, then there are four possible types for add2:

val add2 : int * int -> int * int -> int * int
val add2 : int * float -> int * float -> int * float
val add2 : float * int -> float * int -> float * int
val add2 : float * float -> float * float -> float * float

This can grow exponentially and becomes inefficient. Thus, implementations of ad-hoc polymorphism typically have clever ways of circumventing this kind of blowup.

One way is to use a system of dynamic dispatch as seen in Java, another is to make heavy use of higher order functions as seen in Haskell³. Both approaches work on a similar idea: One must first define a sort of standard for the (+) function we can group the implemented types under. For the sake of example, we will call types that implement (+) addable. I.e. int and float are addables. Then we can built a system to tell the compiler that add2 is defined for parameters whose types are addables. Whenever we define a type as addable, i.e. implement the (+) function, we also create a dictionary with information about where to find the implemented function. Since we standardized the (+) function under the addable interface, these dictionaries will look the same for every addable type. When calling add2, all we need to do is also pass the appropriate dictionary to add2. add2 then uses that dictionary to find the appropriate (+) function and call it for the arguments. That way, only one implementation for add2 is needed to cover all addable types and there is no exponential blowup. However, depending on what solution is used, there may be a runtime overhead involved.

"Types and programming languages" by Benjamin C. Pierce; Chapter 23.2

"Fast Functional Lists" by Phil Bagwell

"How to make ad-hoc polymorphism less ad hoc" by Philip Wadler and Stephen Blott

Haskell

Haskell supports both parametric and ad-hoc polymorphism.

Parametric Polymorphism

Parametric polymorphism is achieved the same way it is in OCaml, that is, when writing a function that does not rely on its parameters' types, then one can define the function's type with type parameters.

For example, the list length function:

length :: [a] -> Int
length [] = 0
length (x:xs) = 1 + length xs

where a is the type parameter for the type of the list's elements.

Ad-hoc Polymorphism: Type Classes

According to the Haskell committee, there was no standard solution available for ad-hoc polymorphism when they designed the language, so type classes were developed and judged successful enough to be included in the Haskell design¹.

In its simplest form, a type class is not much more than an interface as seen in object oriented languages, meaning it simply defines a named set of operations. For example, to define an interface for checking values for equality, one can define an Eq type class with a class parameter a, for which a (==) function should exist:

class Eq a where
  (==) :: a -> a -> bool

Mind that it's easy to misunderstand the a class parameter as being similar to the a type parameter used for parametric polymorphism, however this is not the case. Where in the parametric polymorphism case before, a stood for any type, in this case a stands for a class or set of specific types for which instances of Eq exist.

The type class itself so far provides no functionality. We need to create an instance of the type class for a specific type, for example for Int:

instance Eq Int where
  (==) x y = eqInt x y

Similarly, an instance of the Eq type class can be created for the specific type Float:

instance Eq Float where
  (==) x y = eqFloat x y

When we now try to use the (==) function with parameters of types other than Int or Float, we will get a compile time error.

Type constraints

So far, we have seen how we can use type classes to overload functions defined within the type class, however we can also use them to define standalone functions with types implementing specific type classes.

Assume we want to define some kind of cost to our data types. We can use a type class to convey this idea:

class Cost a where
  cost :: a -> Int

A possible instance for some custom data type could then look like this:

data Custom = Cheap | Expensive

instance Cost Custom where
  cost Cheap = 0
  cost Expensive = 1

Now say we want to define a function cheapest that takes a list of elements of a type that defines a cost, and then return the cheapest element of that list. Such a function could look like this:

cheapest :: Cost a => [a] -> a
cheapest [x] = x
cheapest (x:xs)
  | cost x <= cost y = x
  | otherwise = y
  where y = cheapest xs

Note the function's type Cost a => [a] -> a. It means that cheapest is a function of type [a] -> a for any type a that is an instance of Cost.

The similarities to parametric polymorphism are a lot more fitting in this case: a for the cheapest function is also a type parameter where the exact type does not matter for the function. The difference is that, whereas in the case of length, any type can be inserted into a, for cheapest only types that implement a cost can be used. This is necessary as the cost function is used in the implementation of cheapest.

Creating instances from other instances

One particular strength of type classes is that they allow one to create instances based on other existing type class instances. We extend the example from the "Type constraints" section to illustrate this: We are given a list of Customs and want to determine the overall sum of the list's costs. We could write an instance for Cost [Custom], however this seems rather verbose, considering we might want to have an implementation of this function for other Cost instances, too. Instead, we can define an instance of the Cost type class for any list of type [a] where a has an instance for Cost:

instance Cost a => Cost [a] where
  cost = sum . map cost

Now, Haskell will automatically create an instance for Cost [Custom] the moment we create the instance for Cost Custom.

This looks quite similar to type constraints and this is no coincidence. We are essentially doing a type constraint, only for the entire instance of a type class. We can see this as applying the type constraint to every function declared within the type class. As such, we can again use a's cost function inside the Cost [a] instance's functions.

Multiple Parameters

Standard Haskell does not allow for type classes to have more than one class parameter. However, modern compilers like ghc support the -XMultiParamTypeClasses compiler option that allows one to create type classes with multiple parameters.

The most obvious limitation this solves is the fact that, with single parameter type classes and the perspective of operator overloading, we can only define an add function to add values of the same type. With multiple parameters, one could define an Add type class that allows adding two values of different types.

class Add a b where
  add :: a -> b -> a

This is not how the add function is implemented in standard Haskell, but this is how it could be done.

However this is still quite ugly as the return type of the addition is simply the type of the first argument, which need not always be what we want. An obvious fix to this would be adding a third parameter to the Add type class:

class Add a b c where
  add :: a -> b -> c

While this works, it poses a strange issue: One can define multiple additions for the same type that have a different output. Let's say we define the following instances for the Add type class with three parameters:

instance Add Int Float Int where
  add x y = x + roundFloatInt y

instance Add Int Float Float where
  add x y = intToFloat x + y

At first this might seem fine, however when one were to write add 1 3.2, the actual return type of such an expression is not well defined anymore and needs to be annotated explicitly for every call, e.g. add 1 3.2 :: Float. This is not very ergonomic, thus a different more elegant solution exists.

Associated types

Associated Types are a way to define more class parameters, that are defined inside the instance of a type class and not in its signature. That means we can define Add with three class parameters, where the third of those is fixed once the other two are.

In our Add example, a solution using an associated type could look like this:

class Add a b where
  type AddOutput a b
  add :: a -> b -> AddOutput a b

We define that with any instance of Add a b for specific types a and b, there shall also be a third type that's given the name AddOutput a b. This type is then the output of our add function. This concept ensures that there shall only be one well defined output type for an addition of two specific types a and b.

To define an addition for Int and Float, this means we have to decide what output we would even want in this case. Let's assume the choice was made to output a Float in this case, the instance for the Add type class would then look like this:

instance Add Int Float where
  type AddOutput Int Float = Float
  add x y = intToFloat x + y

One unfortunate limitation is that the AddOutput name is not namespaced, i.e. we will need to have a different name for each associated type we have inside of different type classes.

A matrix example

Let's assume we have a data type called Matrix a which describes a matrix whose elements are of type a. We can then use our previously defined Add type class to define a matrix addition for any matrices whose elements' types are addable:

instance Add a b => Add (Matrix a) (Matrix b) where
  type AddOutput (Matrix a) (Matrix b) =
    Matrix (AddOutput a b)
  add x y = -- ...

The first line can be read as "implement Add for two matrices whose elements' types implement Add". The second and third lines mean "the output of such an addition shall be another matrix whose elements' type is that of the output of adding the elements of the original two matrices together".

Note that this is not how standard Haskell implements overloaded addition. Instead, Haskell has a Num type class that defines multiple operators, such as (+) and the unary negate, and only defines those operations on two elements of the same type. We introduced our Add type class merely to showcase multiple parameters and associated types.

"How to make ad-hoc polymorphism less ad hoc" by Philip Wadler and Stephen Blott

Rust

Rust is known to be a language that combines different features of different programming languages into one language, with the goal of only keeping whatever works best and eliminating whatever doesn't. When it comes to polymorphism, Rust combines the concept of type classes with generics, the latter of which is usually seen in object oriented languages like Java, C# and C++.

Generics: Rust's Parametric Polymorphism

Generics, in their simplest form, are a form of parametric polymorphism. As such, the length function on slices can be defined with a generic which we will name T.

Because Rust only implements many ideas from functional languages, but is in itself a C-like imperative language, the terminology used for similar things will be slightly different, so will the syntax. In this case, we use a slice in place of a list. A slice in Rust is a reference to a specific area inside an array (or similar), or to the entire array itself. The type usize is one of Rust's unsigned integer types. Also, pay no mind to ampersands inside the Rust code, as it is not of importance for this topic.

#![allow(unused)]
fn main() {
fn len<T>(slice: &[T]) -> usize {
    // calculate and return length
}
}

We will leave out the implementation of the length function here, because it requires knowledge of the underlying layout of slices. The function is given by Rust's STL.

We will also slightly alter the declaration of this function. As it stands, len is declared as a standalone function, but in the real Rust STL it is actually defined as a member function on the slice primitive. To implement member functions on types, Rust uses the impl keyword. Note that we also must declare the generic T with the impl keyword, as the type we are implementing on itself uses the generic:

#![allow(unused)]
fn main() {
impl<T> [T] {
    fn len(&self) -> usize {
        // calculate and return length
    }
}
}

self is a keyword used in Rust to allow a call with the dot operator, e.g. so one can write my_slice.len() instead of len(my_slice). It implicitly is of the type the impl block is defined on.

Traits: Rust's Type Classes

In their simplest form, traits are the exact same concept as type classes, they simply define a named set of operations. This is because Rust's traits are heavily inspired by type classes¹. The main difference between the two being a slight change in nomenclature. Whereas in Haskell we explicitly gave the type we are later instancing on a name, i.e. the a in class Eq a, in Rust this type is implicitly called Self. Self is not the same as self, self is a value and Self is self's type.

To see how traits work, we will implement the same equality type class from "Haskell | Ad-hoc Polymorphism: Type Classes":

#![allow(unused)]
fn main() {
trait Eq {
    fn eq(&self, other: &Self) -> bool;
}
}

Rust's STL also defines an Eq trait, however this is actually a marker trait without any function declarations. Instead, what we are doing here is more similar to the STL's PartialEq trait (although that one also has a default implementation for not equals).

What we called a type class instance in Haskell we call a trait implementation in Rust, because doing so extends the previously used impl keyword:

#![allow(unused)]
fn main() {
impl Eq for usize {
    fn eq(&self, other: &Self) -> bool {
        // ...
    }
}
}

We can of course also implement the trait on other types, such as Rust's floating point primitive f32:

#![allow(unused)]
fn main() {
impl Eq for f32 {
    fn eq(&self, other: &Self) -> bool {
        // ...
    }
}
}

Like Haskell, Rust is also statically typed, and therefore calling eq on either usize or f32 types will automatically pick the correct function at compile time, and calling it on any other type will give us a compile time error.

Trait bounds: Rust's type constraints

Trait bounds allow us to tell the Rust compiler that a generic function only makes sense, if those generic types implement a specific trait. This is the exact same as Haskell's type constraints discussed in "Haskell | Type constraints". We will also use the same Cost example and thus first define a Cost trait:

#![allow(unused)]
fn main() {
trait Cost {
    fn cost(&self) -> usize;
}
}

Then we will implement the trait for some custom data type, which will be an enum with two variants with the exact same semantics as the custom data type we used for the Haskell example:

#![allow(unused)]
fn main() {
enum Custom {
    Cheap,
    Expensive,
}

impl Cost for Custom {
    fn cost(&self) -> usize {
        match self {
            Custom::Cheap => 0,
            Custom::Expensive => 1,
        }
    }
}
}

And now we again define a cheapest function, however this time we define it on slices instead of lists:

#![allow(unused)]
fn main() {
fn cheapest<T>(elements: &[T]) -> &T
where
    T: Cost,
{
    // the exact implementation is not relevant
    elements
        .iter()
        .min_by(|x, y| x.cost().cmp(&y.cost()))
        .unwrap()
}
}

Inside the cheapest function, we use the cost method on elements of the slice. To do so, we must tell the compiler that elements of the slice implement such function, which is done by giving the cheapest function a where clause. Inside a where clause, we can define as many trait bounds as we want which are of the form type: bound. In this case, we are adding a bound Cost for the generic T inside the where clause, meaning that the function cheapest shall only exist for Ts that have a Cost implementation.

Creating implementations from other implementations

As with Haskell, we can create implementations for generic types, where the generics are constrained, i.e. have trait bounds. For this, we will again implement Cost for slices where the slices' elements implement Cost:

#![allow(unused)]
fn main() {
impl<T> Cost for [T]
where
    T: Cost,
{
    fn cost(&self) -> usize {
        self.iter().map(|x| x.cost()).sum()
    }
}
}

Generic traits: Rust's multi-parameter type classes

So far we faced a similar problem we did in Haskell: If we wanted to define an Add trait with an add function, that function would have to take the same types for both parameters. This limitation can be removed in Rust by using generics inside the trait's definition:

#![allow(unused)]
fn main() {
trait Add<T> {
    fn add(self, rhs: T) -> Self;
}
}

Unlike in Haskell, we did not have to enable a language extension first, as Rust supports this out of the box.

As in "Haskell | Multiple Parameters", the output is set to be of the same type as the left hand side, which we could fix by adding a second generic:

#![allow(unused)]
fn main() {
trait Add<T, U> {
    fn add(self, rhs: T) -> U;
}
}

Which again poses the issue that an addition between two types does not have a well defined output type. We can implement addition between integers and floating point numbers with differing outputs:

#![allow(unused)]
fn main() {
impl Add<f32, usize> for usize {
    fn add(self, rhs: f32) -> usize {
        // ...
    }
}

impl Add<f32, f32> for usize {
    fn add(self, rhs: f32) -> f32 {
        // ...
    }
}
}

If we were to call 1_usize.add(3.2_f32), we would have to specify which add function we want to call:

#![allow(unused)]
fn main() {
<usize as Add<f32, f32>>::add(1_usize, 3.2_f32)
}

This uses Rust's Fully Qualified Syntax for Disambiguation and it looks rather complex. However, by simply binding the result to a variable, it would be enough to give said variable a type: let k: f32 = 1_usize.add(3.2_f32);.

Associated types

Once again, Rust also defines the notion of associated types, which also work very similar to Haskell. Just as in Haskell, an associated type is another type parameter that is fixed for a specific configuration of types inserted in the generics. And as before, we will use an associated type to define the output type of an addition.

First we must edit the Add trait's definition. We will add an associated type called Output and we will also change the name of the generic T to Rhs and default it to Self:

#![allow(unused)]
fn main() {
trait Add<Rhs = Self> {
    type Output;

    fn add(self, rhs: Rhs) -> Self::Output;
}
}

This is actually how the real world Add trait in Rust's STL is defined, with the exception of access modifiers.

The type Output is now fixed for a given Self and Rhs type. Notice how we didn't call the associated type AddOutput like we did in the Haskell example. The reason being that, in Rust, the associated type is in the namespace of the surrounding trait. That is why, to use the associated type as the output type of the add function, we need to qualify it as Self::Output which implicitly translates to <Self as Add<Rhs>>::Output inside this trait. That way, we can reuse the Output name for all operations.

Implementing the Add trait now means one also has to define the Output associated type, which we will set to be a float like we did with Haskell:

#![allow(unused)]
fn main() {
impl Add<f32> for usize {
    type Output = f32;

    fn add(self, rhs: f32) -> Self::Output {
        // ...
    }
}
}

If we were to add another implementation with type Output = usize, the Rust compiler would detect the ambiguity and throw an error.

A matrix example

Rust generics can also be used for structs, which are Rust's custom data types. To mirror the matrix example from "Haskell | A matrix example", let us assume we have a generic matrix struct Matrix<T> where the matrix's elements are of type T.

First, we will define the addition over matrices with the same T:

#![allow(unused)]
fn main() {
impl<T> Add for Matrix<T>
where
    T: Add<Output = T>,
{
    type Output = Matrix<T>;

    fn add(self, rhs: Matrix<T>) -> Self::Output {
        // ...
    }
}
}

Note that we do not have to specify the type of the right hand side, as it defaults to Self, which is Matrix<T> in this case. Note also that we had to bound T to be addable with itself where the result is again a T. The where clause can thus be read as "implement only for types T that are addable with itself and where such an addition returns a T".

We extend this implementation for additions where the element type of the right hand side matrix (which we will call Trhs) is different to the element type of the left hand side matrix (Tlhs). We also make the Output type of the addition of a Tlhs and Trhs variable, by giving the output type the name Tout:

#![allow(unused)]
fn main() {
impl<Tlhs, Trhs, Tout> Add<Matrix<Trhs>> for Matrix<Tlhs>
where
    Tlhs: Add<Trhs, Output = Tout>,
{
    type Output = Matrix<Tout>;

    fn add(self, rhs: Matrix<Trhs>) -> Self::Output {
        // ...
    }
}
}

This is essentially the exact same as Haskell's matrix example.

Making the matrix's size known at compile time

As one last extension to our matrix example, we can make use of Rust's system for const generics. These offer a limited form of dependent types in Rust, limited in the sense that the values for each generic must be known at compile time and cannot be changed at runtime.

What this allows us to do is define matrix multiplication only for matrices of correct sizes. To demonstrate, let us first look at how the appropriate matrix struct would be defined:

#![allow(unused)]
fn main() {
struct Matrix<T, const M: usize, const N: usize> {
    // ...
}
}

We can view this as a definition of matrices of type $T^{M \times N}$ .

To see how one would implement a matrix multiplication, we will first have to define a Mul trait, which will look very similar to the Add trait:

#![allow(unused)]
fn main() {
trait Mul<Rhs = Self> {
    type Output;

    fn mul(self, rhs: Rhs) -> Self::Output;
}
}

Reminder: Matrix multiplication is defined by the following function: $Tlhs^{M \times N} \times Trhs^{N \times L} \to Tout^{M \times L} : (A, B) \mapsto C where \forall i, j : c_{i, j} = k = 1 \sum N a_{i, k} \cdot b_{k, j}$

With this, we can construct a trait implementation for the Matrix struct:

#![allow(unused)]
fn main() {
impl<Tlhs, Trhs, Tout, const M: usize, const N: usize, const L: usize> Mul<Matrix<Trhs, N, L>>
    for Matrix<Tlhs, M, N>
where
    Tlhs: Mul<Trhs, Output = Tout>,
    Tout: Sum,
{
    type Output = Matrix<Tout, M, L>;

    fn mul(self, rhs: Matrix<Trhs, N, L>) -> Self::Output { /* ... */ }
}
}

Technically we would also need to add Copy or Clone trait bounds for an implementation to work. However, for the sake of simplicity, we will leave those out.

At first we define all generics used, which are Tlhs, Trhs, Tout as well as all three const generics M, N and L. Afterwards we can read the rest of the line as "implement multiplication between matrices of type $Tlhs^{M \times N}$ and $Trhs^{N \times L}$ ". The trait bound for Tlhs is as expected, however note how we also have to define that Tout needs to be sum-able in some way, with Sum being another trait defining such functionality.

What's impressive is that we define multiplication only on matrices that have the correct size, and as that size also must be known at compile time, that means that we can also check if the matrices' sizes are correct at compile time. We can also deduce the resulting size of such an operation, all at compile time. We did do so completely generically and it will work with matrices of any size.

"The Rust Reference"; Chapter 20.2

"The Rust Programming Language" by Steve Klabnik and Carol Nichols

OCaml

As already noted before, OCaml supports parametric polymorphism. However the language has no concept of ad-hoc polymorhism, i.e. there is no way to overload functions. Nontheless, OCaml inherits ML's powerful module system which still allows the language to express and convey complicated relations between types.

At first glance, a module system might not seem related to interface systems. However these concepts aren't all that different. Interfaces define a set of functions over one or, in some languages, multiple types. Modules are quite similar, they are a collection of types and functions. For example, to define a module for working with matrices of integers, in OCaml one could write this:

module DenseMatrix =
  struct
    type elem = int
    type t = elem list list

    let add a b = (* ... *)
  end

This matrix module could be further extended with other functions, but for our example this minimal version suffices.

OCaml modules also have an associated signature. The signature includes information about the function names and types, the names of the defined types and what those types are defined to. The latter part is optional to have the ability to hide implementation details from the user of the module. OCaml can infer the signature of our DenseMatrix module, but let's say we want to hide the fact that our matrix uses an elem list list, so we'll define the signature ourselves:

module type Matrix =
  sig
    type elem = int
    type t

    val add : t -> t -> t
  end

module DenseMatrix : Matrix =
  struct
    type elem = int
    type t = elem list list

    let add a b = (* ... *)
  end

This also allows us to create a separate SparseMatrix module with the same signature:

module SparseMatrix : Matrix =
  struct
    type elem = int
    type t = (* ... *)

    let add a b = (* ... *)
  end

Thus, from the perspective of interfaces, module signatures can be seen as type classes and modules as instances¹.

The main difference to type classes (and by extension the reason why OCaml modules are not ad-hoc polymorphic) is that the add function will be namespaced inside each module we defined. If we wanted to use the DenseMatrix's add function, we would have to specify it as DenseMatrix.add. There is no logic behind automatically choosing the correct add function for a given context, which is what ad-hoc polymorphism and overloading is about. Instead, the caller must specify an exact function they want to call. In contrast to this, for Haskell, the implemented functions in a type class were available independently from specific instances, i.e. in OCaml's terms, the module signature's defined functions could be used without specifying the exact module.

Functors

Extending our matrices for more element types

Let's take one step back and slightly alter our Matrix module signature.

module type Matrix =
  sig
    type elem
    type t

    val add : t -> t -> t
  end

In particular, we are not defining the type of elem anymore. This allows us to implement multiple matrices for different types:

module DenseIntMatrix : (Matrix with type elem = int) =
  struct
    type elem = int
    type t = elem list list

    let add a b = (* ... *)
  end

module SparseIntMatrix : (Matrix with type elem = int) =
  struct
    type elem = int
    type t = (* ... *)

    let add a b = (* ... *)
  end

module DenseFloatMatrix : (Matrix with type elem = float) =
  struct
    type elem = float
    type t = elem list list

    let add a b = ...
  end

module SparseFloatMatrix : (Matrix with type elem = float) =
  struct
    type elem = float
    type t = (* ... *)

    let add a b = (* ... *)
  end

Matrix with type elem = int simply means to replace type elem with type elem = int inside the Matrix module signature, effectively exporting the type of elem. This allows an outside user of DenseIntMatrix to know that elem is an int. This can be important if, let's say, we added a get function that returns an elem to get certain elements out of the matrix. If we didn't specify that elem was of type int, the user of our module would then not know that get returns ints, but rather some unknown and thus unusable type elem.

Because we left out the implementations of these functions, it might not be immediately obvious, but the actual implementation of a dense matrix with float values and a dense matrix with int values will most likely look very similar. The same is true for the sparse matrix. This is not exactly something we want, and this is where functors come in handy.

Defining addition as a module

Functors, as the name would suggest, are something similar to functions. In a sense, they can be seen as compile-time functions that take modules as arguments, and return other modules. They can also be seen as a sort of macro for module definitions.

A functor in our case could define the implementation of an entire DenseMatrix for any type elem that implements an addition. Thus we must first define what "implements an addition" means, which we do by defining a module signature:

module type Addable =
  sig
    type t

    val add : t -> t -> t
  end

Because our Matrix module signature defines this same add function, we can also define the Matrix signature by including Addable:

module type Matrix =
  sig
    type elem
    type t

    include Addable with type t := t
  end

The include keyword in OCaml copies all definitions of another module/signature into the current module/signature, making our matrices addable in this case. The reason why we don't use Addable's type t but instead our own is because we might want to add signatures for other operations (like multiplication) later on which should all share the same t.

Functors in action

Now we can define a DenseMatrix functor, that takes in an Addable module as an argument called D, and gives back a Matrix module by implementing a dense matrix with elements of type D.t:

module DenseMatrix (D : Addable) : (Matrix with type elem = D.t) =
  struct
    type elem = D.t
    type t = elem list list

    let add a b = (* implementation of Matrix addition using D.add *)
  end

We can do the same for sparse matrices, but we will skip this here.

So far we can't use the DenseMatrix functor, as we have yet to create a module with the Addable signature, so let's do so for ints:

module IntAddable : (Addable with type t = int) =
  struct
    type t = int

    let add a b = a + b
  end

Now we can create a DenseIntMatrix module by calling the DenseMatrix functor on our IntAddable module:

module DenseIntMatrix = DenseMatrix IntAddable

We can now also easily create a DenseMatrix for floating point values, simply by creating an Addable module and calling the DenseMatrix functor on it:

module FloatAddable : (Addable with type t = float) =
  struct
    type t = float

    let add a b = a +. b
  end

module DenseFloatMatrix = DenseMatrix FloatAddable

"Modular type classes" by Derek Dreyer, Robert Harper and Manuel M.T. Chakravarty

Conclusion

Rust considers itself as not a particularly original language and mentions Haskell's type classes as one of its influences¹. So it's no surprise that Rust's traits mirror type classes so closely. However, there are still slight differences between the two concepts. As mentioned in "Associated types", associated types in Rust are namespaced inside the trait, whereas associated types in Haskell are globally namespaced. Also, Rust ensures that using generics and traits will not add runtime cost, by recompiling functions for each used combination of types individually². Haskell does not have such guarantee. Still, type classes and traits are two very related concepts.

Moreover, OCaml's module system is increadibly powerful and functors are a unique way of writing generic code. However, it does not offer ad-hoc polymorphism. The user will always have to specify which function from which module they want exactly. Another issue is that juggling functors, modules and module signatures can get very overwhelming and confusing. With type classes, we were able to create addition for matrices with varying outputs depending on the outputs of the additions of the matrices' element types. In OCaml, building a generic matrix implementation even without the ability to have varying element types is already similarly complicated.

It is also rather verbose having to apply functors by hand. It is not possible for OCaml to automatically apply functors and create unnamed modules out of them (although there are ideas on how to do so in limited cases³), and thus, to define a library for generic matrices, the library will only be able to offer a functor that the user must then apply for every type they want to use inside a matrix. It is not as fluent as other solutions, not to mention one has to come up with a unique name for every one of the resulting modules.

On the other hand, neither Haskell nor Rust can convey the generic idea of a matrix. In both languages, we are able to be generic over the elements of the matrix, not the implementation of it. In OCaml, thanks to the fact that we can define a generic Matrix module signature, wherein we can keep the format of the matrix undefined, we were also able to create functors for both dense and sparse matrices. In Rust for example, defining a sparse matrix would mean to define a whole new separate struct with no correlation to the dense matrix struct. There is no way to define the structure of a struct from within a trait.

It shall be mentioned that OCaml modules aren't a direct equivalent to the notion of "OCaml's type classes". Instead, they sit at a higher level. As the name suggests, modules in OCaml can include various types and define how they interact with one another, whereas type classes and traits are essentially interface definitions for singular types. It's just with the existence of abstract module signatures as well as functors, that OCaml modules are also found in the realm of abstract interfaces.

"The Rust Reference"; Chapter 20.2

"The Rust Programming Language" by Steve Klabnik and Carol Nichols; Chapter 10.1

"Modular type classes" by Derek Dreyer, Robert Harper and Manuel M.T. Chakravarty

Comparing Haskell's Type Classes to Rust's Traits and OCaml's Modules