From Type-Driven Development with Idris by Edwin Brady

In this article, you will learn about defining dependent data types and defining vectors with Idris.


Save 37% on Type-Driven Development with Idris. Just enter code fccbrady into the discount code box at checkout at manning.com.


Defining dependent data types

A data type is a type which is computed from a dependent other value. You’ve probably seen a dependent type, Vect, where the exact type is calculated from the vector’s length:

 

  Vect : Nat -> Type -> Type  

The type of a Vect depends on its length. This gives us additional precision in the type, which helps direct our programming via the process of type, define, refine. In this article, you’ll see how to define dependent types such as Vect. The core of the idea is that, because there’s no syntactic distinction between types and expressions, types can be computed from any expression.

We’ll begin with a simple example to illustrate how this works, by defining a type for representing vehicles and their properties, depending on their power source; and you’ll see how you can use this to constrain the valid inputs to a function to those that make sense. You’ll then see how Vect itself is defined, along with some useful operations.

A first example: classifying vehicles by power source

Dependent types allow you to give more precise information about the data constructors of a type – by adding more arguments to the type constructor. For example, you might have a data type to represent vehicles (for example, bicycles, cars, and buses), but some operations don’t make sense on all values in the type (for example, refueling a bicycle wouldn’t work because there’s no fuel tank). We’ll therefore classify vehicles into those powered by pedal and those powered by petrol, and express this in the type.

The following listing shows how you could express this in Idris.


Listing 1 Defining a dependent type for vehicles, with their power source in the type (vehicle.idr)

  data PowerSource = Petrol | Pedal                           data Vehicle : PowerSource -> Type where                       Bicycle : Vehicle Pedal                                   Car : (fuel : Nat) -> Vehicle Petrol                      Bus : (fuel : Nat) -> Vehicle Petrol                    

 An enumeration type describing possible power sources for a vehicle

A Vehicle’s type is annotated with its power source

A vehicle powered by pedal

A vehicle powered by petrol, with a field for current fuel stocks


You can write functions that’ll work on all vehicles by using a type variable to stand for the power source. For example, all vehicles have wheels. Not all vehicles carry fuel, and it only makes sense to refuel a vehicle whose type indicates it’s powered by Petrol. These concepts are illustrated in the following listing.


Listing 2 Reading and updating properties of Vehicle

  wheels : Vehicle power -> Nat                   wheels Bicycle = 2 wheels (Car fuel) = 4 wheels (Bus fuel) = 4   refuel : Vehicle Petrol -> Vehicle Petrol       refuel (Car fuel) = Car 100 refuel (Bus fuel) = Bus 200    

Use a type variable, power, because this function works for all possible vehicle types.

❷   Refueling only makes sense for vehicles that carry fuel. Restrict the input and output type to Vehicle Petrol.

Asserting that inputs are impossible

If you try adding a case for refueling, Idris Bicycle will report a type error, because the input type is restricted to vehicles powered by petrol. If you use the interactive tools, Idris won’t even give a case for Bicycle after a case split with Ctrl-Alt-C. Nevertheless, it can sometimes aid readability to make it explicit that you know the Bicycle case is impossible. You can write this:

  refuel : Vehicle Petrol -> Vehicle Petrol refuel (Car fuel) = Car 100 refuel (Bus fuel) = Bus 200 refuel Bicycle impossible  

If you do this, Idris will check that the case you’ve marked as impossible would produce a type error. Similarly, if you assert a case is impossible but Idris believes it’s valid, it’ll report an error:

  refuel : Vehicle Petrol -> Vehicle Petrol refuel (Car fuel) = Car 100 refuel (Bus fuel) impossible  

Here, Idris will report the following:

  vehicle.idr:15:8:refuel (Bus fuel) is a valid case  

In general, you should define dependent types by giving the type constructor and the data constructors directly. This gives you a lot of flexibility in the form that the constructors take. Here it allows you to define types where the data constructors can take different arguments. You can either write functions that work on all vehicles (like wheels) or functions that only work on some subset of vehicles (like refuel).

Note: Defining families of types For Vehicle, you’ve defined two types in one declaration (specifically, Vehicle Pedal and Vehicle Petrol). Dependent data types like Vehicle are therefore sometimes referred to as families of types, because you’re defining multiple related types at the same time. The power source is an index of the Vehicle family. The index tells you exactly which Vehicle type you mean.

Defining vectors

The length information in the type helps drive development of functions on vectors. In this article we’ll look at how Vect is defined, along with some of the operations.

It’s defined in the Data.Vect module, as shown in listing 3. The type constructor, Vect, takes a length and an element type as arguments, and when you define the data constructors, you state explicitly in their types what their lengths are.


Listing 3 Defining vectors (Vect.idr)

  data Vect : Nat -> Type -> Type where                         Nil : Vect Z a                                            (::) : (x : a) -> (xs : Vect k a) -> Vect (S k) a       %name Vect xs, ys, zs                                        

The type constructor Vect states that a vector constructor takes two arguments: a Nat, which is its length, and a Type, which is its element type.

The data constructor Nil explicitly states that an empty vector has length Z.

The data constructor (::) explicitly states that adding an element x to a vector of length k results in a vector of length S k (1 + k).

Gives some default names to use in case splits.


The Data.Vect library includes several utility functions on Vect, including concatenation, looking up values by their position in the vector, and various higher-order functions, such as map. Instead of importing this, though, we’ll use our own definition of Vect and try writing some functions by hand. To begin, create a Vect.idr file containing only the definition of Vect in listing 3.

Because Vect includes the length explicitly in its type, any function that uses some instance of Vect will describe its length properties explicitly in its type. For example, if you define an append function on Vect, its type will express how the lengths of the inputs and output are related:

  append : Vect n elem -> Vect m elem -> Vect (n + m) elem  

NOTE: Type-level expressions – the expression n + m in the return type here is an ordinary expression with type Nat, using the ordinary + operator. Because Vect’s first argument is of type Nat, expect to be able to use any expression of type Nat. Remember, types are first class, and types and expressions are all part of the same language.

Having written the type first, as always, you can define append by case-splitting on the first argument. You do this interactively as follows:

  1. (Define) Begin by creating a skeleton definition and then case-splitting on the first argument xs:

      append : Vect n elem -> Vect m elem -> Vect (n + m) elem append [] ys = ?append_rhs_1 append (x :: xs) ys = ?append_rhs_2  
  2. (Refine) Idris has enough information in the type to complete this definition by an expression search on each of the holes, resulting in this:

      append : Vect n elem -> Vect m elem -> Vect (n + m) elem append [] ys = ys append (x :: xs) ys = x :: append xs ys  

 

Terminology: parameters and indices

Vect defines a family of types, and we say that a Vect is indexed by its length and parameterized by an element type. The distinction between parameters and indices is as follows:

  • A parameter is unchanged across the entire structure. In this case, every element of the vector has the same type.

  • An index may change across a structure. In this case, every subvector has a different length.

The distinction is most useful when looking at a function’s type: you can be certain that the specific value of a parameter can play no part in a function’s definition. However, the index might – as when defining length for vectors by looking at the length index, and when defining createEmpties for building a vector of empty vectors.

Another common operation on vectors is zip, which pairs corresponding elements in two vectors, as illustrated in figure 1.


Figure 1 Pairing corresponding elements of [1,2,3,4] and [“one”,”two”,”three”,”four”] using zip


The name zip is intended to suggest the workings of a zip fastener, bringing two sides of a bag or jacket together. Because the length of each input Vect is in the type, you need to think about how the lengths of the inputs and output will correspond. A reasonable choice for this would be to require the lengths of both inputs to be the same:

 

  zip : Vect n a -> Vect n b -> Vect n (a, b)    

Having a more precise type for Vect, capturing the length in the type, means that you need to decide in advance how the lengths of the inputs to zip relate, and express this decision in the type. Also, it means you can safely assume that zip will only be called with equal length lists, because if this assumption is violated, Idris will report a type error.

You can define zip, as usual, step by step.

  1. (Define) Again, you can begin to define this by a case split on the first argument:

      zip : Vect n a -> Vect n b -> Vect n (a, b) zip [] ys = ?zip_rhs_1 zip (x :: xs) ys = ?zip_rhs_2  
  2. (Refine) You can fill in ?zip_rhs_1 with an expression search, because the only well-typed result is an empty vector:

      zip : Vect n a -> Vect n b -> Vect n (a, b) zip [] ys = [] zip (x :: xs) ys = ?zip_rhs_2  
  3. (Refine) For the second case, ?zip_rhs_2, take a look at the type of the hole and see if that gives you further information about what to do.

        b : Type   a : Type   x : a   k : Nat   xs : Vect k a   ys : Vect (S k) b -------------------------------------- zip_rhs_2 : Vect (S k) (a, b)  

    Notice that ys has length S k, meaning that there must be at least one element. If you case-split on ys, Idris won’t give you a pattern for the empty list, because it wouldn’t be a well-typed value:

      zip : Vect n a -> Vect n b -> Vect n (a, b) zip [] ys = [] zip (x :: xs) (y :: ys) = ?zip_rhs_1    

After the case split, Idris has created a new hole. Let’s take a look at the types of the local variables.

 

    b : Type   a : Type   x : a   k : Nat   xs : Vect k a   y : b   ys : Vect k b -------------------------------------- zip_rhs_1 : Vect (S k) (a, b)    

Again, there’s enough information to complete the definition with an expression search:

 

  zip : Vect n a -> Vect n b -> Vect n (a, b) zip [] ys = [] zip (x :: xs) (y :: ys) = (x, y) :: zip xs ys    

Idris has noticed that it needs to build a vector of length S k, so that it can create the appropriate vector of length k with a recursive call, and so that it can create the appropriate first element by pairing x and y.

Deconstructing expression searches

To understand what expression search has done, it can be instructive to remove some part of the result and replace it with a hole, to see what type expression search was working with at that point. For example, you can remove the (x, y):

  zip : Vect n a -> Vect n b -> Vect n (a, b) zip [] ys = [] zip (x :: xs) (y :: ys) = ?element :: zip xs ys    

Then, checking the type of the ?element hole, you’ll see that at this point, Idris was looking for a pair of a and b.

    b : Type   a : Type   x : a   k : Nat   xs : Vect k a   y : b   ys : Vect k b -------------------------------------- element : (a, b)    

The only way to make a pair of a and b at this point is to use x and y, and this is what Idris used to construct the pair.

That’s all for this article.


For more, check out the whole book on liveBook here.