![]() |
From Type-Driven Development with Idris by Edwin Brady
In this article, you will learn about defining dependent data types and defining vectors with Idris. |
Save 37% on Type-Driven Development with Idris. Just enter code fccbrady into the discount code box at checkout at manning.com.
Defining dependent data types
A data type is a type which is computed from a dependent other value. You’ve probably seen a dependent type, Vect
, where the exact type is calculated from the vector’s length:
Vect : Nat -> Type -> Type
The type of a Vect
depends on its length. This gives us additional precision in the type, which helps direct our programming via the process of type, define, refine. In this article, you’ll see how to define dependent types such as Vect
. The core of the idea is that, because there’s no syntactic distinction between types and expressions, types can be computed from any expression.
We’ll begin with a simple example to illustrate how this works, by defining a type for representing vehicles and their properties, depending on their power source; and you’ll see how you can use this to constrain the valid inputs to a function to those that make sense. You’ll then see how Vect
itself is defined, along with some useful operations.
A first example: classifying vehicles by power source
Dependent types allow you to give more precise information about the data constructors of a type – by adding more arguments to the type constructor. For example, you might have a data type to represent vehicles (for example, bicycles, cars, and buses), but some operations don’t make sense on all values in the type (for example, refueling a bicycle wouldn’t work because there’s no fuel tank). We’ll therefore classify vehicles into those powered by pedal and those powered by petrol, and express this in the type.
The following listing shows how you could express this in Idris.
Listing 1 Defining a dependent type for vehicles, with their power source in the type (vehicle.idr)
data PowerSource = Petrol | Pedal ❶
data Vehicle : PowerSource -> Type where ❷
Bicycle : Vehicle Pedal ❸
Car : (fuel : Nat) -> Vehicle Petrol ❹
Bus : (fuel : Nat) -> Vehicle Petrol ❹
❶ An enumeration type describing possible power sources for a vehicle
❷ A Vehicle’s type is annotated with its power source
❸ A vehicle powered by pedal
❹ A vehicle powered by petrol, with a field for current fuel stocks
You can write functions that’ll work on all vehicles by using a type variable to stand for the power source. For example, all vehicles have wheels. Not all vehicles carry fuel, and it only makes sense to refuel a vehicle whose type indicates it’s powered by Petrol.
These concepts are illustrated in the following listing.
Listing 2 Reading and updating properties of Vehicle
wheels : Vehicle power -> Nat ❶
wheels Bicycle = 2
wheels (Car fuel) = 4
wheels (Bus fuel) = 4
refuel : Vehicle Petrol -> Vehicle Petrol ❷
refuel (Car fuel) = Car 100
refuel (Bus fuel) = Bus 200
❶ Use a type variable, power, because this function works for all possible vehicle types.
❷ Refueling only makes sense for vehicles that carry fuel. Restrict the input and output type to Vehicle Petrol.
refuel : Vehicle Petrol -> Vehicle Petrol
refuel (Car fuel) = Car 100
refuel (Bus fuel) = Bus 200
refuel Bicycle impossible
refuel : Vehicle Petrol -> Vehicle Petrol
refuel (Car fuel) = Car 100
refuel (Bus fuel) impossible
vehicle.idr:15:8:refuel (Bus fuel) is a valid case
In general, you should define dependent types by giving the type constructor and the data constructors directly. This gives you a lot of flexibility in the form that the constructors take. Here it allows you to define types where the data constructors can take different arguments. You can either write functions that work on all vehicles (like wheels
) or functions that only work on some subset of vehicles (like refuel
).
Note: Defining families of types For Vehicle
, you’ve defined two types in one declaration (specifically, Vehicle Pedal
and Vehicle Petrol
). Dependent data types like Vehicle
are therefore sometimes referred to as families of types, because you’re defining multiple related types at the same time. The power source is an index of the Vehicle
family. The index tells you exactly which Vehicle
type you mean.
Defining vectors
The length information in the type helps drive development of functions on vectors. In this article we’ll look at how Vect
is defined, along with some of the operations.
It’s defined in the Data.Vect
module, as shown in listing 3. The type constructor, Vect
, takes a length and an element type as arguments, and when you define the data constructors, you state explicitly in their types what their lengths are.
Listing 3 Defining vectors (Vect.idr)
data Vect : Nat -> Type -> Type where ❶
Nil : Vect Z a ❷
(::) : (x : a) -> (xs : Vect k a) -> Vect (S k) a ❸
%name Vect xs, ys, zs ❹
❶ The type constructor Vect states that a vector constructor takes two arguments: a Nat, which is its length, and a Type, which is its element type.
❷ The data constructor Nil explicitly states that an empty vector has length Z.
❸ The data constructor (::) explicitly states that adding an element x to a vector of length k results in a vector of length S k (1 + k).
❹ Gives some default names to use in case splits.
The Data.Vect
library includes several utility functions on Vect
, including concatenation, looking up values by their position in the vector, and various higher-order functions, such as map
. Instead of importing this, though, we’ll use our own definition of Vect
and try writing some functions by hand. To begin, create a Vect.idr
file containing only the definition of Vect
in listing 3.
Because Vect
includes the length explicitly in its type, any function that uses some instance of Vect
will describe its length properties explicitly in its type. For example, if you define an append function on Vect
, its type will express
how the lengths of the inputs and output are related:
append : Vect n elem -> Vect m elem -> Vect (n + m) elem
NOTE: Type-level expressions – the expression n + m
in the return type here is an ordinary expression with type Nat
, using the ordinary +
operator. Because Vect
’s first argument is of type Nat
, expect to be able to use any expression of type Nat
. Remember, types are first class, and types and expressions are all part of the same language.
Having written the type first, as always, you can define append
by case-splitting on the first argument. You do this interactively as follows:
-
(Define) Begin by creating a skeleton definition and then case-splitting on the first argument
xs
:append : Vect n elem -> Vect m elem -> Vect (n + m) elem
append [] ys = ?append_rhs_1
append (x :: xs) ys = ?append_rhs_2
-
(Refine) Idris has enough information in the type to complete this definition by an expression search on each of the holes, resulting in this:
append : Vect n elem -> Vect m elem -> Vect (n + m) elem
append [] ys = ys
append (x :: xs) ys = x :: append xs ys
Another common operation on vectors is zip
, which pairs corresponding elements in two vectors, as illustrated in figure 1.
Figure 1 Pairing corresponding elements of [1,2,3,4] and [“one”,”two”,”three”,”four”] using zip
The name zip
is intended to suggest the workings of a zip fastener, bringing two sides of a bag or jacket together. Because the length of each input Vect
is in the type, you need to think about how the lengths of the inputs and output will correspond. A reasonable choice for this would be to require the lengths of both inputs to be the same:
zip : Vect n a -> Vect n b -> Vect n (a, b)
Having a more precise type for Vect
, capturing the length in the type, means that you need to decide in advance how the lengths of the inputs to zip
relate, and express this decision in the type. Also, it means you can safely assume that zip
will only be called with equal length lists, because if this assumption is violated, Idris will report a type error.
You can define zip
, as usual, step by step.
-
(Define) Again, you can begin to define this by a case split on the first argument:
zip : Vect n a -> Vect n b -> Vect n (a, b)
zip [] ys = ?zip_rhs_1
zip (x :: xs) ys = ?zip_rhs_2
-
(Refine) You can fill in
?zip_rhs_1
with an expression search, because the only well-typed result is an empty vector:zip : Vect n a -> Vect n b -> Vect n (a, b)
zip [] ys = []
zip (x :: xs) ys = ?zip_rhs_2
-
(Refine) For the second case,
?zip_rhs_2
, take a look at the type of the hole and see if that gives you further information about what to do.b : Type
a : Type
x : a
k : Nat
xs : Vect k a
ys : Vect (S k) b
--------------------------------------
zip_rhs_2 : Vect (S k) (a, b)
Notice that
ys
has lengthS k
, meaning that there must be at least one element. If you case-split onys
, Idris won’t give you a pattern for the empty list, because it wouldn’t be a well-typed value:zip : Vect n a -> Vect n b -> Vect n (a, b)
zip [] ys = []
zip (x :: xs) (y :: ys) = ?zip_rhs_1
After the case split, Idris has created a new hole. Let’s take a look at the types of the local variables.
b : Type
a : Type
x : a
k : Nat
xs : Vect k a
y : b
ys : Vect k b
--------------------------------------
zip_rhs_1 : Vect (S k) (a, b)
Again, there’s enough information to complete the definition with an expression search:
zip : Vect n a -> Vect n b -> Vect n (a, b)
zip [] ys = []
zip (x :: xs) (y :: ys) = (x, y) :: zip xs ys
Idris has noticed that it needs to build a vector of length S k
, so that it can create the appropriate vector of length k
with a recursive call, and so that it can create the appropriate first element by pairing x
and y
.
zip : Vect n a -> Vect n b -> Vect n (a, b)
zip [] ys = []
zip (x :: xs) (y :: ys) = ?element :: zip xs ys
b : Type
a : Type
x : a
k : Nat
xs : Vect k a
y : b
ys : Vect k b
--------------------------------------
element : (a, b)
That’s all for this article.
For more, check out the whole book on liveBook here.