Kurt_1stclassF_00

By Will Kurt

This article was excerpted from the book Learn Haskell.

The concept of First Class Functions is that functions are no different than any other data used in a program. The means that functions can be used both as arguments and well as returned as values from other functions. This is a deceptively powerful feature for a programming language to have. It allows you to abstract out any repetitive computation from your code, and ultimately allows you to write functions that effectively write other functions.

Suppose we have a function incEven that takes a number n if it is even increments it, otherwise it returns the number unchanged.


Listing 1 ifEvenInc

 ifEvenInc n = if even n
               then n + 1
               else n

Later we find out that we need two more functions: doubleEven and squareEven, that double and square even numbers respectively. These would be easy functions to write given that we know how to write incEven:


Listing 2 ifEvenDouble and ifEvenSquare

 ifEvenDouble n = if even n
                  then n * 2
                  else n
  
  
 ifEvenSquare n = if even n
                      then n^2
                      else n

While these functions were easy to write all three of them are nearly identical. The only difference is the behavior of incrementing, doubling and squaring. What we have discovered here is a general pattern of computation that we can abstract away. The key thing we need to do this is the ability to pass a function as an argument to perform the desired behavior.

We’ll demonstrate this with the function ifEven which takes a function and a number as arguments. If that number is even applies a function to that number:


Listing 3 ifEven

 ifEven myFunction x = if even x
                       then myFunction x
                       else x

We can also abstract out our incrementing, doubling and squaring behavior into three seperate functions:

 inc n = n + 1
 double n = n*2
 square n = n^2

Let’s recreate our previous definitions using the power of first class functions:

 ifEvenInc n = ifEven inc n
 ifEvenDouble n = ifEven double n
 ifEvenSquare n = ifEven square n

Now we can easily handle adding new functions such as ifEvenCube, ifEvenNegate, etc.


Lambda Functions as arguments

Naming functions is generally a good idea, but we can also use lambda functions to quickly add some code to pass into a function. If we wanted to just double the value, we can quickly put together a lambda function for this:

*> ifEven (\x -> x*2) 6
> 12

While named functions are preferred, there are many times were you want to pass in some very simple functionality.

Spot Quiz: write a lambda function for cubing x and pass it in to ifEven


Example – custom sorting

A very practical use of passing functions into other functions is for sorting. Suppose we have a list of first and last names. In this example each name is represented as a Tuple. A Tuple is a type that is very much like a List only it can contain multiple types in it and is of a fixed size. Here is an example of a name in a Tuple:

 author = ("Will","Kurt")

Tuples have two very useful functions that are used with them fst and snd, which access the first and second elements of the Tuple.

 *> fst author > "Will"
 *> snd author > "Kurt"

Now suppose we have a list of names we want to sort. Here is a set of names represented as a list of Tuples:


Listing 4 names

names = [("Ian", "Curtis"),
          ("Bernard","Sumner"),
          ("Peter", "Hook"),
          ("Stephen","Morris")]

We want to sort names. Thankfully Haskell does have a built-in sort function. To use it we’ll first need to import the Data.List module. To do this is fairly straightforward, we just need to add the following declaration to the top of whatever file we’re working in:

import Data.List

Alternatively, you can just type the import into GHCi. If we load a file with names and our import, we can see that Haskell’s sort takes a pretty good guess at how to sort these tuples:

*> sort names
 > [("Bernard","Sumner"),("Ian", "Curtis"),("Peter", "Hook"),("Stephen","Morris")]

Not bad given Haskell has no idea what we’re trying to do! Unfortunately, we usually don’t want to sort by first name. To solve this, we can use Haskell’s sortBy function, which is included in the Data.List module. We need to supply sortBy with another function that will compare two of our Tuple names. Once we explain how to compare two elements then the rest is taken care of. For this we’ll write a function compareLastNames. This function will take two arguments, name1 and name2, and will return GT, LT or EQ. GT, LT and EQ are special types representing “Greater Than”, “Less Than” and “Equal”.  In many programming languages we would just return True or False, or 1,-1 and 0.


Listing 5 compareLastNames

 compareLastNames name1 name2 = if lastName1 > lastName2
                                then GT
                                else if lastName1 < lastName2
                                     then LT
                                     else EQ
   where lastName1 = (snd name1)
         lastName2 = (snd name2)

Now we can go back to GHCi and use sortBy with our custom sorting:

*> sortBy compareLastNames names
 > [("Ian", "Curtis"),("Peter", "Hook"),("Stephen","Morris),("Bernard","Sumner")]

Much better! JavaScript, Ruby and Python all support a similar use of first class functions for custom sorting, so this technique is likely familiar to many programmers.

Spot quiz:    In compareLastNames we forgot to handle the case where two last names are the same but the first names are different, write a function to compare first names as use it to fix compareLastNames.


Returning Functions

We’ve talked a fair bit about passing functions as arguments, but this is only half of what it means to have first class functions as values. Functions also return values, and so for truly first class functions it makes sense that functions must sometimes return other functions. As always the question should be “why would I ever want to return a function?” One very good reason is that we want to dispatch certain functions based on other parameters.

Suppose we have created a Secret Society of Contemporary Alchemists and we need to send out newsletters to members at various regional post office boxes. There are offices in three cities: San Francisco, Reno and New York. Here are the office addresses:

  • PO Box 1234 – San Francisco, CA, 94111
  • PO Box 789 – New York, NY, 10013
  • PO Box 456 – Reno, NV 89523

We need to build a function that will take a name Tuple, like we used before in our sorting example, and an office location and then puts together the mailing address for us. A first pass at this function might look like this. The only other thing we need to introduce that we haven’t seen yet is the ++ operator used to concatenate strings (and lists).


Listing 6 addressLetter v.1

addressLetter name location = nameText ++" - "++location
   where nameText = (fst name) ++" "++ (snd name)

To use this function, we have to pass a name Tuple and the full address:

*> addressLetter ("Bob","Smith") "PO Box 1234 - San Francisco, CA, 94111"
 > "Bob Smith - PO Box 1234 - San Francisco, CA, 94111"

This is a fine solution, we could easily just use variables to keep track of the addresses and that would make use much less likely to have errors (and save us typing). We’re all set to send out our newsletters!

After the first round of newsletters we get some complaints and requests from the regional offices:

  • San Francisco added a new address for members with last names starting with the letter “L” or later in the alphabet: PO Box 1010 – San Francisco, CA, 94109
  • New York wants the name followed by an “:” rather than a “-“, for mystical reasons they won’t share.
  • Reno wants only last names to be used for greater secrecy.
  • It’s clear that now we’ll need a different function for each office:

Listing 7 sfOffice, nyOffice, renoOffice

 sfOffice name = if lastName < "L"
                 then nameText 
                      ++ " - PO Box 1234 - San Francisco, CA, 94111"
                 else nameText
                      ++ " - PO Box 1010 - San Francisco, CA, 94109"
   where lastName = (snd name)
         nameText = (fst name) ++" "++ lastName
        
         
 nyOffice name = nameText ++ ": PO Box 789 - New York, NY, 10013"
   where nameText = (fst name) ++" "++ (snd name)
  
 renoOffice name = nameText ++ " - PO Box 456 - Reno, NV 89523"
   where nameText = (snd name)

The question now is how should we use these three functions with addressLetter? We could simply rewrite addressLetter to take a function rather than a location as an argument. The trouble with this is that our addressLetter function is going to be part of a larger web application and we’d like to just pass in a string parameter for the location. What we’d really like is another function that will take a location string and dispatch the right function for us. We’ll build a new function called getLocationFunction, that will take a single string and dispatch the correct function. Rather than a bunch of nested if then else statements we’ll make use of Haskell’s case expresssion:


Listing 8 getLocationFunction

 getLocationFunction location = case location of ❶ 
   "ny" -> nyOffice ❷ 
   "sf" -> sfOffice ❸ 
   "reno" -> renoOffice ❹ 
   _ -> (\name -> (fst name) ++" "++ (snd name))❺ 

❶   case looks at the value of location

❷   if location is “ny” return nyOffice

❸   if location is “sf” return sfOffice

❹   if location is “reno” return

❺   If it is anything else (_ is a wildcard) return a generic solution


Our case expression should be pretty straight forward, except that _ at the end. We want to capture the situation where a string other than one of our official locations is passed in. In Haskell _ is used very frequently as a wild card. This will be covered in much more depth in the next chapter. In this case if the user of our code passes in an invalid location we just put together a quick lambda function that will just make the name Tuple into a string. Now we have a single function that will return the function we need when we need it. Finally, we can rewrite addressLetter:


Listing 9 addressLetter v.2

addressLetter name location = locationFunction name
   where locationFunction = getLocationFunction location

In GHCi we can test out that our function performs as expected:

  
 *> addressLetter ("Bob","Smith") "ny"
 > "Bob Smith: PO Box 789 - New York, NY, 10013"
  
 *> addressLetter ("Bob","Jones") "ny"
 > "Bob Jones: PO Box 789 - New York, NY, 10013"
  
 *> addressLetter ("Samantha","Smith") "sf"
 > "Samantha Smith - PO Box 1010 - San Francisco, CA, 94109"
  
 *> addressLetter ("Bob","Smith") "reno"
 > "Smith - PO Box 456 - Reno, NV 89523"
  
 *> addressLetter ("Bob","Smith") "la"
 > "Bob Smith"

Now that we’ve separated each of the functions needed for generating addresses we can easily add new rules as they come in from each office. In this example returning functions as values helped us tremendously to make our code easier to understand and extend. This is a pretty simple use of returning functions as values; all we have done is automated how functions can move around.


Exercises

  • For anything that can be compared in Haskell, for example [Char]which we use for the actual names in our name Tuples, can be compared with a function called compare. The compare function returns either GT, LT or EQ. Rewrite compareLastNames using compare.
  • Define a new location function for Washington DC and add it to getLocationFunction. In the DC function everyone’s names must be followed by “Esq.”

Question: “How do you use functions as values?”

Answer: Functions can be used as arguments to other functions. This is frequently done to abstract out computation. Functions can also be returned as values from other functions, both to dispatch function based on input as to create new functions.