by Bayle Shanks
some code samples are from __Yet Another Haskell Tutorial__ which is by Hal Daume III
This (currently unfinished) tutorial aims to be short yet to cover topics through basic monads.
Much of the bulk of the text consists of short code examples and interpreter traces.
The intended reader has programming experience, but only in imperative languages, and wants a quick introductory tour of Haskell. The tutorial is not intended to be difficult, however in the interest of brevity many things are demonstrated without much explanation.
It does not aim to be comprehensive, so after reading it, interested parties may want to read a longer tutorial to get the details that are skipped here. Hopefully, after reading this tutorial, you will be able to breeze through a longer one.
Copyright 2008 Bayle Shanks. You may copy this document under the terms of either:
whichever you prefer (but not in any other circumstances without prior permission).
(ignore this if you are reading the PDF). I'm using EasyLatex? to compile the wiki source of this page into a PDF. So don't mind the occasional LaTeX? commands. For example:
\usepackage{times} \usepackage[margin=3.7cm]{geometry}
\newpage
As of this writing, the most popular implementation of Haskell seems to be ghc
, the Glasgow Haskell Compiler. The GHC project also produces ghci
, an interpreter. In this tutorial, we'll use ghci
(although of course most of the things that we'll talk about are general features of the language that will be in any implementation). So go install ghci
now.
When you start ghci
, you get an intro banner and then a prompt:
Prelude>
To quit, type Cntl-D.
Start ghci
again. You can enter Haskell directly in the interpreter, for example:
Prelude> print "Hello World" "Hello World" Prelude>
If you want to put this in a file, here's how. Each file is a module. The module name should be the same as the file name, except that the file name should have an .hs at the end and the module name should not.
So, create a file called Test.hs
and put this into it:
module Test where helloWorldFunction = print "Hello world"
To load a module in ghci
, you say ":l modulename
". So, to load module Test and then evaluate helloWorldFunction
, you do:
Prelude> :l Test [1 of 1] Compiling Test ( Test.hs, interpreted ) Ok, modules loaded: Test. *Test> helloWorldFunction "Hello world" *Test>
Note that prompt changes from "Prelude" to "*Test". If you later change the sourcecode file and want to reload it, you can use :r
, which reloads the current module.
In the previous example, the reason that there are quotes around "Hello world" is that the "print" command prints a representation of a value that is useful for debugging. If you just have a string that you want to send to the console directly, you use putStrLn
. For example:
Prelude> print "Hello world" "Hello world" Prelude> putStrLn "Hello world" Hello world Prelude>
If you want to compile a standalone executable, you need to make a Main module. Put the following into Main.hs:
module Main where main = putStrLn "Hello world"
Now, at your operating system's command line, use ghc
to compile:
$ ghc --make Main.hs -o main $ ./main Hello world
\newpage
In addition,
In Haskell, every expression has a type2. Various functions and operators have various restrictions on what sort of types they can operate on. If you try to use incompatible types, you'll get a compile time error.
If you are curious about the type of some expression in Haskell, you can use ghci
's :t
command to find out. For example:
Prelude> :t 'c' 'c' :: Char Prelude> :t "a string" "a string" :: [Char]
Later on we'll talk about how to read the notation for expressing the types of things in Haskell. Until then, don't worry about it.
At this point you might be worried that you'll spend a lot of time declaring the types of things, so I should mention that Haskell doesn't usually require you to declare the types of your variables and functions. Haskell has a powerful system of type inference that guesses the types of almost everything automatically.
A disadvantage of using a powerful type inference system is that it makes type error messages harder to interpret. For example, let's say that you have three expressions, call them A,B,C. Let's say that you give expression A the wrong type. Let's say that you construct expression B out of expression A, and then in a very different part of the program, you refer to expression B in expression C. Because you made a mistake with expression A, Haskell might infer the wrong type for expressions B and C. Perhaps the error will only surface when it gets to expression C. In this case, the error message will refer only to expression C, and you'll have to figure out that the root of the problem was really with expression A.
To define a function in a source code file, write something like:
functionName argument1 argument2 = expression
For example:
plusFive a = a+5
Inside ghci
, you have to put the word "let
" at the beginning of the function definition:
Prelude> let plusFive a = a+5 Prelude> plusFive(10) 15
Functions you define are prefix by default. We'll talk about how to define infix functions later.
To call a prefix function, just write the function, and then a space, and then the argument. If there are multiple arguments, just separate them by spaces. For example:
Prelude> let addition a b = a+b Prelude> addition 2 3 5
In Haskell, if you want to execute a sequence of instructions, you can't just put them one after another, unless they are within a do
. For example, try putting this incorrect code into Test.hs
:
module Test where test = print "First line." print "Second line."
It won't compile in ghci
(don't worry about trying to understand this error message3):
Prelude> :l Test [1 of 1] Compiling Test ( Test.hs, interpreted ) Test.hs:2:7: Couldn't match expected type `(a -> IO ()) -> [Char] -> t' against inferred type `IO ()' In the expression: print "First line." print "Second line." In the definition of `test': test = print "First line." print "Second line." Failed, modules loaded: none.
The problem is that in Haskell, a function is just __a single expression__. So in a sense, a whole Haskell function is analogous to just a single line of code in other languages.
To execute a sequence of instructions, you have to wrap them in a do
block. Example in ghci
:
Prelude> do {putStrLn "First line."; putStrLn "Second line."} First line. Second line.
Example as a source code file (to be placed in Test.hs
):
module Test where test = do {putStrLn "First line."; putStrLn "Second line."}
A complete do
block is itself a single expression that returns a single value (which is why you can have a whole do
block inside a function, even though functions must be single expressions).
Under the hood, do
blocks are actually syntactic sugar for something pretty complicated involving monads. We'll talk more about the deeper meaning of do
blocks later. For now, one more thing to note is that a do
block that contains I/O has a return value of type 'IO
something' (where the 'something' varies).
You don't have mutable variables in Haskell.
Put the following incorrect code into Test.hs
:
module Test where x = 1 x = x+1
Now try loading it in ghci
:
Prelude> :l Test [1 of 1] Compiling Test ( Test.hs, interpreted ) Test.hs:3:0: Multiple declarations of `Test.x' Declared at: Test.hs:2:0 Test.hs:3:0 Failed, modules loaded: none.
To accomplish the same effect, you have to have a different name for the so-called variable at each stage. Put this into Test.hs:
module Test where x = 1 x2 = x+1
Now it works:
Prelude> :l Test [1 of 1] Compiling Test ( Test.hs, interpreted ) Ok, modules loaded: Test. *Test> x2 2
You can see that what you are really doing here is not declaring "variables", but defining functions, just like the "addition" function in the example above. There are still variables in Haskell, in the form of function arguments; but these are just like the variables found in mathematics -- their value doesn't change over the course of the computation.
However, for me, the word "variable" in the context of a programming language is so bound up the idea of mutable variables that I like to tell myself this: in Haskell, there are no "variables", because nothing "varies"; there are just functions, each with a fixed4 definition.
I'll say that again. In the above code, x
and x2
do not correspond to memory locations which you can read and write. x
and x2
are names of functions that you have defined.
Now, you're thinking, how are we going to program without mutable variables? Well, as far as I know (todo), there are 3 ways in which variables are used in typical imperative programming languages:
I'll treat each case in turn.
In the first case, these variables don't really have to be mutable -- you could always5 give the variable a new name on each line of code that assigns to it6.
An example of the first case, in Python:
import time toOutput = "The time is " toOutput = toOutput + time.strftime('%H:%M') toOutput = toOutput + " right now" print toOutput
In this situation, toOutput can be replaced by three different variables:
import time toOutput1 = "The time is " toOutput2 = toOutput + time.strftime('%H:%M') toOutput3 = toOutput + " right now" print toOutput3
The second case is function arguments. Function arguments aren't usually considered "mutable variables", but they do take on different values each time the function is called. Well, we do have function arguments in Haskell, so there's no problem here.
The third case are things that change as you progress through loop iterations. Well, that's easily taken care of. In Haskell, you're not allowed to loop. Feel better?
That's a half-truth. In Haskell, you replace loop constructs with recursion. Each iteration is replaced with a new call to the function. So, you can still go through the same section of code over and over again while incrementing the equivalent of a loop counter -- it's just that the loop counter is a function argument, rather than a local variable, and that each iteration is a separate function call. I'll give an example later.
By the way, you can put some nonalphanumeric characters into your function names. Some people like to use the apostrophe (for example, "x'" -- read "x prime") to indicate to the reader that a bunch of functions should be considered to form sequence. Example:
module Test where x = 1 x' = x+1
To reassign a name within a do
block, use something like "let {x=1}
". Example:
Prelude> do {let {s="Line 1"}; putStrLn s; let {s="Line 2"}; putStrLn s;} Line 1 Line 2
However, this doesn't mean that we have mutable variables. The following code enters an infinite loop when Haskell tries to evaluate "x=x+1":
Prelude> do {let {x=1}; print x; let {x=x+1}; print x;}
You can also reassign things within the ghci interpreter using let
. You might think of the things that you enter in the interpreter as being within an implicit do
block7. Example:
Prelude> let x = 3 Prelude> x 3 Prelude> let x = 4 Prelude> x 4 Prelude> let f a b = a+b Prelude> f 2 3 5 Prelude> let f a b = a*b Prelude> f 2 3 6
You can either enclose blocks in curly brackets and delimit lines with semicolons, or you can use whitespace. If you use whitespace, just indent things that you would have put into curly brackets.
In the ghci
interpreter, you cannot use whitespace, you must use only "{}
" and ";
". You can only use whitespace in source code files.
Here is an example using "{}
" and ";
":
module Test where test = do {putStrLn "First line."; putStrLn "Second line."}
And here is the equivalent using whitespace:
module Test where test = do putStrLn "First line." putStrLn "Second line."
The rules for interpretation of whitespace in Haskell are sometimes called the rules of "layout".
Generally for the rest of this tutorial when I present code samples, I'll use the whitespace form, as if it was sitting inside a source code file. Sometimes, however, when I want to show you what happens when you evaluate something, I'll show you a ghci
trace and put in some commands in the form that uses {}
and ;
.
Examples:
x = 3 -- this is a single line comment {- this is a multiline comment -}
To import a module in a source code file, use the import
statement. For example, to import module Random
:
import Random
To import a module within the ghci
interpreter, use the :m
command. For example, to import module Random
:
Prelude> :m Random Prelude Random>
You can set breakpoints within do
blocks by using the ghci
along with the breakpoint
command, which is in the GHC.Base
module:
module Test where import GHC.Base main = do let a = 3 let b = a*2 breakpoint $ do print a+b
If you are using ghci
6.8.1 or later, you can use the more powerful debugging facilities included in ghci
. See http://haskell.org/ghc/docs/6.8.1/html/users_guide/ghci-debugger.html for details.
True
and False
are the result of boolean expressions.
Another special value in Haskell is the "unit" value, written:
()
This is sometimes used to mean something like what is called "null" in other languages (in Haskell, null
is used for something else; it's a boolean function that tells you if a list is empty).
By the way, sometimes you run into trouble because you expect the negation function ('-
') to bind very tightly to numbers, but it doesn't. Just use parenthesis. Example:
Prelude> abs 3 3 Prelude> abs -3 <interactive>:1:5: No instance for (Num (a -> a)) arising from the literal `3' at <interactive>:1:5 Possible fix: add an instance declaration for (Num (a -> a)) In the second argument of `(-)', namely `3' In the expression: abs - 3 In the definition of `it': it = abs - 3 Prelude> abs (-3) 3
When we tried to do abs -3
, what happened was that it was parsed as (abs -) 3
. Oh well.
Example8:
sgn x = if x < 0 then -1 else if x > 0 then 1 else 0
An else
is always required in Haskell.
Both the then
expression and the else
expression have to return compatibly-typed values (so that the overall expression has a sensible return type).
For example, if you put this incorrect code into Test.hs, then it won't compile:
badIf x = if x < 0 then 1 else "this is a string"
Here's the error you get:
Prelude> :l Test [1 of 1] Compiling Test ( Test.hs, interpreted ) Test.hs:10:15: No instance for (Num [Char]) arising from the literal `1' at Test.hs:10:15 Possible fix: add an instance declaration for (Num [Char]) In the expression: if x < 0 then 1 else "this is a string" In the definition of `badIf': badIf x = if x < 0 then 1 else "this is a string" Failed, modules loaded: none.
What the compiler is basically saying in this error is:
"I noticed that the type of the then
clause was a number ('Num'
), so I assumed that the type of the whole if/then/else
clause was a number. Then I noticed that the type of the else
clause was a string ('[Char]
'). But a string is not a type of number ('No instance for (Num [Char])
'). So the type of the else
clause is not compatible with the inferred type of the whole if/then/else
clause."
If you do an if/then/else
inside a do
block, and you want to do a sequence of actions inside the if/then/else
, then you have to open a new do
block.
For example, the following code is incorrect:
main = do let a = 3 putStrLn "Testing if a == 3..." if a == 3 then putStrLn "Yep, it is 3." putStrLn "Good." else putStrLn "Nope, a is not 3." putStrLn "That's impossible."
This is what you have to do:
main = do let a = 3 putStrLn "Testing if a == 3..." if a == 3 then do putStrLn "Yep, it is 3." putStrLn "Good." else do putStrLn "Nope, a is not 3." putStrLn "That's impossible."
\newpage
To get a line of input from the console and temporarily assign it to the symbol name "line", do this within a do
block:
line <- getLine
For example:
Prelude> do {putStrLn "Enter a name"; name <- getLine; putStrLn ("Hi " ++ name)} Enter a name Bayle Hi Bayle
(note: ++
is how you do string concatenation)
At this point you are probably wondering, "what's that <-
operator for?"
The full answer involves a theoretical issue with I/O, the deeper meaning of do
blocks, and monads. So I'll put off explaining it until later. For now, just pretend that name
is a mutable variable, and that within the context of a do
block, the '<-
' operator means "assign the result of this function to that variable." Note, however, that the '<-
' operator can only be used with certain functions. Functions which receive transient information from 'the outside world' are some of these functions.
When you use getLine
, you get back strings. If the user is actually entering something else type of thing, like a number, you'll have to explicitly convert the string to that other type. So, the following is incorrect code:
isUserInputGreaterThan3_bad = do input <- getLine; if input > 3 then putStrLn (input ++ " is greater than 3") else putStrLn (input ++ " is NOT greater than 3")
The error you get is:
Prelude> :l Test [1 of 1] Compiling Test ( Test.hs, interpreted ) Test.hs:6:17: No instance for (Num String) arising from the literal `3' at Test.hs:6:17 Possible fix: add an instance declaration for (Num String) In the second argument of `(>)', namely `3' In the predicate expression: input > 3 In the expression: if input > 3 then putStrLn (input ++ " is greater than 3") else putStrLn (input ++ " is NOT greater than 3") Failed, modules loaded: none.
A function that parses user input and turns strings into other types of values is (for example, numbers) is read
. This is the thing you need to fix the above example. For example:
isUserInputGreaterThan3 = do input <- getLine; let number = read(input) if number > 3.0 then putStrLn (input ++ " is greater than 3") else putStrLn (input ++ " is NOT greater than 3")
If the user enters something that isn't a number, you'll get an exception that looks like:
*Test> isUserInputGreaterThan3 woah *** Exception: Prelude.read: no parse
IMPORTANT NOTE: Note that we used let
=
to assign the result from read
, not <-
. Remember that <-
can only be used with certain special functions. read
is not one of them.
Another note. In the above code, it is important that we said number > 3.0
rather than number > 3
. If we hadn't done that, then the function would work find when the user enters and integer, but it would crash when the user tries to enter a number with a decimal point. The reason is that doing this caused Haskell's automatic type inference system to infer that number
supported 'Fractional
' operations. And this in turn caused Haskell to choose a variant of the read
function that returns something that supports 'Fractional
' operations, which is necessary for read
to be able to parse reals.
Here's the error you get if you put 3
instead of 3.0
:
Prelude> :l Test [1 of 1] Compiling Test ( Test.hs, interpreted ) Ok, modules loaded: Test. *Test> isUserInputGreaterThan3 4 4 is greater than 3 *Test> isUserInputGreaterThan3 3.5 *** Exception: Prelude.read: no parse
How are you supposed to figure out the root cause of the problem for something like this? We haven't yet talked enough about types to be able to explain how to troubleshoot this type of problem, but we'll come back to it later.
The show
function is sort of the opposite of the read
function. The show
converts various non-string data types to strings:
Prelude> 2 2 Prelude> show 2 "2" Prelude> "string" ++ 2 <interactive>:1:12: No instance for (Num [Char]) arising from the literal `2' at <interactive>:1:12 Possible fix: add an instance declaration for (Num [Char]) In the second argument of `(++)', namely `2' In the expression: "string" ++ 2 In the definition of `it': it = "string" ++ 2 Prelude> "string" ++ (show 2) "string2"
\newpage
There's something tricky about doing conditional I/O in Haskell. We haven't run into this yet in the examples, but in case you try to do it yourself I wanted to mention it now. You can skip over this section and come back to it later if you want.
As previously noted, a do
block that contains I/O has a return value of type 'IO
something' (where the 'something' varies). Since the return type of both the then
and the else
must be compatible, this means that if you do I/O in one condition and not in the other, you'll get an error about incompatible types.
For instance, the following incorrect code causes an error:
conditionalPrint bool str = if bool then print str else ()
Here's the error that you get:
Prelude> :l Test [1 of 1] Compiling Test ( Test.hs, interpreted ) Test.hs:12:12: Couldn't match expected type `IO ()' against inferred type `()' In the expression: if bool then print str else () In the definition of `conditionalPrint': conditionalPrint bool str = if bool then print str else () Failed, modules loaded: none.
What the compiler is saying is something like:
"I noticed that the type of the then
clause was IO ()
, so I assumed that the type of the whole if/then/else
clause was IO ()
. So I expected that the type of the else
clause would be IO ()
, too. But when I calculated the type of the else
clause, it turned out to be ()
. And these two types don't match."
To fix this (at least in the case of print
, which returns type IO ()
), you can use the following weird-looking expression, which generates a type of IO ()
without actually doing any I/O:
( (do {return ()})::IO () )
We'll explain this... later. Here's an example of it in use:
conditionalPrint bool str = if bool then print str else ( (do {return ()})::IO () )
And here's how it looks when you run it (assuming you put it into Test.hs
below an appropriate module header):
Prelude> :l Test [1 of 1] Compiling Test ( Test.hs, interpreted ) Ok, modules loaded: Test. *Test> conditionalPrint True "hi" "hi" *Test> conditionalPrint False "hi" *Test>
\newpage
Here's the analog of a for
loop. The following function prints out a count from a to b:
countFromAtoB a b = if a<b then do print a countFromAtoB (a+1) b else print a
Here's how it looks (assuming that you put the above code into Test.hs
, following an appropriate module declaration):
Prelude> :l Test [1 of 1] Compiling Test ( Test.hs, interpreted ) Ok, modules loaded: Test. *Test> countFromAtoB 3 5 3 4 5
You might be worried that if you did a loop with many iterations this way, the stack would overflow, because as the number of iterations increased, the recursion depth would increase, and the call stack would have to hold more and more stack frames. However, compilers for functional languages notice when the very last operation that a function does is a recursive call to itself, and they convert these structures to iteration in the compiled code. This is called "tail call optimization"9. So don't worry.
Here's the analog of a while
loop:
whileUserDoesntEnterQ = do putStrLn "Enter Q to quit" input <- getLine; if (input == "Q") then print "Bye" else whileUserDoesntEnterQ
\newpage
Now that we know how to do if/then/else
, basic console input and output, and loops using recursion, we can do this example10:
module Main where import Random main = do num <- randomRIO (1, 100) putStrLn "I'm thinking of a number between 1 and 100" doGuessing num doGuessing num = do putStrLn "Enter your guess:" guess <- getLine let guessNum = read guess if guessNum < num then do putStrLn "Too low!" doGuessing num else if read guess > num then do putStrLn "Too high!" doGuessing num else do putStrLn "You Win!"
Note that randomRIO
is one of those functions that uses the <-
operator. Since the result of this function will be different each time you call it, you can think of it as a function which returns transient information about some 'outside world'.
\newpage
Tuples are fixed-length, ordered collections of hetrogeneous elements.
To group two values together into an ordered pair, surround them with parentheses, for example11:
Prelude> (5,3) (5,3)
The values in a pair don't have to be the same type:
Prelude> (5, "hello") (5,"hello")
To extract the first and the second values from a pair, you could use the predefined functions fst
and snd
12:
Prelude> fst (5, "hello") 5 Prelude> snd (5, "hello") "hello"
In addition to pairs, you can construct triples, quadruples, etc13:
Prelude> (1,2,3) (1,2,3) Prelude> (1,2,3,4) (1,2,3,4)
In general, pairs, triples, quadruples, etc are called tuples.
Tuples are flexible because they can put together different data types. They are not equipped to (easily) change their length, however; if you want a variable-length collection of elements, you may be better off with a list.
Note: the functions fst
and snd
only work on pairs, not triples, etc. We'll see how to extract values from longer tuples later, when we talk about "pattern matching".
Lists are variable-length, ordered collections of homogenous elements.
To construct a list, surround the elements of the list with square brackets:
Prelude> [5,3,2] [5,3,2]
To add an element to the front of a list, use a colon (called a "cons" operator, short for "construct" as in "we are constructing a list"):
Prelude> 7:[5,3,2] [7,5,3,2]
You can build any list by starting with the empty list (written []
) and then adding in the elements one by one:
Prelude> 7:5:3:2:[] [7,5,3,2]
You can make lists of tuples, lists of lists, lists of functions, or whatever. However, the elements of a list must all be of the same type.
Strings are just lists of characters14:
Prelude> 'H':'e':'l':'l':'o':[] "Hello"
Some basic list functions are:
head
: the first element of a non-empty listtail
: all elements __but__ the first element of a non-empty listlength
: length of a listnull
: True iff the list is empty++
: concatenate two liststake
: make a new list with the first n elements of the input list (see below)filter
: elementwise filter elements out of a list (see below)map
: elementwise transform a list (see below)..
: enumerate from x to y (enumerable lists only!) (see below)zip
: "zip" two lists together into a single list with pairs (see below)take
takes 2 arguments, an integer and a list. It returns a new list with the first n elements of the list. Example:
Prelude> take 3 [3,6,2,5] [3,6,2]
filter
takes two arguments, a boolean function and a list. It returns another list. It applies the boolean function to each element of the list and only those elements which pass the test (that is, those elements for which the function returns True
) get added to the return list. For instance15:
Prelude> filter Char.isLower "Hello World" "elloorld"
map
takes two arguments, a function and a list, and returns another list. It applies the function to each element of the input list, and then adds the result of the function to the output list. For instance16:
Prelude> map Char.toUpper "Hello World" "HELLO WORLD"
If the elements of a list are a type that is __enumerable__ (that is, if their type is a member of the Enum
typeclass.. which will be introduced later), then you can use ..
as a shorthand when constructing a list, as in the following example. Int
and Char
(integers and characters) are examples of enumerable types.
Prelude> [1..10] [1,2,3,4,5,6,7,8,9,10]
To specify a step other than 1:
Prelude> [1,3..10] [1,3,5,7,9] Prelude> [1,4..10] [1,4,7,10]
Example of zip
:
Prelude> zip [1,2,3,4,5] ['a','b','c','d','e'] [(1,'a'),(2,'b'),(3,'c'),(4,'d'),(5,'e')]
There is some syntactic sugar for using map
and filter
on lists. This construct is called a list comprehension. It looks like this:
[expression containing x | x <- inputList, g x]
(except you can use any name in place of x
) and it evaluates to:
map (a function representing 'expression containing x') (filter g inputList)
For example17:
Prelude> map toLower (filter isUpper "Hello World") "hw" Prelude> [toLower x | x <- "Hello World", isUpper x] "hw"
You can put more than one filter
condition in the list comprehension:
Prelude> let li = [(0,0), (0,1), (1,0), (1,1)] Prelude> [(x, "hello") | x <- li, fst x == snd x, fst x < 1] [((0,0),"hello")]
List comprehensions can also be used to combine multiple lists in complex ways. Suppose18 you want to create a list of pairs, one for each point where 0 <
x <
2, 0 <
y <
3, and x <=
y.
Prelude> [(x,y) | y <- [0..5], x <- [0..5], x <= 2, y <= 3, x <= y] [(0,0),(0,1),(1,1),(0,2),(1,2),(2,2),(0,3),(1,3),(2,3)]
Latter comma-separated clauses can use the names assigned in former clauses:
Prelude> [(x,y) | x <- [0..2], y <- [x..3]] [(0,0),(0,1),(0,2),(0,3),(1,1),(1,2),(1,3),(2,2),(2,3)]
The order of the comma-separated clauses in the second part matters:
Prelude> [(x,y) | y <- [x..3], x <- [0..2]] <interactive>:1:15: Not in scope: `x' Prelude> [(x,y) | y <- [0..3], x <- [0..y], x <= 2] [(0,0),(0,1),(1,1),(0,2),(1,2),(2,2),(0,3),(1,3),(2,3)]
Note that the order of the elements in the resulting list is different from the way it was when x <-
came before y <-
.
Exercise 119 Use map to convert a string into a list of booleans, each element in the new list representing whether or not the original element was a lower-case character. That is, it should take the string "aBCde"
and return [True,False,False,True,True]
.
Exercise 220 Use the functions mentioned in this section (you will need two of them, plus Char.isLower
) to compute the number of lower-case letters in a string. For instance, on "aBCde"
it should return 3.
Exercise 321 Write a function that takes a list of pairs of length at least 2 and returns the first component of the second element in the list. So, when provided with [(5,'b'),(1,'c'),(6,'a')]
, it will return 1.
id
: the identity function; id x = x
==
: equality/=
: inequalitynot
: logical negation&&
: logical AND||
: logical ORand
: input: a list of booleans. output: True iff they are all Trueor
: input: a list of booleans. output: True iff any of them are True\newpage
You can temporarily define a symbol using let
... in
. Example22:
roots a b c = let det = sqrt (b*b - 4*a*c) in ((-b + det) / (2*a), (-b - det) / (2*a))
You can define multiple symbols at once like this23:
roots a b c = let det = sqrt (b*b - 4*a*c) twice_a = 2*a in ((-b + det) / twice_a, (-b - det) / twice_a)
Another construct which is like let
except that it goes in different places is where
. where
goes after the block of code rather than before it24 25:
roots a b c = ((-b + det) / (2*a), (-b - det) / (2*a)) where det = sqrt(b*b-4*a*c)
You can use let
or where
to temporarily redefine something which was already defined outside of the code block. Outside of the code block, the symbol goes back to its previous definition:
x = 1 y = let x = 2 in x*2 z = x*2
In that piece of code, you end up with x=1, y=4, z=2:
Prelude> let x=1 Prelude> let y=let {x = 2} in x*2 Prelude> let z=x*2 Prelude> x 1 Prelude> y 4 Prelude> z 2
You can have both a let
and a where
surrounding the same piece of code:
z = let x = 2 in x*y where y = 3
In that example, z=6.
If you tried to define the same symbol in both the let
and the where
clause, the let
clause takes precedence. But don't do that.
Another name for this sort of "temporary definition" is "local binding". Another name is "temporary declaration". When a local definition takes precedence over a definition in an outer scope, it is said that the local definition "shadows" the other definition.
Since (outside of a do
block, Haskell doesn't have26 sequential statements, let
and where
often substitute for what in other languages would be a list of sequential variables assignments. Example:
vectorLength x y z = let xSq = x^2 ySq = y^2 zSq = z^2 sumOfSquares = xSq+ySq+zSq ans = sqrt(sumOfSquares) in ans
\newpage
In this example, edgeList
represents the list of edges in a directed graph, represented as a list of pairs, where the first element of each pair is the edge source, and the second element is the target.
For example, [(0,1), (0,2), (2,3), (3,4)]
represents the graph with vertices 0,1,2,3,4 which looks like:
\begin{graph} rankdir = "LR"; size="2,2" 0 -> 1 0 -> 2 2 -> 3 3 -> 4 \end{graph}
Here's the source code:
search edgeList src dst = if src == dst then [src] else searchEdgesOriginatingAtSrc edgeList src dst searchEdgesOriginatingAtSrc edgeList src dst = if (null edgeList) then [] else let currentEdge = head edgeList currentEdgeSrc = fst currentEdge currentEdgeDst = snd currentEdge in if src == currentEdgeSrc then let searchResult = search edgeList currentEdgeDst dst in if not (null searchResult) then src:searchResult else searchEdgesOriginatingAtSrc (tail edgeList) src dst else searchEdgesOriginatingAtSrc (tail edgeList) src dst
Here's how it looks:
Prelude> :l Test [1 of 1] Compiling Test ( Test.hs, interpreted ) Ok, modules loaded: Test. *Test> let el = [(0,1), (0,2), (2,3), (3,4)] *Test> search el 0 3 [0,2,3] *Test> search el 0 4 [0,2,3,4] *Test> search el 4 0 []
We'll come back to this example27 a few times and rewrite it in various ways.
One alternative way that we can write this is to use a where
clause to define searchEdgesOriginatingAtSrc
within the scope of search
, in which case we don't have to pass in the arguments src
and dst
:
search edgeList src dst = if src == dst then [src] else searchEdgesOriginatingAtSrc edgeList where searchEdgesOriginatingAtSrc edgeList = if (null edgeList) then [] else let currentEdge = head edgeList currentEdgeSrc = fst currentEdge currentEdgeDst = snd currentEdge in if src == currentEdgeSrc then let searchResult = search edgeList currentEdgeDst dst in if not (null searchResult) then src:searchResult else searchEdgesOriginatingAtSrc (tail edgeList) else searchEdgesOriginatingAtSrc (tail edgeList)
\newpage
Example:
dayStrExpand abbrev = case abbrev of "M" -> "Monday" "T" -> "Tuesday" "W" -> "Wednesday" "Th" -> "Thursday" "F" -> "Friday" "Sat" -> "Saturday" "Sun" -> "Sunday" _ -> "(INVALID DAY ABBREVIATION!)"
The underscore is the "default" rule that fires if none of the others do. In Haskell the underscore represents a wildcard that matches anything.
Just like with then
and else
, each expression of the case
has to return compatibly-typed values (so that the overall case
expression has a sensible return type).
There is a special value in Haskell called undefined
. Example:
dayStrExpand abbrev = case abbrev of "M" -> "Monday" "T" -> "Tuesday" "W" -> "Wednesday" "Th" -> "Thursday" "F" -> "Friday" "Sat" -> "Saturday" "Sun" -> "Sunday" _ -> undefined
If you try to print out an undefined value, it results in an exception. Example:
Prelude> let x = undefined Prelude> let y = x + 1 Prelude> print y *** Exception: Prelude.undefined
Sometimes people use the symbol \bot (read "bottom"; if you are reading this in HTML, "bottom" looks like an upside-down 'T') to refer to Haskell's "undefined
".
Like in math, you can define functions piecewise. This is much like a case
statement. Example:
f 0 = 0 f 1 = 2 f 2 = 5
Note that in that example, the piecewise definition doesn't cover all of the possible cases. If you try to evaluate the function on an undefined case, you'll get an exception:
Prelude> let {f 0 = 0; f 1 = 2; f 2 = 5} Prelude> f 0 0 Prelude> f 1 2 Prelude> f 2 5 Prelude> f 3 *** Exception: <interactive>:1:5-29: Non-exhaustive patterns in function f
In a source code file, the different parts of the function definition don't even have to be next to each other. If you are doing this interactively in ghci
, however, they all have to be in the same let
, because otherwise the latter definitions overwrite the previous ones:
Prelude> let f 0 = 0 Prelude> let f 1 = 2 Prelude> f 1 2 Prelude> f 0 *** Exception: <interactive>:1:4-10: Non-exhaustive patterns in function f Prelude> let {f 0 = 0;f 1 = 2;} Prelude> f 1 2 Prelude> f 0 0
If multiple definitions match the same situation, the first one takes precedence:
Prelude> let {f 0 = 0; f 0 = 1} <interactive>:1:5: Warning: Pattern match(es) are overlapped In the definition of `f': f 0 = ... Prelude> f 0 0
A more interesting example:
Prelude> let {f 0 = 0; f x = 1/(-x)} Prelude> f (-10) 0.1 Prelude> f (-0.000001) 1000000.0 Prelude> f (0.000001) -1000000.0 Prelude> f (10) -0.1 Prelude> f 0 0.0
You can associate arbitrary conditions with each "piece" of a piecewise defined function using "guards". To do this, you write something like
functionName argument | condition1 = (function definition when condition1 is true) | condition2 = (function definition when condition2 is true) ... | otherwise = (function definition when none of the conditions are true)
where the condition1
, condition2
, etc. are boolean expressions. When the function is evaluated, all of the guards are tried in order until one matches.
For example:
abs x | x >= 0 = x | x < 0 = -x
or equivalently:
abs x | x < 0 = -x | otherwise = x
Rather than using fst
and snd
to extract values from pairs, you can use "pattern matching" in the function definition instead:
Prelude> let f (a,b) = 2*a+b Prelude> f (5,7) 17
You can do a similar thing with lists, in place of using head
and tail
:
Prelude> let f (x:xs) = do print x Prelude> let g (x:xs) = do print xs Prelude> f [5,7,3] 5 Prelude> g [5,7,3] [7,3]
In fact, you can do this for any data type for which you have a constructor28; you simply write the constructor in the function arguments section, and when the function is called Haskell will bind the symbols used in the constructor expression to the appropriate values in the data structure.
You can combine this with piecewise function definition as a substitute for if/then/else
expressions on parts of the data structure:
pairIs01Or10 (0,1) = True pairIs01Or10 (1,0) = True pairIs01Or10 (a,b) = False
In ghci
:
Prelude> let pairTest (0,1) = True; pairTest (1,0) = True; pairTest (a,b) = False Prelude> pairTest (0,0) False Prelude> pairTest (0,1) True Prelude> pairTest (1,0) True Prelude> pairTest (1,1) False Prelude> pairTest (1,2) False
However, all of the possible inputs to the piecewise-defined function must share some type (later on we'll talk about how to get around this restriction using "typeclasses"29). The following code is illegal because it implies that function f
should accept both pairs and lists:
f (a,b) = do print "pair" f (x:xs) = do print "list"
Here's the error you get in ghci
:
Prelude> let f (a,b) = do {print "pair"}; f (x:xs) = do {print "list"}; <interactive>:1:36: Couldn't match expected type `(t, t1)' against inferred type `[a]' In the pattern: x : xs In the definition of `f': f (x : xs) = do print "list"
You can also use the wildcard _
in patterns (instead of using "dummy variables" for values that you don't need; this makes the code easier to read):
fstIs10 (10,_) = True fstIs10 _ = False
In ghci
:
Prelude> let fstIs10 (10,_) = True; fstIs10 _ = False Prelude> fstIs10 (10,0) True Prelude> fstIs10 (9,0) False
\newpage
qsort [] = [] qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++ qsort (filter (>= x) xs)
Or, equivalently, using list comprehensions instead of filter
:
qsort [] = [] qsort (x:xs) = qsort [e | e<-xs, e<x] ++ [x] ++ qsort [e | e<-xs, e>=x]
I must say; out of the programming languages that I know so far, Haskell is definitely the best one for expressing quicksort.
With the new syntax that we just covered, we can rewrite the path search algorithm much more concisely:
search edgeList src dst | src == dst = [src] | otherwise = searchEdgesOriginatingAtSrc edgeList where searchEdgesOriginatingAtSrc [] = [] searchEdgesOriginatingAtSrc ((currentEdgeSrc, currentEdgeDst):restOfEdges) | src == currentEdgeSrc = case search edgeList currentEdgeDst dst of [] -> searchEdgesOriginatingAtSrc restOfEdges successList -> src:successList | otherwise = searchEdgesOriginatingAtSrc restOfEdges
\newpage
Sometimes you want to define a once-off function, that is, a function that will only be used once. In this case, for clarity, you may want to just define the function right where it will be used rather than defining it elsewhere in the source code. Also, you might not want to bother to give the function a name.
To do this, you write something like:
(\argument1 argument2 -> (the definition of the function))
For example, recall the addition
example:
addition a b = a+b
The equivalent anonymous function would be written like this:
(\a b -> a + b)
This whole expression can then be used in place of whereever you would have used the word addition
before. For example:
Prelude> (\a b -> a+b) 2 3 5 Prelude> let addition a b = a+b Prelude> addition 2 3 5
Such nameless, once-off functions are called "anonymous" functions.
The example created an anonymous function with 2 arguments, but you can make anonymous function with just 1 argument, or with more than 2 arguments, in the same way.
They are useful as arguments to functions that take other functions as arguments, for example, map
:
Prelude> map (\x -> x + 1) [1,2,3] [2,3,4]
You can use any prefix function as if it were an infix function, and vice versa.
To use a prefix function as an infix function, put backquotes around it. To use an infix function as a prefix function, put parentheses around it. Example:
Prelude> let addition a b = a+b Prelude> addition 2 3 5 Prelude> 2 + 3 5 Prelude> 2 `addition` 3 5 Prelude> (+) 2 3 5
We'll be seeing a lot of this because infix functions have to be surrounded by parentheses if you are giving them as inputs to higher-order functions.
Higher-order functions are functions that operate on other functions. We've already seen some functions that take other functions as input; for instance, map
and filter
. Now I'll give some examples of functions that yield other functions as output.
'.
' is the function composition operator. It takes two functions as input and yields a new function as output. Specifically, the expression f . g
yields a function which first does function g
, and then does function f
on the output of g
. For example, if you have a list of numbers, and you want to first take their absolute value and then compute their square roots, you could either do it like this (without function composition):
Prelude> let li = [4.0,-9.0,-16.0,4.0] Prelude> let li2 = map abs li Prelude> let result = map sqrt li2 Prelude> li2 [4.0,9.0,16.0,4.0] Prelude> result [2.0,3.0,4.0,2.0]
Or you could do it like this (using function composition):
Prelude> let li = [4.0,-9.0,-16.0,4.0] Prelude> let result = map (sqrt . abs) li Prelude> result [2.0,3.0,4.0,2.0]
The two input functions to '.
' must each take a single argument.
flip
take a function as input and returns a new function as output. Specifically, it takes as input some function f
, where f takes two arguments, and it returns a new function f'
with the order of the arguments reversed (that is, (flip f) a b = f' a b = f b a
). For example:
Prelude> (/) 5 10 0.5 Prelude> (/) 10 5 2.0 Prelude> (flip (/)) 5 10 2.0 Prelude> (flip (/)) 10 5 0.5
Note: for this to work, you have to put parentheses around '/
'. (flip /)
doesn't work30.
OK, this may seem a bit weird.
If a function takes a bunch of arguments, you can create a new function by giving values to some of those arguments, and leaving the rest unbound.
For example, addition takes two arguments. If we fix one of those arguments permanently at 1, and leave the argument open, then we have created a function (call it plusOne) that takes one argument, and adds 1 to that argument. Here's how to do it:
Prelude> ((+) 1) 2 3 Prelude> ((+) 1) 5 6 Prelude> let plusOne x = ((+) 1) x Prelude> plusOne 2 3 Prelude> plusOne 5 6
(in other words, the function plusOne
has the same effect as the function (\x -> x + 1)
which we saw in an earlier example)
(note: remember that putting parentheses around an infix function makes it a prefix function. So, (+)
is the prefix version of the +
operator.)
Here's another example. In this example, we take map
and fix its first argument (the mapping function) to plusOne
-- but we leave the second argument open. The result is a specialized version of map
-- a function which takes a list and returns the result of incrementing each element in the list by 1:
Prelude> let plusOne x = ((+) 1) x Prelude> let incrementListElements = map plusOne Prelude> incrementListElements [1,2,3] [2,3,4]
We could have just used ((+) 1)
instead of plusOne
, since those expressions are equivalent (both of them represent a function which takes one argument and returns that argument plus one):
Prelude> let incrementListElements = map ((+) 1) Prelude> incrementListElements [1,2,3] [2,3,4]
You can do this with functions that take more than 2 arguments, too. Here's a function which tests to see if its third argument lies in between its first and second arguments:
Prelude> let isZInBetweenXandY x y z = (x < z) && (z < y) Prelude> isZInBetweenXandY 0 10 5 True Prelude> isZInBetweenXandY 0 10 15 False Prelude> isZInBetweenXandY 0 10 (-1) False
If we fix the first argument to 5, we get a function (call it isZInBetween5andY) which takes two arguments. The first argument of the new function is like the second argument of the original function, and the second argument of the new function is like the third argument of the original function:
Prelude> let isZInBetween5andY = (isZInBetweenXandY 5) Prelude> isZInBetween5andY 10 7 True Prelude> isZInBetween5andY 10 15 False Prelude> isZInBetween5andY 10 4 False
We could also fix both of the first two arguments of isZInBetweenXandY to 5 and 10, respectively, yielding a function waiting for just one more argument (call it isZInBetween5and10):
Prelude> let isZInBetween5and10 = (isZInBetweenXandY 5 10) Prelude> isZInBetween5and10 7 True Prelude> isZInBetween5and10 15 False Prelude> isZInBetween5and10 4 False
We can achieve the same result by starting with isZInBetween5andY and fixing its first argument:
Prelude> let isZInBetween5andY = (isZInBetweenXandY 5) Prelude> let isZInBetween5and10 = (isZInBetween5andY 10) Prelude> isZInBetween5and10 7 True Prelude> isZInBetween5and10 15 False Prelude> isZInBetween5and10 4 False
And there's no need to actually name the partially applied function (we just did that for readability):
Prelude> (isZInBetweenXandY 5 10) 7 True Prelude> (isZInBetweenXandY 5 10) 15 False Prelude> (isZInBetweenXandY 5 10) 4 False
In general, if there is a function f
which takes __n__ arguments, you can give it just the first __i__ arguments, enclose the whole things in parenthesis, and you have a function which is like f
but with the first __i__ arguments fixed, and which is still waiting to accept the other (__n__ - __i__) arguments.
Note that the syntax for normal ("total") function evaluation is just a special case of partial evaluation --- if you "fix" ALL of the arguments of a function, then it becomes a constant function, i.e. a value.
There's some syntactic sugar for partially applying infix functions. Example:
Prelude> (/2) 10 5.0 Prelude> (2/) 10 0.2
Partial function application provides a neat way to think of the anonymous function syntax. If you know about "lambda calculus" from theoretical computer science, that's what we're about to discuss (if you haven't heard of lambda calculus, don't worry about it, we're going to explain the relevant part).
Let's say you define addition as an anonymous function:
(\x y -> x + y)
You can think of what happens when this expression encounters a value in the following way. The x y
on the left (in between the backslash and the arrow) means that this expression "wants to eat" two values. When you feed it a value, it takes the leftmost variable in its "want-to-eat list", removes it from the list, and substitutes the value you fed it for that variable in the expression on the right. If the "want-to-eat list" is empty, then we remove the symbols \ ->
also.
The way that you feed it a value is to put a value on its right. When it eats the value, the value disappears from the right. For instance, if you feed it a 3, here's what happens:
(\x y -> x + y) 3 = (\y -> 3 + y)
Even when we fully apply the function, we can look at it as a succession of partial applications:
(\x y -> x + y) 3 5 = (\y -> 3 + y) 5 = (\ -> 3 + 5) = (3 + 5)
So, you see that the core of applying a function is really the substitution operation, where we substitute the arguments being given to the function for the argument symbols in the function body. Partial application just means that we substitute in for of the function arguments in the function body but also leave some other function arguments the way they were.
In this way we can see how normal function application can always be restated in terms of a succession of partial function applications. Keeping this in mind will make the type notation for functions make more sense.
\newpage
Every expression in Haskell is assigned a type.
As we mentioned before, you can use ghci
's :t
command to find out the type of a given expression. The results is called a "type signature". For example:
Prelude> :t 'c' 'c' :: Char Prelude> :t "a string" "a string" :: [Char]
This means that the type signature of the expression 'c'
is Char
and the type signature of the expression "a string"
is [Char]
.
[Char]
means a list whose elements are of type Char
.
The type signature of tuples is what you'd expect:
Prelude> :t ('a',"hi") ('a',"hi") :: (Char, [Char])
Types can be nested in all sorts of crazy ways:
Prelude> :t [("hello", ["goodbye"])] [("hello", ["goodbye"])] :: [([Char], [[Char]])]
If you ask for the type of a number, you get something unexpected:
Prelude> :t 1 1 :: (Num t) => t
This is our first encounter with a type class. Num
is a type class. A number of types can be grouped together into a "type class" (the word "class" here is being used similar to the word "set" in mathematics). For any given type class, some types are part of the class, and the rest aren't.
The motivation for the invention of type classes is that a type class can be associated with a list of functions that must be implemented for each type in the class. So, type classes are a lot like interfaces in Java. If you know that a given type is part of a given type class, then you know that certain specific functions are defined with respect to data structures of that type. To give a concrete example, any type in the type class Num
must support the addition operator (+
). Num
includes the types Int
(integer) and Float
, among other types. So, if you find yourself with two expressions of type Num
, then you are guaranteed31 to be able to add them together.
Type classes are sort of related to "classes" in object-oriented languages, although not as closely as to interfaces. In object-oriented languages, "objects" are individual instantiations of data structures, and objects are instances of classes. By contrast, the instances of a type class are __types__.
Type classes often appear in type signatures. Here's what they mean. We saw that the type signature of 1
is (Num t) => t
. The t
here is just a variable; it may as well have been a
or x
.
1 :: (Num t) => t
might be read as saying something like, "The type of 1 is some type t, where t belongs to the type class Num
". What that means is that the type of 1
is undetermined except that it is constrained to be some type within the Num
typeclass.
So, when you enter a numerical constant in Haskell, it enters the world not as an Int
or a Float
or anything else; it starts out being of undetermined type. Strange but true. However, it is constrained to be some type within the Num
typeclass, which is why you are allowed to apply operations like addition to it.
Example:
Prelude> :t not not :: Bool -> Bool
We'll go through this piece by piece.
not ::
means that the expression not
is the thing whose type we are examining.Bool -> Bool
means a function which takes something of type Bool
as input, and returns something of type Bool
.Next we'll look at a function that returns an output of a different type than its input:
Prelude> :t length length :: [a] -> Int
This says that length
is a function whose input must be a list of type a
, where a
can be any type, and whose output has type Int
. So, in other words, the input to length
must be a list, but it doesn't matter what type of element is in the list, and the output of length
will have type Int
.
Now we'll look at a function whose type signature involves a type class.
Prelude> :t abs abs :: (Num a) => a -> a
We'll go through this piece by piece.
(Num a) =>
means, "in the following type signature, __a__ must be in the type class Num
".a -> a
means a function which takes something with type a
as input, and returns something of type a
.Putting it together, this says that abs
will accept as input a single argument of any type, provided that that type is in the Num
type class, and that the output of abs
will be of the same type as the input.
Note that a -> a
doesn't mean that abs
returns the same __value__ that it was given! It just means that whatever it returns will have the same __type__ as the thing that it was given.
Now let's look at boolean AND (&&
), which is a function which takes two inputs of type Bool
, and produces one output of type Bool
Prelude> :t (&&) (&&) :: Bool -> Bool -> Bool
Weird, huh? The first thing to say about this is that there are implicit parentheses which are left out of this expression. It is assumed for type signatures that the arrows associate to the right unless there are parentheses that say otherwise. If we write the parentheses explicitly it would be:
(&&) :: Bool -> (Bool -> Bool)
Now, the key to understanding this is to think of partial function application. Imagine that we had the expression:
(&&) True False
As noted before, when a function takes multiple arguments, normal function application can always be thought of as a successive series of partial function applications. So, the above expression is equivalent to:
((&&) True) False
where ((&&) True)
means the function that you get if you start with (&&)
and then fix its first argument to True
, leaving the second argument unbound.
So, think of how (&&)
behaved right there. It started out as a function that took two arguments. But then we fed it an argument, and as a result we got a function that took only one argument.
So, (&&)
can equivalently be thought of as a function that takes __one__ argument as input, and then produces another function as output.
What is the type of the function ((&&) True)
? Well, this function takes one argument of type Bool
, and returns something of type Bool
. So, it must have type Bool -> Bool
. And indeed we see that this is right:
Prelude> :t ((&&) True) ((&&) True) :: Bool -> Bool
So, going back to the original function (&&)
, what type is it (if we consider what happens when we partially apply just a single argument)? Well, it takes as input something of type Bool
, and then it gives us back a new function which has type Bool -> Bool
. So, (&&)
is a function that goes from type Bool
to type Bool -> Bool
. So, its type is therefore:
Bool -> (Bool -> Bool)
And, remembering that the arrows are assumed to associate to the right, we see that this is equivalent to the reported type signature for (&&)
:
Bool -> Bool -> Bool
In general, the procedure for figuring out the type of something like a function is:
A -> X
, where X
is a placeholder for the type of the thing that you get when you apply the first argument.X
.In the previous example, our reasoning followed that procedure:
(&&)
takes is type Bool
(&&)
must be Bool -> X
, where X is the type of an expression that you get from applying an argument to (&&)
(an example of such an expression is ((&&) True)
).((&&) True)
?((&&) True)
takes is type Bool
, and it returns Bool
, so it has type Bool -> Bool
X = Bool -> Bool
into Bool -> X
, we see that the type of (&&)
is Bool -> (Bool -> Bool)
Example:
Prelude> :t map map :: (a -> b) -> [a] -> [b]
Including the implicit parentheses, this type signature could be written:
(a -> b) -> ([a] -> [b])
For example, consider
Prelude> let showInt :: Int -> String; showInt i = show i Prelude> :t showInt showInt :: Int -> String Prelude> map showInt [2,5,7] ["2","5","7"] Prelude> (map showInt) [2,5,7] ["2","5","7"] Prelude> :t (map showInt) (map showInt) :: [Int] -> [String]
We can see that map
takes one argument, which is a function of type a -> b
(in our case a is Int
and b is String
), and yields another function with type [a] -> [b]
(in our case, this other function is (map showInt)
). This explains why map
's type signature is:
(a -> b) -> ([a] -> [b])
In order to explicitly specify the type of some expression, write something like:
expression :: typeName
for example:
1 :: Int
is an expression for 1, with type Int
.
In order to explicitly specify the type of some named function, write something like this:
square :: Int -> Int square x = x*x
When you explicitly specify the type for some expression, it must match (i.e. be more specific than) the type that Haskell would infer for the same expression.
Infix operators must be surrounded by parentheses (i.e. converted to prefix form) in type declarations.
It is recommended32 that you explicitly specify the types of the outermost functions in your programs.
The type
command lets you define syntactic sugar for oft-used types. For example, if you were writing a program in which you often passed around lists of 3D coordinates, where each coordinate was a Double
, then you might get tired of typing [(Double,Double,Double)]
. So you could do:
type List3D = [(Double,Double,Double)]
An oft-used type synonym is:
type String = [Char]
\newpage
When you take the type of a function which does input or output, it yields something of type IO something
, where "something" is the type that you would "expect" that function to yield. Examples:
Prelude> :t putStrLn putStrLn :: String -> IO () Prelude> :t getLine getLine :: IO String
You'd expect that putStrLn
wouldn't return anything (in some languages you might call this a return type of 'void', but in Haskell we'd say it's '()
)'). You'd expect that getLine
would return a string.
Even some functions which don't seem to actually do IO have this in their type:
Prelude> :m Random Prelude Random> :t randomRIO randomRIO :: (Random a) => (a, a) -> IO a
What's happening here? Isn't the result of getLine
a string? Isn't the result of randomRIO
a number?
No, it isn't. Note that whenever we actually used these functions in examples, we used them within do
blocks, and we used the <-
operator to assign their result. For example:
Prelude> do {putStrLn "Enter a name"; name <- getLine; putStrLn ("Hi " ++ name)} Enter a name Bayle Hi Bayle
You can think of the <-
operator as removing the "IO" from the type of a value. Whenever you want to use a function with an "IO" in its return type, you'll need to use the <-
operator if you want to get at the "actual" return value.
The <-
operator only works within do
blocks. How do you pass a value out of the do
block? That is, how do you send a value from somewhere inside of a do
block to somewhere outside of the do
block, so that that value can be used somewhere else?
Well, you can send values out of do
blocks containing I/O, but those values are required to be one of those IO something
types! You're not allowed to smuggle any "naked" values out of an I/O-containing do
block.
A do
block returns the value of the expression on its last line33. If the last expression itself yields an IO something
type, all is well:
Prelude> :t do {getLine} do {getLine} :: IO String
If you try to return something of a "naked type" out of a do
block containing I/O, however, there's a type error34:
Prelude> :t do {putStrLn "hi"; True} <interactive>:1:19: Couldn't match expected type `IO t' against inferred type `Bool' In the expression: True
Incidentally, it is illegal for the last line to be a <-
expression; these things are so weird that they're technically not even considered to be expressions:
Prelude> :t do {name <- getLine} <interactive>:1:4: The last statement in a 'do' construct must be an expression
If you want to return something from a do
block containing I/O, and you are willing to submit to the indignity of changing its type from something
to IO something
, you can use the return
construct:
Prelude> :t do {putStrLn "hi"; return True} do {putStrLn "hi"; return True} :: IO Bool
return
is kinda the reverse of <-
; it changes a value of type something
into a value of type IO something
.
Unlike in some other programming languages, return
does not mean anything like "exit this function and return this value".
No. The reason is that Haskell wants to clearly segregate all computation which is causally linked to the outside world35. This is related to monads.
We'll explain why soon, but in order to have the background to do that we need to explain "referential transparency".
\newpage
A referentially transparent expression is one that can be replaced with its value without changing the behavior of the program. So, referential transparency means that, if you compute the function f(x) and find that f(x) = y, then every time you encounter f(x) in the future, you can just replace it with y.
This may sound like it is always the case, but here's some examples where it's not.
Consider the Python program:
a=1; c=a*5; a=5; d=a*5;
In Python, this results in c=5 and d=25. But if we assumed that we had referential transparency, then as soon as the interpreter saw "a=1", it would be allowed to substitute "1" for every occurence of the expression "a" from then on. And then at the end we'd have d=1*5
= 5.
So, in this case, if (after the first assignment to a) we had replaced every instance of the expression "a" with its value, then we would have changed the behavior of the program.
In general, if you ever find an expression that can return different things at different times, then that expression is not referentially transparent (in this case, "a" was shown to be such an expression).
Consider this Perl program:
srand(0); print "\n"; print rand(); print "\n"; print rand();
The function rand()
, which calls a pseudorandom number generator, returns a different result upon each call, even though it was passed the same arguments each time. This is because the result of the rand()
function doesn't just depend on the arguments that you pass it; it also depends on some __hidden state__ which is altered upon each call.
In Python, the input()
function reads input from the user. If the interpreter evaluated input()
once, and then after that, whenever it saw input()
, it just substitued the return value it got the first time without actually calling input()
again, that would change the program's behavior.
In Python, the print
command doesn't return anything. So, it doesn't return different things at different times. Still, it is not referentially transparent. If the interpreter just replaced every print
command with nothing, that would change the program's behavior.
In the last example, referential transparency failed not because the return value of an expression was not constant, but because print
has a side-effect. Side-effects are when the evaluation of a function has a "side-effect" beyond the computation of the return value. For instance, in Perl, consider the statement:
$a = (print "hi");
This statement does two kinds of things. First, it calls the print
function, which computes a return value, which gets put into variable a
. But the call to print
also has a __side-effect__: the argument is printed to the console.
The essence of referential transparency is that functions don't care about anything but their arguments and their return value. The point is that referentially transparent expressions and functions are just normal expressions and functions like you find in mathematics; they are fixed mappings from inputs to outputs. None of this funny business that you see in computer programming like x=x+1
or expressions like print
that __do__ things when you evaluate them; if they're referentially transparent, they're just mathematical expressions.
\newpage
Haskell tries to keep referentially transparent expressions segregated from other code. In this way, it's clear which expressions can be relied upon to be referentially transparent, and which cannot. In Haskell, we call referentially transparent functions "pure functions", and we call other functions "actions".
This segregation is achieved by marking actions as type IO something
.
In Haskell, just as pure functions are first-class objects, so are actions. For example, the type of putStrLn
is String -> IO ()
. This means that putStrLn
takes a String
as input and returns something of type IO ()
as output. IO ()
is an example of an "IO action". This should be thought of as a representation of the action of printing out that string to the console.
This action can be combined with other actions to make a sequence of actions. For instance, you might combine putStrLn "Line 1"
with putStrLn "Line 2"
to make a composite action that prints out two lines to the console. The way that you combine these two actions into a third action in Haskell is:
myCompositeAction = do putStrLn "Line 1" putStrLn "Line 2"
In this example, the type of myCompositeAction
is also IO ()
. So, the function myCompositeAction
doesn't "do" anything; it merely returns an IO action (which, in this case, happens to be a combination of two smaller actions).
How does anything get done, then? The function main
is special. main
returns an IO action, and then Haskell actually executes that IO action.
So, you should think of all of your functions which use IO as pure functions which operate on actions. Their job is to prepare the final IO action that will be returned by main
and then36 executed by Haskell.
IO actions aren't just predetermined sequences of input and output. Later subactions can conditionally depend on earlier subactions. For example, recall the doGuessing
subroutine in our random number guessing program:
doGuessing num = do putStrLn "Enter your guess:" guess <- getLine let guessNum = read guess if guessNum < num then do putStrLn "Too low!" doGuessing num else if read guess > num then do putStrLn "Too high!" doGuessing num else do putStrLn "You Win!"
The type of doGuessing
is (Ord a, Read a) => a -> IO ()
. This means essentially that this is a subroutine that takes a single input (a number37), and produces an IO action as an output. So, IO actions can be entire programs.
And main
returns an IO action, which is then run. In other words, your program, considered as a whole, IS an IO action.
We aren't going to explain yet what monads are, but here we will explain what monads are used for in the context of IO38:
I/O operations aren't referentially transparent because they interact with the outside world, yet this interaction isn't reflected in their formal inputs and outputs. For instance, a putStrLn
statement doesn't return anything or change any variables, yet it changes the outside world. Similarly, a getLine
statement both changes and is changed by the outside world, yet the outside world isn't reflected in its input arguments. So we could substantially improve the situation if we could explicitly pass around a symbol that represents the outside world. Maybe the signatures of putStrLn
and getLine
should be the following:
putStrLn :: OutsideWorld -> String -> OutsideWorld getLine :: OutsideWorld -> (OutsideWorld, String)
Instead of writing:
putStrLn "What's your name?" name <- getLine putStrLn ("Hello, " ++ name)
we'd write:
main outside = let outside2 = putStrLn outside "What's your name?" (outside3, name) = getLine outside2 in putStrLn outside3 ("Hello, " ++ name)
This correctly models the dependencies of the computation on the various interactions with the outside world. However, it is hard to read. Also, what should the computer do if you give it this?:
main outside = let outside2 = putStrLn outside "What's your name?" (outside3, name) = getLine outside2 in putStrLn outside2 ("Hello, " ++ name)
In that example, we passed outside2
in the last call to putStrLn
; but we should have passed outside3
.
Haskell address these problems by expressing program I/O using a weird formalism that allows them to essentially follow the idea embodied in these examples (they pass along a symbol representing the outside world) without the programmer having to write it that way. This solution is called monads. Monads can be implemented within Haskell with no special handling from the compiler39 (although it's nice if the compiler optimizes them).
Monads can also be used for other things besides I/O.
The do
block notation is syntactic sugar for monads. do
notation allows us to write code that looks like a sequence of imperative I/O commands, and then internally transforms it into a set of interdependent expressions which represents dependencies on the outside world by passing along an I/O symbol40.
So, in summary: programs that communicate with the outside world are first-class objects called "IO actions". When you write a function that does I/O, its return type will be an IO action. do
blocks are the syntax that Haskell uses for composing actions together into larger actions. The return type of a do
block is an action. Under the hood, do
blocks are syntactic sugar for a construct called "monads", which we haven't explained yet. But, in the case of IO actions, monads can be thought of as a pattern for writing code so that, technically, a symbol representing the "outside world" is passed to and from our IO functions, without cluttering up our function calls by actually writing that symbol.
\newpage
Haskell is a functional programming language.
When I heard that, I assumed that that was because Haskell lets functions be "first-class objects", and because Haskell provides you with cool higher-order operators to manipulate functions with. Both of these things are true, and I think both tend to go together with functional languages, but they don't get to the heart of the meaning of the word 'functional'.
Functional languages are often contrasted with imperative languages. Imperative languages tell the computer what to do. They tend to be step-by-step lists of instructions. To give an example in Python,
def factorial(k): x = 1 i = 1 while (i <= k): x = x*i i = i+1 return x print(factorial(5))
The function factorial
says something like:
__Set the contents of memory location "x" to 1. Then set the contents of memory location "i" to 1. Then, as long as the contents of memory location "i" is less than or equal to "k", do the following: multiply the contents of memory location "x" by the contents of memory location "i", then increment the contents of memory location "i".__
}}}By contrast, in Haskell, the focus is not on telling the computer a sequence of instructions, but rather on defining the functions that you want computed. Functions are defined much as they are in mathematics. Here's how you might define a function <code>factorial</code> in Haskell: {{{ factorial 0 = 1 factorial n = n*(factorial (n-1))
This sort of thing is Haskell's focus, and that is the reason that Haskell is called "functional". In Haskell, running a program corresponds to evaluating a function.
But Haskell has an imperative side too. It has to; a mathematical function just defines a mapping from inputs to outputs. You can define all the functions you want, and the computer might then be able to compute the output of any of those functions for any particular input, but if that was all you did, the computer would just sit there, because you haven't told it to __actually__ evaluate any particular function on any particular input (like a C program with lots of function definitions but an empty main()
). Also, mapping inputs to outputs doesn't provide a notion of __actions__ (example: the action of printing something to the screen).
So, just like "imperative" languages, Haskell does provide a facility for giving the computer a sequence of instructions. But the code that does this is clearly separated from the other code41. The imperative commands are found only within do
blocks42. Here's a program that is equivalent to the Python program above:
factorial 0 = 1 factorial n = n*(factorial (n-1)) main = do print(factorial 5)
Lazy means that expressions don't get evaluated until (and __unless__) they have to be. Contrast with C, a "strict" language; in C, each expression is evaluated as soon as it is encountered.
One neat thing about a lazy language is that you can create expressions which would take up an infinite amount of space or time to completely evaluate. For instance,
Prelude> do {print (take 5 [1..])} [1,2,3,4,5]
is Haskell for "print out the list consisting of the first five elements of the list [1..]", where [1..] is an infinite list consisting of all of the integers. It runs fine, because Haskell never actually tries to construct the infinite list.
When you define an expression in Haskell, Haskell doesn't actually try to evaluate it; it just remembers, "ok, if I ever need the definition of that expression, here's where to find it". Only when the expression is needed for I/O (or is needed in order to evaluate some other expression that is needed for I/O) does it get evaluated.
Laziness allows you to define infinite data structures, which are often cleaner to define and work with (see section Circular function definitions for infinite data structures, below). L
aziness also allows you to do things like make a lazy database query that, if not lazy, would return a result too large to fit into memory; and then to retrieve the results one-by-one (or in small sets) (note: Haskell's laziness won't automatically make an eager database query into a lazy one, your database framework would have to do that; but this is a good example of a situation in which laziness can be helpful).
In some cases, laziness can make it easier to reuse modular code; for example, if you are doing a search through state space, you can construct the state space lazily and pass it to some other routine, and the other routine can decide to do breadth-first search or depth-first search or what have you. This is more reusable because the construction of the state space is completely modularized from its exploration. This separation also tends to make the code clearer.
See this essay for more on the value of laziness: http://www.cs.chalmers.se/~rjmh/Papers/whyfp.html
Laziness has a price. When you pass around a variable with a lazy value that has not been fully evaluated, it has to keep track of the state of the partial evaluation, which can incur considerable memory overhead. In some cases the programmer must be aware of this sort of thing and plan carefully in order to avoid out of memory errors (see http://www.haskell.org/haskellwiki/Foldr_Foldl_Foldl%27 for an example). There is also some extra CPU processing needed to keep track of this stuff.
Pure means that code in Haskell __(except within "do" blocks43!)__ satisfies __referential transparency__. We'll define44 that in the next section.
Static typing means that type-checking is computed at compile-time rather than at run-time.
Strong typing is used in different ways by different people, but what I mean here is that you can't do something like a typecast in C; that is, you can't tell Haskell to let you get away with type violations.
Since I haven't actually programmed in Haskell much yet (todo), all I can do is repeat what I've heard from others.