Introduction
When we program in strongly typed languages, we often find
it is necessary to work with data types that capture the semantics of our
domain, e.g., when writing accounting software, we could model solutions in
terms of debits, credits, etc. Using domain semantics helps us to keep our solutions
transparent, simpler to understand, and more natural to work with. As a multi-paradigm
language, F# supports structured types, e.g., discriminated unions, as well
sophisticated objects that support full OO functionality. In this chapter, we
study F#’s built-in structured types and learn how to define our own. To begin,
let’s look at how we can access existing types via type abbreviations,
Type Abbreviations
The simplest way to get started with types is to discuss type
abbreviations, which are alternate names or aliases for existing types. We use type
abbreviations to give existing types meaningful names, in order to make code clearer
and easier to read. We also use them to create short names for types that are otherwise
unwieldy to work with. Additionally, we use type abbreviations to make it
easier to replace a type without changing all the code that uses that type. In
the following code snippet, we define several type abbreviations:
type
arrIndex = int
type
temperature = double
type
point = float * float
type student = string * float * bool
These type abbreviations are fundamentally aliases, e.g., arrIndex
for int
and point
for float
* float. Since the type names help indicate what the types are
used for, you can see how type abbreviations can help clarify intent.
Once defined, you can use abbreviated types anywhere you’d
use the orignal type. For example, given the type abbreviations defined above,
we can write the following;
let
student joe = ("joe", 3.2, true) or
let joe: student = ("joe",
3.2, true)
By convention, type abbreviations are lowercase, but this is
not a technical requirement or constraint.
Back in Chapter 9, we discussed generic functions, and how
F# enabled us to specify generic function arguments. The idea of generic
arguments extends to type abbreviations. For example, let suppose we want to
create a new abbreviated type based on .NET’s built-in generic collection class
System.Collection.Generic.Dictionary.
We can do this by creating a type abbreviation that accepts a generic argument.
The generic argument appears in angle brackets as part of the abbreviation’s
definition. In the following code snippet we define a type that represents an
associative array that uses string-based keys to hold generic values:
type AssocArray<'a> =
System.Collections.Generic.Dictionary<string, 'a>
Now, when we create an AssocArray, F# will
ensure that we supply a type parameter, which will be used to instantiate a new
System.Collections.Generic.Dictionary
that uses stores whatever type we specify, and uses strings as its lookup keys,
e.g.:
let
mylist = new AssocArray<int>()
Using type abbreviations, we can alias any type, including
arrays, function types, tuples, etc. The following example illustrates a type
abbreviation that aliases functional types:
type Transformer<'a, 'b, 'c> = ('a -> 'b) -> ('b -> 'c)
Type abbreviations are nothing more than aliases. During
F#’s compilation process, they are expanded into the standard type format
shared by all .NET languages.
Enumerations
While type abbreviations enable us to alias existing types,
enumerations give us a way to create our own types. Enumerations have a defined
set of named values. You can use them in place of literals to make code more
readable and maintainable. They take the following general form:
type enum-name =
| value1 = literal1
| value2 = literal2
| ...
| valueN = literalN
The literals can have the type sbyte, byte,
int16,
uint16,
int32
(the default), uint32, int64, uint64
and char.
Enumerations in F# are mapped to the.NET Enum
class, which is a value type, meaning they are allocated on the stack and use
pass-by-value semantics. In the following example, we define a simple
enumerated type to represent a Web protocol:
type
Protocol =
| FTP = 0
| HTTP = 1
Those coming from the C-family of languages beware. If
you omit the literals, you no longer have an enumeration, but a discriminated
union (discussed later in this chapter).
You can also compact the representation of enumerations by
placing them on the same line:
type
Color =
| Red = 0 | Green = 1 | Blue = 2
It is simple to convert an enumeration to its underlying,
corresponding value by using the appropriate casting operator:
type
Color =
| Red = 0 | Green = 1 | Blue = 2
let c =
Color.Blue
let n = int c // cast c to
corresonding int value
Whenever you access an enumerated vaue, F# requires that you
use the enumerator’s full name, e.g., you must use Color.Red,
not just Red by itself.
To change the default type of an enumeration from int32
(the default) to something else, you need to specify the type of each
element, as shown here:
type
Days =
| Sunday = 0x1uy
| Monday = 0x2uy
| Tuesday = 0x4uy
| Wednesday = 0x8uy
| Thursday = 0x10uy
| Friday = 0x20uy
| Saturday = 0x40uy
This example uses unsigned bytes (uy)
to represent the various days of the week.
You can use the enum function in the F#
library to generate an enumeration value, even a value other than one of the
predefined, named values. For example, given this Color enum:
type
Color =
| Red = 0 | Green = 1 | Blue = 2
You use the enum function to
programmatically create an additional entry:
let purple = enum<Color>(3)
We can also use enumerations in conjunction with pattern
matching, as demonstrated in the following example:
let
isPrimary color =
match color with
| Color.Red | Color.Green | Color.Blue-> true
| _ -> false
Records
The next rung in the type sophistication ladder of is that
of records. A record represents a simple aggregate of named values that enable
us to build data structures that contain named fields. The general form of a
record type is:
type [accessibility-modifier] typename = {
[ mutable ] label1 : type1;
[ mutable ] label2 : type2;
...
}
In the following example, we define a record that contains
information about a movie:
type
Movie = {
Title: string;
ReleaseYear: int;
}
Each record element consists of a named field and data type,
separated by a colon, e.g., Title: string. Each record element is
separated by a semicolon.
Once you’ve defined a record type, you can construct
instances of the record using using the let keyword, as shown
here:
let
funnyMovie =
{ Title="Pineapple Express";
ReleaseYear=2008 }
When you create an instance of a record, you need to
initialize every field. Forgetting to initialize a field results in a compiler
error.
Constructing a record instance looks similar to defining a
record. It helps to keep in mind that when you define a record type, you use
the keyword type and delineate the record’s fields and data
types. In contrast, when constructuing a record instance, you use the keyword let
and assign values (via =) to each of the record’s fields. You use the
brackets { } in both cases.
Record fields are not limited to simple types. You can use
lists, arrays, etc. as part of a record’s definition. To demonstrate this,
let’s define a new Movie record that includes the list of star
actors:
type
Movie = {
Title : string;
ReleaseYear: int;
Actors : string[];
}
let
funnyMovie = {
Title="Pineapple Express";
ReleaseYear=2008;
Actors=[|"Seth Rogen"; "James Franco"|]
}
By default, like all values in F#, record elements are
read-only. If you wish to make record elements writeable, use the mutable
keyword, as shown here:
type
Employee = {
Name : string // semicolon optional
when elements
mutable Salary : float // on a separate line
}
let joe
= { Name="Joe Smith"; Salary =
50000.0 }
joe.Salary <- joe.Salary * 1.1 // Give
Joe 10% raise
Note that if an element is inherently mutable,
as is the Actors field defined in the Movie type above (by
virtue of the fact that it is an array, and arrays are inherently mutable), you
can change its contents without using the mutable
keyword. What you cannot do is change the referenced element. In other
words, given the Movie type as defined, you can update or change
the contents of the existing Actors array, but you
cannot replace the array with a new array. In code:
funnyMovie.Actors.[0]
<- "Ed Helms" // works
funnyMovie.Actors
<- [| "Zach Galifianakis" |] // error, Actors is not mutable
If you define two or more record types and find that these
types contain identical element names, you will need to disambiguate them
during instance construction. For example, given an Employee
class and an Actor class, both with elements Name
and Salary,
you need to tell F# which record you’re trying to construct; otherwise, F#
will choose the most recently defined one. You disambiguate using the record
type name followed by a dot ( . ). Let’s look at an example:
type
Employee = {
Name : string;
mutable Salary : float;
}
type
Actor = {
Name : string;
mutable Salary : float;
}
let x =
{ Name="Joe Smith"; Salary=50000.0
} // Actor, since most recently defined
let y = { Employee.Name = "Mary
Brown"; Salary=60000.0 } // Employee,
explicit
Copying Records
In addition to the mechanisms we’ve discussed so far, we can
also construct records via cloning. For example, given a record of type Employee,
we can easily create an Employee record and copy it using the with
keyword. In the process of making the copy, we can replace whatever element
values we’d like, so that the newly constructed copy assumes the new values:
type
Employee = {
Name : string;
Salary : float;
HireYear : int;
}
let joe
= { Name="Joe Smith";
Salary=65000.0; HireYear=2001 }
let mary
= { joe with Name="Mary
Jones"; Salary=75000.0 }
// copy joe replacing name and salary
Augmenting Records with Operators
Although we’ve already introduced operator overloading, we
did so only from the perspective of introducing them in the global namespace.
Now that we’ve discussed record types, we’re in a position to revisit operator
overloading in a new context.
To overload an operator to be used with specific record
types, we use the following syntax:
static member (operator-symbols) (parameter-list) =
method-body
Here, we see a new keyword static, which means
that the member becomes part of the record type itself vs. instances of the type.
We’ll discuss static in detail when we cover classes. The member
keyword is also new, and tells us that the operator is a “member of”/”part of”
the public interface of the record type. The rest of the operator definition
remains unchanged vis-à-vis a standard operator as introduced previously.
In the following example, we define a record type called PolarCoord
that represents a point in polar coordinates. As part of the
record type, we define an operator + that enables us to
easily add PolarCoords together.
Since we’re defining the operator + to be applicable only
to PolarCoords,
we do not risk muddying the global namespace, nor do we overload the standard
plus (+) operator.
open
System
type
PolarCoord =
{
r : double
angle : double
}
static member (
+ ) (c1 : PolarCoord, c2 : PolarCoord) =
// Need to convert to rectangular coords, add, then
convert back
let (x1, y1) = (c1.r * Math.Cos(c1.angle), c1.r
* Math.Sin(c1.angle))
let (x2, y2) = (c2.r * Math.Cos(c2.angle), c2.r
* Math.Sin(c2.angle))
let (x3, y3) = (x1 + x2, y1 + y2)
let r3 = Math.Sqrt(x3 * x3 + y3 * y3);
// We need to define "r3" here, since the
"r" defined in
// the record (below) is not visible to the rest of
the
// record, and therefore, cannot be used in the match
expression.
// Go figure.
// Build the resultant PolarCoord
{
r = r3
angle =
match (x3, y3) with
| (0.0, 0.0) -> 0.0
| (x3, _) when x3 >= 0.0 -> Math.Asin(y3 / r3)
| (x3, _) when x3 < 0.0 -> -Math.Asin(y3 / r3) + Math.PI
}
let p1 =
{ r = 1.0; angle = 10.0 }
let p2 =
{ r = 5.0; angle = 72.5 }
let p3 = p1 + p2
When overloading operators in record types, whitespace is
critically important. The braces that define the record must be aligned, and
the static
keyword must vertically align with the closing brace.
Discriminated Unions
The next type in our F# bag of tricks is the discriminated
union, an important and often-used construct in F# programs. A
discriminated union (DU) is a data type with a finite number of distinct,
alternative representations. I think that it’s helpful to think of a DU as a polymorphic
type akin to a union in traditional C.
In functional programming, discriminated unions often are
used to replace single-inheritance hierarchies. For example, let’s assume that
we want to represent a set of simple operating system constructs such as files,
processes, and handles. To model this type of system in OO fashion, we’d most
likely create an abstract class OSObject and from this derive
subclasses such as File, Process, and Handle. We can do something very similar
with DUs, as shown in the following example:
type
OSObject =
| File of string // name
| Process of string * int // name and priority
Here, we are declaring a DU called OSObject
that contains two discriminators: File of type string,
and Process
of type string * int, which is a tuple. As you can see, each
discriminator can be of a different type. Note that each discriminator must
start with an uppercase character. This is somewhat odd due to the fact that F#
does not normally restrict case outside of its built-in keywords.
To use the OSObject DU, you create
instances of the discriminators directly:
let f = File("fsharp.txt")
let p = Process("iexplore.exe",
1)
Here we’re creating an OSObject of type File.
Since File was specified to be of type string,
F# generated a compliant constructor – and we’re invoking it. We also create a Process,
that accepts 2 arguments of type string and int
respectively.
In addition to creating discriminators of explicit type, you
can create discriminators that do not specify types explicitly. In the following
code, we add Unknown to our OSObject DU:
type
OSObject =
| File of string // name
| Process of string * int // tuple (name, priority)
| Unknown // unknown
type
When F# encounters Unknown, it generates a simple class that contains
a default constructor, enabling us to create instances, as shown here;
let u = Unknown
DUs enable us to treat the discriminators polymorphically.
For example, if we wanted to create a factory method for creating new OS
objects (OSObjects), we could do so as follows:
let
makeOSObject objtype =
if objtype = "file"
then File("")
else if objtype
= "process" then Process("", 0)
else Unknown()
Because is it an expression, the makeOSObject
function must resolve to a single return type. Since all of the potential return
types fall under the OSObject umbrella (they are polymorphic with
repsect to one another) F# considers a valid return type any OSObject.
This enables us to return a File, a Process or an Unknown.
To further flesh out DUs, let’s look at some examples that
could form the basis of a fantasy role-playing game:
type
FantasyCharacter =
| Warrior | Wizard | Thief
type
OffensiveWeapon = // ints represent offensive
strength
| Sword of int | Mace of
int | MagicStaff of int
type
DefensiveWeapon = // floats represet defensive
strength
| Shield of float | ChainMail of float
Once we’ve defined DUs, we can use them stand-alone or as components
of other types. For example, we can define a Player type that consists
of a string, a FantasyCharacter and an
OffensiveWeapon:
type
Player = {
Name: string;
Character: FantasyCharacter;
Offense: OffensiveWeapon;
}
let conan = { Name="Conan"; Character=Warrior; Offense=Sword(100)
}
We can even combine discriminators in tuples, arrays, etc.
Here is another example where we create a tuple our of two discriminators:
type
Player = {
Name: string; //
string
Character: FantasyCharacter; //
discriminator
Weapons: OffensiveWeapon * DefensiveWeapon // tuple
of discriminators
}
let conan = { Name="Conan";
Character=Warrior; Weapons=(Sword, Shield) }
Let’s look at another example, this time switching gears
from fantasy games to books. This example throws in type abbreviations for good
measure:
type
Title = string
type
PubYear = int
type
Author = string
type
Book =
| NonFiction of Title * Author * PubYear
| Fiction of Title
let
bizbook = NonFiction("Good to Great",
"Jim Collins", 2001)
let goodread = Fiction "Shogun"
Note here that we’ve abbreviated the types Title,
PubYear
and Author,
and defined the NonFiction discriminator as a tuple of them all.
As with other types, you can construct arrays and lists of discriminators, as
shown here:
let
bizlibrary = [
NonFiction("Tipping Point", "Malcom Gladwell", 2000);
NonFiction("Made to Stick", "Chip & Dan Heath", 2007);
Fiction("The Lost Symbol")
]
Recursive Definitions
A powerful feature of DUs is that they can include recursive
references. This enables us to define data structures in a recursive manner. The
classic example of a recursive data structure is a tree, and it turns out that
in F# we often use DUs to define them. In the following example, we use a DU to
define a recursive tree structure whose leaf nodes can contain any data type
(as defined by a generic parameter 'a):
type
Tree<'a> =
| Leaf of 'a
| Node of Tree<'a> *
Tree<'a>
Given the Tree structure above, we
can build a tree as follows:
let t = Node(Node(Leaf(100), Leaf(200)), Leaf(50))
You can also write a function
that walks the tree and prints out its values in an easy-to-read style:
let
walkTree tree =
let rec loop
depth tree =
let spacer = new
string(' ', depth)
match tree with
| Leaf(value) ->
printfn "%s |- %A" spacer value
| Node(tree1, tree2) ->
printfn "%s |" spacer
loop (depth + 1) tree1
loop (depth + 1) tree2
loop 0 tree
Here, again, I’d like to point out the common pattern of
using recursion and pattern matching to define concise, elegant processing
constructs.
A side note on the example: Although we have not covered it
yet, the match statement, among its other functions, can be used to
distinguish among a given set of discriminators. It’s sort of like a an if…else
chain or switch statement in C. We will cover match
fully in Chapter 12.
Options
F# ships with some built-in discriminated unions that prove
quite useful, one of which is the option type. The option
type enables our expressions to evaluate to “something” or “nothing” using the
discriminators Some and None respectively. In
other words, option is used when a value may or may not
exist. Here is its definition:
type 'a
option =
| None
| Some of 'a
When using the option type, our
expressions use Some to tell the caller that “the result is
valid” and None to tell the caller that “I cannot give you
an answer.” Although we could solve this problem in different ways,
using options is a well accepted F# idiom.
Let’s look at a few examples to help clarify how we use
options. In the first example, we will look at a case of classic division,
where we divide a numerator by some denominator:
let classicDiv a b = a / b
This is a perfectly valid function, and works properly
except when b = 0. If we call this function with b = 0,
F# will throw a DivideByZero exception. The caller can
certainly catch and deal with the exception, or we can use the option
type to assist the caller in understanding the nature of the return value. Here
is a safe version of classic division:
let
safeDiv a b =
if b <> 0 then
Some(a / b)
else None
Options are used frequently to check database query results,
report errors from object contruction, and in other instances where a value
might be legitimately null.
Until you learn to use pattern matching (in the next
chapter) to check the “Some-ness” or “None-ness” of a return
value, you can use the Option.isNone and Option.isSome
static
functions in an if expresson, as shown below:
let safeDiv
a b = if b <> 0
then Some(a/b) else None
let f =
safeDiv 9 3
if
Option.isNone(f) then printfn "You can't divide by 0!"
else printfn "The answer is
%d" f.Value
Notice here that you can pull the intrinsic value from the
option type via the Value function. You can accomplish the same
thing using Option.get.
Reference (Ref) Cells
Even though F# discourages side effects, there are times,
e.g., when interfacing with other .NET languages, that we need to change the
state or values, data structures, etc. We can use the mutable
keyword, as you’ve seen; however, mutable values have some limitations,
including that you cannot use them outside the scope in which they are declared
– for example, we cannot use a mutable value declared
externally inside a function. The following code snippet, for example, will not
compile.
let
counter() =
let mutable
count = 0
(fun () ->
count <- count + 1; count)
This example is trying to define a function counter
that returns a function (a closure) with a side effect. The side effect is
incrementing the mutable value count – but count
would then be used outside the scope in which it is declared whenever the
function returned from counter is invoked, and this is illegal. The specific
feedback we get from the F# compiler is:
The mutable variable 'count' is used in an invalid way.
Mutable variables may not be captured by closures. Consider eliminating this
use of mutation or using a heap-allocated mutable reference cell via 'ref' and
'!'
To make it simpler and more flexible to deal with mutable
state, F# offers the Ref type. The Ref type is a record
that wraps mutable contents of arbitrary type. A Ref type is defined as
follows:
type Ref<'a> =
{ mutable contents: 'a }
Values of type Ref are called reference
cells. Unlike mutable values, reference cells overcome the
limitation of being accessible outside their declaring scope. In addition, F#
provides helper functions for working with them. To create reference cells, we
use the ref keyword, followed by a reference expression, as
shown in the following example:
let
daughter = ref "kimberly"
Here, daughter represents a reference cell –
it “wraps” or “points to” the string "kimberly". To
change the value referenced by the reference cell, daughter, we use the reference
assignment operator :=
daughter
:= "chee"
Now the value of daughter is "chee”, and the
former value, "kimberly", is forgotten.
We access a reference cell’s value using the cell dereference
operator (!):
printfn "kim's nickname is %s"
!daughter
All in all, a ref type uses pointer-like semantics. It
refers to a value and must be dereferenced when used. The value returned by the
dereference operator (!) is read-only. To assign a value to a
reference cell, use the Value function and the destructive assignment
operator (<-), as shown below.
let pi =
ref 3.14
pi.Value
<- 3.1415926
Somewhat surprisingly, if we create a reference cell that
refers to another variable, changes that we make to the reference cell via :=
do not affect the referenced variable. In the following example, we can see
that y references x. When we increment y,
x
remains unchanged.
let x =
123 // x = 123
let y =
ref x // use x as initial value for y
y
:= !y + 1 // see note below
printfn "x=%d, y=%d" x
!y // print results (x=123, y=124)
The line y := !y + 1 is interesting. What’s
happening here is we’re using the old referenced value for y
to create a new, incremented value. y then references this
new value (124). The old value is forgotten by y,
but is not changed (x still = 123). The analog here
is pointers, e.g., from C or C++: the pointer is changed, not the thing the
pointer used to point to.
As you can see from the example above, the := operator is a
shorthand for assigning to contents, and !
is a shorthand for reading its value. The fact the := does not change the
references value is somewhat counterintuitive, since reference cells behave
very much like pointers otherwise.
The net net of all this? Use mutable when you can,
and reference cells otherwise. You can use mutable so long as the
mutable variable is in local scope.
Defining Generic Types
In an earlier chapter, we discussed generic functions, the
mechanics of which are largely handled by F#’s sophisticated type inference
system and automatic generalization facilities. We also discussed that type
abbreviations can contain generic designations. To finish the swing on
generics, I want to mention that F# supports our defining our own generic
types, just like the generic types that ship with F# and those in the .NET Base
Class Library (BCL), e.g., List.
Let’s suppose, for example, that I want to create a new
type, a Stack that can contain any type of object. While there are
several ways to skin that cat, one of the prevalent F# idioms is to define a
type-parameterized DU. To follow that idiom, we would define our type as a
generic discriminated union, as shown:
type 'a Stack =
| EmptyStack
| StackNode of 'a * 'a Stack
This definition creates a generic data structure. The
leading apostrophe in 'a is important. It’s a required literal part of the
syntax. Simply using the letter a without a leading
apostrophe will not work. The 'a preceding Stack tells us (and, as
importantly, the compiler) that this type is generic, which means the type can
be deduced when the DU constructor is invoked. When we supply a data element to
the Stack,
F# infers the particular type of the Stack and binds it to 'a
. Note the 'a is a placeholder for a concrete type. It is replaced by the type
supplied by the constructor’s caller. We now have a Stack containing one
element, and the Stack expects all future elements to be of the
same type. Now that we’ve created an instance of a Stack, one with a real type,
we have created a constructed type. The following code snippet shows 3
constructed types based on the generic Stack type:
let
iStack = StackNode(3, StackNode(5, EmptyStack))
let fStack
= StackNode(2.0, StackNode(1.0, EmptyStack))
let sStack = StackNode("world",
StackNode("hello", EmptyStack))
The only reason I mention the term constructed type, is
because it shows up in some F# literature. There’s nothing particularly significant
about that term.
A note about syntax. F# also supports an angle bracket
syntax when dealing with generic type parameters. We will discuss this more in
Chapter 13; however, for the curious, we could define the above Stack
as follows:
type
Stack<'a> =
| EmptyStack
| StackNode of 'a * 'a Stack
The two forms are logically identical. The are created
identically and behave the same way.
What You Need to Know
·
F# supports a rich set of aggregate types including enumerations,
records, discriminated unions, structures and classes. We will cover structures
and classes later.
·
In F#, you can define your own types via type abbreviations,
e.g., type student = string * float * bool. Type abbreviations are
aliases. When you define a type via a type specifier, you can include generics.
·
Enumerations represent named values. Each name must have an
associated value.
·
Records represent simple aggregates of name-value pairs.
Generally, a record’s name-value pairs are logically or semantically related.
·
You can enhance record types with functions (discussed later) and
custom operators.
·
A discriminated union is an aggregate type whose instances
resolve to one of its discriminants. In functional languages,
discriminated unions are used in places where OO languages might use lightweight
class hierarchies.
·
Option is a pre-defined discriminated union in
F# that’s used when a given expression has the potential for returning “no
value”, e.g., when you read from a database.
·
F# supports reference cells, which are wrappers around mutable
values. They are used when the semantics of the language prevent the explicit
use of mutable, such as when closures are formed.
·
F# supports your defining your own types, including generic
types.
|