Chapter 11

Aggregate Types

Main Page

Introduction

When we program in strongly typed languages, we often find it is necessary to work with data types that capture the semantics of our domain, e.g., when writing accounting software, we could model solutions in terms of debits, credits, etc. Using domain semantics helps us to keep our solutions transparent, simpler to understand, and more natural to work with. As a multi-paradigm language, F# supports structured types, e.g., discriminated unions, as well sophisticated objects that support full OO functionality. In this chapter, we study F#’s built-in structured types and learn how to define our own. To begin, let’s look at how we can access existing types via type abbreviations,

Type Abbreviations

The simplest way to get started with types is to discuss type abbreviations, which are alternate names or aliases for existing types.[1] We use type abbreviations to give existing types meaningful names, in order to make code clearer and easier to read. We also use them to create short names for types that are otherwise unwieldy to work with. Additionally, we use type abbreviations to make it easier to replace a type without changing all the code that uses that type. In the following code snippet, we define several type abbreviations:

type arrIndex = int

type temperature = double

type point = float * float

type student = string * float * bool

These type abbreviations are fundamentally aliases, e.g., arrIndex for int and point for float * float. Since the type names help indicate what the types are used for, you can see how type abbreviations can help clarify intent.

Once defined, you can use abbreviated types anywhere you’d use the orignal type. For example, given the type abbreviations defined above, we can write the following;

let student joe = ("joe", 3.2, true) or

let joe: student = ("joe", 3.2, true)

By convention, type abbreviations are lowercase, but this is not a technical requirement or constraint.

Back in Chapter 9, we discussed generic functions, and how F# enabled us to specify generic function arguments. The idea of generic arguments extends to type abbreviations. For example, let suppose we want to create a new abbreviated type based on .NET’s built-in generic collection class System.Collection.Generic.Dictionary. We can do this by creating a type abbreviation that accepts a generic argument. The generic argument appears in angle brackets as part of the abbreviation’s definition. In the following code snippet we define a type that represents an associative array that uses string-based keys to hold generic values:

type AssocArray<'a> = System.Collections.Generic.Dictionary<string, 'a>

Now, when we create an AssocArray, F# will ensure that we supply a type parameter, which will be used to instantiate a new System.Collections.Generic.Dictionary that uses stores whatever type we specify, and uses strings as its lookup keys, e.g.:

let mylist = new AssocArray<int>()

 

Using type abbreviations, we can alias any type, including arrays, function types, tuples, etc. The following example illustrates a type abbreviation that aliases functional types:

type Transformer<'a, 'b, 'c> = ('a -> 'b) -> ('b -> 'c)

Type abbreviations are nothing more than aliases. During F#’s compilation process, they are expanded into the standard type format shared by all .NET languages.

Enumerations

While type abbreviations enable us to alias existing types, enumerations give us a way to create our own types. Enumerations have a defined set of named values. You can use them in place of literals to make code more readable and maintainable. They take the following general form:

type enum-name =
    | value1 = literal1
    | value2 = literal2
    | ...
    | valueN = literalN

The literals can have the type sbyte, byte, int16, uint16, int32 (the default), uint32, int64, uint64 and char. Enumerations in F# are mapped to the.NET Enum class, which is a value type, meaning they are allocated on the stack and use pass-by-value semantics. In the following example, we define a simple enumerated type to represent a Web protocol:

type Protocol =

    | FTP = 0

    | HTTP = 1

Those coming from the C-family of languages beware. If you omit the literals, you no longer have an enumeration, but a discriminated union (discussed later in this chapter).

You can also compact the representation of enumerations by placing them on the same line:

type Color =

    | Red = 0 | Green = 1 | Blue = 2

It is simple to convert an enumeration to its underlying, corresponding value by using the appropriate casting operator:

type Color =

    | Red = 0 | Green = 1 | Blue = 2

 

let c = Color.Blue

let n = int c   // cast c to corresonding int value

Whenever you access an enumerated vaue, F# requires that you use the enumerator’s full name, e.g.,  you must use Color.Red, not just Red by itself.

To change the default type of an enumeration from int32 (the default) to something else, you need to specify the type of each element, as shown here:

type Days =

    | Sunday = 0x1uy

    | Monday = 0x2uy

    | Tuesday = 0x4uy

    | Wednesday = 0x8uy

    | Thursday = 0x10uy

    | Friday = 0x20uy

    | Saturday = 0x40uy

This example uses unsigned bytes (uy) to represent the various days of the week.

You can use the enum function in the F# library to generate an enumeration value, even a value other than one of the predefined, named values. For example, given this Color enum:

type Color =

    | Red = 0 | Green = 1 | Blue = 2

You use the enum function to programmatically create an additional entry:

let purple = enum<Color>(3)

We can also use enumerations in conjunction with pattern matching, as demonstrated in the following example:   

let isPrimary color =

    match color with

        | Color.Red | Color.Green | Color.Blue-> true

        | _ -> false

 

Records

The next rung in the type sophistication ladder of is that of records. A record represents a simple aggregate of named values that enable us to build data structures that contain named fields. The general form of a record type is:

type [accessibility-modifier] typename = {
    [ mutable ] label1 : type1;
    [ mutable ] label2 : type2;
    ...
    }

In the following example, we define a record that contains information about a movie:

type Movie = {

    Title: string;

    ReleaseYear: int;

}

      

Each record element consists of a named field and data type, separated by a colon, e.g., Title: string. Each record element is separated by a semicolon.

Once you’ve defined a record type, you can construct instances of the record using using the let keyword, as shown here:

let funnyMovie =

    { Title="Pineapple Express"; ReleaseYear=2008 }

When you create an instance of a record, you need to initialize every field. Forgetting to initialize a field results in a compiler error.

Constructing a record instance looks similar to defining a record. It helps to keep in mind that when you define a record type, you use the keyword type and delineate the record’s fields and data types. In contrast, when constructuing a record instance, you use the keyword let and assign values (via =) to each of the record’s fields. You use the brackets { } in both cases.

Record fields are not limited to simple types. You can use lists, arrays, etc. as part of a record’s definition. To demonstrate this, let’s define a new Movie record that includes the list of star actors:

type Movie = {

    Title : string;

    ReleaseYear: int;

    Actors : string[];

}

      

let funnyMovie = {

    Title="Pineapple Express"; ReleaseYear=2008;

    Actors=[|"Seth Rogen"; "James Franco"|]

}

By default, like all values in F#, record elements are read-only. If you wish to make record elements writeable, use the mutable keyword, as shown here:

type Employee = {

    Name : string               // semicolon optional when elements

    mutable Salary : float      // on a separate line

}

 

let joe = { Name="Joe Smith"; Salary = 50000.0 }

joe.Salary <- joe.Salary * 1.1  // Give Joe 10% raise

Note that if an element is inherently mutable, as is the Actors field defined in the Movie type above (by virtue of the fact that it is an array, and arrays are inherently mutable), you can change its contents without using the mutable keyword. What you cannot do is change the referenced element. In other words, given the Movie type as defined, you can update or change the contents of the existing Actors array, but you cannot replace the array with a new array. In code:

funnyMovie.Actors.[0] <- "Ed Helms"             // works

funnyMovie.Actors <- [| "Zach Galifianakis" |]  // error, Actors is not mutable

 

If you define two or more record types and find that these types contain identical element names, you will need to disambiguate them during instance construction. For example, given an Employee class and an Actor class, both with elements Name and Salary, you need to tell F# which record you’re trying to construct; otherwise, F# will choose the most recently defined one. You disambiguate using the record type name followed by a dot ( . ). Let’s look at an example:

type Employee = {

    Name : string;             

    mutable Salary : float;     

}

 

type Actor = {

    Name : string;             

    mutable Salary : float;  

}

 

let x = { Name="Joe Smith"; Salary=50000.0 } // Actor, since most recently defined

let y = { Employee.Name = "Mary Brown"; Salary=60000.0 } // Employee, explicit

Copying Records

In addition to the mechanisms we’ve discussed so far, we can also construct records via cloning. For example, given a  record of type Employee, we can easily create an Employee record and copy it using the with keyword. In the process of making the copy, we can replace whatever element values we’d like, so that the newly constructed copy assumes the new values:

type Employee = {

    Name     : string;             

    Salary   : float;     

    HireYear : int;

}

let joe = { Name="Joe Smith"; Salary=65000.0; HireYear=2001 }

let mary = { joe with Name="Mary Jones"; Salary=75000.0 }

    // copy joe replacing name and salary


Augmenting Records with Operators

Although we’ve already introduced operator overloading, we did so only from the perspective of introducing them in the global namespace. Now that we’ve discussed record types, we’re in a position to revisit operator overloading in a new context.

To overload an operator to be used with specific record types, we use the following syntax:

static member (operator-symbols) (parameter-list) =
    method-body

Here, we see a new keyword static, which means that the member becomes part of the record type itself vs. instances of the type. We’ll discuss static in detail when we cover classes. The member keyword is also new, and tells us that the operator is a “member of”/”part of” the public interface of the record type. The rest of the operator definition remains unchanged vis-à-vis a standard operator as introduced previously.

In the following example, we define a record type called PolarCoord that represents a point in polar coordinates. As part of the record type, we define an operator + that enables us to easily add PolarCoords together. Since we’re defining the operator + to be applicable only to PolarCoords, we do not risk muddying the global namespace, nor do we overload the standard plus (+) operator.

open System

 

type PolarCoord =

    {

        r : double

        angle : double

    }

    static member ( + ) (c1 : PolarCoord, c2 : PolarCoord) =

        // Need to convert to rectangular coords, add, then convert back

        let (x1, y1) = (c1.r * Math.Cos(c1.angle), c1.r * Math.Sin(c1.angle))

        let (x2, y2) = (c2.r * Math.Cos(c2.angle), c2.r * Math.Sin(c2.angle))

        let (x3, y3) = (x1 + x2, y1 + y2)

        let r3 = Math.Sqrt(x3 * x3 + y3 * y3);

            // We need to define "r3" here, since the "r" defined in

            // the record (below) is not visible to the rest of the

            // record, and therefore, cannot be used in the match expression.

            // Go figure.

           

        // Build the resultant PolarCoord

        {

            r = r3

            angle =

                match (x3, y3) with

                | (0.0, 0.0) -> 0.0

                | (x3, _) when x3 >= 0.0 -> Math.Asin(y3 / r3)

                | (x3, _) when x3 < 0.0 -> -Math.Asin(y3 / r3) + Math.PI

        }

       

let p1 = { r = 1.0; angle = 10.0 }

let p2 = { r = 5.0; angle = 72.5 }

let p3 = p1 + p2

When overloading operators in record types, whitespace is critically important. The braces that define the record must be aligned, and the static keyword must vertically align with the closing brace.

Discriminated Unions

The next type in our F# bag of tricks is the discriminated union, an important and often-used construct in F# programs. A discriminated union (DU) is a data type with a finite number of distinct, alternative representations. I think that it’s helpful to think of a DU as a polymorphic type akin to a union in traditional C.

In functional programming, discriminated unions often are used to replace single-inheritance hierarchies. For example, let’s assume that we want to represent a set of simple operating system constructs such as files, processes, and handles. To model this type of system in OO fashion, we’d most likely create an abstract class OSObject and from this derive subclasses such as File, Process, and Handle. We can do something very similar with DUs, as shown in the following example:

type OSObject =

    | File of string            // name

    | Process of string * int   // name and priority

   

Here, we are declaring a DU called OSObject that contains two discriminators: File of type string, and Process of type string * int, which is a tuple. As you can see, each discriminator can be of a different type. Note that each discriminator must start with an uppercase character. This is somewhat odd due to the fact that F# does not normally restrict case outside of its built-in keywords.

To use the OSObject DU, you create instances of the discriminators directly[2]:

let f = File("fsharp.txt")
let p = Process("iexplore.exe", 1)

Here we’re creating an OSObject of type File. Since File was specified to be of type string, F# generated a compliant constructor – and we’re invoking it. We also create a Process, that accepts 2 arguments of type string and int respectively.

In addition to creating discriminators of explicit type, you can create discriminators that do not specify types explicitly. In the following code, we add Unknown to our OSObject DU:

type OSObject =

    | File of string            // name

    | Process of string * int   // tuple (name, priority)

    | Unknown                   // unknown type

When F# encounters Unknown, it generates a simple class that contains a default constructor, enabling us to create instances, as shown here;

let u = Unknown

DUs enable us to treat the discriminators polymorphically. For example, if we wanted to create a factory method for creating new OS objects (OSObjects), we could do so as follows:

let makeOSObject objtype =

    if objtype = "file" then File("")

    else if objtype = "process" then Process("", 0)

    else Unknown()

Because is it an expression, the makeOSObject function must resolve to a single return type. Since all of the potential return types fall under the OSObject umbrella (they are polymorphic with repsect to one another) F# considers a valid return type any OSObject. This enables us to return a File, a Process or an Unknown.

To further flesh out DUs, let’s look at some examples that could form the basis of a fantasy role-playing game:

type FantasyCharacter =

    | Warrior | Wizard | Thief

   

type OffensiveWeapon =  // ints represent offensive strength

    | Sword of int | Mace of int | MagicStaff of int

   

type DefensiveWeapon =  // floats represet defensive strength

    | Shield of float | ChainMail of float

 

Once we’ve defined DUs, we can use them stand-alone or as components of other types. For example, we can define a Player type that consists of a string, a FantasyCharacter and an OffensiveWeapon:

type Player = {

    Name: string;

    Character: FantasyCharacter;

    Offense: OffensiveWeapon;

}

let conan = { Name="Conan"; Character=Warrior; Offense=Sword(100) }

We can even combine discriminators in tuples, arrays, etc. Here is another example where we create a tuple our of two discriminators:

type Player = {

    Name: string;                                // string

    Character: FantasyCharacter;                 // discriminator

    Weapons: OffensiveWeapon * DefensiveWeapon   // tuple of discriminators

}
let conan = { Name="Conan"; Character=Warrior; Weapons=(Sword, Shield) }

Let’s look at another example, this time switching gears from fantasy games to books. This example throws in type abbreviations for good measure:

type Title = string

type PubYear = int

type Author = string

 

type Book =

    | NonFiction of Title * Author * PubYear

    | Fiction of Title

 

let bizbook = NonFiction("Good to Great", "Jim Collins", 2001)

let goodread = Fiction "Shogun"

Note here that we’ve abbreviated the types Title, PubYear and Author, and defined the NonFiction discriminator as a tuple of them all. As with other types, you can construct arrays and lists of discriminators, as shown here:

let bizlibrary = [

    NonFiction("Tipping Point", "Malcom Gladwell", 2000);

    NonFiction("Made to Stick", "Chip & Dan Heath", 2007);

    Fiction("The Lost Symbol")

]

Recursive Definitions

A powerful feature of DUs is that they can include recursive references.  This enables us to define data structures in a recursive manner. The classic example of a recursive data structure is a tree, and it turns out that in F# we often use DUs to define them. In the following example, we use a DU to define a recursive tree structure whose leaf nodes can contain any data type (as defined by a generic parameter 'a):

type Tree<'a> =

    | Leaf of 'a

    | Node of Tree<'a> * Tree<'a>

Given the Tree structure above, we can build a tree as follows:

let t = Node(Node(Leaf(100), Leaf(200)), Leaf(50))

You can also write a function[3] that walks the tree and prints out its values in an easy-to-read style:

let walkTree tree =

    let rec loop depth tree =

        let spacer = new string(' ', depth)

        match tree with

        | Leaf(value) ->       

            printfn "%s |- %A" spacer value

        | Node(tree1, tree2) ->

            printfn "%s |" spacer

            loop (depth + 1) tree1

            loop (depth + 1) tree2

    loop 0 tree

Here, again, I’d like to point out the common pattern of using recursion and pattern matching to define concise, elegant processing constructs.

A side note on the example: Although we have not covered it yet, the match statement, among its other functions, can be used to distinguish among a given set of discriminators. It’s sort of like a an if…else chain or switch statement in C. We will cover match fully in Chapter 12.

Options

F# ships with some built-in discriminated unions that prove quite useful, one of which is the option type[4]. The option type enables our expressions to evaluate to “something” or “nothing” using the discriminators Some and None respectively. In other words, option is used when a value may or may not exist. Here is its definition:

type 'a option =

    | None

    | Some of 'a

When using the option type, our expressions use Some to tell the caller that “the result is valid” and None to tell the caller that “I cannot give you an answer.” Although we could solve this problem in different ways, using options is a well accepted F# idiom.

Let’s look at a few examples to help clarify how we use options. In the first example, we will look at a case of classic division, where we divide a numerator by some denominator:

let classicDiv a b = a / b

This is a perfectly valid function, and works properly except when b = 0. If we call this function with b = 0, F# will throw a DivideByZero exception. The caller can certainly catch and deal with the exception, or we can use the option type to assist the caller in understanding the nature of the return value. Here is a safe version of classic division:

let safeDiv a b =

    if b <> 0 then Some(a / b)

    else None

Options are used frequently to check database query results, report errors from object contruction, and in other instances where a value might be legitimately null.

Until you learn to use pattern matching (in the next chapter) to check the “Some-ness” or “None-ness” of a return value, you can use the Option.isNone and Option.isSome static functions in an if expresson, as shown below:    

let safeDiv a b = if b <> 0 then Some(a/b) else None

let f = safeDiv 9 3

if Option.isNone(f) then printfn "You can't divide by 0!"

else printfn "The answer is %d" f.Value

Notice here that you can pull the intrinsic value from the option type via the Value function. You can accomplish the same thing using Option.get.

Reference (Ref) Cells

Even though F# discourages side effects, there are times, e.g., when interfacing with other .NET languages, that we need to change the state or values, data structures, etc. We can use the mutable keyword, as you’ve seen; however, mutable values have some limitations, including that you cannot use them outside the scope in which they are declared – for example, we cannot use a mutable value declared externally inside a function. The following code snippet, for example, will not compile.

let counter() =

    let mutable count = 0

    (fun () -> count <- count + 1; count)

This example is trying to define a function counter that returns a function (a closure) with a side effect.  The side effect is incrementing the mutable value count  – but count would then be used outside the scope in which it is declared whenever the function returned from counter is invoked, and this is illegal. The specific feedback we get from the F# compiler is:

The mutable variable 'count' is used in an invalid way. Mutable variables may not be captured by closures. Consider eliminating this use of mutation or using a heap-allocated mutable reference cell via 'ref' and '!'        

To make it simpler and more flexible to deal with mutable state, F# offers the Ref type. The Ref type is a record that wraps mutable contents of arbitrary type. A Ref type is defined as follows:

type Ref<'a> =
    { mutable contents: 'a }

Values of type Ref are called reference cells. Unlike mutable values, reference cells overcome the limitation of being accessible outside their declaring scope. In addition, F# provides helper functions for working with them. To create reference cells, we use the ref keyword, followed by a reference expression, as shown in the following example:

let daughter = ref "kimberly"

Here, daughter represents a reference cell – it “wraps” or “points to” the string "kimberly". To change the value referenced by the reference cell, daughter, we use the reference assignment operator  := 

daughter := "chee"

 

Now the value of daughter is "chee”, and the former value, "kimberly", is forgotten.

We access a reference cell’s value using the cell dereference operator (!):

printfn "kim's nickname is %s" !daughter

All in all, a ref type uses pointer-like semantics. It refers to a value and must be dereferenced when used. The value returned by the dereference operator (!) is read-only. To assign a value to a reference cell, use the Value function and the destructive assignment operator (<-), as shown below.

let pi = ref 3.14

pi.Value[5] <- 3.1415926

Somewhat surprisingly, if we create a reference cell that refers to another variable, changes that we make to the reference cell via := do not affect the referenced variable. In the following example, we can see that y references x. When we increment y, x remains unchanged.

let x = 123                  // x = 123

let y = ref x                // use x as initial value for y

y := !y + 1                  // see note below

printfn "x=%d, y=%d" x !y    // print results (x=123, y=124)

The line y := !y + 1 is interesting. What’s happening here is we’re using the old referenced value for y to create a new,  incremented value. y then references this new value (124). The old value is forgotten by y, but is not changed (x still = 123). The analog here is pointers, e.g., from C or C++:   the pointer is changed, not the thing the pointer used to point to.

As you can see from the example above, the := operator is a shorthand for assigning to contents, and ! is a shorthand for reading its value. The fact the := does not change the references value is somewhat counterintuitive, since reference cells behave very much like pointers otherwise.

The net net of all this? Use mutable when you can, and reference cells otherwise. You can use mutable so long as the mutable variable is in local scope.

Defining Generic Types

In an earlier chapter, we discussed generic functions, the mechanics of which are largely handled by F#’s sophisticated type inference system and automatic generalization facilities. We also discussed that type abbreviations can contain generic designations. To finish the swing on generics, I want to mention that F# supports our defining our own generic types, just like the generic types that ship with F# and those in the .NET Base Class Library (BCL), e.g., List.

Let’s suppose, for example, that I want to create a new type, a Stack that can contain any type of object. While there are several ways to skin that cat, one of the prevalent F# idioms is to define a type-parameterized DU. To follow that idiom, we would define our type as a generic discriminated union, as shown:

type 'a Stack =
    | EmptyStack
    | StackNode of 'a * 'a Stack

This definition creates a generic data structure. The leading apostrophe in 'a is important.  It’s a required literal part of the syntax.  Simply using the letter a without a leading apostrophe will not work.  The 'a preceding Stack tells us (and, as importantly, the compiler) that this type is generic, which means the type can be deduced when the DU constructor is invoked. When we supply a data element to the Stack, F# infers the particular type of the Stack and binds it to 'a . Note the  'a is a placeholder for a concrete type. It is replaced by the type supplied by the constructor’s caller. We now have a Stack containing one element, and the Stack expects all future elements to be of the same type. Now that we’ve created an instance of a Stack, one with a real type, we have created a constructed type.  The following code snippet shows 3 constructed types based on the generic Stack type:

let iStack = StackNode(3, StackNode(5, EmptyStack))

let fStack = StackNode(2.0, StackNode(1.0, EmptyStack))

let sStack = StackNode("world", StackNode("hello", EmptyStack))

The only reason I mention the term constructed type, is because it shows up in some F# literature. There’s nothing particularly significant about that term.

A note about syntax. F# also supports an angle bracket syntax when dealing with generic type parameters. We will discuss this more in Chapter 13; however, for the curious, we could define the above Stack as follows:

type Stack<'a> =

    | EmptyStack

    | StackNode of 'a * 'a Stack

The two forms are logically identical. The are created identically and behave the same way.

What You Need to Know

·         F# supports a rich set of aggregate types including enumerations, records, discriminated unions, structures and classes. We will cover structures and classes later.

·         In F#, you can define your own types via type abbreviations, e.g., type student = string * float * bool. Type abbreviations are aliases. When you define a type via a type specifier, you can include generics.

·         Enumerations represent named values. Each name must have an associated value.

·         Records represent simple aggregates of name-value pairs. Generally, a record’s name-value pairs are logically or semantically related.

·         You can enhance record types with functions (discussed later) and custom operators.

·         A discriminated union is an aggregate type whose instances resolve to one of its discriminants. In functional languages, discriminated unions are used in places where OO languages might use lightweight class hierarchies.

·         Option is a pre-defined discriminated union in F# that’s used when a given expression has the potential for returning “no value”, e.g., when you read from a database.

·         F# supports reference cells, which are wrappers around mutable values. They are used when the semantics of the language prevent the explicit use of mutable, such as when closures are formed.

·         F# supports your defining your own types, including generic types.

 

 

 

 



[1] Type abbreviations are much like typedefs in C/C++.

[2] This is possible because for each discriminator defined, under the covers F# generates a simple class with a constructor that expects arguments of the speficied types.

[3] Adapted from F Sharp Programming Wikibooks (http://en.wikibooks.org/wiki/F_Sharp_Programming/Discriminated_Unions)

[4] Defined in the Microsoft.FSharp.Core.Option module.

[5] There is another function, contents, that is used for OCaml compatibility. it does the same thing as Value.

 

Feedback

We welcome your feedback. If you have comments or questions about this chapter, please feel free to e-mail us at

Keep Reading

Next Chapter...