Functional programming and the real world What is functional programming (FP) and what is its role in today's software engineering world?
1. What is functional programming?
Talking to
dima_i earlier today, I realized that I started to take everybody's familiarity with FP for granted.
This is a bit startling, especially considering that it took me about 6 years to go from an FP skeptic to an ardent proponent. Judging by my blog posts back in 2008, I was quite wary of FP and regarded it as a somewhat reckless experiment. However, it was fascinating to read about FP and to learn new mathematical concepts, so I kept at it. Already a couple of years later, it is clear that I regarded FP as a
viable and interesting experiment. At some point around mid-2013 I realized that the object-oriented and imperative paradigms are a failure and won't help me with my job (as a software engineer). From 2014 onwards, I regarded FP as a viable and practical paradigm that merely suffers from being unconventional and unpublicized among software developers. Three years later, many of my blog posts are about technical aspects of FP - more precisely, about a peculiar new branch of computer science that I call
functional type theory (FTT). This theory underlies functional programming and can be studied in its own right, regardless of the particular FP language, because it applies equally to all FP languages.
So, perhaps it is helpful to summarize what FP/FTT is.
I will be giving examples in Scala and Haskell, but keep in mind that essentially the same code works in Standard ML, OCaml, Swift, or F#.
Definition: A programming language
follows FTT if it has all of the following features.
- The language must support expressions of arbitrary complexity.
An expression should not be limited to simple arithmetic (such as a + b*c), as it is in many programming languages. An expression should be able to contain arbitrary code if/then/else branches, to define temporary variables and use them, or to define new temporary functions - with or without names - and use them. The syntax for expressions should be sufficiently powerful that we should be able to write our entire program as a single expression if we wished.
Haskell adopts this minimalistic approach: the program is simply the expression named "main".
In Scala, due to Java compatibility requirement, one must define a class and a static method named "main"; but the body of this method can be written, if we so desire, as a single expression encompassing our entire program.
- Each expression (named or nameless) must have a fixed type, determined at compile time and unchanged during run time.
Example (Scala):
val q = "xyz" + 1
The expression computes a string "xyz1", so the value q is automatically determined to be of type String. It will be a compile-time error to use q in a context that requires, say, an array or a Boolean value.
- Functions should be also treated as values with fixed types.
A function type is written as A -> B, which means that a function accepts an argument of type A and returns a result of type B. It is a compile-time error to pass an argument of a wrong type, or to use the result value as a value of another type.
Example (Scala):
val f = (x: Int) => x + 1
This defines the value f, which is a function taking an integer argument x and returning x + 1. Since x + 1 is also integer, the function f is automatically determined to have the type Int => Int. It will be a compile-time error to call f on a String argument, or to use the result of f as a non-integer.
In Haskell, the same example looks like this:
f :: Int -> Int
f = \ x -> x + 1
Haskell has a code convention that function types should be declared on a separate line before defining the function.
However, this is optional, and in any case, the Haskell compiler will automatically determine the type Int -> Int for the function f.
- The language must support type constructors, i.e. types parameterized by other types. This applies also to function types: the language must support functions with type parameters.
For example, we should be able to define a type representing an array of elements of type T, where T is a type variable that stands for an unknown type. We should be able to write a function that takes an argument of type "array of T" without specifying what T is. Later, we should be able to call that function on any specific array type (such as, array of strings or array of arrays of functions of type Int -> Int, or whatever else). If the language supports this feature, we call "array" a type constructor.
In Scala, type parameters are written in square brackets, for example Array[T]. In Haskell, the syntax is "Array t" without any brackets.
Here is a Scala example of a function parameterized by type T and taking arguments also parameterized by T:
def getFirstElementOrDefault[T] (arr: Array[T], defaultValue: T): T =
if (arr.nonEmpty) arr(0) else defaultValue
Java has this feature under the name of "generics"; C++ has it too, with "templates".
- The language must support function calls that take other functions as arguments and/or return a new function as the result value. If the language supports this, we call those functions higher-order functions.
Example: define a function `compose` that takes two arbitrary functions f, g and returns their composition as a new function.
Haskell:
compose: (b -> c) -> (a -> b) -> a -> c
compose f g = \x -> f (g x)
Scala is a bit more verbose since functions with type parameters require a "def", but it's basically the same code as in Haskell:
def compose[A,B,C](f: B => C, g: A => B): A => C = (x: A) => f(g(x))
- The language must support product types ("tuples", "conjunctions") and co-product types ("sums", "disjunctions", or "variants"), as well as the special type Unit that has only one value.
Example of a tuple type is (Int, String, Int) -- this syntax works for both Scala and Haskell. It contains three values, each of its respective type, so that the type of each place in the tuple is fixed. Tuples have operations such as "take the first element", "take the second element" etc. In effect, tuples are records where fields have no names.
Example of a disjunction type is Either: this type is parameterized by two types A and B, and the value Either[A,B] contains one of the values of types A or B (but not both). In both Scala and Haskell, all members of a disjunction type must be named.
Haskell has a short syntax for disjunction types:
type WhatAddress = StreetAddress String | POBox (String, Int) | AddressNotGiven
Disjunction types are almost unknown in programming languages of Algol family (C, C++, Java, Python, etc.). The only sort of disjunction type in C and Java is the "enum" type, which is a disjunction of constant labels.
C: typedef enum { Red, Green, Blue } Color
Haskell: type Color = Red | Green | Blue
So, disjunction types can be seen as extended "enum"s that additionally carry a value (or several values) on each label.
Optional features for a more advanced FP language are higher-order types and type classes. Only Scala and Haskell support both of these features. OCaml has second-order types as part of its module system, but no type classes. Neither Swift nor F# support higher-order types or type classes.
- Higher-order types are types parameterized by type constructors. For instance, the standard function "map" applies a transformation to each element of a data container (such as array, queue, tree, etc.). Since "map" exists for many different types of data containers, we may want to define "map" so that its type is parameterized by the type of the container.
Scala: def map[Container[_], A, B](f: A=> B)(c: Container[A]): Container[B] = ...
Haskell:
map :: (a -> b) -> container a -> container b
Defined in this way, the function "map" has a higher-order type.
If a language does not support higher-order types, its standard library will have to define "map" separately for each container (either with a different name or in a different namespace, such as List.map, Array.map, Set.map and so on).
- A type class is a set of type constructors for which a function with a specific type signature exists. The function (or several such functions) should be parameterized by the type constructor, as we just saw in the "map" example.
For example, each of the Scala type constructors Array, List, Seq, HashMap, IndexedSeq, Set, Option, Future has a specific "map" function defined on it. All these "map" functions have the same signature, as we just saw, parameterized by the constructor. For this reason, all these type constructors belong to the same type class - the class of "type constructors that have `map` defined on them". This type class is called "functor". Examples of other type classes are "applicative functor" and "monad".
What is functional type theory (FTT)?
It is a branch of category theory mixed with formal logic. FTT studies the data types and type constructors used by FP languages, in order to help the programmer to decide what types to use for specific programming tasks.
Typical questions considered by FTT are:
- If we have a pair of lists (List[A], List[B]), can we convert that value into a single list of pairs List[ (A, B) ]?
What are other type constructors besides List that have this property, and what are type constructors that cannot have this property?
- Can we define "map" for the type R x = Int -> x, where x is a type parameter (i.e. an arbitrary, unknown type)?
Can we do that for Q x = x -> Int or for D x = (x -> Float) -> x?
- Can we define a function of type (((((A -> B) -> B) -> A) -> B) -> B) for arbitrary types A and B?
What would be the code of this function, if we can define it?
- If we have a type constructor A that does not have a "map" operation, can we define another type constructor F that does little more than wrap values of type A x but has a "map" function with the standard type (a -> b) -> F a -> F b? Can we define this F through A in a generic way, so that the code for F is parameterized by A?
While writing code, an FTT-aware programmer will be trained to recognize situations when the code can be refactored to use parameterized types and standard functions such as "map". Then, the programmer will see that before writing further code, one must understand what types are necessary for solving the task at hand, and how the initially available types can be transformed into the desired final result type. Thus, the programmer will have to answer some questions about types, such as the questions shown above.
To answer these questions, the programmer will take a piece of paper and write some calculations. Alternatively, there could be automatic tools for deciding some of these questions (although automatic derivation in the relevant logics is still at the forefront of research).
Category theory and formal logic provide tools for answering all these questions. The amount of required mathematical material is certainly not large, and the complexity of the relevant calculations are not higher than that of (European) high-school algebra and calculus. Nevertheless, new programmers need some time - perhaps a year or two of learning and on-the-job practice - before FTT becomes as obvious as simplifying (x-y)*(x+y).
FTT today is still in its infancy. The features of FP that I listed here are design patterns that showed their usefulness in the course of the last 20 years of exploration. However, it is far from clear what other design patterns might be possible. For example, as recently as 2016, there was still no consensus about the best way to combine arbitrary monads: with monad transformers, through MTL and "power lifting", with free monads and functor co-products, or with specialized "effect types" that utilize the "open co-product" structure. So, it is safe to say that the full implications of monads for programming is still a research area today. Twenty years from now, these questions will surely have been settled.
I have not mentioned dependent types, and languages such as Coq, Agda, and Idris. Using dependent types in FP is another promising research direction, which is still further away from settling into an established engineering discipline than the areas of FTT I outlined.
What are the benefits of FP?
There are many levels of complexity in FP/FTT, and, as I said, certain features are still in the stage of experimental exploration. Most companies actually using FP are utilizing only the features I listed, and this already allows them to reap two main benefits of FP:
- The code contains no loops: all iterative processing of data structures is performed using standard higher-order functions such as "map", "filter", "flatMap", "fold", "zip" and so on. This avoids errors and makes code very concise.
- All types are checked at compile time, and the type system (especially the combination of functional types, product types, and co-product types) is powerful enough to avoid the necessity of runtime typecasts or type checking. This again avoids large classes of errors associated with "null" or other special values used in traditional programming languages. At the same time, the type system allows programmers to express complicated idioms, such as massively concurrent computations, more easily
2. Functional programming is far from mainstream software engineering.
Let's start with some facts.
- Haskell and OCaml are general-purpose, reasonably well-supported FP languages. One of the first published books about programming in these languaged had titles "
Real World Haskell" and "
Real World OCaml". However, there are no such books for other programming languages such as Python, JavaScript, or C#.
To be precise, there is one book from 2002 with the title "
Java tutorial for the real world" and a couple of more specialized Java books with a "Real World" in their subtitle. However, no book about another programming language would try to make a loud statement that,
contrary to appearances, the language is actually ready to be used in the real world. Haskell and OCaml books needed to do that! This is all the more significant since there are so few books in total about Haskell or OCaml, and so many about Java.
- Programming languages most in demand today, according to different sources and in decreasing popularity order:
SQL, JavaScript, C#, Java, Python, PHP, C++, T-SQL, C, Ruby, PowerShell, PerlThe 12 "up-and-coming" languages are
Go, Swift, Hack, Rust, Julia, Dart, Erlang, Scala, Elixir, Haskell, Clojure, Lua - but that's just an analyst's opinion and prediction. None of these languages are nearly at the point of being mainstream.
SQL, Java, Javascript, Python, C# are consistently among the most demanded languages as determined for
2016 and
2017. None of these languages are FP.
- New FP languages invented by major companies include Apple's Swift and Microsoft's F#. Adoption of these languages is relatively small.
- Large and innovative Internet software companies such as Facebook, Twitter, LinkedIn, or Netflix use FP languages such as Scala and Haskell. However, as the language popularity data shows, the use of FP at these companies is not enough to skew the statistics towards FP. We must conclude that the overwhelming majority of the software industry - both the large established companies such as Microsoft, Amazon, and Google, and smaller businesses - use only non-FP languages.
- The rapidly growing data science/machine learning community is at 91% usage of
Python, R, SAS, and SQL. These languages are not FP, although some of them do have certain FP features (notably R), and the linked report says that Scala and Julia are "not declining in popularity" for data science, while
another report claims that Scala grows faster than Julia, while still clearly behind Python & Co. On the other hand, the data engineering community use technologies such as Kafka, Akka, and Spark, which are all Scala-based and heavily use FP. It seems that the only wide industrial use of FP today is "big data" engineering; many reports such as
this one say that Scala is on the rise for data engineering while the usage of Matlab and SAS is declining.
3. Why are data scientists not embracing Haskell or Scala?
One explanation is that the software industry is very conservative, and/or that software engineers are not math PhDs who can study functional type theory and become proficient. Perhaps this is sufficient to explain why most Web servers today are still written in JavaScript, PHP, or Ruby rather than in Haskell or Scala.
However, data scientists do not fit this picture. They often have PhD degrees in science or mathematics, and they would certainly have no trouble learning the basic rudiments of category theory and abstract algebra that are used by FTT.
My explanation is that data scientists are much more interested in science than in software engineering. They see software engineering as a necessary evil or, at best, as a tool for implementing mathematical calculations or other algorithms. Even if it were possible today to learn FTT from a concise book adapted to science-minded people, a typical data scientist would probably find that kind of material uninteresting.
As for ordinary software engineers, most of them seem to be quite uninterested in learning any "theory". They are quite content to suffer with the bugs in their programs, just as users have by now learned that software products are "inherently" unreliable and cannot be depended on in the same way as we depend on, say, the highway bridges not collapsing as we drive over them.
There are, however, two cases of FP adoption that I found illuminating.
The first case is the widespread adoption of Scala in the data engineering community. Scala is not a particularly elegant language (no comparison with Haskell or Agda here), but Scala has the key benefits that data engineers realized:
- compatibility with the entire Java ecosystem - all libraries and deployment tools can be used in Scala with very little effort
- a powerful type system (having type constructors) that allows programmers to easily implement the stream processing paradigm for big data
Stream processing of big data is the only reason, in my view, why Scala had any chance for being used in data engineering or anywhere at all in the software industry. The previous methodology (Google's map/reduce) is too restrictive to implement arbitrary data flow. Scala-based technology such as Spark and Akka have been adopted because there was no other solution; Java or C++ is too difficult to use for implementing a concurrent, massively parallel application. So it is not the theoretical elegance of FTT but the very real necessity of building a stream processing pipeline that forced software engineers to turn to Scala.
The second case is the adoption of R by data scientists (I am now expanding on what
dima_i has told me today). This is the most puzzling example of FP adoption.
Although the R language has an old lineage and is not inspired by FP ideas, it has some accidental FP features: it can treat functions as values, and its standard library contains some higher-order functions such as "map", which are used to process arrays and matrices. A crucial design goal was that "map", "filter" and other higher-order functions should be optimized in the R interpreter to run fast enough on large data sets.
An unintended consequence of this design goal is that, while R permits programmers to write traditional loops, its interpreter is too slow to provide acceptable performance for the loop-based code. Therefore, the users of R are
gently forced to use higher-order functions such as "map" or "filter" in their programs! No amount of admonishments or demonstrations of mathematical elegance would have moved R users towards functional programming; what worked was a simple deficiency of an erroneously designed programming language.