lj_chaource: Polynomial functors are not always distributive

Polynomial functors are not always distributive
Category theory is useful for programming because modern functional languages allow us to reason about code directly in terms of category-theoretic notions such as "functor" or "natural transformation".

Every programming language defines a certain category. The objects of that category are all the data types available in the language (for example, "integer", "boolean", "array of strings", "a pair of matrices of floating-point numbers", etc.). The morphisms are the functions that can be implemented in the language, taking one data type and returning another. For example, in some programming language the function x -> (if x = "hello" then true else false) is a morphism from "string" to "boolean".

Category theory has a number of constructions, such as the initial / terminal objects, products and co-products, etc., and it turns out that programming is easier if these constructions exist in the programming language's category. This gives valuable guidance about the features that a programming language "should" have.

Modern functional languages are such that their categories have initial and terminal objects, products and co-products. Additionally, one can directly work with functors and natural transformations. Since all we have is just one category, all functors map into a sub-category. However, it is still useful to think about them as functors.

In what follows, I develop a small bit of category theory that is directly applicable to functional programming. Please let me know if some terminology is not standard or misleading; I am not an expert in category theory.

The axioms of category are that

1) for each object t there is an identity morphism id: t->t

2) for any two morphisms f: r->s and g: s->t there is a "composition" morphism g.f: r->t

Both these axioms are evidently satisfied by the programming language's category. The identity function and the function composition are present in every functional language.

The terminal object in the category is denoted by "unit" or "1". The product of two objects is denoted by r*s, and the co-product ("disjoint union" or "sum") is denoted by r+s. We assume that all these always exist in our category. (This is the case for all modern functional languages.)

A type constructor is a map from objects to objects that is not necessarily a functor. Denote it as A(t). In a programming language, type constructors normally represent some kind of "generic containers of data". For example, the type of "array of strings" is denoted as Array[String] in Scala, Array<String> in Java, Array String in Haskell, and string array in OCaml. In our notation, this would be "A(t)" where "Array" is "A" and "String" is "t". The "Array" is the type constructor, and the "String" is a specific type; but a type constructor can be used with every type.

A type constructor A(t) is a functor if for any morphism f: r->s there exists a morphism A(f): A(r) -> A(s). The existence must be verified constructively; so there must exist a function that takes f and returns A(f). This function is called "fmap" in Haskell, and its type can be written like this:

fmap_A :: (r->s) -> (A(r) -> A(s)).

Definition. A polynomial type constructor is defined inductively: for any type t, A(t) is either a constant, fixed type "c", or it is the type "t" itself, or it is a product of two polynomial type constructors, or it is a "sum" of two polynomial type constructors.

Example: A(t) = a + b*t + c*t*t is a polynomial type constructor that involves constant types "a", "b", and "c". The type A(t) represents either a value of type "a", or a pair of values of types "b" and "t", or a triple of values of types "c", "t", and "t". Other examples of polynomial type constructors: Id(t)=t (identity functor); Unit(t)=1 (returning the unit type for all types t); A(t) = 1 + t*t*t.

Statement: Any polynomial type constructor is a functor.

Proof: By induction on the definition of the polynomial type constructor. If A(t)=c for a constant type c, it is a functor with "fmap" always returning the identity map id: c->c. If A(t)=t, it is again a functor with "fmap" returning "id". If A(t)=B(t)*C(t), where B and C are already proved to be functors, we define "fmap_A" for A(t) as a product of "fmap_B" and "fmap_C". Similarly for A(t)=B(t)+C(t).

Example: To see that the type constructor A(t) = 1 + t is a functor, we need, for any morphism f:r->s, a morphism fmap_A : A(r)->A(s). We can define fmap_A (1+0) = 1+0, and fmap_A (0+x) = 0 + f(x). It is straightforward to check the functor properties.

So, from now on we can simply talk about polynomial functors. They are quite important for programming, since almost all data types are polynomial functors (or recursive completions of polynomial functors).

Definition. A natural transformation T:A~>B between functors A and B is, for every object r, a morphism T(r):A(r)->B(r) that "commutes with fmap". In other words, for any types r,s and for any morphism f: r->s we should have the commutative diagram

        T(r)
   A(r)  ->   B(r)

    |          |
A(f)|      B(f)|
    |          |
    V          V
        T(s)
   A(s)  ->   B(s)

Definition: A functor F is pointed if there exists a natural transformation Id ~> F.

Examples of pointed functors: F(t) = 1 + 1 + t*t; F(t) = t+t+t. Examples of non-pointed functors: F(t) = c*t where c is some fixed type. A polynomial functor is pointed when it is a polynomial in t with "coefficients" that are all units, without externally defined fixed types.

Definition: A functor F is distributive if there is a natural transformation of functors F(a)*F(b) ~> F(a*b). A functor F is co-distributive if there is a natural transformation of functors F(a+b) ~> F(a)+F(b).

Distributive functors are also called "applicative" in the programming language community, but this term doesn't really explain much. On the other hand, I am tempted to call co-distributive functors "linear".

Definition: A pointed functor F is monadic if there is a natural transformation between functors F(F(t)) ~> F(t).

It is not obvious but a monadic functor is necessarily distributive.

Pointed, monadic, distributive and co-distributive polynomial functors seem to be quite important for practical programming. However, I'm not sure how to characterize them. Not all polynomial functors are distributive or co-distributive, but many are.

By analogy we can define co-pointed and co-monadic functors. They seem to be much less used in functional programming practice today. Is a co-monadic functor always co-distributive? Is there a different, more transparent way of characterizing all these properties of functors?

For example, F(t) = 1+t is a very frequently used monadic functor. However, F(t)=1+t is not co-pointed, and so it is not co-monadic.