Fold

An Explanation for Composability of Clojure Transducers

In A Tutorial on the Universality and Expressiveness of Fold, Graham Hutton presents in a clear and understandable way the advantages of programming by folds. Most of the core concepts exposed there, among which the universal and fusion property of folds are to be found in Bird's book. In this short article we'll apply some of those ideas to Clojure transducers, and we'll show how the fusion property implies the efficiency and composability of transducers. It's beyond of this writing to compare left and right folds for Clojure and their lazyness, since Clojure reduce is (notationally) a left fold and we'll assume all lists are left lists (say Clojure vectors). Note also that what follows is not about Clojure's fold as found in clojure.core.reducers.

The Universal Property of Fold

Following Hutton, we can define a fold function on lists by means of the following properties. For sets α\alpha and β\beta, define the set of all functions f ⁣:β×αβf\colon\beta\times\alpha\rightarrow\beta which we'll call right actions of α\alpha on β\beta and denoted by Aβ,αA_{\beta,\alpha} . Also when the function ff is clear from the context, we'll just write bab\cdot a for f(b,a)f(b, a). The fold\mathsf{fold} function can be defined as

such that, for a given bβb\in\betathe following properties hold:

where xˉ ⁣: ⁣x\bar{x}\colon\!x is the list obtained as conjunction of a list xˉ\bar{x} with an element xx of α\alpha (in Clojure(conjxˉx)(\mathsf{conj}\,\bar{x}\,x)).

Now for fixed bb, (u1) and (u2) form indeed a universal property, i.e. if such a function exists then it's unique and fold\mathsf{fold} is fully caracterized by these properties. Unicity is proven by means of induction: assume there's functions f\mathsf{f} and f\mathsf{f}^{'}which satisfy (u1) and (u2) but which disagree i.e. ¬f(f,b,x^)=f(f,b,x^)\neg\, \mathsf{f}(f, b, \hat{x}) = \mathsf{f}'(f, b, \hat{x}) on some list x^=xˉ ⁣: ⁣x\hat{x}=\bar{x}\colon\!x. Now by (u2) they also need to disagree on xˉ\bar{x}, hence x^\hat{x} must have length zero contradicting (u1).

We can also restate (u2) by saying that for all f,bf,\,b then as a function of [α][\alpha] in β\beta the partial application foldfb\mathsf{fold}_{f\,b} commutes with  ⁣: ⁣x\colon\!x and x\cdot x, that is:

Existence of a fold function is proved by implementation in the context of the majority of programming languages. In Clojure, fold is the reduce function, but the universal property (u1) and (u2) can actually provide a constructive definition:

(defn fold [f i l]
  (if-some [last (peek (vec l))]
		(f (fold f i (butlast l)) last)
    i))
0.2s
Clojure
Clojure+TestCheck
'user/fold

Since fold\tt{fold} above satisfies (u1) and (u2) by definition, and we can easily prove it forreduce\tt{reduce}, then they have to be the same function, but we can build some cheap function-equality check on integer vectors

(require '[clojure.test.check :as c])
(require '[clojure.test.check.generators :as gen])
(require '[clojure.test.check.properties :as p])
(defn =' [& fns]
 (let [pr (p/for-all [v (gen/vector gen/int)]
           (apply = (map #(% v) fns)))]
   (c/quick-check 100 pr)))
3.4s
Clojure
Clojure+TestCheck
'user/='

to get a hint our definition is sound, trying out some concrete example

(=' (partial fold + 0)
    (partial reduce + 0))
0.3s
Clojure
Clojure+TestCheck
Map {:result: true, :num-tests: 100, :seed: 1571307035067}
(=' (partial fold str "")
    (partial reduce str ""))
0.2s
Clojure
Clojure+TestCheck
Map {:result: true, :num-tests: 100, :seed: 1571307035336}

Expressing List Operations in Terms of Fold: Transducers

It is possible to express a lot of functions on lists in terms of fold, amongst the most popular are filter and map:

(defn filter' [pred]
  (fn [xs x] (if (pred x) (conj xs x) xs)))
(fold (filter' odd?) [] (range 10))
0.1s
filter'Clojure
Clojure+TestCheck
Vector(5) [1, 3, 5, 7, 9]
(defn map' [phi]
  (fn [xs x] (conj xs (phi x))))
(fold (map' inc) [] (range 9))
0.1s
map'Clojure
Clojure+TestCheck
Vector(9) [1, 2, 3, 4, 5, 6, 7, 8, 9]

where (filterπ)(\tt{filter}'\,\pi) and (mapϕ)(\tt{map}'\,\phi) are actions of natural numbers on lists of natural numbers. Clojure transducers bring this pattern one step further: they incapsulate list-like operations independently of the reducing function. Formally, transducers are transformations of actions i.e. functions

which behaves functorially with respect of folds, this will be explained later.

Functions like filter\tt{filter} and map\tt{map} in Clojure, when given a single argument, return a transducer. For instance, loot at

(let [s (with-out-str (clojure.repl/source filter))]
   (println (clojure.string/join (take 420 s))))
0.6s
Clojure
Clojure+TestCheck
nil

and let's see it applied to theconj\tt{conj}function at first

(fold ((filter odd?) conj) [] (range 6))
0.0s
Clojure
Clojure+TestCheck
Vector(3) [1, 3, 5]
(fold ((map inc) conj) [] (range 9))
0.0s
Clojure
Clojure+TestCheck
Vector(9) [1, 2, 3, 4, 5, 6, 7, 8, 9]

and later to the +\tt{+} function

(fold ((filter odd?) +) 0 (range 6))
0.1s
Clojure
Clojure+TestCheck
9

The strong point for using transducers in practice is that they offer stack reducing operations in a composable way in which the input list will be visited just once. Take for instance:

(def coll [{:a 1} {:a 2} {:a 3} {:a 4}])
(->> coll
     (map :a)
     (filter odd?)
     (map inc)
     (reduce + 0))
0.1s
Clojure
Clojure+TestCheck
6

At each step above a whole list is returned and fed the next computation which iterates through it again and again. With transducers this won't happen, the following snippet of code reads the input collection just once, encoding the transformations in a single action:

(def xf (comp (map :a)
              (filter odd?)
              (map inc)))
(reduce (xf +) 0 coll)             
0.1s
Clojure
Clojure+TestCheck
6

which in clojure is (almost) the same of the simpler form

(transduce xf + 0 coll)
0.1s
Clojure
Clojure+TestCheck
6

Later you'll also see the reason for this contravariant behaviour in the order of the function composition which is not the natural right-to-left order.

Fusion Property and the Composition of Folds

Having shown that many functions on lists can be expressed in terms of fold, when can we actually assert that a composition of folds is expressible in a fold of a single action? One step in this direction is given by the fusion property.

Given right α\alpha-actionsf ⁣:βαβf\colon\beta\rightarrow\alpha\rightarrow\beta and g ⁣:βαβg\colon\beta'\rightarrow\alpha\rightarrow\beta'we we call a function ϕ:ββ\phi:\beta\rightarrow\beta' a morphism from f to g if ϕ(f(b,x))=g(ϕ(b),x)\phi(f(b, x)) = g(\phi(b), x) holds for every bβ,xαb\in\beta, x\in\alpha.

We can prove that fold\mathsf{fold} is stable under the application of morphisms, i.e. given a morphismϕ\phi of actions like the one above, then we have:

To prove the above equality we appeal to the universal property: if we can prove (u1) and (u2) of ϕfoldfb\phi\circ\mathsf{fold}_{\,f\,b}, then the equality above must hold for every list in[α][\alpha]. While it's trivial to see (u1), (u2) follows by combining commutative diagrams:

Now, if we want to compose folds as functions of lists we have to restrict to some specific class of actions. We say that an action on lists f ⁣:[α]×α[α]f\colon[\alpha]\times\alpha\rightarrow[\alpha] splits if there exists some functionf ⁣:ααf'\colon\alpha\rightarrow\alpha, such that xˉa=xˉ ⁣:f(a)\bar{x}\cdot a = \bar{x}\colon f'(a) for all aαa\in\alpha. Note that the actions defined in the examples above all split (with some formal imagination in the filter case). Given a splitting α\alpha-action ff and an action gg of α\alpha on β\beta we define a function of actions tf ⁣:Aβ,αAβ,α\mathsf{t}_f\colon A_{\beta,\alpha} \rightarrow A_{\beta, \alpha}defined by

We can now state a transducing property for folds of splitting actions in terms of: for every splittingffand g we have:

wherell'isfold(g,b,l)\mathsf{fold}(g, b, l).

To prove the transducer lemma it's enough to show that foldgb\mathsf{fold}_{g\,b} is a morphism between the actionsffandtf(g)\mathsf{t}_f(g):

Let's apply the lemma above to a simple Clojure case whereggis++andffis((mapinc)conj)\tt{((map\,inc)\,conj)}, thenffsplits (by definition) andtf(g)\mathsf{t}_f(g)is((mapinc)+)\tt{((map\,inc)\, +)}:

(=' (comp (partial reduce + 0)
          (partial map inc))
  
    (comp (partial reduce + 0)
          (partial reduce ((map inc) conj) []))
    (partial reduce ((map inc) +) 0))          
0.2s
Clojure
Clojure+TestCheck
Map {:result: true, :num-tests: 100, :seed: 1571307036569}

Now since function composition is associative, repeating the step above we also get for instance

(=' (comp (partial reduce + 0)
          (partial map inc)
          (partial filter odd?))
    (comp (partial reduce ((map inc) +) 0)
          (partial reduce ((filter odd?) conj) []))
    (partial reduce ((filter odd?) ((map inc) +)) 0)
    
    (partial reduce ((comp (filter odd?) (map inc)) +) 0))
0.1s
Clojure
Clojure+TestCheck
Map {:result: true, :num-tests: 100, :seed: 1571307036798}

which explains the countravariant behaviour in the composition of transducers, with respect to the composition of the non-transduced form.

Stateful transducers and cat

There's some transducers which escape the pure form of splitting actions as defined above, most notably

(clojure.repl/source cat)
0.6s
Clojure
Clojure+TestCheck
nil

to flatten list outputs on the fly:

(let [coll [{:a [1]} {:a [2]} {:a [3]}]
      xf (comp (map :a) cat)]
 (reduce (xf +) 0 coll))
0.0s
Clojure
Clojure+TestCheck
6

and stateful transducers, like say the 0-ary form ofdistinct\tt{distinct}

(clojure.repl/source distinct)
0.5s
Clojure
Clojure+TestCheck
nil

or the 1-ary form of take-while

(clojure.repl/source take-while)
0.6s
Clojure
Clojure+TestCheck
nil

which uses the reduced trick to short-circuit the fold, allowing for very nice stuff like

(def terms [{:do true :val 1} 
            {:do true :val 2}
            {:do true :val 2}
            {:do true :val 1}
            {:do true :val 3}
            {:do false :val 4}
            {:do true :val 5}
            {:do true :val 6}])
(let [xf (comp (take-while :do)
               (map :val)
               (distinct))]
	(reduce (xf +) 0 terms))
0.0s
Clojure
Clojure+TestCheck
6

If you'd like to discuss this, find me at @lo_zampino on Twitter. Or remix this article to explore transducers yourself!

Appendix

{:deps {org.clojure/test.check {:mvn/version "0.9.0"}}}
deps.edn
Extensible Data Notation
(require '[clojure.test.check])
(clojure-version)
2.0s
Clojure+TestCheck (Clojure)
"1.10.0"
Runtimes (2)