Some time ago I ported most of Clojureâs core
namespace to Fennel and made it into a library called fennel-cljlib.
This was my first library for Fennel, so it wasnât really great in terms of how it was implemented.
While it was making Fennel more like Clojure syntax-wise, which I like, it wasnât following Clojureâs semantics that well.
The main thing that was missing is a proper seq
abstraction, which Clojure relies on for providing both lazy sequences and generic functions that can work on any data type that implements ISeq
.
Such functions are map
, filter
, take
and so on.
Since then, Iâve made a few more libraries for Fennel, which were somewhat more narrowly focused, and one of such libraries was lazy-seq - an implementation of Clojureâs lazy sequences.
It doesnât feature chunked sequences, (yet, maybe), but it implements almost all sequence-related functions from Clojure.
And you can throw pretty much any Lua data structure that implements pairs
or ipairs
into it, and it will work out how to lazily transform it into a sequence.
This was one of the missing pieces for the fennel-cljlib, as its implementation of seq
simply made a shallow copy of a given object in a linear time, making sure that the result is sequential.
With the implementation of seq
from the lazy-seq library, I could rewrite fennel-cljlib, also making all sequence-related functions lazy.
And while this will make the library more Clojure-like one piece is still missing from both fennel-cljlib and lazy-seq libraries.
Transducers.
Iâm quite familiar with transducers, well, I use them regularly at work, and I read about their implementation a few years ago.
However, Iâve never implemented a transduceable context, i.e. a function that accepts a transducer, and applies it to elements of a given collection.
So, as a part of the rewrite of the fennel-cljlib library, I needed not only to port transducers themselves, but I also had to implement such functions as into
, transduce
, and sequence
, which are transduceable contexts.
Thankfully, into
and transduce
are written in Clojure, and are very straightforward to understand, but the sequence
function is not.
Hereâs its source code:
(defn sequence
([coll]
(if (seq? coll) coll
(or (seq coll) ())))
([xform coll]
(or (clojure.lang.RT/chunkIteratorSeq
(clojure.lang.TransformerIterator/create xform (clojure.lang.RT/iter coll)))
()))
([xform coll & colls]
(or (clojure.lang.RT/chunkIteratorSeq
(clojure.lang.TransformerIterator/createMulti
xform
(map #(clojure.lang.RT/iter %) (cons coll colls))))
())))
It is written mostly via Java interop, and I canât use this in my port of the clojure.core
namespace to Fennel, because Fennel runs on Lua, and Lua canât really interop with Java.
So I had to reimplement this function in Fennel, and this is what motivated me to write this post.
Also, I donât really know Java, so understanding how sequence
works was a challenge on its own.
The interesting thing is, after Iâve tried to implement this function several times, I understood that I, actually, donât understand transducers as well as I thought.
So I had to go back a bit and learn how transducers actually work, and why are they written the way they are.
It was really fascinating, and after a bit of trial and error, Iâve managed to implement sequence
in Clojure first, and then port it to Fennel, using my implementation of lazy sequences.
I will show the resulting code later in this post, but first, letâs understand what transducers are, and how they work.
Transducers
First, a bit of theory.
A transducer is a function that describes the process of transformation, without knowing how exactly the thing it transforms is organized.
It is not the same as generic functions, because transducers are generic in a bit different way.
For example, we all know how map
works in Clojure - you pass it a function and a collection, and it applies the function to each element of the collection:
(map inc [1 2 3]) ;; => (2 3 4)
Map
walks through the given collection, applies the given function to each element, and puts results into a sequence.
Seems nice, and Clojure actually makes map
generic by transforming the given collection to a sequence before actually mapping over it.
However, this approach isnât generic enough, because there are things that canât be transformed into a sequence in an efficient or even meaningful way.
One such thing is an asynchronous channel, and when Clojure developers were working on the core.async
library, theyâve realized that sequence manipulation functions are usable in the context of a channel, but reimplementing all these functions is a no-go.
So how did Clojure developers solve this problem?
By decoupling the transformation process from the collection it transforms.
Hereâs a helpful analogy.
Imagine an airport worker, whose job is to weigh and sort the luggage before it goes into a plane.
Their basic instructions are:
- Take a bag;
- Measure its weight and put a sticker on the bag;
- If the bag weight is bigger than X, donât put the bag on the plane;
- Hand the bag over.
Note that this process, while can be applied to a single bag at a time, doesnât at all specify how bags are coming to you and how they leave you.
One day bags can come to you in containers brought by a vehicle, the other day they can come on a conveyor, it doesnât matter to you - you just take a bag, weigh it, put a sticker and hand it over to another car or another conveyor.
These details should not matter, because your job remains the same, even if you take bags from different sources every day.
This is, of course, an analogy, but it applies to programming pretty well.
Bags are items in the collection youâre going to map
a function over and then filter
them out.
The function is what you do with the bag, in the case of our fellow airport worker weighing bags and putting stickers.
In addition, notice that we first weight the bag, and then filter it out immediately, in oppose to weighting all bags and then filtering them one by one.
However, in a programming language, the way how items are coming to us completely depends on the collection.
And how we collect results into another collection depends on the implementation of the map
function.
These are two main things stopping us from describing an arbitrary transformation process, without tying ourselves to a particular collection implementation or a conversion protocol.
Looking at other languages, which provide different classes for different data structures, most of the time map
is a method and not a function.
This way map
can be implemented in terms of the collection youâre mapping through, usually producing the same collection as a result, because this method knows how to map a function over this particular collection implementation.
Methods do not fly in functional languages, so another approach, some languages take is to provide different implementations of map
functions via namespaces.
For example, Elixir has a map
function implemented for lists and another implementation for streams:
# Enum versions of map and filter
iex(1)> (0..3
|> Enum.map(fn(x) -> IO.puts("map #{x}"); x + 1 end)
|> Enum.filter(fn(x) -> IO.puts("filter #{x}"); rem(x, 2) != 0 end)
|> List.first)
map 0
map 1
map 2
map 3
filter 1
filter 2
filter 3
filter 4
1
# Stream versions of map and filter
iex(2)> (0..3
|> Stream.map(fn(x) -> IO.puts("map #{x}"); x + 1 end)
|> Stream.filter(fn(x) -> IO.puts("filter #{x}"); rem(x, 2) != 0 end)
|> Enum.to_list
|> List.first)
map 0
filter 1
map 1
filter 2
map 2
filter 3
map 3
filter 4
1
The difference is quite substantial, as streams apply function composition to each element one by one, (similarly to how our airport worker does) whereas enumerations are fully traversed by map
first and only then the result is being passed to filter
.
This distinction is possible thanks to different implementations of map
, but be it a method of a specific class, or a namespaced function thatâs exactly what Clojure developers wanted to avoid.
So transducers were created.
Understanding transducers
To understand transducers, we first need to understand what map
essentially does and how we can abstract it away from both the input collection and the result it produces.
It may seem obvious: map
applies a function to each element of a collection and puts the result into a new collection.
Iâve marked important things in bold because if we think about these a bit, weâll see that there are some concrete actions that the map function performs, which should be abstracted away.
First, letâs implement map
in terms of reduce
.
This function is actually very similar to how mapv
is implemented in Clojure, except weâve left out some optimizations:
(defn mapr
"A version of `map` that uses `reduce` to traverse the collection and
build the result."
[f coll]
(reduce (fn reducer [result val] (conj result (f val))) [] coll))
;; ^ ^ ^ ^ ^ ^
;; | | | | | |
;; | | | | | `Collection to iterate
;; | | | | `Collection to put results
;; | | | `Function that produces the result
;; | `A reducing function that`knows how to put elements to result
;; `This is how we get a collection element one by one
Here, reduce
takes care of how to traverse the collection, and the reducer
function takes care of how to put elements to the result.
It may seem that weâve decoupled these steps from map
but weâve just moved them into another place.
More than that, if we were to implement filter
this way we would have to put the logic, that decides what elements are going to be left out, into an anonymous function:
(defn filterr
"A version of `filter` that uses `reduce` to traverse the collection
and build the result."
[f coll]
(reduce (fn [res x] (if (f x) (conj res x) res)) [] coll))
;; ^ ^
;; | |
;; | `Logic that decides whether the value will be left out
;; `generic way to iterate through a collection
Notice, that both mapr
and filterr
share a lot of structure, the only difference here is how to put the resulting value into the collection.
This should give us a hint on how we can abstract this away.
And given that Clojure is functional, we can write functions that accept other functions and return new functions, which will provide a generic way of how to put the result into a collection:
(defn map-transducer [f]
(fn [reducer]
(fn [result value]
(reducer result (f value)))))
(defn filter-transducer [pred]
(fn [reducer]
(fn [result value]
(if (pred value)
(reducer result value)
result))))
This isnât what a transducer really is like, but a first real step towards them.
The key point here is that now, we can describe a reducing process without knowing how to put the modified item into the resulting collection (or channel, or socket, or whatever).
The only thing we need to implement for collection now is reduce
or some other function that has the same interface as reduce
.
This is how we can use these prototype transducers:
user> (def incrementer (map-transducer inc)) ; a function that knows how to increment
#'user/incrementer
user> (incrementer conj) ; teaching `incrementer` how to put elements to the result
#function[user/map-transducer/fn--5698/fn--5699]
user> (reduce (incrementer conj) [] [1 2 3]) ; using this transducer in `reduce`
[2 3 4]
user> (reduce ((filter-transducer odd?) conj) [] [1 2 3]) ; same for `filter-transducer`
[1 3]
So what happens here is that when we call map-transducer
and pass it a function inc
it returns a function, that accepts the reducer
also known as the reducing function of just rf
for short.
We then call this function, passing it the reducing function conj
and get another function, that accepts the results so far, and the element to process.
This function then calls inc
, which weâve supplied in the first step, on the element, and uses conj
to put the resulting value to result
.
In other words, by passing inc
and conj
to the transducer weâve basically constructed (fn [result value] (conj res (inc value)))
function, that is then used by reduce
.
Hereâs a demonstration:
user> (reduce ((map-transducer inc) conj) [] [1 2 3])
[2 3 4]
user> (reduce (fn [result value] (conj result (inc value))) [] [1 2 3])
[2 3 4]
And thatâs basically what transducers are all about!
Theyâre just a composition of functions, that produces the final transformation function, that acts as a single step over the given collection.
And the amazing part of such design is that transducers can be composed with other transducers:
user> (reduce ((comp (map-transducer inc)
(filter-transducer odd?))
conj)
[] [1 2 3 4 5 6])
[3 5 7]
user> (reduce (fn [result value] ; above is essentially the same as this
((fn [result value] ; function composition
(if (odd? value)
(conj result value)
result))
result (inc value)))
[] [1 2 3 4 5 6])
[3 5 7]
It may be a little hard to process, but donât worry, I will go into details after we complete the implementation of transducers, as weâre not yet finished.
Completing transducers
Most transducers donât have any intermediate state, but some do.
For example, the partition-all
function takes a collection and returns a list of partitions of elements from this collection.
Transducer, that is returned by this function, needs to store elements inside an array, and only after the current partition is filled it will append it to the result.
Seems logical, however, if the number of elements canât be equally partitioned, some will be left over:
user> (partition-all 3 (range 8)) ; a regular partition-all call
((0 1 2) (3 4 5) (6 7))
user> (defn partition-all-transducer
"Our naive implementation of `partition-all` as a transducer."
[n]
(fn [reducing-function]
(let [p (volatile! [])]
(fn [result value]
(vswap! p conj value) ; building the partition
(if (= (count @p) n)
(let [p* @p]
(vreset! p []) ; clearing the partition storage
(reducing-function result p*)) ; adding the partition to the result
result ; returning result as is, if the partition is not yet complete
)))))
#'user/partition-all-transducer
user> (reduce ((partition-all-transducer 3) conj) [] (range 8))
[[0 1 2] [3 4 5]]
We can see that in the case of our implementation, only complete partitions were added to the result, yet there should be an additional incomplete partition, as shown by the direct partition-all
call.
This is because our transducer is missing a so-called completion step:
(defn partition-all-transducer [n]
(fn [reducing-function]
(let [p (volatile! [])]
(fn
([result] ; completion arity
(if (pos? (count @p))
(reducing-function result @p)
(reducing-function result)))
([result value] ; reduction arity
(vswap! p conj value)
(if (= (count @p) n)
(let [p* @p]
(vreset! p [])
(reducing-function result p*))
result))))))
Here Iâve added another arity, that must be called after the reduction process is complete.
This arity checks if the array weâve used to store the incomplete partition is not empty.
If itâs not, it means that there are some leftovers, that we need to add to the result.
So it calls reducing-function
with this array, otherwise it will call reducing-function
with result
only, propagating completion step down the line.
Invoking this arity after reduce
completed, we can see that all partitions are present:
user> (let [f ((partition-all-transducer 3) conj)
res (reduce f [] (range 8))]
(f res)) ;; complete step
[[0 1 2] [3 4 5] [6 7]]
Our reduce
example has become way too verbose, and thereâs also a potential for error if our transducer leaked from this scope and someone else used it after it was completed.
Notice that Iâve forgotten to clear the volatile p
in the completion step, and if someone else calls this particular function again, these leftover elements will be added to the result again.
(Try to achieve that.)
Therefore, Clojure abstracts the process of finalizing a transducer into a function, conventionally called transduce
:
(defn transduce
([xform f coll]
(transduce xform f (f) coll))
([xform f init coll]
(let [f (xform f)
ret (reduce f init coll)]
(f ret))))
Thereâs one additional arity that can be added to our transducer implementation, which is used for initialization, done by calling the reducing function without arguments.
This arity takes zero arguments, and it is optional, as not all reducing functions can come up with a meaningful initialization process, but itâs better to supply it than not.
So a complete implementation of the map-transducer
function is:
(defn map-transducer [f]
(fn [reducing-function]
(fn
([] ; init
(reducing-function))
([result] ; complete
(reducing-function result))
([result input] ; step
(reducing-function result (f input)))
([result input & inputs] ; step with multiple inputs
(reducing-function result (apply f input inputs))))))
Code Snippet 2:
A complete implementation of a mapping transducer
And thatâs it!
This is a complete implementation of a transducer.
Clojure provides this as an additional arity of map
where you only supply a function, without the collection, thus we donât even need a separate function for this, it was merely for demonstration purposes.
If you look back at the Elixir example, you can see that when Stream
implementations of map
and filter
are used, each function is applied in quick succession, opposed to Enum
version, where map
is applied first, and then the result is filter
âed as a whole.
With transducers, we just implemented, or with ones available to us in Clojure we can do exactly the same:
user> (->> (range 4)
(map (fn [x] (println "map" x) (inc x)))
(filter (fn [x] (println "filter" x) (odd? x)))
first)
;; map 0
;; map 1
;; map 2
;; map 3
;; filter 1
;; filter 2
;; filter 3
;; filter 4
;; => 1
user> (->> (range 4)
(transduce
(comp (map (fn [x] (println "map" x) (inc x)))
(filter (fn [x] (println "filter" x) (odd? x))))
conj)
first)
;; map 0
;; filter 1
;; map 1
;; filter 2
;; map 2
;; filter 3
;; map 3
;; filter 4
;; => 1
Now letâs really understand how transducers work.
Understanding the inverse order in comp
and how transducers are composed
The transduce
call may have seemed a bit complicated because of comp
, but hereâs a nice trick Iâve found.
Simply remember, that the order of transducers in comp
is exactly the same as the order of calls in the ->>
macro.
However, it may seem counter-intuitive, because usually functions in comp
are applied in the reverse order, e.g.:
((comp a b c d) x)
; expressions are aligned for clarity
(a (b (c (d x))))
In other words, even though functions are provided in order a
, b
, c
, d
, the call order will be d
, c
, b
, a
.
And in the case of transducers the composition works the same way, itâs just we have one extra step after weâve passed the reducing function.
As Iâve demonstrated in this example, the composition basically is engineered in such a way that it kinda inverses its order twice.
The first inversion happens after we compose functions, and the second inversion happens when we pass the reducing function.
Letâs use the substitution model, sometimes referred to as normal order evaluation to see why this happens.
Substitution here basically means that before we do any reduction, we expand all forms until they only contain primitives.
This is done by substituting names with expressions they refer to.
For example, given these three functions:
(defn add [a b] (+ a b))
(defn square [x] (* x x))
(defn sum-squares [a b] (+ (square a) (square b)))
We can walk through the (sum-squares (add 1 1) (add 1 2))
expression, and see how it will be evaluated in normal order, and how it differs from applicative order:
Normal order |
Applicative order |
(sum-squares (add 1 1) (add 1 2)) |
(sum-squares (add 1 1) (add 1 2)) |
(sum-squares (+ 1 1) (+ 1 2)) |
(sum-squares (+ 1 1) (+ 1 2)) |
(+ (square (+ 1 1)) (square (+ 1 2))) |
(sum-squares 2 3) |
(+ (* (+ 1 1) (+ 1 1)) (* (+ 1 2) (+ 1 2))) |
(+ (square 2) (square 3)) |
(+ (* 2 2) (* 3 3)) |
(+ (* 2 2) (* 3 3)) |
(+ 4 9) |
(+ 4 9) |
13 |
13 |
Notice, that in normal order (+ 1 1)
and (+ 1 2)
are executed twice, because all substituting happens before any reduction.
In applicative order, reduction happens before substituting, so every expression is computed only once.
Applicative order is more in line with real evaluation rules, but itâs harder to see whatâs happening when we compose things, so Iâll use the normal order to show how transducers are composed.
With this in mind, letâs try to walk through the following expression:
((comp (map-transducer inc)
(filter-transducer odd?))
conj)
First, we need to substitute map-transducer
and filter-transducer
with their (simplified) implementations:
((comp ((fn [f]
(fn [rf]
(fn [result value]
(rf result (f value))))) inc)
((fn [f]
(fn [rf]
(fn [result value]
(if (f value)
(rf result value)
result)))) odd?))
conj)
Next, letâs substitute f
with inc
and odd?
and get rid of function calls, substituting them with anonymous functions that accept rf
:
((comp (fn [rf]
(fn [result value]
(rf result (inc value))))
(fn [rf]
(fn [result value]
(if (odd? value)
(rf result value)
result))))
conj)
Now, knowing that composition of functions f
and g
is (fn [x] (f (g x)))
we can substitute comp
with this expression, and then substitute f
and g
in this expression with our functions from the previous step:
((fn [x]
((fn [rf]
(fn [result value]
(rf result (inc value))))
((fn [rf]
(fn [result value]
(if (odd? value)
(rf result value)
result)))
x)))
conj)
Note that according to composition rules, odd?
should be executed first, and inc
would follow it, but weâre not done composing yet.
Letâs substitute x
for conj
and remove the function call:
((fn [rf]
(fn [result value]
(rf result (inc value))))
((fn [rf]
(fn [result value]
(if (odd? value)
(rf result value)
result)))
conj))
We can now substitute rf
for conj
in the innermost function call, and remove the innermost function that accepts rf
:
((fn [rf]
(fn [result value]
(rf result (inc value))))
(fn [result value]
(if (odd? value)
(conj result value)
result)))
Finally, we can substitute outer rf
with the entire inner function body, and remove another call:
(fn [result value]
((fn [result value]
(if (odd? value)
(conj result value)
result))
result (inc value)))
As the last step, weâre substituting the inner functionâs value
to (inc value)
and result
to result
to eventually get:
(fn [result value]
(if (odd? (inc value))
(conj result (inc value))
result))
The final function that will be executed by reduce
, which is just an ordinary two-argument function!
And as you can see inc
happens before odd?
.
So yes, the order in comp
may seem inverse, as inc
and odd
appear in a logical order, but thanks to how the whole composition process evolves, this logical order can be preserved in the resulting function.
I hope that with this substitution model you can now understand the whole composition process of transducers, which is not a trivial process by any means.
But Iâm actually amazed by how this simple idea achieves such a complete abstraction that can be used in all kinds of transformation contexts.
Speaking of transduceable contexts, letâs implement one!
Implementing sequence
transduceable context
Now weâre ready to implement sequence
in Clojure, without direct Java interop.
First, letâs remember what reduce
does:
- Accepts a function, initial value, and a collection;
- Gets an element of a collection and passes the initial value and the element to the reducing function;
- The reducing function returns current result;
- The result is then passed to the reducing function alongside the next element from the collection;
- Once the collection is exhausted, the result is returned.
Thus, reduce
can be written as an ordinary loop.
However, sequence
is lazy, therefore we canât just loop through the collection, but thankfully, lazy sequences can be recursive.
And, thereâs another problem, we must append each element to the sequence weâre producing, but we also need to check if weâve actually finished, or if we need to skip the element because it is filtered out.
And we need to call the completion step somewhere.
Though, if you think about it, we donât need for our transducer to actually append elements to a collection, it can merely do the transformation, and since we know what kind of collection weâre building, we can build it later.
With this approach, we can check the result of a transducer on each step and act accordingly.
First, letâs finalize the transducer with a reducing function:
(defn sequence [xform coll]
(let [f (xform (completing #(cons %2 %1)))]
;; lazy loop?
))
It may seem that weâre using cons
here because weâre building a sequence, but itâs not.
We could actually use anything, like conj
or even a function that returns something that we can later distinguish from other values.
In cons
call, we have to reverse arguments though, because cons
adds the second argument to a list, provided as a first argument.
And completing
simply adds a completion step that just returns the value itâs been given, in other words, we could write it as (fn ([a] a) ([a b] (cons b a)))
.
Now letâs figure out how to loop through the collection.
We can start with an ordinary loop and then convert it to recursion, adding laziness as the last step.
(defn sequence [xform coll]
(let [f (xform (completing #(cons %2 %1)))
res (loop [s (seq coll)
res ()]
(if s
(let [x (f nil (first s))]
(if (seq? x)
(recur (next s) (concat res x))
(recur (next s) res)))
res))]
res))
Letâs try it:
user> (sequence (map inc) [1 2 3])
(2 3 4)
user> (sequence (partition-all 2) [1 2 3 4 5])
([1 2] [3 4])
user> (clojure.core/sequence (partition-all 2) [1 2 3 4 5])
([1 2] [3 4] [5])
Seems to work, but weâre missing the completion step (hence no incomplete partition in the result), so letâs add it:
user> (defn sequence [xform coll]
(let [f (xform (completing #(cons %2 %1)))
res (loop [s (seq coll)
res ()]
(if s
(let [x (f nil (first s))]
(if (seq? x)
(recur (next s) (concat res x))
(recur (next s) res)))
(f res)))] ; complete
res))
user> (sequence (partition-all 2) [1 2 3 4 5])
([5] [1 2] [3 4])
Oops, remember, that weâre using cons
and we canât really call f
with res
as a completion step, because our reducing function doesnât know how to build the whole collection, only how to transform a single element.
Instead, we have to call it with nil
as before, and concat
it with the result:
user> (defn sequence [xform coll]
(let [f (xform (completing #(cons %2 %1)))
res (loop [s (seq coll)
res ()]
(if s
(let [x (f nil (first s))]
(if (seq? x)
(recur (next s) (concat res x))
(recur (next s) res)))
(concat res (f nil))))] ; proper completion
res))
user> (sequence (partition-all 2) [1 2 3 4 5])
([1 2] [3 4] [5])
Now letâs make it recursive, by replacing loop
with an anonymous function, and recur
with actual recursion:
user> (defn sequence [xform coll]
(let [f (xform (completing #(cons %2 %1)))
step (fn step [coll res]
(if-some [s (seq coll)]
(let [x (f nil (first s))]
(if (seq? x)
(step (next s) (concat res x))
(step (next s) res)))
(concat res (f nil))))]
(step coll ())))
#'user/sequence
user> (sequence (partition-all 2) [1 2 3 4 5])
([1 2] [3 4] [5])
user> (dorun (sequence (map inc) (range 100000)))
Execution error (StackOverflowError) at user/sequence$step (REPL:25).
null
It still seems to work, but overflows with enough elements.
Luckily, we can use lazy-seq
to eliminate this problem, and actually make our implementation lazy:
user> (defn sequence [xform coll]
(let [f (xform (completing #(cons %2 %1)))
step (fn step [coll res]
(if-some [s (seq coll)]
(let [x (f nil (first s))]
(if (seq? x)
(lazy-seq (step (rest s) (concat res x)))
(lazy-seq (step (rest s) res))))
(concat res (f nil))))]
(step coll ())))
user> (dorun (sequence (map inc) (range 100000)))
Execution error (StackOverflowError) at user/sequence$step (REPL:1).
null
And it still throws the StackOverflowError
.
Why?
Well, because weâre not really lazy yet.
Instead of passing the result to the next iteration of step
, as we did in loop
we should use the result of step
and concatenate with it.
Letâs reorganize our function:
user> (defn sequence [xform coll]
(let [f (xform (completing #(cons %2 %1)))]
((fn step [coll]
(if-some [s (seq coll)]
(let [res (f nil (first s))]
(if (seq? res)
(concat res (lazy-seq (step (rest s))))
(step (rest s))))
(f nil)))
coll)))
user> (dorun (sequence (map inc) (range 100000)))
nil
Now it doesnât overflow.
However, it is not yet ready to be used, because reduce
, as you may know, can be terminated with reduced
, and some transducers, like take
leverage that to terminate the process.
So we need to check for reduced?
in our implementation.
Not only that, but we have to call the completion step on the value, returned by dereferencing the reduced
object, otherwise it will not be added to the resulting sequence properly.:
user> (defn sequence [xform coll]
(let [f (xform (completing #(cons %2 %1)))]
((fn step [coll]
(if-some [s (seq coll)]
(let [res (f nil (first s))]
(cond (reduced? res) (f (deref res)) ; checking for early termination
(seq? res) (concat res (lazy-seq (step (rest s))))
:else (step (rest s))))
(f nil)))
coll)))
#'user/sequence
user> (sequence (comp (partition-all 2) (take 5)) (range))
([0 1] [2 3] [4 5] [6 7] [8 9])
user> (sequence (comp (take 5) (partition-all 2)) (range))
([0 1] [2 3] [4])
And weâre done!
Well, almost, sequence
should coerce the result to an empty sequence, and our current version will return nil
if the transducer never returned anything.
Itâs easy enough to fix:
(defn sequence [xform coll]
(let [f (xform (completing #(cons %2 %1)))]
(or ((fn step [coll]
(if-some [s (seq coll)]
(let [res (f nil (first s))]
(cond (reduced? res) (f (deref res))
(seq? res) (concat res (lazy-seq (step (rest s))))
:else (step (rest s))))
(f nil)))
coll)
())))
Code Snippet 3:
Final version of our sequence
transducer.
We can see that it is lazy, by using side-effecting transducers, like ones with println
in those:
user> (sequence
(comp (map (fn [x] (println "map" x) (inc x)))
(filter (fn [x] (println "filter" x) (odd? x)))
(take 3))
(range))
map 0
filter 1
(1map 1
filter 2
map 2
filter 3
3map 3
filter 4
map 4
filter 5
5)
Code Snippet 4:
Notice that the output is all messed up, because the sequence started printing before it was realized, and side effects appeared during the pretty printing process, which itself is lazy.
And as you can see, our sequence
behaves in the same way as the clojure.core/sequence
, regarding laziness:
user> (do (sequence
(comp (map (fn [x] (println "map" x) (inc x)))
(filter (fn [x] (println "filter" x) (odd? x))))
(range)) ;; note, infinite range
nil)
map 0
filter 1
nil
user> (do (clojure.core/sequence
(comp (map (fn [x] (println "map" x) (inc x)))
(filter (fn [x] (println "filter" x) (odd? x))))
(range))
nil)
map 0
filter 1
nil
I think now sequence
is completed, though Iâll need to test it very extensively in the future.
Iâve already tested it a lot when I was porting it to Fennel, and I think it should work correctly, looking at the code at least I donât see anything that could go wrong.
What have I learned
Implementing transduceable context, in this case, the sequence
function, was a nice puzzle.
Iâm sure itâs not as efficient, as the clojure.core
version, mainly due to the use of concat
but I wasnât able to come up with a better way of building the result which can be done lazily.
And after actually implementing sequence
, porting it and a lot of transducers to Fennel, Iâve finally figured out how they actually work, and I hope that now you understand it too!
This is why I love Clojure - the developers are putting a lot of thought into such features, and not taking shortcuts, like reimplementing all functions for data sources that canât be transformed into sequences.
For me, itâs a really practical language, with a lot of tools that can enhance the programming experience, and make it fun.
Of course, if you have any questions, feel free to email me, and Iâll try to answer them, and maybe update the post for future readers.
This topic is a bit complicated, so I hope it was not a boring post.
Thanks for reading!
Permalink