Watching Intro to Clojure With AustinClojure

Last weekend I participated in an event by the Austin Clojure User Group. They watched LispCast LispCast’s Intro to Clojure screencasts as a group in a classroom setting. Someone had bought a group license and donated it to the group.

We had a group of 12 and many had done little to no clojure before. The two days were divided into 4-hour chunks with snacks and caffeinated beverages. The group was lead by Norman, one of the co-founders of the user group. We watched a bit of the video, paused it and then worked on the exercises on our laptops. We had time for questions, questions and troubleshooting…and for Norman to come by and said oh Nola, you forgot this part… :)

Day One (first video)

The first video was basically working in the REPL. We learned how to write statements and create basic functions. We learned basic operations, conditions, branching It was really great, nobody was fighting with their editor or worried about where to save a file or how do any setup of the project. Just learning how to use the REPL. I think this is a great way to get beginners interested without boring them or making them think they have to memorize this list of functions they don’t really understand.

Near the end of the day we were typing long and longer statements and it was kind of frustrating at times when you made a typo or forgot a ) .. so you had to start your function over. Someone had a great solution: type it in a text editor and then paste it into the REPL. Still I got behind a few times and went to the cheatsheet in the austin clojore repo and copy and pasted the function allowing me to move on and not stress over the syntax.

Day Two (second video)

We starting out making a new project and using the library we worked with the day before. We added some of the functions from the previous day. We learned how to run a clojure program from the command line and how to set the “main” function that is run when the app starts. We learned how to take multiple arguments. Also that clojure code is compiled in order :) which tripped up some of us having the functions called before we had defined them. While previous lesson only used vector, we covered the Set..and later the Map. Introducing these at this point when we need more than a vector I think is brilliant. Verses, the lesson starting out with “here’s a set. here’s a map. go forth and prosper.” having a purpose for those I think really solidifies the reasons for each data structure.

I’ve done ruby for 9 years, php for 6, javascript for 16.. its hard being a noob again! though I’ve poked at clojure the last few months I really felt like I got a pretty good grasp on it this weekend. Thanks Austin Clojure User Group and LispCast’s Intro to Clojure by Eric Normand.

Now to watch part 3 then LispCast’s Web Development in Clojure !!!

Permalink

Haskell Data Analysis Cookbook - a Book Review

As with my previous post, Clojure Data Analysis Cookbook - a Book Review, I was this time offered to review Haskell Data Analysis Cookbook by Nishant Shukla.  First impressions: those are two very similar and related books that have some overlapping ideas, but not only are the programming languages used totally different in "genre", the content itself also cover some different data analysis grounds and could be treated as complementary books in that way.


The book itself is very example oriented (much like the Clojure Data Analysis Cookbook), basically being a collection of code recipes for accomplishing various common tasks for data analysis.  It does give you some quick explanations of why and what else to "see also".

It gives you recipes to take in raw data in the form of CSV, JSON, XML, or whatever, including data that lives on web servers (via HTTP GET or POST requests).  Then there are recipes to build up datasets in MongoDB, or SQLite databases.  To recipes to clean up that data, do analysis (e.g. clustering with k-means), to visualizing, presenting, and exporting that analysis.

Each recipe is more or less self-contained, without much in building on top of previous recipes.  It makes the book more "random access".  It's less a book to read through cover to cover, and more of a handy reference to use by full-text searching for key terms, clicking on the relevant topic in the table of contents, or by looking up terms in the index.  It's definitely a book I'd rather have as a PDF ebook so that I can access it anywhere in the world, and so I can do full-text search in.  It does come in Mobi as well as ePub formats, and code samples are provided in a separate zipped download as well.

Having said that, you can tell whether a book was made to be seriously used as a reference or not by looking at its index.  There are 9 pages of indices, equivalent to about 2.9% of the number of pages previous to the index.  This book can certainly be used as a reference.

As a reference book, it's great for people who have already a familiarity with Haskell in general.  If you don't know Haskell, this book won't teach it to you.  That is, unfortunately, possibly a missed marketing opportunity, as those who don't know Haskell (but have knowledge of another programming language) really only needs a small bit to understand enough of how functions are written in Haskell to pick up what's going on in the book.  This means if you know another programming language, know a bit about data analysis, you could use this book to learn some Haskell so long as you pick up the basic syntax with another tutorial in hand (so it's really not a show stopper to using this book).

Similarly, I'd say you had best be familiar with how to do data analysis as a discipline in itself.  If you don't know whether to do clustering or regression, or whether to use a K-NN or K-means, this book won't teach it to you.

Much of that is, of course, echoing the Clojure Data Analysis Cookbook.  Where the Haskell Data Analysis Cookbook differs, makes the two books have a set of complementary ideas.  Whereas both books talk about concurrency and parallelism, the Clojure DAC goes into those topics (including distributed computing) in much more detail.

On the other hand, whereas both books talk about preparing and processing data (prior to performing statistics or machine learning on it), the Haskell DAC goes into much more detail on topics like processing strings with more advanced algorithms (as in computing the Jaro-Winkler distance between strings, not like doing substring/concat operations), computing hashes and using bloom filters, and working with trees and graphs (as in node-and-link graph theory graphs, not grade-school bar graphs).

So in some sense, the Haskell Data Analysis Cookbook has more theory heavy topics (graphs and trees!), whilst the Clojure Data Analysis Cookbook has more "engineering" topics (concurrency, parallelism, and distributed computing).

Neither books are comprehensive treatise on the topic, but someone who needs a practical refresher on working with graphs and trees may find Haskell Data Analysis Cookbook to be quite useful.

All in all, I'd say this is a decent book, because if you have some familiarity of Haskell, have some familiarity with some of the basic technologies like JSON, MongoDB, or SQLite, have taken a class or two of data analysis or machine learning in university (or a MOOC?), and aren't expecting a lot of hand holding from the book, then this book is a great guide to start you off to doing some data analysis with Haskell.

Permalink

Clojure Gazette 1.91

Clojure Gazette 1.91

core.async, Thinking, and Invention

Clojure Gazette

Issue 1.91 August 31, 2014


Editorial

Hi Gazette Readers,

I've been thinking a bit about bias recently. I've started putting ads in the Gazette and I think the issue of bias needs to be addressed.

The Clojure Gazette is me. It's my voice and my interests. I post things I come across and that I think are worth sharing. You subscribe to hear my bias, my particular perspective.

I try to bring this same perspective to the selection and presentation of the sponsors. It's not only about choosing who to show, but also choosing what to talk about out of all the cool stuff they are doing.

If you're subscribed, you must like my bias :) Thanks!

Rock on!
Eric Normand <ericwnormand@gmail.com> @ericnormand

PS Learn more about the Clojure Gazette and subscribe.

Sponsor: Factual

 
Factual is a location platform that enables development of personalized, contextually relevant experiences in a mobile world. Factual is hiring outstanding people in Los Angeles, San Francisco, and Shanghai to help solve the following problem: How to organize ALL of the world's data. 
 
You could join a team of excellent developers at a growing company with a casual, start-up vibe. Thanks, Factual, for supporting the Gazette. Now go apply

SICP Distilled Kickstarter


Tom Hall is doing a Kickstarter to write a book as a companion to Structure and Interpretation of Computer Programs. Kickstarter projects are always a risk, but what convinced me about this one is that Tom Hall seems to know of and maybe have read the Feynman Lectures on Physics. It now seems to be at a pay-what-you-want price. DISCUSS

The Hitchhiker's Guide to the Curry-Howard Correspondence


Chris Ford is a great speaker. This talk shows some of the cool stuff you can do if you have a formal, static type system in your language. If you've ever wondered how logic relates to functional programming, check this out. Did you know that function application is Modus ponensDISCUSS

Clojure for the Brave and True


One of the great resources for beginners out there. It just has this great personality that oozes through. You can buy a digital edition that works on your Kindle, Desktop, and iPad. It supports a good cause. And you'll get updates as they come out. DISCUSS

core.async webinar recording


Well, I was sad to miss this webinar hosted by David Nolen. But they recorded it and here it is. DISCUSS

Implementation details of core.async


Rich Hickey explaining the inner workings of core.async. There are lots of constraints on the design. It's a really interesting talk. I listened to it twice and may have to listen again :) DISCUSS

Clojure eXchange 2014


Early Bird tickets are available for this London event. The discount is steep! I've never been to Clojure eXchange, but I watched many videos and there's always something good going on. DISCUSS

Thinking for Programmers


Leslie Lamport, winner of the Turing Award, teaches you how to think. Two things he relies on in his mind: pure functions and sequences of values (instead of mutation). Sound familiar? DISCUSS

Invention, Innovation and ClojureScript


David Nolen talks about ClojureScript, Om, immutability, and the future. DISCUSS

EuroClojure 2014 Videos


I guess I've been watching a lot of EuroClojure videos, trying to catch up. I haven't seen them all, so if you like to watch conference talks, jump right in! DISCUSS
Copyright © 2014 LispCast, All rights reserved.


unsubscribe from this list    change email address    advertising information

Permalink

Introducing reagent-forms

One thing I’ve always found to be particularly tedious is having to create data bindings for form elements. Reagent makes this much more palatable than most libraries I’ve used. All you need to do is create an atom and use it to track the state of the components.

However, creating the components and binding them to the state atom is still a manual affair. I decided to see if I could roll this into a library that would provide a simple abstraction for tracking the state of the fields in forms.

[...]

Permalink

Meltdown 1.1.0-beta1 is released

TL;DR

Meltdown is a Clojure interface to Reactor, an asynchronous programming, event passing and stream processing toolkit for the JVM.

1.1.0-beta1 is a development mileston that updates Reactor to the most recent point release.

Changes between 1.0.0 and 1.1.0

Reactor Update

Reactor is updated to 1.1.x.

Change log

Meltodwn change log is available on GitHub.

Meltdown is a ClojureWerkz Project

Meltdown is part of the group of libraries known as ClojureWerkz, together with

  • Langohr, a Clojure client for RabbitMQ that embraces the AMQP 0.9.1 model
  • Elastisch, a Clojure client for ElasticSearch
  • Monger, a Clojure MongoDB client for a more civilized age
  • Cassaforte, a Clojure Cassandra client
  • Titanium, a Clojure graph library
  • Neocons, a client for the Neo4J REST API
  • Quartzite, a powerful scheduling library

and several others. If you like Meltdown, you may also like our other projects.

Let us know what you think on Twitter or on the Clojure mailing list.

About the Author

Michael on behalf of the ClojureWerkz Team

Permalink

CSP and transducers in JavaScript

Learning Clojure has introduced me to some really fascinating ideas. I really believe in the importance of trying new things, so I've been playing with two of them — an old idea and a new one: the Clojure/core.async interpretation of C. Hoare's Communicating Sequential Processes (CSP), and Rich Hickey's transducers, coming soon to a Clojure near you.

Hopefully, this post will serve two purposes: to solidify these ideas in my mind by explaining them, and — by proxy — help someone else to understand. To weed out mistakes and weaknesses in my own thinking I'm pretty explicit each small conceptual step, particularly when it comes to transducers. It's a long one, but hopefully useful!

tl;drThere's code on GitHub.

First, a quick introduction to the mass of prior work here...

CSP?

CSP is a formalised way to describe communication in concurrent systems. If that's sounds a little dry, it's because it is — but like many a snore-inducing concept, when hurled at problems in the real world things get a whole lot more interesting. A bit like yoghurt.

core.async?

Just over a year ago an implementation of CSP called core.async was released to the Clojure community, offering "facilities for async programming and communication." It introduced channels, a simple way to coordinate entities in a system. The library is compile-target agnostic so it can also be used from ClojureScript.

Transducers?

The most recent development in this epic saga (spanning almost 40 years of computing history!) are transducers, a "powerful and composable way to build algorithmic transformations". Again dry but very powerful in use — and very hard for me to understand!

Talks about how these concepts tie together fascinated me, and I've been toying with the ideas using ClojureScript and David Nolen's excellent Om framework. In addition, these ideas tie closely with Twitter's Flight framework on which I work.

However I've never felt truly comfortable with what's going on under the hood, and since the best way to learn anything is to do it yourself, I've been experimenting!

Oh, and just quickly — I'm not going to spend very much time on why you might want this stuff. Many of the links above will help.

What's the problem?

There's a whole stack of ideas that combine to make channels and transducers valuable, but I'll pick just one: events are a bad primitive for data flow. They require distribution of mutable state around your code, and it's not idiomatic or pleasant to flow data through events:

pubsub.on('users:response', function (users) {
    users
        .filter(function (user) {
            return !user.muted;
        })
        .forEach(function (user) {
            pubsub.emit('posts:request', {
                user: user.id
            });
        })
    })
});

pubsub.on('posts:response', function (data) {
    ...
});

pubsub.emit('users:request');

Events are fine for one-shot notifications, but break down when you want to coordinate data from a number of sources. Event handlers tend to not be very reusable or composable.

core.async's channels offer an alternative that is ideal for flow control, reuse and composability.

I'll leave it to David Nolen to show you why.

Channels in JavaScript

The first step was to implement the core.async primitive — channels — and their fundamental operations: put and take.

Channels are pretty simple: they support producers and consumers that put values to, and take values from, the channel. The default behaviour is "one-in, one-out" — a take from the channel will give you only the least-recently put value, and you have to explicitly take again to get the next value. They're like queues.

It's immediately obvious that this decouples the producer and consumer – they each only have to know about the channel to communicate, and it's many-to-many: multiple producers can put values for multiple consumers to take.

I'm not going to detail the exact implementation here, but making a new channel is as simple as asking for one: var c = chan().

You can try channels out in this JS Bin:

JS Bin

If you get errors, make sure to click 'Run'.

Stuck for ideas? Try:

> c = chan()
...
> chan.put(c, 10)
...
> chan.take(c, console.log.bind(console, 'got: '))
got: 10
...
> chan.take(c, console.log.bind(console, 'got: '))
...
> chan.put(c, 20)
got: 20
...

Nice. I've added a few upgrades, but fundamentally things stay the same.

By the way... these ideas are firmly rooted in functional programming, so I'm avoiding methods defined on objects where possible, instead preferring functions that operate on simple data structures.

We have working channels!

Transducers in JS

Above, transducers were described as a "powerful and composable way to build algorithmic transformations." While enticing, this doesn't really tell us much. Rich Hickey's blog post, from which that quote is taken, expands somewhat but I still found them very hard to comprehend.

In fact, understanding them meant spending hours frustratedly scribbling on a mirror with a whiteboard pen.

To me, transducers are a generic and composable way to operate on a collection of values, producing a new value or new collection of new values. The word 'transducer' itself can be split into two parts that reflect this definition: 'transform' — to produce some value from another — and 'reducer' — to combine the values of a data structure to produce a new one.

To understand transducers I built up to them from first principles by taking a concrete example and incrementally making it more generic, and that's what we're going to do now.

We're aiming for a "composable way to build algorithmic transformations."

I hope you're excited.

popcorn

From the bottom to the top...

First, we have to realise that many array (or other collection) operations like map, filter and reverse can be defined in terms of a reduce.

To start with, here's an example that maps over an array to increment all its values:

[1,2,3,4].map(function (input) {
    return input + 1;
}) // => [2,3,4,5]

Pretty simple. Note that two things are implicit here:

  • The return value is built up from a new, empty array.
  • Each value returned is added to the end of the new array as you would do manually using JavaScript's concat.

With this in mind, we can convert the example to use .reduce:

[1,2,3,4].reduce(function (result, input) {
    return concat(result, input + 1);
}, []) // => [2,3,4,5]

To get around JavaScript's unfortunate Array concat behaviour, I've redefined it to a function called concat that adds a single value to an array:

function concat(a, b) {
    return a.concat([b]);
}

By the way, we're about to get into higher-order function territory. If that makes you queasy, it might be time to do some reading and come back later!

Our increment-map-using-reduce example isn't very generic, but we can make it more so by wrapping it up in a function that takes an array to be incremented:

function mapWithIncr(collection) {
    return collection.reduce(function (result, input) {
        return concat(result, input + 1);
    }, []);
}

mapWithIncrement([1,2,3,4]) // => [2,3,4,5]

This can be taken a step further by passing the transformation as a function. We'll make one called inc:

function inc(x) {
    return x + 1;
}

Using this with any collection requires another higher-order function, map, that combines the transform and the collection.

This is where things start to get interesting: this function contains the essence of what it means to map — we reduce one collection to another by transforming the values and concatenating the results together.

function map(transform, collection) {
    return collection.reduce(function (result, input) {
        return concat(result, transform(input));
    }, []);
}

In use, it looks like this:

map(inc, [1,2,3,4]) // => [2,3,4,5]

Very nice.

Algorithmic transformations

So what's the next abstraction in our chain? It's perhaps worth restating the goal: a "composable way to build algorithmic transformations." There's two key phrases there: "algorithmic transformations" and "composable". We'll deal with them in that order.

map, defined above, is a kind of algorithmic transformation. Another I mentioned earlier is filter, so let's define that in same way we did for map.

Filter better fits the word "reduce" because it can actually produce fewer values than it was given.

We're going to quickly jump from a concrete example, through the reduce version, to a generic filter function that defines the essence of what it means to filter.

// Basic filter
[1,2,3,4].filter(function (input) {
    return (input > 2);
}) // => [3,4]

// Filter with reduce
[1,2,3,4].reduce(function (result, input) {
    return (
        input > 2 ?
            concat(result, input) :
            result
    );
}, []) // => [3,4]

// Transform (called the predicate)
function greaterThanTwo(x) {
    return (x > 2);
}

// And finally, filter as function
function filter(predicate, collection) {
    return collection.reduce(function (result, input) {
        return (
            predicate(input) ?
                concat(result, input) :
                result
        );
    }, [])
}

In use, it looks like this:

filter(greaterThanTwo, [1,2,3,4]) // => [3,4]

Composable

Now we can construct a couple of different algorithmic transformations, we're missing "composable" bit from that original definition. We should fix that.

How does composability apply to the algorithmic transformations we've already defined — map and filter? There are two ways to combine these transformations:

  • Perform the first transformation on the whole collection before moving on to the second.
  • Perform all transformations on the first element of the collection before moving on to the second.

We can already do the former:

filter(greaterThanTwo, map(inc, [1,2,3,4])) // => [3,4,5]

We can even use compose:

var incrementAndFilter = compose(
    filter.bind(null, greaterThanTwo),
    map.bind(null, inc)
);

incrementAndFilter([1,2,3,4]) // => [3,4,5]

However, this has a number of issues:

  • It cannot be parallelised.
  • It cannot be done lazily.
  • The operations are tied very closely to input and output data structure.

The converse is true for the latter way of combining the transformations, and so is the much more desirable end result.

For a discussion of why this is the case, look into the fork-join model.

Frankly, I found this extremely difficult; I just couldn't understand how they could be composed generically.

Time to dig deeper, and talk about reducing functions.

Reducing functions

A reducing function is any function that can be passed to reduce. They have the form: (something, input) -> something. They're the inner-most function in the map and filter examples.

These are the things we need to be composing, but right now they are hidden away in map and filter.

function map(transform, collection) {
    return collection.reduce(
        // Reducing function!
        function (result, input) {
            return concat(result, transform(input));
        },
        []
    );
}

function filter(predicate, collection) {
    return collection.reduce(
        // Reducing function!
        function (result, input) {
            return (
                predicate(input) ?
                    concat(result, input) :
                    result
            );
        },
        []
    )
}

To get at the reducing functions, we need to make map and filter more generic by extracting the pieces they have in common:

  • Use of collection.reduce
  • The 'seed' value is an empty array
  • The concat operation performed on result and the input (transform-ed or not)

First, let's pull out the use of collection.reduce and the seed value. Instead we can produce reducing functions and pass them to .reduce:

function mapper(transform) {
    return function (result, input) {
        return concat(result, transform(input));
    };
}

function filterer(predicate) {
    return function (result, input) {
        return (
            predicate(input) ?
                concat(result, input) :
                result
        );
    };
}

[1,2,3,4].reduce(mapper(inc), []) // => [2,3,4,5]
[1,2,3,4].reduce(filterer(greaterThanTwo), []) // => [3,4]

Nice! We're getting closer but we still cannot compose two or more reducing functions. The last piece of shared functionality is the key: the concat operation performed on result and the input.

Remember we said that reducing functions have the form (something, input) -> something? Well, concat just one such function:

function concat(a, b) {
    return a.concat([b]);
}

That means there's actually two reducing functions:

  • One that defines the job (mapping, filtering, reversing...)
  • Another that, within the job, combines the existing result with the input

So far we have only used concat for the latter, but who says we have to? Could we use another, completely different reducing function – like, say, one produced from mapper?

Yes, we could.


To build up to composing our reducing functions we'll start with a very explicit example, rewriting filterer to use mapper to combine the result with the input, and explore how the data flows around.

Before we do that, we need a new function: identity. It simply returns whatever it is given:

function identity(x) {
    return x;
}

[1,2,3,4].reduce(mapper(identity), []) // => [1,2,3,4]

We can rewrite filter to use mapper quite easily:

function lessThanThree(x) {
    return (x < 3);
}

function mapper(transform) {
    return function (result, input) {
        return concat(result, transform(input));
    };
}

function filterer(predicate) {
    return function (result, input) {
        return (
            predicate(input) ?
                mapper(identity)(result, input) :
                result
        );
    };
}

[1,2,3,4].reduce(filterer(lessThanThree), []) // => [1,2]

To see how this works, let's step through it:

  1. filterer(lessThanThree) produces a reducing function which is passed to .reduce.
  2. The reducing function is passed result — currently [] — and the first input1.
  3. The predicate is called and returns true, so the first expression in the ternary is evaluated.
  4. mapper(identity) returns a reducing function, then called with [] and 1.
    1. The reducing function's transform function — identity — is called, returning the same input it was given.
    2. The input is concat-ed onto the result and returned.
  5. The new result — now [1] — is returned, and so the reduce cycle continues.

I'd recommend running this code and looking for yourself!

What has this gained us? Well, now we can see that a reducing function can make use of another reducing function – it doesn't have to be concat!

In fact, if we altered filterer to use mapper(inc), we'd get:

[1,2,3,4].reduce(filterer(lessThanThree), []) // => [2,3]

This is starting to feel a lot like composable algorithmic transformation, but we don't want to be manually writing composed functions – we want to use compose!

If we pull out the inner reducing function (the combiner), we make reducing functions that express the essence of their job without being tied to any particular way of combining their arguments.

We'll change the names again to express the nature of what's going on here:

function mapping(transform) {
    return function (reduce) {
        return function (result, input) {
            return reduce(result, transform(input));
        };
    };
}

function filtering(predicate) {
    return function (reduce) {
        return function (result, input) {
            return (
                predicate(input) ?
                    reduce(result, input) :
                    result
            );
        };
    };
}

Those new inner functions – the ones that take a reduce function — are transducers. They encapsulate some reducing behaviour without caring about the nature of the result data structure.

In fact, we've offloaded the responsibility of combining the transformed input with the result to the user of the transducer, rather than expressing it within the reducing function. This means we can reduce generically into any data structure!

Let's see this in use by creating that filtering-and-incrementing transducer again:

var filterLessThanThreeAndIncrement = compose(
    filtering(lessThanThree),
    mapping(inc)
);

[1,2,3,4].reduce(filterLessThanThreeAndIncrement(concat), []) // => [2,3]

Wow. Notice:

  • We only specify the seed data structure once, when we use the transducer.
  • We only tell the transducers how to combine their input with the result once (in this case, with concat), by passing it to the filterLessThanThreeAndIncrement transducer.

To prove that this works, let's turn it into an object with the resulting values as keys without altering the reducing functions.

[1,2,3,4].reduce(filterLessThanThreeAndIncrement(function (result, input) {
    result[input] = true;
    return result;
}), {}) // => { 2: true, 3: true }

Woo!


Let's try it with some more complex data. Say we have some posts:

var posts = [
    { author: 'Agatha',  text: 'just setting up my pstr' },
    { author: 'Bert',    text: 'Ed Balls' },
    { author: 'Agatha',  text: '@Bert fancy a thumb war?' },
    { author: 'Charles', text: '#subtweet' },
    { author: 'Bert',    text: 'Ed Balls' },
    { author: 'Agatha',  text: '@Bert m(' }
];

Let's pull out who's talked to who and build a graph-like data structure.

function graph(result, input) {
    result[input.from] = result[input.from] || [];
    result[input.from].push(input.to);
    return result;
}

var extractMentions = compose(
    // Find mentions
    filtering(function (post) {
        return post.text.match(/^@/);
    }),
    // Build object with {from, to} keys
    mapping(function (post) {
        return {
            from: post.author,
            to: post.text.split(' ').slice(0,1).join('').replace(/^@/, '')
        };
    })
);

posts.reduce(extractMentions(graph), {}) /* =>
    { Agatha:  ['Bert', 'Charles'],
      Bert:    ['Agatha'],
      Charles: ['Bert'] } */

Applying transducers to channels

Now we have all the parts of a "composable way to build algorithmic transformations" we can start applying them to any data pipeline – so let's try channels. Again, I'm not going to show you the channel-level implementation, just some usage examples.

We're going to listen for DOM events and put them into a channel that filters only those that occur on even x & y positions and maps them into a triple of [type, x, y].

First, two additions to our function library:

// Put DOM events into the supplied a channel
function listen(elem, type, c) {
    elem.addEventListener(type, function (e) {
        chan.put(c, e);
    });
}

function even(x) {
    return (x % 2 === 0);
}

Now let's create a channel, and pass it a transducer. The transducer will be used to reduce the data that comes down the channel.

var c = chan(
    1, // Fixed buffer size (only one event allowed)
    compose(
        // Only events with even x & y
        filtering(function (e) {
            return (
                even(e.pageX) &&
                even(e.pageY)
            );
        }),
        // e -> [type, x, y]
        mapping(function (e) {
            return [e.type, e.pageX, e.pageY];
        })
    )
);

Next we'll hook-up the events and the channel:

listen(document, 'mousemove', c);

And, finally, take in a recursive loop:

(function recur() {
    chan.take(c, function (v) {
        console.log('got', v);
        recur()
    });
}());

Running this code, you should see lots of events in your console – but only those with even x & y positions:

> got ["mousemove", 230, 156]
> got ["mousemove", 232, 158]
> got ["mousemove", 232, 160]
> got ["mousemove", 234, 162]

Stateful transducers

Finally, let's take a look at a stateful transducer, building a gateFilter to detect "dragging" using mousedown and mouseup event, and a keyFilter that matches against a property of the channel data.

function gateFilter(opener, closer) {
    var open = false;
    return function (e) {
        if (e.type === opener) {
            open = true;
        }
        if (e.type === closer) {
            open = false;
        }
        return open;
    };
}

function keyFilter(key, value) {
    return function (e) {
        return (e[key] === value);
    };
}

var c = chan(
    1,
    compose(
        // Only allow through when mouse has been down
        filtering(gateFilter('mousedown', 'mouseup')),
        // Filter by e.type === 'mousemove'
        filtering(keyFilter('type', 'mousemove')),
        // e -> [type, x, y]
        mapping(function (e) {
            return [e.pageX, e.pageY];
        })
    )
);

// Listen for relevant events
listen(document, 'mousemove', c);
listen(document, 'mouseup',   c);
listen(document, 'mousedown', c);

// Take in a loop
(function recur() {
    chan.take(c, function (v) {
        console.log('got', v);
        recur()
    });
}());

Whew. Pretty cool, eh?

And finally...

I think there's a great deal of expressive power here, particularly in making it easy to reason about data flow in large application.

My real goal is to explore the Actor model as it relates to front-end engineering, particularly in preventing an explosion of complexity with increasing scale. It's the model Flight uses, but I'm not wholly convinced events — while perfect for one-shot notifications — are the right primitive for coordinating behaviour and flow-control.

The result of this work is on Github so please do check that out, and email or Tweet me with feedback.

Finally finally, a massive thank-you to Stuart & Passy who gave me top-notch feedback on this article!

Permalink

Using core.async for Producer-Consumer Workflows

I've found core.async to be versatile for many workflows. One that I've used on several occasions is a producer-consumer model to distribute work among multiple consumers.

Say we want to process a lot of data coming in from a single source (e.g. stdin) and then output the results to a single destination (e.g. stdout). We can think of this as a producer-consumer problem.

A naive solution may look like this:

(defn process
  "Do 'work'"
  [line]
  (Thread/sleep 10))

(def stdin-reader
  (java.io.BufferedReader. *in*))

;; Read each line from stdin
(doseq [line (line-seq stdin-reader)]
  (process line)
    (println line))

But if process is expensive, we would find this program unacceptably slow. On my machine, this baseline program takes 115.5 seconds using 9% of the CPU to process 10,000 lines.

How can we speed this up?

It's apparent that our program isn't taking advantage of our CPU—we're only using 9% of it. We should deploy threads! Perhaps we can use java.util.concurrent.Executors to deploy a pool of workers, where each worker outputs their work into a shared Java concurrent queue. Another thread can then consume from this queue to print to stdout.

A simple alternative

A simple alternative is to use core.async. Using core.async's thread macro, we can create consumers that take data from an input channel, process the data, and put the result to an output channel.

To do this, we'll first create our two channels:

(ns my-test-ns
  (:require [clojure.core.async :as async]))

(defn process [line]
  (Thread/sleep 10)
  line)

(def stdin-reader
  (java.io.BufferedReader. *in*))

(def in-chan (async/chan))
(def out-chan (async/chan))

Then, using thread, we'll create consumers that live in its own threads so that they can do blocking takes and puts. We could use a go block here, which would employ its own thread pool, but I've found that go threads result in less throughput than devoted threads for CPU-heavy work. Our consumers look like this:

(defn start-async-consumers
  "Start num-consumers threads that will consume work
  from the in-chan and put the results into the out-chan."
  [num-consumers]
  (dotimes [_ num-consumers]
    (async/thread
      (while true
        (let [line (async/<!! in-chan)
              data (process line)]
          (async/>!! out-chan data))))))

Note that async/thread instantly returns a channel that will receive the result of the body. We ignore the return value, however, since our consumers are long-living.

Then we'll print out the processed items by taking from the output channel:

(defn start-async-aggregator
  "Take items from the out-chan and print it."
  []
  (async/thread
    (while true
      (let [data (async/<!! out-chan)]
        (println data)))))

Finally we can start our program:

(do
  (start-async-consumers 8)
  (start-async-aggregator)
  (doseq [line (line-seq stdin-reader)]
    (async/>!! in-chan line)))

This core.async version ran in 20.086 seconds using 54% of the CPU to process 10,000 lines. Compared to the 115.5 seconds of the baseline program, this version is almost six times faster.

Naturally, there are cases where we should prefer Java's Executors library to core.async. In fact, thread does use Executors underneath the hood. But why not take advantage of core.sync? You'll get the benefits of core.async channels and operations on top of thread's simplicity for producer-consumer workflows.

For further study

Rich Hickey: Clojure core.async Channels

Timothy Baldridge: Core.Async (video)

Source and results

Get the source code here.

# Set up input file
repeat 10000 echo "." >> input

# Run inline
time lein run inline < input > output
# 9.56s user 1.54s system 9% cpu 1:55.50 total

# Run async
time lein run async < input > output
# 9.30s user 1.57s system 54% cpu 20.086 total

Permalink

Fastest json parser

I just completed a first stable version of my json parser that beats jackson, clj-json, data.json, cheshire and the like by seconds and takes half the time the boon parser takes.

see:

https://github.com/gerritjvv/pjson

JSON objects as APersistentMap instances and JSON arrays are APersistentVector instances 100% clojure integration.

Permalink

Clojurescript, Om and ReactJS Talk for Portland JavaScript Admirers

These are my slides and speaker notes for a short talk I did at the Portland JavaScript Admirers meetup meeting on Wed Sep 27th 2014. The meeting started out with Jesse Hallett introducing ReactJS to the audience of JS programmers. Wat? Clojure – Modern LISP targeting JVM Clojurescript – Clojure adapted for JavaScript Om – […]

The post Clojurescript, Om and ReactJS Talk for Portland JavaScript Admirers appeared first on E-String.

Permalink

SICP Distilled

SICP Distilled is one of the most interesting Kickstarter projects I’ve seen in a while.

Its creator Tom Hall is planning to create some nice companion resources for SICP with code examples in Clojure. In his own words:

It’s a long book, with lots
of exercises and lots of people I know have started, loved it, but
somehow not finished.

Abelson and Sussman themselves highlight the important lessons of SICP
in their paper Lisp: A Language For Stratified Design and I have my
own favourite bits.

As the book itself is available online for free I want to make the
perfect accompaniment to it – an ebook summarising the key ideas,
short videos describing them, screencasts of solving some of the
exercises, translation of the examples into Clojure, example projects,
partial solutions for you to complete (similar to 4clojure and Clojure
koans
) and a place to discuss solving them with people and hopefully
keep momentum and get it finished!

Something to be enjoyed alongside SICP, rather than completely replace it.

Maybe some ideas come out a little different in Clojure, or I take a
slightly novel approach (hence idiosyncratic), maybe I miss something
out (hence tour, sorry), but I hope we can have some fun along the
way.

Tom Hall SICP Distilled

I’m one of those many people who never did finish SICP (although I hope to do that some day), so I can totally relate to Tom’s words. I’ve already backed his campaign and I hope more of you will do the same!

Permalink

Copyright © 2009, Planet Clojure. No rights reserved.
Planet Clojure is maintained by Baishamapayan Ghose.
Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
Theme by Brajeshwar.