What the Reagent Component?!

Did you know that when you write a form-1, form-2 or form-3 Reagent component they all default to becoming React class components?

For example, if you were to write this form-1 Reagent component:

(defn welcome []
  [:h1 "Hello, friend"])

By the time Reagent passes it to React it would be the equivalent of you writing this:

class Welcome extends React.Component {
  render() {
    return <h1>Hello, friend</h1>
  }
}

Okay, so, Reagent components become React Class Components. Why do we care? This depth of understanding is valuable because it means we can better understand:

The result of all of this "fundamental" learning is we can more effectively harness JavaScript from within ClojureScript.

A Pseudoclassical Pattern

The reason all of your Reagent components become class components is because all of the code you pass to Reagent is run through an internal Reagent function called create-class. The interesting part of this is how create-class uses JavaScript mechanics to transform the Reagent component you wrote into something that is recognized as a React class component. Before we look into what create-class is doing, it's helpful to review how "classes" work in JavaScript.

Prior to ES6, JavaScript did not have classes. and this made some JS developers sad because classes are a common pattern used to structure ones code and provide mechanisms for:

  • instantiation
  • inheritance
  • polymorphism

But as I said, prior to ES6, JavaScript did not have a formal syntax for "classes". This led the JavaScript community to develop a series of instantiation patterns to help simulate classes.

Of all of these patterns, the pseudoclassical instantiation pattern became one of the most popular ways to simulate a class in JavaScript. This is evidenced by the fact that many of the "first generation" JavaScript libraries and frameworks, like google closure library and backbone, are written in this style.

The reason we are going over this history is because the thing about a programming language is there are "patterns" and "syntax". The challenge with "patterns" is:

  • They are disseminated culturally
  • They are often not easy to search
  • They often require a deeper understanding of the language and problem being solved to understand why the pattern became accepted.

The last point is relevant to our conversation because patterns ultimatley make assumptions. Assumptions like our understanding of the problem being solved and where and when a pattern should itself be used. The end result is that a pattern can just become a "thing" we do all while forgetting why we started to do it in the first place or what the world could look like without it.

For example, the most common way of writing a React class component is to use ES6 class syntax. But did you know that ES6 class syntax is little more than syntactic sugar around the pseudoclassical instantiation pattern?

For example, you can write a valid React class component using the pseudoclassical instantiation pattern like this:

// 1. define a function (component) called `Welcome`
function Welcome(props, context, updater) {
  React.Component.call(this, props, context, updater)

  return this
}

// 2. connect `Welcome` to the `React.Component` prototype
Welcome.prototype = Object.create(React.Component.prototype)

// 3. re-define the `constructor`
Object.defineProperty(Welcome.prototype, 'constructor', {
  enumerable: false,
  writable: true,
  configurable: true,
  value: Welcome,
})

// 4. define your React components `render` method
Welcome.prototype.render = function render() {
  return <h2>Hello, Reagent</h2>
}

While the above is a valid React Class Component, it's also verbose and error prone. For these reasons JavaScript introduced ES6 classes to the language:

class Welcome extends React.Component {
  render() {
    return <h1>Hello, Reagent</h1>
  }
}

For those looking for further evidence, we can support our claim that ES6 Classes result in same thing as what the pseudoclassical instantiation pattern produces by using JavaScript's built-in introspection tools to compare the pseudoclassical instantiation pattern to the ES6 class syntax.

pseudoclassical instantiation pattern:

function Welcome(props, context, updater) {
  React.Component.call(this, props, context, updater)

  return this
}

// ...repeat steps 2 - 4 from above before completing the rest

var welcome = new Welcome()

Welcome.prototype instanceof React.Component
// => true

Object.getPrototypeOf(Welcome.prototype) === React.Component.prototype
// => true

welcome instanceof React.Component
// => true

welcome instanceof Welcome
// => true

Object.getPrototypeOf(welcome) === Welcome.prototype
// => true

React.Component.prototype.isPrototypeOf(welcome)
// => true

Welcome.prototype.isPrototypeOf(welcome)
// => true

ES6 class

class Welcome extends React.Component {
  render() {
    console.log('ES6 Inheritance')
  }
}

var welcome = new Welcome()

Welcome.prototype instanceof React.Component
// => true

Object.getPrototypeOf(Welcome.prototype) === React.Component.prototype
// => true

welcome instanceof React.Component
// => true

welcome instanceof Welcome
// => true

Object.getPrototypeOf(welcome) === Welcome.prototype
// => true

React.Component.prototype.isPrototypeOf(welcome)
// => true

Welcome.prototype.isPrototypeOf(welcome)
// => true

What does all of this mean? As far as JavaScript and React are concerned, both definions of the Welcome component are valid React Class Components.

With this in mind, lets look at Reagent's create-class function and see what it does.

What Reagent Does

The history lesson from the above section is important because create-class uses a modified version of the pseudoclassical instantiation pattern. Let's take a look at what we mean.

The following code sample is a simplified version of Reagent's create-class function:

function cmp(props, context, updater) {
  React.Component.call(this, props, context, updater)

  return this
}

goog.extend(cmp.prototype, React.Component.prototype, classMethods)

goog.extend(cmp, React.Component, staticMethods)

cmp.prototype.constructor = cmp

What we have above is Reagents take on the pseudoclassical instantiation pattern with a few minor tweaks:

// 1. we copy to properties + methods of React.Component
goog.extend(cmp.prototype, React.Component.prototype, classMethods)

goog.extend(cmp, React.Component, staticMethods)

// 2. the constructor is not as "thorough"
cmp.prototype.constructor = cmp

Exploring point 1 we see that Reagent has opted to copy the properties and methods of React.Component directly to the Reagent compnents we write. That is what's happening here:

goog.extend(cmp.prototype, React.Component.prototype, classMethods)

If we were using the the traditional pseudoclassical approach we would instead do this:

cmp.prototype = Object.create(React.Component.prototype)

Thus, the difference is that Reagent's approach copies all the methods and properties from React.Component to the cmp prototype where as the second approach is going to link the cmp prototype to React.component prototype. The benefit of linking is that each time you instantiate a Welcome component, the Welcome component does not need to re-create all of the React.components methods and properties.

Exploring the second point, Reagent is doing this:

cmp.prototype.constructor = cmp

whereas with the traditional pseudoclassical approach we would instead do this:

Object.defineProperty(Welcome.prototype, 'constructor', {
  enumerable: false,
  writable: true,
  configurable: true,
  value: Welcome,
})

The difference in the above approaches is that if we just use = as we are doing in the Reagent version we create an enumerable constructor. This can have an implication depending on who consumes our classes, but in our case we know that only React is going to be consuming our class components, so we can do this with relative confidence.

What is one of the more interesting results of the above two Reagent modifications? First, if React depended on JavaScript introspection to tell whether or not a component is a child of React.Component we would not be happy campers:

Welcome.prototype instanceof React.Component
// => false...Welcome is not a child of React.Component

Object.getPrototypeOf(Welcome.prototype) === React.Component.prototype
// => false...React.component is not part of Welcomes prototype chain

welcome instanceof React.Component
// => false...Welcome is not an instance of React.Component

welcome instanceof Welcome
// => true...welcome is a child of Welcome

Object.getPrototypeOf(welcome) === Welcome.prototype
// => true...welcome is linked to Welcome prototype

console.log(React.Component.prototype.isPrototypeOf(welcome))
// => false...React.Component not linked to the prototype of React.Component

console.log(Welcome.prototype.isPrototypeOf(welcome))
// is Welcome is the ancestory?

What the above shows is that Welcome is not a child of React.component even though it has all the properties and methods that React.Component has. This is why were lucky that React is smart about detecting class vs. function components.

Second, by copying rather than linking prototypes we could inccur a performance cost but again, in our case this cost is negligible.

Conclusion

I think it's important to dive into the weeds. In my experience, it's these detours and thorough questioning of topics which has led to considerable improvements in my programming skill and general comfort with increasingly challenging topics.

However, I think the biggest thing for me is something I referenced a few times in this post: "cultural knowledge". I have come to see that that is the most powerful tools we have as a species. It's unfortunate that this kind of information is not always available and my hope is that I could fill some of the gaps with this writing and maybe even open the door to works which can be built ontop of this.

Less philosophically though, I find it encouraging to know that everything is, generally speaking, JavaScript under the hood. This is important because it allows us to take advantage of what has come before and really dig into interesting ways we can use and manipulate JS from within CLJS.

Permalink

What are the Clojure Tools?

This post is about "getting" the Clojure Tools. The reason? They stumped me in the beginning and I felt like if I can make someone's journey just a bit easier that might be a good thing.

My Clojure learning journey started by asking questions like:

  • How do I install Clojure?
  • How do I run a Clojure program?
  • How do I manage Clojure packages (dependencies)?
  • How do I configure a Clojure project?
  • How do I build Clojure for production?

Now, when I started working with Clojure the answer to these questions was: choose either lein or boot. Then Rich Hickey and his ride or die homeboys rolled by and provided their own answer: The Clojure Tools .

Their vision, like the vision of Clojure itself, is a bit offbeat. So, this post is about reviewing the Clojure Tools and figuring out a mental model for them.

At a high level, the Clojure Tools currently consist of:

  • Clojure CLI
  • tools.build

The first is a CLI tool and the second is a Clojure library which provides some helper functions to make it easier to build Clojure artifacts. The rest of this post will dig into each of these tools.

Clojure CLI

The Clojure CLI is made up of the following subprograms:

And here is what it looks like to use the Clojure CLI and some of the things it can do:

Run a Clojure repl

clj

Run a Clojure program

clj -M -m your-clojure-program

manage Clojure dependencies

clj -Sdeps '{:deps {bidi/bidi {:mvn/version "2.1.6"}}}'

The above is just the tip of the CLojure CLI iceburg. I have omitted more interesting examples so we can focus on the Clojure CLI at a higher level. In honor of said "high level" overview, the following sections will cover each of theClojure CLI's subprograms.

clj/clojure

As we see above, the Clojure CLI is invoked by calling one of the two shell commands:

  • clj
  • clojure

When you read through the Official Deps and CLI Guide you will see that you can use either clj or clojure. clj is the recommended version, but both are used. Furthermore, when you start to look at open source code, you will see that both are used.

What's the difference between these two commands? clj is mainly used during development. clojure is mainly used in a production or CI environment. The reason for this is because clj is a light wrapper around the clojure command.

The clj command wraps the clojure command in another tool called rlwrap. rlwrap improves the developer experience by making it easier to type in the terminal while you're running your Clojure REPL.

The tradeoff for the convenience provided by clj is that clj introduces dependencies. This is a tradeoff because you may not have access to rlwrap in production. In addition, a tool like rlwrap can make it harder to compose the clj command with other tools. As a result, it's a common practice to use clojure in production/ci environments .

Now that we see they both more or less the same command, what do they do? clj/clojure has one job: run Clojure programs against a classpath. If you dig into the clj/clojure bash script you see that it ultimatley calls a command like this:

java [java-opt*] -cp classpath clojure.main [init-opt*] [main-opt] [arg*]

The above might look like a simple command, but the value of having clj/clojure is that you as a new Clojure developer don't have to manually build the classpath, figure out the exact right Java command to run or work to make this execute on different environments (windows, linux, mac etc).

In summary, clj/clojure is about running Clojure programs in a classpath and orchestrates other tools. For example, in order to run against a classpath, there has to be a classpath. clj/clojure is not responsible for figuring out the classpath though. That's a job for tools.deps.alpha

tools.deps.alpha

tools.deps.alpha is a Clojure libary responsible for managing your dependencies. What it does is:

  • reads in dependencies from a deps.edn file
  • resolves the dependencies and their transitive dependencies
  • builds a classpath

Note that I said it's a Clojure library. You don't have to be using the Clojure CLI in order to use this tool. You can just use it by itself if you wanted to.

What makes tools.deps.alpha so great is that it's a small and focused library. There isn't much more to say about this other than if you want to learn more about the history, development and goals of the tool from the Clojure team I recommend listening to this episode of Clojure Weekly Podcast which features Alex Miller, the author of tools.deps.alpha.

As noted above, the first thing tools.deps.alpha is going to do is read in your project configuration and deps. This information is stored in deps.edn.

deps.edn

The deps.edn file is a Clojure map with a specific structure. Thus, when you run clj/clojure one of the first things it does is find a deps.edn file and reads it in.

deps.edn is where you configure your project and specify project dependencies. At it's heart, deps.edn is just an edn file. You can think of it like Clojure's version of package.json.

Here is an example of what a deps.edn file looks like:

{:deps    {...}
 :paths   [...]
 :aliases {...}}

As you can see, we use the keywords :deps, :paths and :aliases and more to start to describe your project and the dependencies it requires.

Tools.Build

This is the newest Clojure Tool. It's been in the works for a while and might be the simplest to understand conceptually: It's a Clojure library with functions that do things like build a jar, uberjar etc.

One distinction that's important to note is that tools.build is not the same as the Clojure CLI tool's -T switch. I am calling this out now because when tools.build was released the Clojure CLI was also enhanced to provide the -T switch. As one can imagine, this could be seen as confusing because of the similarity of their names.

The best way that I can currently explain the -T switch is by saying that it's meant to be another level of convenience provided by the Clojure CLI.

Regarding usage, it helps to first breakdown the main types of Clojure programs one might build into 3 sub categories:

  • A tool
  • A library
  • An app

You would use -T for Clojure programs that you want to run as a "tool". For example, deps-new is a Clojure library which creates new Clojure projects based on a template you provide. This is a great example of a Clojure project which is built to be a "tool".

I don't want to go into more detail about -T now because that means we would have to dive into other Clojure CLI switches like -X and -M. That's for another post. On to the Installer!

Installer

The "Clojure CLI Installer" is a fancy way of referring to the brew tap used to install Clojure on mac and linux machines. As of February 2020, Clojure started maintaining their own brew tap. Thus, if you installed the Clojure CLI via

brew install clojure

you will likely want to uninstall clojure and install the following:

brew install clojure/tools/clojure

In all likelihood, you would probably be fine with brew install clojure as it will recieve updates. However, while brew install clojure will still see some love, it won't be as active as the clojure/tools/clojure tap.

clj v lein v boot

This section will provide a quick comparison of clj, lein and boot.

Firstly, all of the above tools are more or less addressing the same problems in their own way. Your job is to choose the one you like best.

If you're curious which to choose, my answer is the Clojure CLI. The reason I like the Clojure CLI is because the tool is simple. You can read through clj and tools.deps.alpha in an afternoon and understand what they are doing if you had to. The same (subjectively of course) cannot be said for lein or boot. This is not just implementation, but also usage. Yes, lein seems easier to start, but the moment you break away from the beginner examples you are left deeps in the woods without a compass.

Secondly, the Clojure Tools promote libraries over frameworks. This is important when working with a language like Clojure because it really does reward you for breaking down your thinking.

Finally, the Clojure community is really leaning into building tools for Clojure CLI. For example, where lein used to have significantly more functionality, the community has built a ton of incredible tools that will cover many of your essential requirements.

So yes, Clojure Tools for the win.

Permalink

WASM PART II: Using Rust and ClojureScript

In the last post we studied what WebAssembly(WASM) is, it’s influence in modern web development and we presented a simple demo using WAT to WASM compilation.

In this post, we will start by solving a more complex problem ClojureScript, Reagent and Rust (more details in the next post) compiled to WASM.

Wait what? WAT is not enough?!

In the previous session we talked about WAT, but we have to tell you something: Yes, WAT is not enough because probably we will need to write more complex code to cover business requirements.

We don’t want to write complex problems using a stack and a bunch of binary operations. For this reason, we will use Rust.

What is Rust? Why so much love is coming from the community?

Rust is programming language that focus on speed, safety, concurrency and memory safe. The speed is almost naturally attached to the language, because the language ensure a low level access to memory and safety.

Rust solves problems that languages like C/C++ have been battling for a long time such as memory errors and concurrent programming.

We are convinced that safety is one of the biggest advantages of Rust. Just to mention one topic here, we have the Rust’s dual-mode model: Safe Rust and Unsafe Rust.

With Safe Rust we will have restrictions over certain parts of the code. These restrictions maintain certain parts of the code safe, to keep working properly.

On the other hand, Unsafe Rust gives more flexibility to the developers, but they have to be extra careful to ensure that their code is safe.

These high level features and many others, make Rust one of the most loved languages by developers.

Maybe we have to mention another features such as the useful packages from crates.io, but we can talk more about this topic in another article.

Given this summary about Rust, we can continue to our little game application: TiedUp!

TiedUp!

TiedUp is a simple game about guessing the missing last three characters of a word located in the right side and the initial three characters of a word located in the left side. For example, if the right word is docTOR and the left word is TORtoise, these words are tied by the three characters TOR.

In this sample project we will present to you three pairs of words loaded from an external file using rust, and we will receive this data calling the respective function from ClojureScript.

Technical Design

Technically this problem is not complicated to solve but add many topics that are interesting.

The first topic is the ClojureScript+Reagent code. We created a repository here with the initial source code. We are using here only one namespace for simplicity purposes but we accept any criticism and improvement opportunities, so fork/clone it and provide feedback please, we will appreciate it.

The second topic, is the Rust => WASM code. For this purpose, we will use wasm-bindgen. This tool provides to us the super power to compile Rust code to WASM.

In this project we will use Rust to load a *.json file with a few pair of words. It’s a pretty simple task but it will show you how to integrate Rust, WASM and ClojureScript.

This task will be explained in the next part of this post because it will require to work with serialization process for the data shared between all the layers of the architecture.

The image below presents the way we are organizing the application:

Application components

In this article, we will only talk about the client code(ClojureScript layer), but in the third part of this series we will cover the “Serde” process.

ClojureScript + Reagent – Presentation layer and Game Logic

In this layer, we are creating a simple GUI to present and control the user interaction with the game. It looks pretty much like the following image:

Basic GUI

The code architecture

The code architecture is super simple, we are using only one namespace for game logic, the core namespace.

The core namespace contains the basic application components, logic and rendering. The most important functions of this namespace are: get-pairs, verify-answer, update-game-state and create-board.

get-pairs load the data required to present the right word, the left word and the correct answer. This part will be replaced by a WASM function call for the next part.

(defn get-pairs
  "This will be replaced by rust wasm function"
  []
  [{:left-word "headache"
    :right-word "chemical"
    :answer "che"}
   {:left-word "ukelele"
    :right-word "element"
    :answer "ele"}
   {:left-word "dialogue"
    :right-word "guessing"
    :answer "gue"}])

verify-answer determines if the three characters used are correct. Almost all the logic depends on this function.

(defn verify-answer
  [first-box second-box third-box pair]
  (let [first-letter (:value (get first-box 1))
        second-letter (:value (get second-box 1))
        third-letter (:value (get third-box 1))
        answer (str first-letter second-letter third-letter)]
    (= answer (:answer pair))))

update-game-state modifies the game state everytime the user try a new answer.

(defn update-game-state
  []
  (swap! app-state assoc :actual-pair-index (+ (:actual-pair-index @app-state) 1))
  (js/alert guessed-one-word-message))

create-board returns every component of the GUI to be used by the reagent render function.

(defn create-board
  []
  (let [first-letter (character-box (:first-letter @app-state))
        first-letter-minus-button (character-change-button - first-letter :first-letter)
        first-letter-plus-button (character-change-button + first-letter :first-letter)
        second-letter (character-box (:second-letter @app-state))
        second-letter-minus-button (character-change-button - second-letter :second-letter)
        second-letter-plus-button (character-change-button + second-letter :second-letter)
        third-letter (character-box (:third-letter @app-state))
        third-letter-minus-button (character-change-button - third-letter :third-letter)
        third-letter-plus-button (character-change-button + third-letter :third-letter)
        pair (get (get-pairs) (:actual-pair-index @app-state))]
    [:div
     [:tbody
      [:tr
       [:td (left-word (:left-word pair))]
       [:td (character-component first-letter-plus-button first-letter first-letter-minus-button)]
       [:td (character-component second-letter-plus-button second-letter second-letter-minus-button)]
       [:td (character-component third-letter-plus-button third-letter third-letter-minus-button)]
       [:td (right-word (:right-word pair))]]]
     [:div (verify-button first-letter second-letter third-letter pair)]]))

How to run the project?

Please refer to the following file for instruction about running the project.

Conclusion

In this post, we shared the first part of the tiedup code covering the most important part of the presentation layer and a very basic explanation about Rust and the structure of our project.

We know the most interesting part will be the integration with Rust + WASM but don’t worry, we will cover it in the next post.

Happy Hacking! =)

The post WASM PART II: Using Rust and ClojureScript appeared first on Flexiana.

Permalink

Clojure Deref (Jan 21, 2022)

Welcome to the Clojure Deref! This is a weekly link/news roundup for the Clojure ecosystem. (@ClojureDeref RSS)

Highlights

In Clojure-related business news, Dividend Finance is being acquired by Fifth Third. Also, there was a round of funding for Green Labs. Both are users of Clojure, congrats to all!

From the core

Last week we did a roundup of jiras in Clojure 1.11.0-alpha4 and I was surprised that so many people actually watched it, so thanks for the feedback and interest. We’ll try to do a bit more of that kind of thing in the future. Since the alpha4 release we’ve been working several things that came out of the release and a few of those changes have started to hit master this week:

  • CLJ-2685 - fix to the generative tests for iteration which were occasionally failing

  • CLJ-2689 - fix to the clojure.math tests to make them less strict so they pass on M1 builds (java.lang.Math methods mostly allow differences of ~1 ulp "unit in last place")

  • CLJ-2690 - updated the docstring and some arg names for iteration

No new release yet but I expect the next one will be 1.11.0-beta1. Big thanks to everyone who tried alpha4 and gave feedback.

Podcasts and videos

Libraries and Tools

New releases and tools this week:

  • uclj 0.1.3 - Small, quick, native Clojure interpreter

  • pathom-viz 2022.1.21 - Visualization tools for Pathom

  • ring-logger 1.1.1 - Log ring requests & responses using your favorite logging backend

  • lacinia 1.1 - GraphQL implementation in pure Clojure

  • lacinia-pedestal 1.1 - Expose Lacinia GraphQL as Pedestal endpoints

  • relic 0.1.2 - Functional relational programming for Clojure/Script

  • automigrate 0.1.0 - database auto-migration tool for Clojure

  • honeysql 2.2.858 - Turn Clojure data structures into SQL

  • remorse 0.1.0 - Keyword to morse code conversion

  • clojure-lsp 2022.01.20-14.12.43 - A Language Server for Clojure(script)

  • statecharts 1.0.0-alpha8 - A Statechart library for CLJ(S)

  • pitoco - Create Malli (API) schemas from captured HTTP requests and responses or from recording inputs and outputs of instrumented functions

  • clojure-test 2.0.156 - A clojure.test-compatible version of the classic Expectations testing library

  • clj-gatling 0.17.0 - Load testing library for Clojure

  • alpine-version-clj 0.0.1 - Parse and compare Alpine/Gentoo package version strings in clojure

  • fulcro 3.5.11 - A library for development of single-page full-stack web applications in clj/cljs

  • fulcro-rad 1.1.4 - Fulcro Rapid Application Development

  • fulcro-rad-semantic-ui 1.2.5 - Semantic UI Rendering Plugin for RAD

  • clojure-extras 0.5.0 - Custom features added on top of Cursive for Clojure Lovers

  • symspell-clj 0.3.0 - SymSpell spell checker in Clojure

  • stub 0.2.2 - Library to generate stubs for other Clojure libraries

  • pod-babashka-aws 0.1.2 - AWS pod wrapping the Cognitect aws-api library

  • clj-kondo 2022.01.15 - A linter for Clojure code that sparks joy

  • create-shadowfront 0.0.13 - Get shadow-cljs + reagent up and running fast

Permalink

Ginger Names

The ginger language has, so far, 2 core types implemented: numbers and names. Obviously there will be more coming later, but at this stage of development these are all that’s really needed. Numbers are pretty self explanatory, but it’s worth talking about names a bit.

As they are currently defined, a name’s only property is that it can either be equal or not equal to another name. Syntactically they are encoded as being any alphanumeric token starting with an alphabetic character. We might think of them as being strings, but names lack nearly all capabilities that strings have: they cannot be iterated over, they cannot be concatenated, they cannot be split. Names can only be compared for equality.

Utility

The use-case for names is self-explanatory: they are words which identify something from amongst a group.

Consider your own name. It might have an ostensible meaning. Mine, Brian, means “high” (as in… like a hill, which is the possible root word). But when people yell “Brian” across the room I’m in, they don’t mean a hill. They mean me, because that word is used to identify me from amongst others. The etymology is essentially background information which doesn’t matter.

We use names all the time in programming, though we don’t always call them that. Variable names, package names, type names, function names, struct field names. There’s also keys which get used in hash maps, which are essentially names, as well as enumerations. By defining name as a core type we can cover a lot of ground.

Precedence

This is not the first time a name has been used as a core type. Ruby has symbols, which look like :this. Clojure has keywords, which also look like :this, and it has symbols, which look like this. Erlang has atoms, which don’t have a prefix and so look like this. I can’t imagine these are the only examples. They are all called different things, but they’re all essentially the same thing: a runtime value which can only be compared for equality.

I can’t speak much about ruby, but I can speak about clojure and erlang.

Clojure is a LISP language, meaning the language itself is described using the data types and structures built into the language. Ginger is also a LISP, though it uses graphs instead of lists.

Clojure keywords are generally used as keys to hash maps, sentinel values, and enumerations. Besides keywords, clojure also makes use of symbols, which are used for variable and library names. There seems to be some kind of split ability on symbols, as they are expected to be separated on their periods when importing, as in clojure.java.io. There’s also a quoting mechanism in clojure, where prefixing a symbol, or other value, with a single quote, like 'this, prevents it from being evaluated as a variable or function call.

It’s also possible to have something get quoted multiple layers deep, like '''this. This can get confusing.

Erlang is not a LISP language, but it does have atoms. These values are used in the same way that clojure keywords are used. There is no need for a corresponding symbol type like clojure has, since erlang is not a LISP and has no real macro system. Atoms are sort of used like symbols, in that functions and packages are identified by an atom, and so one can “call” an atom, like this(), in order to evaluate it.

Just Names

I don’t really see the need for clojure’s separation between keywords and symbols. Symbols still need to be quoted in order to prevent evaluation either way, so you end up with three different entities to juggle (keywords, symbols, and symbols which won’t be evaluated). Erlang’s solution is simpler, atoms are just atoms, and since evaluation is explicit there’s no need for quoting. Ginger names are like erlang atoms in that they are the only tool at hand.

The approaches of erlang vs clojure could be reframed as explicit vs implicit evaluation of operations calls.

In ginger evaluation is currently done implicitly, but in only two cases:

  • A value on an edge is evaluated to the first value which is a graph (which then gets interpreted as an operation).

  • A leaf vertex with a name value is evaluated to the first value which is not a name.

In all other cases, the value is left as-is. A graph does not need to be quoted, since the need to evaluate a graph as an operation is already based on its placement as an edge or not. So the only case left where quoting is needed (if implicit evaluation continues to be used) is a name on a leaf vertex, as in the example before.

As an example to explore explicit vs implicit quoting in ginger, if we want to programatically call the AddValueIn method on a graph, which terminates an open edge into a value, and that value is a name, it might look like this with implicit evaluation (the clojure-like example):

out = addValueIn < (g (quote < someName;) someValue; );

* or, to borrow the clojure syntax, where single quote is a shortcut:

out = addValueIn < (g; 'someName; someValue; );

In an explicit evaluation language, which ginger so far has not been and so this will look weird, we might end up with something like this:

out = addValueIn < (eval < g; someName; eval < someValue; );

* with $ as sugar for the `eval`, like ' is a shortcut for `quote` in clojure:`

out = addValueIn < ($g; someName; $someValue; );

I don’t like either pattern, and since it’s such a specific case I feel like something less obtrusive could come up. So no decisions here yet.

Uniqueness

There’s another idea I haven’t really gotten to the bottom of yet. The idea is that a name, maybe, shouldn’t be considered equal to the same name unless they belong to the same graph.

For example:

otherFoo = { out = 'foo } < ();

out = equal < ('foo;  otherFoo; );

This would output false. otherFoo’s value is the name foo, and the value it’s being compared to is also a name foo, but they are from different graphs and so are not equal. In essence, names are automatically namespaces.

This idea only really makes sense in the context of packages, where a user (a developer) wants to import functionality from somewhere else and use it in their program. The code package which is imported will likely use name values internally to implement its functionality, but it shouldn’t need to worry about naming conflicts with values passed in by the user. While it’s possible to avoid conflicts if a package is designed conscientiously, it’s also easy to mess up if one isn’t careful. This becomes especially true when combining functionality of packages with overlapping functionality, where the data returned from one might looks similar to that used by the other, but it’s not necessarily true.

On the other hand, this could create some real headaches for the developer, as they chase down errors which are caused because one foo isn’t actually the same as another foo.

What it really comes down to is the mechanism which packages use to function as packages. Forced namespaces will require packages to export all names which they expect the user to need to work with the package. So the ergonomics of that exporting, both on the user’s and package’s side, are really important in order to make this bearable.

So it’s hard to make any progress on determining if this idea is gonna work until the details of packaging are worked out. But for this idea to work the packaging is going to need to be designed with it in mind. It’s a bit of a puzzle, and one that I’m going to marinate on longer, in addition to the quoting of names.

And that’s names, their current behavior and possible future behavior. Keep an eye out for more ginger posts in…. many months, because I’m going to go work on other things for a while (I say, with a post from a month ago having ended with the same sentiment).

Permalink

Why Are My (Clojure) Stack Traces Missing? The Little-Known OmitStackTraceInFastThrow Flag

Problem

Sometimes you see exceptions on the JVM that are missing the message and the stack trace. They can look like this in your logs:

ERROR java.lang.ClassCastException

Or perhaps like this if you're using Clojure (printed via prn):

#error {
 :cause nil
 :via
 [{:type java.lang.ClassCastException
   :message nil}]
 :trace
 []}

Or perhaps like this (printed via clojure.stacktrace/print-stack-trace):

java.lang.ClassCastException: null
 at [empty stack trace]

Solution

This happens because of a HotSpot optimization. The release notes for JDK 5.0 tell us:

After recompilation, the compiler may choose a faster tactic using preallocated exceptions that do not provide a stack trace. To disable completely the use of preallocated exceptions, use this new flag: -XX:-OmitStackTraceInFastThrow.

This little-known flag has been around since Java 5! So to get stack traces for your exceptions, you need to set the -XX:-OmitStackTraceInFastThrow JVM flag. Note that this flag disables the OmitStackTraceInFastThrow feature due to the - before Omit.

Here are some ways you can set the flag:

  1. Pass the option to java when running a jar file: java -XX:-OmitStackTraceInFastThrow myjar.jar
  2. Set the JAVA_TOOL_OPTIONS env var JAVA_TOOL_OPTIONS=-XX:-OmitStackTraceInFastThrow
  3. (Clojure-specific) Set :jvm-opts ["-XX:-OmitStackTraceInFastThrow"] in your deps.edn or project.clj
  4. (Clojure-specific) Use clj -J-XX:-OmitStackTraceInFastThrow with Clojure CLI

Discussion

Why is omitting stack traces the default? Some Java code uses exceptions for control flow, and can end up throwing lots of exceptions in a hotspot. Code that uses exceptions for control flow usually only cares about the class of the exception, not the message or the stack trace. Leaving out the stack trace can make throwing the exception orders of magnitude faster so HotSpot helpfully performs this optimization for you.

I'd argue that since using exceptions for control flow is a bad idea anyway, you don't want to benefit from this optimization. Also, in my experience making production problems harder to debug (by throwing away important error messages) is not worth a slight performance increase. I recommend starting every project with -XX:-OmitStackTraceInFastThrow, measuring performance, and optimizing hotspots where necessary. If it turns out you need to turn on OmitStackTraceInFastThrow, do it consciously after evaluating the pros and cons for your project.

For fun, I ran a quick poll on the Clojurians Slack to check how many people know this flag. Here are the results. Turns out that at least among this Slack crowd the flag is fairly well-known.

PS. Reproducing The Problem

I had to try out a couple of ways to reproduce the problem. The following snippet seems the most reliable so far. The exception needs to get thrown multiple times for HotSpot to optimize the throw.

(dotimes [i 10000]
  (try (dissoc (list) :x)
       (catch Exception e nil)))

(dissoc (list) :x)

Permalink

Transducers: Middleware for Reducing Functions (Part 4)

This is a four-part series. You can find the parts here:

Wow! We did it! In our last episode, we built ourselves a real transducer that actually works with Clojure’s transduce function. It included all three reducing function arities and implemented a stateful reducing function. It was our own version of partition-all.

(defn cc-partition-all-3 [n]
  (fn [rf]
    (let [p (volatile! [])]
      (fn
        ([]                             ; arity-0
         (rf))
        ([state]                        ; arity-1
         (if (> (count @p) 0)
           (rf (rf state @p))
           (rf state)))
        ([state input]                  ; arity-2
         (let [p' (vswap! p conj input)]
           (if (= n (count p'))
             (do (vreset! p [])
                 (rf state p'))
             state)))))))

We’re nearing the end of our journey to understand transducers, but there’s a bit more we need to cover.

In cc-partion-all-3, we saw that a middleware reducing function can call the downstream reducing function or not as it pleases, and it can choose which of the different arities it wants. We saw that the reduction context (e.g., transduce) can signal that the reduction is done by calling the arity-1 reducing function.

But what if the reduction function decides that the reduction is done before the reduction context does? How can a reducing function signal to the reduction context that it wants to stop?

Let’s consider a reimplementation of take-while.

(defn cc-take-while-1 [pred]
  (fn [rf]
    (let [stopped (volatile! false)]
      (fn
        ([]
         (rf))
        ([state]
         (rf state))
        ([state input]
         (cond
           @stopped state
           (pred input) (rf state input)
           :else (do (vreset! stopped true)
                     state)))))))

We start out with stopped being false and as long as pred applied to the input returns true, then we just keep calling the downstream reducing function. As soon as pred returns false, then we set stopped to true and return state without calling the downstream reducing function. Once stopped is true, we ignore all further calls and simply return state unchanged.

This works great.

user> (transduce (cc-take-while-1 #(< % 5)) conj (range 10))
[0 1 2 3 4]

“Okay, was that it? Are we done?” you ask.

Not quite. While cc-take-while-1 works, it’s inefficient. Once we flip stopped to true, we correctly ignore all further inputs, but we’re still having to process every item in the collection. When there are only a handful of items, that’s no big deal, but it adds up with large collections.

For instance, let’s run some timing tests using Criterium.

user> (let [coll (vec (range 10))]
        (bench (transduce (cc-take-while-1 #(< % 5)) conj coll)))
Evaluation count : 166427760 in 60 samples of 2773796 calls.
             Execution time mean : 359.708462 ns
    Execution time std-deviation : 0.399059 ns
   Execution time lower quantile : 358.765407 ns ( 2.5%)
   Execution time upper quantile : 360.449649 ns (97.5%)
                   Overhead used : 5.939916 ns

Found 4 outliers in 60 samples (6.6667 %)
	low-severe	 2 (3.3333 %)
	low-mild	 1 (1.6667 %)
	high-mild	 1 (1.6667 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
nil
user> (let [coll (vec (range 10000))]
        (bench (transduce (cc-take-while-1 #(< % 5)) conj coll)))
Evaluation count : 706140 in 60 samples of 11769 calls.
             Execution time mean : 85.261613 µs
    Execution time std-deviation : 131.542776 ns
   Execution time lower quantile : 84.971377 µs ( 2.5%)
   Execution time upper quantile : 85.464019 µs (97.5%)
                   Overhead used : 5.939916 ns

Found 2 outliers in 60 samples (3.3333 %)
	low-severe	 2 (3.3333 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
nil

Here, we see that processing ten elements in the collection only takes us 359.7 nanoseconds, but processing 10,000 elements in the collection balloons up to 85.2 microseconds. So, even though the result of each call is the same, returning the first five elements, the second case with 10,000 elements takes much longer.

It doesn’t have to be this way, however. Once we set stopped to true, we’re going to throw away everything else that comes to us. It would be nice if there was a way to do that. Fortunately, there is.

Clojure has a function named reduced that will encapsulate the return state in a special wrapper that signals to the reduction context that it should stop processing. The reduction context can use a predicate named reduced? to check the newly returned state value to see if it’s wrapped in this special wrapper. You can pull the final state value out of the wrapper using @ (deref).

user> (reduced [1 2 3])
#<Reduced@56e48adb: [1 2 3]>
user> (reduced? (reduced [1 2 3]))
true
user> (reduced? [1 2 3])
false
user> @(reduced [1 2 3])
[1 2 3]

So, we can stop the reduction in our version of take-while by simply returning a reduced value as soon as the predicate returns false.

(defn cc-take-while-2 [pred]
  (fn [rf]
    (fn
      ([]
       (rf))
      ([state]
       (rf state))
      ([state input]
       (if (pred input)
         (rf state input)
         (reduced state))))))

Note that this also allows us to remove the state in the reducing function. When we’re done, we’re done.

You can see the impact of this on our ten vs. 10,000 elements test.

user> (let [coll (vec (range 10))]
        (bench (transduce (cc-take-while-2 #(< % 5)) conj coll)))
Evaluation count : 209638200 in 60 samples of 3493970 calls.
             Execution time mean : 284.600497 ns
    Execution time std-deviation : 1.499647 ns
   Execution time lower quantile : 282.564290 ns ( 2.5%)
   Execution time upper quantile : 287.687909 ns (97.5%)
                   Overhead used : 5.939916 ns
nil
user> (let [coll (vec (range 10000))]
        (bench (transduce (cc-take-while-2 #(< % 5)) conj coll)))
Evaluation count : 211362120 in 60 samples of 3522702 calls.
             Execution time mean : 280.628692 ns
    Execution time std-deviation : 0.486964 ns
   Execution time lower quantile : 279.166659 ns ( 2.5%)
   Execution time upper quantile : 281.293161 ns (97.5%)
                   Overhead used : 5.939916 ns

Found 2 outliers in 60 samples (3.3333 %)
	low-severe	 2 (3.3333 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
nil

Now, the timings are both about the same, 280 nanoseconds or so. They both process the first five elements of the collection and then return (reduced state), immediately stopping the reduction.

With that, our multi-part series diving into transducers comes to a close. Hopefully, this has demystified the subject and you understand now why I say that transducers are just middleware for reducing functions. In this series, you’ve learned how to implement your own transducers, including transducers that return stateful reducing functions. You’ve seen how to clean up state at the end of a reduction using the arity-1 reducing function and how to stop the reduction process using reduced as soon as the reducing function determines that it’s pointless to continue. You should now have the skills to make use of standard transducers and write your own when your programs require it.

You can find all the source code from this series in the Clojure Crazy GitHub repository.

Permalink

Getting Started with Clojure(Script) – Eduardo’s Journey

2022 is just beginning, Clojure was never as hipster and attractive as it is now, and I’ve decided it is time to share a bit of my quest on learning Clojure with you folks. At the same time, I’ll try to provide some guidance on how to get your feet wet with this amazing language.

A Bit of Context


What is Clojure?

As defined by its creator, Rich Hickey,

Clojure is a robust, practical, and fast programming language with a set of useful features that together form a simple, coherent, and powerful tool.

Among its main features I’d like to highlight:

  • Dynamic Development: Clojure is a dynamic environment you can interact with.
  • LISP: Clojure is a member of the Lisp family of languages.
  • Functional Programming: Clojure provides the tools to avoid mutable state, provides functions as first-class objects, and emphasizes recursive iteration instead of side-effect-based looping. Being impure, it still stands behind the philosophy that programs that are more functional are more robust.

Clojure was designed to be a hosted language and supports the dynamic implementation of Java interfaces and classes. All the code is compiled to JVM bytecode.

What is ClojureScript?

ClojureScript is a compiler for Clojure that targets JavaScript. It emits JavaScript code that is compatible with the advanced compilation mode of the Google Closure optimizing compiler.

This means that nowadays we’re able to produce code in Clojure that is ready to run in a browser. This is really helping a lot to leverage the usage of this amazing language.

Where am I coming from?

If you’re reading this article, chances are you’re already a professional programmer. But if not, just keep reading, this article is also for you, and maybe you can start your amazing journey into programming today!

I met Clojure for the first time in 2019. I was a JavaScript frontend developer working with JS/React/Node. Thanks to the great efforts of the JavaScript community and the React team, React/Redux started to leverage the best side of JavaScript, the functional one. And that’s when I started to pay attention to the functional paradigm.

I already had some experience with JavaScript pitfalls, mainly the lack of immutability, which can quickly lead you into some “not so funny” quirks.

One day I met a friend who convinced me to try Clojure, and I really enjoyed the experience. Clojure has almost no syntax, which makes it really quick to grok, and it’s one of the most high-level languages you can use. And I really mean it. Problems that are not very easy to solve in JavaScript, can quickly be solved in Clojure.

I must say that I already had knowledge of HTML, CSS, and JavaScript (specifically Node.js and React) on my toolbelt. And this really helped me get into Clojure quickly. But if you’re just beginning, no worries, you can learn it on the go.

Because ClojureScript compiles to JavaScript, knowing a bit of JavaScript can help you make use of already existent packages/solutions in the JavaScript ecosystem. Also, some ClojureScript frontend frameworks make use of React, so knowing React is also a plus to quickly grok how these frameworks work.

No matter where you’re coming from, where should you start learning Clojure?

Where should you start learning Clojure?

Well, first things first. I think you should know a bit about Rich Hickey and his motivation to create Clojure. That’s the best way to understand what’s behind the man and the language.

  • Are We There Yet? (2009): In this talk Rich Hickey advocated for the reexamination of basic principles like state, identity, value, time, types, genericity, complexity, as they are used by OOP today, to be able to create the new constructs and languages to deal with the massive parallelism and concurrency of the future.
  • Hammock Driven Development (2010): Rich Hickey talks about strategies he’s used personally for solving difficult problems. This is a more general talk, but will make you want to install a hammock in your office 😅
  • Simple Made Easy (2011): Rich Hickey emphasizes simplicity’s virtues over easiness, showing that while many choose easiness, they may end up with complexity, and the better way is to choose easiness along the simplicity path.
  • The Value of Values (2012): Rich Hickey compares value-oriented programming with place-oriented programming, concluding that the time of imperative languages has passed and it is the time of functional programming.

After getting through these videos, I’d definitely recommend you read the Love Letter To Clojure, from Gene Kim, and my favorite book of all time: Clojure for the Brave and True, a great pearl written by Daniel Higginbotham. It is free to read online, written with a great sense of humor, and guides you from zero to hero, raising awareness on the main aspects and functionalities of the language. Make sure you read it at least till the end of Part II.

At this point in time, you should be ready and anxious 😅 to get your hands dirty and start to tackle some challenges in Clojure.

Well, if you’re not proficient with Emacs, I’d recommend you go with VSCode. There’s a great addon for VSCode, called Calva, which will provide you with the essential tools to connect to or start a REPL. It also includes inline code evaluation, Paredit, code formatting, a test runner, syntax highlighting, linting, and more. It’s free to use, and it exists thanks to Peter Strömberg aka PEZ.

You might want to begin with some simple Clojure challenges. You can pick Rich 4Clojure and later Codewars.

Other important resources

By now, you might be already be able to tackle some Clojure challenges. Still, there are some obstacles in the way, such as tooling and frameworks.

To overcome this difficulty, I’ve completed some of Jacek Schae and Eric Normand’s (PuerlyFunctional.tv) courses, which are great. Eric Normand has an amazing newsletter, that inspires me every week, and also wrote a great book that helps a lot if you’re coming from JavaScript. It’s called Grokking Simplicity and it really helps you to think with a functional approach in mind.

Main difficulties along the journey

I think my journey to learn Clojure was pretty straightforward, although I had some small difficulties along the way.

My first obstacle was wanting to learn Emacs while learning Clojure. I totally recommend you to stick to your code editor, otherwise, there will be a lot of new things around and frustration can rise a bit. VS Code is an excellent option, as it is a very simple code editor. Plus, nowadays it provides you with great functionality and support for learning.

Also, the Clojure community is very mature, so don’t expect those very, very, beginner-friendly tutorials, where everything is explained step-by-step. The way I found to overcome this was to join Clojurians Slack. Despite being small, and mature, the Clojure community is very warm and happy to see someone new joining in. So, if you have any questions, don’t hesitate and feel free to post them there. Even those “not so dumb” questions are answered quickly and always with a very didactic approach.

You’ll be surprised as to how common it is for the creator of a library to directly answer you. This rocks, as you’re always talking with knowledgeable people, which are ready to teach and help you on a daily basis.

Last but not least, tooling got in my way a bit. More woth Clojure, than with ClojureScript, I found it a bit difficult to put all the pieces of the puzzle to build a backend. There are a lot of packages to mix and lots of new documentation to read. Watching Jacek Schae and Eric Normand’s courses provided great help with these difficulties, and that’s why I totally recommend tehm.

Conclusion

Learning Clojure was – and still is – a great and enjoyable adventure. I think almost everyone in the Clojure ecosystem has a very pleasant experience while writing code with it. First, because it’s a very high-level language. Second, because the functional approach saves you lots (and I really mean LOTS) of lines of code that would otherwise be needed to solve the same problem in another non-functional language such as JavaScript.

Have you already taken these steps and gained experience with Clojure? Check out Flexiana’s current job openings and apply to join our multinational team.

The post Getting Started with Clojure(Script) – Eduardo’s Journey appeared first on Flexiana.

Permalink

Transducers: Middleware for Reducing Functions (Part 3)

This is a four-part series. You can find the parts here:

Welcome back. In our last episode, we got within spitting distance of “real” transducers. We saw how transducers are middleware wrapped around a reducing function. And we actually wrote some basic transducers.

In this episode, we’ll pull together some of the last bits and have you writing real transducers that will work with Clojure’s standard transduction functions (e.g., transduce). We’ll cover the multiple arities that reducing functions need to have. We’ll even write a “stateful transducer” and learn why that’s an inaccurate name. Strap in; we have a lot to get to and this is going to move fast.

If you were paying attention during the last episode, you might have noticed that our cc-xd function looks a lot like Clojure’s standard transduce function.

user> (defn cc-xd [xf rf init coll]
        (reduce (xf rf) init coll))
#'user/cc-xd
user> (doc transduce)
-------------------------
clojure.core/transduce
([xform f coll] [xform f init coll])
  reduce with a transformation of f (xf). If init is not
  supplied, (f) will be called to produce it. f should be a reducing
  step function that accepts both 1 and 2 arguments, if it accepts
  only 2 you can add the arity-1 with 'completing'. Returns the result
  of applying (the transformed) xf to init and the first item in coll,
  then applying xf to that result and the 2nd item, etc. If coll
  contains no items, returns init and f is not called. Note that
  certain transforms may inject or skip items.
nil

Both functions take the same arguments in the same order. There is some difference in terminology, however. First, I use the term xf where the doc-string uses xform. The doc-string names the result of applying xform to rf as xf. Personally, I think the doc-string is confusing and different official Clojure documentation about transducers and different blogs use different argument names in different spots. I think this overall inconsistency, coupled with the functions-returning-anonymous-functions-that-return-anonymous-functions nature of the problem, is part of the reason that folks have been so confused about transducers.

So, I’ve tried to be consistent throughout this blog series:

  • A reducing function is always named rf.
  • A transducer is always named xf.
  • Since transducers compose with comp, a stack of transducers is also a transducer and is named xf.
  • The result of applying a transducer to a reducing function is another reducing function, so the result is named rf (if it’s actually given a name at all).
  • The names of the arguments to a reducing function are names state and input.
  • When we compose individual transducers together, I have called that a “transducer stack” to distinguish it from a single, atomic transducer.
  • The “reduction context” is the logic that is calling the reducing function to perform the processing. Example contexts are transduce, reduce, or even core.async channels.

Now, back to cc-xd and transduce. The main difference between the two is that our version always takes four arguments while the standard version of transduce has a three argument version. The doc-string says that if the initial value of the collection is not supplied, then the reducing function of zero arity is called to produce it. In other words, it invokes (rf) (without any transducers wrapped around it) to create the initial value. You can see that a function like conj produces an initial value if you call it with no arguments. So does + or *.

user> (conj)
[]
user> (+)
0
user> (*)
1

These are all good reducing functions for use with transduce as they return an “identity value” of some sort.

Note that previously we said that a reducing function is one that takes two arguments, a state and an input, and returns a new state. That’s still true, but now we need to expand our definition. A proper reducing function that works with transducers includes two arities. The arity-0 version returns an identity value appropriate for use as the initial value of a reduction and the arity-2 version performs a reduction step.

It’s important to note that transduce and reduce have different behavior in the case where they don’t have an initial value. reduce uses the first value of the collection and starts processing at the second value. This allows you to use functions like min or max as a reducing value. But these functions won’t work without an initial value if you try to use them with transduce. You can use first and rest to pick off the first value if need to.

user> (max)
Execution error (ArityException) at user/eval8583 (form-init7109873409143739450.clj:535).
Wrong number of args (0) passed to: clojure.core/max
user> (transduce identity min [1 2 3])
Execution error (ArityException) at user/eval8585 (form-init7109873409143739450.clj:538).
Wrong number of args (0) passed to: clojure.core/min
user> (reduce min [1 2 3])
1
user> (let [coll [1 2 3]]
        (transduce identity min (first coll) (rest coll)))
1

Here, for simplicity, I’m just using identity as the transducer, which just returns the reducing function, unchanged.

We can modify our cc-xd function to have the same arities and behavior as the standard transduce function.

user> (defn cc-xd-2
        ([xf rf coll]
         (cc-xd xf rf (rf) coll))
        ([xf rf init coll]
         (reduce (xf rf) init coll)))
#'user/cc-xd

When you consider transduce’s arguments, one question that comes up is why xf and rf are separate arguments? Why can’t we just apply xf to rf ahead of time and pass one argument to transduce? That would make the function signature basically the same as reduce. In fact, under limited circumstances, we can do exactly this, and even pass wrapped reducing functions to reduce itself.

user> (def filter-odd-times-five-xf (comp (filter odd?)
                                          (map (partial * 5))))
#'user/filter-odd-times-five-xf
user> (def rf (filter-odd-times-five-xf conj)
user> (reduce rf [] (range 10))
[5 15 25 35 45]

So, why do it any other way? Wouldn’t it be faster and more efficient to apply the transducer to the reducing function once and then reuse it possibly multiple times? Yes, it would be marginally faster, but only by a hair; the dominant portion of any transduction is spent actually processing the elements.

The reason we do it this way is because the reducing function created by a transducer is allowed to be stateful. Such a transducer is often called a “stateful transducer” when you read Clojure documentation about transducers, but as Christophe Grand pointed out several years ago, this is not quite accurate. The transducers themselves are stateless (remember, a transducer simply takes a reducing function and returns another, wrapped reducing function), but the new reducing function returned by the transducer may keep state during the reduction process. This state is initialized when the transducer creates the new reducing function. So, it’s better if we create our transducer stack separately, a stateless operation, and then apply that to our reducing function right before we perform the reduction. If we don’t do this, then the next time we reduce/transduce with this reducing function, we’ll end up with stale state and things won’t work as expected. The example above worked because the reducing functions are stateless.

Further, some tranduction contexts choose their own reducing function and so we’d like to be able to pass around our transducer stack before it’s applied to a reducing function. For instance, into can take a transducer stack and uses Clojure’s optimized transient data structure processing to improve performance. It does not take a separate reducing function but chooses one based on the output data type.

user> (into #{} filter-odd-times-five-xf (range 10))
#{15 25 35 5 45}

Examples of standard transducers that create stateful reducing functions include drop, keep-indexed, partition-all, partition-by, take, and take-nth. All of these require the created reducing function to keep track of a count of items seen or an index of some sort. The doc-strings for transducer creating functions will specify whether they are stateful or not.

Let’s try our hand at writing a transducer that creates a stateful reducing function. As an exercise, we’ll re-write partition-all. This will demonstrate how a reducing function should store state and how to handle various special cases during the reduction process.

Here’s a first attempt. Remember that partition-all groups the processed collection into partitions of n items. At the end of the process, whatever items remain are emitted as their own short partition.

(defn cc-partition-all-1 [n]
  (fn [rf]
    (let [p (volatile! [])]             ; reducing function state
      (fn [state input]
        (let [p' (vswap! p conj input)]
          (if (= n (count p'))
            (do (vreset! p [])
                (rf state p'))
            state))))))

You can see the same basic transducer structure that we’ve seen before, but here we’ve wrapped a let form containing a binding for a volatile around the returned reducing function. This is what holds our state.

The processing is pretty simple. Every time the reducing function is called, it adds the input item to the saved state, p, using vswap!, which returns the new value. We capture this in p'. Then if the number of items in p' is equal to the partition size, we reset p back to an empty vector and call the downstream reducing function with the partition. If the count of elements in the growing partition is not equal to n, we just return the state, unchanged.

That’s an important new point about transducers: the new reducing function can control whether it calls the downstream reducing function or not.

Let’s see how it works.

user> (cc-xd-2 (cc-partition-all-1 3) conj [] (range 10))
[[0 1 2] [3 4 5] [6 7 8]]

Hmmm. Well, that mostly worked. We got partitions of size three. But we didn’t get 10 items in the result like we should have. What’s going on?

The problem here is that when the reduction process is finished, our cc-partition-all-1 is still holding onto an item in its state. It has no way of knowing that the reduction is complete. We need a way to signal it and say “Hey, we’re done here, so perform any clean-up work you need to.” Transducers do this by adding yet another arity to the reducing function: an arity-1 version.

(defn cc-partition-all-2 [n]
  (fn [rf]
    (let [p (volatile! [])]             ; reducing function state
      (fn
        ([state]                        ; arity-1
         (if (> (count @p) 0)
           (rf (rf state @p))
           (rf state)))
        ([state input]                  ; arity-2
         (let [p' (vswap! p conj input)]
           (if (= n (count p'))
             (do (vreset! p [])
                 (rf state p'))
             state)))))))

The arity-1 function includes the logic to clean up. We simply take the partition that is in process of being formed and if there are any elements remaining in it, we pass those to the arity-2 downstream reducing function. This adds the final partition to the output. Then we call the arity-1 version of the downstream reducing function on the new state that we just returned from the arity-2 version. This signals the downstream reducing function that it should perform any cleanup in the case that it’s stateful as well. If there are no items in the partition, then we simply pass along the state to the downstream arity-1 version. If you have a reducing function that doesn’t support arity-1, you can easily create a new version that does by passing it to completing. The created arity-1 function will simply be identity.

Now, we also need a new version of our transduction function, cc-xd-3.

(defn cc-xd-3
  ([xf rf coll]
   (cc-xd-3 xf rf (rf) coll))
  ([xf rf init coll]
   (let [rf' (xf rf)]
     (rf' (reduce rf' init coll)))))

Now things work as expected.

user> (cc-xd-3 (cc-partition-all-2 3) conj [] (range 10))
[[0 1 2] [3 4 5] [6 7 8] [9]]

So, now we know that a proper reducing function for use with transducers has arity-0, arity-1, and arity-2 versions.

user> (conj)
[]
user> (conj [1])
[1]
user> (conj [1] 2)
[1 2]
user> (+)
0
user> (+ 1)
1
user> (+ 1 2)
3
user> (*)
1
user> (* 1)
1
user> (* 1 2)
2

While transduce doesn’t strictly need it, we should also add an arity-0 function to our partition-all reimplementation: cc-partition-all-3. In this case, the arity-0 function simply delegates to the downstream reducing function. That’s going to be pretty standard behavior; it’s rare that a transducer needs to manipulate the initial value.

(defn cc-partition-all-3 [n]
  (fn [rf]
    (let [p (volatile! [])]
      (fn
        ([]                             ; arity-0
         (rf))
        ([state]                        ; arity-1
         (if (> (count @p) 0)
           (rf (rf state @p))
           (rf state)))
        ([state input]                  ; arity-2
         (let [p' (vswap! p conj input)]
           (if (= n (count p'))
             (do (vreset! p [])
                 (rf state p'))
             state)))))))

Now, we can use our cc-partition-all-3 with Clojure’s standard transduction functions.

user> (transduce (cc-partition-all-3 3) conj [] (range 10))
[[0 1 2] [3 4 5] [6 7 8] [9]]

Boom. Nailed it. We have a legitimate transducer.

Okay, that’s all for now. Next time, we’ll take a look at how to abort a reduction or transduction right in the middle.

Permalink

100 Languages Speedrun: Episode 57: Scala

Scala is one of the JVM languages trying to dethrone Java. Currently Kotlin is leading this race by a lot, but Scala, Clojure, and Groovy are all quite popular, with JRuby being somewhat behind them in the race.

This post is about Scala 2. Scala 3 is currently being developed, which plans to make fundamental changes to the language.

Hello, World!

println("Hello, World!")

You can run it without separate compilation:

$ scala hello.scala
Hello, World!

FizzBuzz

The FizzBuzz is very reasonable:

// FizzBuzz in Scala
for (i <- 1 to 100) {
  if (i % 15 == 0) {
    println("FizzBuzz")
  } else if (i % 5 == 0) {
    println("Buzz")
  } else if (i % 3 == 0) {
    println("Fizz")
  } else {
    println(i)
  }
}

Unicode

Just like Kotlin, Clojure, and Groovy, Scala string handling is also broken for any data involving characters outside Unicode Basic Plane. It's broken on JVM, and so every language using it directly has broken string handling. JRuby is the only major JVM language which had courage to fix it, and pay performance price for that.

val strings = List("Hello", "Źółw", "💩")
for (s <- strings) {
  println(s"Length of $s is ${s.length}")
}
$ scala unicode.scala
Length of Hello is 5
Length of Źółw is 4
Length of 💩 is 2

On the upside, Scala has nice string interpolation. Of course with its own unique syntax, as that's one thing which every language uses different syntax for, with no consensus emerging so far.

Fibonacci

On the upside, we don't need pointless return.

On the downside, we need to declare types for both arguments and return values, type inference is completely failing us here. There's some limited type inference, but in this case it would give us error: recursive method fib needs result type.

def fib(n : Int) : Int = {
  if (n <= 2) {
    1
  } else {
    fib(n - 1) + fib(n - 2)
  }
}

for (n <- 1 to 20) {
  println(s"fib($n) = ${fib(n)}")
}
$ scala fib.scala
fib(1) = 1
fib(2) = 1
fib(3) = 2
fib(4) = 3
fib(5) = 5
fib(6) = 8
fib(7) = 13
fib(8) = 21
fib(9) = 34
fib(10) = 55
fib(11) = 89
fib(12) = 144
fib(13) = 233
fib(14) = 377
fib(15) = 610
fib(16) = 987
fib(17) = 1597
fib(18) = 2584
fib(19) = 4181
fib(20) = 6765

Types

Scala just like Haskell has extremely complicated type system, featuring type classes. By the way this is one of the features which is getting full rewrite in Scala 3, so presumably Scala devs are not too happy with its state.

Let's try to define a function like this, in a way that would allow valid combinations (like Int + Int, or String + String), but not invalid combinations (like Int + String, or Double + HttpRequest):

def add(a, b) = {
  println(s"${a} + ${b} = ${a + b}")
}

The first step is to make this function generic:

def add[T](a : T, b : T) = {
  println(s"${a} + ${b} = ${a + b}")
}

This won't work - as + is not defined for every type. Scala gives a completely meaningless error message (required: String; incompatible interpolation method s) for this, but it's no big deal.

What works is defining a type class (spelled a trait in Scala, but documentation still refers to it as a "type class") Additive. Then defining various instances of Additive[...]. Then passing Additive as an implicit parameter.

trait Additive[T] {
  def plus(x: T, y: T): T
}

implicit object AdditiveInt extends Additive[Int] {
  def plus(x: Int, y: Int): Int = x + y
}

implicit object AdditiveDouble extends Additive[Double] {
  def plus(x: Double, y: Double): Double = x + y
}

implicit object AdditiveString extends Additive[String] {
  def plus(x: String, y: String): String = x + y
}

def add[T](a : T, b : T)(implicit additive: Additive[T]) = {
  println(s"${a} + ${b} = ${additive.plus(a, b)}")
}

add(6, 0.9)
add(400, 20)
add("foo", "bar")

How does it compare with other languages with complex type systems:

  • Crystal managed to figure it out with zero type annotations
  • Scala managed to do this, with very heavy annotations
  • Haskell almost works, with very heavy annotations and a few language extensions, but then in the end it doesn't (this is largely because Haskell string is not a real type, it's just a list of characters, and Haskell is bad at type classes over such composite types)
  • OCaml doesn't even try

Simple Types

For simple immutable classes, Scala supports case class shortcut, very similar to Kotlin's data class. It defines common basic operations like ==, .toString(), hashCode() and so on.

case class Point(x : Double, y : Double) {
  def length() = Math.sqrt(x * x + y * y)
}

val a = List(1, 2, 3)
val b = List(1, 2, 3)
val c = Point(30.0, 40.0)
val d = Point(30.0, 40.0)

println(a == b)
println(c == d)
println(null == d)
println(s"len of ${c} is ${c.length()}")
$ scala point.scala
false
true
false
len of Point(30.0,40.0) is 50.0

This is all fine.

Generic Point Type

So let's define a generic Point type that can take be Point[Int] or Point[Double] or such, and always implement +. This can be done, but it's quite convoluted:

trait Additive[T] {
  def plus(x: T, y: T): T
}

implicit object AdditiveInt extends Additive[Int] {
  def plus(x: Int, y: Int): Int = x + y
}

implicit object AdditiveDouble extends Additive[Double] {
  def plus(x: Double, y: Double): Double = x + y
}

implicit object AdditiveString extends Additive[String] {
  def plus(x: String, y: String): String = x + y
}

case class Point[T](x : T, y : T)(implicit additive: Additive[T]) {
  def +(other : Point[T]) = {
    Point(additive.plus(x, other.x), additive.plus(y, other.y))
  }
}

implicit def AdditivePoint[T] : Additive[Point[T]] = {
  new Additive[Point[T]] {
    def plus(x: Point[T], y: Point[T]): Point[T] = x + y
  }
}

def add[T](a : T, b : T)(implicit additive: Additive[T]) = {
  println(s"${a} + ${b} = ${additive.plus(a, b)}")
}

println(Point(1, 2) + Point(3, 4))
add(Point(300, 60), Point(120, 9))
add(Point(6.0, 250.0), Point(0.9, 170.0))
add(Point("foo", "much"), Point("bar", "wow"))
$ scala typeclasses2.scala
Point(4,6)
Point(300,60) + Point(120,9) = Point(420,69)
Point(6.0,250.0) + Point(0.9,170.0) = Point(6.9,420.0)
Point(foo,much) + Point(bar,wow) = Point(foobar,muchwow)

What Scala has is definitely among the most complicated type systems ever, and we're really just barely scratching the surface here.

One important advantage Scala has over Haskell is that if type system really doesn't like what you're doing, you can just declare something as Any and do all the type checking at runtime. Haskell is fundamentalist about type checking, and if type checker doesn't like your perfectly valid code, there's nothing you can do about that.

What Scala notably lacks is union types, which are extremely necessary for such basic things like parsing JSON. Scala 3 plans to add union types. And speaking of parsing JSON...

Libraries

Scala doesn't come with any JSON library, which in this day and age, is ridiculous.

It looks like there are two popular package managers for Scala - Scala-specific sbt and more generic gradle. Both are convoluted enough that I'll just pass on this whole mess to keep this post reasonable size.

That's not a Scala specific issue, the whole JVM world suffers from extremely convoluted package management, and lacks anything comparable to rubygems or npm or pip.

This is sort of acceptable for bigger projects, as all that sbt or gradle setup will be a tiny part, but for small ones, it's a huge pain.

This might have been acceptable 10 years ago, nobody should accept this today.

Functional Programming

Basic functional programming patterns work just as you'd expect it. There's implicit arguments _, similar to Perl's $_ or Kotlin's it.

val alist = List(1, 2, 3, 4, 5)

println(alist.map{ x => x * 2 })
println(alist.map{ _ * 2 })
println(alist.filter{ _ % 2 == 1 })
println(alist.reduce{ (a, b) => a + b })

JVM Interoperability

Scala can call any JVM code, so in theory, it should have access to the entire JVM ecosystem, right? Well, in practice if you actually try to do that, it will look like ass:

import java.awt._
import java.awt.event._
import javax.swing._

var clicks = 0

val f = new JFrame()

f.setLayout(new GridLayout(2, 1))
f.setSize(300, 300)

val l = new JLabel("0")
l.setHorizontalAlignment(SwingConstants.CENTER)
f.add(l)

val b = new JButton("click me")
f.add(b)
b.addActionListener(new ActionListener {
  def actionPerformed(e: ActionEvent) : Unit = {
    clicks += 1
    l.setText(s"${clicks} clicks")
  }
})

f.setVisible(true)

Click Counter

So in practice you'll be using Scala wrappers, like this one.

How bad is it when you don't have a Scala wrapper depends on a library, but Java and Scala diverge semantically far more than let's say Java and Kotlin.

Should you use Scala?

I'd recommend against it.

If you don't need JVM, Scala is not for you. Scala is a ridiculously overcomplicated language, in particular with a ridiculously overcomplicated type system, somehow still lacking basic functionality like parsing JSON or modern package manager, with a terrible track record of breaking backwards compatibility (and Scala 3 coming soon to break it even harder), highly fragmented ecosystem, and apparently a lot of maintainer drama on top of it (I didn't look too closely at that).

If you specifically need something that runs on a JVM, it's a closer call, but I'd still not recommend it. Scala can use JVM libraries, but due to semantic mismatch, it will be quite awkward. And apparently the biggest JVM ecosystem - Android - isn't even really supported by Scala. For the "better Java" role Scala was aiming at, Kotlin just does it much better. If you're more flexible about your JVM language, one of Clojure, Groovy, or JRuby might be a better choice.

On the other hand, if you need something that runs on a JVM, but you don't care about Android, and not actually need to use too many JVM libraries (only the popular ones that mostly have Scala wrappers), and you actively want Scala's ridiculously overcomplicated type systems, and aren't terribly bothered by backwards compatibility issues, and so on - then Scala might actually work for you. I don't expect that to be too many people.

Code

All code examples for the series will be in this repository.

Code for the Scala episode is available here.

Permalink

Recurse Center: Week Two

This post is about my second week at the Recurse Center.

Clojure

One of my primary goals at the Recurse Center, is to learn how to write programs in Clojure. To this end, I had spent the first week familiarising myself with the basic syntax of Clojure. This week, I could delve deeper and wrote some not-so-simple programs.

Before the start of the retreat, I had asked for suggestions for getting started with Clojure. Kapil Reddy was kind enough to point me to an on-boarding course, clojure-by-example, that he, Aditya Athalye and a few others had designed.

The week began with a long and super helpful Clojure pairing session with Aditya, who gave me a detailed walk-through of clojure-by-example. Until then, I had only read the first three chapters of Clojure for the Brave and True. While I had a theoretical overview of some of the basic semantics of Clojure, this pairing session gave a more hands-on take on how to read, evaluate and write expressions in Clojure. This was the perfect start to my week!

During the rest of the week, I concentrated on completing the next three chapters of Clojure for the Brave and True (Core Functions in Depth, Functional Programming, and Organizing Your Project: A Librarian’s Tale). Learning about sequence abstraction was truly fascinating

It doesn’t matter how a particular data structure is implemented: when it comes to using seq functions on a data structure, all Clojure asks is “can I first,rest, and consit?” If the answer is yes, you can use the seq library with that data structure source

While reading Clojure for the Brave and True, I learnt about some of the key concepts of functional programming. Specifically, reading about pure functions and their use-case was a lot of fun! I would like to spend some time reading more on concepts (like ‘referential transparency’, and ‘functional composition’) that are fundamental to functional programming. Also, reading about higher-order functions was a lot of fun as it reminded me of my early adventures with map, reduce and filter.

I had struggled to visualise the importance of lazy sequences while writing programs. So, I ended up spending considerable amount of time, playing around with lazy sequences. In the end, I published a blog post (with a cheeky title) summarising my findings.

As reading several pages of Clojure for the Brave and True can get a bit monotonous, I would often take a detour to implement some of the examples and concepts explained in the book. It was particularly satisfying to take up Daniel’s challenge on Chapter 4:

If you want an exercise that will really blow your hair back, try implementing map using reduce and then do the same for filter and some after you read about them later in this chapter.

Here’s my implementation of map-using-reduce:

(defn map-using-reduce
  [map-fn xs]
  (seq (reduce (fn f
                 [aggregator new-val]
                 (conj (vec aggregator) (map-fn new-val)))
               (empty xs)
               xs)))

Here’s filter-using-reduce:

(defn filter-using-reduce
  [filter-fn xs]
  (seq (reduce (fn
                 [aggregator new-val]
                 (if (filter-fn new-val)
                   (conj (vec aggregator) new-val)
                   aggregator))
               (empty xs)
               xs)))

And, some-using-reduce:

(defn some-using-reduce
  [some-fn xs]
   (reduce (fn [aggregator new-val]
                  (if-not aggregator
                    (if (some-fn new-val)
                      (some-fn new-val)
                      aggregator)
                    aggregator))
                nil
                xs))

Here’s my GitHub repository, hosting my white-board adventures with Clojure!

Also, as suggested by Aditya, I keep solving the 4Clojure problems mentioned at the end of each exercise of clojure-by-example. So far I have solved a handful of them; I intend to solve many more this week. Here’s where I write down my solutions to 4Clojure problems: https://github.com/oitee/clojure-by-example/blob/master/src/clojure_by_example/4-clojure-exercises.clj

Using NGINX to Use a Custom Domain

While most of the week was spent on Clojure, I could sneak out some time to complete a long-standing task related to Twirl. Twirl is a URL-shorening web app. As it is hosted on Heroku (free tier), the domain name of the app is pretty long(oteetwirl.herokuapp.com). This makes the short links generated by Twirl not so short after all. I used NGNIX on a Google Cloud VM I had spawned recently, to generate a custom short domain name for my application. Twirl is now accessible at: https://twirl.otee.dev. This makes the resultant short links shorter! Here’s a detailed account of the steps I followed to complete this task!

Others

  • I paired with Kanyisa, to continue working on building a REPL version of the Mastermind game. We think we would need one more session to complete this.
  • I attended a very insightful session hosted by Julia Evans on How DNS works and how it breaks.

Permalink

Who Moved My Cheese: Laziness in Clojure

In this post, I try to understand what lazy sequences are and how to create our own lazy sequence in Clojure.

Lazy Sequences in Clojure

Clojure reference explains laziness as:

Most of the sequence library functions are lazy, i.e. functions that return seqs do so incrementally, as they are consumed, and thus consume any seq arguments incrementally as well.

The important parts of this is that many library functions that produce sequences (e.g. lists) do so incrementally, as they are consumed. This means that unless there is someone to consume the sequence, nothing really happens.

This was particularly perplexing to the eyes of a beginner learning Clojure, especially as to why the following the following does not print anything:

(defn print-numbers
  [n]
  (map println (range n)) ;; .... (1)
  (println "Done printing: " n))

When this function was called with proper arguments, the REPL produced the following output:

butterfly.core=> (print-numbers 9)
Done printing:  9
nil

It turns out that the function print-numbers produces a lazy sequence on line 1 (as indicated on the code-block above), i.e., the map produces a lazy sequence. As no one is consuming the sequence from map, it is never really realised. For this reason, the println inside the map is never executed. This is what lazy evaluation means.

This can be fixed in any of the following ways.

Using mapv

Use a non-lazy version of map, i.e., mapv

(defn print-numbers
  [n]
  (mapv println (range n))
  (println "Done printing: " n))

Using map + doall

Wrap map with doall which realises a lazy sequence

(defn print-numbers
  [n]
  (doall (map println (range n)))
  (println "Done printing: " n))

Using doseq

Instead of generating a sequence, we can use doseq which acts on each element of a sequence

(defn print-numbers
  [n]
  (doseq [i (range n)]
    (println i))
  (println "Done printing: " n))

Using run!

A better version (for the present example) than using doseq is to use run! which applies a given function on every element of a sequence, without generating another sequence (unlike map)

(defn print-numbers
  [n]
  (run! println (range n))
  (println "Done printing: " n))

Generating a Lazy Sequence 💪🏼

Let us now try to generate an infinite lazy sequence of prime numbers.

First, we write a function which returns true if a number is prime

(defn is-prime?
  [n]
  (not-any? (fn [factor]
              (zero? (mod n factor)))
            (range 2 (dec n))))

Let us now define an infinite sequence of prime numbers:

(def infinite-primes
  (filter is-prime? (drop 2 (range))))

This infinite-primes var is doing a filter operation on an infinite sequence of numbers. Because filter and range both produce lazy sequences, this will not halt our program. In fact, this is the power of lazy sequences, that we can compute as many number of prime numbers, as we need when we need it without having to know apriori how many we may need.

Thus, all of the following generates prime number sequences of varying lengths:

user=> (take 5 infinite-primes)
(2 3 5 7 11)
user=> (take 10 infinite-primes)
(2 3 5 7 11 13 17 19 23 29)
user=> (take 100 infinite-primes)
(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 191 193 197 199 211 223 227 229 233 239 241 251 257 263 269 271 277 281 283 293 307 311 313 317 331 337 347 349 353 359 367 373 379 383 389 397 401 409 419 421 431 433 439 443 449 457 461 463 467 479 487 491 499 503 509 521 523 541)

Constructing a Lazy Sequence from Scratch 🧙🏼

In the above part, we generated a lazy sequence by leveraging the library functions filter and range both of which are lazy.

In order to construct a lazy sequence from scratch, we can use lazy-seq. But before writing a lazy-primes function, let us write an eager-primes function that constructs a sequence of prime numbers from scratch:

(defn eager-primes
  ([n] (eager-primes n 2 []))
  ([n curr xs]
   (if (= (count xs) n)
     xs
     (if (is-prime? curr)
       (recur n (inc curr) (conj xs curr)) 
       (recur n (inc curr) xs)))))

Here’s the how eager-primes can be used:

butterfly.core=> (eager-primes 8)
[2 3 5 7 11 13 17 19]

This is not a lazy sequence, because irrespective of how many primes are consumed, eager-primes will always generate n prime numbers, no more, no less.

So, in order to make it lazy, first we need to get rid of n, so that the production of primes depends on how many are consumed and not on a pre-defined number. For example, here we need only 10 primes, but eager-primes will still generate 100 primes

(take 10 (eager-primes 100))

In order to make it a lazy sequence, we need to stop the recursion. The way to do this is to wrap the recursive call around lazy-seq. What this will do is that it will tell Clojure that if any one is consuming from this sequence, this is the step to repeat. In other words, if we have a pause/resume functionality around recur, we have essentially generated a lazy sequence. But, in practice, we cannot actually use recur because when we use lazy-seq around the recursive call, the recursive call is no longer a tail call. So, we have to make the recursive call by using the function name.

(defn primes
  ([] (primes 2))
  ([curr]
   (if (is-prime? curr)
     (cons curr (lazy-seq (primes (inc curr))))
     (lazy-seq (primes (inc curr))))))

Output:

butterfly.core=> (def all-primes (primes))
#'butterfly.core/all-primes

butterfly.core=> (take 5 all-primes )
(2 3 5 7 11)

butterfly.core=> (take 10 all-primes )
(2 3 5 7 11 13 17 19 23 29)

butterfly.core=> (take 15 all-primes )
(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47)

Important things to note:

  1. Why we must use cons: cons appends an element to a sequence at the beginning, without traversing the rest of the sequence. This is important because, if we are generating an infinite sequence, we cannot traverse the sequence fully because the sequence is infinite. In contrast, conj appends a new element at the end of the sequence, which, by definition, requires traversal of the whole sequence (to find the end). This is why we must use cons instead of conj.

  2. Use of lazy-seq: Whenever Clojure sees lazy-seq it stops evaluation until someone is realising it. This means that a cons on a lazy sequence is a sequence which has not really been computed. It will be computed when someone is traversing that list, i.e., realising it. The way this works is that cons points to an element and a lazy sequence which, again, includes a cons cell that points to an element and another lazy sequence. The Clojure environment evaluates these lazy-seqs as demanded as the traversal progresses. Therefore, it only evaluates that part of the sequence which has been traversed at least once.

To visualize the above, let us try to understand how a lazy sequence is traversed upon realization. For example, when four elements are taken out of the lazy sequence, the following happens:

Every cons cell has a subsequent sequence, which is lazy. This means that it has not been computed. Therefore, only when we traverse that lazy sequence, do we find another cons cell with the next prime number, followed by another lazy sequence. This can be thought of as the proverbial Russian Dolls. Only by opening the first doll do you see the next doll (and not any more).

This traversal of lazy sequences and realisation (and caching of realised cons cells) is transparently done by Clojure, as they are consumed by some enclosing code. As a corollary, by virtue of the REPL having an eval step, lazy sequences are realised on the spot if directly written on the REPL (i.e., the REPL tries to consume the entire lazy sequence).

user=> (cons 1 (range))
;; this will never finish on the REPL

Who Moved My Cheese: Why laziness is not a bad thing

In the book ‘Who Moved My Cheese’, the author says that the lazy mouse ultimately loses out on life because when the going gets tough, the hard-working mouse (eager mouse) finds a solution and the lazy one, out of sheer laziness, perishes.

However, in a modern-day language like Clojure, a lazy sequence can prove to be very useful because some problems, like the prime numbers sequence, by definition, are infinite. If we built a web-page that displayed a paginated result of prime numbers, any language that did not implement a lazy sequence of prime numbers, would have to re-evaluate all the prime numbers for every page. This means that to generate the 500th prime number, we would have to generate 499 prime numbers and then the 500th one. Consequently, to generate the 501st prime number, we would have to re-generate the first 500 prime numbers all over again (as seen in the eager-primes example above). This in Clojure would not be required because we maintain only one sequence of prime numbers that are realised as required and no recomputaton would be necessary.

Edit: In an earlier version, I had mentioned that a lazy sequence can provide efficiency. This is not accurate: a lazy sequence does not improve the inherent algorithmic efficiency of an expression. (However, caching of a lazy sequence does prevent recomputations). Thanks to Aditya Athalye, for pointing this out!

Permalink

Reading the present moment

This is an experiment I am doing about introducing a bit of self-referential stuff in Chapter 13 of "Data-Oriented Programming.

I was inspired by the "Gödel, Escher, Bach" masterpiece. Not sure yet, if it will make it into the official version of the book though. It depends on your feedback.

Throughout the book, Joe — a senior Clojure developer — reveals the secrets of Data-Oriented Programming to Theo and Dave — two fellow developers — who get quite excited about this new paradigm.

In Chapter 13, Dave tests a piece of code he wrote using as a example the book "Data-Oriented Programming" written by your servant.

var yehonathan = {
  "name": "Yehonathan Sharvit",
  "bookIsbns": ["9781617298578"]
};

Author.myName(yehonathan, "html");
// → "<i>Yehonathan Sharvit</i>"

And that’s how the self-referential fun begins…​

Please read this article on a device with a wide screen, like a desktop or a tablet. I don’t think it renders well on a mobile phone.

When Theo comes to Dave’s desk to review his implementation of the "list of authors" feature, he asks him about the author that appears in the test of Author.myName.

THEO: Who is Yehonathan Sharvit?

DAVE: I don’t really know. The name appeared when I googled for "Data-Oriented Programming" yesterday. He wrote a book on the topic. I thought it would be cool to use its ISBN in my test.

THEO: Does his book present DOP in a similar way to what Joe taught us?

DAVE: I don’t know. I guess I’ll discover when I receive the print book I ordered.

A few days later, Dave walks to Theo’s cube holding a package. Dave opens the package and they take a look at the cover together.

THEO: Wow, that’s-- that’s…​ odd. The woman on the cover - she’s so familiar. I could swear she’s the girl my grandparents knew from this Greek island called Santorini. My grandparents were born there, speak often of their childhood friend and have a photo of her. But how could a girl from their little island wind up on the cover of this book?

DAVE: That’s so cool!

Dave opens the book with Theo looking over his shoulder. They scan the table of contents.

THEO: It looks like this books covers all the same topics Joe taught us.

DAVE: This is great!

Dave leafs through a few random sections. Hi attention is caught by a bit of dialog.

DAVE: Theo, this is so strange!

THEO: What?

DAVE: The characters in Sharvit’s book have the same names as ours!

THEO: Let me see…​

Theo turns to a page from the first chapter. He and Dave read this passage side by side.

Data-Oriented Programming: Chapter 1

THEO: Hey Dave! How’s it going?

DAVE: Today? Not great. I’m trying to fix a bug in my code! I can’t understand why the state of my objects always changes. I’ll figure it out though, I’m sure. How’s your day going?

THEO: I just finished the design of a system for a new customer.

DAVE: Cool! Would it be OK for me to see it? I’m trying to improve my design skills.

THEO: Sure! I have the diagram on my desk. We can take a look now if you like.

DAVE: I remember this situation. It was around a year ago just a few weeks after I had joined Albatross.

Theo’s face turns pale.

THEO: I don’t feel well.

Theo gets up to splash cold water on his face. When he comes back, still pale, but in better control of his emotions, he tries to remember the situation described in the first chapter of the book.

THEO: Was it when I showed you my design for Klafim prototype?

DAVE: Exactly! I was quite impressed by your class hierachy diagrams.

THEO: Oh no! Don’t remind me of that time. The long hours of work on such a complex OOP system gave me nightmares.

DAVE: I remember it as a fun period. Every week I was learning a new technology: GraphQL, Elasticsearch, DataDog, Bigtable, Spring, Express…​

THEO: Luckily, I met Joe a few days later.

DAVE: Apropos Joe, you never told me exactly how you met him.

THEO: Well now you’ll know everything. The meeting is told quite accurately at the beginning of Chapter 2.

Dave reads a few lines in the beginning of Chapter 2.

Data-Oriented Programming: Chapter 2

The next morning, Theo asks on Hacker News and on Reddit for ways to reduce system complexity and build flexible systems. Some folks mention using different programming languages, others talk about advanced design patterns. Finally, Theo’s attention gets captured by a comment from a user named Joe who mentions "Data-Oriented programming" and claims that its main goal is to reduce system complexity. Theo has never heard this term before. Out of curiosity he decides to contact Joe by email.

What a coincidence! Joe lives in San Francisco too. Theo invites him to a meeting in his office.

Joe is a 40-year old developer. He’d been a Java developer for nearly decade before adopting Clojure around 7 years ago.

When Theo tells Joe about the Library Management System he designed and built, and about his struggles to adapt to changing requirements, Joe is not surprised.

DAVE: The book doesn’t say if it was on Hacker News or on Reddit that Joe you exchanged with Joe.

THEO: I remember it very well: It was on Reddit. In the "r/clojure" community.

While they talk, Dave leafs through the pages of the book, when he comes across a curious passage from Chapter 15…​

Data-Oriented Programming: Chapter 15

DAVE: I get that. But what happens if the code of the function modifies the data that we are writing. Will we write the original data to the file, or the modified data?

THEO: I’ll let you think about that while I get a cup of coffee at the museum coffee shop. Would you like one?

DAVE: Yes, an espresso please.

THEO: I have a weird sensation of déjà lu.

DAVE: Me too.

DAVE: Do you know what déjà lu means?

THEO: No. But it sounds like it’s related to déjà vu.

Dave and Theo sit quietly, pondering the meaning of "déjà lu" and the bigger puzzle of this weird book.

DAVE: That’s it! I think I got the hang of it.

Dave shows Theo the result from Google translate with the "Detect language" option activated.

DAVE: In French, "déjà lu" means "already read".

THEO: Do you think that the author is French?

DAVE: Probably. That would explain some odd turns of phrases I’ve noticed here and there in the way the characters express themselves.

THEO: But of course! At least we have found a point on which we are not identical to the characters in this book.

DAVE: Anyway, A déjà lu must be when you live a situation that you have already read in a book.

THEO: But I don’t think we’ve ever been together at a museum!

DAVE: Me neither. Could this book be telling not only the past but also the future?

THEO: A future that we will already know when it will happen since we are now reading it.

Dave and Theo together:

 — A déjà lu!

DAVE: This book tells our past and our future. I wonder if it also tells our present.

THEO: What chapter do you think we would be at the moment?

DAVE: Let’s see. At the end of Chapter 12, there’s a beautiful drawing of the JSON schema cheatsheet we made together last week. It means, that we should now be in Chapter 13.

Dave slowly turns the pages of the book, until he finds the line that tells the present moment.

Data-Oriented Programming: Chapter 13

DAVE: This book tells our past and our future. I wonder if it also tells our present.

THEO: What chapter do you think we would be at the moment?

DAVE: Let’s see. At the end of Chapter 12, there’s a beautiful drawing of the JSON schema cheatsheet we made together last week. It means, that we should now be in Chapter 13.

Dave slowly turns the pages of the book, until he finds the line that tells the present moment.

Data-Oriented Programming: Chapter 13

DAVE: This book tells our past and our future. I wonder if it also tells our present.

THEO: What chapter do you think we would be at the moment?

DAVE: Let’s see. At the end of Chapter 12, there’s a beautiful drawing of the JSON schema cheatsheet we made together last week. It means, that we should now be in Chapter 13.

Dave slowly turns the pages of the book, until he finds the line that tells the present moment.

Data-Oriented Programming: Chapter 13

DAVE: This book tells our past and our future. I wonder if it also tells our present.

THEO: What chapter do you think we would be at the moment?

DAVE: Let’s see. At the end of Chapter 12, there’s a beautiful drawing of the JSON schema cheatsheet we made together last week. It means, that we should now be in Chapter 13.

Dave slowly turns the pages of the book, until he finds the line that tells the present moment.

THEO: Dave! This is freaking me out! I think we should close this book immediately and forget all about it.

DAVE: I can’t. I’m too curious to discover my future.

THEO: You’ll have to do it without me. Joe told us many times we should never mess up with the state of a system.

DAVE: Wait! It’s true that Joe taught us the merits of immutability. But that only concerns the past state of a system. He never said we didn’t have the right to mutate our future!

THEO: You mean that reading beyond Chapter 13 won’t necessarily lock us in a predefined scenario?

DAVE: I hope so!

Hoping to stay in control of their destiny, Theo and Dave start reading Chapter 14 of "Data-Oriented Programming".

Please share your thoughts about this self-referential stuff by replying to this tweet.

Did you enjoy this self-referential stuff in Chapter 13?

Do you think it’s a good idea to include this self-referential stuff in the book?

How would you make it better?

Permalink

Copyright © 2009, Planet Clojure. No rights reserved.
Planet Clojure is maintained by Baishamapayan Ghose.
Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
Theme by Brajeshwar.