Clojure Deref (Apr 5, 2025)

Welcome to the Clojure Deref! This is a weekly link/news roundup for the Clojure ecosystem (feed: RSS). Thanks to Anton Fonarev for link aggregation.

Libraries and Tools

New releases and tools this week:

  • core.async 1.8.735 - Facilities for async programming and communication in Clojure

  • splint 1.20.0 - A Clojure linter focused on style and code shape

  • big-config - An alternative to traditional configuration languages

  • valhalla 2025.3.29 - A ClojureScript focused validation library

  • futurama 1.3.1 - Futurama is a Clojure library for more deeply integrating async abstractions with core.async

  • atproto-clj - ATProto Clojure SDK

  • snitch 0.1.16 - Snitch is inline-defs on steroids

  • replicant 2025.03.27 - A data-driven rendering library for Clojure(Script) that renders hiccup to DOM or to strings

  • flow-storm-debugger 4.3.0 - A debugger for Clojure and ClojureScript with some unique features

  • desiderata 1.1.0 - Things wanted or needed but missing from clojure.core

  • process 0.6.23 - Clojure library for shelling out / spawning sub-processes

  • edamame 1.4.29 - Configurable EDN/Clojure parser with location metadata

  • pact 1.0.8 - clojure.spec to json-schema generation library

  • clay 2-beta39 - A tiny Clojure tool for dynamic workflow of data visualization and literate programming

  • noj 2-beta14 - A clojure framework for data science

  • clojure-cli-config 2025-04-02 - User aliases and Clojure CLI configuration for deps.edn based projects

  • pod-babashka-instaparse 0.0.5 - A pod exposing Instaparse to babashka

  • game-of-life-cljs - Conway’s Game of Life implemented in ClojureScript

Permalink

Clojure Is Awesome!!! [PART 20]

Deep Dive into Clojure's reduce Function

What is reduce?

In Clojure, reduce is a fundamental function in functional programming that processes a collection by iteratively applying a function to an accumulator and each element, "reducing" the collection into a single result. Its basic signature is:

(reduce f init coll)
  • f: A function of two arguments: the accumulator (the running result) and the next element.
  • init: The initial value of the accumulator (optional).
  • coll: The collection to process.

If init is omitted, reduce uses the first element of the collection as the initial accumulator and starts processing from the second element. This flexibility makes reduce a powerful tool for a wide range of tasks.

Why reduce Matters

Unlike imperative loops (for or while in other languages), reduce offers a functional approach to iteration and accumulation. It can:

  • Aggregate data (e.g., summing numbers).
  • Transform collections into new structures (e.g., building maps or lists).
  • Handle complex logic in a clean, declarative way.

Let’s explore its capabilities with examples, starting simple and moving to a sophisticated real-world scenario.

Basic Examples

Here are some straightforward uses of reduce to illustrate its mechanics:

1!. Calculate the Factorial of a Number

(defn factorial [n]
  (reduce * 1 (range 1 (inc n))))

(factorial 5)

2!. Build a Sentence from Words

(def words ["Clojure" "are" "Awesome!!!"])
(reduce (fn [sentence word] (str sentence " " word)) (first words) (rest words))

3!. Find the Oldest Person

(def people [{:nome "Andre" :idade 30}
             {:nome "Borba" :idade 28}
             {:nome "John" :idade 35}])

(reduce (fn [oldest person]
          (if (> (:idade person) (:idade oldest))
            person
            oldest))
        (first people)
        (rest people))

4!. Count the Frequency of Elements

(def fruits ["Apple" "Banana" "Apple" "Orange" "Banana" "Apple"])
(reduce (fn [freq fruit]
          (update freq fruit (fnil inc 0)))
        {}
        fruits)

5!. Apply Sequential Discounts

(def discount [0.1 0.05 0.2])
(def original-price 100)
(reduce (fn [price discount] (* price (- 1 discount))) original-price discount)

hese examples show how reduce adapts to different operations by changing the function (f).

A Sophisticated Real-World Example: Analyzing User Activity Logs

Let’s move beyond trivial examples to a professional, high-value use case. Imagine you’re a data engineer working with a system that tracks user activity logs for a web application. You need to process a collection of log entries to calculate key metrics (total session time, page views per user) and generate a summary report. Here’s how reduce can tackle this:

Scenario

You have a list of log entries, where each entry is a map containing:

  • :user-id: The user’s unique identifier.
  • :timestamp: When the action occurred (in milliseconds).
  • :action: The type of action (:login, :page-view, :logout).

Your goal is to:

  • Calculate the total session duration per user (time between :login and :logout).
  • Count the number of page views per user.
  • Produce a structured summary.

Data

(def activity-logs [{:user-id "u1" :timestamp 1698000000000 :action :login}
                    {:user-id "u1" :timestamp 1698000001000 :action :page-view}
                    {:user-id "u1" :timestamp 1698000002000 :action :page-view}
                    {:user-id "u1" :timestamp 1698000005000 :action :logout}
                    {:user-id "u2" :timestamp 1698000003000 :action :login}
                    {:user-id "u2" :timestamp 1698000004000 :action :page-view}
                    {:user-id "u2" :timestamp 1698000007000 :action :logout}])

Solution
We’ll use reduce to process the logs and build a summary map where each user’s data includes session duration and page view count.

(defn process-logs [logs]
  (reduce
    (fn [acc {:keys [user-id timestamp action]}]
      (case action
        :login
        (assoc-in acc [user-id :start-time] timestamp)

        :logout
        (let [start-time (get-in acc [user-id :start-time])
              duration (when start-time (- timestamp start-time))
              page-views (get-in acc [user-id :page-views] 0)]
          (-> acc
              (assoc-in [user-id :duration] duration)
              (assoc-in [user-id :page-views] page-views)
              (update user-id dissoc :start-time)))

        :page-view
        (update-in acc [user-id :page-views] (fnil inc 0))

        acc))
    {}
    logs))

(process-logs activity-logs)

Explanation

  • Accumulator: A map where each key is a user-id and the value is a nested map tracking :start-time, :duration, and :page-views.

  • Logic:

    • :login: Stores the start timestamp.
    • :page-view: Increments the page view count (using fnil to handle nil as 0).
    • :logout: Calculates the session duration, finalizes the page view count, and cleans up the temporary :start-time.
  • Result: A concise summary of user activity, ready for reporting or further analysis.

Conclusion

Clojure’s reduce is a Swiss Army knife for functional programming. From simple aggregations to complex state transformations like processing user activity logs, it provides a clean, powerful way to handle collections. Its elegance lies in its ability to abstract away explicit loops while delivering high-value results in real-world applications.

How might you use reduce in your projects? Whether it’s analyzing logs, transforming data, or computing metrics, it’s a tool worth mastering.

Permalink

Why Clojure Developers Love the REPL So Much

Many programming languages offer a REPL (Read-Eval-Print Loop). It’s a simple yet powerful concept: read user input, evaluate it, print the result, and loop. This provides a crucial way to experiment with code and get immediate feedback, ensuring functions work together correctly. This interactivity is particularly potent in functional (ideally immutable) languages like Clojure, where side effects are managed, making repeated evaluation predictable. While many languages have REPLs, their significance and capabilities vary widely. This article delves into why the REPL is so deeply valued in the Clojure community, examining its Lisp heritage, highlighting its core strengths for the daily workflow, and considering its role in future development.

Historical Roots: Lisp as the Pioneer

The REPL concept originated with Lisp back in 1964. Though the acronym took time to catch on, REPLs are now found in languages far removed from Lisp, like Python or Ruby. However, implementation details greatly influence how central the REPL is to a language’s workflow. To fully appreciate Clojure’s REPL, it’s helpful to first look at its ancestor, often considered the gold standard: Common Lisp.

The Power of Common Lisp: Peak Interactive Debugging

Common Lisp (CL) arguably boasts the most capable REPL implementation, primarily due to its deeply integrated, interactive debugger. When an error occurs during execution in CL, the program pauses. The debugger allows the developer to inspect the full program state, interactively modify the code at the point of error, change variable values (aided by CL’s mutability), and then resume execution without needing a full restart. This “stop-and-fix” capability, combined with profound introspection features and tooling like SLIME/Sly, creates a really productive environment. The scope of what’s possible from the REPL is exceptionally wide. For instance, one can even start a complete rebuild of the Common Lisp system itself. While Smalltalk offers similar interactive capabilities, its tight integration with its own environment makes its REPL less universally accessible than Lisp’s. The power of the CL REPL is something developers in many other languages can only envy.

Clojure’s REPL: Interactive Development in Practice

Clojure, as a Lisp dialect running on platforms like the JVM, inherits introspection capabilities similar to CL. However, its standard, built-in REPL offers a different, and in the specific aspect of integrated ‘stop-and-fix’ debugging, a weaker approach compared to Common Lisp. When a runtime error occurs in Clojure, the current evaluation typically stops, and a stack trace is printed. You usually cannot interactively manipulate the call stack or resume from the error point within the standard REPL itself. Languages like Python (with IPython) or Ruby (with Pry) might offer built-in tools that feel closer to CL’s interactive debugging model.

Despite lacking the built-in condition system of CL, the Clojure REPL is fundamental to the development workflow due to numerous other strengths, often augmented by robust tooling:

  1. Live Code Modification & Instant Feedback: You can redefine functions and variables (def, defn) in a running application, and subsequent calls will use the new definition instantly. This “hot reload” eliminates constant restarts, allowing developers to incrementally build and test code. You can break problems into small pieces, test each in the REPL, feed functions various data, and see results immediately. This drastically reduces the edit-compile-run cycle.
  2. Exploration, Learning & Data Interaction: The REPL is an very effective tool for learning the language or exploring new libraries. You can try out functions, inspect data structures, trace execution flow using simple tools like clojure.tools.trace (or the more advanced ones mentioned later), and understand behavior without needing to set up a full project. Clojure’s focus on data structures (maps, vectors) makes the REPL particularly useful for easily inspecting and manipulating data – often just printing the data is enough, far simpler than complex debugger watches in other languages.
  3. Reduced Cognitive Load: By avoiding restarts and maintaining the application’s state, the REPL helps developers stay focused on the problem, reducing the mental overhead of context switching associated with traditional development cycles.
  4. Introspection Tools: The clojure.repl namespace offers functions like doc (show documentation), source (show source code), apropos (find vars by regex), and find-doc (search documentation). Special variables like *1, *2, *3 hold recent results, and *e holds the last exception (useful with (pst *e) to print the stack trace). Helpers for Java interop (clojure.java.javadoc/javadoc, clojure.reflect/reflect) and namespace management are also readily available. Clojure 1.12+ even allows dynamically adding dependencies (add-lib) to a running REPL. More information about these REPL features is available here.
  5. Client-Server Architecture & Remote Connection: Clojure REPLs typically run as servers, allowing development tools (editors, dedicated clients) to connect. This architecture enables connecting to a REPL running in a remote process (e.g., staging or even production, with caution), allowing for live inspection and debugging of deployed applications from your local machine.
  6. Extensibility & Community Tools: The ecosystem provides valuable tools that deeply integrate with the REPL, significantly enhancing the experience. Tools like Portal, Reveal, and Morse offer sophisticated, interactive data visualization and navigation. Flow-storm provides advanced tracing and debugging capabilities, letting you step through code execution visually. Datafy/Nav protocol enables these tools to understand and navigate relationships within your data (e.g., following database foreign keys directly from query results shown in Portal). These tools work in synergy with the REPL.
  7. Scripting & Operational Tasks: Beyond development, the REPL is great for quick administrative scripts, one-off data migrations, or inspecting system state without writing and deploying separate utility code.

The “REPL-Driven Development” (RDD) Philosophy

Clojure culture strongly embraces “REPL-Driven Development.” This means not just using the REPL, but actively designing systems to be easily interacted with via the REPL. It involves writing small, testable functions, composing them, and constantly evaluating them in the context of the running application. This philosophy encourages managing application state in ways that allow modification from the REPL without restarts. This interactive state management is so central that libraries like Component or Integrant exist to help structure applications around this principle. Running tests directly from the REPL (sometimes automatically on code changes) is also a common practice, providing continuous feedback. The REPL is thus a central part of the design and development loop.

The REPL and the Future with AI

With the rise of Large Language Models (LLMs), the REPL offers another compelling advantage. Its interactive and introspective nature makes it an ideal interface for AI coding assistants. Tools like nrepl-mcp-server allow AI agents supporting protocols like MCP to connect to a running Clojure REPL. The agent can then inspect the live environment, call existing functions, evaluate generated code snippets interactively, and verify results before suggesting or making changes to source files. This provides a much tighter and more efficient feedback loop compared to the typical generate-save-compile-run-check cycle necessary in many other languages. While LLMs might currently find generating idiomatic Clojure challenging, the potential for REPL-assisted AI development, enabling more effective automated code modification, is immense.

In Summary

The Clojure REPL is loved because it fosters a highly productive and enjoyable development experience through:

  • Immediate Feedback: Instant results accelerate learning and iteration.
  • Interactive Exploration & Modification: Live interaction with the running system.
  • Powerful Introspection: Easy access to documentation, source code, and runtime state.
  • Straightforward Data Handling: Effortless inspection and manipulation of data structures.
  • Reduced Cognitive Load: Minimizes context switching, enabling focus.
  • Remote Capabilities: Debug and inspect deployed applications.
  • Rich Tooling Ecosystem: Extensible via community visualization and debugging tools.
  • Foundation for RDD: Enables an iterative and interactive design philosophy.
  • AI Integration Potential: Provides a superior interface for AI coding assistants.

The post Why Clojure Developers Love the REPL So Much appeared first on Flexiana.

Permalink

bb-fzf: A Babashka Tasks Picker

This post introduces a Babashka Tasks helper I put together make it easier to use bb tasks interactively. Here's the bb-fzf repo link.

The main things the helper adds are:

  • the use of fzf for fuzzy task selection to help you type less
  • the ability to invoke bb tasks from any sub-directory in your repo
  • a bit of pretty-printing with the help of Bling

Babashka and fzf are two tools that are consistently part of my workflow, regardless of what I'm doing. I've briefly wrote about them in my previous post, PhuzQL: A GraphQL Fuzzy Finder, where I combined them together with Pathom to create a simple GraphQL explorer. In short, Babashka lets you write Clojure scripts with a fast startup time and fzf is an interactive picker with fuzzy text matching to save you some keystrokes.

Babashka includes a tasks feature that is somewhat like a Makefile replacement, allowing you to define named tasks in a bb.edn file and then execute them with bb <task name> [optional params]. bb-fzf just lets you pick from your list of tasks interactively - think of it as an ergonomic autocomplete.

A quick demo showing:

  • simple task selection
  • argument support
  • output formatting (callouts, color, task selection, repo root)

bb-fzf Demo

Check out the README in the bb-fzf repo for more details on installation and usage. There's always room for improvement, but this is sufficient for my needs for now. In the future I might add a preview window and more robust argument handling with the help of babashka.cli.

Issues and PRs in the repo are welcome.

Permalink

Clojure Is Awesome!!! [PART 19]

We’ll dive into pattern matching using the core.match library in Clojure

This powerful tool allows you to handle complex data structures in a declarative and concise way, making your code more readable and maintainable. Let’s explore what pattern matching is, how to use core.match, and why it’s a valuable addition to your Clojure toolkit, complete with practical examples.

What Is Pattern Matching?

Pattern matching is a technique where you match data against predefined patterns, extracting components or triggering actions based on the match. It’s a staple in functional programming languages like Haskell and Scala, and in Clojure, the core.match library brings this capability to the table. Unlike basic conditionals like if or cond, pattern matching lets you express "what" you want to happen rather than "how" to check conditions step-by-step. It’s especially useful for:

  • Parsing structured data (e.g., JSON or EDN).
  • Handling data variants or tagged unions.
  • Simplifying nested conditional logic.

Setting Up core.match
To get started, you’ll need to add core.match to your project. The latest version available is 1.1.0 (as per the GitHub repository). Here’s how to include it:

  • For Leiningen (in project.clj): :dependencies [[org.clojure/core.match "1.1.0"]]

For deps.edn:
{:deps {org.clojure/core.match {:mvn/version "1.1.0"}}}

Once added, require it in your namespace:

(ns clojure-is-awesome.pattern-matching
  (:require [clojure.core.match :refer [match]]))

With the setup complete, let’s see core.match in action.

Basic Usage: Matching Values

The match macro takes an expression and pairs of patterns and results. Here’s a simple example matching numbers:

(match 42
  0 "Zero"
  1 "One"
  _ "Other")

The _ acts as a wildcard, matching anything. Now, let’s match a list:

(match [1 2 3]
  [1 2 3] "Exact match"
  [_ _ _] "Three elements"
  :else "Default")

You can also bind variables:'

(match [1 2 3]
  [a b c] (str "Values: " a ", " b ", " c))

This is destructuring made more powerful and flexible.

Practical Example: Parsing Commands

Let’s apply pattern matching to a real-world scenario—parsing commands. Suppose we have commands like [:add 5] or [:mul 2 3]:

(defn process-command [cmd]
  (match cmd
    [:add x] (str "Adding " x)
    [:sub x] (str "Subtracting " x)
    [:mul x y] (str "Multiplying " x " by " y)
    :else "Unknown command"))

(println (process-command [:add 5]))
(println (process-command [:mul 2 3]))

This is much cleaner than a chain of cond statements, and the intent is immediately clear.

Advanced Features: Guards and Nested Patterns

core.match goes beyond basics with features like guards ":guard fn" and nested pattern matching. Here’s an example with a guard:

(match {:type :user :age 25}
             ({:type :user :age age} :guard #(> (:age %) 18)) "Adult user"
             {:type :user :age age} "Minor user"
             :else "Other")

The :guard #(> (:age %) 18) ensures the pattern only matches if the condition holds. Now, let’s tackle a nested structure:

(match {:data [{:id 1 :name "Borba"} {:id 2 :name "John"}]}
             {:data [{:id id :name name} & rest]} (str "First user: " id ", " name))

The & rest captures remaining elements, showcasing how core.match handles complex data effortlessly.

Why Use core.match?

Compared to Clojure’s built-in cond or case, core.match offers:
Nested matching: Easily handle complex structures.
Variable binding: Extract values directly in patterns.
Guards: Add conditions for precise control.
Clarity: Replace verbose logic with declarative patterns.

It shines in scenarios like parsing DSLs, managing state machines, or simplifying conditional code.

Example: Traffic Light State Machine

(defn next-state [current]
  (match current
    :red {:state :green :action "Go"}
    :green {:state :yellow :action "Slow down"}
    :yellow {:state :red :action "Stop"}
    :else {:state :red :action "Invalid, stopping"}))

(println (next-state :red))
(println (next-state :yellow))

This declarative approach makes the state transitions crystal clear.

What do you think? Have you tried core.match in your projects? Let me know in the comments—I’d love to hear your experiences! Stay tuned for the next part of our series, where we’ll uncover more Clojure awesomeness.

Permalink

Massively scalable collaborative text editor backend with Rama in 120 LOC

This is part of a series of posts exploring programming with Rama, ranging from interactive consumer apps, high-scale analytics, background processing, recommendation engines, and much more. This tutorial is self-contained, but for broader information about Rama and how it reduces the cost of building backends so much (up to 100x for large-scale backends), see our website.

Like all Rama applications, the example in this post requires very little code. It’s easily scalable to millions of reads/writes per second, ACID compliant, high performance, and fault-tolerant from how Rama incrementally replicates all state. Deploying, updating, and scaling this application are all one-line CLI commands. No other infrastructure besides Rama is needed. Comprehensive monitoring on all aspects of runtime operation is built-in.

In this post, I’ll explore building the backend for a real-time collaborative text editor like Google Docs or Etherpad. The technical challenge of building an application like this is conflict resolution. When multiple people edit the same text at the same time, what should be the result? If a user makes a lot of changes offline, when they come online how should their changes be merged in to a document whose state may have diverged significantly?

There are many approaches for solving this problem. I’ll show an implementation of “operational transformations” similar to the one Google Docs uses as described here. Only incremental changes are sent back and forth between server and client, such as “Add text ‘hello’ to offset 128” or “Remove 14 characters starting from offset 201”. When clients send a change to the server, they also say what version of the document the change was applied to. When the server receives a change from an earlier document version, it applies the “operational transformation” algorithm to modify the addition or removal to fit with the latest document version.

Code will be shown in both Clojure and Java, with the total code being about 120 lines for the Clojure implementation and 160 lines for the Java implementation. Most of the code is implementing the “operational transformation” algorithm, which is just plain Clojure or Java functions, and the Rama code handling storage/queries is just 40 lines. You can download and play with the Clojure implementation in this repository or the Java implementation in this repository.

Operational transformations

The idea behind “operational transformations” is simple and can be understood through a few examples. Suppose Alice and Bob are currently editing a document, and Alice adds "to the " at offset 6 when the document is at version 3, like so:

However, when the change gets to the server, the document is actually at version 4 since Bob added "big " to offset 0:

Applying Alice’s change without modification would produce the string “big heto the llo world”, which is completely wrong. Instead, the server can transform Alice’s change based on the single add that happened between versions 3 and 4 by pushing Alice’s change to the right by the length of "big ". So Alice’s change of “Add ‘to the ’ at offset 6” becomes “Add ‘to the ’ at offset 10”, and the document becomes:

Now suppose instead the change Bob made between versions 3 and 4 was adding " again" to the end:

In this case, Alice’s change should not be modified since Bob’s change was to the right of her addition, and applying Alice’s change will produce “hello to the world again”.

Transforming an addition against a missed remove works the same way: if Bob had removed text to the left of Alice’s addition, her addition would be shifted left by the amount of text removed. If Bob removed text to the right, Alice’s addition is unchanged.

Now let’s take a look at what happens if Alice had removed text at an earlier version, which is slightly trickier to handle. Suppose Alice removes 7 characters starting from offset 2 when the document was “hello world” at version 3. Suppose Bob had added "big " to offset 0 between versions 3 and 4:

In this case, Alice’s removal should be shifted right by that amount to remove 7 characters starting from offset 6. Now suppose Bob had instead added "to the " to offset 6:

In this case, text had been added in the middle of where Alice was deleting text. It’s wrong and confusing for Alice’s removal to delete text that wasn’t in her version of the document. The correct way to handle this is split her removal into two changes: remove 3 characters from offset 13 and then remove 4 characters from offset 2. This produces the following document:

The resulting document isn’t legible, but it’s consistent with the conflicting changes that Alice and Bob were making at the same time without losing any information improperly.

Let’s take a look at one more case. Suppose again that Alice removes 7 characters from offset 2 when the document was “hello world” at version 3. This time, Bob had already removed one character starting from offset 3:

In this case, one of the characters Alice had removed was already removed, so Alice’s removal should be reduced by one to remove 6 characters from offset 2 instead of 7, producing:

There’s a few more intricacies to how operational transformations work, but this gets the gist of the idea across. Since this post is really about using Rama to handle the storage and computation for a real-time collaborative editor backend, I’ll keep it simple and only handle adds and removes. The code can easily be extended to handle other kinds of document edits such as formatting changes.

The Rama code will make use of two functions that encapsulate the “operational transformation” logic, one to apply a series of edits to a document and the other to transform an edit against a particular version against all the edits that happened until the latest version. These functions are as follows:

ClojureJava
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
(defn transform-edit [edit missed-edits]
  (if (instance? AddText (:action edit))
    (let [new-offset (reduce
                       (fn [offset missed-edit]
                         (if (<= (:offset missed-edit) offset)
                           (+ offset (add-action-adjustment missed-edit))
                           offset))
                       (:offset edit)
                       missed-edits)]
    [(assoc edit :offset new-offset)])
  (reduce
    (fn [removes {:keys [offset action]}]
      (if (instance? AddText action)
        (transform-remove-against-add removes offset action)
        (transform-remove-against-remove removes offset action)))
    [edit]
    missed-edits)))

(defn apply-edits [doc edits]
  (reduce
    (fn [doc {:keys [offset action]}]
      (if (instance? AddText action)
        (setval (srange offset offset) (:content action) doc)
        (setval (srange offset (+ offset (:amount action))) "" doc)))
    doc
    edits))
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
public static List transformEdit(Map edit, List<Map> missedEdits) {
  int offset = offset(edit);
  if(isAddEdit(edit)) {
    int adjustment = 0;
    for(Map e: missedEdits) {
      if(offset(e) <= offset) offset += addActionAdjustment(e);
    }
    Map newEdit = new HashMap(edit);
    newEdit.put("offset", offset + adjustment);
    return Arrays.asList(newEdit);
  } else {
    List removes = Arrays.asList(edit);
    for(Map e: missedEdits) {
      if(isAddEdit(e)) removes = transformRemoveAgainstAdd(removes, e);
      else removes = transformRemoveAgainstRemove(removes, e);
    }
    return removes;
  }
}

private static String applyEdits(String doc, List<Map> edits) {
  for(Map edit: edits) {
    int offset = offset(edit);
    if(isAddEdit(edit)) {
      doc = doc.substring(0, offset) + content(edit) + doc.substring(offset);
    } else {
      doc = doc.substring(0, offset) + doc.substring(offset + amount(edit));
    }
  }
  return doc;
}

The “apply edits” function updates a document with a series of edits, and the “transform edit” function implements the described “operational transformation” algorithm. The “transform edit” function returns a list since an operational transformation on a remove edit can split that edit into multiple remove edits, as described in one of the cases above.

The implementation of the helper functions used by “apply edits” and “transform edit” can be found in the Clojure repository or the Java repository.

Backend storage

Indexed datastores in Rama, called PStates (“partitioned state”), are much more powerful and flexible than databases. Whereas databases have fixed data models, PStates can represent infinite data models due to being based on the composition of the simpler primitive of data structures. PStates are distributed, durable, high-performance, and incrementally replicated. Each PState is fine-tuned to what the application needs, and an application makes as many PStates as needed. For this application, we’ll make two PStates: one to track the latest contents of a document, and one to track the sequence of every change made to a document in its history.

Here’s the PState definition for the latest contents of a document:

ClojureJava
1
2
3
4
(declare-pstate
  topology
  $$docs
  {Long String})
1
topology.pstate("$$docs", PState.mapSchema(Long.class, String.class));

This PState is a map from a 64-bit document ID to the string contents of the document. The name of a PState always begins with $$ , and this is equivalent to a key/value database.

Here’s the PState tracking the history of all edits to a document:

ClojureJava
1
2
3
4
(declare-pstate
  topology
  $$edits
  {Long (vector-schema Edit {:subindex? true})})
1
2
3
topology.pstate("$$edits",
                PState.mapSchema(Long.class,
                                 PState.listSchema(Map.class).subindexed()));

This declares the PState as a map of lists, with the key being the 64-bit document ID and the inner lists containing the edit data. The inner list is declared as “subindexed”, which instructs Rama to store the elements individually on disk rather than the whole list read and written as one value. Subindexing enables nested data structures to have billions of elements and still be read and written to extremely quickly. This PState can support many queries in less than one millisecond: get the number of edits for a document, get a single edit at a particular index, or get all edits between two indices.

Let’s now review the broader concepts of Rama in order to understand how these PStates will be materialized.

Rama concepts

A Rama application is called a “module”. In a module you define all the storage and implement all the logic needed for your backend. All Rama modules are event sourced, so all data enters through a distributed log in the module called a “depot”. Most of the work in implementing a module is coding “ETL topologies” which consume data from one or more depots to materialize any number of PStates. Modules look like this at a conceptual level:

Modules can have any number of depots, topologies, and PStates, and clients interact with a module by appending new data to a depot or querying PStates. Although event sourcing traditionally means that processing is completely asynchronous to the client doing the append, with Rama that’s optional. By being an integrated system Rama clients can specify that their appends should only return after all downstream processing and PState updates have completed.

A module deployed to a Rama cluster runs across any number of worker processes across any number of nodes, and a module is scaled by adding more workers. A module is broken up into “tasks” like so:

A “task” is a partition of a module. The number of tasks for a module is specified on deploy. A task contains one partition of every depot and PState for the module as well as a thread and event queue for running events on that task. A running event has access to all depot and PState partitions on that task. Each worker process has a subset of all the tasks for the module.

Coding a topology involves reading and writing to PStates, running business logic, and switching between tasks as necessary.

Implementing the backend

Let’s start implementing the module for the collaborative editor backend. The first step to coding the module is defining the depot:

ClojureJava
1
2
3
4
(defmodule CollaborativeDocumentEditorModule
  [setup topologies]
  (declare-depot setup *edit-depot (hash-by :id))
  )
1
2
3
4
5
6
public class CollaborativeDocumentEditorModule implements RamaModule {
  @Override
  public void define(Setup setup, Topologies topologies) {
    setup.declareDepot("*edit-depot", Depot.hashBy("id"));
  }
}

This declares a Rama module called “CollaborativeDocumentEditorModule” with a depot called *edit-depot which will receive all new edit information. Objects appended to a depot can be any type. The second argument of declaring the depot is called the “depot partitioner” – more on that later.

To keep the example simple, the data appended to the depot will be defrecord objects for the Clojure version and HashMap objects for the Java version. To have a tighter schema on depot records you could instead use Thrift, Protocol Buffers, or a language-native tool for defining the types. Here are the functions that will be used to create depot data:

ClojureJava
1
2
3
4
5
6
7
8
9
10
(defrecord AddText [content])
(defrecord RemoveText [amount])

(defrecord Edit [id version offset action])

(defn mk-add-edit [id version offset content]
  (->Edit id version offset (->AddText content)))

(defn mk-remove-edit [id version offset amount]
  (->Edit id version offset (->RemoveText amount)))
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
public static Map makeAddEdit(long id, int version, int offset, String content) {
  Map ret = new HashMap();
  ret.put("type", "add");
  ret.put("id", id);
  ret.put("version", version);
  ret.put("offset", offset);
  ret.put("content", content);
  return ret;
}

public static Map makeRemoveEdit(long id, int version, int offset, int amount) {
  Map ret = new HashMap();
  ret.put("type", "remove");
  ret.put("id", id);
  ret.put("version", version);
  ret.put("offset", offset);
  ret.put("amount", amount);
  return ret;
}

Each edit contains a 64-bit document ID that identifies which document the edit should apply.

Next, let’s begin defining the topology to consume data from the depot and materialize the PStates. Here’s the declaration of the topology with the PStates:

ClojureJava
1
2
3
4
5
6
7
8
9
10
11
12
13
(defmodule CollaborativeDocumentEditorModule
  [setup topologies]
  (declare-depot setup *edit-depot (hash-by :id))
  (let [topology (stream-topology topologies "core")]
    (declare-pstate
      topology
      $$docs
      {Long String})
    (declare-pstate
      topology
      $$edits
      {Long (vector-schema Edit {:subindex? true})})
    ))
1
2
3
4
5
6
7
8
9
10
11
12
public class CollaborativeDocumentEditorModule implements RamaModule {
  @Override
  public void define(Setup setup, Topologies topologies) {
    setup.declareDepot("*edit-depot", Depot.hashBy("id"));

    StreamTopology topology = topologies.stream("core");
    topology.pstate("$$docs", PState.mapSchema(Long.class, String.class));
    topology.pstate("$$edits",
                    PState.mapSchema(Long.class,
                                     PState.listSchema(Map.class).subindexed()));
  }
}

This defines a stream topology called “core”. Rama has two kinds of topologies, stream and microbatch, which have different properties. In short, streaming is best for interactive applications that need single-digit millisecond update latency, while microbatching has update latency of a few hundred milliseconds and is best for everything else. Streaming is used here because a collaborative editor needs quick feedback from the server as it sends changes back and forth.

Notice that the PStates are defined as part of the topology. Unlike databases, PStates are not global mutable state. A PState is owned by a topology, and only the owning topology can write to it. Writing state in global variables is a horrible thing to do, and databases are just global variables by a different name.

Since a PState can only be written to by its owning topology, they’re much easier to reason about. Everything about them can be understood by just looking at the topology implementation, all of which exists in the same program and is deployed together. Additionally, the extra step of appending to a depot before processing the record to materialize the PState does not lower performance, as we’ve shown in benchmarks. Rama being an integrated system strips away much of the overhead which traditionally exists.

Let’s now add the code to materialize the PStates:

ClojureJava
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
(defmodule CollaborativeDocumentEditorModule
  [setup topologies]
  (declare-depot setup *edit-depot (hash-by :id))
  (let [topology (stream-topology topologies "core")]
    (declare-pstate
      topology
      $$docs
      {Long String})
    (declare-pstate
      topology
      $$edits
      {Long (vector-schema Edit {:subindex? true})})
    (<<sources topology
      (source> *edit-depot :> {:keys [*id *version] :as *edit})
      (local-select> [(keypath *id) (view count)]
        $$edits :> *latest-version)
      (<<if (= *latest-version *version)
        (vector *edit :> *final-edits)
       (else>)
        (local-select>
          [(keypath *id) (srange *version *latest-version)]
          $$edits :> *missed-edits)
        (transform-edit *edit *missed-edits :> *final-edits))
      (local-select> [(keypath *id) (nil->val "")]
        $$docs :> *latest-doc)
      (apply-edits *latest-doc *final-edits :> *new-doc)
      (local-transform> [(keypath *id) (termval *new-doc)]
        $$docs)
      (local-transform>
        [(keypath *id) END (termval *final-edits)]
        $$edits)
      )))
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
public class CollaborativeDocumentEditorModule implements RamaModule {
  @Override
  public void define(Setup setup, Topologies topologies) {
    setup.declareDepot("*edit-depot", Depot.hashBy("id"));

    StreamTopology topology = topologies.stream("core");
    topology.pstate("$$docs", PState.mapSchema(Long.class, String.class));
    topology.pstate("$$edits",
                    PState.mapSchema(Long.class,
                                     PState.listSchema(Map.class).subindexed()));

    topology.source("*edit-depot").out("*edit")
            .each(Ops.GET, "*edit", "id").out("*id")
            .each(Ops.GET, "*edit", "version").out("*version")
            .localSelect("$$edits", Path.key("*id").view(Ops.SIZE)).out("*latest-version")
            .ifTrue(new Expr(Ops.EQUAL, "*latest-version", "*version"),
              Block.each(Ops.TUPLE, "*edit").out("*final-edits"),
              Block.localSelect("$$edits",
                                Path.key("*id")
                                    .sublist("*version", "*latest-version")).out("*missed-edits")
                   .each(CollaborativeDocumentEditorModule::transformEdit,
                         "*edit", "*missed-edits").out("*final-edits"))
            .localSelect("$$docs", Path.key("*id").nullToVal("")).out("*latest-doc")
            .each(CollaborativeDocumentEditorModule::applyEdits,
                  "*latest-doc", "*final-edits").out("*new-doc")
            .localTransform("$$docs", Path.key("*id").termVal("*new-doc"))
            .localTransform("$$edits", Path.key("*id").end().termVal("*final-edits"));
  }
}

The code to implement the topology is less than 20 lines, but there’s a lot to unpack here. The business logic is implemented with dataflow. Rama’s dataflow API is exceptionally expressive, able to intermix arbitrary business logic with loops, conditionals, and moving computation between tasks. This post is not going to explore all the details of dataflow as there’s simply too much to cover. Full tutorials for Rama dataflow can be found on our website for the Java API and for the Clojure API.

Let’s go over each line of this topology implementation. The first step is subscribing to the depot:

ClojureJava
1
2
(<<sources topology
  (source> *edit-depot :> {:keys [*id *version] :as *edit})
1
2
3
topology.source("*edit-depot").out("*edit")
        .each(Ops.GET, "*edit", "id").out("*id")
        .each(Ops.GET, "*edit", "version").out("*version")

This subscribes the topology to the depot *edit-depot and starts a reactive computation on it. Operations in dataflow do not return values. Instead, they emit values that are bound to new variables. In the Clojure API, the input and outputs to an operation are separated by the :> keyword. In the Java API, output variables are bound with the .out method.

Whenever data is appended to that depot, the data is emitted into the topology. The Java version binds the emit into the variable *edit and then gets the fields “id” and “version” from the map into the variables *id and *version , while the Clojure version captures the emit as the variable *edit and also destructures its fields into the variables *id and *version . All variables in Rama code begin with a * . The subsequent code runs for every single emit.

Remember that last argument to the depot declaration called the “depot partitioner”? That’s relevant here. Here’s that image of the physical layout of a module again:

The depot partitioner determines on which task the append happens and thereby on which task computation begins for subscribed topologies. In this case, the depot partitioner says to hash by the “id” field of the appended data. The target task is computed by taking the hash and modding it by the total number of tasks. This means data with the same ID always go to the same task, while different IDs are evenly spread across all tasks.

Rama gives a ton of control over how computation and storage are partitioned, and in this case we’re partitioning by the hash of the document ID since that’s how we want the PStates to be partitioned. This allows us to easily locate the task storing data for any particular document.

The next line fetches the current version of the document:

ClojureJava
1
2
(local-select> [(keypath *id) (view count)]
  $$edits :> *latest-version)
1
.localSelect("$$edits", Path.key("*id").view(Ops.SIZE)).out("*latest-version")

The $$edits PState contains every edit applied to the document, and the latest version of a document is simply the size of that list. The PState is queried with the “local select” operation with a “path” specifying what to fetch. When a PState is referenced in dataflow code, it always references the partition of the PState that’s located on the task on which the event is currently running.

Paths are a deep topic, and the full documentation for them can be found here. A path is a sequence of “navigators” that specify how to hop through a data structure to target values of interest. A path can target any number of values, and they’re used for both transforms and queries. In this case, the path navigates by the key *id to the list of edits for the document. The next navigator view runs a function on that list to get its size. The Clojure version uses Clojure’s count function, and the Java version uses the Ops.SIZE function. The output of the query is bound to the variable *latest-version . This is a fast sub-millisecond query no matter how large the list of edits.

The next few lines run the “operational transformation” algorithm if necessary to produce the edits to be applied to the current version of the document:

ClojureJava
1
2
3
4
5
6
7
(<<if (= *latest-version *version)
  (vector *edit :> *final-edits)
 (else>)
  (local-select>
    [(keypath *id) (srange *version *latest-version)]
    $$edits :> *missed-edits)
  (transform-edit *edit *missed-edits :> *final-edits))
1
2
3
4
5
6
7
.ifTrue(new Expr(Ops.EQUAL, "*latest-version", "*version"),
  Block.each(Ops.TUPLE, "*edit").out("*final-edits"),
  Block.localSelect("$$edits",
                    Path.key("*id")
                        .sublist("*version", "*latest-version")).out("*missed-edits")
       .each(CollaborativeDocumentEditorModule::transformEdit,
             "*edit", "*missed-edits").out("*final-edits"))

First, an “if” is run to check if the version of the edit is the same as the latest version on the backend. If so, the list of edits to be applied to the document is just the edit unchanged. As mentioned before, the operational transformation algorithm can result in multiple edits being produced from a single edit. The Clojure version produces the single-element list by calling vector , and the Java version does so with the Ops.TUPLE function. The list of edits is bound to the variable *final-edits .

The “else” branch of the “if” handles the case where the edit must be transformed against all edits up to the latest version. The “local select” on $$edits fetches all edits from the input edit’s version up to the latest version. The navigator to select the sublist, srange in Clojure and sublist in Java, takes in as arguments a start offset (inclusive) and end offset (exclusive) and navigates to the sublist of all elements between those offsets. This sublist is bound to the variable *missed-edits .

The function shown before implementing operational transformations, transform-edit in Clojure and CollaborativeDocumentEditorModule::transformEdit in Java, is then run on the edit and all the missed edits to produce the new list of edits and bind them to the variable *final-edits .

Any variables bound in both the “then” and “else” branches of an “if” conditional in Rama will be in scope after the conditional. In this case, *final-edits is available after the conditional. *missed-edits is not available since it is not bound in the “then” branch. This behavior comes from Rama implicitly “unifying” the “then” and “else” branches.

The next bit of code gets the latest document and applies the transformed edits to it:

ClojureJava
1
2
3
(local-select> [(keypath *id) (nil->val "")]
  $$docs :> *latest-doc)
(apply-edits *latest-doc *final-edits :> *new-doc)
1
2
3
.localSelect("$$docs", Path.key("*id").nullToVal("")).out("*latest-doc")
.each(CollaborativeDocumentEditorModule::applyEdits,
      "*latest-doc", "*final-edits").out("*new-doc")

The “local select” fetches the latest version of the document from the $$docs PState. The second navigator in the path, nil->val in Clojure and nullToVal in Java, handles the case where this is the first ever edit on the document. In that case the document ID does not exist in this PState. In that case the key navigation by *id would navigate to null , so the next navigator instead navigates to the empty string in that case.

The next line runs the previously defined “apply edits” function to apply the transformed edits to produce the new version of the document into the variable *new-doc .

The next two lines finish this topology:

ClojureJava
1
2
3
4
5
(local-transform> [(keypath *id) (termval *new-doc)]
  $$docs)
(local-transform>
  [(keypath *id) END (termval *final-edits)]
  $$edits)
1
2
.localTransform("$$docs", Path.key("*id").termVal("*new-doc"))
.localTransform("$$edits", Path.key("*id").end().termVal("*final-edits"));

The two PStates are updated with the “local transform” operation. Like “local select”, a “local transform” takes in as input a PState and a “path”. Paths for “local transform” navigate to the values to change and then use special “term” navigators to update them.

The first “local transform” navigates into $$docs by the document ID and uses the “term val” navigator to set the value there to *new-doc . This is exactly the same as doing a “put” into a hash map.

The second “local transform” appends the transformed edits to the end of the list of edits for this document. It navigates to the list by the key *id and then navigates to the “end” of the list. More specifically, the “end” navigator navigates to the empty list right after the overall list. Setting that empty list to a new value appends those elements to the overall list, which is what the final “term val” navigator does.

This entire topology definition executes atomically – all the PState queries, operational transformational logic, and PState writes all happen together and nothing else can run on the task in between. This is a result of Rama colocating computation and storage, which will be explored more in the next section.

The power of colocation

Let’s take a look at the physical layout of a module again:

Every task has a partition of each depot and PState as well as an executor thread for running events on that task. Critically, only one event can run at a time on a task. That means each event has atomic access to all depots and PState partitions on that task. Additionally, those depot and PState partition are local to the JVM process running that event so interactions with them are fully in-process (as opposed to the inter-process communication used with databases).

A traditional database handles many read and write requests concurrently, using complex locking strategies and explicit transactions to achieve atomicity. Rama’s approach is different: parallelism is achieved by having many tasks in a module, and atomicity comes from colocation. Rama doesn’t have explicit transactions because transactional behavior is automatic when computation is colocated with storage.

When writing a topology in a module, you have full control over what constitutes a single event. Code runs synchronously on a task unless they explicitly go asynchronous, like with partitioners or yields.

This implementation for a collaborative editor backend is a great example of the power of colocation. The topology consists of completely arbitrary code running fine-grained logic for the “operational transformational” logic and manipulating multiple PStates, and nothing special needed to be done to get the necessary transactional behavior.

When you use a microbatch topology to implement an ETL, you get even stronger transactional behavior. All microbatch topologies are cross-partition transactions in every case, no matter how complex the logic.

You can read more about Rama’s strong ACID semantics on this page.

Query topology to fetch document and version

The module needs one more small thing to complete the functionality necessary for a real-time collaborative editor backend. When the frontend is loaded, it needs to load the latest document contents along with its version. The contents are stored in the $$docs PStates, and the version is the size of the list of edits in the $$edits PState. So we need to read from both those PStates atomically in one event.

If you were to try to do this with direct PState clients, you would be issuing one query to the client for the $$edits PState and one query to the client for the $$docs PState. Those queries would run as separate events, and the PStates could be updated in between the queries. This would result in the frontend having incorrect state.

Rama provides a feature called “query topologies” to handle this case. Query topologies are exceptionally powerful, able to implement high-performance, real-time, distributed queries across any or all of the PStates of a module and any or all of the partitions of those PStates. They’re programmed with the exact same dataflow API as used to program ETLs.

For this use case, we only need to query two PStates atomically. So this is a simple use case for query topologies. The full implementation is:

ClojureJava
1
2
3
4
5
6
7
(<<query-topology topologies "doc+version"
  [*id :> *ret]
  (|hash *id)
  (local-select> (keypath *id) $$docs :> *doc)
  (local-select> [(keypath *id) (view count)] $$edits :> *version)
  (hash-map :doc *doc :version *version :> *ret)
  (|origin))
1
2
3
4
5
6
7
8
9
10
11
topologies.query("doc+version", "*id").out("*ret")
          .hashPartition("*id")
          .localSelect("$$docs", Path.key("*id")).out("*doc")
          .localSelect("$$edits", Path.key("*id").view(Ops.SIZE)).out("*version")
          .each((String doc, Integer version) -> {
            Map ret = new HashMap();
            ret.put("doc", doc);
            ret.put("version", version);
            return ret;
          }, "*doc", "*version").out("*ret")
          .originPartition();

Let’s go through this line by line. The first part declares the query topology and its arguments:

ClojureJava
1
2
(<<query-topology topologies "doc+version"
  [*id :> *ret]
1
topologies.query("doc+version", "*id").out("*ret")

This declares a query topology named “doc+version” that takes in one argument *id as input. It declares the return variable *ret , which will be bound by the end of the topology execution.

The next line gets the query to the task of the module containing the data for that ID:

ClojureJava
1
(|hash *id)
1
.hashPartition("*id")

The line does a “hash partition” by the value of *id . Partitioners relocate subsequent code to potentially a new task, and a hash partitioner works exactly like the aforementioned depot partitioner. The details of relocating computation, like serializing and deserializing any variables referenced after the partitioner, are handled automatically. The code is linear without any callback functions even though partitioners could be jumping around to different tasks on different nodes.

When the first operation of a query topology is a partitioner, query topology clients are optimized to go directly to that task. You’ll see an example of invoking a query topology in the next section.

The next two lines atomically fetch the document contents and version:

ClojureJava
1
2
(local-select> (keypath *id) $$docs :> *doc)
(local-select> [(keypath *id) (view count)] $$edits :> *version)
1
2
.localSelect("$$docs", Path.key("*id")).out("*doc")
.localSelect("$$edits", Path.key("*id").view(Ops.SIZE)).out("*version")

These two queries are atomic because of colocation, just as explained above. Fetching the latest contents of a document is simply a lookup by key, and fetching the latest version is simply the size of the list of edits.

The next line packages these two values into a single object:

ClojureJava
1
(hash-map :doc *doc :version *version :> *ret)
1
2
3
4
5
6
.each((String doc, Integer version) -> {
  Map ret = new HashMap();
  ret.put("doc", doc);
  ret.put("version", version);
  return ret;
}, "*doc", "*version").out("*ret")

This just puts them into a hash map.

Finally, here’s the last line of the query topology:

ClojureJava
1
(|origin)
1
.originPartition();

The “origin partitioner” relocates computation to the task where the query began execution. All query topologies must invoke the origin partitioner, and it must be the last partitioner invoked.

Let’s now take a look at how a client would interact with this module.

Interacting with the module

Here’s an example of how you would get clients to *edit-depot , $$edits , and the doc+version query topology, such as in your web server:
ClojureJava
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
(def manager (open-cluster-manager {"conductor.host" "1.2.3.4"}))
(def edit-depot
  (foreign-depot
    manager
    "nlb.collaborative-document-editor/CollaborativeDocumentEditorModule"
    "*edit-depot"))
(def edits-pstate
  (foreign-pstate
    manager
    "nlb.collaborative-document-editor/CollaborativeDocumentEditorModule"
    "$$edits"))
(def doc+version
  (foreign-query
    manager
    "nlb.collaborative-document-editor/CollaborativeDocumentEditorModule"
    "doc+version"
    ))
1
2
3
4
5
6
7
8
9
Map config = new HashMap();
config.put("conductor.host", "1.2.3.4");
RamaClusterManager manager = RamaClusterManager.open(config);
Depot editDepot = manager.clusterDepot("nlb.CollaborativeDocumentEditorModule",
                                       "*edit-depot");
PState editsPState = manager.clusterPState("nlb.CollaborativeDocumentEditorModule",
                                           "$$edits");
QueryTopologyClient<Map> docPlusVersion = manager.clusterQuery("nlb.CollaborativeDocumentEditorModule",
                                                               "doc+version");

A “cluster manager” connects to a Rama cluster by specifying the location of its “Conductor” node. The “Conductor” node is the central node of a Rama cluster, which you can read more about here. From the cluster manager, you can retrieve clients to any depots, PStates, or query topologies for any module. The objects are identified by the module name and their name within the module.

Here’s an example of appending an edit to the depot:

ClojureJava
1
(foreign-append! edit-depot (mk-add-edit 123 3 5 "to the "))
1
editDepot.append(makeAddEdit(123, 3, 5, "to the "));

This uses the previously defined helper functions to create an add edit for document ID 123 at version 3, adding “to the ” to offset 5.

Here’s an example of querying the $$edits PState for a range of edits:

ClojureJava
1
(def edits (foreign-select-one [(keypath 123) (srange 3 8)] edits-pstate))
1
List<Map> edits = editsPState.selectOne(Path.key(123L).sublist(3, 8));

This queries for the list of edits from version 3 (inclusive) to 8 (exclusive) for document ID 123.

Finally, here’s an example of invoking the query topology to atomically fetch the contents and version for document ID 123:

ClojureJava
1
(def ret (foreign-invoke-query doc+version 123))
1
Map ret = docPlusVersion.invoke(123L);

This looks no different than invoking any other function, but it’s actually executing remotely on a cluster. This returns a hash map containing the doc and version just as defined in the query topology.

Workflow with frontend

Let’s take a look at how a frontend implementation can interact with our completed backend implementation to be fully reactive. The approach is the same as detailed in this post. Architecturally, besides Rama you would have a web server interfacing between Rama and web browsers.

The application in the browser need to know as soon as there’s an update to the document contents on the backend. This is easy to do with Rama with a reactive query like the following:

ClojureJava
1
2
3
4
5
(foreign-proxy [(keypath 123) (view count)]
  edits-pstate
  {:callback-fn (fn [new-version diff previous-version]
                  ;; push to browser
                  )})
1
2
3
4
5
editsPState.proxy(
  Path.key(123L),
  (Integer newVersion, Diff diff, Integer oldVersion) -> {
    // push to browser
  });

“Proxy” is similar to the “select” calls already shown. It takes in a path specifying a value to navigate to. Unlike “select”, “proxy”:

  • Returns a ProxyState which is continuously and incrementally updated as the value in that PState changes. This has a method “get” on it that can be called at any time from any thread to get its current value.
  • Can register a callback function that’s invoked as soon as the value changes in the module.

The time to invoke the callback function after being changed on the PState is less than a millisecond, and it’s given three arguments: the new value (what would be selected if the path was run from scratch), a “diff” object, and the previous value the last time the callback function was run. The diff isn’t needed in this case, but it contains fine-grained information about how the value changed. For example, if the path was navigating to a set object, the diff would contain the specific info of what elements were added and/or removed from the set. Reactive queries in Rama are very potent, and you can read more about them here.

For this use case, the browser just need to know when the version changes so it can take the following actions:

  • Fetch all edits that it missed
  • Run operational transformational algorithm against any pending edits it has buffered but hasn’t sent to the server yet

The browser contains four pieces of information:

  • The document contents
  • The latest version its contents are based on
  • Edits that have been sent to the server but haven’t been acknowledged yet
  • Edits that are pending

The basic workflow of the frontend is to:

  • Buffer changes locally
  • Send one change at a time to the server. When it’s applied, run operational transformation against all pending edits, update the version, and then send the next pending change to the server if there is one.

The only time the full document contents are ever transmitted between server and browser are on initial load when the query topology is invoked to atomically retrieve the document contents and version. Otherwise, only incremental edit objects are sent back and forth.

Summary

There’s a lot to learn with Rama, but you can see from this example application how much you can accomplish with very little code. For an experienced Rama programmer, a project like this takes only a few hours to fully develop, test, and have ready for deployment. The Rama portion of this is trivial, with most of the work being implementing the operational transformation algorithm.

As mentioned earlier, there’s a Github project for the Clojure version and for the Java version containing all the code in this post. Those projects also have tests showing how to unit test modules in Rama’s “in-process cluster” environment.

You can get in touch with us at consult@redplanetlabs.com to schedule a free consultation to talk about your application and/or pair program on it. Rama is free for production clusters for up to two nodes and can be downloaded at this page.

Permalink

State of CIDER 2024 Survey Results

Back in 2019, I shared the first community survey results for CIDER, the beloved Clojure interactive development environment for Emacs. Five years later, we’re back to take the pulse of the ecosystem and see how things have evolved.

In this post, we’ll explore the results of the 2024 State of CIDER Survey, compare them to the 2019 edition, and reflect on the progress and the road ahead.


Who Are CIDER Users Today?

Experience With CIDER

Experience with CIDER

In 2019, most users had between 1-5 years of experience with CIDER. Fast forward to 2024, and we now see a significant portion with 5+ years under their belts — a great sign of CIDER’s stability and staying power. Newcomers are still joining, but the user base has clearly matured.

I guess also has to do with the fact that Clojure is not growing as much as before and few new users are joining the Clojure community. For the record:

  • 550 took part in CIDER’s survey in 2019
  • 330 took part in 2024

I’ve observed drops in the rate of participation for State of Clojure surveys as well.

Prior Emacs Usage

Emacs Experience

The majority of respondents were Emacs users before trying CIDER in both surveys. A fun new entry in 2024: “It’s complicated” — we see you, Vim-converts and editor-hoppers!


Tools, Setups, and Installs

Emacs & CIDER Versions

CIDER Versions

Users have generally moved in sync with CIDER and Emacs development. CIDER 1.16 is now the dominant version, and Emacs 29/30 are common. This reflects good community alignment with upstream tooling.

Probably by the time I wrote this article most people are using CIDER 1.17 (and 1.18 is right around the corner). Those results embolden us to be slightly more aggressive with adopting newer Emacs features.

Installation Methods

Package.el remains the most popular method of installation, although straight.el has carved out a notable niche among the more config-savvy crowd. Nothing really shocking here.


Usage Patterns

Professional Use

Professional Use

Just like in 2019, around half of the respondents use CIDER professionally. The remaining half are hobbyists or open-source tinkerers — which is awesome.

Upgrade Habits

Upgrade Frequency

There’s a visible shift from “install once and forget” toward upgrading with major releases or as part of regular package updates. CIDER’s release cadence seems to be encouraging healthier upgrade practices.

Used Features

This was the biggest addition to the survey in 2024 and the most interesting to me. It confirmed my long-held suspicions that most people use only the most basic CIDER functionality. I can’t be sure why is that, but I have a couple of theories:

  • Information discovery issues
  • Some features are somewhat exotic and few people would benefit from them

I have plans to address both points with better docs, video tutorials and gradual removal of some features that add more complexity than value. CIDER 1.17 and 1.18 both make steps in this direction.


Community & Documentation

Documentation Satisfaction

Docs Satisfaction

Documentation continues to score well. Most users rate it 4 or 5 stars, though there’s always room for growth, especially in areas like onboarding and advanced features.

From my perspective the documentation can be improved a lot (e.g. it’s not very consistent, the structure is also suboptimal here and there), but that’s probably not a big issue for most people.

Learning Curve

Learning Curve

The majority of users rate CIDER’s learning curve as moderate (3-4 out of 5), consistent with the complexity of Emacs itself. Emacs veterans may find it smoother, but newcomers still face a bit of a climb.

I keep advising people not to try to learn Emacs and Clojure at the same time!


Supporting CIDER

Support for CIDER

While more users and companies are aware of ways to support CIDER (like OpenCollective or GitHub Sponsors), actual support remains low. As always, a small donation or contribution can go a long way to sustaining projects like this. As a matter of fact the donations for CIDER and friends have dropped a lot of since 2022, which is quite disappointing given all the efforts me and other contributors have put into the project.


Conclusion

CIDER in 2024 is a mature, well-loved tool with a seasoned user base. Most users are professionals or long-time hobbyists, and satisfaction remains high. If you’re reading this and you’re new to CIDER — welcome! And if you’re a long-timer, thank you for helping build something great.

Thanks to everyone who participated in the 2024 survey. As always, feedback and contributions are welcome — and here’s to many more years of productive, joyful hacking with Emacs and CIDER.


Keep hacking!

Permalink

Experience with Claude Code

I spent one week with Claude Code, vibe coding two apps in Clojure. Hours and $134.79 in. Let me tell you what I got out of that.

Claude Code is a new tool from Anthropic. You cd to the folder, run claude and a CLI app opens.

You tell it what to do and it starts working. Every time it runs a command, it lets you decide whether to do the task, or do something else. You can tell it to always be allowed to do certain tasks which is useful for things like moving in the directory, changing files, running tests.

Task 1: Rewrite app from Python to Clojure

We have a small standalone app here implemented in Python. It’s an app that serves as a hub for many ETL other jobs. Every job is run, the result is collected, packed as JSON and sent into the API for further processing.

So I copied:

  • the app that should be converted from Python to Clj (basically job runner, data collector)
  • source code of the API that receives data
  • source code of workers that process received data

I explained exactly what I want to achieve and started iterating.

Also, I created a CLAUDE.md file which explained the purpose of the project, how to use Leiningen (yes, I still use Leiningen for new projects; I used Prismatic Schema in this project too). It also contained many additional instructions like adding schema checking, asserts, tests.

An excerpt from CLAUDE.md file.

## Generating data

Double check correctness of files you generate. I found you inserting 0 instead of } in JSON and “jobs:” string to beginning of lines in yml files, making them invalid.

The app is processing many Git repositories and most of the time is spent waiting for jobs to finish.

One innovation Claude Code did itself was creating 4 working copies for each repo and running 8 repositories in parallel. This means, the app was processing all work 32* times faster. I had to double check if it is really running my jobs because of how fast the result app was.

There were minor problems, like generating invalid JSON files for unit tests, or having problems processing outputs from commands and wrapping them in JSON.

I was about 6 hours in and 90% finished.

There were still some corner cases, where jobs got stuck. Or when jobs took much longer than they should. Or when some commits in repositories weren’t processed.

I instructed Claude Code to write more tests, testing scripts and iterate to resolve this. I wanted to make sure this will be a real drop in replacement for existing Python app that will take the same input config, will call API compatible, and will be just 30* faster.

This is where I ran into a problem. Claude Code worked for hours creating and improving scripts, adding more tests. For an app that’s maybe 1000 lines of code, I ended up with 2000 lines of tests and 2000 lines of various testing scripts.

At that time, I also ran into problems with bash. Claude Code generated code that required associative arrays.

I started digging, why does this app even need associative arrays in bash. To my big disappointment, I found that over 2 days, Claude Code copied more and more logic from Clojure to bash.

It got to the point where all logic was transported from Clojure to bash.

My app wasn’t running Clojure code anymore! Everything was done in bash. That’s why it all ended up with 2000 lines of shell code.

That was the first time in my life, when I told LLM I was disappointed by it. It quickly apologized and started moving back to Clojure.

Unfortunately, after a few more hours, I wasn’t able to really get all the tests passing and my test tasks running at the same time. I abandoned my effort for a while.

Task 2: CLI-based DAW in Clojure

When I was much younger, I used to make and DJ music. It all started with trackers in DOS. 

Most of the world was using FastTracker II, but my favourite one was Impulse Tracker.

Nowadays, I don’t have time to do as much music as in the past. But I got an idea. How hard might it be to build my own simple music making software.

Criteria I gave to Claude Code:

  • Build it in Clojure
  • Do not rewrite it in another language (guess how I got to this criteria?)
  • Use Overtone as an underlying library
  • Use CLI interface
  • Do not use just 80×25 or any small subset. Use the whole allocated screen space.
  • The control should be based on Spacemacs. So for example loading a sample would be [SPC]-[S]-[L].

Spacemacs [SPC]- based approach is awesome. Our HR software Frankie uses a Spacemacs-like approach too (adding new candidates with [SPC]-[C]-[A]). So it seems to me, it’s only natural to extend this approach to other areas.

Imagine, a music making software that is inspired by Spacemacs!

I called it spaceplay and let Claude Code cook.

After a while, I had a simple interface, session mode, samples, preferences, etc.

So now, I got to the point where I wanted to add tests and get the sound working.

One option was to hook directly on the interface. Just like every [SPC] command sequence dispatches an event, I could just simulate those presses and test, if the interface reacts.

The issue is, when you have a DAW, you have many things happening at the same time. I didn’t want to test only things that are related to the user doing things.

And at the same time, I wanted my DAW to be a live set environment, where the producer can make music. It is an environment with many moving parts. So it made sense to me to make a testing API. I exposed the internal state via API and let tests manipulate it.

This was a bit of the problem for Claude.

However, Claude Code is extremely focused on text input and output. There’s no easy way to explain how to attach to sound output and test it. Even overtone test folder doesn’t spend a lot of time doing this overtone/test/overtone at master · overtone/overtone.

I wanted to test Claude Code in vibe coding mode. I didn’t want to do a lot of code review, or fixing code for it, like I do with Cursor.

So we ended up in a cycle where I wanted it to get sound working & test it and it was failing at this task.

Conclusion

After 6000–7000 lines of code (most of it Clojure, some of it bash) generated, $134.79 spent, and 12 hours invested, I came to a conclusion.

Cursor is a much better workflow for me. AI without overview is still not there to deliver maintainable apps. All of those people who are vibecoding are going to throw those apps away later, or are going to regret it.

AI is absolutely perfect in laying out in the first 90%. Unfortunately, the second 90% it takes to finish the thing.

Apps, I have successfully finished with LLMs, were code reviewed by humans constantly, they were architected by human (me), they were deployed to production early and big parts were implemented, or refactored by hand.

I will be happy to try Claude Code again in 6 months from now. Until then, I got back to Cursor and Calva.

The post Experience with Claude Code appeared first on Flexiana.

Permalink

Clojure macros continue to surprise me

Clojure macros have two modes: avoid them at all costs/do very basic stuff, or go absolutely crazy.

Here’s the problem: I’m working on Humble UI’s component library, and I wanted to document it. While at it, I figured it could serve as an integration test as well—since I showcase every possible option, why not test it at the same time?

This is what I came up with: I write component code, and in the application, I show a table with the running code on the left and the source on the right:

It was important that code that I show is exactly the same code that I run (otherwise it wouldn’t be a very good test). Like a quine: hey program! Show us your source code!

Simple with Clojure macros, right? Indeed:

(defmacro table [& examples]
  (list 'ui/grid {:cols 2}
    (for [[_ code] (partition 2 examples)]
      (list 'list
        code (pr-str code)))))

This macro accepts code AST and emits a pair of AST (basically a no-op) back and a string that we serialize that AST to.

This is what I consider to be a “normal” macro usage. Nothing fancy, just another day at the office.

Unfortunately, this approach reformats code: while in the macro, all we have is an already parsed AST (data structures only, no whitespaces) and we have to pretty-print it from scratch, adding indents and newlines.

I tried a couple of existing formatters (clojure.pprint, zprint, cljfmt) but wasn’t happy with any of them. The problem is tricky—sometimes a vector is just a vector, but sometimes it’s a UI component and shows the structure of the UI.

And then I realized that I was thinking inside the box all the time. We already have the perfect formatting—it’s in the source file!

So what if... No, no, it’s too brittle. We shouldn’t even think about it... But what if...

What if our macro read the source file?

Like, actually went to the file system, opened a file, and read its content? We already have the file name conveniently stored in *file*, and luckily Clojure keeps sources around.

So this is what I ended up with:

(defn slurp-source [file key]
  (let [content      (slurp (io/resource file))
        key-str      (pr-str key)
        idx          (str/index-of content key)
        content-tail (subs content (+ idx (count key-str)))
        reader       (clojure.lang.LineNumberingPushbackReader.
                       (java.io.StringReader.
                         content-tail))
        indent       (re-find #"\s+" content-tail)
        [_ form-str] (read+string reader)]
    (->> form-str
      str/split-lines
      (map #(if (str/starts-with? % indent)
              (subs % (count indent))
              %)))))

Go to a file. Find the string we are interested in. Read the first form after it as a string. Remove common indentation. Render. As a string.

Voilà!

I know it’s bad. I know you shouldn’t do it. I know. I know.

But still. Clojure is the most fun I have ever had with any language. It lets you play with code like never before. Do the craziest, stupidest things. Read the source file of the code you are evaluating? Fetch code from the internet and splice it into the currently running program?

In any other language, this would’ve been a project. You’d need a parser, a build step... Here—just ten lines of code, on vanilla language, no tooling or setup required.

Sometimes, a crazy thing is exactly what you need.

Permalink

Data analyis with Clojure - workshop, May 10th - initial survey

Following the maturing of the Noj toolkit for Clojure data science, we are planning a workshop for people who are curious to learn the Clojure language for data analysis. Please share this page broadly with your friends and groups who may be curious to learn Clojure at this occasion. The SciNoj Light conference schedule is emerging these days, with a fantastic set of talks. We want a broader audience to feel comfortable joining, and thus we wish to run a prep workshop one week earlier.

Permalink

Data analyis with Clojure - free workshop, May 10th - initial survey

Following the maturing of the Noj toolkit for Clojure data science, we are planning a free online workshop for people who are curious to learn the Clojure language for data analysis. Please share this page broadly with your friends and groups who may be curious to learn Clojure at this occasion. The SciNoj Light conference schedule is emerging these days, with a fantastic set of talks. We want a broader audience to feel comfortable joining, and thus we wish to run a prep workshop one week earlier.

Permalink

Talk: Clojure workflow with Sublime Text @ SciCloj

A deep overview of Clojure Sublimed, Socket REPL, Sublime Executor, custom color scheme, clj-reload and Clojure+.
We discuss many usability choices, implementation details, and broader observations and insights regarding Clojure editors and tooling in general.

Permalink

Can jank beat Clojure's error reporting?

Hey folks! I&aposve spent the past quarter working on jank&aposs error messages. I&aposve focused on reaching parity with Clojure&aposs error reporting and improving upon it where possible. This has been my first quarter spent working on jank full-time and I&aposve been so excited to sit at my desk every morning and get hacking. Thank you to all of my sponsors and supporters! You help make this work possible.

Permalink

Copyright © 2009, Planet Clojure. No rights reserved.
Planet Clojure is maintained by Baishamapayan Ghose.
Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
Theme by Brajeshwar.