Sayid Redux

I have a soft spot for Sayid - it’s one of the most ingenious Clojure tools ever built, and also one of the most neglected. It’s an omniscient debugger: instead of stopping your program at a breakpoint, it quietly records every call to the functions you’ve traced and lets you rummage through the recording afterwards. It’s the kind of thing you demo to people and watch their jaw drop. And it had been sitting practically unmaintained for the better part of a decade.

Here’s the awkward part: that neglect is largely on me. Bill Piel, Sayid’s original author, handed me the keys ages ago, and I’ve been… let’s say a less than exemplary steward. I’d merge the occasional patch to keep the lights on, but until very recently I’d done precious little to actually move the project forward.

The best time to maintain your open-source project was six years ago. The second best time is now.

– Ancient proverb, lightly adapted

So what finally lit a fire under me? Blame CIDER 2.0. I’ve been reworking CIDER’s debugging and tracing story, and at some point I sat down to make the built-in tracer smarter. A few hours in it hit me: nothing I could realistically bolt onto the built-in tracing would come anywhere close to what Sayid already does. So instead of building a worse Sayid, I dusted off the real one, gave it a good scrub, and here we are - Sayid 0.4.

What is Sayid, anyway?

For the uninitiated: Sayid records the arguments, return value, timing and full call tree of the functions you trace, so you can go back and inspect exactly what happened - no breakpoints, no println, no re-running the thing five times.

While we’re on the subject of embarrassing confessions: I’ve been the maintainer of this thing for years and I still have no idea what “Sayid” actually means or what it’s a reference to. If you happen to know, please, put me out of my misery - I’d love to finally get the joke.

Trace a namespace, run your code, and pop open the workspace with C-c s w:

▾ (demo.coins/can-afford? [:quarter :dime :nickel :penny] 45) => true
    (demo.coins/total-cents [:quarter :dime :nickel :penny]) => 45

can-afford? says true, but four coins worth 41 cents shouldn’t cover a 45-cent tab. Something’s off, and the bug lives inside total-cents. This is where Sayid really shines - flip on an inner trace and it records every expression inside the function:

▾ (demo.coins/can-afford? [:quarter :dime :nickel :penny] 45) => true
  ▾ (demo.coins/total-cents [:quarter :dime :nickel :penny]) => 45
    ▾ (apply + (map coin-values coins)) => 45
        (map coin-values coins) => (25 10 5 5)

There it is, staring right at us: (25 10 5 5), when the last value should be 1. A penny is worth five cents in our coin map. We never wrote a single println, and we never had to guess where to put a breakpoint - we just looked at what actually happened.

That’s the pitch, and it’s a great one. So why was such a cool tool gathering dust?

Why an omniscient debugger, in 2026?

Honestly, part of it is that Clojure folks are spoiled. We have the REPL, so the reflex for most of us is to just re-run a form with some tap> or println sprinkled in and eyeball the output. That works right up until it doesn’t - until the bug is three layers deep in a map over a lazy seq, or only shows up on the 400th call, or lives in code you didn’t write and don’t feel like instrumenting by hand.

A traditional stepping debugger stops the world and makes you drive. Sayid does the opposite: it lets the program run to completion and then hands you the entire execution history as something you can navigate at your own pace. tools.trace is in the same spirit, but it dumps text to stdout - Sayid keeps structured data and gives you a query DSL to slice it. That’s a much better fit for how we actually work in Clojure: run it, capture everything, explore the data.

And here’s the thing that made me want to revive it rather than reinvent it - “capture everything as data and explore it later” is exactly the workflow that’s becoming more relevant, not less. Structured execution traces are gold, whether the thing doing the exploring is you, a data-inspection tool like Portal, or an AI assistant trying to understand why your code misbehaved.

Out with the old coordinates

First order of business was dragging the project into the present.

Sayid used to live under com.billpiel/sayid on Clojars, with namespaces like com.billpiel.sayid.core. The new home is clojure-emacs, so the artifact is now published as mx.cider/sayid:

{:user {:plugins [[mx.cider/sayid "0.4.0"]]}}

I also dropped the personal-domain prefix from every namespace - it’s plain sayid.core, sayid.trace, sayid.nrepl-middleware and so on now. The old com.billpiel/sayid coordinates still get the same releases for the time being, so nobody’s dependency breaks overnight, but the future is mx.cider.

Data first, strings later

Here’s the change I’m most excited about. Sayid’s nREPL middleware used to hand the Emacs client a pre-rendered blob of text plus a pile of text properties for colouring. In other words, the server did all the rendering and the client was a dumb terminal. That single decision is a big part of why there was exactly one Sayid client.

So the middleware now speaks data. There’s a family of new ops - sayid-get-workspace-data, sayid-query-data and friends - that return the recorded call tree as honest, navigable data instead of a wall of text:

{"id"       "4793"
 "name"     "demo.coins/can-afford?"
 "args"     ["[:quarter :dime :nickel :penny]" "45"]
 "return"   "true"
 "file"     "demo/coins.clj"
 "line"     12
 "children" [...]}

The structural bits (ids, names, timings, source location, the tree shape) come across as real nested maps and lists you can walk by key. The captured values are pr-str‘d, since an arbitrary Clojure value can’t always round-trip over the wire - but that’s the only place strings sneak in, and it’s exactly where you’d expect them.

The upshot: any editor or tool that speaks nREPL can now fetch a workspace and render it however it likes, and a REPL one-liner or a Portal tap gets you the same data with zero Sayid-specific machinery. The whole thing is written up in doc/nrepl-api.md, so you don’t have to reverse-engineer the wire format from the Emacs client the way you would have before.

A UI that doesn’t feel like 2015

With the data ops in place, I rebuilt the Emacs UI on top of them. The workspace and the “what’s traced” views are now proper foldable trees, built on CIDER’s new cider-tree-view. You get real folding (TAB), navigation (n/p), jump-to-source (RET), and - my favourite - c i hands the actual captured value straight to CIDER’s inspector, so you can drill into an argument or a return value as a live, navigable object rather than squinting at its printed form.

There’s also a query layer wired into the tree: f narrows to every recorded call of the function at point, i focuses a single call and its subtree. On a big trace that’s the difference between “wall of text” and “actually finding the thing”.

The best part is that the client is now smaller, not bigger - all the tree rendering, folding and value inspection are handled by mature components instead of bespoke code painting text properties by hand. That’s the payoff of moving rendering to the client: the server ships data, and the client is free to be as fancy or as plain as it wants.

Wait, you broke everything?

Backwards compatibility is a promise you make to the users you have. Sayid had exactly one, and reader, I am that user.

– Me, rationalising

Yeah, I did. I went pretty wild with the breaking changes this time around - new artifact coordinates, new namespaces, a reworked nREPL API, a bumped minimum CIDER version. Normally I’d bend over backwards to keep old clients working, but here I made a deliberate call: as far as I can tell, the bundled Emacs client was the only client Sayid ever had. Dancing around imaginary third parties to preserve compatibility nobody was relying on would have just made the project harder to adopt and harder to maintain.

So I opted for sweeping changes that leave Sayid in a much better place to build on, rather than a museum of backwards-compatible cruft. If it turns out I was wrong and you were quietly depending on the old coordinates - I’m sorry, and do let me know, because that’s genuinely useful information.

Take it for a spin

That’s the gist of it. Sayid is alive again, it’s leaner, it speaks data, and it has a UI I’m not embarrassed to demo. What I’d love now is for more people to actually use it and tell me whether the new direction resonates.

So please - [mx.cider/sayid "0.4.0"], trace something gnarly, and pop open the tree. Then head over to the issue tracker and tell me what you think: what feels great, what feels rough, what’s missing. I have ideas for where to take it next (bounding the recording so you can safely trace a whole namespace under a test suite is high on the list), but I’d rather steer by what people actually want out of it.

Big thanks to Bill Piel for building such a wonderful tool in the first place - I’m merely standing on the shoulders of a giant here. And thanks in advance to everyone willing to kick the tyres on the revival!

That’s all from me for now. Keep hacking!

Permalink

nREPL and ClojureScript: Demystifying Piggieback

I can’t carry it for you… but I can carry you!

– Samwise Gamgee, on the virtues of piggyback

If you’ve ever fired up a ClojureScript REPL from your editor and it just worked, there’s a decent chance Piggieback was quietly doing the heavy lifting behind the scenes. It’s one of those libraries that’s been around forever, that everyone in the CIDER world depends on, and that almost nobody actually understands. For years I counted myself firmly in the “almost nobody” camp.

That changed recently. I finally had to do some serious work on Piggieback myself, and to my mild horror I realized I’d forgotten most of how it works internally. So I did what I always do when I need to understand something properly - I dug in, refactored it into a shape I could actually reason about, and then wrote it all down. This article is that write-up.

A little backstory

Piggieback was created by Chas Emerick back in August 2012. The very first commit is dated August 10th, 2012, which makes the project a few weeks shy of fourteen years old as I write this. That’s a very long time in software years.

The name is a pun, of course - Piggieback rides piggyback on top of an existing nREPL server. Why “piggie” and not “piggy”? That’s one of life’s great unsolved mysteries. Only Chas knows, and he’s not telling.1

The problem it set out to solve was very concrete. nREPL talks to a Clojure runtime. ClojureScript, meanwhile, compiles to JavaScript and runs somewhere else entirely - a browser tab, a Node process, back in the day even Rhino or Nashorn running inside the same JVM. So how do you get your editor, which only knows how to talk nREPL to a JVM, to evaluate ClojureScript in some far-flung JavaScript runtime? Piggieback’s answer was pretty clever: don’t build a new protocol, don’t build a new server, just hijack the existing nREPL session and quietly reroute its evaluation to ClojureScript.

The project changed hands a couple of times over the years. It started life as com.cemerick/piggieback, deployed to Maven Central. When Chas stepped back it moved under the CIDER umbrella and became cider/piggieback on Clojars, and eventually landed in the nREPL org where it lives today. We kept the cider group id to avoid breaking everyone’s deps.edn for no good reason.2

Historically I wasn’t the one driving Piggieback’s development. That credit goes to a few other people - Chas himself, of course, but also Bruce Hauman (of Figwheel fame, who shaped a lot of the evaluation machinery) and Michael Griffiths (who did a ton of work on printing and nREPL integration). For a long time I was happy to let Piggieback tick along in the background while smarter people than me handled its guts.

But times change. These days CIDER and the rest of its supporting projects, Piggieback very much included, have exactly two active maintainers: me and Alex Yakushev. When you’re down to a crew that small, “I’ll let someone else understand the ClojureScript internals” stops being a viable strategy. So I sat down and learned. Eventually we all have to, I guess. :D

Which brings me to a small confession: neither Alex nor I is an actual ClojureScript expert. We’re Clojure folks who wandered into the cljs internals out of necessity, and we muddle through. So if you are one, and you’ve got a bit of time and goodwill to spare, help with the development of Piggieback (and CIDER more broadly) would be most appreciated. There’s plenty of interesting work to go around!

Two languages, one server

Here’s the thing that still makes me smile about Piggieback’s design: a single nREPL server can host both Clojure and ClojureScript evaluation at the same time, on a per-session basis. One editor connection can be running Clojure, another can be running ClojureScript, and the server doesn’t blink.

And the client doesn’t have to do anything special either. There’s no :env :cljs parameter to pass, no dedicated cljs-eval op to call. You keep sending plain old eval messages with your session id, exactly as you would for Clojure. Piggieback figures out the rest.

The way you “enter” ClojureScript is by evaluating a single form in your session:

(require 'cljs.repl.node)
(cider.piggieback/cljs-repl (cljs.repl.node/repl-env))
;; To quit, type: :cljs/quit

From that point on, every eval and load-file in that session is ClojureScript, until you send :cljs/quit to switch back. Meanwhile your colleague’s session on the same server is still merrily evaluating Clojure. It’s a bit like flipping a switch that only affects your own room in a shared house.

This is what I mean by session-based dispatch, and it’s the conceptual heart of the whole thing:

 nREPL server (one JVM)
 +-----------------------------------------------------------+
 |                                                           |
 |  session A --eval-->  Clojure                             |
 |                                                           |
 |  session B --eval-->  ClojureScript  -->  Node runtime    |
 |                                                           |
 |  session C --eval-->  ClojureScript  -->  browser tab     |
 |                                                           |
 +-----------------------------------------------------------+

How it actually works

These aren’t the droids you’re looking for.

– Obi-Wan Kenobi, nREPL middleware

Let’s peel back a layer. Piggieback is, first and foremost, an nREPL middleware - a function that wraps the nREPL handler and gets a crack at every message before (or instead of) the default machinery.

Its middleware, wrap-cljs-repl, does something pretty simple: for each incoming message, it checks whether the session currently has an active ClojureScript REPL. If it does and the op is eval or load-file, Piggieback handles the message itself. Otherwise it passes the message straight through to the normal Clojure handling. Distilled to its essence, it looks like this:

(defn wrap-cljs-repl [handler]
  (fn [{:keys [session op] :as msg}]
    (if (and (@session #'*cljs-repl-env*)          ; are we in cljs mode?
             (#{"eval" "load-file"} op))           ; and is this an eval?
      (piggieback-eval msg)                        ; yes: hijack it
      (handler msg))))                             ; no: business as usual

The *cljs-repl-env* dynamic var stored in the session is the linchpin. Its presence is the signal “this session is doing ClojureScript”. It’s also what other middleware, like cider-nrepl, look for to know they’re dealing with a cljs session - more on that in a bit.

Here’s the bigger picture, showing where Piggieback sits in the stack:

   editor                         JVM: nREPL server
 +---------+   nREPL protocol   +-------------------------------------------+
 | CIDER,  |<--(bencode/edn)--->|  transport + session                      |
 | Calva,  |                    |        |                                  |
 | vim,    |                    |        v   middleware stack               |
 | ...     |                    |   wrap-print > wrap-cljs-repl > eval ...   |
 +---------+                    |                    |                      |
                                |      (eval/load-file, only when the       |
                                |       session has *cljs-repl-env* set)    |
                                |                    v                      |
                                |      cljs.repl  (IJavaScriptEnv)          |
                                |                    |                      |
                                +--------------------+----------------------+
                                                     v
                                       Node process  /  browser tab  /  ...

The clever bit is that Piggieback is written in Clojure and runs on the JVM, yet it drives evaluation in ClojureScript. It does this through ClojureScript’s own Clojure-facing API - specifically the cljs.repl/IJavaScriptEnv protocol, which every ClojureScript REPL environment implements:

(defprotocol IJavaScriptEnv
  (-setup    [repl-env opts]         "initialize the environment")
  (-evaluate [repl-env filename line js] "evaluate a javascript string")
  (-load     [repl-env provides url] "load code at url into the environment")
  (-tear-down [repl-env]             "dispose of the environment"))

So the flow, roughly, is this. Your ClojureScript form comes in as text. Piggieback reads it with the ClojureScript reader, the ClojureScript compiler turns it into JavaScript, and -evaluate ships that JavaScript off to the runtime (Node, browser, whatever). The result comes back as a string, Piggieback reads it and sends it to your editor as an ordinary nREPL :value. Your editor has no idea a whole JavaScript runtime was involved.

Starting the REPL vs. evaluating in it

There are actually two distinct code paths inside Piggieback, and understanding the split explains a lot of its quirks.

Setup is the fiddly one. To bootstrap a ClojureScript REPL you need to load cljs.core, set up the analyzer, wire up the compiler environment, and require a bunch of REPL helpers into cljs.user. Rather than reimplement all that, Piggieback leans on ClojureScript’s own cljs.repl/repl* - it briefly drives the real REPL loop, feeds it a single namespace-require form, and then bails out.

There’s a funny wrinkle here. cljs.repl/repl* insists on tearing the environment down when its loop exits, which would immediately kill the Node process we just launched. Piggieback’s workaround is to wrap your repl-env in a delegating environment whose -tear-down is a deliberate no-op, forwarding every other method to the real thing:

(deftype DelegatingReplEnv [repl-env]
  cljs.repl/IJavaScriptEnv
  (-setup    [_ opts]           (cljs.repl/-setup repl-env opts))
  (-evaluate [_ filename line js] (cljs.repl/-evaluate repl-env filename line js))
  (-load     [_ provides url]   (cljs.repl/-load repl-env provides url))
  (-tear-down [_])              ; <- the whole point: swallow the tear-down
  ;; ... plus map-like access, delegated to repl-env ...
  )

Steady-state evaluation is the other path, and it’s the one your editor hits on every eval. Here Piggieback skips the REPL loop entirely and calls cljs.repl/evaluate-form directly, reusing the compiler environment and repl-env it squirreled away in the session during setup. One form in, one printed value out.

 (cljs-repl node-env)          (eval "(+ 1 1)")
        |                              |
        v                              v
   drive repl* once             read cljs form
   (delegating env)                   |
        |                       compile to JS
   -setup: launch Node                |
        |                        -evaluate  -->  Node  -->  "2"
   stash repl-env +                   |
   compiler-env in session      read back, send :value

The compiler environment, and why your tools love it

Now for my favourite part, because it’s the bit that makes ClojureScript tooling actually good rather than merely functional.

When Piggieback sets up a REPL, it stashes the ClojureScript compiler environment in the session, under a well-known var:

;; in the session atom, keyed by the var itself:
{#'cider.piggieback/*cljs-compiler-env*  <the compiler env atom>
 #'cider.piggieback/*cljs-repl-env*      <the (delegating) repl-env>}

That compiler environment is a treasure trove. It holds the analyzer’s full picture of your program - every namespace, every var, their arglists, docstrings, source locations, the works. And because Piggieback parks it somewhere stable and predictable, other middleware can reach in and grab it.

That’s exactly what cider-nrepl does. When you ask CIDER for completion candidates, or arglists, or “jump to definition” in a ClojureScript buffer, cider-nrepl pulls the compiler env out of the session:

;; cider-nrepl, roughly:
(get @session #'cider.piggieback/*cljs-compiler-env*)

…and then runs static analysis against it. No extra round-trip to the browser, no runtime reflection, just the compiler’s own knowledge sitting right there for the taking. It’s a nice example of how a small, stable contract between two libraries enables a whole class of features neither could build alone. Every time autocompletion works in a .cljs buffer, this little handshake is why.

Which environments does it support?

Because Piggieback talks to the generic cljs.repl/IJavaScriptEnv protocol, it works with pretty much any standard ClojureScript REPL environment. These days that mostly means Node.js (cljs.repl.node) - the workhorse for non-browser development, scripts, and, importantly, test suites; this is what Piggieback’s own tests run against. The other big one is the browser environment (cljs.repl.browser), which evaluates in a real browser tab via a little websocket dance and is great for actual web development. Beyond those there’s GraalJS and anything else that happens to ship an IJavaScriptEnv implementation.

A couple of historical footnotes are worth a mention too. The very first versions of Piggieback supported Rhino (dropped in 0.3.0) and later Nashorn, both in-JVM JavaScript engines that have since faded from relevance. The tests were moved off Nashorn and onto Node years ago.

There’s one category Piggieback can’t help with, though: self-hosted ClojureScript, like Lumo or Planck.3 Piggieback is fundamentally a Clojure program leaning on ClojureScript’s Clojure API. If there’s no JVM in the picture at all, there’s nothing for it to hook into. For those you want a native ClojureScript nREPL implementation like nrepl-cljs instead.

The limitations

Piggieback’s simplicity is its greatest strength, but simplicity always has a price. Let me be honest about the sharp edges.

The most annoying one is that you can only run one Node REPL per JVM. ClojureScript’s cljs.repl.node keys some of its internal state on the thread name, which collapses nREPL’s worker threads together, so two Node REPLs in the same JVM end up clobbering each other. This one’s upstream, not Piggieback’s fault, but it’s a real ceiling.

Then there’s the fact that you can’t interrupt a running evaluation. A ClojureScript form runs in the JavaScript runtime, on a thread Piggieback doesn’t control, and there’s no portable way to interrupt code that’s already executing over there. So a runaway expression means restarting the runtime - Clojure’s interrupt op simply doesn’t apply here.

Multi-form evaluation is limited too. For performance reasons Piggieback doesn’t spin up a fresh REPL for each evaluation, which means that if you send several top-level forms in a single message, only the first one gets evaluated. Rarely an issue in practice, but a sharp edge nonetheless.

And finally, Piggieback is coupled to ClojureScript’s compiler internals. There’s no public “evaluate this cljs form and hand me the result” API, so it has to reach into cljs.analyzer, cljs.env, and the non-public bits of cljs.repl. It works because ClojureScript changes slowly, but it’s inherently a bit fragile.

None of these are dealbreakers for the things Piggieback is good at. But they do point at why it’s not the only game in town.

The other approach: shadow-cljs

If you’ve done any serious ClojureScript app development in the last several years, you’ve almost certainly used shadow-cljs, Thomas Heller’s build tool. And here’s a fact that surprises a lot of people: shadow-cljs does not use Piggieback. It ships its own nREPL middleware.

The philosophical difference is worth spelling out. Piggieback is a thin bridge to the official cljs.repl environments; it assumes very little and owns very little. shadow-cljs takes the opposite stance - it owns the entire build and runtime lifecycle (watching your files, hot-reloading code, managing multiple runtimes, tracking which build is which), and its nREPL integration is a natural extension of all that machinery. When you eval a form in a shadow-cljs setup it routes to whichever runtime is attached to your build, with hot-reload and the rest already in play.

So which is “better”? That’s the wrong question, in my opinion. They’re solving overlapping but quite different problems, and the right choice depends almost entirely on what kind of ClojureScript environment you’re working in. If you’re building a web app with a modern toolchain, shadow-cljs’s integrated approach is probably a better fit and a nicer experience overall - the hot-reload story alone is worth a lot. If you’re on a plain deps.edn or Leiningen project, poking at a Node REPL, or building tooling that targets the official ClojureScript REPL, Piggieback’s minimalism is exactly what you want. The flip side of shadow-cljs owning everything is that you have to buy into its world; the flip side of Piggieback owning almost nothing is that ceiling of one Node REPL per JVM. Both are entirely valid trade-offs. I use both, depending on the day.

What we’ve been up to lately

Piggieback had drifted into maintenance mode for a while, so a big chunk of my recent work was simply modernizing it - part of the same recent burst of energy that’s producing CIDER 2.0 - and then fixing a pile of long-standing bugs now that I actually understood the thing.

On the correctness front, the recent 0.6.x and 0.7.0 releases sorted out a batch of old issues: ClojureScript output going missing after reconnecting to a session (#111), evaluation breaking after a REPL setup error (#62), garbled printing of unknown tagged literals (#120), and a cryptic error when cljs-repl was invoked outside a session (#124). On top of that, load-file now evaluates the source that’s actually in your editor buffer, unsaved changes and all, instead of silently re-reading the file from disk - which is how nREPL has always behaved for Clojure. Closing a session now tears down its ClojureScript REPL too, so you stop leaking Node processes when you disconnect without a polite :cljs/quit. And describe finally advertises ClojureScript status, so tools can detect that a session is in cljs mode straight from the protocol instead of having to guess.

There was also a serious bit of internal restructuring. The old three-file load/in-ns contraption is gone, all the ClojureScript coupling now lives behind one clean namespace, and the runtime code generation for the delegating env was replaced with a single, comprehensible type. The public contract didn’t change one bit; the insides are just far less scary now. If you want the gory details, I wrote both an architecture document and a roadmap while I was in there - partly for future contributors, partly for future me, who will inevitably forget all of this again.

What’s next?

A few things I’ve been ruminating on for the future. Piggieback still drives cljs.repl/repl* for the initial bootstrap, which is the source of most of its remaining awkwardness. In principle we could replicate just the setup we need and unify the two eval paths into one, but that trades one kind of coupling for another, so I’ve deferred it until the benefit clearly outweighs the risk.

There are also a couple of things I’d love to see fixed upstream in ClojureScript itself, because they’re not really Piggieback’s problems: the thread-name state in cljs.repl.node that caps us at one Node REPL per JVM, and the way the Node env captures *out* at setup time. Fixing those upstream would help everyone, shadow-cljs included. And if those pieces land, better multi-runtime support becomes a real possibility. None of it is urgent, though. Piggieback works, and works well, for what it does.

Epilogue

If there’s one thing I hope you take away from this, it’s that Piggieback is not as scary as its reputation suggests. Underneath the intimidating “it evaluates ClojureScript over nREPL” description is a genuinely simple idea - hijack a session, delegate to cljs.repl, stash the compiler env where tools can find it - with a bit of cleverness sprinkled on top. It’s a great example of how a small, sharp design can endure for well over a decade with its core intact.

For years the community kept expecting native nREPL servers running on self-hosted ClojureScript to show up and make Piggieback obsolete. Those never really materialized. Piggieback, meanwhile, is still here, still doing its job, still quietly powering ClojureScript REPLs in editors everywhere. It’s clearly not the only game in town anymore - shadow-cljs’s approach is a better fit for a lot of people, and that’s great. But there’s something to be said for a little library that picked a simple design back in 2012 and rode it all the way to today.

That’s from me, folks! Keep hacking!

  1. The official project FAQ’s answer to “Why ‘piggieback’ instead of ‘piggyback’?” is, and I quote, “That’s one of life’s greatest mysteries. Only Chas can answer that one.” I’ve decided to preserve the mystery. 

  2. This is also why the main namespace is cider.piggieback rather than nrepl.piggieback. Renaming it would break every existing config for zero real benefit. Boring stability beats exciting churn. 

  3. The self-hosted ClojureScript scene has gone rather quiet, by the way. Lumo was archived back in 2022 and Planck hasn’t cut a release since 2024. These days the “Clojure without a JVM” itch mostly gets scratched by SCI-based tools like nbb and scittle - a different beast (an interpreter, not the self-hosted compiler), but very much alive. Which makes that native ClojureScript nREPL server I keep pining for feel further away than ever. 

Permalink

Compilers, Editors, Typefaces and More From the Ground Up

In the four years since my previous progress-update article, I have been working on a series of experimental projects and learning about the low-level fundamentals of software—down to the finicky particulars of things like text rendering and x86 machine code generation.

Here is a brief, incomplete listing of these projects:

  • (Late 2022) Structural editor for Clojure, with code stored non-hierarchically in a database. (Early prototyping phase.)

  • (First half of 2023) Compiler, structural editor and debugger for a custom language with a bytecode interpreter and x64 machine code backend. The language is imperative, strongly typed and assumes manual memory management, but has a Clojure-like syntax.

  • (Mid 2023) ButteryTaskbar2: a reimplementation of my original utility for Windows, improved to be simpler and more efficient.

  • Basic GPU-powered 2D rendering via Direct3D 12.

  • OLKCH colour picker.

  • (Second half of 2023) Two different attempts at a structural editor for the Jai programming language.

  • A simple alternative language to Markdown.

  • A tool for viewing and editing notes using a graph-based structure similar to Roam Research. Notes are written in my Markdown alternative.

  • Static website generator using my Markdown alternative for webpage content.

  • A programmatic generator for my six-segment logo, using TinyVG for rendering.

  • XML parser.

  • Implementation of Myer’s and Dijkstra’s algorithms for diffing a source code AST, together with a fun (but vitally useful) visualisation of the diffing algorithm.

  • (First half of 2024) A text editor for code, featuring syntax highlighting, Tree-sitter integration, adjustable panels, Regular Expression text search, modal editing, macros and undo history.

  • libgrapheme ported to Jai.

  • (Second half of 2024) Compiler and bytecode interpreter for a custom modern C-like language.

  • (Late 2024) A simple bespoke tool for tracking deadlines and reminders. Uses SQLite and a minimal browser-based UI. I continue to use this tool to this day.

  • Win32 API bindings generator, which converts the JSON data from win32json into a serialised binary form, which is then selectively converted into code declarations.

  • (2025 onwards) A blank-slate code repository with a bespoke build system and custom implementations of core libraries for purposes like memory management, file system access, hash tables, etc. This is the basis of all projects that follow.

  • (Early 2025) A utility to provide various comforts on Windows, including navigation key bindings, synchronised display brightness control, and sticky window edges (like macOS).

  • A more ergonomic and flexible alternative to JSON.

  • Basic SIMD-accelerated software renderer written in assembly for AVX2.

  • (Mid 2025) Initial prototyping work on a new design of a structural code editor, inspired by the Kyra language and editor.

  • Glyph rasteriser (CPU-based), capable of directly rasterising both linear and quadratic edges. Inspired by the approach in stb_truetype, but designed to be SIMD-friendly.

  • A new programmatically-defined quasi-proportionally-spaced typeface designed to be both legible and fast to render, by minimising vertices and preferring quadratic Bézier curves over cubics. A simple bespoke text layout system replaces the dependence on Harfbuzz.

  • (Late 2025) Another attempt at a UI toolkit, which includes a more powerful layout system, better support for incremental updates, and dependency analysis to optimise GPU draw calls.

  • Vulkan bindings generator, which sources data from the official XML-based specification.

  • (2026) A revised text editor, bringing together my latest work on UI systems and text rendering. It is powerful enough that I now use it as my primary code editor for my Jai-based projects. Like 4coder, it implements virtual whitespace, which greatly improves the experience of working with code indentation.

All of these projects constitute an extensive volume of research and time spent in deep thought. This article does not aim to give exhaustive coverage of all the ideas and implementation details across these four years, but merely to serve as an overview of some of the most interesting points.

Reinventing Software Development Tooling

Recurring themes in my work include compilers, innovative code editors, and novel programming language design. I feel like there is opportunity to bring substantial improvements to the development experience by reducing superfluous sources of friction, cutting out bureaucratic busywork, and tightening feedback loops. My main points of focus have been:

  • Structural code editors to give a more refined editing experience with more powerful and robust code intelligence and transformations.

  • Rethinking code organisation to make it easier to scale, manage, and adapt complex codebases. This means storing code declarations in some kind of graph database instead of a tree of files.

  • A more interactive debugging experience leveraging tightly integrated code introspection along with dynamic code compilation and execution.

My research in this area is still early and highly experimental, though I now have a better idea of what probably doesn’t work, and what could work.

Whilst searching for prior art, I encountered this video which is the best demonstration I’ve found of a structural editor and its benefits.

2023H1 Programming System

I started off by implementing a programming language with a syntax very much like Clojure, because I was most comfortable with that at the time. The simplicity of Clojure’s syntax makes it very powerful, and makes it a particularly good fit for a structural code editor.

In addition to a structural editor, this project implements a primitive compiler that emits both bytecode and (to a limited extent) x64 machine code. Initially, I implemented a tree-walking interpreter, but that was awfully slow. The bytecode interpreter was a significant improvement, but still orders of magnitude slower than machine code. The x64 backend is capable of directly generating a valid Windows executable binary without needing to depend on an assembler. The most sophisticated program I tested is a program that counts the total number of lines of all text files in a directory.

I also implemented a primitive debugger that integrates with the bytecode interpreter. Unfortunately, I do not have any images of this and now I cannot easily compile the project. This debugger had a GUI that displayed a listing of the bytecode instructions as well as the current contents of virtual registers and stack memory. You could set breakpoints, step through the program, and watch the program state change in real time.

I was initially using Skia for rendering, but later switched to doing basic software rendering with FreeType and Harfbuzz handling text rasterisation and layout. In the bytecode interpreter, I used Dyncall for dynamically calling procedures in dynamically-loaded libraries.

2023H2 Programming System

My subsequent experiments involved building structural editors for the Jai programming language. The problem with a Lisp-like syntax is that its simplicity means it can be verbose when expressing the same amount of information that a statically-typed language needs to express. In such a statically-typed language like Jai, it pays off to have more complex syntax in return for brevity.

At first, I tried an editor with four types of node: token, string, block and newline. A block corresponds to a pair of brackets—(), [] or {}—and contains a list of nodes. However, this did not feel powerful enough, so I proceeded to implement an alternative design with a more heterogeneous AST that more closely reflects what the compiler is working with. For example, declarations, binary operations, if statements and procedures are each their own distinct type of node in the editor.

The result was a fun, visually unique demonstration, but it revealed a challenge in the design space of structural editors: increasing the heterogeneity of the editor AST greatly increases complexity and makes editing operations similarly heterogeneous. In order to create nodes of a certain type, my solution was to use special key bindings, text commands, automatically converting a typed-out keyword into its corresponding type of node. This does not feel as fluid as the experience of editing a uniform array of characters in a traditional text file. Manipulating the tree becomes more difficult since it requires a larger set of more specialised operations, which imposes a greater cognitive burden.

There are also questions like:

  • How specialised should the editor’s AST be?

  • Should you be able to insert AST nodes of a certain type into a slot that is invalid for its type?

  • Should you be allowed to partially select the contents of a node in the same selection that contains its siblings? How should selection ranges on a tree structure work?

Having done this experiment, I am now leaning towards a structure that is more uniform, but that needs further research.

Graphics Rendering

When implementing GUIs from scratch, there is a question of how to render graphical elements to the screen (after everything is laid out). A few options:

  • Use a library (like Skia or Blend2D) that abstracts away all the difficult parts.

  • Build a CPU-based software renderer.

  • Use the GPU by directly interfacing with an API like OpenGL, Direct3D or Vulkan.

Initially, I was using Skia because it is very capable, cross-platform, and was familiar to me. If I recall correctly, I was having performance issues rendering text in Skia. Almost certainly, this was because I was using it wrong and probably there was a lack of glyph caching, but I took this as a good excuse to write an extremely simple software renderer and handle text myself using FreeType and Harfbuzz.

My original software renderer was slow and I needed more performance, so I looked around. For a brief moment, I used Blend2D, a 2D CPU-based graphics rasterisation library, but performance was still not satisfactory. Furthermore, Blend2D is such a large, complex project that I would rather not depend on.

In 2023, I eventually gave in to GPU rendering and spend over two weeks learning how to use Direct3D 12 to produce the most basic 2D graphics. This was some serious drudgery, but in the end I had working shaders for drawing rectangles, images and glyph runs of text. OffsetAllocator was used for managing GPU heap memory. The speed gain from GPU rendering was especially apparent in the OKLCH colour picker I made.

Later, in 2025, I explored writing a CPU-based software renderer written in x86 assembly language. Because this used AVX2 instructions and could operate on eight pixels at a time, it was much faster than my prior software rendering attempt. I experimented with an implementation that operates on 8-byte integer colour channels, and another that operates on 16-byte floating-point colour channels for better blending accuracy (using special FP16 x86 SIMD instructions). My aim with this effort was to win some simplicity and avoid having to interface with a GPU graphics API which all have significant start-up latency.

This software renderer was initially inspired by the design in Handmade Hero, which approximates gamma correction by the square and square root operations. Unfortunately, this approximation results in a very noticeable loss of accuracy and attempting to do anything more accurate is prohibitively expensive. Even without the accurate gamma correction, the software render is still much slower than the GPU. Therefore, I had to conclude that GPU was the only feasible option for performant and accurate 2D graphics, which sadly meant enduring more of the misery of working with complex graphics APIs.

Approaching the end of 2025, my faith in the Windows operating system was nearing zero. I did not want to tie myself to this rapidly decaying platform if I could avoid it. Thus, I started learning Vulkan and used it to implement a basic 2D render. Vulkan Guide helped me get set up and I used the VulkanMemoryAllocator library because apparently it’s silly not to. HowToVulkan is a newer guide that I haven’t looked at but seems promising.

Vulkan has the benefit of giving you access to the latest features in GPU technology, whereas OpenGL is antiquated and better avoided nowadays. Vulkan is not fun to work with, but that’s where the future is headed, and that’s not a decision I get to make. The bindless APIs in the newer Vulkan versions makes things easier and enables substantial simplifications such as eliminating the need for a glyph atlas.

Designing a Typeface and Text Layout System

In designing UI systems, it has become clear to me that text rendering is by far the most complex and computationally-expensive part of rendering simple 2D UIs. This is because glyphs first have to be rasterised from a vector format into a bitmap image, then a potentially very complex set of rules need to be applied to compute the positions of each glyph in a string of text.

Glyph caching helps improves performance by saving the bitmap of a glyph, but the bitmap is only valid for a specific sub-pixel offset, and glyphs can be positioned at any non-integer coordinate. If you round the glyphs to the nearest pixel boundary, the text becomes noticeably unevenly spaced.

In my prior text rendering solution, I instead rounded each glyph to the nearest 1/3-pixel boundary which reduces the visual error, but requires up to three cached bitmaps per glyph.

Another solution I tried was to instead use a word cache: split up the text into short segments (words) and rasterise each word as a single bitmap image. This means that the spacing between glyphs in a single word is perfect, at the expense of more memory spent on storing bitmaps.

But I wanted something better. Following good engineering principles, I considered my specific situation and considered—from first principles—what an optimal text rendering design would look like. Here are some of the constraints and assumptions I came with:

  • Support only for left-to-right scripts like Latin, Cyrillic and Greek.

  • Glyphs must always be pixel-aligned (so there is one bitmap per glyph with no need for sub-pixel variants).

  • Support for characters of various widths (not just monospaced typefaces).

  • Kerning is applied to each pair of graphemes, with no further context allowed.

  • Font size is determined by cap height which must be an integer.

  • No sub-pixel anti-aliasing like ClearType; it complicates colour blending, only works well in limited cases, and results in colour fringing. I would prefer using high-DPI displays so that sub-pixel anti-aliasing can be a forgotten artefact of the past.

By constraining the problem I am attempting to solve, I can search for opportunities to make the targeted solution significantly superior to general solutions like TrueType and Harfbuzz. And indeed, I concluded that I would be better off throwing out TrueType/OpenType, Harfbuzz (or equivalent) and FreeType (or equivalent).

Firstly, I needed a glyph rasteriser to replace FreeType. The goal of this is to have a simple piece of code that does exactly what I need in a way I control. I learned about the mathematics of glyph rasterisation, giving careful consideration to floating-point precision limitations, and proceeded to implement a rasteriser in less than 800 lines. This rasteriser directly supports quadratic Bézier curves, unlike stb_truetype, which has to approximate such curves by a series of straight line segments.

Secondly, I needed a typeface. Since I had chosen to reject the design of TrueType, I had shut myself out of the ecosystem of pre-existing typefaces—I had to now create my own. This process took much longer than I initially anticipated, and so I spent months swimming a laborious mathematical soup of twists, turns, and dead ends.

TrueType fonts work by representing each glyph as a collection of contours, where each contour consists of segments of either quadratic Bézier curves or straight lines (which is just a degenerate case of a quadratic Bézier curve). It gets further complicated by font hinting, which requires the use of an interpreter to read TrueType bytecode instructions which make adjustments to the vertices for better alignment with the pixel boundaries at a certain font size.

I also wanted to support hinting (where the glyph shape is optimised for the pixel boundaries at a certain size) but I wanted to implement it in a simpler way. Instead of representing my typeface in a bespoke data format (something like a ttf or otf file), I decided that my typeface would be defined in executable code for maximum flexibility. With this design, the process of obtaining a set of glyph contours involves calling a function with three key parameters: the typeface, the font size, and the glyph number. The code then directly computes the optimal glyph vectors depending on the font size, without needing a complicated bytecode interpreter. It’s all just regular imperative code.

Two main principles informed the design of my typeface:

  • Legibility: characters should be reasonably distinct and readable, even at rather small font sizes.

  • Performance: there should be minimal computational cost involved in generating the glyph vectors and rasterising them. This meant that I aimed to use as few curve segments as possible, including minimising the use of cubic Bézier curves (which must be broken up into quadratics prior to rasterisation).

To make the glyphs look as good as possible, I studied quite a bit of mathematical theory as I needed to be able to generate curves that ideally maintain G2 continuity, but these curves need to be cheap to generate. For weeks, I chased after mathematical derivations in SageMath trying to find formulae for computing curves that satisfied various invariants. This turned out to be much less fruitful that I had hoped; often, the sets of equations would be too complex to solve, or no solution would exist at all. I had also taken plenty of time to check the prior research, including the work done by Raph Levien, but I wanted something as simple as possible and much of the research involved mathematical expressions that lack an analytical solution and thus are not trivial to compute (which would violate my performance constraints).

Eventually, I completed the designs for the ASCII range of characters. You may think, as I did, that the small range of ASCII characters wouldn’t take that much work to design. But I was waiting for the day that it would end, and I felt like it would never come.

Putting this all together into a text rendering system, I was definitely surprised by how well the result turned out. The text rendering system is now much simpler, and I can now comfortably use my own proportionally-spaced font in my own code editor.

From Clojure to Jai

This section presents some of my reasoning as to why my style of programming dramatically shifted away from the dynamic, functional, GC-powered land of Clojure and towards embracing a more imperative, statically-typed, systems-level kind of programming.

Clojure Is Too Slow, and Performance Matters

Having become reasonably experienced using the Clojure programming language for over two years, I repeatedly found myself running into the same insurmountable obstacle: performance. Put simply, Clojure is too inefficient for high-performance software, and the program start-up time is unacceptable. I have spent a lot of time optimising performance in Clojure programs but nothing is ever enough.

In my previous article, I discussed my attempt at building a compiler for a custom JVM-based language which aimed to be more performant, but eventually I had to face reality: the Java Virtual Machine is too slow and limiting.

There is a good reason the JVM has not taken over the world of performance-critical software (operating systems, codecs, game engines). Even in cases where performance is less critical, choosing to build your software with inefficient technology like the JVM still does a disservice to your users, since the result is often:

  • Less responsive and more sluggish (even for simple UIs, the slightest difference matters).

  • Less power efficient (especially important for laptops).

  • More complex (larger binary sizes, possibly requiring a separate JVM installation).

To illustrate the importance of performance, if a compiler takes two seconds to compile instead of one, the added delay further breaks the flow of development and decreases dampens one’s spirit. Billions of people use software every day, so each minor delay quickly adds up to an unfathomably great waste of human hours spent waiting for software to respond.

Performance also affects the user’s pattern of behaviour. For example, the slow file search in Windows File Explorer means that I never use that feature. If the performance of the search were much better, then superior workflows could be enabled that take advantage of that search feature. This is not merely hypothetical: nowadays, with FilePilot, I use text search to rapidly navigate the file system all the time.

A common dogma in computer programming is that you should not be optimising for performance until you know your performance requirements and bottlenecks. To a large extent that is true, but it is not an excuse for things that are slow for no good reason; these things, polluted with junk, can be made faster without sacrificing a good programming experience, and in fact the programmer’s experience will often be improved by prioritising performance and simplicity.

As soon as you choose to use something like the JVM, you have already accepted a significant performance and memory penalty. If people don’t think about performance from the beginning, we end up with a proliferation of slow software (the world we live in now). You may take steps to optimise the program (which naturally makes it more brittle), but you will always have the foundational issue of running on a suboptimal, managed VM.

Programs should be fast by default—and the language should push the programmer in the direction of something that is not horribly inefficient. At the very least, there should be a pathway to transform the program into something that makes reasonably efficient use of the hardware.

Also, if you are developing a library, you should assume highly strict performance criteria, since you do not know in which contexts your library may be used. Pouring a huge volume of effort into a fundamentally slow technology may waste the time of your users. But it will also cut off those users for whom the library is too slow, thus their time is wasted by not being able to do what they want to do without reimplementing the library features.

In the case of a desktop applications, it is imperative that they be fast and responsive, out of respect of the user. It should be normal that it be fast and responsive, without any special effort. Computers are incredibly fast—fast enough to drive interactive 3D worlds with complicated graphics—yet even simple programs struggle to meet reasonable performance expectations. Even if you do make the program feel responsive, its inefficiencies can still impact the user experience. For instance, you may consume an absurd amount of memory that reduces the user’s multitasking potential. Additionally, through excessive CPU usage, you could bring real discomfort to the user due to increased fan noise.

Performance aside, the JVM’s class loader system was difficult to work with, and it was impossible to do the things I wanted in a non-clunky way. The JVM is drowned in unnecessary complexity, which I found apparent when working with its bytecode and flawed object model.

I think good engineering is important, and that means valuing efficient and simple solutions. The JVM is not that, and nor is it something that can be fixed; its problems stretch down to the roots.

Finding a Good Programming Language

At the dawn of 2023, I departed from the sweet comforts of JVM-enforced memory safety and escaped into the wilderness of systems-level languages with manual memory management.

I had briefly tried Rust in late 2021, but never again. The misguided design of this language creates tremendous friction and imposes artificial puzzles that get in the way of solving the actual problem at hand. The costs outweigh the benefits, at least in most cases, and the compile times are torturous.

So, I began to learn Zig as I set out to create a new compiler. However, it didn’t take long before my frustration with Zig’s language design reached a critical threshold. For instance, the Zig compiler treats any unused variable as an error, and that is just not the way I want to program.

I promptly adopted Odin, which has a far superior design philosophy. There is still a lot I don’t like about it—like the restrictive namespacing system—but it worked well enough for my compiler, interpreter and debugger project of the first half of 2023. Eventually, the growing project size (albeit still modest) meant that the LLVM compile times were unfortunately becoming a significant source of friction.

In the middle of 2023, I began using the Jai programming language. It is, by an order of magnitude, more well-designed and powerful than any other language in its class, thanks to a unique combination of fast compile times, powerful polymorphism, meta-programming powered by a bytecode interpreter, helpful error messages and ergonomic syntax.

Despite being rather complex, Jai actually feels very simple, uniform, and almost Lisp-like in its design. Unlike other languages, Jai spends it complexity budget wisely—in the places where the added complexity pays off the most. I am using Jai to this day and that is not changing any time soon.

But the Libraries! The Ecosystem!

One of the benefits of being tapped into the Clojure and JVM ecosystem is the access to a vast range of software libraries. However, I value that much less now than I used to in the past.

Building upon layers of abstraction can often be harmful due to the additional constraints of that abstraction. The API of a library may push you towards a design that is not optimal for the problem you are trying to solve. Also, libraries are usually designed to support a variety of use-cases, which probably means that there is more abstraction (and more features) than you truly need.

Often, I have run into trouble when my specific use-case does not fit the exact mould of the library. As a result, when going to do something specific, the library’s design may outright prevent me from doing what I want without having to fork and modify the source code—an undesirable solution that partially defeats the point of using a library in the first place.

Furthermore, I don’t tend to use a large number of third-party libraries, and many of the ones I do use could be feasibly reimplemented myself. I’d say writing that extra code is worth it in many cases, especially where the library in question is simple or only a small subset of it is used. As a bonus, the code you write is tailored to your specific needs.

In any case, I’m of the philosophy that you shouldn’t be afraid to throw away code often. An implication of this is that code should be easy to write and refactor, and a reason for this practice is that the more you work on a project, the better of an idea you get about what your program should ultimately look like. Rather than being held back by old, uninformed code, you should have the freedom to rebuild substantial pieces from a simpler foundation backed by experience. A library is analogous to the old informed code you wrote before you fully understood your needs.

Remember: Source code you don’t own or understand is a liability; it’s good to keep things minimal.

Dynamic Code Execution

For me, the most compelling feature of the JVM was the ability to dynamically modify and inspect the live program. That said, with some work, simple hot code loading can be implemented in an unmanaged language. By not having to deal with the class loader system, it is possible to design a programming language with a much better system for dynamically loading code, tailored and fine-tuned according to how I want it. In fact, it may turn out that the simpler solution overall is to implement this stuff from scratch and bypass the complexity put forth by the JVM.

Another consideration: how dynamic do you need to be? By using self-describing objects everywhere, the JVM is highly dynamic, but at the great expense of performance. Is that trade-off worth it? I don’t think so, after having gained experience in modern unmanaged languages.

I believe it is preferable to opt-in to these dynamic runtime features in the places and times you need them. Ideally, no changes to the source code should be necessary to compile a more dynamic and introspective version of the program compared to a completely static version. Additionally, in the spirit of Mouldable Development, it should be cheap to create ad-hoc tools for live visualisation, debugging, and modification of the program. The video games industry has some good examples of bespoke in-house tooling.

Conclusion

At a very bird’s-eye view, I have presented some hopefully interesting aspects of my projects. This skips over many details like the mathematics involved in typeface design as well as entire topics including the compiler implementations. Also, I haven’t touched on foundational systems such as the memory allocators and the arena allocator system that I use throughout my codebase. There is also a lot yet to talk about relating to UI event handling, caching, and layout. Maybe another day.

To wrap up, I would like to emphasise a piece of truth that I repeatedly discovered as I set down these paths of learning unfamiliar topics: the low-level details are not as frightening as they may seem at first.

  • When I was writing Clojure, the idea of working with JVM bytecode seemed daunting. Then, I spent time learning about it and found it was not so difficult after all.

  • Coming from garbage-collected languages, the idea of using a language with manual memory management sounded very tricky and error-prone, but it’s not actually that bad with the right techniques.

  • Learning assembly language and manually generating x64 machine code was intimidating at first, but it just takes perseverance to read through the vast Intel documentation before being able to create something simple that works. Assembly language itself is actually very simple to understand.

  • Learning to use a graphics API, particularly one like D3D12 or Vulkan, seemed like a monumental task due to the vast amount of prerequisite knowledge. And indeed it is, because there is no happiness to be found in modern graphics APIs. However, once you dip a foot in and start learning, you start to pick things up, things get more familiar, and thus things get a bit easier.

Not everything is like this—some things are just really difficult and may even take a lifetime to master. For instance, I’ve made no attempt at an optimising compiler because there is no limit to how complicated that gets when taken to the fullest extent (see LLVM). Also, when working with modern graphics APIs like D3D12, even industry professionals are prone to make subtle memory management mistakes thanks to the overly convoluted nature of these APIs.

But for many things, all it takes is the initial step of dipping your toes in the water, then as you begin uncovering new territory, everything becomes a lot clearer and less daunting.


Thanks for reading. To possibly receive future updates, you may choose to subscribe via email.

Permalink

From Prototype to Production: The Business Case for Chachaml Part 3

This is the final article in our series, Clojure Meets Production MLOps: How chachaml Delivers AI-Native Workflows.

If you’re joining here, you may want to start with the earlier posts:

In this final article, we’ll focus on the business side. We’ll look at what adopting chachaml means for engineering teams, how it helps shorten the path from prototype to production, and why that matters when you’re building AI systems that need to last.

Real-World Use Case #1: End-to-End Model Development Inside a Clojure System.

A team builds a recommendation engine for a Clojure app. It uses data from their existing databases and services. That data moves through feature pipelines and into model training.

With chachaml, every training run is tracked automatically. Everything—parameters, metrics, artifacts, and results—is gathered in one place, making it easy to review and contrast your experiments. 

Once the model looks good, the team registers it, handles versioning, and promotes it through the model registry. After that, deployment is just part of the flow, and they monitor it performance from there. 

The whole process happens in a single environment: 

Data Ingestion → Training → Tracking → Registry → Deployment

Example: End-to-end run in a single Clojure form

;; 30-second quickstart — full end-to-end tracking
(ml/with-run {:experiment "quickstart"}
  (ml/log-params {:lr 0.01 :epochs 50})
  (ml/log-metric :accuracy 0.94)
  (ml/log-artifact "model" {:weights [1.0 2.0] :bias 0.3}))

There’s no need to piece together multiple tools or move between different systems.

Benefits

  • Single technology stack.
  • Less operational complexity.
  • Faster iteration.
  • Easier team collaboration.
  • Better visibility across the ML lifecycle.

For teams already using Clojure, this makes it much easier to take machine learning projects from development to production.


Real-World Use Case #2: Hybrid Python + Clojure ML Infrastructure.

For machine learning, many teams don’t use just one language.

A common setup is a Python data science team working alongside a Clojure production platform.

Data scientists train and evaluate models in Python. The production application runs in Clojure.

Chachaml helps connect both sides.

Teams can send runs, metrics, and artifacts using simple HTTP APIs. 

If they are working in Python, they can connect with chachaml using scikit-learn or libpython-clj2. All artifacts are stored in a shared S3-compatible storage, so everyone—across different teams and environments—can access what they require.

Here’s the general flow: 

Python Training → chachaml Tracking → Shared Artifact Storage → Clojure Production Platform

Example: Hybrid Python + Clojure workflow

;; Python side: push run via HTTP (shown above)
;; Clojure side: load artifact and serve
(let [model (reg/load-model "iris-classifier")]
  (predict model new-features))

;; Shared S3 artifact storage available to both sides
(artifact/store! run-id "model" model {:backend :s3})

This setup comes with some real perks: 

  • Python tooling for data science.
  • Clojure for production systems.
  • Shared experiment tracking.
  • Centralized artifact management.
  • Less duplication across teams.
  • A consistent view of the ML lifecycle.

Teams don’t have to replace their preferred tools; all components are centralized, making ML operations much easier to handle. 


What This Means for CTOs and Engineering Leaders

➜ Strategic Benefits

For engineering leaders, machine learning isn’t just about building models anymore—it’s become all about keeping operations running smoothly. 

These days, as teams adopt more AI systems, they manage a bunch of tools, workflows, and deployment setups. Before the team knows it, every aspect quickly becomes complicated and hard to handle. 

A unified approach changes the game. 

  • Decrease tool sprawl.
  • Improve governance.
  • Increase reproducibility.
  • Lower operational complexity.

And as machine learning becomes part of core business systems, those benefits become more important.

➜ Evaluating Long-Term ML Platform Decisions

Choosing an MLOps platform is a long-term architecture decision. It guides teams on building and deploying models. It also helps them with how they run and troubleshoot them. 

FactorPython-Centricchachaml + Clojure
Stack consistencyLowHigh
Operational complexityHighLower
REPL workflowLimitedNative
JVM integrationModerateExcellent

If a team already works with Clojure and the JVM, sticking with that ecosystem reduces complexity. 

Teams can reduce the challenges of managing multiple platforms, so workflows move faster and the infrastructure remains less complex. 

The idea isn’t to avoid Python completely—it’s mainly about simplifying workflows by using one solid platform instead of many. 


Flexiana’s View on the Future of MLOps in Clojure

➜ The Ecosystem Is Entering a New Phase

Machine learning in Clojure isn’t just an experiment anymore. It used to be all about building models, managing data, and doing research, but that’s changed. 

Now, more teams are actually putting machine learning systems in place and handling everything that comes with keeping them up and running. 

AI is emerging in everyday business tools, too, which means teams need better ways to track experimental progress, manage their models, assess results, and collaborate across projects. 

As adoption grows, operational tooling becomes just as important as model development. That’s usually a sign that an ecosystem is maturing.

➜ Why chachaml Matters Beyond One Library

chachaml is part of a larger shift in the Clojure ecosystem.

The challenge for most organizations is no longer training a model. It’s managing machine learning systems over time. That requires tracking, governance, deployment workflows, monitoring, and collaboration.

These are operational problems, and they need operational tools.

At Flexiana, we’ve spent years building production systems, working with machine learning workloads in Clojure, and helping teams manage hybrid ML environments that combine Clojure services with Python-based tooling. Those experiences highlighted a common gap: Teams could build models but lacked the infrastructure to run them in production.

As machine learning becomes a standard part of software architecture, Clojure teams will need tools that support the full lifecycle of those systems.

That’s why chachaml matters. It’s not only about machine learning. It brings production ML to Clojure with lasting infrastructure. 


To Sum Up

chachaml really steps up Clojure’s MLOps game. This platform puts everything in one place. Teams can track their experiments, manage pipelines, monitor models, and stay on top of everything—all without the usual hassle. Collaboration actually feels easy here. 

The built-in MCP support is a solid perk, as it lets AI agents work directly with the ML data and operations.

At Flexiana, we’re excited about this. It feels like a big step toward making machine learning in Clojure more reliable and ready for serious use. 

Get started with chachaml. Talk to Flexiana.

Review the source code, documentation, and examples on GitHub to see how chachaml works in practice. Connect the chachaml MCP server to tools like Claude Code or Continue and explore machine learning data through natural language queries. 

If you’re evaluating machine learning infrastructure or planning a production ML system, we’d be happy to talk.

We help teams with:

  • Clojure ML architecture
  • Production MLOps platforms
  • Hybrid JVM/Python systems
  • AI infrastructure strategy
  • Enterprise software development

Whether you’re building a new ML platform or improving an existing one, we can help you design an approach that fits your systems, workflows, and long-term goals.

The post From Prototype to Production: The Business Case for Chachaml Part 3 appeared first on Flexiana.

Permalink

I am sorry, but everyone is getting syntax highlighting wrong

Translations: Russian

Syntax highlighting is a tool. It can help you read code faster. Find things quicker. Orient yourself in a large file.

Like any tool, it can be used correctly or incorrectly. Let’s see how to use syntax highlighting to help you work.

Christmas Lights Diarrhea

Most color themes have a unique bright color for literally everything: one for variables, another for language keywords, constants, punctuation, functions, classes, calls, comments, etc.

Sometimes it gets so bad one can’t see the base text color: everything is highlighted. What’s the base text color here?

The problem with that is, if everything is highlighted, nothing stands out. Your eye adapts and considers it a new norm: everything is bright and shiny, and instead of getting separated, it all blends together.

Here’s a quick test. Try to find the function definition here:

and here:

See what I mean?

So yeah, unfortunately, you can’t just highlight everything. You have to make decisions: what is more important, what is less. What should stand out, what shouldn’t.

Highlighting everything is like assigning “top priority” to every task in Linear. It only works if most of the tasks have lesser priorities.

If everything is highlighted, nothing is highlighted.

Enough colors to remember

There are two main use-cases you want your color theme to address:

  1. Look at something and tell what it is by its color (you can tell by reading text, yes, but why do you need syntax highlighting then?)
  2. Search for something. You want to know what to look for (which color).

1 is a direct index lookup: color → type of thing.

2 is a reverse lookup: type of thing → color.

Truth is, most people don’t do these lookups at all. They might think they do, but in reality, they don’t.

Let me illustrate. Before:

After:

Can you see it? I misspelled return for retunr and its color switched from red to purple.

I can’t.

Here’s another test. Close your eyes (not yet! Finish this sentence first) and try to remember what color your color theme uses for class names?

Can you?

If the answer for both questions is “no”, then your color theme is not functional. It might give you comfort (as in—I feel safe. If it’s highlighted, it’s probably code) but you can’t use it as a tool. It doesn’t help you.

What’s the solution? Have an absolute minimum of colors. So little that they all fit in your head at once. For example, my color theme, Alabaster, only uses four:

  • Green for strings
  • Purple for constants
  • Yellow for comments
  • Light blue for top-level definitions

That’s it! And I was able to type it all from memory, too. This minimalism allows me to actually do lookups: if I’m looking for a string, I know it will be green. If I’m looking at something yellow, I know it’s a comment.

Limit the number of different colors to what you can remember.

If you swap green and purple in my editor, it’ll be a catastrophe. If somebody swapped colors in yours, would you even notice?

What should you highlight?

Something there isn’t a lot of. Remember—we want highlights to stand out. That’s why I don’t highlight variables or function calls—they are everywhere, your code is probably 75% variable names and function calls.

I do highlight constants (numbers, strings). These are usually used more sparingly and often are reference points—a lot of logic paths start from constants.

Top-level definitions are another good idea. They give you an idea of a structure quickly.

Punctuation: it helps to separate names from syntax a little bit, and you care about names first, especially when quickly scanning code.

Please, please don’t highlight language keywords. class, function, if, elsestuff like this. You rarely look for them: “where’s that if” is a valid question, but you will be looking not at the if the keyword, but at the condition after it. The condition is the important, distinguishing part. The keyword is not.

Highlight names and constants. Grey out punctuation. Don’t highlight language keywords.

Comments are important

The tradition of using grey for comments comes from the times when people were paid by line. If you have something like

of course you would want to grey it out! This is bullshit text that doesn’t add anything and was written to be ignored.

But for good comments, the situation is opposite. Good comments ADD to the code. They explain something that couldn’t be expressed directly. They are important.

So here’s another controversial idea:

Comments should be highlighted, not hidden away.

Use bold colors, draw attention to them. Don’t shy away. If somebody took the time to tell you something, then you want to read it.

Two types of comments

Another secret nobody is talking about is that there are two types of comments:

  1. Explanations
  2. Disabled code

Most languages don’t distinguish between those, so there’s not much you can do syntax-wise. Sometimes there’s a convention (e.g. -- vs /* */ in SQL), then use it!

Here’s a real example from Clojure codebase that makes perfect use of two types of comments:

Disabled code is gray, explanation is bright yellow

Light or dark?

Per statistics, 70% of developers prefer dark themes. Being in the other 30%, that question always puzzled me. Why?

And I think I have an answer. Here’s a typical dark theme:

and here’s a light one:

On the latter one, colors are way less vibrant. Here, I picked them out for you:

Notice how many colors there are. No one can remember that many.

This is because dark colors are in general less distinguishable and more muddy. Look at Hue scale as we move brightness down:

Basically, in the dark part of the spectrum, you just get fewer colors to play with. There’s no “dark yellow” or good-looking “dark teal”.

Nothing can be done here. There are no magic colors hiding somewhere that have both good contrast on a white background and look good at the same time. By choosing a light theme, you are dooming yourself to a very limited, bad-looking, barely distinguishable set of dark colors.

So it makes sense. Dark themes do look better. Or rather: light ones can’t look good. Science ¯\_(ツ)_/¯

But!

But.

There is one trick you can do, that I don’t see a lot of. Use background colors! Compare:

The first one has nice colors, but the contrast is too low: letters become hard to read.

The second one has good contrast, but you can barely see colors.

The last one has both: high contrast and clean, vibrant colors. Lighter colors are readable even on a white background since they fill a lot more area. Text is the same brightness as in the second example, yet it gives the impression of clearer color. It’s all upside, really.

UI designers know about this trick for a while, but I rarely see it applied in code editors:

If your editor supports choosing background color, give it a try. It might open light themes for you.

Bold and italics

Don’t use. This goes into the same category as too many colors. It’s just another way to highlight something, and you don’t need too many, because you can’t highlight everything.

In theory, you might try to replace colors with typography. Would that work? I don’t know. I haven’t seen any examples.

Using italics and bold instead of colors

Myth of number-based perfection

Some themes pay too much attention to be scientifically uniform. Like, all colors have the same exact lightness, and hues are distributed evenly on a circle.

This could be nice (to know if you have OCD), but in practice, it doesn’t work as well as it sounds:

OkLab l=0.7473 c=0.1253 h=0, 45, 90, 135, 180, 225, 270, 315

The idea of highlighting is to make things stand out. If you make all colors the same lightness and chroma, they will look very similar to each other, and it’ll be hard to tell them apart.

Our eyes are way more sensitive to differences in lightness than in color, and we should use it, not try to negate it.

Let’s design a color theme together

Let’s apply these principles step by step and see where it leads us. We start with the theme from the start of this post:

First, let’s remove highlighting from language keywords and re-introduce base text color:

Next, we remove color from variable usage:

and from function/method invocation:

The thinking is that your code is mostly references to variables and method invocation. If we highlight those, we’ll have to highlight more than 75% of your code.

Notice that we’ve kept variable declarations. These are not as ubiquitous and help you quickly answer a common question: where does thing thing come from?

Next, let’s tone down punctuation:

I prefer to dim it a little bit because it helps names stand out more. Names alone can give you the general idea of what’s going on, and the exact configuration of brackets is rarely equally important.

But you might roll with base color punctuation, too:

Okay, getting close. Let’s highlight comments:

We don’t use red here because you usually need it for squiggly lines and errors.

This is still one color too many, so I unify numbers and strings to both use green:

Finally, let’s rotate colors a bit. We want to respect nesting logic, so function declarations should be brighter (yellow) than variable declarations (blue).

Compare with what we started:

In my opinion, we got a much more workable color theme: it’s easier on the eyes and helps you find stuff faster.

Shameless plug time

I’ve been applying these principles for about 8 years now.

I call this theme Alabaster and I’ve built it a couple of times for the editors I used:

It’s also been ported to many other editors and terminals; the most complete list is probably here. If your editor is not on the list, try searching for it by name—it might be built-in already! I always wondered where these color themes come from, and now I became an author of one (and I still don’t know).

Feel free to use Alabaster as is or build your own theme using the principles outlined in the article—either is fine by me.

As for the principles themselves, they worked out fantastically for me. I’ve never wanted to go back, and just one look at any “traditional” color theme gives me a scare now.

I suspect that the only reason we don’t see more restrained color themes is that people never really thought about it. Well, this is your wake-up call. I hope this will inspire people to use color more deliberately and to change the default way we build and use color themes.

Permalink

Statistics made simple

I have a weird relationship with statistics: on one hand, I try not to look at it too often. Maybe once or twice a year. It’s because analytics is not actionable: what difference does it make if a thousand people saw my article or ten thousand?

I mean, sure, you might try to guess people’s tastes and only write about what’s popular, but that will destroy your soul pretty quickly.

On the other hand, I feel nervous when something is not accounted for, recorded, or saved for future reference. I might not need it now, but what if ten years later I change my mind?

Seeing your readers also helps to know you are not writing into the void. So I really don’t need much, something very basic: the number of readers per day/per article, maybe, would be enough.

Final piece of the puzzle: I self-host my web projects, and I use an old-fashioned web server instead of delegating that task to Nginx.

Static sites are popular and for a good reason: they are fast, lightweight, and fulfil their function. I, on the other hand, might have an unfinished gestalt or two: I want to feel the full power of the computer when serving my web pages, to be able to do fun stuff that is beyond static pages. I need that freedom that comes with a full programming language at your disposal. I want to program my own web server (in Clojure, sorry everybody else).

Existing options

All this led me on a quest for a statistics solution that would uniquely fit my needs. Google Analytics was out: bloated, not privacy-friendly, terrible UX, Google is evil, etc.

What is going on?

Some other JS solution might’ve been possible, but still questionable: SaaS? Paid? Will they be around in 10 years? Self-host? Are their cookies GDPR-compliant? How to count RSS feeds?

Nginx has access logs, so I tried server-side statistics that feed off those (namely, Goatcounter). Easy to set up, but then I needed to create domains for them, manage accounts, monitor the process, and it wasn’t even performant enough on my server/request volume!

My solution

So I ended up building my own. You are welcome to join, if your constraints are similar to mine. This is how it looks:

It’s pretty basic, but does a few things that were important to me.

Setup

Extremely easy to set up. And I mean it as a feature.

Just add our middleware to your Ring stack and get everything automatically: collecting and reporting.

(def app
  (-> routes
    ...
    (ring.middleware.params/wrap-params)
    (ring.middleware.cookies/wrap-cookies)
    ...
    (clj-simple-stats.core/wrap-stats))) ;; <-- just add this

It’s zero setup in the best sense: nothing to configure, nothing to monitor, minimal dependency. It starts to work immediately and doesn’t ask anything from you, ever.

See, you already have your web server, why not reuse all the setup you did for it anyway?

Request types

We distinguish between request types. In my case, I am only interested in live people, so I count them separately from RSS feed requests, favicon requests, redirects, wrong URLs, and bots. Bots are particularly active these days. Gotta get that AI training data from somewhere.

RSS feeds are live people in a sense, so extra work was done to count them properly. Same reader requesting feed.xml 100 times in a day will only count as one request.

Hosted RSS readers often report user count in User-Agent, like this:

Feedly/1.0 (+http://www.feedly.com/fetcher.html; 457 subscribers; like FeedFetcher-Google)

Mozilla/5.0 (compatible; BazQux/2.4; +https://bazqux.com/fetcher; 6 subscribers)

Feedbin feed-id:1373711 - 142 subscribers

My personal respect and thank you to everybody on this list. I see you.

Graphs

Visualization is important, and so is choosing the correct graph type. This is wrong:

Continuous line suggests interpolation. It reads like between 1 visit at 5am and 11 visits at 6am there were points with 2, 3, 5, 9 visits in between. Maybe 5.5 visits even! That is not the case.

This is how a semantically correct version of that graph should look:

Some attention was also paid to having reasonable labels on axes. You won’t see something like 117, 234, 10875. We always choose round numbers appropriate to the scale: 100, 200, 500, 1K etc.

Goes without saying that all graphs have the same vertical scale and syncrhonized horizontal scroll.

Insights

We don’t offer much (as I don’t need much), but you can narrow reports down by page, query, referrer, user agent, and any date slice.

Not implemented (yet)

It would be nice to have some insights into “What was this spike caused by?”

Some basic breakdown by country would be nice. I do have IP addresses (for what they are worth), but I need a way to package GeoIP into some reasonable size (under 1 Mb, preferably; some loss of resolution is okay).

Finally, one thing I am really interested in is “Who wrote about me?” I do have referrers, only question is how to separate signal from noise.

Performance. DuckDB is a sport: it compresses data and runs column queries, so storing extra columns per row doesn’t affect query performance. Still, each dashboard hit is a query across the entire database, which at this moment (~3 years of data) sits around 600 MiB. I definitely need to look into building some pre-calculated aggregates.

One day.

How to get

Head to github.com/tonsky/clj-simple-stats and follow the instructions:

Let me know what you think! Is it usable to you? What could be improved?

Permalink

Clojure on Fennel part four: Parsing (again)

Other than that, the parsing is complete, and we can look at the compiler part of the ClojureFnl project. But that’s gonna be in the next post.

Sike!

Wasn’t my intention to fake out like this again, but while working on the compiler, I had an important realization that led me to redesign the parser completely. And it’s a bit of a shame, because I already had a decent part of the compiler working, being able to run the REPL and do various cool things with it. But, better sooner than later, I guess.

I had a good amount of the post about the compiler already written too, but now I’ll have to discard all of that. So instead, I decided to talk about the new parser by explaining why the old one failed me. And to do so, I’ll give you a brief look on how the compiler currently works with parsed code.

Old parser and compiler

So, in the previous post, we’ve reached a point where the parser can read code into a tagged tree. I spent a fair amount of time discussing how I wanted this specifically, and now it feels like I’m backpedalling, but tagged tree is not the way forward. Here’s the idea.

Currently, we’re reading this expression (+ 1 2) into:

[:code
 [:list
  [:symbol "+"]
  [:whitespace " "]
  [:number "1"]
  [:whitespace " "]
  [:number "2"]]]

The compiler then walks this tree, taking the first value of each node to decide what to do. First, it sees :code - that’s just an entry point, containing multiple top-level expressions. Next, it sees :list and dispatches to a function dedicated to compiling lists.

This function checks the first element of the list, sees that it is a :symbol, and checks whether this symbol is somehow special. While + doesn’t look that special, it is to the compiler, because Clojure’s + and Fennel’s + are not the same thing. In Clojure, + is a function, in Fennel, it is a special.

So the compiler replaces + with clojure_core_ns.add, and then proceeds over the rest of the forms in the list, recursively calling itself over each. In the end, we get this, written into a string builder:

(clojure_core_ns.add 1 2)

This is a somewhat simplified explanation, because I’m omitting scope resolution, symbol shadowing, and other such things that the compiler currently tracks. For actual specials, like Clojure’s let*, the story is a bit different too. The compiler has a dedicated compile-special function for all Clojure specials provided by cljlib as Fennel macros. And that’s where things started to go haywire.

Look at this code:

(def ^:private foo 42)

It’s read like this:

[:list
 [:symbol "def"]
 [:whitespace " "]
 [:metadata
  [:metadata-entry [:keyword ":private"]]
  [:whitespace " "]
  [:symbol "foo"]]
 [:whitespace " "]
 [:number "42"]]

Here’s a problem - the parser reads metadata ^:private, and metadata in Clojure is attached to symbols, so the grammar I use reads the next value after the metadata and wraps it into a single node:

[:metadata
 [:metadata-entry [:keyword ":private"]]
 [:whitespace " "]
 [:symbol "foo"]]

The compiler, while compiling this list, sees def as the first symbol and enters compile-special.def. But def expects a symbol to bind the value to, while here we have a different node type, called :metadata. So my compiler had to account for all cases where metadata can appear - and it wasn’t easy to do.

I tried to side-step this by always expecting metadata, because it can appear almost anywhere, and discarding it, because Fennel’s concept of metadata is a bit different. In which I mostly succeeded. But this wasn’t the only problematic thing to support.

Here’s another example, now from Edamame:

(defn- read-token
  "Read in a single logical token from the reader"
  ^String [#?(:clj rdr :cljs ^not-native rdr :cljr rdr) _kind initch]
  (loop [sb #?(:clj (StringBuilder.)
               :cljs (StringBuffer.)
               :cljr (StringBuilder.))
         ch initch]
    (if (or (whitespace? ch)
            (macro-terminating? ch)
            (nil? ch))
      (do (when ch
            (r/unread rdr ch))
          (str sb))
      (recur #?(:clj (.append sb ch) :cljs (.append sb ch) :cljr (.Append sb (str ch))) (r/read-char rdr)))))

It reads into this tagged tree:

[:list
 [:symbol "defn-"]
 [:whitespace " "]
 [:symbol "read-token"]
 [:whitespace "\n  "]
 [:string "\"Read in a single logical token from the reader\""]
 [:whitespace "\n  "]
 [:metadata
  [:metadata-entry [:symbol "String"]]
  [:whitespace " "]
  [:vector
   [:conditional
    [:list
     [:keyword ":clj"]
     [:whitespace " "]
     [:symbol "rdr"]
     [:whitespace " "]
     [:keyword ":cljs"]
     [:whitespace " "]
     [:metadata
      [:metadata-entry [:symbol "not-native"]]
      [:whitespace " "]
      [:symbol "rdr"]]
     [:whitespace " "]
     [:keyword ":cljr"]
     [:whitespace " "]
     [:symbol "rdr"]]]
   [:whitespace " "]
   [:symbol "_kind"]
   [:whitespace " "]
   [:symbol "initch"]]]
 [:whitespace "\n  "]
 [:list
  [:symbol "loop"]
  [:whitespace " "]
  [:vector
   [:symbol "sb"]
   [:whitespace " "]
   [:conditional
    [:list
     [:keyword ":clj"]
     [:whitespace " "]
     [:list [:symbol "StringBuilder."]]
     [:whitespace "\n               "]
     [:keyword ":cljs"]
     [:whitespace " "]
     [:list [:symbol "StringBuffer."]]
     [:whitespace "\n               "]
     [:keyword ":cljr"]
     [:whitespace " "]
     [:list [:symbol "StringBuilder."]]]]
   [:whitespace "\n         "]
   [:symbol "ch"]
   [:whitespace " "]
   [:symbol "initch"]]
  [:whitespace "\n    "]
  [:list
   [:symbol "if"]
   [:whitespace " "]
   [:list
    [:symbol "or"]
    [:whitespace " "]
    [:list [:symbol "whitespace?"] [:whitespace " "] [:symbol "ch"]]
    [:whitespace "\n            "]
    [:list
     [:symbol "macro-terminating?"]
     [:whitespace " "]
     [:symbol "ch"]]
    [:whitespace "\n            "]
    [:list [:symbol "nil?"] [:whitespace " "] [:symbol "ch"]]]
   [:whitespace "\n      "]
   [:list
    [:symbol "do"]
    [:whitespace " "]
    [:list
     [:symbol "when"]
     [:whitespace " "]
     [:symbol "ch"]
     [:whitespace "\n            "]
     [:list
      [:symbol "r/unread"]
      [:whitespace " "]
      [:symbol "rdr"]
      [:whitespace " "]
      [:symbol "ch"]]]
    [:whitespace "\n          "]
    [:list [:symbol "str"] [:whitespace " "] [:symbol "sb"]]]
   [:whitespace "\n      "]
   [:list
    [:symbol "recur"]
    [:whitespace " "]
    [:conditional
     [:list
      [:keyword ":clj"]
      [:whitespace " "]
      [:list
       [:symbol ".append"]
       [:whitespace " "]
       [:symbol "sb"]
       [:whitespace " "]
       [:symbol "ch"]]
      [:whitespace " "]
      [:keyword ":cljs"]
      [:whitespace " "]
      [:list
       [:symbol ".append"]
       [:whitespace " "]
       [:symbol "sb"]
       [:whitespace " "]
       [:symbol "ch"]]
      [:whitespace " "]
      [:keyword ":cljr"]
      [:whitespace " "]
      [:list
       [:symbol ".Append"]
       [:whitespace " "]
       [:symbol "sb"]
       [:whitespace " "]
       [:list [:symbol "str"] [:whitespace " "] [:symbol "ch"]]]]]
    [:whitespace " "]
    [:list [:symbol "r/read-char"] [:whitespace " "] [:symbol "rdr"]]]]]]

Yes, it’s abysmal, but bear with me. I had to work with this, after all, thinking that this is a blessing.

The compiler sees defn- and enters the compile-special.defn- function. Defn expects a function name, then a vector for its arguments. Here, instead of the vector, we have metadata node again. This is fine, since I just mentioned above that I managed to sidestep this problem.

Since this is an arglist, we need to compile it in a special way, adding its symbols to the function scope, etc. However, instead of arguments, we see the conditional node:

[:vector
 [:conditional
  [:list
   [:keyword ":clj"] [:whitespace " "] [:symbol "rdr"] [:whitespace " "]
   [:keyword ":cljs"] [:whitespace " "] [:metadata [:metadata-entry [:symbol "not-native"]] [:whitespace " "] [:symbol "rdr"]] [:whitespace " "]
   [:keyword ":cljr"] [:whitespace " "] [:symbol "rdr"]]]
 [:whitespace " "]
 [:symbol "_kind"]
 [:whitespace " "]
 [:symbol "initch"]]

I didn’t think it was allowed in Clojure to use conditional reading inside forms like this. But apparently it’s OK, and my compiler failed to deal with it.

So now, any node can not only be a metadata node, but also a conditional node. This complicates things, but at this point I’m still thinking that I can persevere.

So I did. I added support for almost all specials provided by cljlib. The main thing that was left to do were macros. And then it hit me:

I can’t do macros like that!

Why? Why, of course, because macros don’t emit tagged trees that my compiler understands. They emit code!

Here’s a simple macro:

(defmacro unless [test & body]
  `(when (not ~test)
     ~@body))

We can parse it:

[:list
 [:symbol "defmacro"] [:whitespace " "] [:symbol "unless"] [:whitespace " "]
 [:vector
  [:symbol "test"] [:whitespace " "] [:symbol "&"] [:whitespace " "] [:symbol "body"]]
 [:whitespace " "]
 [:backtick
  [:list
   [:symbol "when"] [:whitespace " "]
   [:list [:symbol "not"] [:whitespace " "] [:unquote [:symbol "test"]]]
   [:whitespace " "] [:unquote-splicing [:symbol "body"]]]]]

We can probably even compile it to something that we could then call during the compiler step. However, what will (unless true (println 42)) thing return?

(when (not true) (println 42)), of course. It’s a Lisp macro, what else did you expect?

But what does the compiler expect?

[:list
 [:symbol "when"]
 [:whitespace " "]
 [:list [:symbol "not"] [:whitespace " "] [:symbol "true"]]
 [:whitespace " "]
 [:list [:symbol "println"] [:whitespace " "] [:number "42"]]]

Oh, it wants this.

I have to convert macro’s output into a string, parse it, and feed it into the compiler if I want macros to work at all. And that’s BAD. If only I realized this sooner!

This was the final nail in the coffin of the tagged tree approach for my project. I knew I had to rewrite the parser to emit actual data structures that I’ll be able to emit from macros as well, and compile them.

New parser reader

So I decided that I need a proper lisp reader that will produce data structures:

;; Welcome to Fennel Proto REPL 0.6.4-dev
;; Fennel version: 1.7.0-dev
;; Lua version: PUC Lua 5.5
;; Work directory: ~/Projects/fennel/ClojureFnl/
>> (local reader (require :impl.reader))
nil
>> (reader.read-string "(def ^:private foo 42)")
(def foo 42)
>> (local {: meta : second} (require :clojure.core))
nil
>> (meta (second (reader.read-string "(def ^:private foo 42)")))
{:private true}

As can be seen, the new reader module provides the read-string function that produces data structures. Notably, there are no longer any metadata nodes - metadata is assigned to the symbol foo in this case.

Same goes for the reader conditionals:

>> (reader.read-string "[0 #?(:clj 1 :cljfnl 2) #?@(:clj [2 3] :cljfnl [3 4]) 5]")
[0 2 3 4 5]

These are now correctly spliced at read time.

This also solved macro problems for the most part:

>> (reader.read-string "`(+ ~x ~@y ~z)")
(clojure.core/seq
 (clojure.core/concat
  (clojure.core/list (quote clojure.core/+))
  (clojure.core/list x)
  y
  (clojure.core/list z)))

And it matches what Clojure itself does:

Clojure 1.12.5
user=> (read-string "`(+ ~x ~@y ~z)")
(clojure.core/seq
 (clojure.core/concat
  (clojure.core/list (quote clojure.core/+))
  (clojure.core/list x)
  y
  (clojure.core/list z)))

Due to this change, the compiler doesn’t have to know about syntax quote (`) at all, and thus macros will correctly return these kinds of lists.

So, now we can read Clojure code into data structures. Previously, the parser returned [:code A B ...] kind of result, where A, B, and the rest are all top-level forms. New reader also does this, but returns them as a list:

>> (reader.read-string-all "1 \"str\" :keyword ::namespaced #:map{:key :val}")
(1 "str" :keyword :user/namespaced {:map/key :val})

However, there’s a problem.

Consider the ::namespaced keyword. Both my reader and Clojure read it as :user/namespaced, but this is because the default namespace in the REPL is user. For my parser, it’s mostly an arbitrary choice because it doesn’t know anything about runtime.

Clojure 1.12.5
user=> (read-string "::x")
:user/x
user=> (ns foo)
nil
foo=> (read-string "::x")
:foo/x

And when compiling a file, the file might have an ns declaration, or even multiple of them:

(ns foo)

(println ::x)

(ns bar)

(println ::x)

If we were to run this code, we would see :foo/x then :bar/x.

My parser currently reads all input, meaning it transforms all top-level forms from text to data. However, this also means that it doesn’t understand anything about namespaces.

Previously, this was fine because the compiler handled this - the tagged tree didn’t resolve anything. Currently, however, we need to construct namespaced keys, and we need to know the namespace. But it’s not possible unless we parse the input expression by expression, instead of all at once.

Hence, I had to update the grammar so it could parse expression by expression. This way, I could read source code form by form, and compile each form separately. And if the compiler encounters an ns declaration, it could update some state, so the reader would know how to read namespaced keywords, and other things that may include namespaces.

But we still need to pass this state to the reader. And my reader is based on a PEG grammar.

Luckily for me, the lpeg library I use has a Carg function that can pass additional arguments into the PEG parser. This way I can write my transformation function T:

(fn T [tag patt]
  (/ (* (Ct (* (Cc tag) patt)) (Carg 1))
     (fn [node state]
       (case node
         ;; ----8<----
         [:macro-keyword bo data]
         (read-macro-keyword data bo state)
         [:conditional data]
         (read-conditional data state)
         [:conditional-splicing data]
         (conditional-splicing data state)
         ;; ----8<----
         _ _))))

This state can be passed anew after reading each expression, or mutated in place - either way, the reader now knows how to access state. The reader itself is still stateless.

The reader now supports all Clojure syntax and produces data structures:

;; Clojure                                          ;; Fennel
1                                                   ;; 1
0.5                                                 ;; 0.5
1e10                                                ;; 10000000000.0
1/2                                                 ;; 1/2
1N                                                  ;; 1N
1.33333333333333333333333333333333M                 ;; 1.33333333333333333333333333333333M
0xFF                                                ;; 255
017                                                 ;; 15
16rFF                                               ;; 255
\c                                                  ;; "c"
"string"                                            ;; "string"
\u0041                                              ;; "A"
\o101                                               ;; "A"
:keyword                                            ;; :keyword
:namespaced/keyword                                 ;; :namespaced/keyword
::auto-keyword                                      ;; :user/auto-keyword
symbol                                              ;; symbol
namespaced/symbol                                   ;; namespaced/symbol
#'var                                               ;; (var var)
'quoted-symbol                                      ;; (quote quoted-symbol)
(list)                                              ;; (list)
[vector]                                            ;; [vector]
{hash map}                                          ;; {hash map}
#:namespaced{:map 42}                               ;; {:namespaced/map 42}
#::{:auto :namespaced}                              ;; {:user/auto :namespaced}
#{hash set}                                         ;; #{set hash}
#"regexp"                                           ;; "regexp"
@dereferencing                                      ;; (clojure.core/deref dereferencing)
#(+ %1 %2)                                          ;; (fn* [p__0__ p__1__] (+ p__0__ p__1__))
^:meta data                                         ;; data
#_discard                                           ;;
[#?(:cljfnl "reader") #?@(:cljfnl [:conditionals])] ;; ["reader" :conditionals]
nil                                                 ;; nil
true                                                ;; true
false                                               ;; false
##NaN                                               ;; .nan
##Inf                                               ;; .inf
#inst "2022"                                        ;; #inst "2022-01-01T00:00:00.000-00:00"
#uuid "c6e8050a-789b-4305-b518-8f0f7c31da7a"        ;; #uuid "c6e8050a-789b-4305-b518-8f0f7c31da7a"

Note about data structures: even though we’re working in Fennel, the produced maps, vectors and lists are custom data structures that are implemented in the cljlib library. These are implementations of persistent data structures I’ve talked about in part 2.

Further work

With that, I can work on the compiler, now for real. It’s a shame that I basically have to do all the work on the compiler again from scratch, because the underlying data has changed so much, but this simplifies a lot of things, so I’m OK with that.

One thing that the reader still doesn’t support yet is read-time evaluation with #=expr. This requires a working compiler, and I’ll have to first implement it, then integrate it back into the reader.

Another thing I was thinking about was to drop the LPEG parser altogether. Maybe, when I have the compiler working and have support for all Clojure runtime semantics in place, I’ll make a fork of the Edamame parser and replace the current reader with it, adding support for ClojureFnl via reader conditionals. This will remove a C library dependency, which is a good thing for distribution. I still have another C library for arbitrary-precision numbers, but it can be worked around.

But, lesson learned - when implementing a lisp, even if it is just a transpiler, don’t try to cut corners, and make a proper reader.

Next post, for sure, will be about the compiler!

Permalink

Inside chachaml: Core Capabilities for AI-Native Workflows in Clojure Part 2

This is Part 2 of our series, Clojure Meets Production MLOps: How chachaml Delivers AI-Native Workflows.

If you missed Part 1, Why Machine Learning Projects Fail in Production—and How chachaml Solves the MLOps Gap, we covered a common issue in machine learning. Models are easier to build than they are to operate. We also looked at why MLOps has become the real bottleneck and how chachaml helps address it.

In this article, we focus on the platform itself. You’ll see how chachaml supports AI-native workflows and the capabilities that help teams move from experiments to production.

Next, in Part 3, From Prototype to Production: The Business Case for chachaml, we’ll look at the practical side of adoption and how these capabilities can help teams deliver AI systems more efficiently.

The Architecture Behind chachaml

➜ Where chachaml Fits in Modern ML Infrastructure

Machine learning involves more than model training. Data moves through several stages before reaching production.

Chachaml steps in to handle the core details of machine learning operations, sitting right between the data pipelines and deployment. 

  • Orchestrate workflow.
  • Manage experiment.
  • Track artifact.
  • Team collaboration.

This makes it easier to track what was trained, how it was trained, and what got to production. 

➜ Integrating with Existing Clojure Infrastructure

chachaml works with existing Clojure systems, including:

  • Datomic applications.
  • Event-driven architectures.
  • JVM services.
  • Existing data pipelines.➜ Acting as the Coordination Layer
  • Chachaml brings everything together in one spot. 

Teams can add MLOps without changing infrastructure. 


Core Capability #1: Pipeline Orchestration

➜ Why ML Pipelines Become Unmanageable

As the machine learning projects get bigger, teams start to notice elements being overlooked. Common problems include,

  • Manual steps for execution.
  • Inconsistent runs.
  • Reproducibility issues.

Without a structured process, it’s hard to know what was run, when it was run, and whether the same results can be reproduced later.

➜ How chachaml Pipelines Solve This

chachaml provides a structured way to run machine learning workflows. It supports,

  • Workflow execution.
  • Step dependencies.
  • Tracked pipeline runs.

This helps teams move from ad hoc scripts to repeatable processes. The result is, 

  • Greater reliability.
  • Better repeatability.
  • Easier auditing and troubleshooting.

Example: Pipeline definition

(pipe/run-pipeline! "train-and-deploy"
  [{:name "preprocess" :fn (fn [_ctx] preprocessed-data)}
   {:name "train"      :fn (fn [ctx] (train (:prev-result ctx)))}
   {:name "evaluate"   :fn (fn [ctx] (eval-model (:prev-result ctx)))}
   {:name "register"   :fn (fn [ctx] (reg/register-model ...))}])



➜ New Pipelines UI in v0.7.0

Version 0.7.0 adds a dedicated Pipelines page.

Now everyone can see exactly what’s happening in the pipeline, track the status in real time, and actually get a sense of what’s running—no need to dig through endless logs or scripts. That means

  • More Pipeline visibility.
  • Workflow observation.
  • Better teamwork with coordination.

Teams get one clear view of the ML workflow—no misunderstandings.


Core Capability #2: Experiment Tracking That Fits the REPL

➜ Why Experiment Tracking Matters

Training a model is easy. Understanding why it performed better than another model is harder.

Teams need a way to compare runs, track changes, and reproduce results. Without that, experiments quickly become difficult to manage. Experiment tracking helps with

  • Model comparisons.
  • Team collaboration.
  • Reproducibility.

➜ Native Tracking Workflows

chachaml brings experiment tracking directly into the Clojure workflow. Teams can track,

  • Runs.
  • Parameters.
  • Metrics.
  • Notes.

Everything is stored together, making it easier to understand how a model was trained and why it performed the way it did.

deftracked and Automatic Run Capture

One of chachaml’s most useful features is deftracked.

It automatically captures function executions as tracked runs. This provides,

  • Function-level tracking.
  • Minimal setup.
  • A workflow that feels natural to Clojure developers.

Example: deftracked usage

(deftracked train-model [config data]
  (ml/log-params config)
  (let [model (fit data config)]
    (ml/log-metric :accuracy (evaluate model))
    model))

(train-model {:lr 0.01} training-data)
;; => auto-creates a run, logs params + metric, no extra wiring

Developers can add tracking without changing how they normally write code.
➜ Batch Metric Logging for Large Training Runs

When teams run big training jobs, they end up with tons of metrics. 

chachaml handles this by letting them batch metric logs, so they don’t waste time or resources. It keeps things moving more quickly and smoothly. Among the advantages are,

  • Better performance.
  • Improved scalability.
  • Lower logging overhead.

This keeps experiment tracking practical even for larger workloads.


Core Capability #3: Model Lifecycle Management

➜ The Problem with Ad-Hoc Model Storage

Training a model is only part of the job. Teams also need to know which version is deployed, how it was trained, and where it came from. Without a structured system, problems appear like

  • Version confusion.
  • Deployment mistakes.
  • Missing history and provenance.

As more models move through development and production, managing them becomes harder.

➜ Registry-Based Model Management

chachaml includes a model registry that helps teams manage models throughout their lifecycle. It supports,

  • Model versioning.
  • Promotion workflows.
  • Production stages.

Teams now have a clear route from experiment to deployment. It guarantees that the correct model is put into production. 

➜ Comparing Model Versions Over Time

The registry also makes it easier to understand how models change over time. Teams can,

  • Browse version history.
  • Compare model versions.
  • Review historical performance.

This helps teams make better deployment decisions.

Example: Model registry workflow

(reg/register-model "iris-classifier" {:artifact "model" :stage :staging})
(reg/promote! "iris-classifier" 1 :production)
(reg/load-model "iris-classifier")     ;; loads latest production version
(reg/diff-versions "iris-classifier" 1 2)

Core Capability #4: AI-Native MLOps Through MCP

➜ Why Traditional MLOps Interfaces Are Changing

Developers aren’t working the same way anymore. 

With AI coding assistants and agent-based tools now part of daily workflows, users prefer not to search through dashboards or dig into logs. They’d rather just ask a question in plain English and get an answer on the spot. 

As AI agents become more capable, operational tools need a way to work with them directly.

chachaml’s MCP Server Changes the Game

One of the most distinctive features of chachaml is its built-in MCP (Model Context Protocol) server.

It exposes 16 tools through JSON-RPC and the standard MCP protocol, allowing AI agents to interact directly with machine learning data and workflows. Agents can

  • Run query.
  • Experiments query.
  • Inspect models.
  • Access the model registry.

The MCP server works with tools like

  • Claude Code

Example: Connecting chachaml MCP to Claude Code

// claude_desktop_config.json
{
  "chachaml": {
    "command": "clojure",
    "args": ["-M:mcp"],
    "cwd": "/path/to/your/project"
  }
}

// Available tools: list_runs, best_run, compare_runs,
//   list_experiments, get_model, list_models, and 10 more
  • Continue
  • Other MCP-compatible clients

So instead of clicking around and searching manually, developers talk to the system and get what they need right away. 

➜ Example Workflow

Let’s say a developer asks, “Which experiment had the highest accuracy last month?” 

The AI agent pushes that request to chachaml through MCP. 

chachaml digs through runs, experiments, and model metadata, finds the answer, and provides it. The agent returns the answer.

No manual searching. No custom queries. No switching between tools.

➜ Why This Matters for Future AI Engineering

Most MLOps platforms were designed around dashboards and user interfaces.

chachaml adds another option: direct access through AI agents.

Machine learning runs, experiments, models, and registry data become available via natural-language requests.

As AI-assisted development becomes more common, this approach becomes increasingly useful.

chachaml isn’t just designed for machine learning workflows. It’s designed for workflows that AI agents can understand and use.

Core Capability #5: Chat With Your ML Data

➜ From Dashboards to Conversations

No more searching through multiple dashboards and run histories just to find a simple answer.

There’s a /chat interface where teams can ask for details about experiments, models, or runs—just by typing questions in simple language. 

No more searching between multiple screens. Need some info? Just ask. 

➜ Practical Examples

Questions might look like:

  • Which run performed best?
  • Show experiments with accuracy above 0.95.
  • Compare model versions.
  • Which model is currently in production?

The system finds the relevant data and returns the answer directly.

➤ Backed by Real Tool Calls, Not Hallucinations

Example: Querying ML data via chat

(chat/ask
  "Which run in experiment 'iris' has the best accuracy?"
  {:provider :anthropic
   :api-key "sk-ant-..."
   :model "claude-sonnet-4-20250514"})

;; => {:answer "Run abc123 has the highest accuracy at 0.94..."
;;     :iterations 2}

The chat interface isn’t guessing.

Behind the scenes, chachaml uses MCP tools to query experiments, runs, models, and registry data.

Every response comes straight from the database—no guesses, no made-up info, just real data. 

That matters a lot when developers need precise details to make decisions. They get a workflow that feels conversational, yet it always stays connected with what’s actually happening in their projects. 


Core Capability #6: Production Visibility Through the Web UI

➜ Why ML Teams Need More Than Logs

Logs can tell teams what happened. They don’t always tell what’s happening.

As machine learning projects get bigger, teams really need a clear view of what’s going on—whether that’s experiments, models, pipelines, or deployments. 

Without this kind of visibility, tracking bugs becomes a major issue, and important details get scattered across tools and forgotten.

Common challenges include,

  • Fragmented observability.
  • Difficult debugging.
  • Limited visibility across workflows.

A central interface makes it easier to understand the system’s state.

➜ Inside the Eight-Page UI

chachaml includes a web UI built for day-to-day machine learning operations.

Teams can access:

  • Runs dashboard
  • Experiment views
  • Search
  • Model registry
  • Compare page
  • Chat page
  • Pipeline page
  • Artifact views

This gives developers, data scientists, and operators a shared view of machine learning activity.

➜ Visual Analytics for Production Teams

The UI includes tools for analyzing and comparing results over time. Features include

  • Vega-Lite charts.
  • Metric curves.
  • Comparison overlays.
  • CSV exports.

With one dedicated place to track progress, teams can compare their runs, spot performance trends, and figure out what went wrong—without digging through endless log files or searching through giant spreadsheets. 

All components coexist, so monitoring models and seeing how they change over time just makes a lot more sense. 


Core Capability #7: Alerts, Collaboration, and Operational Monitoring

➜ Detecting Problems Before Users Do

No machine learning model stays perfect. Data changes, people do new things, and suddenly what worked yesterday starts to fall apart.

The sooner teams catch these problems, the easier they are to fix. 

Suppose a team sets up an alert when model accuracy dips below a set threshold—this lets them jump in and solve the problem before users even notice anything is wrong. 

Slack Alert Integration

With chachaml, teams can integrate automated Slack alerts via webhooks. It’s easy to set up rules like “Ping me if accuracy drops below 0.90.” 

Example: Slack alert configuration

(alerts/set-alert! "accuracy-drop"
  {:experiment  "production"
   :metric-key  :accuracy
   :op          :<
   :threshold   0.9
   :webhook-url "https://hooks.slack.com/services/..."})

(alerts/check-alerts!)
;; => Slack message fired when accuracy drops below 0.90

That way, 

  • Teams can jump on problems faster.
  • Keep track of model health.
  • Improve their oversight. 

Teams don’t have to wait until something goes wrong after deployment—now they will know right away when there’s a problem. 

User Attribution and Team Accountability

When the projects start getting bigger, it’s really helpful to see who did what—who ran an experiment, trained a model, or pushed out a deployment. 

chachaml comes with tools like 

  • CHACHA_USER

Example: HTTP Write API (curl / Python / Go)

# Log a run from any language via HTTP
RUN_ID=$(curl -s -X POST http://localhost:8080/api/w/runs \
  -H 'Content-Type: application/json' \
  -d '{"experiment":"iris","name":"from-python"}' | jq -r .id)

curl -X POST http://localhost:8080/api/w/runs/$RUN_ID/metrics \
  -H 'Content-Type: application/json' \
  -d '{"accuracy":0.94}'
  • My Runs views
  • Ownership tracking

Teams get a full record across runs, experiments, and models, so everyone can tell who’s responsible for what and when it happened. 


Core Capability #8: Open Integration Across Languages and Teams

➜ Not Everyone Needs to Be Writing Clojure

Most engineering teams use more than one language. Even if the company leans heavily on Clojure, it will probably still need Python or Go for machine learning work. 

That’s just how things go. chachaml gets it; it’s flexible enough to fit into whatever environment teams are already using. 

HTTP Write APIs

Chachaml comes with HTTP write APIs so outside apps can push their data straight into the platform. 

Teams can log runs, metrics, parameters, and artifacts from:

  • Python.
  • Go.
  • curl.
  • Any tool that can make HTTP requests.

It’s super easy for teams to plug in without disrupting their current workflows. 

Chachaml serves as a central hub for tracking and managing all machine learning tasks in one place, rather than replacing preferred tools. 

sklearn Interoperability

Many machine learning teams already use Python libraries such as scikit-learn.

Through libpython-clj2, chachaml can work with those libraries while keeping tracking and management inside the Clojure ecosystem.

It includes helpers such as:

  • tracked-fit!

Example: sklearn interoperability via libpython-clj2

(sk/train-and-evaluate!
  (lm/LogisticRegression :max_iter 200)
  X-train y-train X-test y-test
  :register-as "iris-classifier"
  :stage       :staging)

;; => trains in Python sklearn, logs metrics + registers model in chachaml
  • tracked-predict
  • evaluate!

Teams can train and evaluate models using familiar Python tools while still recording experiments and results in chachaml.

Shared Artifact Storage with S3

Machine learning projects generate many artifacts, including models, datasets, reports, and evaluation results.

chachaml supports S3-compatible storage, including

  • MinIO.
  • DigitalOcean Spaces.
  • AWS S3.

Teams can keep all their artifacts in one place and share them, no matter what environment they’re working in. 

It’s just simpler that way—everyone stays aligned, and moving work from development to staging to production doesn’t mean teams have to rethink how they handle their artifacts. 


Storage, Scaling, and Production Readiness

➜ From SQLite to PostgreSQL

Most teams get started with a simple setup. With chachaml, teams can use SQLite for small projects or local development. Thanks to SQLite’s WAL mode, teams achieve better concurrency without complicating deployment. 

When the team grows and the demands increase, switching to PostgreSQL is easy. Teams don’t have to change how they work — the API and workflows stay exactly the same. And when teams need to scale up, chachaml collaborates with HikariCP for connection pooling. 

So it handles bigger workloads and more users without difficulty. 

➜ Dockerized Deployment

Getting started doesn’t take much effort, either. chachaml offers Docker-based deployment options, so starting the app, database, and web interface simultaneously is pretty simple. 

If a team wants to move from a local setup to something everyone can access, there’s no need to rebuild its infrastructure—everything just works. 

The post Inside chachaml: Core Capabilities for AI-Native Workflows in Clojure Part 2 appeared first on Flexiana.

Permalink

The Rest of the Story: June Edition - JVM Weekly vol. 181

Before we get into the technical weeds: is it this hot where you are too? I’m writing this from London, where the thermometer is closing in on 34 degrees, the sky is perfectly cloudless, and the islanders are reacting as if the apocalypse had begun (air conditioning here remains a largely theoretical concept - in the Kensington Event Center it was hot like in the the all-inclusive Hotel in Egypt).

Thanks for reading JVM Weekly! Subscribe for free to receive new posts and support my work.

But enough about the weather, because June in the JVM world ran just as hot.

This time the month split neatly into two halves. On one side, the JDK did that slow, unglamorous work that keeps Java in the game: a native Argon2 moved closer to the standard library, Babylon started lowering Java straight onto Tensor Cores, and the Vector API went through an honest, public reckoning with its own limitations. On the other side, the tooling layer reorganized itself around agents: JetBrains and Microsoft, in the same week, bet on almost perfectly opposite strategies, Spring AI reached 2.0 GA, and the enterprise MCP story I keep saying someone should finally write up keeps coming together. Let’s get started!

1. June: The Rest of the Story

I’ll start with a topic that, if you want to understand where AI-assisted programming is heading, says more than any chat-window demo: in that same stretch of June, JetBrains and Microsoft bet on almost perfectly opposite strategies for the agent harness, the layer that actually runs long, multi-step agent sessions.

JetBrains open-sourced Mellum2, a 12-billion-parameter coding model that activates only 2.5 billion parameters per token thanks to a Mixture-of-Experts architecture, routing each token through a subset of 64 experts. JetBrains’ Nikita Pavlichenko and Anton Semenkin call it a “focal model”: fast, specialized, and deliberately not racing the frontier models on breadth. The benchmarks back up that niche: on a single H100 it matches Qwen2.5-7B almost exactly in single-request mode (192 vs 193 tokens/s), but under concurrent load (and only that kind counts in production) it beats Qwen2.5-7B by 21% and Qwen3-8B by 79%, while its “thinking” variant hits 78.4% on EvalPlus.

A small model only needs one H100…

The technical report is honest about the cost: once you shift evaluation toward broad reasoning (GPQA Diamond, MMLU-Redux), the larger models take the lead again. Mellum2, though, runs on infrastructure you control. Claude Code and Codex run locally but route inference through the Anthropic and OpenAI APIs; Cursor’s work on Composer stays tied to Cursor’s platform (now with an xAI layer outside your walls). Mellum2 is a bet that ownership and on-prem will matter as AI moves deeper into the SDLC.

BTW: last Thursday I was at AI Tinkerers in Warsaw, where Damian Bogunowicz was talking about exactly this, Mellum2.

And then, as if for contrast, Microsoft went for full consolidation. Ji Dong, Senior PM on GitHub Copilot for JetBrains, announced that Copilot CLI is becoming the default agent harness in GitHub Copilot for JetBrains, with the IDE’s own local harness slated for deprecation. The reasoning is pure platform economics: maintaining a separate harness for JetBrains meant features and models landed there later than on Copilot’s other surfaces, so folding everything into Copilot CLI buys faster parity and (they claim) higher-quality results. CLI sessions run independently in the background while the IDE starts, monitors, and steers them, which incidentally makes parallel agent runs the default, and existing local sessions convert automatically.

Put them side by side and you have the whole tension of this moment. JetBrains: own your model, own your harness, run it yourself. Microsoft: one harness to rule them all, ship faster by not maintaining variants. Both approaches are reasonable and, interestingly, both can be the future at the same time


Remember how last month I joked that someone should write a roundup of the coalescing agentic JVM ecosystem? It keeps coalescing. And meanwhile Markus Eisele added an argument complementary to March’s MCP feature in Open Liberty: instead of rewriting a Jakarta EE 10/11 monolith into microservices, turn it into an MCP server, using the Adapter pattern to draw a hard trust boundary so agents get capability-scoped tools rather than raw access to the database and filesystem. He follows that with the Java Agent Skills Kit proposal, which lands squarely on the SKILL.md / SkillsJars thread we’ve been tracking: stop encoding agent behavior as one big, brittle prompt string, and start treating capabilities as separate, versionable, unit-testable components.

Bringing SOLID to agentic workflows is a sentence that’s either the future or a consultant’s slide. This month I’m leaning toward the former.

(It’s getting crowded - I really do need to finally write that roundup article, don’t I? 😉)


Sonatype is tightening the screws on OSS publishing to Maven Central. For now softly (the UI shows how far over the limit you are), and from August “for real.”

Brian Fox (CTO of Sonatype, and at one time chair of Apache Maven) laid it out on the company blog. The schedule is two-stage: from June 16 a “soft” phase with notifications only is underway, and from August 11, 2026 hard enforcement kicks in, meaning publishing is paused until usage comes down, an exception is approved, or a paid option is purchased. Three monthly metrics per organization are counted (file count, size, and number of releases), and as three-month averages at that, so a one-off spike or an urgent CVE fix won’t get you booted on their own.

Where this comes from: over the last 90 days, 10% of namespaces accounted for more than 88% of published files and more than 90% of the space taken by new releases, so the target is commercial-scale patterns (huge artifacts, very frequent releases, Central as the last-mile for SDKs), not ordinary OSS. For those over the threshold there are three routes: cut unnecessary publishing (Fox writes plainly that this is not the place for every CI build), request an exception for an unusual OSS project, or move to paid Maven Central Publisher Pro.

If you maintain anything with a regular release cycle, these two months until August are a good moment to look into the Usage Center and check which side of the threshold you’re sitting on.

That one of them happens to be my employer’s namespace, I can confirm firsthand: the problem isn’t purely theoretical. Both org.virtuslab and com.softwaremill, are already over the threshold, so we are observing development closely.

Nice shut out opportunity - VirtusLab is a team behind Scala 😉


On the security front, which always gets lost in the JEP noise, native Argon2 is becoming a Preview feature. JEP 8377081 moved from Draft to Submitted, bringing RFC 9106-compliant password hashing into the JDK itself.

All three variants (Argon2d, Argon2i, and the recommended Argon2id) come through the SecretKeyFactory SPI, and Argon2ParameterSpec exposes memory cost, time cost, and parallelism, so you can tune resistance to brute-force attacks from GPUs and ASICs. The practical benefit is one fewer reason to drag Bouncy Castle into your dependency tree just to hash a password like it’s 2015, and the JDK’s cryptographic foundation finally steps down from compute-bound PBKDF2 to something memory-hard.

Following March’s quietly impressive HPKE and ML-DSA bundle in JDK 26, the platform’s posture around post-quantum and modern hashing improves release by release.


The most interesting thing in Java’s AI/ML story that is not a wrapper over someone else’s API: Project Babylon is putting Java onto GPU Tensor Cores. Using Code Reflection and the Heuristic Accelerator Toolkit, Babylon can now lower Java functional interfaces to NVVM IR and PTX and perform half-precision matrix multiplication directly on the HMMA.m16n8k16.f16 instruction, with no JNI and no hand-written CUDA.

The HAT API introduces tensor types (hat.T16x16x16) that use CodeModel analysis to optimize tiling and data flow. Still very experimental, but the direction, namely a unified model in which OpenJDK lowers high-level Java straight to hardware-specific IR, is one I’d keep an eye on.


March gave us the “Java is fast, your code might not be” discussion; June gave us the JDK team admitting that the modern API doesn’t always win. Two threads on panama-dev are worth your time. The first is a scalability wall: VectorMask.toLong() and fromLong() rest on 64-bit primitives, capping masks at 64 lanes, which falls apart against ARM SVE’s ambitions reaching 2048 bits, unless the signatures change.

The second, more humble, is JMH audits showing that the Vector API still doesn’t confidently beat C2 auto-vectorization, with float-to-float conversions where the scalar loop C2 generated was leaner than the explicit Vector API version (drowning in guards and uncommon-trap paths). There’s even the counterintuitive finding that holding vector constants in static final fields costs more than building them locally inside the loop.


Speaking of Leyden, which gave us the GC-independent AOT object cache in JDK 26 (JEP 516), the project now wants AOT caching for custom class loaders. The proposal extends the AOT/CDS benefits beyond the system and platform loaders to “safe and reasonable” non-subclassed URLClassLoader instances, identifying them during training runs, recreating them in the assembly phase, and storing linked Class mirrors in the cache so that getClassLoader() resolves correctly in production.

On top of that, anything that deviates from standard URLClassLoader behavior is excluded for cache stability, which is just the usual CDS caution, but it’s a real step toward fast startup for the many frameworks that lean heavily on non-standard loading.


On the governance front: Thomas Stuefe nominated Robert Toyonaga (Red Hat and IBM) for OpenJDK Committer status on the basis of 17 non-trivial patches in HotSpot Runtime, centered on Native Memory Tracking and JDK Flight Recorder, which is exactly the diagnostics you appreciate at 2 a.m. during a production memory leak.


Two more things from the corners of language design and polyglottery. Project Amber debated erasure-based union types (Integer | Float reducing to the nearest common ancestor), a “fast path” with no JVM changes to exhaustive matching in switch without formal sealed hierarchies. The community is split on whether erasure-only convenience won’t later create friction with Valhalla’s long-term goals for specialized generics, which is a fair worry.

And in the “de-JVM-ing” genre of functional languages that gave us swc4j and WebAssembly4J last month, ClojureWasm is a from-scratch runtime in Zig and Clojure that skips both the JVM and GraalVM Native Image, targeting sub-1 MB Wasm binaries with an allocator tuned for persistent data structures.


Finally, a Quarkus cluster and a nice callback. Quarkus is moving to Vert.x 5 in version 4, reworking its reactive core and leaning on SmallRye Common I/O and a NIO.2 Zip filesystem provider to attack archive-scanning bottlenecks (the public API stays stable; internal SPIs and handler APIs will change).

On top of that, the new Quarkus Pi4J extension moves Pi4J context initialization and hardware providers to the build phase, giving CDI-injected GPIO/I2C/SPI/PWM and sub-second native startup on a Raspberry Pi, which is a credible “Java instead of Python or C++” argument for IoT (and a tidy continuation of Pi4J joining Commonhaus in March).

And here’s the promised callback: when Floci was a GitHub All-Star last month, I noted that the community wanted Quarkus DevServices integration. Markus Eisele delivered exactly that, wiring Floci in as Dev Services for S3, SSM, and SQS right into the quarkus dev loop, without LocalStack’s login-token tax. The ecosystem listens, as you can see.

Oh, and one more for the “human-readable text as a weapon” file: Ken Kousen described a delightfully malicious escalation of using property-based testing with jqwik on both sides of the prompt-injection war, namely engineers fuzzing LLMs for injection vulnerabilities with @Provide and Arbitraries, and, on the dark side, developers poisoning their own code with data-nuking payloads for when an autonomous “vibe coder” swallows them.

The lesson is unchanging and unspectacular: sandbox anything that lets an LLM touch your filesystem.


A small note to close on: 👩🏻💻 Marit van Dijk (Java Champion and Developer Advocate at JetBrains) has kicked off a new series on the IntelliJ IDEA YouTube channel, where she talks with people from the JVM community.

The first episode is a conversation with Moritz Halbritter from the Spring team about JSpecify, the standardized nullness annotations that, after being adopted in Spring Framework 7 and Spring Boot 4 (plus support in IntelliJ itself), are shifting from a curiosity into an everyday tool for null-safety in Java.

Episode 1 on YouTube

2. Release Radar

Kotlin 2.4.0

Kotlin 2.4.0 is the release where the headline feature of the last few versions stops being preview. It promotes context parameters to Stable, and the opt-in -Xcontext-parameters flag disappears, so if you have @OptIn(ExperimentalContextParameters::class) scattered around, you can clean it up. They were still experimental when I covered 2.3.20 in March, so this is the moment they move into the “put it in your public API” zone. The use case is the one that’s hit everyone: dragging a logger, a transaction handle, or a tenant ID through five layers of functions just so the deepest one can read it. The compiler injects the context as the first argument at compile time (zero runtime overhead versus a plain call), it works with suspend functions, and unlike ThreadLocal it survives on iOS and JS targets in KMP, because there’s no thread-local there to break:

// Context parameters - Stable in 2.4.0; the -Xcontext-parameters flag is gone

context(tx: DatabaseTransaction)
suspend fun saveOrder(order: Order) {
    tx.persist(order)   // tx is resolved from the surrounding context, not passed in
}

The second genuinely pleasant quality-of-life change is explicit backing fields going Stable, collapsing that “private mutable plus public read-only” dance every Compose and StateFlow codebase is full of into a single declaration:

// Before: standard Compose/StateFlow boilerplate

private val _uiState = MutableStateFlow(UiState())
val uiState: StateFlow<UiState> = _uiState

// After (2.4.0): one declaration, explicit backing field

val uiState: StateFlow<UiState>
    field = MutableStateFlow(UiState())

Beyond the language there’s plenty here: kotlin.uuid.Uuid is now Stable in the common standard library (with new isSorted() / isSortedBy() checks), Kotlin/JVM can emit Java 26 bytecode with annotations in metadata enabled by default, and Kotlin/Native gains Swift Package Manager dependencies via a swiftPMDependencies block, closing out the CocoaPods-replacement story I flagged in March, plus Swift export (now in Alpha) mapping suspend functions to Swift async and Flow to AsyncSequence.

Kotlin/Wasm incremental compilation is Stable and on by default, and experimental WebAssembly Component Model support lands for cross-language Wasm interop. One administrative morsel worth knowing: each kotlin-stdlib release line on the JVM now gets an 18-month security support window with backported fixes. Compatible with Gradle 9.5.0.

Release Notes

Kotlin Toolchain 0.11 (formerly Amper)

Kotlin Toolchain 0.11 (covered by Joffrey Bion ) is the release where Amper stops being Amper. If you missed the KotlinConf keynote, the headline goes like this: Amper grew into Kotlin Toolchain and jumped straight to Alpha, which in JetBrains-speak means “we’re committing to maintaining this, so test away.”

TLDR:

At the heart of the change is a single kotlin command, conceived as one entry point into all of Kotlin: with it you create a project, build, run, test, package, and publish, and in the future also format your code and generate documentation. No build-tool decisions up front, no plugin wiring before you write your first line. The CLI can now be installed globally (sdk install kotlintoolchain) and called from outside the project directory, with the wrapper still pulling the version matched to a given repo. The long-awaited publishing of JVM libraries to Maven, including Maven Central, also enters preview, where the Toolchain takes on the whole tedious ritual (sources and javadoc JARs, PGP signatures, POM metadata, checksums) and reduces a release to kotlin publish mavenCentral, or with publishingMode: auto to a fully automated pipeline.

My favorite little detail, very on-theme for this edition: the new generated section in plugin.yaml, which registers generated sources, resources, classes, and cinterop definitions in one readable place, designed so these artifacts are recognizable at a glance not only to humans but also to AI agents and other tools. This is exactly the same thinking as JobRunr’s context7.json (more on that shortly), just built straight into the plugin model. The rest, for extension authors, is new checks and commands declarations (with the kotlin check and kotlin do commands), public entry points to tasks that are private by default, and a few fresh references like ${project.rootDir}.

One migration gotcha to close on: you can’t upgrade automatically from old Amper - the wrappers have to be swapped manually via kotlin update --create.

Release Notes

Spring AI 2.0

The third heavyweight release: Spring AI 2.0 reached GA, and being Spring, it came with a small cloud of accompanying articles. The core is the exit from experimental to stable: ChatModel, EmbeddingModel, and VectorStore are now a provider-neutral abstraction layer, so switching between OpenAI, Anthropic, and a local Ollama stops being a refactor, while the polished VectorStore abstractions and automatic metadata filtering aim squarely at cutting RAG boilerplate.

On that base, two newer things stand out: SelfCorrectingStructuredOutputConverter, which closes the feedback loop around malformed JSON (when the output fails schema validation, it packs the original instruction, the bad output, and the exception into a new prompt and retries up to a configurable retryCount)...

…and a composable tool-calling API, which replaces imperative tool binding with declarative ToolCallback beans, automating schema generation and threading state through ToolContext.

Craig Walls also documents the edges, namely ElevenLabs’ SpeechModel for streamed TTS and the Embabel abstraction layered over Spring AI and LangChain4j. Spring AI didn’t arrive alone, by the way: in the same window Spring Boot 4.1.0 shipped (more in the Radar below).


Spring Boot 4.1.0 & Spring Data 2026.0.0

Spring Boot 4.1.0 shipped alongside Spring AI 2.0 GA and is a fairly routine bump. For me the strongest point is first-class Spring gRPC support, now with reference documentation and a @GrpcAdvice annotation for centralized gRPC exception handling, which finally gives gRPC the same autoconfigured “convention over configuration” treatment REST has enjoyed for a decade.

On the security side, both reactive and blocking HTTP clients can be equipped with an InetAddressFilter that blocks outbound requests to specific addresses, a built-in hardening primitive against SSRF that you’d otherwise write by hand or pull from a library. The observability story keeps deepening (updated OpenTelemetry support, including OTel SDK environment variables, plus KafkaListenerObservationConvention beans applied automatically to the container factory), there’s file rotation support for Log4j, context now propagates automatically to @Async methods on separate threads, and a new @AutoConfigureWebServer slice annotation helps tests that need a real embedded server.

On the cleanup side: everything deprecated in 4.0 is removed, the layertools jar mode is gone, the withdrawn Apache Derby integration is deprecated (migrate to H2 or HSQL), and LiveReload in Devtools is deprecated with no replacement. It also rolls up all the fixes from 4.0.7. Treat it and Spring Data 2026.0.0 as one coordinated upgrade.

Release Notes


Gradle 9.6.0

Gradle 9.6.0 is a minor update. It improves Configuration Cache hit rate by precisely tracking project properties supplied via the org.gradle.project.<name> system properties and ORG_GRADLE_PROJECT_<name> environment variables: previously, changing any such value invalidated the cache, even if that property was never read during configuration.

There’s also a neat fix with an infrastructural flavor that fits perfectly with Gradle’s own AWS migration story (worth reading the writeup of the AWS-powered build engine behind 200 million plugin downloads a month): Gradle’s log files generated a large volume of I/O, and on network block storage like AWS EBS that triggered IOPS throttling and real slowdowns. 9.6.0 reworks that implementation, delivering significant gains on low-IOPS storage.

Looking toward Gradle 10, both implicit and explicit (findProperty(), property(), hasProperty()) lookups resolved from parent projects now emit deprecation warnings, with an opt-in NO_IMPLICIT_LOOKUP_IN_PARENT_PROJECTS preview to adopt the future behavior early, and lazy property types relax the requirement for an exact type match in the Groovy DSL.

Release Notes


Helidon 4.5.0

Helidon 4.5.0 is a release on the 4.x line following the solid LTS 4.4 (the one that brought Java Verified Portfolio support, LangChain4j agentic patterns, MCP 1.1, and the plan to rename the next line to Helidon 27 in step with JDK 27). 4.5.0 itself delivers hardened value encryption for shared-secret configuration in Helidon Config, closes out the project’s effort around API stability with dedicated stability annotations, and carries the usual fixes and dependency updates. Java 21 remains the floor, Java 25 recommended.

Release Notes


JobRunr 8.7.0

JobRunr 8.7.0 (announced by Nicholas D’hondt) makes the library noticeably easier to embed.

And since ClawRunr is built on it, “easier to embed” now reads as “a better foundation for the agentic JVM stack.”

The headline is lazy server initialization: JobRunr ships full starters for Spring, Quarkus, and Micronaut, but everyone else had to fight a greedy Fluent API that started the Background Job Server and Dashboard the moment you called useBackgroundJobServer().

Now both start on initialize(), the new getBackgroundJobServer() / getDashboardWebServer() accessors hand you the servers to manage their lifecycle, and the old configuration-ordering constraints disappear (builder calls can go in any order). On top of that: the dashboard adopts the system color scheme by default and links directly to each release’s release notes, and Jackson3JsonMapper now deserializes common collection types (List.of(...), Set.of(...), HashMap, and friends) out of the box.

My favorite little detail, very on-theme for this edition: the repo added a context7.json so that AI coding assistants pull the exact, up-to-date JobRunr configuration.

JobRunr Pro adds batch continuations created from within a batch, plus a round of dashboard security hardening (the license key validated server-side, authorization enforced on Server-Sent Events streams, stricter API key checks for the Multi-Cluster Dashboard).

Release Notes


A2A Java SDK 1.0.0

The A2A Java SDK reached GA at 1.0.0, and it matters, because Agent2Agent (an open standard under the Linux Foundation) is a protocol that lets agents from any language, framework, or vendor discover each other’s capabilities, delegate tasks, and collaborate over JSON-RPC, gRPC, or HTTP+JSON. Getting to 1.0 required a real spec-conformant transition: AgentCard now advertises a supportedInterfaces list instead of separate url/transport fields, kind type discriminators are gone, the entire spec module was modernized onto Java records, and JSpecify null-safety annotations were adopted throughout - the Maven coordinates also moved to org.a2aproject.sdk on Maven Central.

The 1.0.0 release itself adds a new integration test kit, a Quarkus-based agent for testing interoperability between SDKs, and exposes HTTP response headers through the A2AHttpResponse interface and A2AClientHTTPError.

Following it comes A2A SDK for Jakarta Servers 1.0.0-RC1 (WildFly integration, under org.wildfly.a2a), Both ADK for Java and the Spring AI A2A integration already use this SDK, so it’s effectively the interop layer for the agentic ecosystem this whole edition is circling.

Documentation | GitHub

3. GitHub All-Stars

June was a strong month for the “zero-config Java” itch and the “look inside the JAR” genre. Four picks.

jhostty - “Probably the Most Portable Terminal in the World,” in One File

Here’s JBang itself reminding everyone why it still sets the bar. jhostty, from Max Rydahl Andersen is a full-featured terminal emulator in a single Java file (~940 lines) on Java 25, embedding the Ghostty engine as a hardware-accelerated JavaFX control via GhosttyFX (0.1.169) by vlaaad. No build system, no IDE, no project setup, just JBang and one file: jbang jhostty@maxandersen (JBang pulls Java 25 automatically).

The feature list is absurdly complete for a script the author says was written “in a few hours a Sunday morning”: freely nested horizontal and vertical splits, ten themes switchable on the fly (the whole UI adapts, context menus and split dividers included), independent per-pane zoom (keyboard, scroll wheel, trackpad pinch, with the percentage in the title bar), every system font (terminal favorites like JetBrains Mono and Fira Code listed first), drag-and-drop of files and URLs, clickable link detection, shell integration (⌘F/Ctrl+F), and a native macOS menu bar.

A perfect “signal” project for how far GhosttyFX and JBang have come: native-quality, low-latency desktop UI tucked into one file you launch with a single command.

Nuts - “The Package Manager Java Never Had”

After nine years of development, Nuts (Network Unified Tool System) reached a round 1.0.0.0, and its author describes it without false modesty as “the package manager Java never had.” The idea: manage dependencies at runtime, not build time, by reading Maven descriptors directly and solving the perennial fat-JAR problem at the source, since only the JARs and dependencies (and, for native binaries, only the assets) actually needed land on the target machine.

Nuts requires no custom descriptors or build tool and doesn’t change classloading behavior: it just resolves the dependency tree, builds the classpath, and runs the app. What makes it distinctive is a shared workspace across all applications, the ability to keep multiple versions of the same app side by side, and automatic provisioning of the required JDK, so nuts install myapp pulls the latest version along with its dependencies and runtime while optimizing network and disk. The author offers the best analogy himself: it’s npm/nvm or uv, but for the Java ecosystem.

The ecosystem is still firmly in container-and-fat-JAR territory, so approach it as a thought-provoking alternative rather than a Monday-morning migration, but the nine-year gestation and a 1.0 milestone show.

Jet - A Turnkey Java HTTP Client and Server, Without the Kotlin Runtime

Jet, by Jacob Peterson, is, in its own words, “a simple, lightweight, modern, turnkey Java web client and server library” (Java 25, MIT-licensed, already on Maven Central at 3.3.0). The contrarian “Java-first” pitch: Javalin-style fluent DX without the mandatory kotlin-stdlib dependency. The architecture is modular: Common provides native HTTP header models and data structures (something the author flags competitors lack), URL building, and I/O utilities; Server is JetServer.Builder, a Handle/Handler/Router/Route system, session support, and, nicely, built-in Let’s Encrypt certificates; on top of that sit a dedicated OpenAPI annotations module and a Gradle plugin that generates the spec (with a README section arguing why a plugin rather than an annotation processor). A Client module is on the roadmap. A detail very on-theme for this edition: the README has its own “Personal Note About AI” section where the author addresses AI’s role in building the library.

Early days (~5 stars), but the niche for Java purists who find Spring Boot overkill and Javalin’s Kotlin roots a dealbreaker is real.

Marshal - Behavioral Supply-Chain Security for JVM Dependencies

Marshal does behavioral supply-chain security for JVM dependencies, with a very concrete model: it watches how packages change on Maven Central and scores every update on a 0-100 risk scale. A maintainer swap, a dropped GPG signature between versions, a sudden jump in dependency count, these all surface the day a version is published, long before a CVE exists.

It plugs in as a GitHub Action on pull requests touching pom.xml or build.gradle(.kts): it detects the build tool, scans the dependency changes, and comments on the PR with a finding (e.g. “javax.activation:activation, ORANGE 55/100, GPG signature dropped between versions”), and with threshold: red plus a required check it can block the merge. Crucially, if dependencies can’t be resolved (because the build fails), the check fails rather than passing silently, on the principle that a scanner that can’t analyze your project shouldn’t report it as clean.

It targets teams that auto-merge dependency updates directly. Still early (v0.1.0, a handful of stars), but the direction, catching the intentional backdoors that CVE-scanning alone misses, is the right one.


PS: And yes, I’m finally accepting that the JVM’s agentic ecosystem has coalesced enough that I have to stop joking about writing that overview and just write it. Consider this your warning. 😉

PS2: JVM Weekly landed at the top of Hacker News last Friday. Wow! Thank you all ❤️

PS3: From the “JVM Weekly in strange places” series - Stansted Express.

Thanks for reading JVM Weekly! Subscribe for free to receive new posts and support my work.

Permalink

Drift Checks for a Self-Hosting Compiler

The bug you cannot see coming

Here is a scenario every contributor eventually hits. You edit a source file, forget one rebuild step, and commit. Everything looks fine: the code compiles, the tests pass, the review is green. Then, a week later, something breaks somewhere that has nothing to do with your change, and you lose an afternoon before you trace it back to that one forgotten step.

That is the worst kind of bug, because there is no error and no stack trace to follow, just wrong behavior a long way from its cause. In MAGIC it has one main source: the repo commits generated files.

Why a compiler commits its own output

A compiler turns source code into something else, and MAGIC turns Clojure source into compiled .dll files (a .dll is the .NET unit of compiled code). Most projects rebuild those files on every build, so they always match the source. MAGIC cannot do that for all of them, because the compiler is partly built from its own previous output. It compiles itself, which means some of its compiled files have to be committed and then reused to build the next version.

A committed file is really a snapshot of its source at one moment in time. Edit the source without rebuilding, and that snapshot quietly becomes a lie. That mismatch has a name: drift.

flowchart LR
    S["You edit a source file"] --> Q{"Rebuild its<br/>committed output?"}
    Q -->|yes| OK["In sync ✅"]
    Q -->|no| BAD["Drift ❌<br/>stale file, no error,<br/>a bug shows up later"]

You cannot catch drift by comparing files

The obvious fix is to rebuild the file and compare it against the committed one: if they differ, someone forgot a step. Unfortunately that does not work, because MAGIC produces a slightly different file every time, even from identical source. It picks random internal names while compiling, and .NET stamps every assembly with a fresh ID, so the bytes never match even when nothing meaningful has changed. Comparing binaries would fail on every commit and tell us nothing useful.

So instead of comparing the binaries, we compare the source they came from.

The trick: fingerprint the source

For each committed binary, we record a fingerprint of its source. A fingerprint is just a SHA hash, a short string that changes whenever the file changes. We keep all of them together in a manifest file that is committed to the repo, with one line per source that maps it to the hash it was last built from. The entry for clojure.core, for example, looks like this:

clojure.core {:source "magic-compiler/src/stdlib/clojure/core.clj"
              :sha256 "33435bc12bcc893ebe91319b5cefa21aa6cfb31254b5269c4cc687feedebb1d2"}

When the binary is rebuilt, we recompute that hash and write it back here, so the manifest moves in lockstep with the source. When the source is edited but the binary is not, the hash in the manifest still describes the old version, and that mismatch is the signal we are looking for. Because the manifest is itself a committed file, the mismatch surfaces as an ordinary change in git, which, as we will see next, is exactly what the check inspects.

A few outputs are not compiled binaries at all: some generated C#, a version number, a copied Unity package. Those come out identical every time, so for them we skip the fingerprint and simply regenerate and compare directly.

That leaves two rules, and together they are the whole idea:

  • If the output is deterministic (the same bytes every time), regenerate it and compare.
  • If the output is a compiled binary, compare the source fingerprints, never the bytes.

The call sites it generates ahead of time

One of the things the check regenerates is C#, and it is worth a closer look. When MAGIC compiles a method call, it does not hard-wire the target. It emits a call site: a small object that finds the right method the first time, caches it by the argument types it saw, and reuses it on every later call, so the lookup cost is paid only once. A call site has to be written for a fixed number of arguments, and Clojure calls functions with anywhere from none to many. ClojureCLR builds these while the program runs. MAGIC cannot, because Unity's IL2CPP compiles everything to C++ ahead of time, and there is no runtime left in which to generate code.

So MAGIC generates the call sites up front, one set for each call arity up to twenty, as ordinary C# that IL2CPP can compile like anything else. Rather than hand-write twenty near-identical copies of five classes, a few Mustache templates stamp them out, and the result is committed as .g.cs files. Edit a template and forget to regenerate, and that committed C# goes stale. This is the easy case from earlier: the output is plain text and identical every run, so the check just regenerates it and compares it byte for byte.

One version for the whole monorepo

Another regenerated value is a version number, and the reason it needs guarding is a good one. MAGIC is a monorepo: six projects that used to live in six separate repositories, each with its own version, now share one version. It is written once, in version.edn, and an MSBuild rule feeds it into every C# project automatically, so the whole stack always builds as a matched set. You can depend on MAGIC 0.8.0 and get one coherent version of every piece, instead of juggling six numbers that drift apart on their own.

There is exactly one file that rule cannot reach: the Unity package's package.json, which is plain JSON, outside MSBuild. A small task copies the version into it, and the drift check makes sure nobody bumps version.edn and forgets to. It is a tiny check guarding a deliberate decision: one version, for everything.

What CI runs

All of this runs from a single command, bb check-drift, on every change in GitHub CI. It regenerates or re-fingerprints everything, then asks git one simple question: did any tracked file change? If the answer is yes, something was not refreshed, and the build fails. It prints the diff of exactly which files changed, followed by the fix command for each kind of drift, so you can match the two and run the right one.

flowchart TD
    Start["bb check-drift"] --> S1["Regenerate the C# from templates"]
    S1 --> S2["Re-fingerprint the stdlib sources"]
    S2 --> S3["Re-fingerprint the compiler + clojure.core sources"]
    S3 --> S4["Sync the Unity version number"]
    S4 --> S5["Regenerate the Unity mirror package"]
    S5 --> Q{"Did any tracked<br/>file change?"}
    Q -->|no| PASS["Pass ✅"]
    Q -->|yes| FAIL["Fail ❌<br/>prints the diff of what drifted,<br/>plus the fix command for each case"]

The check covers every generated file in the repo, including the compiler itself and clojure.core, the heart of the language. Those are the files where a stale copy would do the most damage, so they are guarded exactly like everything else.

The fix and its binary travel together

Day to day, the check is the backstop, not the main event. The habit that keeps the repo honest is simpler than the machinery behind it: whenever you change a compiler or stdlib source, you rebuild its committed binary and commit the two as a pair. The source change is one commit, and the refreshed .dll rides along in a companion commit tagged chore(bootstrap) that names the fix it belongs to. Each fix carries its refresh right behind it, so a slice of the log reads in pairs:

* 8c9a0d7e - fix(compiler): resolve inherited interface properties via interface walk
* 5da7deff - chore(bootstrap): refresh analyze-host-forms DLL for inherited interface property fix
* 7b639c7b - fix(compiler): resolve proxy-super base type from enclosing proxy
* 8ae40888 - chore(bootstrap): refresh typed-passes DLL for proxy-super shadowed-this fix

Anyone reading the history sees each change and its regenerated binary side by side, and the drift check is simply there for the day you forget the second half.

The payoff

The result is that nobody has to remember the rebuild steps, and nobody can quietly skip them either. If you edit a source and forget to rebuild it, the check fails in CI the moment you push, and shows you what drifted and how to fix it. A whole class of silent, confusing bugs simply stops existing. That is the beauty of it.

Permalink

On programming languages, targets, and platforms

I started as a Java developer, but for some time now, I have broadened my horizons. Recently, I thought about how early languages were dedicated to a single target and platform, and now they are broadening their focus. In this post, I want to write down my thoughts in the hope that it may be useful to others, probably to my future self.

Definitions

You may have been wondering about the title terms. I'm pretty sure that if you read this post, you have a pretty good picture of what a programming language is. Some may disagree on some finer points or raise a hair-splitting one, but it's not a PhD thesis, only a post on my blog. I must define what I mean by target and platform in the context of this post before going further.

Target
A target only makes sense in the context of compiled programming languages. For example, C's target is native code, and Java's is bytecode.

Platform
A platform is the system that will ultimately run the target. Native code runs on the operating system; bytecode on the JVM.

Early programming languages

Early programming languages had a single target and platform. I mentioned C and Java, but Ruby, Python, JavaScript, etc., were all the same.

Programming language Target Platform
C Native code Operating system
C++ Native code Operating system
Java Bytecode JVM
Python - Python runtime
TypeScript JavaScript Browser & server-side JS
JavaScript - Browser

I believe it was the case for a long time. It changed at some point, though.

Multi-target is the new black

The first time I heard about multi-target was in Scala. Scala came from the era of single-target and targeted bytecode on the JVM platform. However, in 2015, Martin Odersky announced Scala.js, which added JavaScript to Scala's target.

The original article was published on InfoWorld, but it seems to have redirection issues nowadays. Here's the introduction on a copy:

Scala, developed as a functional and object-oriented language for the JVM, is now multiplatform, with developers using it in abundance on JavaScript via Scala.js, Scala founder Martin Odersky says.

With Scala.js, developers write code in Scala, and the code is then compiled to JavaScript, analogous to using Microsoft's TypeScript. Developers can leverage their Scala skills in Web development. "[Scala is] very popular on JavaScript now," Odersky said at the Scala Days conference in San Francisco.

—- Scala.js lets you compile Scala to JavaScript

While Scala was the first I had heard about, other languages started to target JavaScript: I know at least Kotlin and Clojure, two originally JVM-bound languages. From that point on, it seemed every language started to add more targets. When they didn't, third parties tried to do it.

I believe that Kotlin was the epitome of such a strategy. It started with JavaScript, but the team later added native with LLVM and is currently working on WebAssembly, as far as I know. Java developers weren't left behind. The GraalVM project, managed by Oracle, allows them to generate native code. A third-party project, TeaVM, targets JavaScript and WebAssembly. It seems that nowadays lots of languages target Wasm.

Here's a small excerpt of languages and what targets and platforms they support at the time of this writing. It can't be exhaustive and obviously focuses on the JVM ecosystem, which I happen to know better.

Programming language Native/supporting project Target Platform
Java Native Bytecode JVM
GraalVM Native code Desktop OS
TeaVM JavaScript Browser
WebAssembly Wasm runtime
C -
Kotlin Native Bytecode JVM
JavaScript Browser
Server-side runtimes
Native Desktop OS
iOS
Android
Dart Native JavaScript Browser
WebAssembly Wasm runtime
DartNative Native Desktop OS
iOS
Android
Rust Native Native code Desktop OS
Native via a target WebAssembly Wasm runtime
rustc_codegen_jvm Bytecode JVM
Zig Native Native code Desktop OS
Zig and WebAssembly WebAssembly Wasm runtime
Swift Native Native code Desktop OS
ARM machine code Android
SwiftWasm WebAssembly Wasm runtime
Python Native Python runtime Desktop OS
jythonc (archived) Bytecode JVM
py2wasm WebAssembly Wasm runtime

Again, this is just a snapshot of projects I know at the time of this writing.

Discussion

Programming languages were initially focused on a single target and platform. With time, more and more languages broaden their horizon. It comes either as part of the language itself or is an effort from third parties.

It made sense so far. The bigger your organization's investment in a programming language, the harder it is to pivot to another one. In that light, GraalVM's native is a value proposition for Java-heavy organizations using Kubernetes. The JVM was meant for long-running tasks, while Kubernetes is built on the idea of stopping pods and starting new ones.

However, with the fast progress of AI, I wonder whether the trend will continue. Some might question the value. Instead of GraalVM'ifying existing Java applications, why not rewrite them directly in Rust, which natively compiles to machine code? This won't happen for core applications at the beginning, but I expect some may experiment with peripheral ones. If the experience nets benefits in terms of resource usage, then it may gnaw at core apps.

I'm very interested in the next few years to understand whether I'm imagining things, or if it will be a complete upheaval of the landscape.

To go further:

Originally published at A Java Geek on June 21st, 2026.

Permalink

ClojureScript vs. CoffeeScript for Processing sketches

DECEMBER 2015 UPDATE: I’ve run some tests with the latest ClojureScript and made some minor changes to the code, which significantly improved performance. Read this post for updated numbers. Keeping this post around for archival reasons.

The set up

This whole thing started not as a performance test, but as me experimenting with ClojureScript and Quil while reading Matt Pearson’s Generative Art. As such, it is not the most scientific of comparisons, and instead born out of my notes when exploring how to do sketches with ClojureScript for the web.

Consider the following example:

It initializes a number of circles with a random radius and movement direction, and then each frame we:

  • Update their positions,
  • Evaluate overlap if any of the circles overlaps with any other, and finally,
  • When two overlap, draw a circle at the overlap center, with a radius of the overlap amount.

As you can see, it’s unlike a visualization where you take a set of existing items and display them, but it requires us to alter a significant amount of data every frame. I wrote it first on ClojureScript but my initial implementation had some serious performance issues, so after optimizing a bit I decided to rewrite it on CoffeeScript for a performance comparison.

Some notes:

  • I’ve written Clojure before, even if I’m still a relatively new user.
  • This was the first bit of CoffeeScript that I wrote, and I mostly picked it because I wanted to compare on the browser and I’m not too keen on Javascript.

This means that chances are both pieces lend themselves to optimization, but I don’t expect one language had an unfair advantage over the other (unlike, say, if I’d written one of the examples in C#, where I have a much better idea of the optimization trade-offs).

Performance

The initial ClojureScript implementation used maps for everything, which was horribly slow - I got about 9fps for 100 circles. A quick profile showed that a lot of time was being spent on accessing the map elements, and changing it so that the circle information was passed as a datatype improved performance by 300%.

As David Nolen commented on Twitter:

Here’s the frames-per-second that Chrome reported for each implementation:

CirclesCoffeeScriptClojureScript
1003827
150346
17032N/A
20025N/A
3508N/A

Unsurprisingly, CoffeeScript comes out ahead on something that requires a lot of data modification.  What did surprise me was how much better it scaled: it took 350 wandering circles for the performance of the CoffeScript version to drop as far as the performance the ClojureScript implementation had with only 150.

It could be argued that the comparison is unfair, since CoffeeScript gets to be mutable and if performance is the main constraint, then the GC cost of immutability is likely to bite you when doing an example such as this one.  Valid points, but I precisely wanted to push ClojureScript to see how it behaved in this case, and as I mentioned only thought about doing a CoffeeScript variant for comparison afterwards.

I then tried to find out where that time was going…

Profiling

… which brought me to the next issue - the ClojureScript version has significantly more noise in the profiler, making optimization more difficult. Check for instance this trace:

CLJS trace, expanded

Most of the time is being spent on garbage collection or on core functions.  Our own code is in random_circles.cljs, which as you can see is way down the list. There must be some of our own functions involved in there, of course, but we need to dig really deep to figure out which of them are.

CLJS trace, expanded

By comparison, the equivalent CoffeeScript trace is pretty straightforward.

CoffeeScript trace

Even without digging in at all it’s clear that most of the time in the ClojureScript version is going to garbage collection and the collection functions themselves. This is an advantage that CoffeeScript also has on this scenario, since it gets to just use Javascript’s native arrays and we don’t need to continually dispose of calculated data for the circles (even though we do it for the intersections).

Some parting thoughts

Code length: the CoffeeScript implementation is 95 lines long, ClojureScript is 165. That I did not expect. The latter has a different indentation and could probably be compacted, but I’m not sure how readable the actual functions would end up without extracting some datatype values on let.

Mutability: I could write the ClojureScript version with mutable datatypes, but that would also make the code longer and less readable.  I may try it for performance’s sake - I expect however that while it would remove the GC hit, the time spent on the CLJS function cost would still remain.

Reducing iterations: ClojureScript implementation can likely be improved by having the functions that iterate over circles be less independent so that we perform all actions at once. In this way, instead of doing first a pass that figures out the overlap and then another that draws it, we could do a single pass that takes care of both.

P5.js vs. Processing.js: One somewhat embarrassing note is that I realized too late that I had written the ClojureScript example against Processing.js and the CoffeeScript against P5.js… which turns out aren’t quite the same thing. As you’ve seen on the Profiling section, however, the actual rendering library was not the main performance bottleneck. I also found an article pointing out that Processing.js actually has a performance advantage when rendering, so I don’t expect rewriting the CoffeeScript version to use it would make any significant difference in favor of ClojureScript.

The source: Right, sources. CoffeeScript version, requiring P5.js; and the ClojureScript implementation, requiring Quil.

Permalink

Extending types from Clojure while working on the REPL

On the datatypes chapter of Programming Clojure 2nd edition (page 155) there’s the following bit where the CryptoVault is extended to support the default input streams:

Book capture

The calls to spit/slurp didn’t work on my tests at first, even reloading the namespace.

After a few tests, it turns out that it was because I was using an instance of the type that was created before I had extended the type, so it didn’t have the necessary associations.

It appears that an instance keeps the function associations of when it was created, instead of referring to the latest definition… probably because it’s an entirely different type that just happens to have the same name.

REPL type redefinition

Somewhat surprising at first but makes sense immediately if you think of the type not as being reloaded, but redefined.

Permalink

Making Magic stable

ClojureCLR vs MAGIC

The first question is always why not just use ClojureCLR, David Miller's mature Clojure-to-.NET port. It runs well on the desktop, but it builds its call sites (the small objects that dispatch each method call) by emitting IL, the bytecode the .NET runtime executes, while the program runs. It does this through a part of .NET called the DLR, the Dynamic Language Runtime. Unity's IL2CPP backend compiles everything to C++ ahead of time, so there is no runtime left to execute IL that was generated on the fly. iOS forces IL2CPP, because Apple forbids any third-party JIT, and Android forces it too, not through a JIT ban but because Google mandates 64-bit and Unity's Mono has no ARM64 build. Our games target both, so ClojureCLR is out. MAGIC emits fully static IL instead, and that is the whole reason it exists. The mechanics are in docs/why-magic.md.

How we use it at Flybot

At Flybot we helped port a client's old Java game libraries to Clojure. Then, because we knew MAGIC already existed, we took on the harder task of making those Clojure libraries run as .NET DLLs inside Unity. The payoff is that the same game APIs run in both the server backend and the Unity frontend, written once. I worked closely with Ramsey across two stretches, first on performance and then on stability (the earlier story), until those games shipped in production. I was doing the bug reporting, he was fixing the compiler.

The compiler did its job, but the toolchain around it was painful. Six repositories, each with its own version and no shared release. Ramsey's time for it had become limited, so bugs could sit. And the internals were undocumented, with no public dev workflow, so contributing meant first reverse-engineering how the pieces fit, which is what our clients did for the Unity part over the years. By the time I took it on, their repos still carried workarounds just to get MAGIC to compile and integrate with Unity.

1. Gather the six repos into one

flowchart LR
    m1["magic"] --> gfr
    m2["mage"] --> gfr
    cr["Clojure.Runtime"] --> gfr
    mr["Magic.Runtime"] --> gfr
    no["nostrand"] --> gfr
    mu["Magic.Unity"] --> gfr
    gfr{{"git-filter-repo<br/>(full history kept)"}} --> mono["flybot-sg/magic<br/>one repo · one version · one CI"]

A monorepo is a single repository that holds several related projects which ship together. It was the right call here, because these six always worked as one system. One version instead of six, one place to file bugs, and the freedom to land a compiler change, the runtime tweak it needs, and a stdlib fix in a single commit, instead of coordinating three separate repos. A fix that touches several components at once stops being a chore.

I used git-filter-repo to merge the six trees while keeping every author and commit date from 2014 onwards. That was deliberate, and better than a clean import: it keeps the credit with Ramsey, and since the runtime is forked from ClojureCLR, with David Miller and the ClojureCLR contributors too.

It also lets anyone trace a bug back to the commit that introduced it. A human, and an LLM especially, can bisect far faster when the entire history of every piece sits in one place. Owning the repo under our own org also means fixes ship when they are ready, not when an external maintainer is free.

2. Build tooling instead of becoming a compiler expert

I am not a compiler expert, and trying to become one was not the objective. The better move was to make the compiler legible to anyone, me included. Everything runs as a Babashka (bb) task, and two of them carry most of the debugging: bb pipeline walks a form through the reader, AST, and emitted IL, and bb prepl-eval runs a form against a live MAGIC runtime. Between them, that is usually enough to see where something goes wrong without reading the compiler internals.

For example, for (+ 1 2), bb pipeline shows how it compiles, walking the form through the reader, AST and type stages down to the symbolic IL the emitter produces:

$ bb pipeline '(+ 1 2)'

================================================================
FORM   (+ 1 2)
================================================================

================================================================
AST (skeleton)
================================================================
{:args ...
 :fn {:op :var, :assignable? false, :var #'clojure.core/+, :form +},
 :original ...
 :type System.Int64,
 :op :intrinsic,
 :il-fn #object[MetaWrapper 0x189702c9 "clojure.lang.AFunction+MetaWrapper"],
 :form (+ 1 2)}

================================================================
TYPES (3 typed nodes)
================================================================
  :intrinsic (+ 1 2) :: System.Int64
  :const 1 :: System.Int64
  :const 2 :: System.Int64

================================================================
SYMBOLIC IL (3 instructions)
================================================================
  ldc.i8 1
  ldc.i8 2
  add.ovf

There is no Var lookup and no IFn.invoke: + is recognised as an intrinsic, a function the compiler knows how to emit directly, so it lowers to three CLR instructions. bb prepl-eval is the other half, running the same form on a live MAGIC runtime and handing back a structured reply:

$ bb prepl-eval '(+ 1 2)'

{:tag :ret, :val "3", :ns "user", :ms 2.1492, :form "(+ 1 2)"}

Between them I can see both what the compiler emits and what it actually does, which is most of what I need to localise a bug. The same scriptable tools are what let Claude Code reproduce a bug and narrow it to the offending stage, so I am not the only one who can move the compiler forward.

3. A CI gate, because drift is invisible

MAGIC is written in Clojure and compiles itself, so the repo has to commit some generated files, including the compiler's own compiled output. Edit a source but forget to regenerate the file built from it, and the two fall out of sync with no build error, just an unknown bug downstream. That mismatch is drift, and bb check-drift regenerates each generated file and fails if anything moved:

Generated artifact Source of truth How check-drift catches it
Callsite .g.cs (magic-runtime) .mustache templates regenerate, then byte-diff the output (deterministic)
Stdlib .clj.dlls magic-compiler/src/stdlib/** source SHA in stdlib-manifest.edn, not DLL bytes (compilation is non-deterministic)
Compiler + bootstrap .clj.dlls (nostrand/references) the magic.* compiler, mage, clojure.core, the bootstrap stdlib source SHA in bootstrap-manifest.edn, the set stdlib-manifest skips
Unity package.json version version.edn compare the version field
magic-unity-dual variant magic-unity regenerate from the default, then diff

So none of it is left to vigilance: every PR and push runs bb check-drift and the full test suite, and a tag push builds and publishes the release. On a self-hosting compiler, that is the line between dependable and working only until I forget a step. How the check pulls this off, byte-comparing the deterministic outputs and fingerprinting the source of the binaries that are not, is detailed in Drift Checks for a Self-Hosting Compiler.

4. A smoke project for the bugs CI cannot catch

The bugs that worried me most run fine on Mono and break only once Unity transpiles to C++ for a device, the one path CI does not exercise. Rather than keep rediscovering them inside our large game projects, I built a standalone Unity project that collects a minimal repro of every IL2CPP edge case we have hit, grouped into five suites (value types, letfn, polymorphism, control flow, stdlib), 52 checks that run green on both Mono and Standalone Mac IL2CPP at the press of one button. The rule is simple: whenever an IL2CPP bug gets fixed, its minimal repro lands here in the same commit, so the suite only grows and a bug we have already paid for cannot come back unnoticed.

One check shows why it earns its place. Calling an instance method on a primitive value is everywhere in real code:

(.ToString 42)    ; long   => "42"
(.GetType 90.0)   ; double => System.Double
(.Equals 7 7)     ; long   => true

All of it ran fine on Mono but threw InvalidProgramException the moment IL2CPP transpiled it. MAGIC had emitted a plain callvirt (the CLR instruction for a virtual method call) where a value type needs constrained.callvirt, the variant required when the receiver is a value type. Mono's JIT accepted the sloppy IL, and IL2CPP's verifier did not. That is exactly the failure CI cannot see, green on Mono, so only an actual IL2CPP build surfaces it. The fix and these checks shipped together. The consumer-side IL2CPP details live in docs/unity-integration.md.

5. Foundation first, then the backlog

Only with the monorepo, tooling, CI, and smoke suite in place did I start on the bugs that had been open on Ramsey's repos for years. The order was the point: without the safety net, fixing a compiler you do not fully understand is how you trade one bug for two. Each commit references the issue it closes, including the original nasser/* numbers, and the conventions in CONTRIBUTING.md mean a human or an LLM can file and fix without re-asking how we work.

The releases came fast once the base held:

timeline
    title MAGIC release arc (May to June 2026)
    v0.1.0 May 22 : Monorepo, bb tooling, CI, IL2CPP smoke
    v0.2.0 May 23 : Compiler and stdlib bug fixes
    v0.3.0 Jun 01 : Clojure 1.10 stdlib, magic.flags
    v0.4.0 Jun 04 : Native deps.edn in Nostrand
    v0.5.0 Jun 04 : Consumer quality-of-life
    v0.6.0 Jun 07 : Unity editor/player coexistence
    v0.7.0 Jun 09 : Dual Unity package
    v0.8.0 Jun 24 : Compiler fixes, bootstrap drift guard

Versioning is one version.edn, and bb tag cuts the tag that a CI job turns into a published release tarball on GitHub. One command, and a release builds and ships itself with nothing done by hand. That predictable, hands-off release path is what the single shared repo finally makes possible. Per-release detail is in the CHANGELOG.

A good measure of "dependable" is what disappears. The old way to use David Miller's clr.test.check was to comment out its clojure.core require and rewrite every core/let to its fully qualified form, just to dodge a MAGIC bug. After the v0.2.0 fixes, his port compiled under MAGIC with zero source patches, sooner than I expected; our clr.test.check fork now adds only a build harness and CI. Then, testing against our own libraries, I found that some workarounds were still necessary, because MAGIC had never been fully ported to Clojure 1.10. So v0.3.0 filled that gap and put every compiler option behind one magic.flags namespace.

6. Write a small resolver instead of adopting clr.tools.deps

MAGIC predates deps.edn, Clojure's dependency file, so each consumer carried a CLR-side project.edn next to its deps.edn: two files that drifted apart, with private-repo tokens inlined. David Miller maintains a CLR port of tools.deps (clr.tools.deps), but it leans on a few stdlib functions newer than MAGIC's Clojure 1.10 base, and cherry-picking and porting those in just to adopt it was not worth it. Our need was narrow anyway: resolve git and local coordinates transitively, skip Maven, and authenticate through the developer's own git and SSH config. So I wrote a small native deps.edn resolver, with a :clr alias (via :override-deps) to swap a JVM library for its CLR fork, plus shared dotnet.clj helpers so a project wires its build in a few lines. Trying the port was not wasted, though: loading it under MAGIC tripped a real compiler bug, a let-bound local thrown inside a catch, which I fixed in v0.3.0. That was the v0.4.0 and v0.5.0 work; the consumer guide is docs/porting-libraries-to-magic.md.

7. The right runtime per phase, in Unity

With consumers able to build and depend on CLR libraries cleanly, the last piece left was the one we actually ship into. The earlier workflow packaged the MAGIC-compiled runtime as NuGet and was too slow to work against, because every change meant a full compile and repackage before it showed up in the editor.

The split that fixes this was not my idea. Hong, an engineer on our client's Unity team, had arrived at it out of necessity: run stock ClojureCLR in the editor, where it loads Clojure from source and compiles it to IL in memory as the program runs, so hot reload works, and ship MAGIC only in the player build, where its static IL is what IL2CPP needs. The diagram below shows that split, the same Clojure source meeting a different compiler at each phase.

flowchart TD
    src["Your Clojure (.cljc)"]
    src --> ed
    src --> pl
    subgraph ed["Unity Editor"]
        ccl["stock ClojureCLR<br/>in-memory IL, hot reload"]
    end
    subgraph pl["Player build"]
        magic["MAGIC<br/>static IL, IL2CPP"]
    end

My job was to make MAGIC support that arrangement first-class, so v0.6.0 lets a project keep ClojureCLR in the editor and MAGIC in the player without the two colliding, leaving their existing workflow untouched. Distribution moved to the magic-unity UPM package (a Unity Package Manager package, pinned by git URL), so there is no hand-maintained copy to drift.

That left one rough edge: with both runtimes present, Unity logged 46 benign "incompatible with the editor" lines on every reload, and a consumer rightly complained. So v0.7.0 ships two UPM variants and lets a project pick the one that fits:

UPM variant MAGIC runtime in the editor? Player build Coexistence editor noise Pick it when
sg.flybot.magic.unity (default) included, runs in Play mode MAGIC 46 benign lines the project runs MAGIC everywhere
sg.flybot.magic.unity.dual excluded via !UNITY_EDITOR MAGIC none the project keeps stock ClojureCLR as its editor runtime

I generate the dual from the default and gate it in CI, rather than maintain a second copy by hand. The mechanics are in docs/unity-integration.md.

8. Pull the scattered forks into one org

A stable compiler paid off in a place I had not planned for: our own dependency forks. While MAGIC was unstable we could not reuse existing CLR ports cleanly, so they lived as forks on personal GitHub accounts, drifting from upstream. That is a bus-factor risk: if one person's account went away, so did the fork. I gathered the ones we depend on into the flybot-sg org next to the compiler, and brought them back in line with upstream. They landed in three different shapes, itself a measure of how far the compiler had come: clr.test.check is a straight drop-in, since David Miller's port now builds under MAGIC untouched and the fork adds only CI; fun-map needed a genuine .cljc CLR port; and matcho took a single reader-conditional. Claude Code made the porting quick wherever the interop was not exotic.

Rich comment tests were the one gap left: the rich-comment-tests library relies heavily on the JVM, so my colleague Parth had the idea to extract the assertions on the JVM and emit a plain .cljc test file with regular deftest that the CLR can run. That became rct-clr. To run all of it on both runtimes in one pipeline, I built the ci-clj-clr image.

9. Docs that ship with the code

The toolchain had to be usable by people who did not build it, and by the LLMs working alongside them. The CLR is unfamiliar ground for most Clojure developers, so the docs are written for both, human and LLM alike, to make the CLR ecosystem easier to enter. Three of them carry it, all in-repo and versioned with the code:

Document What it covers
docs/ why MAGIC exists, porting a library, cross-platform .cljc, and the Unity integration
Component READMEs what each piece is, with the Clojure version, runtimes, and Unity version it is tested against
CHANGELOG one entry per release, every issue it closes (including the upstream nasser/* numbers)

The porting guide is the one that matters most: it is detailed enough that a developer, or an LLM pointed at it, can take a JVM Clojure library to a .NET DLL and load it in Unity with little prior MAGIC knowledge.

What is next

The next real effort is dropping Mono for CoreCLR, which Unity is moving to and Nostrand still predates. That is a better investment than porting Clojure 1.11, which can wait.

Permalink

Clojure Deref (Jun 23, 2026)

Welcome to the Clojure Deref! This is a weekly link/news roundup for the Clojure ecosystem (feed: RSS).

Selected Highlights

Ring released v1.15.5 which contains a security fix for a regular expression denial of service (ReDoS) attack.

SCI released v0.13.53 to fix a sandbox escape. SCI is the Small Clojure Interpreter used in babashka, nbb, clerk, joyride and other projects.

If you like Reagami, but you want even more pure data in your lightweight ClojureScript UIs, look no further. Team Replicant combined powers with Team Squint to bring your data-driven UIs into focus by squinting, of course.

And speaking of Squint, as of v0.13.194, lazy seqs got a big performance boost, and you can specify dependencies with :git/sha and :local/root. See all the details.

Not to be outdone, thanks to a team assist, Replicant released v2026.06.1 to fix a long standing annoyance with tidying DOM nodes. Another team score for Christian Johansen and Michiel Borkent.

Perhaps you’re not fond of web UIs because your heart’s been stolen by the CLI. Now you can tell your CLI, "you complete me." Quite literally! Babashka CLI added shell completions and automatic help. It’s like Cupid fired a ClojureDart right through the heart. Oh yes, Babashka CLI added support for that too. See all the details.

Sometimes, you just don’t want to talk. (It’s not you, it’s me!) In the spirit of rejection, Sente v1.22 added a flexible connection rejection hook. Try again later. Maybe.

If you want to talk, NATS style, but you’ve been flummoxed by all that complicated stateful interop. No one saying what they really mean in plain data. Oh! Why do we play these games? Agonize no longer. Claxon is here to cut the drama and simplify your NATS messaging through a minimal, data-driven API with only one dependency. If only it could bring such simplicity and clarity to the complicated matters of the heart.

Maybe you didn’t say what you really meant? Perhaps it could be even better? Say it again, even faster with rewrite-clj v1.2.55. It’s the latest in a series of performance improvements. See the changes.

And if you want to go even faster, take a look at Raster: fast numerical computing for Clojure that compiles down to JVM bytecode or Web Assembly (WASM). For some fun, play Astroids or Valley. For something serious, look at Uniform Manifold Approximation and Projection or Embedding Vector Oriented Clustering.

And speaking of WASM, what if your Clojure host was WASM itself? What if WASM was the lingua franca to let you call functions from Rust, Go, Zig, C or any other language that can target WASM? ClojureWasm is an experimental dialect to explore that future.

Also on the young dialect front, Jolt has pivoted, yet again, to target Chez Scheme. Goodbye Janet. The heart is fickle. Chez is faster, with true threading, and easy access to native code, while still starting a lightweight runtime instantly.

On the tried-and-true dialect front, if you’re using Calva for ClojureScript via shadow-cljs, Calva v2.0.592 has improved support for working with simultaneous runtimes. It’s particularly helpful when working with multiple devices at once.

If you like to use Calva for AI-assisted coding, Calva Backseat Driver added support for Cursor with zero config.

Clojure is taking on Electronic Health Records (EHR). Have you suffered through EHR integrations? Have you told yourself, "There must be an easier way!" Look no further than the ehr-adapter.

Do you use libpython-clj as a bridge to PyTorch? Would you like to cut down on the boilerplate? Take a look at clj-pytorch to see how it can help.

With all that important work getting done, don’t forget to have a little fun with Clojure too. Arne Brasseur started a list for Awesome Creative Clojure projects. Take a look. Better yet, go make something delightful and share it with the community. Perhaps you’ll find yourself in the Deref too.

Clojure/Conj 2026

Wednesday, September 30 is Workshop Day.

This year, as a gift to the community, the full day of workshops are included at one low price starting at just $42!

One morning + one afternoon session of your choice: Babashka, Datomic, Calva, AI tooling, and more.

Get your Workshop Day pass before the price goes up.

Also, the room block at the conference hotel is available now! The group rate expires August 31.

Jank Survey

Jeaye Wilkerson would like to get jank into your hands as soon as possible. How does that involve you? Help Jeaye by taking a short survey.

Upcoming Events

Podcasts, videos, and media

Libraries and Tools

Debut release

  • awesome-creative-clojure - A list of creative coding resources for Clojure and its dialects

  • umap-rstr - UMAP (Uniform Manifold Approximation and Projection) for Clojure — a port of umap-learn on the raster typed-dispatch compiler.

  • evoc-rstr - EVoC (Embedding Vector Oriented Clustering) for Clojure — a port of the Tutte Institute’s evoc on raster + umap-rstr.

  • codetutor - An AI Pair Programmer, that teaches you to code as you write, for Emacs

  • claxon - Minimal, pure clojure, data-driven NATS client

  • ehr-adapter - Simplify EHR integrations by defining and validating your entire provider connection pipeline using native Clojure data structures.

  • clj-pytorch - A Clojure wrapper around PyTorch with libpython-clj

  • huffman-tree - A small Huffman tree implementation in Clojure

  • fol - A Clojure dialect that combines persistent data structures (from Clojure), CLOS-style object orientation with persistent objects, and array programming capabilities (inspired by Q/APL).

Updates

  • partial-cps 0.1.58 - A lean and efficient continuation passing style transform, includes async-await support.

  • ziggurat 4.12.1 - A stream processing framework to build stateless applications on Kafka

  • hikari-cp 4.1.0 - A Clojure wrapper to HikariCP JDBC connection pool

  • json-schema 0.4.8 - Clojure library JSON Schema validation and generation - Draft-07 compatible

  • hawk 1.0.15 - It watches your code like a hawk! You like tests, right? Then run them with our state-of-the-art Clojure test runner.

  • clojuressh 1.0.0 - A Clojure library for using SSH in Clojure that is API compatible with bbssh

  • sente 1.22.0-RC1 - Realtime web comms library for Clojure/Script

  • dda-tara 3.0.2 - Threat modeling toolkit based on threagile

  • rewrite-clj 1.2.55 - Rewrite Clojure code and edn

  • clj-tg-bot-api 1.2.271 - 🤖 The latest Telegram Bot API spec and client lib for Clojure-based apps

  • teensyp 0.7.2 - A small, zero-dependency Clojure TCP server that uses Java NIO

  • sci 0.13.53 - Configurable Clojure/Script interpreter suitable for scripting and Clojure DSLs

  • phel-lang 0.45.1 - A functional, Lisp-inspired language that compiles to PHP. Inspired by Clojure, Phel brings macros, persistent data structures, and expressive functional idioms to the PHP ecosystem.

  • next-jdbc 1.3.1118 - A modern low-level Clojure wrapper for JDBC-based access to databases.

  • temporal-clojure-sdk 2.1.0 - A Temporal SDK for Clojure

  • replicant 2026.06.1 - A data-driven rendering library for Clojure(Script) that renders hiccup to DOM or to strings.

  • raster 0.1.15 - Fast, functional numerical computing for Clojure/JVM.

  • mcp-server 0.3.50 - MCP Server library

  • ClojureWasm 1.0.0-alpha.1 - A lightweight Clojure runtime in Zig — call WebAssembly from Clojure to tap libraries written in any language.

  • calva-backseat-driver 0.0.37 - VS Code AI Agent Interactive Programming. Tools for CoPIlot and other assistants. Can also be used as an MCP server.

  • nexus 2026.06.2 - Data-driven action dispatch for Clojure(Script): Build systems that are easier to test, observe, and extend

  • calva 2.0.593 - Clojure & ClojureScript Interactive Programming for VS Code

  • persistent-sorted-set 0.4.124 - Fast B-tree based persistent sorted set for Clojure/Script

  • plumcp 0.2.2 - Clojure/ClojureScript library for making MCP server and client

  • yamlstar 0.1.9 - A YAML framework for all programming languages

  • mranderson 0.6.0 - Dependency inlining and shadowing

  • encore 3.167.0 - Core utils library for Clojure/Script

  • svar 0.7.33 - Type‑safe LLM output for Clojure. Works with any text‑only model.

  • ansatz 0.1.64 - Dependently typed Clojure DSL with a Lean4 compatible kernel.

  • clj-midas 1.0.0 - Clojure client library for the California Energy Commission’s MIDAS API

  • cli 0.11.74 - Turn Clojure functions into CLIs!

  • squint 0.13.195 - Light-weight ClojureScript dialect

  • ring 1.15.5 - Clojure HTTP server abstraction

  • generate 1.0.69 - code generation for Clojure projects

Permalink

Copyright © 2009, Planet Clojure. No rights reserved.
Planet Clojure is maintained by Baishamapayan Ghose.
Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
Theme by Brajeshwar.