Build and Deploy Web Apps With Clojure and FLy.io

This post walks through a small web development project using Clojure, covering everything from building the app to packaging and deploying it. It’s a collection of insights and tips I’ve learned from building my Clojure side projects, but presented in a more structured format.

As the title suggests, we’ll be deploying the app to Fly.io. It’s a service that allows you to deploy apps packaged as Docker images on lightweight virtual machines.[1] My experience with it has been good; it’s easy to use and quick to set up. One downside of Fly is that it doesn’t have a free tier, but if you don’t plan on leaving the app deployed, it barely costs anything.

This isn’t a tutorial on Clojure, so I’ll assume you already have some familiarity with the language as well as some of its libraries.[2]

Project Setup

In this post, we’ll be building a barebones bookmarks manager for the demo app. Users can log in using basic authentication, view all bookmarks, and create a new bookmark. It’ll be a traditional multi-page web app and the data will be stored in a SQLite database.

Here’s an overview of the project’s starting directory structure:

.
├── dev
│   └── user.clj
├── resources
│   └── config.edn
├── src
│   └── acme
│       └── main.clj
└── deps.edn

And the libraries we’re going to use. If you have some Clojure experience or have used Kit, you’re probably already familiar with all the libraries listed below.[3]

deps.edn
{:paths ["src" "resources"]
 :deps {org.clojure/clojure               {:mvn/version "1.12.0"}
        aero/aero                         {:mvn/version "1.1.6"}
        integrant/integrant               {:mvn/version "0.11.0"}
        ring/ring-jetty-adapter           {:mvn/version "1.12.2"}
        metosin/reitit-ring               {:mvn/version "0.7.2"}
        com.github.seancorfield/next.jdbc {:mvn/version "1.3.939"}
        org.xerial/sqlite-jdbc            {:mvn/version "3.46.1.0"}
        hiccup/hiccup                     {:mvn/version "2.0.0-RC3"}}
 :aliases
 {:dev {:extra-paths ["dev"]
        :extra-deps  {nrepl/nrepl    {:mvn/version "1.3.0"}
                      integrant/repl {:mvn/version "0.3.3"}}
        :main-opts   ["-m" "nrepl.cmdline" "--interactive" "--color"]}}}

I use Aero and Integrant for my system configuration (more on this in the next section), Ring with the Jetty adaptor for the web server, Reitit for routing, next.jdbc for database interaction, and Hiccup for rendering HTML. From what I’ve seen, this is a popular “library combination” for building web apps in Clojure.[4]

The user namespace in dev/user.clj contains helper functions from Integrant-repl to start, stop, and restart the Integrant system.

dev/user.clj
(ns user
  (:require
   [acme.main :as main]
   [clojure.tools.namespace.repl :as repl]
   [integrant.core :as ig]
   [integrant.repl :refer [set-prep! go halt reset reset-all]]))

(set-prep!
 (fn []
   (ig/expand (main/read-config)))) ;; we'll implement this soon

(repl/set-refresh-dirs "src" "resources")

(comment
  (go)
  (halt)
  (reset)
  (reset-all))

Systems and Configuration

If you’re new to Integrant or other dependency injection libraries like Component, I’d suggest reading “How to Structure a Clojure Web”. It’s a great explanation of the reasoning behind these libraries. Like most Clojure apps that use Aero and Integrant, my system configuration lives in a .edn file. I usually name mine as resources/config.edn. Here’s what it looks like:

resources/config.edn
{:server
 {:port #long #or [#env PORT 8080]
  :host #or [#env HOST "0.0.0.0"]
  :auth {:username #or [#env AUTH_USER "john.doe@email.com"]
         :password #or [#env AUTH_PASSWORD "password"]}}

 :database
 {:dbtype "sqlite"
  :dbname #or [#env DB_DATABASE "database.db"]}}

In production, most of these values will be set using environment variables. During local development, the app will use the hard-coded default values. We don’t have any sensitive values in our config (e.g., API keys), so it’s fine to commit this file to version control. If there are such values, I usually put them in another file that’s not tracked by version control and include them in the config file using Aero’s #include reader tag.

This config file is then “expanded” into the Integrant system map using the expand-key method:

src/acme/main.clj
(ns acme.main
  (:require
   [aero.core :as aero]
   [clojure.java.io :as io]
   [integrant.core :as ig]))

(defn read-config
  []
  {:system/config (aero/read-config (io/resource "config.edn"))})

(defmethod ig/expand-key :system/config
  [_ opts]
  (let [{:keys [server database]} opts]
    {:server/jetty (assoc server :handler (ig/ref :handler/ring))
     :handler/ring {:database (ig/ref :database/sql)
                    :auth     (:auth server)}
     :database/sql database}))

The system map is created in code instead of being in the configuration file. This makes refactoring your system simpler as you only need to change this method while leaving the config file (mostly) untouched.[5]

My current approach to Integrant + Aero config files is mostly inspired by the blog post “Rethinking Config with Aero & Integrant” and Laravel’s configuration. The config file follows a similar structure to Laravel’s config files and contains the app configurations without describing the structure of the system. Previously, I had a key for each Integrant component, which led to the config file being littered with #ig/ref and more difficult to refactor.

Also, if you haven’t already, start a REPL and connect to it from your editor. Run clj -M:dev if your editor doesn’t automatically start a REPL. Next, we’ll implement the init-key and halt-key! methods for each of the components:

src/acme/main.clj
;; src/acme/main.clj
(ns acme.main
  (:require
   ;; ...
   [acme.handler :as handler]
   [acme.util :as util])
   [next.jdbc :as jdbc]
   [ring.adapter.jetty :as jetty]))
;; ...

(defmethod ig/init-key :server/jetty
  [_ opts]
  (let [{:keys [handler port]} opts
        jetty-opts (-> opts (dissoc :handler :auth) (assoc :join? false))
        server     (jetty/run-jetty handler jetty-opts)]
    (println "Server started on port " port)
    server))

(defmethod ig/halt-key! :server/jetty
  [_ server]
  (.stop server))

(defmethod ig/init-key :handler/ring
  [_ opts]
  (handler/handler opts))

(defmethod ig/init-key :database/sql
  [_ opts]
  (let [datasource (jdbc/get-datasource opts)]
    (util/setup-db datasource)
    datasource))

The setup-db function creates the required tables in the database if they don’t exist yet. This works fine for database migrations in small projects like this demo app, but for larger projects, consider using libraries such as Migratus (my preferred library) or Ragtime.

src/acme/util.clj
(ns acme.util 
  (:require
   [next.jdbc :as jdbc]))

(defn setup-db
  [db]
  (jdbc/execute-one!
   db
   ["create table if not exists bookmarks (
       bookmark_id text primary key not null,
       url text not null,
       created_at datetime default (unixepoch()) not null
     )"]))

For the server handler, let’s start with a simple function that returns a “hi world” string.

src/acme/handler.clj
(ns acme.handler
  (:require
   [ring.util.response :as res]))

(defn handler
  [_opts]
  (fn [req]
    (res/response "hi world")))

Now all the components are implemented. We can check if the system is working properly by evaluating (reset) in the user namespace. This will reload your files and restart the system. You should see this message printed in your REPL:

:reloading (acme.util acme.handler acme.main)
Server started on port  8080
:resumed

If we send a request to http://localhost:8080/, we should get “hi world” as the response:

$ curl localhost:8080/
# hi world

Nice! The system is working correctly. In the next section, we’ll implement routing and our business logic handlers.

Routing, Middleware, and Route Handlers

First, let’s set up a ring handler and router using Reitit. We only have one route, the index / route that’ll handle both GET and POST requests.

src/acme/handler.clj
(ns acme.handler
  (:require
   [reitit.ring :as ring]))

(def routes
  [["/" {:get  index-page
         :post index-action}]])

(defn handler
  [opts]
  (ring/ring-handler
   (ring/router routes)
   (ring/routes
    (ring/redirect-trailing-slash-handler)
    (ring/create-resource-handler {:path "/"})
    (ring/create-default-handler))))

We’re including some useful middleware:

  • redirect-trailing-slash-handler to resolve routes with trailing slashes,
  • create-resource-handler to serve static files, and
  • create-default-handler to handle common 40x responses.

Implementing the Middlewares

If you remember the :handler/ring from earlier, you’ll notice that it has two dependencies, database and auth. Currently, they’re inaccessible to our route handlers. To fix this, we can inject these components into the Ring request map using a middleware function.

src/acme/handler.clj
;; ...

(defn components-middleware
  [components]
  (let [{:keys [database auth]} components]
    (fn [handler]
      (fn [req]
        (handler (assoc req
                        :db database
                        :auth auth))))))
;; ...

The components-middleware function takes in a map of components and creates a middleware function that “assocs” each component into the request map.[6] If you have more components such as a Redis cache or a mail service, you can add them here.

We’ll also need a middleware to handle HTTP basic authentication.[7] This middleware will check if the username and password from the request map match the values in the auth map injected by components-middleware. If they match, then the request is authenticated and the user can view the site.

src/acme/handler.clj
(ns acme.handler
  (:require
   ;; ...
   [acme.util :as util]
   [ring.util.response :as res]))
;; ...

(defn wrap-basic-auth
  [handler]
  (fn [req]
    (let [{:keys [headers auth]} req
          {:keys [username password]} auth
          authorization (get headers "authorization")
          correct-creds (str "Basic " (util/base64-encode
                                       (format "%s:%s" username password)))]
      (if (and authorization (= correct-creds authorization))
        (handler req)
        (-> (res/response "Access Denied")
            (res/status 401)
            (res/header "WWW-Authenticate" "Basic realm=protected"))))))
;; ...

A nice feature of Clojure is that interop with the host language is easy. The base64-encode function is just a thin wrapper over Java’s Base64.Encoder:

src/acme/util.clj
(ns acme.util
   ;; ...
  (:import java.util.Base64))

(defn base64-encode
  [s]
  (.encodeToString (Base64/getEncoder) (.getBytes s)))

Finally, we need to add them to the router. Since we’ll be handling form requests later, we’ll also bring in Ring’s wrap-params middleware.

src/acme/handler.clj
(ns acme.handler
  (:require
   ;; ...
   [ring.middleware.params :refer [wrap-params]]))
;; ...

(defn handler
  [opts]
  (ring/ring-handler
   ;; ...
   {:middleware [(components-middleware opts)
                 wrap-basic-auth
                 wrap-params]}))

Implementing the Route Handlers

We now have everything we need to implement the route handlers or the business logic of the app. First, we’ll implement the index-page function, which renders a page that:

  1. Shows all of the user’s bookmarks in the database, and
  2. Shows a form that allows the user to insert new bookmarks into the database
src/acme/handler.clj
(ns acme.handler
  (:require
   ;; ...
   [next.jdbc :as jdbc]
   [next.jdbc.sql :as sql]))
;; ...

(defn template
  [bookmarks]
  [:html
   [:head
    [:meta {:charset "utf-8"
            :name    "viewport"
            :content "width=device-width, initial-scale=1.0"}]]
   [:body
    [:h1 "bookmarks"]
    [:form {:method "POST"}
     [:div
      [:label {:for "url"} "url "]
      [:input#url {:name "url"
                   :type "url"
                   :required true
                   :placeholer "https://en.wikipedia.org/"}]]
     [:button "submit"]]
    [:p "your bookmarks:"]
    [:ul
     (if (empty? bookmarks)
       [:li "you don't have any bookmarks"]
       (map
        (fn [{:keys [url]}]
          [:li
           [:a {:href url} url]])
        bookmarks))]]])

(defn index-page
  [req]
  (try
    (let [bookmarks (sql/query (:db req)
                               ["select * from bookmarks"]
                               jdbc/unqualified-snake-kebab-opts)]
      (util/render (template bookmarks)))
    (catch Exception e
      (util/server-error e))))
;; ...

Database queries can sometimes throw exceptions, so it’s good to wrap them in a try-catch block. I’ll also introduce some helper functions:

src/acme/util.clj
(ns acme.util
  (:require
   ;; ...
   [hiccup2.core :as h]
   [ring.util.response :as res])
  (:import java.util.Base64))
;; ...

(defn preprend-doctype
  [s]
  (str "<!doctype html>" s))

(defn render
  [hiccup]
  (-> hiccup h/html str preprend-doctype res/response (res/content-type "text/html")))

(defn server-error
  [e]
  (println "Caught exception: " e)
  (-> (res/response "Internal server error")
      (res/status 500)))

render takes a hiccup form and turns it into a ring response, while server-error takes an exception, logs it, and returns a 500 response.

Next, we’ll implement the index-action function:

src/acme/handler.clj
;; ...

(defn index-action
  [req]
  (try
    (let [{:keys [db form-params]} req
          value (get form-params "url")]
      (sql/insert! db :bookmarks {:bookmark_id (random-uuid) :url value})
      (res/redirect "/" 303))
    (catch Exception e
      (util/server-error e))))
;; ...

This is an implementation of a typical post/redirect/get pattern. We get the value from the URL form field, insert a new row in the database with that value, and redirect back to the index page. Again, we’re using a try-catch block to handle possible exceptions from the database query.

That should be all of the code for the controllers. If you reload your REPL and go to http://localhost:8080, you should see something that looks like this after logging in:

Screnshot of the app

The last thing we need to do is to update the main function to start the system:

src/acme/main.clj
;; ...

(defn -main [& _]
  (-> (read-config) ig/expand ig/init))

Now, you should be able to run the app using clj -M -m acme.main. That’s all the code needed for the app. In the next section, we’ll package the app into a Docker image to deploy to Fly.

Packaging the App

While there are many ways to package a Clojure app, Fly.io specifically requires a Docker image. There are two approaches to doing this:

  1. Build an uberjar and run it using Java in the container, or
  2. Load the source code and run it using Clojure in the container

Both are valid approaches. I prefer the first since its only dependency is the JVM. We’ll use the tools.build library to build the uberjar. Check out the official guide for more information on building Clojure programs. Since it’s a library, to use it, we can add it to our deps.edn file with an alias:

deps.edn
{;; ...
 :aliases
 {;; ...
  :build {:extra-deps {io.github.clojure/tools.build 
                       {:git/tag "v0.10.5" :git/sha "2a21b7a"}}
          :ns-default build}}}

Tools.build expects a build.clj file in the root of the project directory, so we’ll need to create that file. This file contains the instructions to build artefacts, which in our case is a single uberjar. There are many great examples of build.clj files on the web, including from the official documentation. For now, you can copy+paste this file into your project.

build.clj
(ns build
  (:require
   [clojure.tools.build.api :as b]))

(def basis (delay (b/create-basis {:project "deps.edn"})))
(def src-dirs ["src" "resources"])
(def class-dir "target/classes")

(defn uber
  [_]
  (println "Cleaning build directory...")
  (b/delete {:path "target"})

  (println "Copying files...")
  (b/copy-dir {:src-dirs   src-dirs
               :target-dir class-dir})

  (println "Compiling Clojure...")
  (b/compile-clj {:basis      @basis
                  :ns-compile '[acme.main]
                  :class-dir  class-dir})

  (println "Building Uberjar...")
  (b/uber {:basis     @basis
           :class-dir class-dir
           :uber-file "target/standalone.jar"
           :main      'acme.main}))

To build the project, run clj -T:build uber. This will create the uberjar standalone.jar in the target directory. The uber in clj -T:build uber refers to the uber function from build.clj. Since the build system is a Clojure program, you can customise it however you like. If we try to run the uberjar now, we’ll get an error:

# build the uberjar
$ clj -T:build uber
# Cleaning build directory...
# Copying files...
# Compiling Clojure...
# Building Uberjar...

# run the uberjar
$ java -jar target/standalone.jar
# Error: Could not find or load main class acme.main
# Caused by: java.lang.ClassNotFoundException: acme.main

This error occurred because the Main class that is required by Java isn’t built. To fix this, we need to add the :gen-class directive in our main namespace. This will instruct Clojure to create the Main class from the -main function.

src/acme/main.clj
(ns acme.main
  ;; ...
  (:gen-class))
;; ...

If you rebuild the project and run java -jar target/standalone.jar again, it should work perfectly. Now that we have a working build script, we can write the Dockerfile:

Dockerfile
# install additional dependencies here in the base layer
# separate base from build layer so any additional deps installed are cached
FROM clojure:temurin-21-tools-deps-bookworm-slim AS base

FROM base as build
WORKDIR /opt
COPY . .
RUN clj -T:build uber

FROM eclipse-temurin:21-alpine AS prod
COPY --from=build /opt/target/standalone.jar /
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "standalone.jar"]

It’s a multi-stage Dockerfile. We use the official Clojure Docker image as the layer to build the uberjar. Once it’s built, we copy it to a smaller Docker image that only contains the Java runtime.[8] By doing this, we get a smaller container image as well as a faster Docker build time because the layers are better cached.

That should be all for packaging the app. We can move on to the deployment now.

Deploying with Fly.io

First things first, you’ll need to install flyctl, Fly’s CLI tool for interacting with their platform. Create a Fly.io account if you haven’t already. Then run fly auth login to authenticate flyctl with your account.

Next, we’ll need to create a new Fly App:

$ fly app create
# ? Choose an app name (leave blank to generate one): 
# automatically selected personal organization: Ryan Martin
# New app created: blue-water-6489

Another way to do this is with the fly launch command, which automates a lot of the app configuration for you. We have some steps to do that are not done by fly launch, so we’ll be configuring the app manually. I also already have a fly.toml file ready that you can straight away copy to your project.

fly.toml
# replace these with your app and region name
# run `fly platform regions` to get a list of regions
app = 'blue-water-6489' 
primary_region = 'sin'

[env]
  DB_DATABASE = "/data/database.db"

[http_service]
  internal_port = 8080
  force_https = true
  auto_stop_machines = "stop"
  auto_start_machines = true
  min_machines_running = 0

[mounts]
  source = "data"
  destination = "/data"
  initial_sie = 1

[[vm]]
  size = "shared-cpu-1x"
  memory = "512mb"
  cpus = 1
  cpu_kind = "shared"

These are mostly the default configuration values with some additions. Under the [env] section, we’re setting the SQLite database location to /data/database.db. The database.db file itself will be stored in a persistent Fly Volume mounted on the /data directory. This is specified under the [mounts] section. Fly Volumes are similar to regular Docker volumes but are designed for Fly’s micro VMs.

We’ll need to set the AUTH_USER and AUTH_PASSWORD environment variables too, but not through the fly.toml file as these are sensitive values. To securely set these credentials with Fly, we can set them as app secrets. They’re stored encrypted and will be automatically injected into the app at boot time.

$ fly secrets set AUTH_USER=hi@ryanmartin.me AUTH_PASSWORD=not-so-secure-password
# Secrets are staged for the first deployment

With this, the configuration is done and we can deploy the app using fly deploy:

$ fly deploy
# ...
# Checking DNS configuration for blue-water-6489.fly.dev
# Visit your newly deployed app at https://blue-water-6489.fly.dev/

The first deployment will take longer since it’s building the Docker image for the first time. Subsequent deployments should be faster due to the cached image layers. You can click on the link to view the deployed app, or you can also run fly open, which will do the same thing. Here’s the app in action:

The app in action

If you made additional changes to the app or fly.toml, you can redeploy the app using the same command, fly deploy. The app is configured to auto stop/start, which helps to cut costs when there’s not a lot of traffic to the site. If you want to take down the deployment, you’ll need to delete the app itself using fly app destroy <your app name>.

Adding a Production REPL

This is an interesting topic in the Clojure community, with varying opinions on whether or not it’s a good idea. Personally, I find having a REPL connected to the live app helpful, and I often use it for debugging and running queries on the live database.[9] Since we’re using SQLite, we don’t have a database server we can directly connect to, unlike Postgres or MySQL.

If you’re brave, you can even restart the app directly without redeploying from the REPL. You can easily go wrong with it, which is why some prefer not to use it.

For this project, we’re gonna add a socket REPL. It’s very simple to add (you just need to add a JVM option) and it doesn’t require additional dependencies like nREPL. Let’s update the Dockerfile:

Dockerfile
# ...
EXPOSE 7888
ENTRYPOINT ["java", "-Dclojure.server.repl={:port 7888 :accept clojure.core.server/repl}", "-jar", "standalone.jar"]

The socket REPL will be listening on port 7888. If we redeploy the app now, the REPL will be started, but we won’t be able to connect to it. That’s because we haven’t exposed the service through Fly proxy. We can do this by adding the socket REPL as a service in the [services] section in fly.toml.

However, doing this will also expose the REPL port to the public. This means that anyone can connect to your REPL and possibly mess with your app. Instead, what we want to do is to configure the socket REPL as a private service.

By default, all Fly apps in your organisation live in the same private network. This private network, called 6PN, connects the apps in your organisation through WireGuard tunnels (a VPN) using IPv6. Fly private services aren’t exposed to the public internet but can be reached from this private network. We can then use Wireguard to connect to this private network to reach our socket REPL.

Fly VMs are also configured with the hostname fly-local-6pn, which maps to its 6PN address. This is analogous to localhost, which points to your loopback address 127.0.0.1. To expose a service to 6PN, all we have to do is bind or serve it to fly-local-6pn instead of the usual 0.0.0.0. We have to update the socket REPL options to:

Dockerfile
# ...
ENTRYPOINT ["java", "-Dclojure.server.repl={:port 7888,:address \"fly-local-6pn\",:accept clojure.core.server/repl}", "-jar", "standalone.jar"]

After redeploying, we can use the fly proxy command to forward the port from the remote server to our local machine.[10]

$ fly proxy 7888:7888
# Proxying local port 7888 to remote [blue-water-6489.internal]:7888

In another shell, run:

$ rlwrap nc localhost 7888
# user=>

Now we have a REPL connected to the production app! rlwrap is used for readline functionality, e.g. up/down arrow keys, vi bindings. Of course, you can also connect to it from your editor.

Deploy with GitHub Actions

If you’re using GitHub, we can also set up automatic deployments on pushes/PRs with GitHub Actions. All you need is to create the workflow file:

.github/workflows/fly.yaml
name: Fly Deploy
on:
  push:
    branches:
      - main
  workflow_dispatch:

jobs:
  deploy:
    name: Deploy app
    runs-on: ubuntu-latest
    concurrency: deploy-group
    steps:
      - uses: actions/checkout@v4
      - uses: superfly/flyctl-actions/setup-flyctl@master
      - run: flyctl deploy --remote-only
        env:
          FLY_API_TOKEN: ${{ secrets.FLY_API_TOKEN }}

To get this to work, you’ll need to create a deploy token from your app’s dashboard. Then, in your GitHub repo, create a new repository secret called FLY_API_TOKEN with the value of your deploy token. Now, whenever you push to the main branch, this workflow will automatically run and deploy your app. You can also manually run the workflow from GitHub because of the workflow_dispatch option.

End

As always, all the code is available on GitHub. Originally, this post was just about deploying to Fly.io, but along the way, I kept adding on more stuff until it essentially became my version of the user manager example app. Anyway, hope this post provided a good view into web development with Clojure. As a bonus, here are some additional resources on deploying Clojure apps:


  1. The way Fly.io works under the hood is pretty clever. Instead of running the container image with a runtime like Docker, the image is unpacked and “loaded” into a VM. See this video explanation for more details. ↩︎

  2. If you’re interested in learning Clojure, my recommendation is to follow the official getting started guide and join the Clojurians Slack. Also, read through this list of introductory resources. ↩︎

  3. Kit was a big influence on me when I first started learning web development in Clojure. I never used it directly, but I did use their library choices and project structure as a base for my own projects. ↩︎

  4. There’s no “Rails” for the Clojure ecosystem (yet?). The prevailing opinion is to build your own “framework” by composing different libraries together. Most of these libraries are stable and are already used in production by big companies, so don’t let this discourage you from doing web development in Clojure! ↩︎

  5. There might be some keys that you add or remove, but the structure of the config file stays the same. ↩︎

  6. “assoc” (associate) is a Clojure slang that means to add or update a key-value pair in a map. ↩︎

  7. For more details on how basic authentication works, check out the specification. ↩︎

  8. Here’s a cool resource I found when researching Java Dockerfiles: WhichJDK. It provides a comprehensive comparison of the different JDKs available and recommendations on which one you should use. ↩︎

  9. Another (non-technically important) argument for live/production REPLs is just because it’s cool. Ever since I read the story about NASA’s programmers debugging a spacecraft through a live REPL, I’ve always wanted to try it at least once. ↩︎

  10. If you encounter errors related to WireGuard when running fly proxy, you can run fly doctor, which will hopefully detect issues with your local setup and also suggest fixes for them. ↩︎

Permalink

Advent of Code 2024 in Zig

This post is about six seven months late, but here are my takeaways from Advent of Code 2024. It was my second time participating, and this time I actually managed to complete it.[1] My goal was to learn a new language, Zig, and to improve my DSA and problem-solving skills.

If you’re not familiar, Advent of Code is an annual programming challenge that runs every December. A new puzzle is released each day from December 1st to the 25th. There’s also a global leaderboard where people (and AI) race to get the fastest solves, but I personally don’t compete in it, mostly because I want to do it at my own pace.

I went with Zig because I have been curious about it for a while, mainly because of its promise of being a better C and because TigerBeetle (one of the coolest databases now) is written in it. Learning Zig felt like a good way to get back into systems programming, something I’ve been wanting to do after a couple of chaotic years of web development.

This post is mostly about my setup, results, and the things I learned from solving the puzzles. If you’re more interested in my solutions, I’ve also uploaded my code and solution write-ups to my GitHub repository.

My Advent of Code results page

Project Setup

There were several Advent of Code templates in Zig that I looked at as a reference for my development setup, but none of them really clicked with me. I ended up just running my solutions directly using zig run for the whole event. It wasn’t until after the event ended that I properly learned Zig’s build system and reorganised my project.

Here’s what the project structure looks like now:

.
├── src
│   ├── days
│   │   ├── data
│   │   │   ├── day01.txt
│   │   │   ├── day02.txt
│   │   │   └── ...
│   │   ├── day01.zig
│   │   ├── day02.zig
│   │   └── ...
│   ├── bench.zig
│   └── run.zig
└── build.zig

The project is powered by build.zig, which defines several commands:

  1. Build
    • zig build - Builds all of the binaries for all optimisation modes.
  2. Run
    • zig build run - Runs all solutions sequentially.
    • zig build run -Day=XX - Runs the solution of the specified day only.
  3. Benchmark
    • zig build bench - Runs all benchmarks sequentially.
    • zig build bench -Day=XX - Runs the benchmark of the specified day only.
  4. Test
    • zig build test - Runs all tests sequentially.
    • zig build test -Day=XX - Runs the tests of the specified day only.

You can also pass the optimisation mode that you want to any of the commands above with the -Doptimize flag.

Under the hood, build.zig compiles src/run.zig when you call zig build run, and src/bench.zig when you call zig build bench. These files are templates that import the solution for a specific day from src/days/dayXX.zig. For example, here’s what src/run.zig looks like:

src/run.zig
const std = @import("std");
const puzzle = @import("day"); // Injected by build.zig

pub fn main() !void {
    var arena = std.heap.ArenaAllocator.init(std.heap.page_allocator);
    defer arena.deinit();
    const allocator = arena.allocator();

    std.debug.print("{s}\n", .{puzzle.title});
    _ = try puzzle.run(allocator, true);
    std.debug.print("\n", .{});
}

The day module imported is an anonymous import dynamically injected by build.zig during compilation. This allows a single run.zig or bench.zig to be reused for all solutions. This avoids repeating boilerplate code in the solution files. Here’s a simplified version of my build.zig file that shows how this works:

build.zig
const std = @import("std");

pub fn build(b: *std.Build) void {
    const target = b.standardTargetOptions(.{});
    const optimize = b.standardOptimizeOption(.{});

    const run_all = b.step("run", "Run all days");
    const day_option = b.option(usize, "ay", ""); // The `-Day` option

    // Generate build targets for all 25 days.
    for (1..26) |day| {
        const day_zig_file = b.path(b.fmt("src/days/day{d:0>2}.zig", .{day}));

        // Create an executable for running this specific day.
        const run_exe = b.addExecutable(.{
            .name = b.fmt("run-day{d:0>2}", .{day}),
            .root_source_file = b.path("src/run.zig"),
            .target = target,
            .optimize = optimize,
        });

        // Inject the day-specific solution file as the anonymous module `day`.
        run_exe.root_module.addAnonymousImport("day", .{ .root_source_file = day_zig_file });

        // Install the executable so it can be run.
        b.installArtifact(run_exe);

        // ...
    }
}

My actual build.zig has some extra code that builds the binaries for all optimisation modes.

This setup is pretty barebones. I’ve seen other templates do cool things like scaffold files, download puzzle inputs, and even submit answers automatically. Since I wrote my build.zig after the event ended, I didn’t get to use it while solving the puzzles. I might add these features to it if I decided to do Advent of Code again this year with Zig.

Self-Imposed Constraints

While there are no rules to Advent of Code itself, to make things a little more interesting, I set a few constraints and rules for myself:

  1. The code must be readable. By “readable”, I mean the code should be straightforward and easy to follow. No unnecessary abstractions. I should be able to come back to the code months later and still understand (most of) it.
  2. Solutions must be a single file. No external dependencies. No shared utilities module. Everything needed to solve the puzzle should be visible in that one solution file.
  3. The total runtime must be under one second.[2] All solutions, when run sequentially, should finish in under one second. I want to improve my performance engineering skills.
  4. Parts should be solved separately. This means: (1) no solving both parts simultaneously, and (2) no doing extra work in part one that makes part two faster. The aim of this is to get a clear idea of how long each part takes on its own.
  5. No concurrency or parallelism. Solutions must run sequentially on a single thread. This keeps the focus on the efficiency of the algorithm. I can’t speed up slow solutions by using multiple CPU cores.
  6. No ChatGPT. No Claude. No AI help. I want to train myself, not the LLM. I can look at other people’s solutions, but only after I have given my best effort at solving the problem.
  7. Follow the constraints of the input file. The solution doesn’t have to work for all possible scenarios, but it should work for all valid inputs. If the input file only contains 8-bit unsigned integers, the solution doesn’t have to handle larger integer types.
  8. Hardcoding is allowed. For example: size of the input, number of rows and columns, etc. Since the input is known at compile-time, we can skip runtime parsing and just embed it into the program using Zig’s @embedFile.

Most of these constraints are designed to push me to write clearer, more performant code. I also wanted my code to look like it was taken straight from TigerBeetle’s codebase (minus the assertions).[3] Lastly, I just thought it would make the experience more fun.

Favourite Puzzles

From all of the puzzles, here are my top 3 favourites:

  1. Day 6: Guard Gallivant - This is my slowest day (in benchmarks), but also the one I learned the most from. Some of these learnings include: using vectors to represent directions, padding 2D grids, metadata packing, system endianness, etc.
  2. Day 17: Chronospatial Computer - I love reverse engineering puzzles. I used to do a lot of these in CTFs during my university days. The best thing I learned from this day is the realisation that we can use different integer bases to optimise data representation. This helped improve my runtimes in the later days 22 and 23.
  3. Day 21: Keypad Conundrum - This one was fun. My gut told me that it can be solved greedily by always choosing the best move. It was right. Though I did have to scroll Reddit for a bit to figure out the step I was missing, which was that you have to visit the farthest keypads first. This is also my longest solution file (almost 400 lines) because I hardcoded the best-moves table.

Honourable mention:

  1. Day 24: Crossed Wires - Another reverse engineering puzzle. Confession: I didn’t solve this myself during the event. After 23 brutal days, my brain was too tired, so I copied a random Python solution from Reddit. When I retried it later, it turned out to be pretty fun. I still couldn’t find a solution I was satisfied with though.

Programming Patterns and Zig Tricks

During the event, I learned a lot about Zig and performance, and also developed some personal coding conventions. Some of these are Zig-specific, but most are universal and can be applied across languages. This section covers general programming and Zig patterns I found useful. The next section will focus on performance-related tips.

Comptime

Zig’s flagship feature, comptime, is surprisingly useful. I knew Zig uses it for generics and that people do clever metaprogramming with it, but I didn’t expect to be using it so often myself.

My main use for comptime was to generate puzzle-specific types. All my solution files follow the same structure, with a DayXX function that takes some parameters (usually the input length) and returns a puzzle-specific type, e.g.:

src/days/day01.zig
fn Day01(comptime length: usize) type {
    return struct {
        const Self = @This();
        
        left: [length]u32 = undefined,
        right: [length]u32 = undefined,

        fn init(input: []const u8) !Self {}

        // ...
    };
}

This lets me instantiate the type with a size that matches my input:

src/days/day01.zig
// Here, `Day01` is called with the size of my actual input.
pub fn run(_: std.mem.Allocator, is_run: bool) ![3]u64 {
    // ...
    const input = @embedFile("./data/day01.txt");
    var puzzle = try Day01(1000).init(input);
    // ...
}

// Here, `Day01` is called with the size of my test input.
test "day 01 part 1 sample 1" {
    var puzzle = try Day01(6).init(sample_input);
    // ...
}

This allows me to reuse logic across different inputs while still hardcoding the array sizes. Without comptime, I have to either create a separate function for all my different inputs or dynamically allocate memory because I can’t hardcode the array size.

I also used comptime to shift some computation to compile-time to reduce runtime overhead. For example, on day 4, I needed a function to check whether a string matches either "XMAS" or its reverse, "SAMX". A pretty simple function that you can write as a one-liner in Python:

example.py
def matches(pattern, target):
    return target == pattern or target == pattern[::-1]

Typically, a function like this requires some dynamic allocation to create the reversed string, since the length of the string is only known at runtime.[4] For this puzzle, since the words to reverse are known at compile-time, we can do something like this:

src/days/day04.zig
fn matches(comptime word: []const u8, slice: []const u8) bool {
    var reversed: [word.len]u8 = undefined;
    @memcpy(&reversed, word);
    std.mem.reverse(u8, &reversed);
    return std.mem.eql(u8, word, slice) or std.mem.eql(u8, &reversed, slice);
}

This creates a separate function for each word I want to reverse.[5] Each function has an array with the same size as the word to reverse. This removes the need for dynamic allocation and makes the code run faster. As a bonus, Zig also warns you when this word isn’t compile-time known, so you get an immediate error if you pass in a runtime value.

Optional Types

A common pattern in C is to return special sentinel values to denote missing values or errors, e.g. -1, 0, or NULL. In fact, I did this on day 13 of the challenge:

src/days/day13.zig
// We won't ever get 0 as a result, so we use it as a sentinel error value.
fn count_tokens(a: [2]u8, b: [2]u8, p: [2]i64) u64 {
    const numerator = @abs(p[0] * b[1] - p[1] * b[0]);
    const denumerator = @abs(@as(i32, a[0]) * b[1] - @as(i32, a[1]) * b[0]);
    return if (numerator % denumerator != 0) 0 else numerator / denumerator;
}

// Then in the caller, skip if the return value is 0.
if (count_tokens(a, b, p) == 0) continue;

This works, but it’s easy to forget to check for those values, or worse, to accidentally treat them as valid results. Zig improves on this with optional types. If a function might not return a value, you can return ?T instead of T. This also forces the caller to handle the null case. Unlike C, null isn’t a pointer but a more general concept. Zig treats null as the absence of a value for any type, just like Rust’s Option<T>.

The count_tokens function can be refactored to:

src/days/day13.zig
// Return null instead if there's no valid result.
fn count_tokens(a: [2]u8, b: [2]u8, p: [2]i64) ?u64 {
    const numerator = @abs(p[0] * b[1] - p[1] * b[0]);
    const denumerator = @abs(@as(i32, a[0]) * b[1] - @as(i32, a[1]) * b[0]);
    return if (numerator % denumerator != 0) null else numerator / denumerator;
}

// The caller is now forced to handle the null case.
if (count_tokens(a, b, p)) |n_tokens| {
    // logic only runs when n_tokens is not null.
}

Zig also has a concept of error unions, where a function can return either a value or an error. In Rust, this is Result<T>. You could also use error unions instead of optionals for count_tokens; Zig doesn’t force a single approach. I come from Clojure, where returning nil for an error or missing value is common.

Grid Padding

This year has a lot of 2D grid puzzles (arguably too many). A common feature of grid-based algorithms is the out-of-bounds check. Here’s what it usually looks like:

example.zig
fn dfs(map: [][]u8, position: [2]i8) u32 {
    const x, const y = position;
    
    // Bounds check here.
    if (x < 0 or y < 0 or x >= map.len or y >= map[0].len) return 0;

    if (map[x][y] == .visited) return 0;
    map[x][y] = .visited;

    var result: u32 = 1;
    for (directions) | direction| {
        result += dfs(map, position + direction);
    }
    return result;
}

This is a typical recursive DFS function. After doing a lot of this, I discovered a nice trick that not only improves code readability, but also its performance. The trick here is to pad the grid with sentinel characters that mark out-of-bounds areas, i.e. add a border to the grid.

Here’s an example from day 6:

Original map:               With borders added:
                            ************
....#.....                  *....#.....*
.........#                  *.........#*
..........                  *..........*
..#.......                  *..#.......*
.......#..        ->        *.......#..*
..........                  *..........*
.#..^.....                  *.#..^.....*
........#.                  *........#.*
#.........                  *#.........*
......#...                  *......#...*
                            ************

You can use any value for the border, as long as it doesn’t conflict with valid values in the grid. With the border in place, the bounds check becomes a simple equality comparison:

example.zig
const border = '*';

fn dfs(map: [][]u8, position: [2]i8) u32 {
    const x, const y = position;
    if (map[x][y] == border) { // We are out of bounds
        return 0;
    }
    // ...
}

This is much more readable than the previous code. Plus, it’s also faster since we’re only doing one equality check instead of four range checks.

That said, this isn’t a one-size-fits-all solution. This only works for algorithms that traverse the grid one step at a time. If your logic jumps multiple tiles, it can still go out of bounds (except if you increase the width of the border to account for this). This approach also uses a bit more memory than the regular approach as you have to store more characters.

SIMD Vectors

This could also go in the performance section, but I’m including it here because the biggest benefit I get from using SIMD in Zig is the improved code readability. Because Zig has first-class support for vector types, you can write elegant and readable code that also happens to be faster.

If you’re not familiar with vectors, they are a special collection type used for Single instruction, multiple data (SIMD) operations. SIMD allows you to perform computation on multiple values in parallel using only a single CPU instruction, which often leads to some performance boosts.[6]

I mostly use vectors to represent positions and directions, e.g. for traversing a grid. Instead of writing code like this:

example.zig
next_position = .{ position[0] + direction[0], position[1] + direction[1] };

You can represent position and direction as 2-element vectors and write code like this:

example.zig
next_position = position + direction;

This is much nicer than the previous version!

Day 25 is another good example of a problem that can be solved elegantly using vectors:

src/days/day25.zig
var result: u64 = 0;
for (self.locks.items) |lock| { // lock is a vector
    for (self.keys.items) |key| { // key is also a vector
        const fitted = lock + key > @as(@Vector(5, u8), @splat(5));
        const is_overlap = @reduce(.Or, fitted);
        result += @intFromBool(!is_overlap);
    }
}

Expressing the logic as vector operations makes the code cleaner since you don’t have to write loops and conditionals as you typically would in a traditional approach.

Performance Tips

The tips below are general performance techniques that often help, but like most things in software engineering, “it depends”. These might work 80% of the time, but performance is often highly context-specific. You should benchmark your code instead of blindly following what other people say.

This section would’ve been more fun with concrete examples, step-by-step optimisations, and benchmarks, but that would’ve made the post way too long. Hopefully, I’ll get to write something like that in the future.[7]

Minimise Allocations

Whenever possible, prefer static allocation. Static allocation is cheaper since it just involves moving the stack pointer vs dynamic allocation which has more overhead from the allocator machinery. That said, it’s not always the right choice since it has some limitations, e.g. stack size is limited, memory size must be compile-time known, its lifetime is tied to the current stack frame, etc.

If you need to do dynamic allocations, try to reduce the number of times you call the allocator. The number of allocations you do matters more than the amount of memory you allocate. More allocations mean more bookkeeping, synchronisation, and sometimes syscalls.

A simple but effective way to reduce allocations is to reuse buffers, whether they’re statically or dynamically allocated. Here’s an example from day 10. For each trail head, we want to create a set of trail ends reachable from it. The naive approach is to allocate a new set every iteration:

src/days/day10.zig
for (self.trail_heads.items) |trail_head| {
    var trail_ends = std.AutoHashMap([2]u8, void).init(self.allocator);
    defer trail_ends.deinit();
    
    // Set building logic...
}

What you can do instead is to allocate the set once before the loop. Then, each iteration, you reuse the set by emptying it without freeing the memory. For Zig’s std.AutoHashMap, this can be done using the clearRetainingCapacity method:

src/days/day10.zig
var trail_ends = std.AutoHashMap([2]u8, void).init(self.allocator);
defer trail_ends.deinit();

for (self.trail_heads.items) |trail_head| {
    trail_ends.clearRetainingCapacity();
    
    // Set building logic...
}

If you use static arrays, you can also just overwrite existing data instead of clearing it.

A step up from this is to reuse multiple buffers. The simplest form of this is to reuse two buffers, i.e. double buffering. Here’s an example from day 11:

src/days/day11.zig
// Initialise two hash maps that we'll alternate between.
var frequencies: [2]std.AutoHashMap(u64, u64) = undefined;
for (0..2) |i| frequencies[i] = std.AutoHashMap(u64, u64).init(self.allocator);
defer for (0..2) |i| frequencies[i].deinit();

var id: usize = 0;
for (self.stones) |stone| try frequencies[id].put(stone, 1);

for (0..n_blinks) |_| {
    var old_frequencies = &frequencies[id % 2];
    var new_frequencies = &frequencies[(id + 1) % 2];
    id += 1;

    defer old_frequencies.clearRetainingCapacity();

    // Do stuff with both maps...
}

Here we have two maps to count the frequencies of stones across iterations. Each iteration will build up new_frequencies with the values from old_frequencies. Doing this reduces the number of allocations to just 2 (the number of buffers). The tradeoff here is that it makes the code slightly more complex.

Make Your Data Smaller

A performance tip people say is to have “mechanical sympathy”. Understand how your code is processed by your computer. An example of this is to structure your data so it works better with your CPU. For example, keep related data close in memory to take advantage of cache locality.

Reducing the size of your data helps with this. Smaller data means more of it can fit in cache. One way to shrink your data is through bit packing. This depends heavily on your specific data, so you’ll need to use your judgement to tell whether this would work for you. I’ll just share some examples that worked for me.

The first example is in day 6 part two, where you have to detect a loop, which happens when you revisit a tile from the same direction as before. To track this, you could use a map or a set to store the tiles and visited directions. A more efficient option is to store this direction metadata in the tile itself.

There are only four tile types, which means you only need two bits to represent the tile types as an enum. If the enum size is one byte, here’s what the tiles look like in memory:

.obstacle -> 00000000
.path     -> 00000001
.visited  -> 00000010
.path     -> 00000011

As you can see, the upper six bits are unused. We can store the direction metadata in the upper four bits. One bit for each direction. If a bit is set, it means that we’ve already visited the tile in this direction. Here’s an illustration of the memory layout:

        direction metadata   tile type
           ┌─────┴─────┐   ┌─────┴─────┐
┌────────┬─┴─┬───┬───┬─┴─┬─┴─┬───┬───┬─┴─┐
│ Tile:  │ 1 │ 0 │ 0 │ 0 │ 0 │ 0 │ 1 │ 0 │
└────────┴─┬─┴─┬─┴─┬─┴─┬─┴───┴───┴───┴───┘
   up bit ─┘   │   │   └─ left bit
    right bit ─┘ down bit

If your language supports struct packing, you can express this layout directly:[8]

src/days/day06.zig
const Tile = packed struct(u8) {
    const TileType = enum(u4) { obstacle, path, visited, exit };

    up: u1 = 0,
    right: u1 = 0,
    down: u1 = 0,
    left: u1 = 0,
    tile: TileType,

    // ...
}

Doing this avoids extra allocations and improves cache locality. Since the directions metadata is colocated with the tile type, all of them can fit together in cache. Accessing the directions just requires some bitwise operations instead of having to fetch them from another region of memory.

Another way to do this is to represent your data using alternate number bases. Here’s an example from day 23. Computers are represented as two-character strings made up of only lowercase letters, e.g. "bc", "xy", etc. Instead of storing this as a [2]u8 array, you can convert it into a base-26 number and store it as a u16.[9]

Here’s the idea: map 'a' to 0, 'b' to 1, up to 'z' as 25. Each character in the string becomes a digit in the base-26 number. For example, "bc" ( [2]u8{ 'b', 'c' }) becomes the base-10 number 28 (1×26+2=28). If we represent this using the base-64 character set, it becomes 12 ('b' = 1, 'c' = 2).

While they take the same amount of space (2 bytes), a u16 has some benefits over a [2]u8:

  1. It fits in a single register, whereas you need two for the array.
  2. Comparison is faster as there is only a single value to compare.

Reduce Branching

I won’t explain branchless programming here; Algorithmica explains it way better than I can. While modern compilers are often smart enough to compile away branches, they don’t catch everything. I still recommend writing branchless code whenever it makes sense. It also has the added benefit of reducing the number of codepaths in your program.

Again, since performance is very context-dependent, I’ll just show you some patterns I use. Here’s one that comes up often:

src/days/day02.zig
if (is_valid_report(report)) {
    result += 1;
}

Instead of the branch, cast the bool into an integer directly:

src/days/day02.zig
result += @intFromBool(is_valid_report(report))

Another example is from day 6 (again!). Recall that to know if a tile has been visited from a certain direction, we have to check its direction bit. Here’s one way to do it:

src/days/day06.zig
fn has_visited(tile: Tile, direction: Direction) bool {
    switch (direction) {
        .up => return self.up == 1,
        .right => return self.right == 1,
        .down => return self.down == 1,
        .left => return self.left == 1,
    }
}

This works, but it introduces a few branches. We can make it branchless using bitwise operations:

src/days/day06.zig
fn has_visited(tile: Tile, direction: Direction) bool {
    const int_tile = std.mem.nativeToBig(u8, @bitCast(tile));
    const mask = direction.mask();
    const bits = int_tile & 0xff; // Get only the direction bits
    return bits & mask == mask;
}

While this is arguably cryptic and less readable, it does perform better than the switch version.

Avoid Recursion

The final performance tip is to prefer iterative code over recursion. Recursive functions bring the overhead of allocating stack frames. While recursive code is more elegant, it’s also often slower unless your language’s compiler can optimise it away, e.g. via tail-call optimisation. As far as I know, Zig doesn’t have this, though I might be wrong.

Recursion also has the risk of causing a stack overflow if the execution isn’t bounded. This is why code that is mission- or safety-critical avoids recursion entirely. It’s in TigerBeetle’s TIGERSTYLE and also NASA’s Power of Ten.

Iterative code can be harder to write in some cases, e.g. DFS maps naturally to recursion, but most of the time it is significantly faster, more predictable, and safer than the recursive alternative.

Benchmarks

I ran benchmarks for all 25 solutions in each of Zig’s optimisation modes. You can find the full results and the benchmark script in my GitHub repository. All benchmarks were done on an Apple M3 Pro.

As expected, ReleaseFast produced the best result with a total runtime of 85.1 ms. I’m quite happy with this, considering the two constraints that limited the number of optimisations I can do to the code:

  • Parts should be solved separately - Some days can be solved in a single go, e.g. day 10 and day 13, which could’ve saved a few milliseconds.
  • No concurrency or parallelism - My slowest days are the compute-heavy days that are very easily parallelisable, e.g. day 6, day 19, and day 22. Without this constraint, I can probably reach sub-20 milliseconds total(?), but that’s for another time.

You can see the full benchmarks for ReleaseFast in the table below:

Day Title Parsing (µs) Part 1 (µs) Part 2 (µs) Total (µs)
1 Historian Hysteria 23.5 15.5 2.8 41.8
2 Red-Nosed Reports 42.9 0.0 11.5 54.4
3 Mull it Over 0.0 7.2 16.0 23.2
4 Ceres Search 5.9 0.0 0.0 5.9
5 Print Queue 22.3 0.0 4.6 26.9
6 Guard Gallivant 14.0 25.2 24,331.5 24,370.7
7 Bridge Repair 72.6 321.4 9,620.7 10,014.7
8 Resonant Collinearity 2.7 3.3 13.4 19.4
9 Disk Fragmenter 0.8 12.9 137.9 151.7
10 Hoof It 2.2 29.9 27.8 59.9
11 Plutonian Pebbles 0.1 43.8 2,115.2 2,159.1
12 Garden Groups 6.8 164.4 249.0 420.3
13 Claw Contraption 14.7 0.0 0.0 14.7
14 Restroom Redoubt 13.7 0.0 0.0 13.7
15 Warehouse Woes 14.6 228.5 458.3 701.5
16 Reindeer Maze 12.6 2,480.8 9,010.7 11,504.1
17 Chronospatial Computer 0.1 0.2 44.5 44.8
18 RAM Run 35.6 15.8 33.8 85.2
19 Linen Layout 10.7 11,890.8 11,908.7 23,810.2
20 Race Condition 48.7 54.5 54.2 157.4
21 Keypad Conundrum 0.0 1.7 22.4 24.2
22 Monkey Market 20.7 0.0 11,227.7 11,248.4
23 LAN Party 13.6 22.0 2.5 38.2
24 Crossed Wires 5.0 41.3 14.3 60.7
25 Code Chronicle 24.9 0.0 0.0 24.9

A weird thing I found when benchmarking is that for day 6 part two, ReleaseSafe actually ran faster than ReleaseFast (13,189.0 µs vs 24,370.7 µs). Their outputs are the same, but for some reason, ReleaseSafe is faster even with the safety checks still intact.

The Zig compiler is still very much a moving target, so I don’t want to dig too deep into this, as I’m guessing this might be a bug in the compiler. This weird behaviour might just disappear after a few compiler version updates.

Reflections

Looking back, I’m really glad I decided to do Advent of Code and followed through to the end. I learned a lot of things. Some are useful in my professional work, some are more like random bits of trivia. Going with Zig was a good choice too. The language is small, simple, and gets out of your way. I learned more about algorithms and concepts than the language itself.

Besides what I’ve already mentioned earlier, here are some examples of the things I learned:

Some of my self-imposed constraints and rules ended up being helpful. I can still (mostly) understand the code I wrote a few months ago. Putting all of the code in a single file made it easier to read since I don’t have to context switch to other files all the time.

However, some of them did backfire a bit, e.g. the two constraints that limit how I can optimise my code. Another one is the “hardcoding allowed” rule. I used a lot of magic numbers, which helped to improve performance, but I didn’t document them, so after a while, I don’t even remember how I got them. I’ve since gone back and added explanations in my write-ups, but next time I’ll remember to at least leave comments.

One constraint I’ll probably remove next time is the no concurrency rule. It’s the biggest contributor to the total runtime of my solutions. I don’t do a lot of concurrent programming, even though my main language at work is Go, so next time it might be a good idea to use Advent of Code to level up my concurrency skills.

I also spent way more time on these puzzles than I originally expected. I optimised and rewrote my code multiple times. I also rewrote my write-ups a few times to make them easier to read. This is by far my longest side project yet. It’s a lot of fun, but it also takes a lot of time and effort. I almost gave up on the write-ups (and this blog post) because I don’t want to explain my awful day 15 and day 16 code. I ended up taking a break for a few months before finishing it, which is why this post is published in August lol.

Just for fun, here’s a photo of some of my notebook sketches that helped me visualise my solutions. See if you can guess which days these are from:

Photos of my notebook sketches

What’s Next?

So… would I do it again? Probably, though I’m not making any promises. If I do join this year, I’ll probably stick with Zig. I had my eyes on Zig since the start of 2024, so Advent of Code was the perfect excuse to learn it. This year, there aren’t any languages in particular that caught my eye, so I’ll just keep using Zig, especially since I have a proper setup ready.

If you haven’t tried Advent of Code, I highly recommend checking it out this year. It’s a great excuse to learn a new language, improve your problem-solving skills, or just learn something new. If you’re eager, you can also do the previous years’ puzzles as they’re still available.

One of the best aspects of Advent of Code is the community. The Advent of Code subreddit is a great place for discussion. You can ask questions and also see other people’s solutions. Some people also post really cool visualisations like this one. They also have memes!


  1. I failed my first attempt horribly with Clojure during Advent of Code 2023. Once I reached the later half of the event, I just couldn’t solve the problems with a purely functional style. I could’ve pushed through using imperative code, but I stubbornly chose not to and gave up… ↩︎

  2. The original constraint was that each solution must run in under one second. As it turned out, the code was faster than I expected, so I increased the difficulty. ↩︎

  3. TigerBeetle’s code quality and engineering principles are just wonderful. ↩︎

  4. You can implement this function without any allocation by mutating the string in place or by iterating over it twice, which is probably faster than my current implementation. I kept it as-is as a reminder of what comptime can do. ↩︎

  5. As a bonus, I was curious as to what this looks like compiled, so I listed all the functions in this binary in GDB and found:

    72: static bool day04.Day04(140).matches__anon_19741;
    72: static bool day04.Day04(140).matches__anon_19750;

    It does generate separate functions! ↩︎

  6. Well, not always. The number of SIMD instructions depends on the machine’s native SIMD size. If the length of the vector exceeds it, Zig will compile it into multiple SIMD instructions. ↩︎

  7. Here’s a nice post on optimising day 9’s solution with Rust. It’s a good read if you’re into performance engineering or Rust techniques. ↩︎

  8. One thing about packed structs is that their layout is dependent on the system endianness. Most modern systems are little-endian, so the memory layout I showed is actually reversed. Thankfully, Zig has some useful functions to convert between endianness like std.mem.nativeToBig, which makes working with packed structs easier. ↩︎

  9. Technically, you can store 2-digit base 26 numbers in a u10, as there are only 262 possible numbers. Most systems usually pad values by byte size, so u10 will still be stored as u16, which is why I just went straight for it. ↩︎

Permalink

How to Get Started with Machine Learning (2026 Implementation Guide)

Moving from data collection to actual AI software development and machine learning implementation is no longer just a nice-to-have; it is how to stay in business.  

In 2026, if businesses invest in AI development services or partner with a reputable machine learning development company, they can finally turn all that raw data into AI-powered business intelligence (BI). And that actually works. 

If businesses wait, they are already behind. The competitors are busy automating their workflows, personalizing their interactions with customers, and growing faster thanks to machine learning. This guide walks through how to get started with ML in 2026 and highlights common mistakes that confuse most newcomers.

So why step in now? Three things make 2026 the year everything shifts for AI:

  • Mature ecosystems: Tools are ready to use. Platforms like AWS SageMaker, Google Vertex AI, and new options for private deployments make machine learning more accessible than ever.
  • Regulatory clarity: The rules are now clear. GDPR, CCPA, and the new AI Act lay out exactly how to use AI responsibly.
  • Competitive necessity: Third, the pressure is on. Whether it is predicting customer churn or automating paperwork, Machine learning has moved beyond trials. It is just the way business operates now.

The 5-Step Roadmap to AI Integration

Learning to use AI effectively is best approached as a journey rather than a single step. Organizations can follow five clear stages that build on one another: defining the problem, preparing the data, selecting the appropriate model, testing through a pilot, and finally scaling the solution thoughtfully and responsibly. Each stage plays a critical role. By progressing methodically, teams can avoid costly mistakes while giving their AI initiatives a strong foundation for long-term success.

The 5-Step Roadmap to AI Integration

Step ❶: Problem Definition

Start by figuring out where AI can actually help. The best projects begin with a real business pain point and a measurable goal. Do not waste time on unclear ideas- focus on a specific goal.

Some typical examples? 

  • Churn prediction for subscription businesses. 
  • Automating legal or finance documents. 
  • Use of AI to detect fraud in banks. 

These are practical cases with a clear impact. They don’t need huge datasets, complex systems, or long setup times. They work, and they show the company that AI is real.

👉 McKinsey & Company estimates that AI and analytics could add $3.5 to $5.8 trillion in value each year across industries, showing the strong ROI of well‑planned machine learning.”

A good machine learning development company can identify the right starting point, help businesses secure that early win, and lay a foundation for greater achievements in AI software development.

Step ❷: Data Audit & Preparation

Strong models depend on strong data. Before businesses build anything, take a good look at what they have.

Key questions to consider include:

  •  Is the data fragmented across multiple systems? If so, efforts should be made to break down data silos and establish unified access.
  • Are the data compatible, or have calculation methods changed over time?
  • Can teams access the data while remaining fully compliant with security, privacy, and regulatory requirements?
  • Is the data clean, structured, and consistent? This may require removing duplicates, standardizing formats, and addressing missing or incomplete values.

Structured data, such as CRM records, is typically easier to manage and analyze. However, organizations should not overlook unstructured data, including emails, PDFs, and images, which often contain valuable insights. This is where AI development services add significant value by organizing and transforming unstructured information into formats that can be effectively analyzed. Even the most advanced models cannot compensate for poor-quality data. Simply put, reliable and well-prepared data is essential for achieving meaningful AI outcomes.

Structured vs. Unstructured Data Readiness

CategoryStructured DataUnstructured Data
FormatOrganized, labeledRaw, messy, no fixed format
ExamplesCRM records, transactionsEmails, PDFs, images, videos
Ease of UseReady for ML modelsNeeds cleaning and processing
PreparationMinimal workHeavy preprocessing required
Use CasesChurn prediction, fraud detectionSentiment analysis, document automation

Step ❸: Choosing the right model

  • Not every problem requires the same AI approach. The key is selecting the method that best fits the use case.
  • When labeled data is available, and the goal is to predict a specific outcome, such as identifying customers likely to churn, supervised learning is often the most effective choice. If labeled data is not available, unsupervised learning can help uncover hidden patterns, such as grouping customers with similar behaviors.
  • For tasks involving large volumes of text, such as extracting key insights or summarizing contracts, large language models (LLMs) are particularly well-suited. Choosing the right approach ensures that AI solutions remain practical, efficient, and aligned with business objectives.

Step ❹: Pilot & MVP

Avoid deploying AI across the entire organization at once, as this can introduce unnecessary risk and complexity. Instead, begin with a minimum viable product (MVP) or a focused pilot to validate the approach and gather insights before scaling.

Start small. Test with real data. See how it performs, and gather feedback from the people who use it. That builds trust and helps convince skeptics. Privacy‑first AI software development should test in secure environments and safeguard sensitive data.

Step ❺: Scaling & Optimization

If the pilot proves successful, the next step is to scale it thoughtfully. However, AI systems cannot simply be deployed and left unattended—they require ongoing monitoring and maintenance. As business conditions and data evolve, models can drift and lose accuracy. Organizations should continuously evaluate model performance, retrain with updated data, and monitor for bias, security, or compliance concerns. When managed effectively, AI-powered business intelligence (BI) can significantly transform how organizations analyze data and make decisions.

Reports run on their own, dashboards update in real time, and decisions happen faster. AI is now at the center of operations.

Key Takeaway

Bringing AI into the business is not a one-shot deal. Start small, prove it works, and then scale up carefully. Every step a business takes cuts down risk and builds momentum. With the right AI software development and machine learning development company, AI goes from experiment to essential- and helps businesses grow in a way that is smart, safe, and aligned with the goals.

Build vs. Buy in Machine Learning

CategoryOff‑the‑Shelf AI APIsCustom AI Software Development
Data PrivacyHigh risk of leakage, Limited control over shared dataPrivacy‑first, full control
AccuracyGeneric resultsTuned to your data
CostHigh per request, low upfront costsLower long‑term
FlexibilityLimited optionsFull roadmap control
IntegrationQuick plug‑and‑playTailored to existing systems
ScalabilityMay hit usage limitsScales with your infrastructure
SupportVendor‑dependentIn‑house expertise
Speed to LaunchFast startLonger build time
OwnershipNo IP ownershipFull IP ownership
CustomizationOne‑size‑fits‑allDesigned for your needs

Essential Tools for AI Tech Stack in 2026

Core Languages

  • Python → Preferred for research and quick experiments, thanks to its vast libraries and vibrant community.
  • Clojure → It is gaining popularity in production thanks to its functional design, which boosts scalability and reliability.
  • Rust → It gets involved when companies require a high level of speed. Particularly about the larger AI sectors, it maintains the speed and security of the operations.
  • Julia → It is great if companies are very involved in math or scientific computing.

📌 Note: People really see Clojure as a solid choice for production machine learning. 

Machine learning development company Flexiana uses Clojure for a reason- it helps them create systems that actually last.

  • Functional style → Since Clojure works with immutable data, businesses get fewer unexpected side effects, leading to fewer bugs creeping in. That is a big deal when businesses are running massive operations and need to trust their systems.
  • Concurrency → When it comes to handling lots of tasks at once, Clojure does the job well. It runs on the JVM, so it handles the heavy, parallel workloads businesses see in large machine learning pipelines.
  • Python interop → Flexiana runs production systems in Clojure but still trains models in Python. With libpython-clj, Python models can run directly inside Clojure. This way, teams get Python’s rich ML ecosystem plus Clojure’s stability- the best of both worlds.
  • Maintainability → Long‑term upkeep is easier with Clojure’s clean, composable design. Clojure’s clean, composable design makes that part a lot easier, especially when businesses are not just experimenting but actually running ML in production.
  • Ecosystem fit → Flexiana already has experience with Clojure. Keeping everything in the same language just makes their whole stack neater and more consistent.

Flexiana picks Clojure because it gives us control and reliability for real-world machine learning- without giving up the flexibility of Python when we need it. It is a solid balance between trying new things and keeping everything running smoothly.

Infrastructure

  • AWS SageMaker → It covers everything- training, deploying, monitoring- all in one spot.
  • Google Vertex AI → It organizes business datasets, pipelines, and deployed models.
  • Azure ML → It is a go-to if the team is already using Microsoft tools.
  • Privacy‑first local setups → On-prem or edge- help keep sensitive data protected.
  • Hybrid models → They give businesses cloud power while letting them keep control where they need it.
  • Containers & orchestration → Tools like Docker, Kubernetes, Serverless (AWS), and Serverless (Google Cloud) endpoints keep business models portable and simple to run.

Libraries

  • Clojure → Users lean on scicloj.ml for building functional ML pipelines and also major Python libraries with libpython-clj.
  • PythonPyTorch and TensorFlow are still the kings of deep learning.
  • Specialized →For something more specialized, Hugging Face leads in NLP, RAPIDS focuses on GPU data science, and LangChain handles LLM workflows.
  • Visualization → When businesses need to see their data, Plotly and Vega stand out, and now AI-powered dashboards are appearing too.

MLOps & Tooling

  • Experiment tracking → To track experiments, MLflow and Weights & Biases get the job done.
  • MonitoringEvidently AI and Arize help businesses to keep an eye on their models.
  • Version controlDVC and Git workflows manage both data and models.
  • Pipeline automation → Automate business pipelines with Kubeflow and Airflow.
  • CI/CD for AI → If businesses want to implement CI/CD, GitHub Actions and Jenkins (with ML plugins) maintain progress.

Security & Privacy

AI has become part of everyday business operations, making data protection more important than ever. Organizations cannot afford mistakes when it comes to sensitive information or regulatory compliance. Because of this, companies are placing much greater emphasis on security and privacy when developing AI systems. Several approaches are commonly used to achieve this.

  • Federated Learning:
    Instead of sending all raw data to a central server, federated learning allows AI models to learn directly from data where it already exists. Only model updates are shared, not the actual data. This approach helps keep sensitive information private while still improving the model. It also supports compliance with privacy regulations such as GDPR and HIPAA.
  • Differential Privacy:
    Differential privacy protects individuals by introducing small amounts of random noise into datasets during analysis. This allows teams to detect useful patterns and insights without exposing personal or identifiable information.
  • Zero-Trust Architecture:
    Zero-trust security operates on the principle that no user or system is automatically trusted. Every request must be verified for identity and permission before access is granted. While strict, this model significantly reduces the risk of unauthorized access from both external threats and internal misuse.
  • Synthetic Data:
    In many situations, real data cannot be shared due to privacy restrictions. Synthetic data provides a useful alternative. It is artificially generated but designed to mimic the patterns of real datasets. This allows teams to train AI models effectively without compromising anyone’s privacy.
  • Data Consistency and Calculation Drift:
    AI systems can fail if the underlying data or calculations change unexpectedly. For example, modifying how metrics are measured or adjusting formulas can disrupt model predictions. Regular data audits help teams detect these issues early, ensuring that AI systems continue to perform reliably.

Emerging Trends Shaping AI Strategy

AI is evolving rapidly, and new technologies continue to influence how organizations design and deploy intelligent systems. Several important trends are currently shaping AI strategies.

  • Edge AI:
    Instead of relying entirely on cloud infrastructure, some AI models now run directly on devices such as smartphones, smartwatches, or other edge devices. Processing data locally reduces latency, improves response time, and enhances privacy since data does not always need to leave the device.
  • Green AI:
    Training large AI models can consume significant amounts of energy. Green AI focuses on improving efficiency by using smaller models, optimized computing techniques, and cleaner energy sources. The goal is to reduce environmental impact while also lowering operational costs.
  • AutoML (Automated Machine Learning):
    AutoML tools automate many complex machine learning tasks, such as model selection and hyperparameter tuning. This allows organizations with limited AI expertise to build effective models quickly, making AI development more accessible.
  • AI Governance:
    As AI systems become more widely used, proper oversight becomes essential. Organizations must be able to explain how their models make decisions and demonstrate that their systems operate fairly and responsibly. This involves maintaining audit trails, monitoring for bias, and clearly documenting models. Transparency is not only important for regulators but also for building trust with users and customers.
Comprehensive machine learning and Ai development stack

Addressing the Biggest Obstacle: Privacy & Compliance

Key Regulations to Know

  • GDPR (EU): Strong data laws and severe fines for errors.
  • CCPA (California): Demands clear privacy rights and transparency for consumers.
  • Right to be Forgotten: People can ask to have their data erased, no questions asked.
  • EU AI Act (2026): New rules will categorize AI systems by risk.

Privacy-First AI Software Development 

  • Start with privacy. Make compliance part of business AI from day one, not just an add-on later.
  • Collect less data. Only grab what the business really needs- avoid stockpiling.
  • Use privacy tools. Consider anonymization, encryption, or even synthetic data to protect people’s information.
  • Keep track of everything. Know exactly where business data comes from and how you are using it.

Essential Operations

  • Monitor automatically: Set up automatic monitoring to spot privacy issues as they happen.
  • Keep detailed records: Have a clear audit trail for every AI decision.
  • Explain decisions: Explain the business’s AI decisions, both for the users and for regulators. No hidden components.
  • Enable user control: Give users the ability to edit or delete their data at any time.

Preparing for the Future

  • Risk classification: High-risk AI (such as in hiring, healthcare, and law enforcement) is subject to stricter rules.
  • Human oversight: Keep humans in the loop. Big decisions need a real person to review them.
  • Global standards: Plan for global rules. Every country’s got its own standards, so avoid being unprepared.
  • Continuous updates: Stay up to date with changing regulations.

Bottom line: Make privacy and compliance part of your AI plan from the very beginning. It is much easier to build now than to rush and fix later.

Why Machine Learning with Clojure is the Secret Weapon

Concurrency

Clojure lets teams run a bunch of tasks at once without worrying about them conflicting. That is significant when teams are handling real-time data. Consider business dashboards- they stay up-to-date, even as new numbers roll in. The retail team can watch sales, inventory, and customer trends update in real time and immediately modify their marketing offers.

Stability

With Clojure, the data does not change. Once teams set it, it gets locked in. When they run experiments or build models, the results stay the same. It makes bugs easier to find and builds trust in the data.

Code comparison:

Takeaway: Python → list changes. Clojure → vector stays, new copy made.

Interoperability

Clojure is compatible with business structures since it runs on the JVM. Teams get all the benefits of functional programming, but they can still use Python’s machine learning libraries or Java’s tools whenever they want. That makes things smoother. For example, a financial services company can run dependable pipelines in Clojure and still plug in models from TensorFlow or PyTorch.

Concurrency + Stability + Interoperability → Clojure makes machine learning practical, reliable, and ready for real business.

Choosing a Machine Learning Development Company

Technical Depth vs. API Wrappers

Look, not all AI partners build things the same way. Some retailers only apply a simple API wrapper on an existing tool and consider it done. Sure, it is fast, but businesses won’t get anything unique or scalable out of it. The real value comes from teams that dig deeper- they design custom models, set up pipelines just for the enterprises, and actually integrate everything with the business. Quick fixes might get everything started, but they won’t last as business requirements grow. 

If your business wants something that scales with you, ignore the surface-level details and find an AI software development partner who understands how to build real AI systems from the start.

Ethical Standards

Accuracy is not the only thing that counts in AI. Responsibility matters just as much. The right company does not just build models- they make sure those models are fair, explainable, and transparent. Businesses should be able to trust their work, and so should the customers. Plus, with all the rules around AI these days, businesses need an AI development service that takes ethics seriously, not one that treats it like a checkbox at the end.

Who is Flexiana

Flexiana is a global machine learning development company with over 70 developers and more than 25 programming languages. We don’t do one-size-fits-all projects. Instead, we work on custom solutions designed for your business, not someone else’s. What really sets us apart?

  • We write clean, reproducible code, so businesses actually understand and trust the models.
  • We build for the long term, making sure the system can scale as you grow.
  • We bring a ton of experience, from AI and blockchain to complex enterprise systems.

Flexiana works with companies that want both technical smarts and strong ethical standards. We are not just delivering code- we are offering privacy‑first AI software development that lasts and protects your privacy at every step.

Build smarter with AI‑powered business intelligence (BI)– connect with our team.

FAQs on Getting Started with ML

Q1: How much data do I need for machine learning?  

Honestly, it ultimately depends on what you’re trying to build. Some models manage with just a few thousand records, while deep learning projects require millions. But here’s the thing: clean, relevant data beats large-scale almost every time. A good machine learning development company can help you figure out the right balance.

Q2: Is AI only for large enterprises?  

Not at all. Small and mid-sized businesses use AI regularly. With the right AI software development partner, even a small team can set up automation, create forecasting tools, or dig into customer insights. The scale might change, but the value’s there for everyone.

Q3: What’s the difference between AI and ML?  

AI (artificial intelligence) is the big idea- making machines act smart. ML (machine learning) is one way to do that. It learns patterns from data. So, AI is the goal, and ML is the method. Most AI development services use ML as their main engine.

Q4: What makes privacy important in AI?  

Privacy matters- a lot. When you take a privacy-first approach to AI software development, you handle data responsibly, keep models compliant, and build trust with users. Skipping this step just invites risk, no matter how good your models are.

Q5: Why choose machine learning with Clojure?  

Clojure really stands out for machine learning. Clojure’s immutable data makes experiments easy to repeat, and its concurrency lets you build real‑time pipelines that stay stable under heavy load. That’s why many teams choose Clojure for ML systems.

Q6: Can AI help with business intelligence?  

Definitely. AI‑powered Business Intelligence (BI) dashboards process live data and give decision‑makers instant insights. Companies don’t just spot trends or risks- they can act and respond before it’s too late.

In Summary 

Machine learning is not optional now- it is how companies survive. The ones jumping in early get to build systems that actually scale, stay on the right side of ethics, and turn AI into real results. Wait too long, and your business will be left scrambling while everyone else uses AI to move faster, save money, and identify opportunities your business will miss.

Here’s why it matters:

  • Scalability→ ML grows with you. No more systems slowing you down.
  • Trust→ Designing for privacy and keeping things transparent wins over customers and regulators.

Speed→ AI-powered business intelligence gives leaders real-time insights so they can act fast, before risks blow up.

Curious about what’s happening at Flexiana? Subscribe to our newsletter—it lands every two months, promise no spam!

The post How to Get Started with Machine Learning (2026 Implementation Guide) appeared first on Flexiana.

Permalink

Time-Travel Debugging in State Management: Part 1 — Foundations & Patterns

📖 Series: Time-Travel Debugging in State Management (Part 1 of 3)

From debugging tool to competitive UX advantage

Introduction

Imagine: you're testing a checkout form. The user fills in all fields, clicks "Pay"... and gets an error.

You start debugging. But instead of reproducing the scenario again and again, you simply rewind the state back — to the moment before the error. Like in a video game where you respawn from the last checkpoint.

This is Time-Travel Debugging — the ability to move between application states over time.

💡 Key Insight: In modern applications, time-travel has evolved from exclusively a developer tool to a standalone user-facing feature that becomes a competitive product advantage.

💡 Note: The techniques and patterns in this series work for both scenarios — debugging AND user-facing undo/redo.

Use Cases

Domain Examples History Depth Value
📝 Text Editors Google Docs, Notion 500-1000 steps Version history, undo/redo
📋 Forms & Builders Typeform, Tilda 50-100 steps Real-time change reversal
🎨 Graphic Editors Figma, Canva 50-100 steps Design experimentation
💻 Code Editors VS Code, CodeSandbox 500+ steps Local change history
🏗️ Low-code Platforms Webflow, Bubble 100-200 steps Visual version control
🎬 Video Editors Premiere Pro, CapCut 10-20 steps Edit operation rollback

In this article, we'll explore architectural patterns that work across all these domains — from simple forms to complex multimedia systems.

Terminology

This article uses the following terms:

Term Description Library Equivalents
State Unit Minimal indivisible part of state Universal concept
Atom State unit in atom-based libraries Jotai: atom, Recoil: atom, Nexus State: atom
Slice Logically isolated part of state Redux Toolkit: createSlice, Zustand: state key
Observable Reactive object with auto-tracking MobX: observable, Valtio: proxy, Solid.js: signal
Store Container for state units (global state) Zustand: store, Redux: store
Snapshot State copy at a point in time Universal term
Delta Difference between two snapshots Universal term

💡 Note: "State unit" is used as a universal abstraction. Depending on your library, this might be called:

  • Atom (Jotai, Recoil, Nexus State)
  • Slice / state key (Redux, Zustand)
  • Observable property (MobX, Valtio)
  • Signal (Solid.js, Preact)

Terminology Mindmap

mindmap
  root((State Unit))
    Atom
      Jotai
      Recoil
      Nexus State
    Slice
      Redux Toolkit
      Zustand
    Observable
      MobX
      Valtio
    Signal
      Solid.js
      Preact

Code Examples Across Libraries

// Nexus State / Jotai / Recoil
const countAtom = atom(0);

// Zustand (state unit equivalent)
const useStore = create((set) => ({
  count: 0, // ← this is a "state unit"
}));

// Redux Toolkit (state unit equivalent)
const counterSlice = createSlice({
  name: 'counter',
  initialState: { value: 0 },
  //              ^^^^^^^^^^^ this is a "state unit"
});

// MobX (state unit equivalent)
const store = makeObservable({
  count: 0, // ← this is a "state unit"
});

Why "state unit"?

  1. Universality — works for any library (not just atom-based)
  2. Precision — emphasizes minimality and indivisibility
  3. Neutrality — not tied to specific library terminology

What is Time-Travel Debugging?

Definition

Time-Travel Debugging is a debugging method where the system preserves state history and allows developers to:

  • View previous application states
  • Navigate between states (forward and backward)
  • Analyze differences between states
  • Replay action sequences

Key Capabilities

interface TimeTravelAPI {
  // Navigation
  undo(): boolean;
  redo(): boolean;
  jumpTo(index: number): boolean;

  // Availability checks
  canUndo(): boolean;
  canRedo(): boolean;

  // History
  getHistory(): Snapshot[];
  getCurrentSnapshot(): Snapshot | undefined;

  // Management
  capture(action?: string): Snapshot;
  clearHistory(): void;
}

Use Cases

  1. Debugging complex states — when bugs reproduce only after specific action sequences
  2. Regression analysis — understanding which change caused an issue
  3. Training & demos — step-by-step user scenario replay
  4. Automated testing — sequence reproduction for tests

Historical Context

Early Implementations

Time-travel debugging isn't new. First significant implementations appeared in mid-2000s:

Year System Description
2004 Smalltalk Squeak One of first environments with state "rollback"
2010 OmniGraffle Undo/redo for graphic operations
2015 Redux DevTools Popularized time-travel for web apps
2016 Elm Time Travel Built-in support via immutable architecture
2019 Akita (Angular) Built-in time-travel for Angular
2021 Elf (Shopify) Reactive state management on RxJS with DevTools
2020+ Modern Libraries Jotai, Zustand, MobX with plugins

Evolution of Approaches

timeline
    title Time-Travel Debugging Evolution
    2010-2015 : Simple Undo/Redo
              : Full snapshots
              : Limited depth
    2015-2020 : DevTools Integration
              : Redux DevTools
              : Action tracking
    2020+     : Optimized Systems
              : Delta compression
              : User-facing features

Generation 1 (2010-2015): Simple undo/redo stacks

  • Full state copy storage
  • Limited history depth
  • No async support

Generation 2 (2015-2020): DevTools integration

  • Change visualization
  • Redux-like architecture support
  • Action-based tracking

Generation 3 (2020+): Optimized systems

  • Delta compression
  • Smart memory cleanup
  • Atomic state support
  • State visualizer tools

Architectural Patterns

1. Command Pattern

Classic approach where each state change is encapsulated in a command object:

interface Command<T> {
  execute(): T;
  undo(): void;
  redo(): void;
}

// For atom-based libraries (Jotai, Recoil, Nexus State)
class SetAtomCommand<T> implements Command<T> {
  constructor(
    private atom: Atom<T>,
    private newValue: T,
    private oldValue?: T
  ) {}

  execute(): T {
    this.oldValue = this.atom.get();
    this.atom.set(this.newValue);
    return this.newValue;
  }

  undo(): void {
    this.atom.set(this.oldValue!);
  }

  redo(): void {
    this.execute();
  }
}

// For Redux / Zustand (equivalent)
class SetStateCommand<T extends Record<string, any>> implements Command<T> {
  constructor(
    private store: Store<T>,
    private slice: keyof T,
    private newValue: any
  ) {}

  execute(): void {
    this.oldValue = this.store.getState()[this.slice];
    this.store.setState({ [this.slice]: this.newValue });
  }

  undo(): void {
    this.store.setState({ [this.slice]: this.oldValue });
  }

  redo(): void {
    this.execute();
  }
}

Pros:

  • Explicit operation representation
  • Easy to extend with new commands
  • Macro support (command grouping)

Cons:

  • Object creation overhead
  • Complexity with async operations

2. Snapshot Pattern

Preserving full state copies at key moments:

interface Snapshot {
  id: string;
  timestamp: number;
  action?: string;
  state: Record<string, AtomState>;
  metadata: {
    label?: string;
    source?: 'auto' | 'manual';
  };
}

class SnapshotManager {
  private history: Snapshot[] = [];

  capture(action?: string): Snapshot {
    const snapshot: Snapshot = {
      id: generateId(),
      timestamp: Date.now(),
      action,
      state: deepClone(this.store.getState()),
      metadata: { label: action },
    };

    this.history.push(snapshot);
    return snapshot;
  }
}

Pros:

  • Simple implementation
  • Fast restoration (direct state replacement)
  • Easy to serialize for export

Cons:

  • High memory consumption
  • Data duplication

3. Delta Pattern

Storing only changes between states:

interface DeltaSnapshot {
  id: string;
  type: 'delta';
  baseSnapshotId: string;
  changes: {
    [atomId: string]: {
      oldValue: any;
      newValue: any;
    };
  };
  timestamp: number;
}

class DeltaCalculator {
  computeDelta(before: Snapshot, after: Snapshot): DeltaSnapshot {
    const changes: Record<string, any> = {};

    for (const [key, value] of Object.entries(after.state)) {
      const oldValue = before.state[key]?.value;
      if (!deepEqual(oldValue, value)) {
        changes[key] = { oldValue, newValue: value };
      }
    }

    return {
      id: generateId(),
      type: 'delta',
      baseSnapshotId: before.id,
      changes,
      timestamp: Date.now(),
    };
  }
}

Pros:

  • Significant memory savings (up to 90% for small changes)
  • Precise change tracking
  • Ability to "apply" deltas

Cons:

  • Complex restoration (requires delta chain application)
  • Risk of "chain break" (if base snapshot is deleted)

4. Hybrid Approach

Modern approach combining snapshots and deltas:

flowchart LR
    A[State Change] --> B{Full Snapshot<br/>Interval?}
    B -->|Yes| C[Create Full Snapshot]
    B -->|No| D[Compute Delta]
    C --> E[History Array]
    D --> E
    E --> F{Restore Request}
    F -->|Full| G[Direct Return]
    F -->|Delta| H[Apply Delta Chain]
    H --> I[Reconstructed State]
class HybridHistoryManager {
  private fullSnapshots: Snapshot[] = [];
  private deltaChain: Map<string, DeltaSnapshot> = new Map();

  // Every N changes, create a full snapshot
  private fullSnapshotInterval = 10;
  private changesSinceFull = 0;

  add(state: State): void {
    if (this.changesSinceFull >= this.fullSnapshotInterval) {
      // Create full snapshot
      const full = this.createFullSnapshot(state);
      this.fullSnapshots.push(full);
      this.changesSinceFull = 0;
    } else {
      // Create delta
      const base = this.getLastFullSnapshot();
      const delta = this.computeDelta(base, state);
      this.deltaChain.set(delta.id, delta);
      this.changesSinceFull++;
    }
  }

  restore(index: number): State {
    const full = this.getNearestFullSnapshot(index);
    const deltas = this.getDeltasBetween(full.index, index);

    // Apply deltas to full snapshot
    return deltas.reduce(
      (state, delta) => this.applyDelta(state, delta),
      full.state
    );
  }
}

When to use:

Pattern Use when... Avoid when...
Command Complex operations, macros Simple changes, async
Snapshot Small states, need simplicity Large states, frequent changes
Delta Frequent small changes Rare large changes
Hybrid Universal case Very simple apps

State Storage Strategies

1. Full Snapshots

// Universal example for any library
function createFullSnapshot(store: Store): Snapshot {
  return {
    id: uuid(),
    state: JSON.parse(JSON.stringify(store.getState())),
    timestamp: Date.now(),
  };
}

// For Redux / Zustand
const snapshot = {
  state: {
    counter: { value: 5 },  // Redux slice
    user: { name: 'John' }  // Redux slice
  },
  timestamp: Date.now()
};

// For Jotai / Nexus State
const snapshot = {
  state: {
    'count-atom-1': { value: 5, type: 'atom' },
    'user-atom-2': { value: { name: 'John' }, type: 'atom' }
  },
  timestamp: Date.now()
};

Characteristics:

  • Memory: O(n × m), where n = snapshots, m = state size
  • Restoration: O(1) — direct replacement
  • Serialization: Simple

2. Deltas

// Universal example
function computeDelta(before: State, after: State): Delta {
  const changes: Record<string, Change> = {};

  for (const key of Object.keys(after)) {
    if (!deepEqual(before[key], after[key])) {
      changes[key] = {
        from: before[key],
        to: after[key],
      };
    }
  }

  return { changes, timestamp: Date.now() };
}

// Example: Redux slice
const delta = {
  changes: {
    'counter.value': { from: 5, to: 6 },
    'user.lastUpdated': { from: 1000, to: 2000 }
  }
};

// Example: Jotai atoms
const delta = {
  changes: {
    'count-atom-1': { from: 5, to: 6 }
  }
};

Characteristics:

  • Memory: O(n × k), where k = average change size (k << m)
  • Restoration: O(d) — applying d deltas
  • Serialization: Requires context (base snapshot)

3. Structural Sharing (Immer example)

Using immutable structures with shared references:

// Example with Immutable.js
import { Map } from 'immutable';

const state1 = Map({ count: 1, user: { name: 'John' } });
const state2 = state1.set('count', 2);

// state1 and state2 share the user object
// Only count changed

// For React + Immer (more popular approach)
import { produce } from 'immer';

const state1 = { count: 1, user: { name: 'John' } };
const state2 = produce(state1, draft => {
  draft.count = 2;
  // user remains the same reference
});

Characteristics:

Aspect Immer (Proxy) Immutable.js
Memory O(n + m) best case O(log n) for Persistent Data Structures
Restoration O(1) with references O(log n) for access
Requirements Proxy API (ES2015+) Specialized library
Compatibility High (transparent objects) Medium (special types)

Note: Characteristics may differ by implementation. For ClojureScript, Mori, and other persistent data structure libraries, complexity will vary.

4. Strategy Comparison

Strategy Memory Restoration Complexity Use Case
Full Snapshots High Fast Low Small states
Deltas Low Medium Medium Frequent small changes
Structural Sharing Medium Fast High Immutable states
Hybrid Medium Medium High Universal

What's Next?

In Part 2 ("Performance & Advanced Topics"), we'll cover:

  • Memory Optimization: Delta Snapshots, compression, smart cleanup
  • Navigation Algorithms: Undo/Redo, jumpTo, large history optimization
  • Transactionality: Rollback restoration, checkpoints
  • Performance Issues: Benchmarks, optimizations
  • Time-Travel as User-Facing Feature: From debugging to UX

🤔 Food for Thought

Which pattern would you choose for your project?

Think about your current project:

  • How often does state change?
  • What's the state size (small/medium/large)?
  • Do you need deep history (100+ steps)?

Share your choice in the comments!

To be continued... → Part 2: Performance & Advanced Topics

Resources

Libraries with Time-Travel Support

Immutable Data Structures

This is Part 1 of 3 in the Time-Travel Debugging article series.

Tags: #javascript #typescript #state-management #debugging #architecture #react #redux #performance

Permalink

The Power of Framing: Why BigConfig is Rebranding as a Package Manager

BigConfig began as a simple Babashka script designed to DRY up a complex Terraform project for a data platform. Since those humble beginnings, it has evolved through several iterations into a robust template and workflow engine. But as the tool matured, I realized that technical power wasn’t enough; the way it was framed was the true barrier to adoption.

The Language Barrier (and the Loophole)

BigConfig is powerful as a library, but I’ve faced a hard truth: very few developers will learn a language like Clojure just to use a library. However, history shows that developers will learn a new language if it solves a fundamental deployment problem.

People learned Ruby to master Homebrew ; they learn Nix for reproducible builds. Meanwhile, tools like Helm force users to juggle the awkward marriage of YAML and Go templates—a “solution” many endure only because no better alternative exists. To get developers to cross the language barrier, you have to offer more than a tool; you have to offer a total solution.

The “Package Manager” Epiphany

I noticed a significant shift in engagement depending on how I framed the project. When I describe BigConfig as a library, it feels abstract—like “more work” added to a developer’s plate. When I introduce it as a package manager, the interest is immediate.

In the mind of a developer, a library is a component you have to manage. A package manager is the system that manages things for you. By shifting the perspective, BigConfig goes from being a “Clojure utility” to an “Infrastructure Orchestrator.”

How BigConfig Differs

Like Nix and Guix , BigConfig embraces a full programming language. However, it avoids the “two-language architecture” common in those ecosystems—where you often have a compiled language for the CLI and a separate interpreted one for the user.

BigConfig is Clojure all the way down (in the spirirt of Emacs). This allows it to support three distinct environments seamlessly:

  1. The REPL: For interactive development and real-time exploration.
  2. The Shell: For traditional CLI workflows and CI/CD pipelines.
  3. The Library: For embedding directly into your own control planes or APIs.

Beyond the language, BigConfig introduces robust client-side coordination, featuring an Atlantis-style locking mechanism that uses GitHub tags to prevent developer collisions in shared environments.

Limitless Abstraction

The level of abstraction is where BigConfig truly shines. When you adopt the system, you aren’t locked into a rigid schema; you can adapt the entire engine to your specific needs. Complex tasks—like deploying the same architecture across different hyperscalers—are reduced from massive refactors to simply updating a property. It moves the conversation from how to deploy to what to deploy.

The Roadmap

The next phase is focused on expanding the ecosystem and making package discovery seamless:

  • Hyperscaler Support: Having already added DigitalOcean , Hetzner , and Oracle Cloud , I am now prioritizing AWS , Google Cloud , and Azure .
  • Application Packages: While the first “app”—a remote development environment—is a niche use case, I’m expanding into high-demand stacks like Airflow and Rama .
  • The Ecosystem: I am currently defining the formal package manifest and building a registry where users can discover, version, and publish their own infrastructure packages.

Conclusion

The evolution of BigConfig is a testament to the idea that the right abstraction is just as important as the right code. By reframing the tool from a utility you have to manage into a system that manages for you, we bridge the gap between complex cloud resources and developer productivity.

As we expand our hyperscaler support and formalize our package registry, the goal remains the same: to move infrastructure management away from the “how” and toward the “what.” Whether you are deploying a niche remote environment or a massive data stack like a Data Lake, BigConfig provides the language and the logic to make your infrastructure as versionable and reproducible as your software.

Would you like to have a follow-up on this topic? What are your thoughts? I’d love to hear your experiences.

Permalink

Clojure Inheritance… Sort Of

Disclaimer

I want to start off by saying that I don't expect anyone to read this. However, neither OpenAI nor Anthropic's models get this right on the first try, so I figure that writing about it might get picked up in future training runs.

Inheritance

A pattern that I sometimes want in Clojure is to implement the same function for multiple types (defined by either defrecord or deftype). A simple way to do this is to do the work in an external function, and then have every implementation call it.

For instance, say I want to get a list of feet from several animal types. I can create a protocol for this, with the function get-feet:

(defprotocol Footed
  (get-feet [animal] "Get a sequence of feet"))

Then, I may have a few different groups of animals, each sharing a number of feet. I can create a function for each of these groups:

(defn get-2-feet [] [:left :right])

(defn get-4-feet [] [:front-left :front-right
                     :back-left :back-right])

(defn get-6-feet [] [:front-left :front-right
                     :middle-left :middle-right
                     :back-left :back-right])

Then the different record types will call the function they need:

(defrecord Ape [name]
  Footed
  (get-feet [_] (get-2-feet)))

(defrecord Bird [name]
  Footed
  (get-feet [_] (get-2-feet)))

(defrecord Cat [name]
  Footed
  (get-feet [_] (get-4-feet)))

(defrecord Ant [name]
  Footed
  (get-feet [_] (get-6-feet)))

…and so on.

This works, but it is very unsatisfying. It also gets noisy if the protocol has more than one function.

Instead, it would be nice if we could implement the protocol once, and then inherit this in any type that needs that implementation. Clojure doesn't support inheritance like this, but it has something close.

Protocols

A Protocol in Clojure is a set of functions that an object has agreed to support. The language and compiler have special dispatch support around protocols, making their functions fast and easy to call. While many people know the specifics of protocols, this often comes about through exploration rather than documentation. I won't go into an exhaustive discussion of protocols here, but I will mention a couple of important aspects.

Whenever a protocol is created in Clojure, two things are created: the protocol itself, and a plain-old Java Interface. (ClojureScript also has protocols, but they don't create interfaces). The protocol is just a normal data structure, which we can see at a repl:

user=>  (defprotocol Footed
  (get-feet [animal] "Get a sequence of feet"))
Footed
user=> Footed
{:on user.Footed,
 :on-interface user.Footed,
 :sigs {:get-feet {:tag nil, :name get-feet, :arglists ([animal]), :doc "Get a sequence of feet"}},
 :var #'user/Footed,
 :method-map {:get-feet :get-feet},
 :method-builders {#'user/get-feet #object[user$eval143$fn__144 0x67001148 "user$eval143$fn__144@67001148"]}}

This describes the protocol, and each of the associated functions. This is also the structure that gets modified by some of the various protocol extension macros. You may see how the :method-map refers to functions by their name, rewritten as a keywords.

Of interest here is the reference to the interface user.Footed. I'm using a repl with the default user namespace. Because we are already in this namespace, that Footed interface name is being shadowed by the protocol object. But it is still there, and we can still do things with it.

Common Operations

Protocols are often "extended" onto new datatypes. This is a very flexible operation, and allows new behavior to be associated with any datatype, including those not declared in Clojure (for instance, new behavior could be added to a java.util.String). This applies to Interfaces as well as Classes, which is something we can use here.

First of all, we want a new protocol/interface for each type of behavior that we want:

(defprotocol Feet2)
(defprotocol Feet4)
(defprotocol Feet6)

These protocols don't need functions, as they just serve to "mark" the objects that want to implement the desired behavior.

Next, we can extend the protocol with our functions onto the types described by each of these Interfaces:

(extend-protocol Footed
user.Feet2
(get-feet [_] [:left :right])
user.Feet4
(get-feet [_] [:front-left :front-right :back-left :back-right])
user.Feet6
(get-feet [_] [:front-left :front-right :middle-left :middle-right :back-left :back-right]))

Going back to the Footed protocol, we can see that it now knows about these implementations.

user=> Footed
{:on user.Footed,
 :on-interface user.Footed,
 :sigs {:get-feet {:tag nil, :name get-feet, :arglists ([animal]), :doc "Get a sequence of feet"}},
 :var #'user/Footed,
 :method-map {:get-feet :get-feet},
 :method-builders {#'user/get-feet #object[user$eval143$fn__144 0x67001148 "user$eval143$fn__144@67001148"]},
 :impls {
    user.Feet2 {:get-feet #object[user$eval195$fn__196 0x24fabd0f "user$eval195$fn__196@24fabd0f"]},
    user.Feet4 {:get-feet #object[user$eval199$fn__200 0x250b236d "user$eval199$fn__200@250b236d"]},
    user.Feet6 {:get-feet #object[user$eval203$fn__204 0x61f3fbb8 "user$eval203$fn__204@61f3fbb8"]}}}

Note how the :impls value now maps each of the extended interfaces to the attached functions.

A Comment on Identifiers

You might have noticed that I had to use the fully-qualified name for these interfaces due to the protocol name shadowing them. When a protocol is not in the same namespace, then it can be required, and referenced by its namespace, while the Interface can be imported from that namespace. For instance, a project that I've been working on recently has require/imports of:

(ns my.project
 (:require [quoll.rdf :as rdf])
 (:import [quoll.rdf IRI]))

In this example I am able to reference the protocol via rdf/IRI while the interface is just IRI.

Attaching

Now that the Footed protocol has been extended to each of these interfaces, the protocols associated with those interfaces can be attached to any type that wants that behavior.

Going back to our animals, we can do the same thing again, but this time without the stub functions that redirect to the common functionality:

(defrecord Ape [name] Feet2)
(defrecord Bird [name] Feet2)
(defrecord Cat [name] Feet4)
(defrecord Ant [name] Feet6)

Instances of these types will now pick up the implementations extended to these marker protocols:

(def magilla (Ape. "Magilla"))
(def big-bird (Bird. "Big"))
(def garfield (Cat. "Garfield"))
(def atom-ant (Ant. "Atom"))

user=> (get-feet magilla)
[:left :right]
user=> (get-feet big-bird)
[:left :right]
user=> (get-feet garfield)
[:front-left :front-right :back-left :back-right]
user=> (get-feet atom-ant)
[:front-left :front-right :middle-left :middle-right :back-left :back-right]

Final Declarations

After explaining so much of the mechanism, the code has been scattered widely across this post. Putting the declarations together, we have:

(defprotocol Footed (get-feet [_]))
(defprotocol Feet2)
(defprotocol Feet4)
(defprotocol Feet6)

(extend-protocol Footed
 user.Feet2
 (get-feet [_] [:left :right])
 user.Feet4
 (get-feet [_] [:front-left :front-right :back-left :back-right])
 user.Feet6
 (get-feet [_] [:front-left :front-right :middle-left :middle-right  :back-left :back-right]))

(defrecord Ape [name] Feet2)
(defrecord Bird [name] Feet2)
(defrecord Cat [name] Feet4)
(defrecord Ant [name] Feet6)

Wrap Up

Functional programming in Clojure is not generally served by having multiple types like this, but it does happen. While this is a trivial example, with only a single function on the protocol, the need for this pattern becomes apparent when protocols come with multiple functions.

I've called it inheritance, but that is only an analogy. It's not actually inheritance that we are applying here, but it does behave in a similar way.

Permalink

core.async and Virtual Threads

core.async 1.9.847-alpha3 is now available. This release reverts the core.async virtual thread implementation added in alpha2, and provides a new implementation (ASYNC-272).

Threads must block while waiting on I/O operations to complete. "Parking" allows the platform to unmount and free the underlying thread resource while waiting. This allows users to write "normal" straight line code (without callbacks) while consuming fewer platform resources.

io-thread execution context

io-thread was added in a previous core.async release and is a new execution context for running both blocking channel operations and blocking I/O operations (which are not supported in go). Parking operations are not allowed in io-thread (same as the thread context).

io-thread uses the :io executor pool, which will now use virtual threads, when available. If used in Java without virtual threads (< 21), io-thread continues to run in a cached thread pool with platform threads.

With this change, all blocking operations in io-thread park without consuming a platform thread on Java 21+.

go blocks

Clojure core.async go blocks use an analyzer to rewrite code with inversion of control specifically for channel parking operations (the ! async ops like >!). Other blocking operations (!! channel ops or arbitrary I/O ops) are not allowed. Additionally, go blocks are automatically collected if the channels they depend on are collected (and parking can never progress).

The Java 21 virtual threads feature implements I/O parking in the Java platform itself - that capability is a superset of what go blocks provide by supporting all blocking I/O operations. Like regular threads, (and unlike go blocks) virtual threads must terminate ordinarily and will keep referenced resources alive until they do.

Due to this difference in semantics, go blocks are unchanged and continue to use the go analyzer and run on platform threads. If you wish to get the benefits and constraints of virtual threads, convert go to io-thread and parking ops to blocking ops.

Note: existing IOC compiled go blocks from older core.async versions are unaffected.

Executor factories

The clojure.core.async.executor-factory System property now need only provide Executor instances, not ExecutorService instances. This is a reduction in requirements so is backwards-compatible.

Additionally, the io-thread virtual thread Executor no longer holds references to virtual threads as it did in 1.9.829-alpha2.

Permalink

How to start a Polylith project from scratch

A complete step-by-step guide to creating project called cat (or workspace, in Polylith terms) with a filesystem component, a main base, and a cli project using the Polylith architecture.


What You Will Build

cat/                          ← workspace root
├── components/
│   └── filesystem/              ← reads a file and prints its content
├── bases/
│   └── main/                 ← entry point (-main function)
└── projects/
    └── cli/                  ← deployable artifact (uberjar)

Data flow: java -jar cli.jar myfile.txtmain/-mainfilesystem/read-file → stdout


Prerequisites

Install the following before starting:

  • Java 21+ — verify with java -version
  • Clojure CLI — install from clojure.org/guides/getting_started, verify with clojure --version
  • git — verify with git --version; also configure user.name and user.email:
    git config --global user.name "Your Name"
    git config --global user.email "you@example.com"
    
  • poly tool — see below

Installing the poly tool

macOS:

brew install polyfy/polylith/poly

For other OS/platforms please refer to the official Installation doc.

Verify:

poly version

Step 1 — Create the Workspace

Run this outside any existing git repository:

poly create workspace name:cat top-ns:com.acme :commit

Move into the workspace:

cd cat

Your directory structure will look like:

cat/
├── .git/
├── .gitignore
├── bases/
├── components/
├── deps.edn
├── development/
│   └── src/
├── projects/
├── readme.md
└── workspace.edn

Enable auto-add in workspace.edn

Open workspace.edn and set :auto-add to true so that files generated by poly create commands are automatically staged in git:

{:top-namespace "com.acme"
 :interface-ns "interface"
 :default-profile-name "default"
 :dialects ["clj"]
 :compact-views #{}
 :vcs {:name "git"
       :auto-add true}       ;; <-- change this to true
 :tag-patterns {:stable "^stable-.*"
                :release "^v[0-9].*"}
 :template-data {:clojure-ver "1.12.0"}
 :projects {"development" {:alias "dev"}}}

Step 2 — Create the filesystem Component

poly create component name:filesystem

This creates:

components/filesystem/
├── deps.edn
├── src/
│   └── com/acme/filesystem/
│       └── interface.clj
└── test/
    └── com/acme/filesystem/
        └── interface_test.clj

2a. Write the interface (public API)

The interface namespace is the only file other bricks are allowed to call. Edit components/filesystem/src/com/acme/filesystem/interface.clj:

(ns com.acme.filesystem.interface
  (:require [com.acme.filesystem.core :as core]))

(defn read-file
  "Reads the file at `filename` and prints its content to stdout."
  [filename]
  (core/read-file filename))

2b. Write the implementation

Create the file components/filesystem/src/com/acme/filesystem/core.clj:

(ns com.acme.filesystem.core
  (:require [clojure.java.io :as io]))

(defn read-file
  "Reads the file at `filename` and prints its content to stdout."
  [filename]
  (let [file (io/file filename)]
    (if (.exists file)
      (println (slurp file))
      (println (str "Error: file not found — " filename)))))

2c. Register the component in the root deps.edn

Open the root ./deps.edn and add the filesystem component:

{:aliases {:dev {:extra-paths ["development/src"]
                 :extra-deps  {com.acme/filesystem {:local/root "components/filesystem"}
                               org.clojure/clojure {:mvn/version "1.12.0"}}}

           :test {:extra-paths ["components/filesystem/test"]}

           :poly {:main-opts ["-m" "polylith.clj.core.poly-cli.core"]
                  :extra-deps {polyfy/clj-poly {:mvn/version "0.3.32"}}}}}

Step 3 — Create the main Base

poly create base name:main

This creates:

bases/main/
├── deps.edn
├── src/
│   └── com/acme/main/
│       └── core.clj
└── test/
    └── com/acme/main/
        └── core_test.clj

3a. Write the base code

A base differs from a component in that it has no interface — it is the entry point to the outside world. Edit bases/main/src/com/acme/main/core.clj:

(ns com.acme.main.core
  (:require [com.acme.filesystem.interface :as filesystem])
  (:gen-class))

(defn -main
  "Entry point. Accepts a filename as the first argument and prints its content."
  [& args]
  (if-let [filename (first args)]
    (filesystem/read-file filename)
    (println "Usage: cat <filename>"))
  (System/exit 0))

Key points:

  • (:gen-class) tells the Clojure compiler to generate a Java class with a main method.
  • The base calls com.acme.filesystem.interface/read-filenever the core namespace directly.
  • System/exit 0 ensures the JVM terminates cleanly after running.

3b. Register the base in the root deps.edn

Add the main base alongside filesystem:

{:aliases {:dev {:extra-paths ["development/src"]
                 :extra-deps  {com.acme/filesystem {:local/root "components/filesystem"}
                               com.acme/main    {:local/root "bases/main"}
                               org.clojure/clojure {:mvn/version "1.12.0"}}}

           :test {:extra-paths ["components/filesystem/test"
                                "bases/main/test"]}

           :poly {:main-opts ["-m" "polylith.clj.core.poly-cli.core"]
                  :extra-deps {polyfy/clj-poly {:mvn/version "0.3.32"}}}}}

Step 4 — Create the cli Project

poly create project name:cli

This creates:

projects/cli/
└── deps.edn

4a. Register the project alias in workspace.edn

Open workspace.edn and add a cli alias to :projects:

:projects {"development" {:alias "dev"}
           "cli"         {:alias "cli"}}

4b. Wire the bricks into the project

Edit projects/cli/deps.edn to include the filesystem component, the main base, the uberjar entry point, and the build alias:

{:deps {com.acme/filesystem {:local/root "components/filesystem"}
        com.acme/main    {:local/root "bases/main"}
        org.clojure/clojure {:mvn/version "1.12.0"}}

 :aliases {:test    {:extra-paths []
                     :extra-deps  {}}

           :uberjar {:main com.acme.main.core}}}

Step 5 — Add Build Support

The poly tool does not include a build command — it leaves artifact creation to your choice of tooling. We will use Clojure tools.build.

5a. Add the :build alias to the root deps.edn

Your final root ./deps.edn should look like this:

{:aliases {:dev {:extra-paths ["development/src"]
                 :extra-deps  {com.acme/filesystem {:local/root "components/filesystem"}
                               com.acme/main    {:local/root "bases/main"}
                               org.clojure/clojure {:mvn/version "1.12.0"}}}

           :test {:extra-paths ["components/filesystem/test"
                                "bases/main/test"]}

           :poly {:main-opts ["-m" "polylith.clj.core.poly-cli.core"]
                  :extra-deps {polyfy/clj-poly {:mvn/version "0.3.32"}}}

           :build {:deps {io.github.clojure/tools.build {:mvn/version "0.9.6"}}
                   :ns-default build}}}

5b. Create build.clj at the workspace root

Create the file build.clj under the workspace root:

(ns build
  (:require [clojure.tools.build.api :as b]
            [clojure.java.io :as io]))

(defn uberjar
  "Build an uberjar for a given project.
   Usage: clojure -T:build uberjar :project cli"
  [{:keys [project]}]
  (assert project "You must supply a :project name, e.g. :project cli")
  (let [project     (name project)
        project-dir (str "projects/" project)
        class-dir   (str project-dir "/target/classes")
        ;; Create the basis from the project's deps.edn.
        ;; tools.build resolves :local/root entries and collects all
        ;; transitive :paths (i.e. each brick's "src" and "resources").
        basis       (b/create-basis {:project (str project-dir "/deps.edn")})
        ;; Collect every source directory declared across all bricks.
        ;; basis :classpath-roots contains the resolved paths.
        src-dirs    (filterv #(.isDirectory (java.io.File. %))
                             (:classpath-roots basis))
        main-ns     (get-in basis [:aliases :uberjar :main])
        _           (assert main-ns
                             (str "Add ':uberjar {:main <ns>}' alias to "
                                  project-dir "/deps.edn"))
        jar-file    (str project-dir "/target/" project ".jar")]
    (println (str "Cleaning " class-dir "..."))
    (b/delete {:path class-dir})
    (io/make-parents jar-file)
    (println (str "Compiling " main-ns "..."))
    (b/compile-clj {:basis     basis
                    :src-dirs  src-dirs
                    :class-dir class-dir})
    (println (str "Building uberjar " jar-file "..."))
    (b/uber {:class-dir class-dir
             :uber-file jar-file
             :basis     basis
             :main      main-ns})
    (println "Uberjar is built.")))

Step 6 — Validate the Workspace

Run the poly info command to see the current state of your workspace:

poly info

You should see both bricks (filesystem and main) listed, along with the cli project. Then validate the workspace integrity:

poly check

This should print OK. If there are errors, the command will describe what to fix.


Step 7 — Build the Uberjar

From the workspace root:

clojure -T:build uberjar :project cli

Expected output:

Compiling com.acme.main.core...
Building uberjar projects/cli/target/cli.jar...
Uberjar is built.

Step 8 — Run the CLI

Create a test file and run the app:

echo "Hello from Polylith!" > /tmp/hello.txt
java -jar projects/cli/target/cli.jar /tmp/hello.txt

Expected output:

Hello from Polylith!

Test the missing-file error path:

java -jar projects/cli/target/cli.jar /tmp/nonexistent.txt

Expected output:

Error: file not found — /tmp/nonexistent.txt

Test the no-argument path:

java -jar projects/cli/target/cli.jar

Expected output:

Usage: cat <filename>

Final Workspace Layout

cat/
├── bases/
│   └── main/
│       ├── deps.edn
│       └── src/com/acme/main/
│           └── core.clj              ← -main, calls filesystem/read-file
├── components/
│   └── filesystem/
│       ├── deps.edn
│       └── src/com/acme/filesystem/
│           ├── interface.clj         ← public API (read-file)
│           └── core.clj              ← implementation
├── projects/
│   └── cli/
│       ├── deps.edn                  ← wires filesystem + main, :uberjar alias
│       └── target/
│           └── cli.jar               ← generated artifact
├── build.clj                         ← tools.build script
├── deps.edn                          ← dev + test + poly + build aliases
└── workspace.edn                     ← top-ns, project aliases, vcs config

Key Concepts Summary

  • Workspace is the monorepo root containing all bricks, in this project is cat/
  • Component is a reusable building block with a public interface ns, such as filesystem
  • Base is an entry-point brick that bridges the outside world to components, like main
  • Project is a deployable artifact configuration; assembles bricks, the cli
  • Interface is the only namespace other bricks may import from a component, like com.acme.filesystem.interface

Useful poly Commands

poly info          # overview of bricks and projects
poly check         # validate workspace integrity
poly test          # run all tests affected by recent changes
poly deps          # show dependency graph
poly libs          # show library usage
poly shell         # interactive shell with autocomplete

Going Further

  • Add more components — e.g. poly create component name:parser for argument parsing
  • Add tests — on a component level, add tests against the interface and not the implementation. You can have additional tests for the implementation to test internal functions etc. but use a different test file.
  • Tag stable releasesgit tag stable-main after a clean poly test
  • CI integration — run poly check and poly test in your pipeline; tag as stable on success
  • Multiple projects — add another project that reuses the same components
  • Visit the official Polylith project documentation
  • Read the Polylith book
  • Ask questions and connect with other Polylith users on #polylith Slack channel.

Permalink

Simple over easy for operations

Building a workflow engine for infrastructure operations is not trivial. Most people start with a simple mental model: a desired state and a sequence of functions that produce side effects. In Clojure, this looks like a simple thread-first macro:

(-> {}
fn1
fn2
...)

Your state {} is threaded through fn1 and fn2. However, real-world operations are rarely linear. They require complex branching, error handling, and conditional jumps (e.g., “if success, continue; otherwise, jump to cleanup”).

Wiring the Engine

To handle non-linear flows, we associate functions with qualified keywords (steps). Together with the next step, they form the “wiring”. You can override sequential execution by providing a next-fn to handle custom branching.

The core execution loop looks like this:

(loop [step first-step
opts opts]
(let [[f next-step] (wire-fn step step-fns)
new-opts (f opts)
[next-step next-opts] (next-fn step next-step new-opts)]
(if next-step
(recur next-step next-opts)
next-opts)))

Workflow Example

Here is how we use this engine to create a client-side lock for Terraform using Git tags. The opts map represents our “World State”, shared across all functions.

We invoke it like this: (lock [] {}). The first argument is a list of middleware-style step functions, and the second is the starting state.

(->workflow {:first-step ::generate-lock-id
:wire-fn (fn [step _]
(case step
::generate-lock-id [generate-lock-id ::delete-tag]
::delete-tag [delete-tag ::create-tag]
::create-tag [create-tag ::push-tag]
::push-tag [push-tag ::get-remote-tag]
::get-remote-tag [(comp get-remote-tag delete-tag) ::read-tag]
::read-tag [read-tag ::check-tag]
::check-tag [check-tag ::end]
::end [identity]))
:next-fn (fn [step next-step opts]
(case step
::end [nil opts]
::push-tag (choice {:on-success ::end
:on-failure next-step
:opts opts})
::delete-tag [next-step opts]
(choice {:on-success next-step
:on-failure ::end
:opts opts})))})

Debugging Made Simple

In many CI/CD systems, debugging is a nightmare of “print” statements and re-running 10-minute pipelines. Because Clojure data structures are immutable and persistent, we can use a debug macro provided by BigConfig and a “spy” function to inspect the state at every step.

(comment
(debug tap-values
(create [(fn [f step opts]
(tap> [step opts]) ;; "Spy" on every state change
(f step opts))]
{::bc/env :repl
::tools/tofu-opts (workflow/parse-args "render")
::tools/ansible-opts (workflow/parse-args "render")})))

Using tap>, you get the result “frozen in time”. You can render templates and inspect them without ever executing a side effect.

Solving the Composability Problem: Nested Options

Operations often require calling the same sub-workflow multiple times. If every workflow uses the same top-level keys, they clash. We solve this with Nested Options.

By using the workflow’s namespace as a key, we isolate state. However, sometimes a child needs data from a sibling (e.g., Ansible needs an IP address generated by Terraform). We use an opts-fn to map these values explicitly at runtime.

The specialized ->workflow* constructor uses this next-fn to manage this state isolation:

(fn [step next-step {:keys [::bc/exit] :as opts}]
(if (steps-set step)
(do
(swap! opts* merge (select-keys opts [::bc/exit ::bc/err]))
(swap! opts* assoc step opts))
(reset! opts* opts))
(cond
(= step ::end) [nil @opts*]
(> exit 0) [::end @opts*] ;; Error handling jump
:else
[next-step (let [[new-opts opts-fn]
(get step->opts-and-opts-fn next-step [@opts* identity])]
(opts-fn new-opts))]))

This logic ensures that if a step is a sub-workflow, its internal state is captured within the parent’s state under its own key. The opts-fn allows us to bridge the gap—for instance, pulling a Terraform-generated IP address into the Ansible configuration dynamically.

The Working Directory and the Maven Diamond Problem

In operations, you must render configuration files before invoking tools. If you compose multiple workflows, you run into the “Maven Diamond Problem”: two different parent workflows sharing the same sub-workflow. To prevent them from overwriting each other’s files, we use dynamic, hashed prefixes for working directories:

.dist/default-f704ed4d/io/github/amiorin/alice/tools/ansible

The hash f704ed4d is dynamic. If a workflow is moved or re-composed, the hash changes, ensuring total isolation during template rendering.

Conclusion: Simple over Easy

Tools like AWS Step Functions , Temporal , or Restate are powerful workflow engines, but for many operational tasks, they are not a good fit. BigConfig has an edge because it is local and synchronous where it counts. It turns infrastructure into a local control loop orchestrating multiple tools.

In the industry, “Easy” (using the same language as the backend, like Go) often wins over “Simple”. But Go lacks a REPL, immutable data structures, and the ability to implement a debug macro that allows for instantaneous feedback.

Infrastructure eventually becomes a mess of “duct tape and prayers” when the underlying tools aren’t built for complexity. If you choose Simple over Easy, Clojure is the best language for operations—even if you’re learning Clojure for the first time.

Would you like to have a follow-up on this topic? What are your thoughts? I’d love to hear your experiences.

Permalink

Clojure Deref (Mar 10, 2026)

Welcome to the Clojure Deref! This is a weekly link/news roundup for the Clojure ecosystem (feed: RSS).

Clojure Data Science Survey

Do you use clojure for Data Science? Please take the survey. Your responses will help shape the future of the Noj toolkit and the Data Science ecosystem in Clojure.

State of Clojure Survey Results

The results of the 2025 State of Clojure Survey are now available. Thank you to everyone who participated!

Also, a big thanks to the many folks in the community who helped make the survey possible by providing feedback, suggesting questions, and recruiting others to participate.

Check out the video discussion of the results. It includes many topics, such as: where Clojure is being used around the world, what was surprising, the experience level of the community, who Clojure attracts, how Clojure fits in with other languages, and just how much developers love Clojure.

Clojure Dev Call

On February 10, the Clojure team hosted our first Clojure Dev Call!

Watch the recording to hear what the team has been working on and what’s on the horizon. Stick around until the end to hear the community Q&A.

Clojurists Together: Call for Proposals

Clojurists Together has opened the Q2 2026 funding round for open-source Clojure projects. Applications will be accepted through March 19th.

Read the announcement for more details.

Upcoming Events

Blogs, articles, and news

Libraries and Tools

Debut release

  • tools.deps.edn - Reader for deps.edn files

  • cream - Fast starting Clojure runtime built with GraalVM native-image + Crema

  • gloat - Glojure AOT Tool

  • patcho - Patching micro lib for Clojure

  • coll-tracker - Track which keys and indices of a deep data structures are accessed.

  • inst - Clojure time library that always returns a #inst.

  • r11y - CLI tool for extracting URLs as Markdown

  • leinpad - launchpad for leiningen

  • bb-depsolve - Generic monorepo dependency sync, upgrade & reporting for babashka/Clojure

  • sqlatom - Clojure library that stores atoms in a SQLite database

  • ruuter - A zero-dependency, runtime-agnostic router.

  • briefkasten - A mail client that can sync and index with Datahike and Scriptum (Lucene).

  • zsh-clj-shell - Clojure (Babashka) shell integration for Zsh

  • icehouse - Icehouse tabletop game

  • neanderthal-blas-like - BLAS-like Extensions for Neanderthal, Fast Clojure Matrix Library

  • avatar-maker - GitHub - avidrucker/avatar-maker

  • icd11-export - Turtle export of ICD-11

  • mycelium - Mycelium uses Maestro state machines and Malli contracts to define "The Law of the Graph," providing a high-integrity environment where humans architect and AI agents implement.

  • hyper - Reactive server-rendered web framework for Clojure

  • awesome-clojure-llm - Concise, curated resources for working with the Clojure Programming and LLM base coding agents

  • stratum - Versioned, fast and scalable columnar database.

  • any - Objects for smart comparison in tests.

  • sankyuu-template-clj - A clojure project utilizing lwjgl + assimp + opengl + imgui to render glTF models and MMD models.

  • epupp - A web browser extension that lets you tamper with web pages, live and/or with userscripts.

  • clj-yfinance - Fetch prices, historical OHLCV, dividends, splits, earnings dates, fundamentals, analyst estimates and options from Yahoo Finance. Pure Clojure + built-in Java 11 HttpClient, no API key, no Python.

  • ecbjure - Access ECB financial data from Clojure — FX conversion, EURIBOR, €STR, HICP, and the full SDMX catalogue

  • brepl-opencode-plugin - brepl integration for OpenCode - automatic Clojure syntax validation, auto-fix brackets, and REPL evaluation.

  • lalinea - linear algebra with dtype-next tensors

  • superficie - Surface syntax for Clojure to help exposition/onboarding.

  • kaven - A Clojure API for interacting with Maven respositories

  • igor - Constraint Programming for Clojure

Updates

  • tools.deps 0.29.1598 - Deps as data and classpath generation

  • clojure_cli 1.12.4.1618 - Clojure CLI

  • core.cache 1.2.263 - A caching library for Clojure implementing various cache strategies

  • core.memoize 1.2.281 - A manipulable, pluggable, memoization framework for Clojure

  • tools.cli 1.4.256 - Command-line processing

  • pathling 0.2.1 - Utilities for scanning and updating data structures

  • scoped 0.1.16 - ScopedValue in Clojure, with fallback to ThreadLocal

  • dompa 1.2.3 - A zero-dependency, runtime-agnostic HTML parser and builder.

  • cljfmt 0.16.0 - A tool for formatting Clojure code

  • persistent-sorted-set 0.4.119 - Fast B-tree based persistent sorted set for Clojure/Script

  • pocket 0.2.4 - filesystem-based caching of expensive computations

  • contajners 1.0.8 - An idiomatic, data-driven, REPL friendly clojure client for OCI container engines

  • hive-mcp 0.13.0 - MCP server for hive-framework development. A memory and agentic coordination solution.

  • basic-tools-mcp 0.2.1 - Standalone babashka MCP server wrapping clojure-mcp-light — delimiter repair, nREPL eval, cljfmt formatting as IAddon tools

  • bb-mcp 0.4.0 - Lightweight MCP server in Babashka (~50MB vs ~500MB JVM)

  • clj-kondo-mcp 0.1.1 - Standalone MCP server for clj-kondo static analysis (Babashka + JVM)

  • lsp-mcp 0.2.1 - Clojure LSP analysis MCP server — standalone babashka or JVM addon for hive-mcp

  • scc-mcp 0.1.1 - Standalone MCP server for scc code metrics

  • qclojure-braket 0.3.0 - AWS Braket backend for QClojure

  • statecharts 1.3.0 - A Statechart library for CLJ(S)

  • fulcro 3.9.3 - A library for development of single-page full-stack web applications in clj/cljs

  • tableplot 1-beta16 - Easy layered graphics with Hanami & Tablecloth

  • noj 2-beta21 - A clojure framework for data science

  • cljd-video-player 1.3 - A reusable ClojureDart video player package with optional background audio service

  • fulcro-spec 3.2.8 - A library that wraps clojure.test for a better BDD testing experience.

  • drawbridge 0.3.0 - An HTTP/HTTPS nREPL transport, implemented as a Ring handler.

  • yggdrasil 0.2.20 - Git-like, causal space-time lattice abstraction over systems supporting this memory model.

  • hirundo 1.0.0-alpha211 - Helidon 4.x - RING clojure adapter

  • kit 2026-02-18 - Lightweight, modular framework for scalable web development in Clojure

  • ty 0.3.3 - Clojurescript WebComponents library

  • clojure-lsp 2026.02.20-16.08.58 - Clojure & ClojureScript Language Server (LSP) implementation

  • neanderthal 0.61.0 - Fast Clojure Matrix Library

  • diamond-onnxrt 0.24.0 - Fast Clojure Machine Learning Model Integration

  • splint 1.23.1 - A Clojure linter focused on style and code shape.

  • metamorph.ml 1.3.0 - Machine learning functions based on metamorph and machine learning pipelines

  • aws-simple-sign 2.3.1 - A Clojure library for pre-signing S3 URLs and signing HTTP requests for AWS.

  • clojurecuda 0.27.0 - Clojure library for CUDA development

  • nrepl 1.6.0 - A Clojure network REPL that provides a server and client, along with some common APIs of use to IDEs and other tools that may need to evaluate Clojure code in remote environments.

  • inf-clojure 3.4.0 - Basic interaction with a Clojure subprocess from Emacs

  • calva 2.0.563 - Clojure & ClojureScript Interactive Programming for VS Code

  • clay 2.0.12 - A REPL-friendly Clojure tool for notebooks and datavis

  • clj-media 3.0-alpha.3 - Read, write, and transform audio and video with Clojure.

  • pp 2026-03-01.107 - Peppy pretty-printer for Clojure data.

  • rewrite-clj 1.2.52 - Rewrite Clojure code and edn

  • portfolio 2026.03.1 - Component-driven development for Clojure

  • transit-java 1.1.401-alpha - transit-format implementation for Java

  • transit-clj 1.1.354-alpha - transit-format implementation for Clojure

  • babashka 1.12.216 - Native, fast starting Clojure interpreter for scripting

  • babashka-sql-pods 0.1.5 - Babashka pods for SQL databases

  • clojure-mode 5.22.0 - Emacs support for the Clojure(Script) programming language

  • datalevin 0.10.7 - A simple, fast and versatile Datalog database

  • ridley 1.8.0 - A turtle graphics-based 3D modeling tool for 3D printing. Write Clojure scripts, see real-time 3D preview, export STL. WebXR support for VR/AR visualization.

  • deps-new 0.11.1 - Create new projects for the Clojure CLI / deps.edn

  • malli 0.20.1 - High-performance data-driven data specification library for Clojure/Script.

  • instaparse-bb 0.0.7 - Use instaparse from babashka

  • clojure.jdbc 0.9.2 - JDBC library for Clojure

  • get-port 0.2.0 - Find available TCP ports for your Clojure apps and tests.

  • plumcp 0.2.0-beta2 - Clojure/ClojureScript library for making MCP server and client

  • kmono 4.11.1 - The missing workspace tool for clojure tools.deps projects

  • proletarian 1.0.115 - A durable job queuing and worker system for Clojure backed by PostgreSQL or MySQL.

  • monkeyci 0.24.2 - Next-generation CI/CD tool that uses the full power of Clojure!

  • hulunote 1.1.0 - An open-source outliner note-taking application with bidirectional linking.

  • beichte 0.2.6 - Static purity and effect analysis for Clojure.

  • reitit 0.10.1 - A fast data-driven routing library for Clojure/Script

  • thneed 1.1.8 - An eclectic set of Clojure utilities that I’ve found useful enough to keep around.

  • eca 0.112.0 - Editor Code Assistant (ECA) - AI pair programming capabilities agnostic of editor

Permalink

Just What IS Clojure, Anyway?

Look at this line of code:

processCustomerOrder(customer, orderItems)

Any developer with six months of experience knows roughly what that does. The name is explicit, the structure is familiar, the intent is readable. Now look at this:

(reduce + (map f xs))

The reaction most developers have is immediate and unfavourable. Parentheses everywhere. No obvious structure. It looks less like a programming language and more like a typographer's accident. The old joke writes itself: LISP stands for Lost In Stupid Parentheses.

That joke is, technically, a backronym. John McCarthy named it LISP as a contraction of LISt Processing when he created it in 1958. The sardonic expansion came later, coined by programmers who had opinions about the aesthetic choices involved. Those opinions have not mellowed with time.

And yet Clojure – a modern descendant of Lisp – ranked as one of the highest-paying languages in the Stack Overflow Developer Survey for several consecutive years around 2019. Developers walked away from stable Java and C# positions to build production systems in it. A Brazilian fintech used it to serve tens of millions of customers. Something requires explaining.

The ancestry: Lisp reborn

Clojure only makes sense against the background of Lisp, and Lisp only makes sense as what it actually was: not merely a programming language, but a direct implementation of mathematical ideas about computation.

McCarthy's 1958 creation introduced concepts that took the rest of the industry decades to absorb. Garbage collection, conditional expressions, functional programming, symbolic computation – all present in Lisp before most working developers today were born. Many programmers encounter Lisp's descendants daily without being aware of it.

The defining feature is the S-expression:

(+ 1 2)

Everything is written as a list. This is not merely a syntactic preference. Because code and data share the same underlying structure, a Lisp program can manipulate other programs directly. This property – homoiconicity – is the technical foundation of Lisp macros: code that generates and transforms other code at compile time, with a flexibility that few conventional infix languages match. It is the reason serious Lisp practitioners regard the syntax not as a historical curiosity but as a genuine technical advantage.

Lisp also, however, developed a reputation for producing work that individual experts could write brilliantly and teams could not maintain at all. The tension between expressive power and collective readability never fully resolved. Clojure inherits this tradition knowingly, and is aware of the cost.

What Clojure actually is

Rich Hickey created Clojure in 2007. His central design decision was not to build a new runtime from scratch but to attach Lisp to an existing ecosystem.

Layer Technology
Runtime JVM
Libraries Java ecosystem
Language model Lisp

This host strategy gave Clojure immediate access to decades of mature Java libraries without needing to rebuild any of them. A Clojure developer can call Java code directly. The same logic drove two later variants: ClojureScript, which compiles to JavaScript and found real traction in teams already working with React, and ClojureCLR, which runs on .NET. Rather than fight the unwinnable battle of building its own ecosystem from scratch, Clojure attached itself to three of the largest ones that already existed.

Clojure does not attempt to displace existing ecosystems. It operates inside them.

Central to how Clojure development actually works is the REPL – Read–Eval–Print Loop. Rather than the standard write–compile–run–crash cycle, developers send code fragments to a running system and modify it live. Functions are redefined while the application continues executing. For experienced practitioners this is a material productivity difference: the feedback loop is short, and the distance between an idea and a tested result is small. Experienced Clojure developers report unusually low defect rates, a claim that is plausible given the constraints immutability places on the ways a programme can fail.

The Hickey doctrine: simple versus easy

Hickey's 2011 Strange Loop talk Simple Made Easy is the philosophical engine behind every design choice in Clojure. It draws a distinction that most language design ignores.

Term Meaning
Easy Familiar; close to what you already know
Simple Not intertwined; concerns kept separate

Most languages pursue easy. They aim to resemble natural language, minimise cognitive friction at the point of learning, and reduce the effort required to write the first working programme. This also means that languages favoured by human readers tend to be the hardest for which to write parsers and compilers.

Clojure instead pursues simple. Its goal is to minimise tangled interdependencies in the resulting system, even at the cost of an unfamiliar surface. Writing parsers for Lisps is comparatively straightforward, at the cost of human readability.

Hickey's specific target is what he calls place-oriented programming: the treatment of variables as named locations in memory whose values change over time – mutability, in more formal terms. His argument is that conflating a value with a location generates incidental complexity at scale, particularly in concurrent systems. When you cannot be certain what a variable contains at a given moment, reasoning about a programme becomes difficult in proportion to the programme's size.

The design of Clojure follows directly from this diagnosis. Immutable data, functional composition, minimal syntax, and data structures in place of object hierarchies are all consequences of the same underlying position. The language may not feel easy. The resulting systems are intended to be genuinely simpler to reason about.

The real innovation: data and immutability

Clojure's core model is data-oriented. Rather than building class hierarchies, programmes pass simple structures through functions:

(assoc {:name "Alice" :age 30} :city "London")

This creates a new map. The original is untouched. That is the default behaviour across all of Clojure's data structures – values do not change; new versions are produced instead.

This is made practical by persistent data structures, which use structural sharing. When a new version of a data structure is produced, it shares most of its internal memory with the previous version rather than copying it entirely. The comparison that makes this intuitive for most developers: Git does not delete your previous commits when you push a new one. It stores only the difference, referencing unchanged content from before. Clojure applies the same principle to in-memory data.

The consequence for concurrency results directly from this. Race conditions require mutable shared state. If data cannot be mutated, the precondition for the most common class of concurrency bug does not exist. This was Clojure's most compelling practical argument during the multicore boom of the 2010s, when writing correct concurrent code had become a routine industrial concern rather than a specialist one. Clojure let developers eliminate that entire class of problem.

The functional programming wave – and why easy beat rigorous

Between roughly 2012 and 2020, functional programming moved from academic discussion to genuine industry interest. The drivers were concrete: multicore processors created pressure to write concurrent code correctly; distributed data systems required reasoning about transformation pipelines rather than mutable state; and the sheer complexity of large-scale software made the promise of mathematical rigour appealing.

Clojure was among the most visible representatives of this movement, alongside Haskell, Scala, and F#. Conference talks filled. Engineering blogs ran long series on immutability and monads. For a period it seemed plausible that functional languages might displace the mainstream ones.

What actually happened was different. Mainstream languages absorbed the useful ideas and continued. And the majority of working programmers, it turned out, rarely needed to reason about threading and concurrency at all.

Java gained streams and lambdas in Java 8. JavaScript acquired map, filter, and reduce as first-class patterns, and React popularised unidirectional data flow. C# extended its functional capabilities across successive versions. Rust built immutability and ownership into its type system from the outset. The industry did not convert to functional programming – it extracted what it needed and kept the syntax it already knew.

A developer who can obtain most of functional programming's benefits inside a language they already know will rarely conclude that switching entirely is justified.

The deeper reason functional languages lost the mainstream argument is not technical. It is sociological. Python won because it is, in the most precise sense, the Visual Basic of the current era. That comparison is not an insult – Visual Basic dominated the 1990s because it made programming accessible to people who had no intention of becoming professional developers, and that accessibility produced an enormous, self-reinforcing community. Python did exactly the same thing for data scientists, academics, hobbyists, and beginners, and for precisely the same reason: it is easy to learn, forgiving of error, and immediately rewarding to write. Network effects took care of the rest. Libraries multiplied. Courses proliferated. Employers specified it. The ecosystem became self-sustaining.

Clojure is the antithesis of this process. It is a language for connoisseurs – genuinely, not dismissively. Its internal consistency is elegant, its theoretical foundations are sound, and developers who master it frequently describe it with something approaching aesthetic appreciation. Mathematical beauty, however, has never been a reliable route to mass adoption. Narrow appeal does not generate network effects. And Clojure, by design, operates as something of a lone wolf: it rides atop the JVM rather than integrating natively with the broader currents of modern computing – the web-first tooling, the AI infrastructure, the vast collaborative ecosystems built around Python and JavaScript. At a moment when the decisive advantages in software development come from connectivity, interoperability, and the accumulated weight of shared tooling, a language that demands a clean break from everything a developer already knows is swimming directly against the tide.

Compare this with Kotlin or TypeScript, both of which succeeded in part because they offered a graduated path. A developer new to Kotlin can write essentially Java-style code and improve incrementally. A developer new to TypeScript can begin with plain JavaScript and add types as confidence grows. Both languages have, in effect, a beginner mode. Clojure has no such thing. You either think in Lisp or you do not write Clojure at all.

Where Clojure succeeded

Despite remaining a specialist language, Clojure has real industrial presence.

The most prominent example is Nubank, a Brazilian fintech that reached a valuation of approximately $45 billion at its NYSE listing in December 2021. Nubank runs significant portions of its backend in Clojure, and in 2020 acquired Cognitect – the company that stewards the language. That acquisition was considerably more than a gesture; it was a statement of long-term commitment from an organisation operating at scale.

ClojureScript found parallel influence in the JavaScript ecosystem. The Reagent and re-frame frameworks attracted serious production use, demonstrating that the Clojure model could be applied to front-end development at scale and not merely to backend data pipelines.

The pattern that emerges from successful Clojure deployments is consistent: small, experienced teams working on data-intensive systems where correctness and concurrency matter more than onboarding speed. That is a narrow niche. It was also, not coincidentally, a well-paid one – for a time.

Verdict: the ideas won

Clojure did not become a mainstream language. By any measure of adoption – survey rankings, job advertisements, GitHub repositories – it remains firmly in specialist territory. Even F#, a functional rival with the full weight of Microsoft's backing, has not broken through.

But the argument Clojure made in 2007 has largely been vindicated. Immutability is now a design principle in Rust, Swift, and Kotlin. Functional composition is standard across modern JavaScript and C#. Data-oriented design has become an explicit architectural pattern in game development and systems programming. The industry did not adopt Clojure, but it has been grateful for Hickey's ideas and has quietly absorbed them.

What did not transfer was the syntax – and behind the syntax lay an economic problem that no philosophical vindication could resolve.

A CTO evaluating a language does not ask only whether it is technically sound. The questions are: how large is the available talent pool? How long does onboarding take? What happens when a key developer leaves? Clojure's answers to all three were uncomfortable.

There is a further cost that rarely appears in language comparisons. A developer with ten years of experience in Java, C#, or Python carries genuine accumulated capital: hard-won familiarity with idioms, libraries, failure modes, and tooling. Switching to a Lisp-derived language does not extend that knowledge – it resets it. Clojure keeps the JVM underneath but discards almost everything a developer has learned about how to structure solutions idiomatically. The ten-year veteran spends their first six months feeling like a junior again. Recursion replaces loops. Immutable pipelines replace stateful objects. The mental models that took years to build are, at best, partially transferable. That cost is real and largely invisible in adoption discussions, and it falls on precisely the experienced developers an organisation most wants to retain. Knowledge compounds most effectively when it is built upon incrementally. Clojure does not permit that. It demands a clean break, and most organisations and most developers are not willing to pay that price.

The high wages Clojure commanded were not, from a management perspective, a straightforward mark of quality. They were also a warning of risk. They reflected something less flattering than productivity: the classic dynamic of the expert who becomes indispensable by writing systems that only they can maintain. At its worst this approaches a form of institutional capture – a codebase so entangled with one person's idiom that replacing them becomes prohibitively expensive, something uncomfortably close to ransomware in its commercial effect.

That position has been further undermined by the rise of agentic coding tools. The practical value of writing in a mainstream language has quietly increased, because AI coding assistants are trained on the accumulated body of code that exists – and that body is overwhelmingly Python, JavaScript, Java, and C#. The effect is concrete: ask a capable model to produce a complex data transformation in Python and it draws on an enormous foundation of high-quality examples. Ask it to do the same in idiomatic Clojure and the results are less reliable, the suggestions thinner, the tooling shallower. A language's effective learnability in 2026 is no longer a matter only of human cognition; it is also a function of training density. Niche languages are niche in the training data too, and that gap compounds. The expert moat – already questionable on organisational grounds – is being drained from two directions at once.

Clojure's ideas spread quietly through the languages that absorbed them and left the parentheses behind. Its practitioners, once among the best-paid developers in the industry, now find that the scarcity premium they commanded rested partly on barriers that no longer hold.

The language was right about the future of programming. It simply will not be present when that future arrives.

So, just what is Clojure, anyway? It is a language that was correct about the most important questions in software design, arrived a decade before the industry was ready to hear the answers, and expressed those answers in a notation the industry was never willing to learn. That is not a small thing. It is also not enough.

This article is part of an ongoing series examining what programming languages actually are and why they matter.

Language Argument
C The irreplaceable foundation
Python The approachable language
Rust Safe systems programming
Clojure Powerful ideas, niche language

Coming next: Zig, Odin, and Nim – three languages that think C's job could be done better, and have very different ideas about how.

Permalink

Copyright © 2009, Planet Clojure. No rights reserved.
Planet Clojure is maintained by Baishamapayan Ghose.
Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
Theme by Brajeshwar.