The Sandwich Approach to ClojureScript Development
A lightweight way to cook with Clay and Scittle
A lightweight way to cook with Clay and Scittle
This post walks through a small web development project using Clojure, covering everything from building the app to packaging and deploying it. It’s a collection of insights and tips I’ve learned from building my Clojure side projects but presented in a more structured format.
As the title suggests, we’ll be deploying the app to Fly.io. It’s a service that allows you to deploy apps packaged as Docker images on lightweight virtual machines.[1] My experience with it has been good, it’s easy to use and quick to set up. One downside of Fly is that it doesn’t have a free tier, but if you don’t plan on leaving the app deployed, it barely costs anything.
This isn’t a tutorial on Clojure, so I’ll assume you already have some familiarity with the language as well as some of its libraries.[2]
In this post, we’ll be building a barebones bookmarks manager for the demo app. Users can log in using basic authentication, view all bookmarks, and create a new bookmark. It’ll be a traditional multi-page web app and the data will be stored in a SQLite database.
Here’s an overview of the project’s starting directory structure:
.
├── dev
│ └── user.clj
├── resources
│ └── config.edn
├── src
│ └── acme
│ └── main.clj
└── deps.edn
And the libraries we’re going to use. If you have some Clojure experience or have used Kit, you’re probably already familiar with all the libraries listed below.[3]
{:paths ["src" "resources"]
:deps {org.clojure/clojure {:mvn/version "1.12.0"}
aero/aero {:mvn/version "1.1.6"}
integrant/integrant {:mvn/version "0.11.0"}
ring/ring-jetty-adapter {:mvn/version "1.12.2"}
metosin/reitit-ring {:mvn/version "0.7.2"}
com.github.seancorfield/next.jdbc {:mvn/version "1.3.939"}
org.xerial/sqlite-jdbc {:mvn/version "3.46.1.0"}
hiccup/hiccup {:mvn/version "2.0.0-RC3"}}
:aliases
{:dev {:extra-paths ["dev"]
:extra-deps {nrepl/nrepl {:mvn/version "1.3.0"}
integrant/repl {:mvn/version "0.3.3"}}
:main-opts ["-m" "nrepl.cmdline" "--interactive" "--color"]}}}
I use Aero and Integrant for my system configuration (more on this in the next section), Ring with the Jetty adaptor for the web server, Reitit for routing, next.jdbc for database interaction, and Hiccup for rendering HTML. From what I’ve seen, this is a popular “library combination” for building web apps in Clojure.[4]
The user
namespace in dev/user.clj
contains helper functions from Integrant-repl to start, stop, and restart the Integrant system.
(ns user
(:require
[acme.main :as main]
[clojure.tools.namespace.repl :as repl]
[integrant.core :as ig]
[integrant.repl :refer [set-prep! go halt reset reset-all]]))
(set-prep!
(fn []
(ig/expand (main/read-config)))) ;; we'll implement this soon
(repl/set-refresh-dirs "src" "resources")
(comment
(go)
(halt)
(reset)
(reset-all))
If you’re new to Integrant or other dependency injection libraries like Component, I’d suggest reading “How to Structure a Clojure Web”. It’s a great explanation about the reasoning behind these libraries. Like most Clojure apps that use Aero and Integrant, my system configuration lives in a .edn
file. I usually name mine as resources/config.edn
. Here’s what it looks like:
{:server
{:port #long #or [#env PORT 8080]
:host #or [#env HOST "0.0.0.0"]
:auth {:username #or [#env AUTH_USER "john.doe@email.com"]
:password #or [#env AUTH_PASSWORD "password"]}}
:database
{:dbtype "sqlite"
:dbname #or [#env DB_DATABASE "database.db"]}}
In production, most of these values will be set using environment variables. During local development, the app will use the hard-coded default values. We don’t have any sensitive values in our config (e.g., API keys), so it’s fine to commit this file to version control. If there are such values, I usually put them in another file that’s not tracked by version control and include them in the config file using Aero’s #include
reader tag.
This config file is then “expanded” into the Integrant system map using the expand-key
method:
(ns acme.main
(:require
[aero.core :as aero]
[clojure.java.io :as io]
[integrant.core :as ig]))
(defn read-config
[]
{:system/config (aero/read-config (io/resource "config.edn"))})
(defmethod ig/expand-key :system/config
[_ opts]
(let [{:keys [server database]} opts]
{:server/jetty (assoc server :handler (ig/ref :handler/ring))
:handler/ring {:database (ig/ref :database/sql)
:auth (:auth server)}
:database/sql database}))
The system map is created in code instead of being in the configuration file. This makes refactoring your system simpler as you only need to change this method while leaving the config file (mostly) untouched.[5]
My current approach to Integrant + Aero config files is mostly inspired by the blog post “Rethinking Config with Aero & Integrant” and Laravel’s configuration. The config file follows a similar structure to Laravel’s config files and contains the app configurations without describing the structure of the system. Previously I had a key for each Integrant component, which led to the config file being littered with #ig/ref
and more difficult to refactor.
Also, if you haven’t already, start a REPL and connect to it from your editor. Run clj -M:dev
if your editor doesn’t automatically start a REPL. Next, we’ll implement the init-key
and halt-key!
methods for each of the components:
;; src/acme/main.clj
(ns acme.main
(:require
;; ...
[acme.handler :as handler]
[acme.util :as util])
[next.jdbc :as jdbc]
[ring.adapter.jetty :as jetty]))
;; ...
(defmethod ig/init-key :server/jetty
[_ opts]
(let [{:keys [handler port]} opts
jetty-opts (-> opts (dissoc :handler :auth) (assoc :join? false))
server (jetty/run-jetty handler jetty-opts)]
(println "Server started on port " port)
server))
(defmethod ig/halt-key! :server/jetty
[_ server]
(.stop server))
(defmethod ig/init-key :handler/ring
[_ opts]
(handler/handler opts))
(defmethod ig/init-key :database/sql
[_ opts]
(let [datasource (jdbc/get-datasource opts)]
(util/setup-db datasource)
datasource))
The setup-db
function creates the required tables in the database if they don’t exist yet. This works fine for database migrations in small projects like this demo app, but for larger projects, consider using libraries such as Migratus (my preferred library) or Ragtime.
(ns acme.util
(:require
[next.jdbc :as jdbc]))
(defn setup-db
[db]
(jdbc/execute-one!
db
["create table if not exists bookmarks (
bookmark_id text primary key not null,
url text not null,
created_at datetime default (unixepoch()) not null
)"]))
For the server handler, let’s start with a simple function that returns a “hi world” string.
(ns acme.handler
(:require
[ring.util.response :as res]))
(defn handler
[_opts]
(fn [req]
(res/response "hi world")))
Now all the components are implemented. We can check if the system is working properly by evaluating (reset)
in the user
namespace. This will reload your files and restart the system. You should see this message printed in your REPL:
:reloading (acme.util acme.handler acme.main)
Server started on port 8080
:resumed
If we send a request to http://localhost:8080/
, we should get “hi world” as the response:
$
curl localhost:8080/
# hi world
Nice! The system is working correctly. In the next section, we’ll implement routing and our business logic handlers.
First, let’s set up a ring handler and router using Reitit. We only have one route, the index /
route that’ll handle both GET and POST requests.
(ns acme.handler
(:require
[reitit.ring :as ring]))
(def routes
[["/" {:get index-page
:post index-action}]])
(defn handler
[opts]
(ring/ring-handler
(ring/router routes)
(ring/routes
(ring/redirect-trailing-slash-handler)
(ring/create-resource-handler {:path "/"})
(ring/create-default-handler))))
We’re including some useful middleware:
redirect-trailing-slash-handler
to resolve routes with trailing slashes,create-resource-handler
to serve static files, andcreate-default-handler
to handle common 40x responses.If you remember the :handler/ring
from earlier, you’ll notice that it has two dependencies, database
and auth
. Currently, they’re inaccessible to our route handlers. To fix this, we can inject these components into the Ring request map using a middleware function.
;; ...
(defn components-middleware
[components]
(let [{:keys [database auth]} components]
(fn [handler]
(fn [req]
(handler (assoc req
:db database
:auth auth))))))
;; ...
The components-middleware
function takes in a map of components and creates a middleware function that “assocs” each component into the request map.[6] If you have more components such as a Redis cache or a mail service, you can add them here.
We’ll also need a middleware to handle HTTP basic authentication.[7] This middleware will check if the username and password from the request map matche the values in the auth
map injected by components-middleware
. If they match, then the request is authenticated and the user can view the site.
(ns acme.handler
(:require
;; ...
[acme.util :as util]
[ring.util.response :as res]))
;; ...
(defn wrap-basic-auth
[handler]
(fn [req]
(let [{:keys [headers auth]} req
{:keys [username password]} auth
authorization (get headers "authorization")
correct-creds (str "Basic " (util/base64-encode
(format "%s:%s" username password)))]
(if (and authorization (= correct-creds authorization))
(handler req)
(-> (res/response "Access Denied")
(res/status 401)
(res/header "WWW-Authenticate" "Basic realm=protected"))))))
;; ...
A nice feature of Clojure is that interop with the host language is easy. The base64-encode
function is just a thin wrapper over Java’s Base64.Encoder
:
(ns acme.util
;; ...
(:import java.util.Base64))
(defn base64-encode
[s]
(.encodeToString (Base64/getEncoder) (.getBytes s)))
Finally, we need to add them to the router. Since we’ll be handling form requests later, we’ll also bring in Ring’s wrap-params
middleware.
(ns acme.handler
(:require
;; ...
[ring.middleware.params :refer [wrap-params]]))
;; ...
(defn handler
[opts]
(ring/ring-handler
;; ...
{:middleware [(components-middleware opts)
wrap-basic-auth
wrap-params]}))
We now have everything we need to implement the route handlers or the business logic of the app. First, we’ll implement the index-page
function which renders a page that:
(ns acme.handler
(:require
;; ...
[next.jdbc :as jdbc]
[next.jdbc.sql :as sql]))
;; ...
(defn template
[bookmarks]
[:html
[:head
[:meta {:charset "utf-8"
:name "viewport"
:content "width=device-width, initial-scale=1.0"}]]
[:body
[:h1 "bookmarks"]
[:form {:method "POST"}
[:div
[:label {:for "url"} "url "]
[:input#url {:name "url"
:type "url"
:required true
:placeholer "https://en.wikipedia.org/"}]]
[:button "submit"]]
[:p "your bookmarks:"]
[:ul
(if (empty? bookmarks)
[:li "you don't have any bookmarks"]
(map
(fn [{:keys [url]}]
[:li
[:a {:href url} url]])
bookmarks))]]])
(defn index-page
[req]
(try
(let [bookmarks (sql/query (:db req)
["select * from bookmarks"]
jdbc/unqualified-snake-kebab-opts)]
(util/render (template bookmarks)))
(catch Exception e
(util/server-error e))))
;; ...
Database queries can sometimes throw exceptions, so it’s good to wrap them in a try-catch block. I’ll also introduce some helper functions:
(ns acme.util
(:require
;; ...
[hiccup2.core :as h]
[ring.util.response :as res])
(:import java.util.Base64))
;; ...
(defn preprend-doctype
[s]
(str "<!doctype html>" s))
(defn render
[hiccup]
(-> hiccup h/html str preprend-doctype res/response (res/content-type "text/html")))
(defn server-error
[e]
(println "Caught exception: " e)
(-> (res/response "Internal server error")
(res/status 500)))
render
takes a hiccup form and turns it into a ring response, while server-error
takes an exception, logs it, and returns a 500 response.
Next, we’ll implement the index-action
function:
;; ...
(defn index-action
[req]
(try
(let [{:keys [db form-params]} req
value (get form-params "url")]
(sql/insert! db :bookmarks {:bookmark_id (random-uuid) :url value})
(res/redirect "/" 303))
(catch Exception e
(util/server-error e))))
;; ...
This is an implementation of a typical post/redirect/get pattern. We get the value from the URL form field, insert a new row in the database with that value, and redirect back to the index page. Again, we’re using a try-catch block to handle possible exceptions from the database query.
That should be all of the code for the controllers. If you reload your REPL and go to http://localhost:8080
, you should see something that looks like this after logging in:
The last thing we need to do is to update the main function to start the system:
;; ...
(defn -main [& _]
(-> (read-config) ig/expand ig/init))
Now, you should be able to run the app using clj -M -m acme.main
. That’s all the code needed for the app. In the next section, we’ll package the app into a Docker image to deploy to Fly.
While there are many ways to package a Clojure app, Fly.io specifically requires a Docker image. There are two approaches to doing this:
Both are valid approaches. I prefer the first since its only dependency is the JVM. We’ll use the tools.build library to build the uberjar. Check out the official guide for more information on building Clojure programs. Since it’s a library, to use it we can add it to our deps.edn
file with an alias:
{;; ...
:aliases
{;; ...
:build {:extra-deps {io.github.clojure/tools.build
{:git/tag "v0.10.5" :git/sha "2a21b7a"}}
:ns-default build}}}
Tools.build expects a build.clj
file in the root of the project directory, so we’ll need to create that file. This file contains the instructions to build artefacts, which in our case is a single uberjar. There are many great examples of build.clj
files on the web, including from the official documentation. For now, you can copy+paste this file into your project.
(ns build
(:require
[clojure.tools.build.api :as b]))
(def basis (delay (b/create-basis {:project "deps.edn"})))
(def src-dirs ["src" "resources"])
(def class-dir "target/classes")
(defn uber
[_]
(println "Cleaning build directory...")
(b/delete {:path "target"})
(println "Copying files...")
(b/copy-dir {:src-dirs src-dirs
:target-dir class-dir})
(println "Compiling Clojure...")
(b/compile-clj {:basis @basis
:ns-compile '[acme.main]
:class-dir class-dir})
(println "Building Uberjar...")
(b/uber {:basis @basis
:class-dir class-dir
:uber-file "target/standalone.jar"
:main 'acme.main}))
To build the project, run clj -T:build uber
. This will create the uberjar standalone.jar
in the target
directory. The uber
in clj -T:build uber
refers to the uber
function from build.clj
. Since the build system is a Clojure program, you can customise it however you like. If we try to run the uberjar now, we’ll get an error:
# build the uberjar
$
clj -T:build uber
# Cleaning build directory...
# Copying files...
# Compiling Clojure...
# Building Uberjar...
# run the uberjar
$
java -jar target/standalone.jar
# Error: Could not find or load main class acme.main
# Caused by: java.lang.ClassNotFoundException: acme.main
This error occurred because the Main class that is required by Java isn’t built. To fix this, we need to add the :gen-class
directive in our main namespace. This will instruct Clojure to create the Main class from the -main
function.
(ns acme.main
;; ...
(:gen-class))
;; ...
If you rebuild the project and run java -jar target/standalone.jar
again, it should work perfectly. Now that we have a working build script, we can write the Dockerfile:
# install additional dependencies here in the base layer
# separate base from build layer so any additional deps installed are cached
FROM clojure:temurin-21-tools-deps-bookworm-slim AS base
FROM base as build
WORKDIR /opt
COPY . .
RUN clj -T:build uber
FROM eclipse-temurin:21-alpine AS prod
COPY /opt/target/standalone.jar /
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "standalone.jar"]
It’s a multi-stage Dockerfile. We use the official Clojure Docker image as the layer to build the uberjar. Once it’s built, we copy it to a smaller Docker image that only contains the Java runtime.[8] By doing this, we get a smaller container image as well as a faster Docker build time because the layers are better cached.
That should be all for packaging the app. We can move on to the deployment now.
First things first, you’ll need to install flyctl
, Fly’s CLI tool for interacting with their platform. Create a Fly.io account if you haven’t already. Then run fly auth login
to authenticate flyctl
with your account.
Next, we’ll need to create a new Fly App:
$
fly app create
# ? Choose an app name (leave blank to generate one):
# automatically selected personal organization: Ryan Martin
# New app created: blue-water-6489
Another way to do this is with the fly launch
command, which automates a lot of the app configuration for you. We have some steps to do that are not done by fly launch
, so we’ll be configuring the app manually. I also already have a fly.toml
file ready that you can straight away copy to your project.
# replace these with your app and region name
# run `fly platform regions` to get a list of regions
app = 'blue-water-6489'
primary_region = 'sin'
[env]
DB_DATABASE = "/data/database.db"
[http_service]
internal_port = 8080
force_https = true
auto_stop_machines = "stop"
auto_start_machines = true
min_machines_running = 0
[mounts]
source = "data"
destination = "/data"
initial_sie = 1
[[vm]]
size = "shared-cpu-1x"
memory = "512mb"
cpus = 1
cpu_kind = "shared"
These are mostly the default configuration values with some additions. Under the [env]
section, we’re setting the SQLite database location to /data/database.db
. The database.db
file itself will be stored in a persistent Fly Volume mounted on the /data
directory. This is specified under the [mounts]
section. Fly Volumes are similar to regular Docker volumes but are designed for Fly’s micro VMs.
We’ll need to set the AUTH_USER
and AUTH_PASSWORD
environment variables too, but not through the fly.toml
file as these are sensitive values. To securely set these credentials with Fly, we can set them as app secrets. They’re stored encrypted and will be automatically injected into the app at boot time.
$
fly secrets set AUTH_USER=hi@ryanmartin.me AUTH_PASSWORD=not-so-secure-password
# Secrets are staged for the first deployment
With this, the configuration is done and we can deploy the app using fly deploy
:
$
fly deploy
# ...
# Checking DNS configuration for blue-water-6489.fly.dev
# Visit your newly deployed app at https://blue-water-6489.fly.dev/
The first deployment will take longer since it’s building the Docker image for the first time. Subsequent deployments should be faster due to the cached image layers. You can click on the link to view the deployed app, or you can also run fly open
which will do the same thing. Here’s the app in action:
If you made additional changes to the app or fly.toml
, you can redeploy the app using the same command, fly deploy
. The app is configured to auto stop/start, which helps to cut costs when there’s not a lot of traffic to the site. If you want to take down the deployment, you’ll need to delete the app itself using fly app destroy <your app name>
.
This is an interesting topic in the Clojure community, with varying opinions on whether or not it’s a good idea. Personally I find having a REPL connected to the live app helpful, and I often use it for debugging and running queries on the live database.[9] Since we’re using SQLite, we don’t have a database server we can directly connect to, unlike Postgres or MySQL.
If you’re brave, you can even restart the app directly without redeploying from the REPL. You can easily go wrong with it, which is why some prefer to not use it.
For this project, we’re gonna add a socket REPL. It’s very simple to add (you just need to add a JVM option) and it doesn’t require additional dependencies like nREPL. Let’s update the Dockerfile:
# ...
EXPOSE 7888
ENTRYPOINT ["java", "-Dclojure.server.repl={:port 7888 :accept clojure.core.server/repl}", "-jar", "standalone.jar"]
The socket REPL will be listening on port 7888. If we redeploy the app now, the REPL will be started but we won’t be able to connect to it. That’s because we haven’t exposed the service through Fly proxy. We can do this by adding the socket REPL as a service in the [services]
section in fly.toml
.
However, doing this will also expose the REPL port to the public. This means that anyone can connect to your REPL and possibly mess with your app. Instead, what we want to do is to configure the socket REPL as a private service.
By default, all Fly apps in your organisation live in the same private network. This private network, called 6PN, connects the apps in your organisation through Wireguard tunnels (a VPN) using IPv6. Fly private services aren’t exposed to the public internet but can be reached from this private network. We can then use Wireguard to connect to this private network to reach our socket REPL.
Fly VMs are also configured with the hostname fly-local-6pn
, which maps to its 6PN address. This is analogous to localhost
, which points to your loopback address 127.0.0.1
. To expose a service to 6PN, all we have to do is bind or serve it to fly-local-6pn
instead of the usual 0.0.0.0
. We have to update the socket REPL options to:
# ...
ENTRYPOINT ["java", "-Dclojure.server.repl={:port 7888,:address \"fly-local-6pn\",:accept clojure.core.server/repl}", "-jar", "standalone.jar"]
After redeploying, we can use the fly proxy
command to forward the port from the remote server to our local machine.[10]
$
fly proxy 7888:7888
# Proxying local port 7888 to remote [blue-water-6489.internal]:7888
In another shell, run:
$
rlwrap nc localhost 7888
# user=>
Now we have a REPL connected to the production app! rlwrap
is used for readline functionality, e.g. up/down arrow keys, vi bindings. Of course you can also connect to it from your editor.
If you’re using GitHub, we can also set up automatic deployments on pushes/PRs with GitHub Actions. All you need is to create the workflow file:
name: Fly Deploy
on:
push:
branches:
- main
workflow_dispatch:
jobs:
deploy:
name: Deploy app
runs-on: ubuntu-latest
concurrency: deploy-group
steps:
- uses: actions/checkout@v4
- uses: superfly/flyctl-actions/setup-flyctl@master
- run: flyctl deploy --remote-only
env:
FLY_API_TOKEN: ${{ secrets.FLY_API_TOKEN }}
To get this to work, you’ll need to create a deploy token from your app’s dashboard. Then, in your GitHub repo, create a new repository secret called FLY_API_TOKEN
with the value of your deploy token. Now, whenever you push to the main
branch, this workflow will automatically run and deploy your app. You can also manually run the workflow from GitHub because of the workflow_dispatch
option.
As always, all the code is available on GitHub. Originally, this post was just about deploying to Fly.io, but along the way I kept adding on more stuff until it essentially became my version of the user manager example app. Anyway, hope this post provided a good view into web development with Clojure. As a bonus, here are some additional resources on deploying Clojure apps:
The way Fly.io works under the hood is pretty clever. Instead of running the container image with a runtime like Docker, the image is unpacked and “loaded” into a VM. See this video explanation for more details. ↩︎
If you’re interested in learning Clojure, my recommendation is to follow the official getting started guide and join the Clojurians Slack. Also, read through this list of introductory resources. ↩︎
Kit was a big influence on me when I first started learning web development in Clojure. I never used it directly, but I did use their library choices and project structure as a base for my own projects. ↩︎
There’s no “Rails” for the Clojure ecosystem (yet?). The prevailing opinion is to build your own “framework” by composing different libraries together. Most of these libraries are stable and are already used in production by big companies, so don’t let this discourage you from doing web development in Clojure! ↩︎
There might be some keys that you add or remove, but the structure of the config file stays the same. ↩︎
“assoc” (associate) is a Clojure slang that means to add or update a key-value pair in a map. ↩︎
For more details on how basic authentication works, check out the specification. ↩︎
Here’s a cool resource I found when researching Java Dockerfiles: WhichJDK. It provides a comprehensive comparison on the different JDKs available and recommendations on which one you should use. ↩︎
Another (non-technically important) argument for live/production REPLs is just because it’s cool. Ever since I read the story about NASA’s programmers debugging a spacecraft through a live REPL, I’ve always wanted to try it at least once. ↩︎
If you encounter errors related to Wireguard when running fly proxy
, you can run fly doctor
which will hopefully detect issues with your local setup and also suggest fixes for them. ↩︎
This post is about six seven months late, but here are my takeaways from Advent of Code 2024. It was my second time participating, and this time I actually managed to complete it.[1] My goal was to learn a new language, Zig, and to improve my DSA and problem-solving skills.
If you’re not familiar, Advent of Code is an annual programming challenge that runs every December. A new puzzle is released each day from December 1st to the 25th. There’s also a global leaderboard where people (and AI) race to get the fastest solves, but I personally don’t compete in it, mostly because I want to do it at my own pace.
I went with Zig because I have been curious about it for a while, mainly because of its promise of being a better C and because TigerBeetle (one of the coolest databases now) is written in it. Learning Zig felt like a good way to get back into systems programming, something I’ve been wanting to do after a couple of chaotic years of web development.
This post is mostly about my setup, results, and the things I learned from solving the puzzles. If you’re more interested in my solutions, I’ve also uploaded my code and solution write-ups to my GitHub repository.
There were several Advent of Code templates in Zig that I looked at as a reference for my development setup, but none of them really clicked with me. I ended up just running my solutions directly using zig run
for the whole event. It wasn’t until after the event ended that I properly learned Zig’s build system and reorganised my project.
Here’s what the project structure looks like now:
.
├── src
│ ├── days
│ │ ├── data
│ │ │ ├── day01.txt
│ │ │ ├── day02.txt
│ │ │ └── ...
│ │ ├── day01.zig
│ │ ├── day02.zig
│ │ └── ...
│ ├── bench.zig
│ └── run.zig
└── build.zig
The project is powered by build.zig
, which defines several commands:
zig build
- Builds all of the binaries for all optimisation modes.zig build run
- Runs all solutions sequentially.zig build run -Day=XX
- Runs the solution of the specified day only.zig build bench
- Runs all benchmarks sequentially.zig build bench -Day=XX
- Runs the benchmark of the specified day only.zig build test
- Runs all tests sequentially.zig build test -Day=XX
- Runs the tests of the specified day only.You can also pass the optimisation mode that you want to any of the commands above with the -Doptimize
flag.
Under the hood, build.zig
compiles src/run.zig
when you call zig build run
, and src/bench.zig
when you call zig build bench
. These files are templates that import the solution for a specific day from src/days/dayXX.zig
. For example, here’s what src/run.zig
looks like:
const std = @import("std");
const puzzle = @import("day"); // Injected by build.zig
pub fn main() !void {
var arena = std.heap.ArenaAllocator.init(std.heap.page_allocator);
defer arena.deinit();
const allocator = arena.allocator();
std.debug.print("{s}\n", .{puzzle.title});
_ = try puzzle.run(allocator, true);
std.debug.print("\n", .{});
}
The day
module imported is an anonymous import dynamically injected by build.zig
during compilation. This allows a single run.zig
or bench.zig
to be reused for all solutions. This avoids repeating boilerplate code in the solution files. Here’s a simplified version of my build.zig
file that shows how this works:
const std = @import("std");
pub fn build(b: *std.Build) void {
const target = b.standardTargetOptions(.{});
const optimize = b.standardOptimizeOption(.{});
const run_all = b.step("run", "Run all days");
const day_option = b.option(usize, "ay", ""); // The `-Day` option
// Generate build targets for all 25 days.
for (1..26) |day| {
const day_zig_file = b.path(b.fmt("src/days/day{d:0>2}.zig", .{day}));
// Create an executable for running this specific day.
const run_exe = b.addExecutable(.{
.name = b.fmt("run-day{d:0>2}", .{day}),
.root_source_file = b.path("src/run.zig"),
.target = target,
.optimize = optimize,
});
// Inject the day-specific solution file as the anonymous module `day`.
run_exe.root_module.addAnonymousImport("day", .{ .root_source_file = day_zig_file });
// Install the executable so it can be run.
b.installArtifact(run_exe);
// ...
}
}
My actual build.zig
has some extra code that builds the binaries for all optimisation modes.
This setup is pretty barebones. I’ve seen other templates do cool things like scaffold files, download puzzle inputs, and even submit answers automatically. Since I wrote my build.zig
after the event ended, I didn’t get to use it while solving the puzzles. I might add these features to it if I decided to do Advent of Code again this year with Zig.
While there are no rules to Advent of Code itself, to make things a little more interesting, I set a few constraints and rules for myself:
@embedFile
.Most of these constraints are designed to push me to write clearer, more performant code. I also wanted my code to look like it was taken straight from TigerBeetle’s codebase (minus the assertions).[3] Lastly, I just thought it would make the experience more fun.
From all of the puzzles, here are my top 3 favourites:
Honourable mention:
During the event, I learned a lot about Zig and performance, and also developed some personal coding conventions. Some of these are Zig-specific, but most are universal and can be applied across languages. This section covers general programming and Zig patterns I found useful. The next section will focus on performance-related tips.
Zig’s flagship feature, comptime
, is surprisingly useful. I knew Zig uses it for generics and that people do clever metaprogramming with it, but I didn’t expect to be using it so often myself.
My main use for comptime
was to generate puzzle-specific types. All my solution files follow the same structure, with a DayXX
function that takes some parameters (usually the input length) and returns a puzzle-specific type, e.g.:
fn Day01(comptime length: usize) type {
return struct {
const Self = @This();
left: [length]u32 = undefined,
right: [length]u32 = undefined,
fn init(input: []const u8) !Self {}
// ...
};
}
This lets me instantiate the type with a size that matches my input:
// Here, `Day01` is called with the size of my actual input.
pub fn run(_: std.mem.Allocator, is_run: bool) ![3]u64 {
// ...
const input = @embedFile("./data/day01.txt");
var puzzle = try Day01(1000).init(input);
// ...
}
// Here, `Day01` is called with the size of my test input.
test "day 01 part 1 sample 1" {
var puzzle = try Day01(6).init(sample_input);
// ...
}
This allows me to reuse logic across different inputs while still hardcoding the array sizes. Without comptime
, I have to either create a separate function for all my different inputs or dynamically allocate memory because I can’t hardcode the array size.
I also used comptime
to shift some computation to compile-time to reduce runtime overhead. For example, on day 4, I needed a function to check whether a string matches either "XMAS"
or its reverse, "SAMX"
. A pretty simple function that you can write as a one-liner in Python:
def matches(pattern, target):
return target == pattern or target == pattern[::-1]
Typically a function like this requires some dynamic allocation to create the reversed string, since the length of the string is only known at runtime.[4] For this puzzle, since the words to reverse are known at compile-time, we can do something like this:
fn matches(comptime word: []const u8, slice: []const u8) bool {
var reversed: [word.len]u8 = undefined;
@memcpy(&reversed, word);
std.mem.reverse(u8, &reversed);
return std.mem.eql(u8, word, slice) or std.mem.eql(u8, &reversed, slice);
}
This creates a separate function for each word I want to reverse.[5] Each function has an array with the same size as the word to reverse. This removes the need for dynamic allocation and makes the code run faster. As a bonus, Zig also warns you when this word isn’t compile-time known, so you get an immediate error if you pass in a runtime value.
A common pattern in C is to return special sentinel values to denote missing values or errors, e.g. -1
, 0
, or NULL
. In fact, I did this on day 13 of the challenge:
// We won't ever get 0 as a result, so we use it as a sentinel error value.
fn count_tokens(a: [2]u8, b: [2]u8, p: [2]i64) u64 {
const numerator = @abs(p[0] * b[1] - p[1] * b[0]);
const denumerator = @abs(@as(i32, a[0]) * b[1] - @as(i32, a[1]) * b[0]);
return if (numerator % denumerator != 0) 0 else numerator / denumerator;
}
// Then in the caller, skip if the return value is 0.
if (count_tokens(a, b, p) == 0) continue;
This works, but it’s easy to forget to check for those values, or worse, to accidentally treat them as valid results. Zig improves on this with optional types. If a function might not return a value, you can return ?T
instead of T
. This also forces the caller to handle the null
case. Unlike C, null
isn’t a pointer but a more general concept. Zig treats null
as the absence of a value for any type, just like Rust’s Option<T>
.
The count_tokens
function can be refactored to:
// Return null instead if there's no valid result.
fn count_tokens(a: [2]u8, b: [2]u8, p: [2]i64) ?u64 {
const numerator = @abs(p[0] * b[1] - p[1] * b[0]);
const denumerator = @abs(@as(i32, a[0]) * b[1] - @as(i32, a[1]) * b[0]);
return if (numerator % denumerator != 0) null else numerator / denumerator;
}
// The caller is now forced to handle the null case.
if (count_tokens(a, b, p)) |n_tokens| {
// logic only runs when n_tokens is not null.
}
Zig also has a concept of error unions, where a function can return either a value or an error. In Rust, this is Result<T>
. You could also use error unions instead of optionals for count_tokens
; Zig doesn’t force a single approach. I come from Clojure where returning nil
for an error or missing value is common.
This year has a lot of 2D grid puzzles (arguably too many). A common feature of grid-based algorithms is the out-of-bounds check. Here’s what it usually looks like:
fn dfs(map: [][]u8, position: [2]i8) u32 {
const x, const y = position;
// Bounds check here.
if (x < 0 or y < 0 or x >= map.len or y >= map[0].len) return 0;
if (map[x][y] == .visited) return 0;
map[x][y] = .visited;
var result: u32 = 1;
for (directions) | direction| {
result += dfs(map, position + direction);
}
return result;
}
This is a typical recursive DFS function. After doing a lot of this, I discovered a nice trick that not only improves code readability, but also its performance. The trick here is to pad the grid with sentinel characters that mark out-of-bounds areas, i.e. add a border to the grid.
Here’s an example from day 6:
Original map: With borders added:
************
....#..... *....#.....*
.........# *.........#*
.......... *..........*
..#....... *..#.......*
.......#.. -> *.......#..*
.......... *..........*
.#..^..... *.#..^.....*
........#. *........#.*
#......... *#.........*
......#... *......#...*
************
You can use any value for the border, as long as it doesn’t conflict with valid values in the grid. With the border in place, the bounds check becomes a simple equality comparison:
const border = '*';
fn dfs(map: [][]u8, position: [2]i8) u32 {
const x, const y = position;
if (map[x][y] == border) { // We are out of bounds
return 0;
}
// ...
}
This is much more readable than the previous code. Plus, it’s also faster since we’re only doing one equality check instead of four range checks.
That said, this isn’t a one-size-fits-all solution. This only works for algorithms that traverse the grid one step at a time. If your logic jumps multiple tiles, it can still go out of bounds (except if you increase the width of the border to account for this). This approach also uses a bit more memory than the regular approach as you have to store more characters.
This could also go in the performance section, but I’m including it here because the biggest benefit I get from using SIMD in Zig is the improved code readability. Because Zig has first-class support for vector types, you can write elegant and readable code that also happens to be faster.
If you’re not familiar with vectors, they are a special collection type used for Single instruction, multiple data (SIMD) operations. SIMD allows you to perform computation on multiple values in parallel using only a single CPU instruction, which often leads to some performance boosts.[6]
I mostly use vectors to represent positions and directions, e.g. for traversing a grid. Instead of writing code like this:
next_position = .{ position[0] + direction[0], position[1] + direction[1] };
You can represent position
and direction
as 2-element vectors and write code like this:
next_position = position + direction;
This is much nicer than the previous version!
Day 25 is another good example of a problem that can be solved elegantly using vectors:
var result: u64 = 0;
for (self.locks.items) |lock| { // lock is a vector
for (self.keys.items) |key| { // key is also a vector
const fitted = lock + key > @as(@Vector(5, u8), @splat(5));
const is_overlap = @reduce(.Or, fitted);
result += @intFromBool(!is_overlap);
}
}
Expressing the logic as vector operations makes the code cleaner since you don’t have to write loops and conditionals as you typically would in a traditional approach.
The tips below are general performance techniques that often help, but like most things in software engineering, “it depends”. These might work 80% of the time, but performance is often highly context-specific. You should benchmark your code instead of blindly following what other people say.
This section would’ve been more fun with concrete examples, step-by-step optimisations, and benchmarks, but that would’ve made the post way too long. Hopefully I’ll get to write something like that in the future.[7]
Whenever possible, prefer static allocation. Static allocation is cheaper since it just involves moving the stack pointer vs dynamic allocation which has more overhead from the allocator machinery. That said, it’s not always the right choice since it has some limitations, e.g. stack size is limited, memory size must be compile-time known, its lifetime is tied to the current stack frame, etc.
If you need to do dynamic allocations, try to reduce the number of times you call the allocator. The number of allocations you do matters more than the amount of memory you allocate. More allocations mean more bookkeeping, synchronisation, and sometimes syscalls.
A simple but effective way to reduce allocations is to reuse buffers, whether they’re statically or dynamically allocated. Here’s an example from day 10. For each trail head, we want to create a set of trail ends reachable from it. The naive approach is to allocate a new set every iteration:
for (self.trail_heads.items) |trail_head| {
var trail_ends = std.AutoHashMap([2]u8, void).init(self.allocator);
defer trail_ends.deinit();
// Set building logic...
}
What you can do instead is to allocate the set once before the loop. Then, each iteration, you reuse the set by emptying it without freeing the memory. For Zig’s std.AutoHashMap
, this can be done using the clearRetainingCapacity
method:
var trail_ends = std.AutoHashMap([2]u8, void).init(self.allocator);
defer trail_ends.deinit();
for (self.trail_heads.items) |trail_head| {
trail_ends.clearRetainingCapacity();
// Set building logic...
}
If you use static arrays, you can also just overwrite existing data instead of clearing it.
A step up from this is to reuse multiple buffers. The simplest form of this is to reuse two buffers, i.e. double buffering. Here’s an example from day 11:
// Initialize two hash maps that we'll alternate between.
var frequencies: [2]std.AutoHashMap(u64, u64) = undefined;
for (0..2) |i| frequencies[i] = std.AutoHashMap(u64, u64).init(self.allocator);
defer for (0..2) |i| frequencies[i].deinit();
var id: usize = 0;
for (self.stones) |stone| try frequencies[id].put(stone, 1);
for (0..n_blinks) |_| {
var old_frequencies = &frequencies[id % 2];
var new_frequencies = &frequencies[(id + 1) % 2];
id += 1;
defer old_frequencies.clearRetainingCapacity();
// Do stuff with both maps...
}
Here we have two maps to count the frequencies of stones across iterations. Each iteration will build up new_frequencies
with the values from old_frequencies
. Doing this reduces the number of allocations to just 2 (the number of buffers). The tradeoff here is that it makes the code slightly more complex.
A performance tip people say is to have “mechanical sympathy”. Understand how your code is processed by your computer. An example of this is to structure your data so it works better with your CPU. For example, keep related data close in memory to take advantage of cache locality.
Reducing the size of your data helps with this. Smaller data means more of it can fit in cache. One way to shrink your data is through bit packing. This depends heavily on your specific data, so you’ll need to use your judgement to tell whether this would work for you. I’ll just share some examples that worked for me.
The first example is in day 6 part two, where you have to detect a loop, which happens when you revisit a tile from the same direction as before. To track this, you could use a map or a set to store the tiles and visited directions. A more efficient option is to store this direction metadata in the tile itself.
There are only four tile types, which means you only need two bits to represent the tile types as an enum. If the enum size is one byte, here’s what the tiles look like in memory:
.obstacle -> 00000000
.path -> 00000001
.visited -> 00000010
.path -> 00000011
As you can see, the upper six bits are unused. We can store the direction metadata in the upper four bits. One bit for each direction. If a bit is set, it means that we’ve already visited the tile in this direction. Here’s an illustration of the memory layout:
direction metadata tile type
┌─────┴─────┐ ┌─────┴─────┐
┌────────┬─┴─┬───┬───┬─┴─┬─┴─┬───┬───┬─┴─┐
│ Tile: │ 1 │ 0 │ 0 │ 0 │ 0 │ 0 │ 1 │ 0 │
└────────┴─┬─┴─┬─┴─┬─┴─┬─┴───┴───┴───┴───┘
up bit ─┘ │ │ └─ left bit
right bit ─┘ down bit
If your language supports struct packing, you can express this layout directly:[8]
const Tile = packed struct(u8) {
const TileType = enum(u4) { obstacle, path, visited, exit };
up: u1 = 0,
right: u1 = 0,
down: u1 = 0,
left: u1 = 0,
tile: TileType,
// ...
}
Doing this avoids extra allocations and improves cache locality. Since the directions metadata is colocated with the tile type, all of them can fit together in cache. Accessing the directions just requires some bitwise operations instead of having to fetch them from another region of memory.
Another way to do this is to represent your data using alternate number bases. Here’s an example from day 23. Computers are represented as two character strings made up of only lowercase letters, e.g. "bc"
, "xy"
, etc. Instead of storing this as a [2]u8
array, you can convert it into a base-26 number and store it as a u16
.[9]
Here’s the idea: map 'a'
to 0, 'b'
to 1, up to 'z'
as 25. Each character in the string becomes a digit in the base-26 number. For example, "bc"
( [2]u8{ 'b', 'c' }
) becomes the base-10 number 28 (). If we represent this using the base-64 character set, it becomes 12 ('b'
= 1, 'c'
= 2).
While they take the same amount of space (2 bytes), a u16
has some benefits over a [2]u8
:
I won’t explain branchless programming here; the Algorithmica explains it way better than I can. While modern compilers are often smart enough to compile away branches, they don’t catch everything. I still recommend writing branchless code whenever it makes sense. It also has the added benefit of reducing the number of codepaths in your program.
Again, since performance is very context-dependent, I’ll just show you some patterns I use. Here’s one that comes up often:
if (is_valid_report(report)) {
result += 1;
}
Instead of the branch, cast the bool into an integer directly:
result += @intFromBool(is_valid_report(report))
Another example is from day 6 (again!). Recall that to know if a tile has been visited from a certain direction, we have to check its direction bit. Here’s one way to do it:
fn has_visited(tile: Tile, direction: Direction) bool {
switch (direction) {
.up => return self.up == 1,
.right => return self.right == 1,
.down => return self.down == 1,
.left => return self.left == 1,
}
}
This works, but it introduces a few branches. We can make it branchless using bitwise operations:
fn has_visited(tile: Tile, direction: Direction) bool {
const int_tile = std.mem.nativeToBig(u8, @bitCast(tile));
const mask = direction.mask();
const bits = int_tile & 0xff; // Get only the direction bits
return bits & mask == mask;
}
While this is arguably cryptic and less readable, it does perform better than the switch version.
The final performance tip is to prefer iterative code over recursion. Recursive functions bring the overhead of allocating stack frames. While recursive code is more elegant, it’s also often slower unless your language’s compiler can optimise it away, e.g. via tail-call optimisation. As far as I know, Zig doesn’t have this, though I might be wrong.
Recursion also has the risk of causing a stack overflow if the execution isn’t bounded. This is why code that is mission- or safety-critical avoids recursion entirely. It’s in TigerBeetle’s TIGERSTYLE and also NASA’s Power of Ten.
Iterative code can be harder to write in some cases, e.g. DFS maps naturally to recursion, but most of the time it is significantly faster, more predictable, and safer than the recursive alternative.
I ran benchmarks for all 25 solutions in each of Zig’s optimisation modes. You can find the full results and the benchmark script in my GitHub repository. All benchmarks were done on an Apple M3 Pro.
As expected, ReleaseFast
produced the best result with a total runtime of 85.1 ms. I’m quite happy with this, considering the two constraints that limited the number of optimisations I can do to the code:
You can see the full benchmarks for ReleaseFast
in the table below:
Day | Title | Parsing (µs) | Part 1 (µs) | Part 2 (µs) | Total (µs) |
---|---|---|---|---|---|
1 | Historian Hysteria | 23.5 | 15.5 | 2.8 | 41.8 |
2 | Red-Nosed Reports | 42.9 | 0.0 | 11.5 | 54.4 |
3 | Mull it Over | 0.0 | 7.2 | 16.0 | 23.2 |
4 | Ceres Search | 5.9 | 0.0 | 0.0 | 5.9 |
5 | Print Queue | 22.3 | 0.0 | 4.6 | 26.9 |
6 | Guard Gallivant | 14.0 | 25.2 | 24,331.5 | 24,370.7 |
7 | Bridge Repair | 72.6 | 321.4 | 9,620.7 | 10,014.7 |
8 | Resonant Collinearity | 2.7 | 3.3 | 13.4 | 19.4 |
9 | Disk Fragmenter | 0.8 | 12.9 | 137.9 | 151.7 |
10 | Hoof It | 2.2 | 29.9 | 27.8 | 59.9 |
11 | Plutonian Pebbles | 0.1 | 43.8 | 2,115.2 | 2,159.1 |
12 | Garden Groups | 6.8 | 164.4 | 249.0 | 420.3 |
13 | Claw Contraption | 14.7 | 0.0 | 0.0 | 14.7 |
14 | Restroom Redoubt | 13.7 | 0.0 | 0.0 | 13.7 |
15 | Warehouse Woes | 14.6 | 228.5 | 458.3 | 701.5 |
16 | Reindeer Maze | 12.6 | 2,480.8 | 9,010.7 | 11,504.1 |
17 | Chronospatial Computer | 0.1 | 0.2 | 44.5 | 44.8 |
18 | RAM Run | 35.6 | 15.8 | 33.8 | 85.2 |
19 | Linen Layout | 10.7 | 11,890.8 | 11,908.7 | 23,810.2 |
20 | Race Condition | 48.7 | 54.5 | 54.2 | 157.4 |
21 | Keypad Conundrum | 0.0 | 1.7 | 22.4 | 24.2 |
22 | Monkey Market | 20.7 | 0.0 | 11,227.7 | 11,248.4 |
23 | LAN Party | 13.6 | 22.0 | 2.5 | 38.2 |
24 | Crossed Wires | 5.0 | 41.3 | 14.3 | 60.7 |
25 | Code Chronicle | 24.9 | 0.0 | 0.0 | 24.9 |
A weird thing I found when benchmarking is that for day 6 part two, ReleaseSafe
actually ran faster than ReleaseFast
(13,189.0 µs vs 24,370.7 µs). Their outputs are the same, but for some reason ReleaseSafe
is faster even with the safety checks still intact.
The Zig compiler is still very much a moving target, so I don’t want to dig too deep into this, as I’m guessing this might be a bug in the compiler. This weird behaviour might just disappear after a few compiler version updates.
Looking back, I’m really glad I decided to do Advent of Code and followed through to the end. I learned a lot of things. Some are useful in my professional work, some are more like random bits of trivia. Going with Zig was a good choice too. The language is small, simple, and gets out of your way. I learned more about algorithms and concepts than the language itself.
Besides what I’ve already mentioned earlier, here are some examples of the things I learned:
Some of my self-imposed constraints and rules ended up being helpful. I can still (mostly) understand the code I wrote a few months ago. Putting all of the code in a single file made it easier to read since I don’t have to context switch to other files all the time.
However, some of them did backfire a bit, e.g. the two constraints that limit how I can optimise my code. Another one is the “hardcoding allowed” rule. I used a lot of magic numbers, which helped to improve performance, but I didn’t document them so after a while I don’t even remember how I got them. I’ve since gone back and added explanations in my write-ups, but next time I’ll remember to at least leave comments.
One constraint I’ll probably remove next time is the no concurrency rule. It’s the biggest contributor to the total runtime of my solutions. I don’t do a lot of concurrent programming, even though my main language at work is Go, so next time it might be a good idea to use Advent of Code to level up my concurrency skills.
I also spent way more time on these puzzles than I originally expected. I optimised and rewrote my code multiple times. I also rewrote my write-ups a few times to make them easier to read. This is by far my longest side project yet. It’s a lot of fun, but it also takes a lot of time and effort. I almost gave up on the write-ups (and this blog post) because I don’t want to explain my awful day 15 and day 16 code. I ended up taking a break for a few months before finishing it, which is why this post is published in August lol.
Just for fun, here’s a photo of some of my notebook sketches that helped me visualise my solutions. See if you can guess which days these are from:
So… would I do it again? Probably, though I’m not making any promises. If I do join this year, I’ll probably stick with Zig. I had my eyes on Zig since the start of 2024, so Advent of Code was the perfect excuse to learn it. This year, there aren’t any languages in particular that caught my eye, so I’ll just keep using Zig, especially since I have a proper setup ready.
If you haven’t tried Advent of Code, I highly recommend checking it out this year. It’s a great excuse to learn a new language, improve your problem-solving skills, or just to learn something new. If you’re eager, you can also do the previous years’ puzzles as they’re still available.
One of the best aspects of Advent of Code is the community. The Advent of Code subreddit is a great place for discussion. You can ask questions and also see other people’s solutions. Some people also post really cool visualisations like this one. They also have memes!
I failed my first attempt horribly with Clojure during Advent of Code 2023. Once I reached the later half of the event, I just couldn’t solve the problems with a purely functional style. I could’ve pushed through using imperative code, but I stubbornly chose not to and gave up… ↩︎
The original constraint was that each solution must run in under one second. As it turned out, the code was faster than I expected, so I increased the difficulty. ↩︎
TigerBeetle’s code quality and engineering principles are just wonderful. ↩︎
You can implement this function without any allocation by mutating the string in place or by iterating over it twice, which is probably faster than my current implementation. I kept it as-is as a reminder of what comptime
can do. ↩︎
As a bonus, I was curious as to what this looks like compiled so I listed all the functions in this binary in GDB and found:
72: static bool day04.Day04(140).matches__anon_19741;
72: static bool day04.Day04(140).matches__anon_19750;
It does generate separate functions! ↩︎
Well, not always. The number of SIMD instructions depends on the machine’s native SIMD size. If the length of the vector exceeds it, Zig will compile it into multiple SIMD instructions. ↩︎
Here’s a nice post on optimising day 9’s solution with Rust. It’s a good read if you’re into performance engineering or Rust techniques. ↩︎
One thing about packed structs is that their layout is dependent on the system endianness. Most modern systems are little-endian, so the memory layout I showed is actually reversed. Thankfully, Zig has some useful functions to convert between endianness like std.mem.nativeToBig
, which makes working with packed structs easier. ↩︎
Technically, you can store 2-digit base 26 numbers in a u10
, as there are only possible numbers. Most systems usually pad values by byte size, so u10
will still be stored as u16
, which is why I just went straight for it. ↩︎
Behind every Nubank solution, there’s a chair. And behind every chair, there’s a lot more than you might imagine.
Here, every area plays an essential role in building something extraordinary together. But do we really know what the other person’s job is like?
We invited Nubankers from different teams to describe how they imagine the routine of other areas – and then heard directly from those professionals about what their day-to-day is really like. A meeting of perceptions that shows the strength of our connection.
What defines a customer’s experience when they open the app? What does it take to guarantee security on a global scale? How does our customer service team handle delicate situations? What challenges arise behind the scenes of a new feature?
Here are some of those stories – and maybe you’ll even picture yourself living one of them.
When you make a purchase with our card, it feels simple. But for those working at Nubank, that process involves cutting-edge technology, fast decision-making, and a lot of collaboration.
Vitoria, a Staff Software Engineer, works in a crucial area: the infrastructure behind our credit card. Her team maintains and improves the systems that power billing, transactions, interest application, and the payment cycle. These systems operate silently, but must run with absolute precision.
She compares her role to that of an architect: “My job is to think about the structure. What needs to be in place so everything else can run securely and reliably?” That structure doesn’t work in isolation. The card is directly connected to other products, requiring a systemic view and fine-tuned coordination across technical teams.
Even though she’s not on the front lines with customers, Vitoria knows her work impacts people’s lives every day, and now at an even greater scale. “Our goal is to ensure credit is offered responsibly. It’s not just about approving a transaction; it’s about understanding how it affects someone’s life.”
In her six years at Nubank, Vitoria has worked across different areas of the card business. She started as a mid-level software engineer and is now at level 7 in the IC (individual contributor) track, leading projects, mentoring others, and inspiring career paths like her own.
Long before a customer opens the Nubank app, decisions are made, flows are designed, and hypotheses are tested. The simple experience you see is powered by a complex machine in the background, and that’s where people like André and Guilherme come in.
How do you make dozens of products, features, and customer profiles coexist seamlessly in a single app? That’s one of the questions guiding André, a Product Manager at Nubank. His team owns the app’s home screen, the first thing millions of customers see every day. But his work starts long before any visual component exists.
“Our challenge is to keep the structure modular, smart, and scalable, where every decision balances customer needs, business goals, and technical feasibility,” he explains.
The goal is to make the home screen work for everyone: from customers who only use it to pay bills to those who explore every feature; from brand-new users to long-time clients. That means thinking strategically about how to organize information, which products to highlight, and how to adapt navigation to constant changes.
This integrated, technical view helps anticipate problems, align priorities, and make better decisions – ensuring that the final experience is simple, intuitive, and relevant for every user.
Guilherme, a Business Analyst, works on the team responsible for the app’s first screens and all the communications customers receive, rom product suggestions to feature announcements. One of his proudest projects is the Purple Loop, an orchestration system that uses machine learning models to predict what each person needs most at a given moment and deliver the most relevant recommendation.
To make this happen, Guilherme tracks performance metrics, A/B tests, and satisfaction scores daily. His mission is to ensure every decision connects to the customer’s real experience.
The project is the result of a big collaboration between data, engineering, marketing, and business teams, and it perfectly reflects the kind of challenge that motivates him. “We have the freedom to test new ideas and work with amazing people. That makes all the difference.”
Planning is only part of the job. Anticipating the customer experience also means looking at risks, analyzing data, and making sure operations are safe. Jhony, who leads the Internal Audit platform in Brazil, sums it up well: “We’re Nubank’s third line of defense. We step in after everyone else to conduct an independent review and make sure everything is running as it should.”
But it’s not about waiting for something to go wrong – the focus is on acting early. With continuous auditing, Jhony’s team monitors risks in real time, cross-checking data with intelligence tools to spot issues and opportunities for improvement.
“We’re here to make sure controls are working, but also to help the business grow responsibly,” he says. This strategic, data-driven perspective turns audit into a true partner for the customer experience.
Behind every chat reply, there’s a whole team making sure the customer experience is built on care, empathy, and trust.
Aline started as an Xpeer and has sat in nearly every seat within the customer service area. Each role gave her more perspective, more strategic insight, and a clearer sense of what makes truly great service.
Today, as a team lead, she organizes workflows, sets goals, tracks results, and above all, develops her people. “I always say my job is to take care of the people who take care of people. When we build a culture of trust and listening, the customer experience naturally improves.”
Data that drives real-time action
At Nubank, every tap in the app is smarter than it looks. That’s because behind almost every interaction, there’s a machine learning model running in real time, helping flag suspicious behavior or offer personalized credit limits, for example.
Flávio, a Data Scientist on the Machine Learning team, works deep in the background. “It might not seem like it, but almost every customer action in the app is guided by AI models we build here.”
Fraud detection. Default risk. Security threats. All of these are handled by models developed by Flávio’s team. These models learn from user behavior, feed on real data, and help Nubank make better decisions.
But accuracy isn’t enough – it has to scale. With over 100 million customers and an avalanche of decisions happening every second, performance and speed are as critical as intelligence. “What used to take 100 milliseconds to process a few years ago can’t take 100 times longer today,” Flávio explains.
Working in data science at Nubank means combining large-scale impact, the freedom to innovate, and top-level technical challenges.
If this inspires you, imagine being part of it
Each of these stories shows a bit of what it’s like to work at Nubank: collaborating across countries, turning real problems into concrete solutions, and growing alongside the challenges.
The truth is, no process is ever final. No idea is set in stone. And that leaves room for people who want to build with autonomy, responsibility, and real impact on millions of lives.
If this is the kind of challenge that drives you, explore our opportunities and join us.
The post What’s it like to work at Nubank? Inside the process of turning ideas into solutions appeared first on Building Nubank.
Olá, pessoal!
Há 8 meses, embarquei na jornada de ser um Lead Software Engineer no Nubank. Vindo de um mundo onde Kotlin e Go eram minhas principais ferramentas, mergulhar em Clojure foi uma mudança de paradigma. Hoje, quero compartilhar um pouco dessa experiência, mostrando com código as diferenças e o que torna Clojure uma linguagem tão fascinante de se trabalhar.
Vamos explorar três problemas simples, resolvidos em cada uma das três linguagens.
Tudo começa aqui. Ver a sintaxe mais básica já nos dá uma pista da filosofia de cada linguagem.
Go:
Focado em simplicidade e um ferramental robusto. Tudo é explícito.
package main
import "fmt"
func main() {
fmt.Println("Olá, Mundo!")
}
Kotlin:
Moderno, conciso e interoperável com Java. A sintaxe é familiar para quem vem do mundo OO.
fun main() {
println("Olá, Mundo!")
}
Clojure:
Aqui a primeira "estranheza" que vira um encanto. A sintaxe LISP, com parênteses e prefixos (função argumento), trata código como dados. É simples e incrivelmente poderosa.
(println "Olá, Mundo!")
Análise Rápida: De cara, a concisão de Clojure se destaca. A ausência de cerimônias como declaração de package ou main para um script simples já mostra o foco em ir direto ao ponto.
Este é um cenário do dia a dia: temos uma lista de vendas e queremos calcular o total por produto. É aqui que a abordagem funcional de Clojure realmente brilha.
Digamos que temos estes dados:
[{"produto": "A", "valor": 10}, {"produto": "B", "valor": 20}, {"produto": "A", "valor": 5}]
Go:
Em Go, faríamos isso de forma imperativa, inicializando um mapa e iterando sobre a lista para acumular os valores. É eficiente, mas verboso.
package main
import "fmt"
type Venda struct {
Produto string
Valor int
}
func main() {
vendas := []Venda{
{"A", 10},
{"B", 20},
{"A", 5},
}
totalPorProduto := make(map[string]int)
for _, v := range vendas {
totalPorProduto[v.Produto] += v.Valor
}
fmt.Println(totalPorProduto)
// Output: map[A:15 B:20]
}
Kotlin:
Kotlin oferece uma API de coleções rica e funcional, tornando o código mais expressivo e menos propenso a erros.
data class Venda(val produto: String, val valor: Int)
fun main() {
val vendas = listOf(
Venda("A", 10),
Venda("B", 20),
Venda("A", 5)
)
val totalPorProduto = vendas
.groupBy { it.produto }
.mapValues { entry ->
entry.value.sumOf { it.valor }
}
println(totalPorProduto)
// Output: {A=15, B=20}
}
Clojure:
Em Clojure, a transformação de dados é o coração da linguagem. O código é uma composição de funções, resultando em uma "pipeline" de dados clara e elegante.
(def vendas
[{:produto "A" :valor 10}
{:produto "B" :valor 20}
{:produto "A" :valor 5}])
(def total-por-produto
(->> vendas
(group-by :produto)
(map (fn [[produto lista-vendas]]
[produto (reduce + (map :valor lista-vendas))]))
(into {})))
(println total-por-produto)
; Output: {"A" 15, "B" 20}
Análise Rápida: Enquanto Go é explícito e manual, Kotlin e Clojure mostram o poder das abstrações funcionais. A solução em Clojure, com o macro ->> (thread-last), descreve perfeitamente o fluxo: pegue as vendas, agrupe por :produto, depois mapeie cada grupo para calcular a soma e, por fim, transforme tudo em um mapa. É como ler uma receita.
Como lidar com estado compartilhado é um desafio central em sistemas concorrentes. Cada linguagem tem sua abordagem. Vamos simular 1.000 "processos" incrementando um contador.
Go:
Goroutines e Channels são os cidadãos de primeira classe para concorrência em Go. Para estado mutável compartilhado, usamos mutex para garantir a segurança.
package main
import (
"fmt"
"sync"
)
func main() {
var contador int
var wg sync.WaitGroup
var mu sync.Mutex
for i := 0; i < 1000; i++ {
wg.Add(1)
go func() {
defer wg.Done()
mu.Lock()
contador++
mu.Unlock()
}()
}
wg.Wait()
fmt.Println("Contador final:", contador)
// Output: Contador final: 1000
}
Kotlin:
Coroutines são a resposta de Kotlin para concorrência leve. Para estado compartilhado, podemos usar tipos atômicos do Java ou um Mutex específico para coroutines.
import kotlinx.coroutines.*
import kotlinx.coroutines.sync.Mutex
import kotlinx.coroutines.sync.withLock
val mutex = Mutex()
var contador = 0
fun main() = runBlocking {
val jobs = List(1000) {
launch(Dispatchers.Default) {
mutex.withLock {
contador++
}
}
}
jobs.forEach { it.join() }
println("Contador final: $contador")
// Output: Contador final: 1000
}
Clojure:
Clojure abraça a imutabilidade e fornece construções simples e poderosas para gerenciar estado quando ele é inevitável. O atom é perfeito para estado compartilhado e não coordenado. A função swap! garante atualizações atômicas.
(def contador (atom 0))
(defn incrementar []
(swap! contador inc))
(let [processos (repeatedly 1000 #(future (incrementar)))]
(doseq [p processos] (deref p))) ; Espera todos terminarem
(println "Contador final:" @contador)
; Output: Contador final: 1000
Análise Rápida: As três linguagens resolvem o problema com segurança, mas a abordagem de Clojure é notavelmente mais limpa. Não há locks manuais visíveis no nosso código de negócio. A complexidade da concorrência é abstraída pelo atom e pela função swap!, tornando o código mais simples de ler e escrever.
Trabalhar com Go e Kotlin me deu uma base sólida em sistemas eficientes e bem tipados. Mas a imersão em Clojure no Nubank me ensinou a amar a simplicidade, a imutabilidade e o poder da programação funcional.
A capacidade de moldar o código como uma sequência de transformações de dados e de lidar com concorrência de forma tão elegante não só torna o desenvolvimento mais rápido, mas também mais prazeroso. É uma linguagem que nos convida a pensar no problema de forma diferente e, na minha opinião, de um jeito muito mais direto e poderoso.
E você, já teve uma experiência parecida ao aprender uma nova linguagem que mudou sua forma de pensar? Adoraria saber!
Desde hace un tiempo que en cualquier proyecto personal de código que empiezo, termino usando tres tecnologías: Nix, Nushell y Just. En este artículo quiero presentar estas tres herramientas, y compartir la forma en que las uso para configurar las dependencias del proyecto, escribir scripts, y definir sus tareas de desarrollo. Asumo que entiendes cómo usar tu terminal, que trabajas en un entorno tipo Unix, y que sabes programar.
Vamos a partir viendo el manejo de dependencias usando Nix.
Nix es un ecosistema enorme y muy versátil, pero en este contexto lo que nos interesa es que podemos usar esta herramienta para declarar todas las utilidades necesarias para correr un proyecto, compilarlo, hacer linting, etc., y sin usar contenedores. Es compatible con casi cualquier entorno tipo Unix, o sea Linux, macOS y Windows bajo WSL.
En particular, lo que uso es algo llamado “flake”. Este es un archivo de nombre flake.nix
que se pone en la raíz del proyecto, y es donde declaramos los paquetes que necesita.
Partamos con un ejemplo simple:
{
inputs = {
nixpkgs.url = "github:nixos/nixpkgs/nixpkgs-unstable";
};
outputs = {nixpkgs, ...}: let
pkgs = import nixpkgs {system = "aarch64-darwin";};
in {
devShell."aarch64-darwin" = pkgs.mkShell {
buildInputs = [
pkgs.elmPackages.elm
pkgs.elmPackages.elm-format
pkgs.elmPackages.elm-test
pkgs.just
pkgs.nushell
];
};
};
}
Con esto, hemos definido un entorno que contiene algunas herramientas para un proyecto Elm, además de los binarios de Nushell y Just, a los que me voy a referir más abajo. Así como está, esto sólo va a funcionar en un Mac de los con procesador Apple Silicon. Después vamos a desarrollar este ejemplo para entenderlo mejor y hacerlo más versátil, pero por ahora partamos viendo en qué nos ayuda tener este archivo.
Si corremos el comando nix develop
vamos a entrar a un shell Bash que contiene todas las herramientas declaradas en nuestro flake.nix
. Para salir basta con correr el comando exit
, y veremos que ya no tenemos disponibles las herramientas—son totalmente locales al proyecto.
$ nix develop
$ which elm
/nix/store/6hx8g6k7ihgaqvy1i0ydiy7v13s04pf4-elm-0.19.1/bin/elm
$ exit
exit
$ which elm
elm not found
La primera vez se generará un archivo flake.lock
, el cual podemos integrar en Git (o tu VCS preferido) para mantener la versión exacta de estas dependencias. Luego se descargarán algunos binarios en caché o se compilarán las dependencias desde su código fuente. Pero eso ocurrirá sólo la primera vez, o cuando quieras cambiar las dependencias, ya que se almacenará todo en el “store” global de Nix.
Puedes saltarte esta sección si no te interesa. Aquí voy a explicar un poquito cómo hacer funcionar el ejemplo de arriba para ti, ya que no quise complejizar con ciertos detalles.
Primero, hace falta especificar que esta funcionalidad llamada “flakes” es, al momento en que escribo esto, considerada experimental, y por lo tanto no está activada por defecto: hay que configurarla.
Si no tienes Nix instalado aún, te recomiendo usar el Determinate Nix Installer. No es el oficial, pero es mejor en varios aspectos, entre ellos, viene con flakes activados por defecto, funciona mejor en macOS, y también incluye una forma fácil de desinstalar. Corre el script de abajo en tu terminal para instalarlo (o copia el script de la dirección vinculada arriba). Cuidado: Cuando te pregunte si instalar “Determinate Nix”, ponle que no. Ese es un fork, y lo que queremos es instalar Nix oficial.
curl --proto '=https' --tlsv1.2 -sSf -L https://install.determinate.systems/nix | sh -s -- install
Si por otro lado ya tienes instalado Nix y no te funcionan los flakes, necesitas añadir un poco de configuración. Abre o crea un archivo ~/.config/nix/nix.conf
y ponle esto:
experimental-features = flakes nix-command
Ya con todo configurado correctamente, te pongo una versión ajustada del archivo flake.nix
:
{
inputs = {
nixpkgs.url = "github:nixos/nixpkgs/nixpkgs-unstable";
flake-utils.url = "github:numtide/flake-utils";
};
outputs = {
nixpkgs,
flake-utils,
...
}:
flake-utils.lib.eachDefaultSystem (
system: let
pkgs = import nixpkgs {system = system;};
in {
devShell = pkgs.mkShell {
buildInputs = [
pkgs.elmPackages.elm
pkgs.elmPackages.elm-format
pkgs.elmPackages.elm-test
pkgs.just
pkgs.nushell
];
};
}
);
}
Lo que hicimos fue declarar una nueva entrada bajo inputs
, de nombre flake-utils
, y más abajo usar la función eachDefaultSystem
que provee, la que nos ayuda a configurar un shell de desarrollo para múltiples entornos simultáneamente. Lo demás está igual.
Para agregar otros paquetes, los encuentras en la búsqueda de Nixpkgs. Una vez que encuentres el nombre de lo que quieres, lo puedes agregar en la lista en tu flake.
Y esto es todo lo esencial que necesitas saber para empezar, aunque por supuesto, hay muchos detalles que podrás ir descubriendo por tu cuenta. Como dije al principio, Nix es un mundo muy grande por explorar. Y de hecho, si Nix te parece muy complejo, existen herramientas que funcionan con Nix por debajo, pero proveen una interfaz más intuitiva para este caso de uso; te recomiendo que le eches una mirada a Devbox, Flox y Devenv.
Nushell es una alternativa a Bash o zsh. La diferencia es que en vez de seguir los lineamientos POSIX e interactuar sólo con texto plano, trabajamos con datos estructurados. Básicamente, es un shell y lenguaje que reemplaza la necesidad de Bash, grep, sed, awk, jq, curl y muchas otras utilidades de línea de comandos frecuentemente usadas para procesar datos.
Igual se ajusta hasta cierto punto a nuestras expectativas como usuarios de Bash y similares. Por ejemplo, en un shell Nushell el comando ls
funciona como es de esperarse:
$ ls
╭───┬────────────┬──────┬───────┬─────────────╮
│ # │ name │ type │ size │ modified │
├───┼────────────┼──────┼───────┼─────────────┤
│ 0 │ flake.lock │ file │ 569 B │ 8 hours ago │
│ 1 │ flake.nix │ file │ 405 B │ 7 hours ago │
╰───┴────────────┴──────┴───────┴─────────────╯
Pero esta salida no es puramente texto. En realidad lo que hemos obtenido es una tabla con filas y columnas. Mira el tipo de cosas que podemos hacer con sólo Nushell:
$ ls | where name =~ '[.]lock$' | get 0.name | open
::: | from json | get nodes.nixpkgs.locked.lastModified
::: | $in * 1_000_000_000 | into datetime
Tue, 5 Aug 2025 11:35:34 +0000 (2 days ago)
Obviamente este ejemplo es puramente demostrativo, pero fíjate: estamos listando archivos, filtrando en base a un regex, sacando el nombre del primero en la lista, leyendo el contenido del archivo, interpretándolo como JSON, recuperando un valor numérico dentro del JSON, multiplicando para convertir segundos a nanosegundos, y finalmente convirtiendo eso a una fecha.
Es tan práctico que lo tengo como mi shell por defecto. Pero para el caso de este artículo, lo relevante es su utilidad para escribir scripts simples y legibles que transforman archivos, levantan servicios, recuperan datos de internet, y más. Los archivos llevan la extensión .nu
.
Veamos un pequeño ejemplo. Este es un script que actualiza una lista de exclusiones para evitar visitas de robots de IA, con datos que descargamos del Github del proyecto ai.robots.txt.
# Algunas constantes.
let aiRobotsTxtBaseUrl = "https://raw.githubusercontent.com/ai-robots-txt/ai.robots.txt/refs/heads/main"
let startMarkerLine = "# Start ai.robots.txt"
let endMarkerLine = "# End ai.robots.txt"
# Una pequeña función para simplificar el código más adelante.
def splitLines [] {
split row "\n"
}
# Función que procesa los datos para un archivo específico, ya que son
# dos los que queremos actualizar.
def updateFile [$filename] {
# Leemos el archivo local y lo dejamos en la variable como una lista
# de líneas.
let localLines = open $"./public/($filename)" | splitLines
# Lo mismo para el archivo remoto.
let updateLines = http get $"($aiRobotsTxtBaseUrl)/($filename)" | splitLines
# El archivo local tiene líneas que marcan el comienzo y el final
# del contenido que queremos actualizar, marcados por las constantes
# de arriba de nombre `$startMarkerLine` y `$endMarkerLine`. En base
# a ese contenido, cortamos la lista para recuperar el contenido
# que queremos mantener, el cual viene antes y después dentro del
# archivo.
let firstSplit = $localLines | split list $startMarkerLine
let linesBeforeUpdate = $firstSplit | get 0
let secondSplit = $firstSplit | get 1 | split list $endMarkerLine
let linesAfterUpdate = $secondSplit | get 1
# Insertamos las líneas provenientes del archivo remoto en la mitad,
# concatenando todo en una misma lista.
let updatedLines = (
$linesBeforeUpdate
++ [$startMarkerLine]
++ $updateLines
++ [$endMarkerLine]
++ $linesAfterUpdate
)
# Sobreescribimos el archivo antiguo con las nuevas líneas.
$updatedLines | str join "\n" | save --force $"./public/($filename)"
}
# Usamos la función definida arriba para procesar dos archivos.
updateFile ".htaccess"
updateFile "robots.txt"
No sé lo que pienses tú, pero yo creo que queda un código extremadamente legible y conciso. Es fácil de escribir, trae “pilas incluídas” como quien dice, con soporte para procesar muchos formatos de archivo. Y cuando hay algo que no se puede hacer con Nushell, puedes echar mano a cualquier herramienta de línea de comandos, idéntico a como haríamos en un script de Bash.
El lenguaje Nushell toma las “pipes” (|
) de los shell Unix y las combina con estructuras de datos inmutables. Es un lenguaje bastante funcional, de tipado dinámico pero estricto, o sea que si una función recibe un valor con un tipo inesperado, la ejecución falla con un mensaje bien explícito, como este:
$ ["hola", "cómo", "estás"] | date to-timezone "UTC"
Error: nu::parser::input_type_mismatch
× Command does not support list<string> input.
╭─[entry #6:1:31]
1 │ ["hola", "cómo", "estás"] | date to-timezone "UTC"
· ────────┬───────
· ╰── command doesn't support list<string> input
╰────
Just es una herramienta con un alcance muy moderado: permitir definir tareas repetibles para un proyecto. Lo típico es definir una tarea “build” para compilar el código, o una “dev” para ejecutarlo localmente, o una “test” para correr los tests. La manera en que funcionaría usando Just es con los comandos just build
, just dev
y just test
, cuyas tareas uno mismo define en un archivo de nombre justfile
.
Seguro que hasta ahí no hay nada nuevo, es lo mismo que uno haría con Make o con npm run
, por ejemplo. Y es verdad, Just no ofrece un paradigma novedoso. Lo único que ofrece es simpleza.
El modelo de Just es Make. No sé tú, pero yo al menos he escrito muchos makefile para definir tareas para un proyecto. El problema con eso es que Make tiene muchas asperezas para este objetivo. Por ejemplo, si defines una tarea test
para correr tus tests, y resulta que tienes una carpeta test
en el mismo directorio, no se va a correr la tarea. Esto es porque Make está hecho para compilar cosas, donde el nombre de la tarea es el nombre del archivo, y cuando el archivo o directorio ya existe, simplemente no se ejecuta la tarea.
La razón por la que me gusta Just es porque es como un Make pero hecho para correr tareas, con todas sus asperezas pulidas. La sintaxis de un “justfile” es igual a la de un makefile, pero un poquito mejor. Por ejemplo, puedes indentar usando espacios en vez de tabs, y puedes poner comentarios de documentación. Además puedes listar tareas usando just --list
.
Te pongo un ejemplo. Este archivo lleva el nombre justfile
.
port := "1237"
[private]
default:
just --list
# Compila archivos.
build: install
rm -rf dist
pnpm exec vite build
# Levanta el servidor de desarrollo.
develop: install qr
pnpm exec vite --port {{port}} --host
[private]
install:
pnpm install
[private]
qr:
#!/usr/bin/env nu
let ip = sys net | where name == "en0" | get 0.ip | where protocol == "ipv4" | get 0.address
let url = $"http://($ip):{{port}}"
qrtool encode -t ansi256 $url
print $url
En la línea 4 definí la primera tarea, a la que le puse nombre default
porque es la que se ejecutará por defecto si uno escribe just
. Puede ser cualquier tarea, pero a mí me gusta que la primera tarea sirva para ver la lista de tareas disponibles, por eso la hice “escondida” usando el modificador [private]
. Mira cómo se ve:
$ just
just --list
Available recipes:
build # Compila archivos.
develop # Levanta el servidor de desarrollo.
Bastante bonito, ¿no? Te puedes dar cuenta de que los comentarios escritos sobre cada tarea sirven como documentación de esa tarea, y que aparecen al usar el comando --list
.
Otras aspectos notables:
{{port}}
en dos lugares.install
que definí como prerrequisito de dos otras, o sea que cuando, por ejemplo, llame just build
, se ejecutarán primero los comandos en install
.qr
comienza con la línea #!/usr/bin/env nu
, la cual sirve para definir con qué programa quieres que se ejecuten los comandos de esa tarea. En este caso, escribí un pequeño script de Nushell. Puedes usar Python si quisieras, o Clojure vía Babashka, o cualquier otro lenguaje aquí. Esta funcionalidad es súper útil.Esa fue mi presentación sobre estas herramientas que uso para configurar mis proyectos. Hay una otra herramienta que no mencioné: direnv, la cual hace que sea mucho más placentero trabajar con entornos Nix. Tal vez otro día escribiré algo al respecto, no sé.
Ojalá que te haya sido útil, y cuéntame si encontraste algo incorrecto o confuso.
On one hand, I always wanted to know if people are interested in my blog posts; on the other hand, I always disliked analytics services because of how creepy they are with their tracking.
Recently, I decided to make my own analytics service that collects only the minimum amount of data that I need. I built a simple web service that can only receive JSON objects and save them into a SQLite database, along with a timestamp. Then I vibe-coded a simple JS script that sends the current page URL with a referrer when it loads. In this post, I’ll share how I analyzed this data using Reveal.
As a bonus, in the end I’ll share a snippet that, in just 10 lines of code, allows you to view the dependency graph of any tools.deps-based project along with sizes of the dependencies — something every Clojure developer can find useful regardless of the type of project they are working on!
But let’s start with Graphviz.
Graphviz is a data visualization software that uses a simple text language, e.g., digraph { a -> b }
. It’s a good fit if you want to view… well… graphs.
I used it when I wanted to see where people come from to my site, and how they move around it. This had to be shown as graph, since just running a query and viewing results is not enough to see the patterns. But I had to start with a query. Fun fact: SQLite supports JSON as a native data type, so I can look up fields inside free-form JSON values in a query. Slightly simplified, it looks like this:
SELECT
json_extract(event, '$.referrer'),
json_extract(event, '$.url'),
COUNT(*)
FROM
stats
WHERE
json_valid(event) -- because bots just like to submit garbage...
AND json_extract(event, '$.referrer') != ''
GROUP BY
json_extract(event, '$.referrer'),
json_extract(event, '$.url')
Then, once I got referrer + url + count tuples, I massaged them into a Graphviz description:
(str "digraph { rankdir=LR; node[shape=record];"
(->> (db/execute! db referrer-map-query)
(map (fn [[from to count]]
(str (pr-str from) "->" (pr-str to) " [label=\"" count "\"]")))
(str/join "\n"))
"}")
Since I don’t have a need to build web pages with dashboards, it was enough to use Reveal’s Graphviz viewer. I only needed to create a graphviz description string, select the graphviz
action in Reveal, and voila:
Turns out, not only do people actually visit the blog post I shared on Reddit, but some people are actually interested in seeing more and open other pages! This motivated me to update the about page, which previously only had a single sentence with my name — turns out it was actually visited 😬
Another fun fact I discovered after running analytics for a while — ChatGPT sometimes refers users to non-existing pages on my site:
While Graphviz is an essential tool for showing graphs, there is another widely used data visualizer: Vega. Vega can do data grouping by itself, so the SQL query is even simpler:
SELECT
json_extract(event, '$.url') as url, created_at
FROM
stats
WHERE
json_valid(event)
Then came the time to view the results using Reveal’s Vega viewer:
{:fx/type vlaaad.reveal/vega
:spec {:mark {:type "bar" :tooltip true}
:encoding {:x {:field "created_at" :type "temporal" :timeUnit "yearmonthdatehours"}
:y {:field "url" :aggregate "count"}
:color {:field "url" :legend {:orient "bottom" :columns 5}}}}
:data (db/execute! db timed-url-query)}
Such a map can be viewed in Reveal using view
action:
I find Vega very cool; you can do very different visualizations with very small configuration maps. The downside of Vega is a very complex grammar, which is much harder to remember than the Graphviz one, though in practice, LLMs can produce Vega descriptions if you ask them what you want.
It’s very convenient when your REPL is capable of doing more than just showing text. While the examples shared above could be considered something from the realm of “data analysis” and not day-to-day programming, there is still a need sometimes to visualize data with graphs and/or charts. For example, to analyze dependencies of your project! If you use tools.deps, you can write the following code in your REPL to create a dependency graph:
(str "digraph { rankdir=LR; node[shape=record]; nodesep=0.1; ranksep=0.2;"
(->> (clojure.java.basis/current-basis)
:libs
(map (fn [[lib {:keys [dependents paths]}]]
(let [lib-id (pr-str (str lib))
kbs (int (/ (reduce + 0 (map (comp java.io.File/.length java.io.File/new) paths)) 1000))]
(str lib-id "[label=" (pr-str (str lib "\n" kbs " KB")) "]\n"
(clojure.string/join "\n" (map #(str (pr-str (str %)) "->" lib-id) dependents))))))
(clojure.string/join "\n"))
"}")
Yep, just 10 lines to reimplement tools.deps.graph with Reveal:
Summer came and went, and while we all soaked up the sun, hit the beaches, and indulged in ice cream, our favorite programming languages had their own unique ways of relaxing (or not relaxing at all). Here’s how they spent their well-earned break:
Clojure: The Zen Beachgoer
Clojure showed up at the beach with nothing but a single, immutable towel and a book on functional meditation. While everyone else built sandcastles, Clojure calmly explained how sand particles were just lazy sequences. “The ocean is a persistent data structure,” they muttered, sipping their purely functional coconut water.
JavaScript: The Chaotic Road Tripper
JavaScript attempted to visit five countries in three days. Their GPS kept glitching, their convertible randomly transformed into a pickup truck, and at one point, they accidentally booked a hotel in the wrong timezone. But somehow, they made it work—sort of. “I used a framework for this!” they said, as their sunscreen expired mid-application.
Python: The Chill Camper
Python planned the perfect camping trip—minimal setup, maximum enjoyment. They brought a well-documented checklist, a bug-free tent, and a grill that automatically adjusted heat based on meat thickness. “Indentation matters, even in hammocks,” they joked, while others struggled with tangled sleeping bags.
Java: The Overprepared Traveler
Java packed 12 suitcases, each with a 10-step unpacking process. “You need an interface just to open my sunscreen,” they said, handing out 300-page vacation guides. Their beach umbrella had 15 layers of abstraction, and by the time they finished setting up, summer was nearly over.
Scala: The Philosophical Hiker
Scala went on a solo hiking trip but spent most of the time debating whether the mountain was an object or a functional construct. “The trail is monadic,” they declared, attempting to climb it both in OOP and FP style at once. They came back with blisters—but also with a deeply satisfied mind.
C: The DIY Survivalist
C refused to book a hotel and instead built a cabin from scratch using only a pocket knife and sheer willpower. No electricity? No problem. They wrote their own sunlight-to-energy converter in assembly. “Back in my day, summer was real summer,” they grumbled while hand-carving a canoe.
Go: The Fast Kayaker
Go kept things simple with a no-nonsense kayaking trip. While others debated routes, Go just started paddling. “Concurrency is key,” they said, smoothly gliding through the water. By lunchtime, they had already kayaked, fished, and set up a picnic—all without a single dependency.
Elixir: The Party Yacht Captain
Elixir rented a yacht and threw an unforgettable party. The music never stopped, the drinks kept flowing, and the boat stayed afloat—even when JavaScript tried steering. “The party must go on!” they cheered, as the sunset turned everything a vibrant, functional pink.
PHP: The Enthusiastic Tourist
PHP hit every tourist trap within a 100-mile radius, snapping selfies with questionable filters. Their vacation album was full of blurry landmarks, random emojis, and captions like “Summer = true!!” But somehow, they had the most fun—even if their plane ticket was booked in PHP 5.
The End of Vacation
As summer faded, Clojure meditated on the impermanence of sunshine, Java finally finished unpacking, and C built a time machine to extend the season. Python shared a perfectly chilled watermelon, Ruby edited the group photos, and Go was already planning for next year.
So tell us… which language’s vacation most resembled yours?
The post How Our Favorite Programming Languages Spent Summer Vacation appeared first on Flexiana.
Step-by-step development of a depth-first search, using tree-seq, to solve a classic puzzle.
Welcome to the Clojure Deref! This is a weekly link/news roundup for the Clojure ecosystem (feed: RSS).
Clojure/Conj 2025: Early Bird Tickets Available Now!
Clojure South 2025: Tickets On Sale and Call for Proposals open
Macroexpand 2025: Currently inviting speakers and contributors
Coding limit and offset in Clojure. Just look how terse the code is!!!! - Clojure Diary
SDR with Clojure is fun - Juan Monetta
Harnessing the power of Java in Clojure - Clojure Diary
STM Meets gRPC: An Unexpected Marriage - Peter Bengtson
Macroexpand 2025 by Scicloj – Clojure Civitas - Daniel Slutsky
Clojure and D-Bus - Robbie Huffman
Understanding not just Clojure’s comp function by re-implementing it - Aditya Athalye
The power of the :deps alias - Sean Corfield
Managing a Project’s Tool Dependencies with Nix (and direnv) - Gary Verhaegen
New releases and tools this week:
visualization-mcp-server - Visualization MCP Server
cursive 2025.2 - Cursive: The IDE for beautiful Clojure code
kmono - The missing workspace tool for clojure tools.deps projects
kubb - Compact Kubernetes configuration with Babashka
pomegranate 1.2.25 - A sane Clojure API for Maven Artifact Resolver + dynamic runtime modification of the classpath
pretty 3.6.2 - Library for helping print things prettily, in Clojure - ANSI fonts, formatted exceptions
monkeyci - Next-generation CI/CD tool that uses the full power of Clojure!
eca 0.24.2 - Editor Code Assistant (ECA) - AI pair programming capabilities agnostic of editor
qclojure 0.10.0 - A functional quantum computer programming library for Clojure with backend protocols, simulation backends and visualizations.
deps-new 0.10.1 - Create new projects for the Clojure CLI / deps.edn
pedestal 0.8.0-rc-1 - The Pedestal Server-side Libraries
fulcro fulcro-3.9.0-rc5 - A library for development of single-page full-stack web applications in clj/cljs
cursive 2025.2.1-eap1 - Cursive: The IDE for beautiful Clojure code
Code
;; power_of_java_in_clojure.clj
(ns power-of-java-in-clojure
(:import [java.time LocalDate]))
(. LocalDate now)
(. LocalDate (of 2025 7 5))
(. LocalDate (of 2025 7 10))
(def start-date (. LocalDate (of 2025 7 5)))
(def end-date (. LocalDate (of 2025 7 10)))
;; (range start-date end-date) ;; doesn't work
(.plusDays (. LocalDate now) 1)
(defn inc-day [date-time]
(.plusDays date-time 1))
;; (range start-date end-date inc-day) ;; doesn't work
(defn sequence-of-days []
(iterate inc-day (. LocalDate now)))
(take 7 (sequence-of-days))
(defn next-n-days
"Returns a sequence of the next n days.
If n is not supplied, returns a sequence of the next 7 days.
**Examples:**
```clojure
(next-n-days 5)
(next-n-days)
```
"
([n]
(take n (sequence-of-days)))
([]
(next-n-days 7)))
(next-n-days)
(next-n-days 5)
When printing, please avoid println
invocations with more than one argument,
for example:
(defn process [x]
(println "processing item" x))
Above, we have two items passed into the function, not one. This style can let you down when processing data in parallel.
Let’s run this function with a regular map
as follows:
(doall
(map process (range 10)))
The output looks fair:
processing item 1
processing item 2
processing item 3
processing item 4
processing item 5
processing item 6
processing item 7
processing item 8
processing item 9
Replace map
with pmap
which is a semi-parallel method of processing. Now the
output goes nuts:
(pmap process (range 10)))
processing itemprocessing item 10
processing item processing item8 7
processing item 6
processing itemprocessing item 4
processing item 3
processing item 2
5
processing item
9
Why?
When you pass more than one argument to the println
function, it doesn’t print
them at once. Instead, it sends them to the underlying java.io.Writer
instance
in a cycle. Under the hood, each .write
Java invocation is synchronized so no
one can interfere when a certain chunk of characters is being printed.
But when multiple threads print something in a cycle, they do interfere. For example, one thread prints “processing item” and before it prints “1”, another thread prints “processing item”. At this moment, you have “processing itemprocessing item” on your screen.
Then, the first thread prints “1” and since it’s the last argument to println
,
it adds \n
at the end. Now the second thread prints “2” with a line break at
the end, so you see this:
processing itemprocessing item
1
2
The more cores and threads you computer has, the more entangled the output becomes.
This kind of a mistake happens often. People do such complex things in a map
function like querying DB, fetching data from API and so on. They forget that
pmap
can bootstrap such cases up to ten times. But unfortunately, all prints,
should invoked with two or more arguments, get entangled.
There are two things to remember. The first one is to not use println
with
more than one argument. For two and more, use printf
as follows:
(defn process [x]
(printf "processing item %d%n" x))
Above, the %n
sequence stands for a platform-specific line-ending character
(or a sequence of characters, if Windows). Let’ check it out:
(pmap process (range 10)))
processing item 0
processing item 2
processing item 1
processing item 4
processing item 3
processing item 5
processing item 6
processing item 8
processing item 9
processing item 7
Although the order of numbers is random due to the parallel nature of pmap
,
each line has been consistent.
One may say “just use logging” but too often, setting up logging is another
pain: add clojure.tools.logging
, add log4this
, add log4that
, put
logging.xml
into the class path and so on.
The second thing: for IO-heavy computations, consider pmap
over map
. It
takes an extra “p” character but completes the task ten times faster. Amazing!
I want to simulate an orbiting spacecraft using the Jolt Physics engine (see sfsim homepage for details). The Jolt Physics engine solves difficult problems such as gyroscopic forces, collision detection with linear casting, and special solutions for wheeled vehicles with suspension.
The integration method of the Jolt Physics engine is the semi-implicit Euler method. The following formula shows how speed v and position x are integrated for each time step:
The gravitational acceleration by a planet is given by:
To test orbiting, one can set the initial conditions of the spacecraft to a perfect circular orbit:
The orbital radius R was set to the Earth radius of 6378 km plus 408 km (the height of the ISS). The Earth mass was assumed to be 5.9722e+24 kg. For increased accuracy, the Jolt Physics library was compiled with the option -DDOUBLE_PRECISION=ON.
A full orbit was simulated using different values for the time step. The following plot shows the height deviation from the initial orbital height over time.
When examining the data one can see that the integration method returns close to the initial after one orbit. The orbital error of the Euler integration method looks like a sine wave. Even for a small timestep of dt = 0.031 s, the maximum orbit deviation is 123.8 m. The following plot shows that for increasing time steps, the maximum error grows linearly.
For time lapse simulation with a time step of 16 seconds, the errors will exceed 50 km.
A possible solution is to use Runge Kutta 4th order integration instead of symplectic Euler. The 4th order Runge Kutta method can be implemented using a state vector consisting of position and speed:
The derivative of the state vector consists of speed and gravitational acceleration:
The Runge Kutta 4th order integration method is as follows:
The Runge Kutta method can be implemented in Clojure as follows:
(defn runge-kutta
"Runge-Kutta integration method"
{:malli/schema [:=> [:cat :some :double [:=> [:cat :some :double] :some] add-schema scale-schema] :some]}
[y0 dt dy + *]
(let [dt2 (/ ^double dt 2.0)
k1 (dy y0 0.0)
k2 (dy (+ y0 (* dt2 k1)) dt2)
k3 (dy (+ y0 (* dt2 k2)) dt2)
k4 (dy (+ y0 (* dt k3)) dt)]
(+ y0 (* (/ ^double dt 6.0) (reduce + [k1 (* 2.0 k2) (* 2.0 k3) k4])))))
The following code can be used to test the implementation:
(def add (fn [x y] (+ x y)))
(def scale (fn [s x] (* s x)))
(facts "Runge-Kutta integration method"
(runge-kutta 42.0 1.0 (fn [_y _dt] 0.0) add scale) => 42.0
(runge-kutta 42.0 1.0 (fn [_y _dt] 5.0) add scale) => 47.0
(runge-kutta 42.0 2.0 (fn [_y _dt] 5.0) add scale) => 52.0
(runge-kutta 42.0 1.0 (fn [_y dt] (* 2.0 dt)) add scale) => 43.0
(runge-kutta 42.0 2.0 (fn [_y dt] (* 2.0 dt)) add scale) => 46.0
(runge-kutta 42.0 1.0 (fn [_y dt] (* 3.0 dt dt)) add scale) => 43.0
(runge-kutta 1.0 1.0 (fn [y _dt] y) add scale) => (roughly (exp 1) 1e-2))
The Jolt Physics library allows to apply impulses to the spacecraft. The idea is to use Runge Kutta 4th order integration to get an accurate estimate of the speed and position of the spacecraft after the next time step. One can apply an impulse before running an Euler step so that the position after the Euler step matches the Runge Kutta estimate. A second impulse then is used after the Euler time step to also make the speed match the Runge Kutta estimate. Given the initial state (x(n), v(n)) and the desired next state (x(n+1), v(n+1)) (obtained from Runge Kutta) the formulas for the two impulses are as follows:
The following code shows the implementation of the matching scheme using two speed changes in Clojure:
(defn matching-scheme
"Use two custom acceleration values to make semi-implicit Euler result match a ground truth after the integration step"
[y0 dt y1 scale subtract]
(let [delta-speed0 (scale (/ 1.0 ^double dt) (subtract (subtract (:position y1) (:position y0)) (scale dt (:speed y0))))
delta-speed1 (subtract (subtract (:speed y1) (:speed y0)) delta-speed0)]
[delta-speed0 delta-speed1]))
The following plot shows the height deviations observed when using Runge Kutta integration.
The following plot of maximum deviation shows that the errors are much smaller.
Although the accuracy of the Runge Kutta matching scheme is higher, a loss of 40 m of height per orbit is undesirable. Inspecting the Jolt Physics source code reveals that the double-precision setting affects position vectors but is not applied to speed and impulse vectors. To test whether double precision speed and impulse vectors would increase the accuracy, a test implementation of the semi-implicit Euler method with Runge Kutta matching scheme was used. The following plot shows that the orbit deviations are now much smaller.
The updated plot of maximum deviation shows that using double precision the error for one orbit is below 1 meter for time steps up to 40 seconds.
I am currently looking into building a modified Jolt Physics version which uses double precision for speed and impulse vectors. I hope that I will get the Runge Kutta 4th order matching scheme to work so that I get an integrated solution for numerically accurate orbits as well as collision and vehicle simulation.
Update:
Jorrit Rouwé has informed me that he currently does not want to add support for double precision speed values. He also has more detailed information about using Jolt Physics for space simulation on his website.
I have managed to get a prototype working using the moving coordinate system approach. One can perform the Runge Kutta integration using double precision coordinates and speed vectors with the Earth at the centre of the coordinate system. The Jolt Physics integration then happens in a coordinate system which is at the initial position and moving with the initial speed of the spaceship. The first impulse of the matching scheme is applied and then the semi-implicit Euler integration step is performed using Jolt Physics with single precision speed vectors and impulses. Then the second impulse is applied. Finally the position and speed of the double precision moving coordinate system are incremented using the position and speed value of the Jolt Physics body. The position and speed of the Jolt Physics body are then reset to zero and the next iteration begins.
The following plot shows the height deviations observed using this approach:
The maximum errors for different time steps are shown in the following plot:
In this engaging remote episode, we are joined by Vincent Cantin, a seasoned developer with a fascinating journey from game development to web development and Clojure. Vincent's career began over 20 years ago when his Game Boy Advance emulator projec...
Updated 2025-08-14 for Clojure CLI 1.12.1.1561, which added the basis function.Most of us who use the Clojure CLI are familiar with the -M (main), -X (exec), and -T (tool) options, and may have used clojure -X:deps tree at some point to figure out version conflicts in our dependencies. The :deps alias can do a lot more, so let's take a look!