OSS updates May and June 2025

In this post I&aposll give updates about open source I worked on during May and June 2025.

To see previous OSS updates, go here.

Sponsors

I&aposd like to thank all the sponsors and contributors that make this work possible. Without you the below projects would not be as mature or wouldn&apost exist or be maintained at all! So a sincere thank you to everyone who contributes to the sustainability of these projects.

gratitude

Current top tier sponsors:

Open the details section for more info about sponsoring.

Sponsor info

If you want to ensure that the projects I work on are sustainably maintained, you can sponsor this work in the following ways. Thank you!

Updates

Here are updates about the projects/libraries I&aposve worked on in the last two months, 19 in total!

  • babashka: native, fast starting Clojure interpreter for scripting.

    • Bump edamame (support old-style #^ metadata)
    • Bump SCI: fix satisfies? for protocol extended to nil
    • Bump rewrite-clj to 1.2.50
    • 1.12.204 (2025-06-24)
    • Compatibility with clerk&aposs main branch
    • #1834: make taoensso/trove work in bb by exposing another timbre var
    • Bump timbre to 6.7.1
    • Protocol method should have :protocol meta
    • Add print-simple
    • Make bash install script work on Windows for GHA
    • Upgrade Jsoup to 1.21.1
    • 1.12.203 (2025-06-18)
    • Support with-redefs + intern (see SCI issue #973
    • #1832: support clojure.lang.Var/intern
    • Re-allow init as task name
    • 1.12.202 (2025-06-15)
    • Support clojure.lang.Var/{get,clone,reset}ThreadBindingFrame for JVM Clojure compatibility
    • #1741: fix taoensso.timbre/spy and include test
    • Add taoensso.timbre/set-ns-min-level! and taoensso.timbre/set-ns-min-level
    • 1.12.201 (2025-06-12)
    • #1825: Add Nextjournal Markdown as built-in Markdown library
    • Promesa compatibility (pending PR here)
    • Upgrade clojure to 1.12.1
    • #1818: wrong argument order in clojure.java.io/resource implementation
    • Add java.text.BreakIterator
    • Add classes for compatibility with promesa:
      • java.lang.Thread$Builder$OfPlatform
      • java.util.concurrent.ForkJoinPool
      • java.util.concurrent.ForkJoinPool$ForkJoinWorkerThreadFactory
      • java.util.concurrent.ForkJoinWorkerThread
      • java.util.concurrent.SynchronousQueue
    • Add taoensso.timbre/set-min-level!
    • Add taoensso.timbre/set-config!
    • Bump fs to 0.5.26
    • Bump jsoup to 1.20.1
    • Bump edamame to 1.4.30
    • Bump taoensso.timbre to 6.7.0
    • Bump pods: more graceful error handling when pod quits unexpectedly
    • #1815: Make install-script wget-compatible (@eval)
    • #1822: type should prioritize :type metadata
    • ns-name should work on symbols
    • :clojure.core/eval-file should affect *file* during eval
    • #1179: run :init in tasks only once
    • #1823: run :init in tasks before task specific requires
    • Fix resolve when *ns* is bound to symbol
    • Bump deps.clj to 1.12.1.1550
    • Bump http-client to 0.4.23
  • SCI: Configurable Clojure/Script interpreter suitable for scripting

    • 0.10.47 (2025-06-27)
    • Security issue: function recursion can be forced by returning internal keyword as return value
    • Fix #975: Protocol method should have :protocol var on metadata
    • Fix #971: fix satisfies? for protocol that is extended to nil
    • Fix #977: Can&apost analyze sci.impl.analyzer with splint
    • 0.10.46 (2025-06-18)
    • Fix #957: sci.async/eval-string+ should return promise with :val nil for ns form rather than :val <Promise>
    • Fix #959: Java interop improvement: instance method invocation now leverages type hints
    • Bump edamame to 1.4.30
    • Give metadata :type key priority in type implementation
    • Fix #967: ns-name should work on symbols
    • Fix #969: ^:clojure.core/eval-file metadata should affect binding of *file* during evaluation
    • Sync sci.impl.Reflector with changes in clojure.lang.Reflector in clojure 1.12.1
    • Fix :static-methods option for class with different name in host
    • Fix #973: support with-redefs on core vars, e.g. intern. The fix for this issue entailed quite a big refactor of internals which removes "magic" injection of ctx in core vars that need it.
    • Add unchecked-set and unchecked-get for CLJS compatibility
  • clerk: Moldable Live Programming for Clojure

    • Make clerk compatible with babashka
  • quickblog: light-weight static blog engine for Clojure and babashka

    • 0.4.7 (2025-06-12)
    • Switch to Nextjournal Markdown for markdown rendering The minimum babashka version to be used with quickblog is now v1.12.201 since it comes with Nextjournal Markdown built-in.
    • Link to previous and next posts; see "Linking to previous and next posts" in the README (@jmglov)
    • Fix flaky caching tests (@jmglov)
    • Fix argument passing in test runner (@jmglov)
    • Add --date to api/new. (@jmglov)
    • Support Selmer template for new posts in api/new; see Templates > New posts in README. (@jmglov)
    • Add &aposlanguage-xxx&apos to pre/code blocks
    • Fix README.md with working version in quickstart example
    • Fix #104: fix caching with respect to previews
    • Fix #104: document :preview option
  • edamame: configurable EDN and Clojure parser with location metadata and more

    • 1.4.31 (2025-06-25)
    • Fix #124: add :imports to parse-ns-form
    • Fix #125: Support #^:foo deprecated metadata reader macro (@NoahTheDuke)
    • Fix #127: expose continue value that indicates continue-ing parsing (@NoahTheDuke)
    • Fix #122: let :auto-resolve-ns affect syntax-quote
    • 1.4.30
    • #120: fix :auto-resolve-ns failing case
  • squint: CLJS syntax to JS compiler

    • #678: Implement random-uuid (@rafaeldelboni)
    • v0.8.149 (2025-06-19)
    • #671: Implement trampoline (@rafaeldelboni)
    • Fix #673: remove experimental atom as promise option since it causes unexpected behavior
    • Fix #672: alias may contain dots
    • v0.8.148 (2025-05-25)
    • Fix #669: munge refer-ed + renamed var
    • v0.8.147 (2025-05-09)
    • Fix #661: support throw in expression position
    • Fix #662: Fix extending protocol from other namespace to nil
    • Better solution for multiple expressions in return context in combination with pragmas
    • Add an ajv example
  • clj-kondo: static analyzer and linter for Clojure code that sparks joy.

    • #2560: NEW linter: :locking-suspicious-lock: report when locking is used on a single arg, interned value or local object
    • #2555: false positive with clojure.string/replace and partial as replacement fn
    • 2025.06.05
    • #2541: NEW linter: :discouraged-java-method. See docs
    • #2522: support :config-in-ns on :missing-protocol-method
    • #2524: support :redundant-ignore on :missing-protocol-method
    • #2536: false positive with format and whitespace flag after percent
    • #2535: false positive :missing-protocol-method when using alias in method
    • #2534: make :redundant-ignore aware of .cljc
    • #2527: add test for using ns-group + config-in-ns for :missing-protocol-method linter
    • #2218: use ReentrantLock to coordinate writes to cache directory within same process
    • #2533: report inline def under fn and defmethod
    • #2521: support :langs option in :discouraged-var to narrow to specific language
    • #2529: add :ns to &env in :macroexpand-hook macros when executing in CLJS
    • #2547: make redundant-fn-wrapper report only for all cljc branches
    • #2531: add :name data to :unresolved-namespace finding for clojure-lsp
  • sci.configs: A collection of ready to be used SCI configs.

  • scittle: Execute Clojure(Script) directly from browser script tags via SCI

  • nbb: Scripting in Clojure on Node.js using SCI

    • 1.3.204 (2025-05-15)
    • #389: fix regression caused by #387
    • 1.3.203 (2025-05-13)
    • #387: bump import-meta-resolve to fix deprecation warnings on Node 22+
    • 1.3.202 (2025-05-12)
    • Fix nbb nrepl server for Deno
    • 1.3.201 (2025-05-08)
    • Deno improvements for loading jsr: and npm: deps, including react in combination with reagent
    • #382: prefix all node imports with node:
  • quickdoc: Quick and minimal API doc generation for Clojure

    • v0.2.5 (2025-05-01)
    • Fix #32: fix anchor links to take into account var names that differ only by case
    • v0.2.4 (2025-05-01)
    • Revert source link in var title and move back to <sub>
    • Specify clojure 1.11 as the minimal Clojure version in deps.edn
    • Fix macro information
    • Fix #39: fix link when var is named multiple times in docstring
    • Upgrade clj-kondo to 2025.04.07
    • Add explicit org.babashka/cli dependency
  • Nextjournal Markdown

    • 0.7.186
    • Make library more GraalVM native-image friendly
    • 0.7.184
    • Consolidate utils in nextjournal.markdown.utils
    • 0.7.181
    • Hiccup JVM compatibility for fragments (see #34)
    • Support HTML blocks (:html-block) and inline HTML (:html-inline) (see #7)
    • Bump commonmark to 0.24.0
    • Bump markdown-it to 14.1.0
    • Render :code according to spec into <pre> and <code> block with language class (see #39)
    • No longer depend on applied-science/js-interop
    • Accept parsed result in ->hiccup function
    • Expose nextjournal.markdown.transform through main nextjournal.markdown namespace
    • Stabilize API and no longer mark library alpha
  • babashka.nrepl-client

    • Add :responses key with raw responses
  • speculative

    • Add spec for even?
  • http-client: babashka&aposs http-client

    • 0.4.23 (2025-06-06)
    • #75: override existing content type header in multipart request
    • Accept :request-method in addition to :request to align more with other clients
    • Accept :url in addition to :uri to align more with other clients
  • unused-deps: Find unused deps in a clojure project

    • This is a brand new project!
  • fs - File system utility library for Clojure

    • #147: fs/unzip should allow selective extraction of files (@sogaiu)
    • #145: fs/modified-since works only with ms precision but should support the precision of the filesystem
  • cherry: Experimental ClojureScript to ES6 module compiler

    • Fix cherry.embed which is used by malli
  • deps.clj: A faithful port of the clojure CLI bash script to Clojure

    • Released several versions catching up with the clojure CLI

Other projects

These are (some of the) other projects I&aposm involved with but little to no activity happened in the past month.

Click for more details - [CLI](https://github.com/babashka/cli): Turn Clojure functions into CLIs! - [process](https://github.com/babashka/process): Clojure library for shelling out / spawning sub-processes - [html](https://github.com/borkdude/html): Html generation library inspired by squint's html tag - [instaparse-bb](https://github.com/babashka/instaparse-bb): Use instaparse from babashka - [sql pods](https://github.com/babashka/babashka-sql-pods): babashka pods for SQL databases - [rewrite-edn](https://github.com/borkdude/rewrite-edn): Utility lib on top of - [rewrite-clj](https://github.com/clj-commons/rewrite-clj): Rewrite Clojure code and edn - [pod-babashka-go-sqlite3](https://github.com/babashka/pod-babashka-go-sqlite3): A babashka pod for interacting with sqlite3 - [tools-deps-native](https://github.com/babashka/tools-deps-native) and [tools.bbuild](https://github.com/babashka/tools.bbuild): use tools.deps directly from babashka - [http-server](https://github.com/babashka/http-server): serve static assets - [bbin](https://github.com/babashka/bbin): Install any Babashka script or project with one comman - [qualify-methods](https://github.com/borkdude/qualify-methods) - Initial release of experimental tool to rewrite instance calls to use fully qualified methods (Clojure 1.12 only0 - [neil](https://github.com/babashka/neil): A CLI to add common aliases and features to deps.edn-based projects.
- [tools](https://github.com/borkdude/tools): a set of [bbin](https://github.com/babashka/bbin/) installable scripts - [sci.nrepl](https://github.com/babashka/sci.nrepl): nREPL server for SCI projects that run in the browser - [babashka.json](https://github.com/babashka/json): babashka JSON library/adapter - [squint-macros](https://github.com/squint-cljs/squint-macros): a couple of macros that stand-in for [applied-science/js-interop](https://github.com/applied-science/js-interop) and [promesa](https://github.com/funcool/promesa) to make CLJS projects compatible with squint and/or cherry. - [grasp](https://github.com/borkdude/grasp): Grep Clojure code using clojure.spec regexes - [lein-clj-kondo](https://github.com/clj-kondo/lein-clj-kondo): a leiningen plugin for clj-kondo - [http-kit](https://github.com/http-kit/http-kit): Simple, high-performance event-driven HTTP client+server for Clojure. - [babashka.nrepl](https://github.com/babashka/babashka.nrepl): The nREPL server from babashka as a library, so it can be used from other SCI-based CLIs - [jet](https://github.com/borkdude/jet): CLI to transform between JSON, EDN, YAML and Transit using Clojure - [pod-babashka-fswatcher](https://github.com/babashka/pod-babashka-fswatcher): babashka filewatcher pod - [lein2deps](https://github.com/borkdude/lein2deps): leiningen to deps.edn converter - [cljs-showcase](https://github.com/borkdude/cljs-showcase): Showcase CLJS libs using SCI - [babashka.book](https://github.com/babashka/book): Babashka manual - [pod-babashka-buddy](https://github.com/babashka/pod-babashka-buddy): A pod around buddy core (Cryptographic Api for Clojure). - [gh-release-artifact](https://github.com/borkdude/gh-release-artifact): Upload artifacts to Github releases idempotently - [carve](https://github.com/borkdude/carve) - Remove unused Clojure vars - [4ever-clojure](https://github.com/oxalorg/4ever-clojure) - Pure CLJS version of 4clojure, meant to run forever! - [pod-babashka-lanterna](https://github.com/babashka/pod-babashka-lanterna): Interact with clojure-lanterna from babashka - [joyride](https://github.com/BetterThanTomorrow/joyride): VSCode CLJS scripting and REPL (via [SCI](https://github.com/babashka/sci)) - [clj2el](https://borkdude.github.io/clj2el/): transpile Clojure to elisp - [deflet](https://github.com/borkdude/deflet): make let-expressions REPL-friendly! - [deps.add-lib](https://github.com/borkdude/deps.add-lib): Clojure 1.12's add-lib feature for leiningen and/or other environments without a specific version of the clojure CLI

Permalink

Introducing Purple Rockets: Your path to making an impact at Nu Mexico

Since arriving in Mexico, Nu has established itself as one of the main drivers of change in the country’s financial sector. By offering simple, transparent products built around real people’s needs, we’re creating a new relationship with money and delivering tangible value in our customers’ financial lives.

This fast-paced growth has opened the door for professionals who want to be part of something bigger: helping us transform a constantly evolving market. To support this movement, we’re launching Purple Rockets, our talent acceleration program for early-career professionals who already have some hands-on experience.

The program includes tracks focused on intensive learning and work on high-impact projects. If you’re looking for a purpose-driven environment where you can grow and take on meaningful challenges, this could be your next journey.

What is Purple Rockets and why it matters

Purple Rockets is a structured talent acceleration program designed for early-career professionals who have gained some industry experience and are eager to fast-track their career growth.

Rather than being a conventional entry-level program, Purple Rockets is designed to empower professionals by giving them real responsibilities, continuous mentorship, and exposure to critical projects, preparing them for accelerated growth within Nu.

The initiative aligns perfectly with Nu’s strategic mission in Mexico and Latin America, ensuring the development of a pipeline of highly skilled professionals ready to contribute immediately and significantly to Nu’s long-term goals.

Roles and profiles we’re looking for

Purple Rockets seeks passionate, proactive professionals ready to further their career paths. Key positions include:

  • Software Engineers: Passionate about technology and innovation, ready to build scalable products.
  • Product Managers: Able to envision and drive product development strategies effectively.
  • Product Operations: Strategic thinkers who enhance operational efficiency and product performance.
  • Business Analysts: Experts in data-driven decision-making, translating analytics into actionable insights.

Check the details of the positions and hiring requirements in the job post available here.

Hiring process

Purple Rockets has a structured selection process designed to identify high-potential candidates who are aligned with our values. Here’s what to expect:

  • Online test: Assesses logical reasoning and basic technical knowledge.
  • Group exercise: An opportunity to observe collaboration, active listening, and team-based problem solving.
  • Interview with the recruiting team: Focused on your journey so far, your motivations, and cultural fit.
  • Interview with Nu leaders: The final stage, where you’ll meet with managers from your area of interest.

We’re looking for curious, proactive people who are committed to continuous learning and ready to grow with us.

Why join Purple Rockets?

Joining Nu means becoming part of a dynamic, transformative organization that prioritizes innovation, inclusion, and impact. At Nu, you’ll:

  • Drive growth with technical and business acumen: You’ll be strongly encouraged to combine your technical knowledge with business language, learning to navigate the organization and influence key aspects of the company’s exponential growth. This program is designed for tech professionals who want to apply their skills to broader business challenges.
  • Work in a diverse, inclusive environment: We value different perspectives and foster an inclusive atmosphere.
  • Innovate and impact lives: You’ll contribute directly to innovative solutions, significantly impacting customers’ lives in Mexico and across Latin America.

Program Structure

The Purple Rockets program is structured into three distinct phases:

Launchpad & Ignition

This initial two-month onboarding phase immerses participants fully into Nu’s culture and working environment. It starts with a three-week, intensive, in-person cultural immersion, followed by five weeks of specialized, function-specific training. 

Participants receive personalized mentorship and guidance from experienced professionals, preparing them comprehensively for future challenges.

Orbital mission

This ten-month phase provides substantial hands-on experience, enabling participants to tackle real-world problems across various teams through rotations every three to five months. Regular mentorship from seasoned leaders, combined with structured performance evaluations, ensures continuous learning and development.

Moonwalk

In these final six months, participants take on greater responsibilities and autonomy in their roles. This phase is designed for you to solidify your skills and demonstrate your full potential, and it’s where you’ll prove your progress and demonstrate your mastery, showcasing your readiness to function independently as a key contributor at Purple Rockets. Your performance during this acceleration phase will play an important role in shaping your growth and future opportunities within the company as a full-time employee.

Ready to make an impact?

We encourage you to apply even if your background doesn’t perfectly match every listed criterion. At Nu, your potential, passion, and creativity matter most.

Ready to make an impact? Check the details of the positions and hiring requirements in the job post available here.

The post Introducing Purple Rockets: Your path to making an impact at Nu Mexico appeared first on Building Nubank.

Permalink

The economic inevitability of AI

This week, our Apropos episode is on Thursday. We invite Peter Strömberg to the show!

The Clojure/conj CFP is open! If you’re looking to speak, go apply! I’ll be there giving a workshop on domain modeling in Clojure.


The economic inevitability of AI

The current hype around AI triggers our cognitive biases and makes it hard to take an objective look. It’s just hard to be certain about anything. On what timeline will AI take our jobs? What threats does AI pose to culture and civilization? Will AI help us manage and mitigate climate change or will it exacerbate it? There are a lot of difficult questions. But there’s one thing I’m fairly confident about, and I think it’s worth taking the time to clear the air about it. I’m certain of the economic inevitability of the AI industry.

The future is clear. It has replayed itself many times in the history of business, and certainly in the history of the computing industry. The future is this: Some company (probably OpenAI but we can’t be sure) will capture a significant percentage of the revenue of white-collar labor. Just like Amazon now captures a significant portion of online commerce, just like AWS takes a cut of most startup’s revenue, the winner of the AI race will extract a fee from most every white collar worker. Like licenses for Microsoft Office or Windows, companies will pay for access to AIs they believe will make their employees more productive.

Companies will gladly ask their employees to use AI if it speeds up their work. Businesses want to be competitive, so as other companies gain efficiencies from their use of AI, they’ll need to do it, too, to keep up. It’s a rational choice. I’d ask my company to do it, too, if I were the CTO. Eventually, most employees doing knowledge work will be using AI. And the employers will be paying licenses to OpenAI (or whoever wins).

We live in a world where winner-takes-all is the natural course of things. Slight advantages compound. The big company can acquire, undercut, or outlast the small company. Eventually, the market sorts out the winner with ~80% market share. The rest of the companies fight over the remaining 20%.

In short, one company will have 80% market share, and the market is AI accelerators for knowledge work. Microsoft had “A computer on every desk.” OpenAI will have “An AI on every desk.”

Once the 80% winner is found, the enshittification can begin. These companies are fueled by massive investments. They can sell services well below cost for years. OpenAI forecasts losing $44 billion in the next four years. But in 2029, they can start experimenting with how to turn all those customers, now dependent on their services, into profitable clients, likely by raising prices, degrading service, or inserting ads. OpenAI will create a few trillionaires, many billionaires, and uncountable millionaires on their ascent.

If you’re looking to make money, hitch yourself to the rise of these AI giants. You could be one of the millionaires by being one of the companies renting the shovels during this gold rush.

History is repeating. Personal computers were supposed to liberate us. But what we didn’t realize when the digital dream of personal freedom was still alive was that the companies who amassed more computers and more data would be more free than others. The computer giants donated computers to school.

Similarly, AI is currently sold as a liberating force. It is “democratizing” programming. It’s giving everyone a cheap therapist. They are more reliable companions. But we know that the productivity gains and the huge profits will accrue to the employers and to the AI companies most of all. OpenAI and the others are making products that are sure to become the next essential tool for thought work. Employers will demand that we use them. OpenAI will take rent on all of the work we do. And it will become enshittified. With that inevitability out of the way, we can talk about the more nuanced ideas in the next emails, including some positive ones!

Permalink

Macros Explained for Java Developers

If you’re a Java dev, you’ve probably used or heard of Project Lombok, Jakarta Bean Validation (JSR 380), AutoValue, MapStruct, or Immutables. They all help reduce boilerplate and add declarative magic to your code.
And I’m sure you’ve come across the term “macro”, usually explained in some academic or cryptic way. But here’s the thing: these libraries are simulating macro-like behavior — just without true macro support.

What Are Macros Anyway?

In languages like Lisp or Clojure, macros are compile-time programs that transform your code before it runs. They let you:

  • Rewrite or generate code
  • Build new control structures
  • Create entire domain-specific languages

They're basically code that writes code — giving you full control of the compiler pipeline.

Java’s “Macro” Workarounds

Java doesn’t support macros. Instead, it uses annotation processors and code generation tools:

  • Lombok’s @data → generates constructors, getters, and equals()/hashCode()
  • Jakarta Bean Validation (@min, @notblank) → declarative validation
  • AutoValue → immutable value types
  • MapStruct → type-safe mappers (my personal favorite)
  • Immutables → generates immutable types with builders
  • Spring Validation → framework-driven validation

These are powerful tools — but they can’t create new syntax or change how Java works at its core. They're still working within the language, not extending it.

What Real Macros Look Like

In Clojure, you can define a new data structure and its validator in a single macro:

lisp
(defmacro defvalidated
  [name fields validations]
  `(do
     (defrecord ~name ~fields)
     (defn ~(symbol (str "validate-" name)) [~'x]
       (let [errors# (atom [])]
         ~@(for [[field rule] validations]
             `(when-not (~rule (~field ~'x))
                (swap! errors# conj ~(str field " failed validation"))))
         @errors#))))

Usage:

lisp
(defvalidated User
  [name age]
  {name not-empty
   age #(>= % 18)})

(validate-User (->User "" 15))
;; => ["name failed validation" "age failed validation"]

No annotations. No libraries. No ceremony.
Just your own language feature, built with a macro.

TL;DR

Java’s toolchain simulates macro-like behavior through annotations and codegen. But if you want to invent language, write less boilerplate, and build smarter abstractions — macros in languages like Clojure or Racket offer the real deal.

Java gives you a powerful toolkit. Macros give you the power to build your own.

Inspired by Paul Graham's essay collection "Hackers & Painters"

Permalink

Understanding Clojure reduce better

Code

(defn add [a b]
  (+ a b))

(reduce add [1 2 3 4 5])

(reduce add 1 [2 3 4 5])

(reduce add 3 [3 4 5])

;; (reduce add [3 3 4 5])

(reduce add 3 [3 4 5])

(reduce add 6 [4 5])

(reduce add 10 [5])

(remove #(= % :a) [:a :b :c :d])

(defn remove-value [seq value]
  (remove #(= % value) seq))

(remove-value [:a :b :c :d] :b)

(reduce remove-value [:a :b :c :d] [:b :d])

(reduce remove-value [:a :b :c :d] [:b])

(reduce remove-value [:a :c :d] [:d])

(reduce remove-value [:a :c] [])

Permalink

June 2025 Short-Term Project Updates Q2 2025 Projects

This is the June project update for four of our Q2 2025 Funded Projects. (Reports for the other two are on a different schedule). A brief summary of each project is included to provide overall context.

Brandon Ringe: CALVA
A new REPL output view for Calva, which is a webview in VS Code. The webview will allow us to add more rich features to the output webview, while also providing better performance.

Bozhidar Batsov: CIDER
Provide continued support for CIDER, nREPL and the related libraries (e.g. Orchard, cidernrepl, etc) and improve them in various ways.

Jeaye Wilkerson: Jank
Build jank’s seamless C++ interop system.

Siyoung Byun: SciCloj Building Bridges to New Clojure Users Scicloj aims to improve the accessibility of Clojure for individuals working with data, regardless of their programming backgrounds. The project aims to develop standardized templates to encourage greater consistency across the documentation of existing Scicloj ecosystem libraries, making those libraries more robust and user-friendly.

CALVA: Brandon Ringe

Q2 2025 $9K. Report 2. Published June 16, 2025.

Since the last project update, several improvements have been made to Calva’s new output view.

One of those improvements is a significant performance boost with high frequency output, such as logging to stdout hundreds or thousands of times within a minute. I realized that replicant (and rendering libraries like it), which is an awesome library, is not well-suited for this kind of use case. So I switched to using the DOM API directly, which is much faster and more efficient for this purpose.

Here’s a list of the changes to the output view since the last project update:


CIDER: Bozhidar Batsov

Q2 2025 $9K. Report 2. Published June 17, 2025.
The last month was another strong one for CIDER and friends, which featured many releases across the different tools and libraries. Below are some of the highlights:

  • inf-clojure 3.3
    • This was the first inf-clojure release in 3 years
    • It adds support for clojure-ts-mode and also features some fixes and small improvements
  • clojure-mode 5.20
    • Adds clojuredart-mode and jank-mode for better support of ClojureDart and Jank
    • Also features a couple of small fixes
  • clojure-ts-mode 0.5 was the biggest news in the past month for several reasons
    • It uses an experimental version of the Clojure Tree-sitter grammar that allows us to deal better with Clojure metadata
    • It features support for font-locking embedded JavaScript code in ClojureScript (and C++ code in Jank)
    • Introduces the clojure-ts-extra-def-forms customization option to specify additional defn-like forms that should be treated as definitions (as it’s hard to reliably infer those)
    • It features some simple built-in completion powered by Tree-sitter
      • I blogged about the concept here
  • CIDER also saw some work, but no new release. You can expect in the next release:
    • better support for clojure-ts-mode
    • more improvements to the Inspector
    • user manual (documentation) improvements

On top of this, I’ve been reviewing the data from CIDER’s survey, improving the CI setups of various projects, providing user support and working on some ideas about restructuring the documentation of CIDER and nREPL to be more approachable. The process for streamlining (slimming down) CIDER, cider-nrepl and orchard is ongoing as well. (e.g. we’ve stopped bundling print engines with cider-nrepl and now those have to be supplied by the users)

clojure-ts-mode is shaping up really nicely and has mostly matched the features of clojure-mode. On top of this it does a few things better than clojure-mode, so I’m optimistic that we’ll mark it as ready for general usage before the end of the year. We’ve expanded the design docs recently and I think they can be useful both to potential contributors and to other people looking to create Tree-sitter based major modes. I’m also working in the background on a couple of article for best practices when working with Tree-sitter in Emacs.

Thanks to Clojurists Together for their continued support of my OSS work! You rock!


Jank: Jeaye Wilkerson

Q2 2025 $9K. Report 2. Published June 17, 2025.

Thank you!

Hi folks! Thanks so much for the sponsorship this quarter. Clojurists Together is my largest form of income this year, which makes this even more special.

Seamless interop

In the past month, I’ve implemented:

  • Many, many tests for the new interop functionality
  • Calling global C++ functions
  • Calling C++ member functions
  • Accessing C++ member variables
  • C++ reference semantics, including automatic referencing and dereferencing
  • Dozens of C++ operators

You can find all of the details in my latest blog post, here.

Next up

In the final month of this quarter, I aim to expand the test suite, fix the remaining known bugs, add manual memory management support, better template support, full destructor support, and finally the portability fixes necessary to make all of this work on both macOS and Linux.


SciCloj Building Bridges to New Clojure Users: Siyoung Byun

Q2 2025 $2K. Report 1. Published June 25, 2025.

Work Completed, In Progress and Further Plans

CFD Python into Clojure Project

I initiated the CFD Python into Clojure project, which translates computational fluid dynamics (CFD) learning steps from Python into Clojure. The project also includes a currently evolving notebook page that shows the learning steps progressed so far for the project.
This project showcases interactive, real-world scientific computing examples with an initiative to diversify the set of data science use cases available in Clojure.

I am working toward completing and polishing the project as a featured example of Clojure’s capability in handling scientific computing and numerical simulation use cases. I am planning to discuss the results and experiences at a future SciCloj Light Conference.

Conference Presentation at SciCloj Light #1

I presented a talk at SciCloj Light #1, highlighting the progress (implementing one-dimensional CFD using Burgers' equation), simulations, and future directions from the CFD Python into Clojure initiative. The talk demonstrated how Clojure can serve as a powerful tool in data science and data analysis.

Conference Reflections and Scicloj Community Video

I co-authored and released a video summarizing the SciCloj Light #1 conference experience, along with publishing a written conference survey to gather feedback and reflect on future SciCloj Light conferences with Daniel Slutsky. The video discusses the preparation process, key takeaways, and future directions for Scicloj as a community-driven initiative and as an individual contributor.

Community Outreach Initiative for Broader Participation

I initiated and participated in a community discussion on Clojurian Zulip to encourage Scicloj contributors and users to present at external data science conferences and share their experiences using Clojure and Scicloj libraries for their data science work. This outreach aims to amplify Clojure’s visibility in broader scientific and data communities to obtain more attention and bring newcomers to the community.

Documentation Improvements for Noj

I joined as one of the maintainers of Noj, an out-of-the-box Clojure library designed to simplify data science workflows. Currently, I’ve been focusing on improving the library’s introductory documentation, setup guides, and use cases in notebooks to make it more welcoming and useful to newcomers.
There are already good use cases, ideas, and documentation in the library, so my main goal is to make those resources more coherent, consistent, and organized to be easily searchable. Once we have well-tailored documentation in place for Noj, we hope to roll out the same documentation template for the rest of the Scicloj libraries.

Library Documentation Review and Feedback Collection

I am actively gathering feedback on gaps and pain points in the documentation of various Scicloj libraries through the Clojurian Zulip channel.

Improving “Getting Started” Experiences

I am continuing to improve the beginner documentation and setup processes across Scicloj libraries, with a goal of creating a smoother onboarding experience for users with diverse backgrounds.

Organizing ‘macroexpand’ Gatherings

I will co-organize a series of regular online community meetups called macroexpand gatherings. These are aimed at welcoming both new and existing Clojure users and will foster communication across communities working on professional data projects (data engineering, analysis, visualization, etc.) and identify shared challenges and opportunities for collaboration. By hosting a space to discuss the current status of the community/individuals and challenges together, we hope to prioritize our todos and initiatives better and create actionable items to move forward.

Permalink

Case Study: Reagent With Macro Help

The question came up whether something like the grove << fragment macro could apply to any of the CLJS react wrappers? I said yes, and didn’t really want to get into it. Curious as I am though that question nagged at me and I wanted to know for myself. So, since I’m preaching about Lessons learned and not really wanting to talk about shadow-grove, let use talk about reagent for a bit. I don’t want to put reagent in a bad light or anything, I want to highlight how much optimization potential is left within easy reach.

Given that I have already written the actual << macro, how hard could it be to translate this to reagent? To do an actual proper implementation I can’t really say. Something basic that would satisfy my basic testing setup? Not that hard. Of course, this isn’t nearly as comprehensive as the React Compiler would do, and neither is it supposed to be. I just wanted to test whether this actually provides any kind of speedup at all.

TL;DR: Yep, it gets faster.

Reagent Baseline

I have to emphasize that this setup isn’t anything close to how you are actually supposed to use reagent. This is intentionally naive and probably breaking some rules. I wanted this to measure against React 19 and I don’t even know if that is officially supported by reagent. I did the same testing methodology as my previous post. Some basic hiccup, turned into react elements which then get rendered by react. I was shocked how slow this is during development. Like, 90ms without any slowdowns. Which emphasizes the fact that creating very large hiccup structures is bad. I know I should have used components and not actually do everything as on big hiccup structure. But this was only during dev, release actually looks much more reasonable. Reaching about ~1.6ms with no slowdowns, but what about 20x slowdown?

Trace

Due to react not rendering when I tell it to, but rather delaying itself the trace is actually split in two. One is the actual reagent/as-element call, which converts the hiccup to react elements. And then a bit later actually react rendering it.

Trace

So ~31ms, but with a bit of variance. Seems the go between 20-40ms, in rather unpredictable ways.

Regardless, as always feel free to get your own numbers via the demo page.

Reagent With Macro Help

Same as in my earlier post, the only real change was wrapping all hiccup forms in the << macro I wrote. This macro does two things.

First, it tries to find completely constant parts of hiccup that it can pull out and “memoize”. In the example code that for example the entire :div holding the :svg, as well as the :thead bit. These only get created once, so the hiccup->react element conversion is only done once, and react finds identical? elements and I assume skips some work.

Second, using the same strategy the macro in shadow-grove uses it analyzes the hiccup and looks for dynamic parts. So, for example from this bit:

[:tr
 [:td {:class $col-num}
  (:id row)]
 [:td {:class $col-num}
  (format-temp (:target row))]
 ...]

It identifies the $col-num as “constant”, since it is a def in the local namespace, but (:id row) and (format-temp (:target row)) are code blocks. Code needs to actually run, so instead it basically creates references to these things. Simplified it generates a function for this instead, that roughly looks like

(def memo-fn1
  (make-memo-fn
    (fn [[a1 a2]]
      (as-element
        [:tr
         [:td {:class $col-num} a1]
         [:td {:class $col-num} a2]
         ...]))))

In the place of the actual fragment, it then emits a call to a helper function, which will eventually call the above function, but also memoize it based on the passed data vector.

(use-memo-fn memo-fn1 [(:id row) (format-temp (:target row))])

The two helper functions you can find here. Admittedly naive and all full of bugs, but to be quite honest I don’t care about react at all, so the only goal here was to find out if this all can actually apply to reagent with any meaningful benefits.

Running this variant again with a 20x slowdown looks something like this:

Trace

Again, react still runs when it wants, and as you can see there is now some as-element conversion happening as part of the react expansion. You’d have this normally anyway due to the introduction of some intermediate “components”, but it makes it a bit harder to distinguish where actual time is spent now. No longer has clear boundaries between reagent/react, but that is fine, you’d never have that in the first place if not for my deliberate test setup.

Trace

Either way that is ~13ms, again with quite a bit of variance, but still a measurable difference. Not too bad for a couple of hours of macro tinkering.

Conclusion

Should you use this? Absolutely not. This was only to satisfy my own curiosity. I’m almost certain that some of the other CLJS wrappers do something similar or smarter already. It has been many years since I actually used react myself and I have no interest in turning this into a proper library, or actually optimizing this to its full potential. I’m satisfied to have some evidence that these ideas actually do apply.

Again, I know that I could have gone with a few simple components and probably get about the same or better speedup with that. The goal was testing whether introducing a macro can help “fix” my naive approach.

If done properly you could do what React Compiler does completely within a macro, with no extra help from the compiler at all. And quite honestly anyone using react via CLJS probably should. Otherwise, future comparisons between CLJS and React-Compiler-compiled JS will look embarrassing, even more than they currently already do. Maybe I’m wrong and the React Compiler will actually be usable with CLJS in some way, but I have my doubts, especially since I have no clue how I’d integrate it with shadow-cljs.

Happy to discuss this with anyone equally performance obsessed. ;)

Permalink

Convergence to Normal Distribution, independent of original distribution

We mentioned last time that the result of combining more and more events will approach the normal distribution, regardless of the shape of the original event distribution. Let’s try to demonstrate that visually.

Our previous definition of a random event is an example of a uniform distribution:

(defn event []
  (rand))
(defn event-sample-dataset [event-fn sample-count]
  {:index       (range sample-count)
   :event-value (repeatedly sample-count event-fn)})
(def uniform-ds (event-sample-dataset event 100000))
(defn histogram [ds]
  (-> ds
      (tc/dataset)
      (plotly/layer-histogram
       {:=x :event-value
        :=histnorm "count"
        :=histogram-nbins 40})
      (plotly/layer-point)))
(histogram uniform-ds)

If we combine several of these distributions, watch the shape of the distribution:

(defn avg [nums]
  (/ (reduce + nums) (count nums)))
(defn combined-event [number-of-events]
  (avg (repeatedly number-of-events event)))
(histogram (event-sample-dataset #(combined-event 2) 100000))
(histogram (event-sample-dataset #(combined-event 5) 100000))
(histogram (event-sample-dataset #(combined-event 20) 100000))

Let’s try this again with a different shape of distribution:

(defn triangle-wave [x]
  (-> x (- 0.5) (Math/abs) (* 4.0)))
(-> (let [xs (range 0.0 1.01 0.01)
          ys (mapv triangle-wave xs)]
      (tc/dataset {:x xs :y ys}))
    (plotly/layer-point
     {:=x :x
      :=y :y}))

Generating samples from this distribution is more complicated than I initially expected. This warrants a follow-up, but for now I’ll just link to my source for this method: Urban Operations Research by Richard C. Larson and Amedeo R. Odoni, Section 7.1.3 Generating Samples from Probability Distributions (see “The rejection method”).

(defn sample-from-function [f x-min x-max y-min y-max]
  (loop []
    (let [x (+ x-min (* (rand) (- x-max x-min)))
          y (+ y-min (* (rand) (- y-max y-min)))]
      (if (<= y (f x))
        x
        (recur)))))
(defn event []
  (sample-from-function triangle-wave 0.0 1.0 0.0 2.0))
(def triangle-wave-ds (event-sample-dataset event 100000))
(histogram triangle-wave-ds)

Let’s combine several of these distributions:

(histogram (event-sample-dataset #(combined-event 2) 100000))
(histogram (event-sample-dataset #(combined-event 5) 100000))
(histogram (event-sample-dataset #(combined-event 20) 100000))

I find these visuals surprisingly powerful because you can see the original distribution “morph” into this characteristic shape.

The normal distribution holds a unique place in mathematics and in the world itself: whenever you combine multiple independent and identically-distributed events, the result will converge to the normal distribution as the number of combined events increases.

Permalink

Clojure Deref (June 27, 2025)

Welcome to the Clojure Deref! This is a weekly link/news roundup for the Clojure ecosystem (feed: RSS).

The Clojure/conj 2025 Call for Presentations is open now until July 27! We are seeking proposals for both 30 minute sessions and 10 minute lightning talks.

Libraries and Tools

New releases and tools this week:

  • eca - Editor Code Assistant (ECA) - AI pair programming capabilities in any editor

  • brainfloj - A BrainFlow wrapper for Clojure

  • scittlets - A repository of Scittle libraries

  • clojure-mcp 0.1.5-alpha - Clojure MCP

  • bud - A minimalist ClojureScript DOM library with precise, signal-driven reactivity for single-page applications

  • tufte 3.0.0 - Simple performance monitoring library for Clojure/Script

  • beeld - Get the metadata associated with an image. Also contains image utilities: filesize, scale, etc.

  • splint 1.20.0 - A Clojure linter focused on style and code shape.

  • edamame 1.4.31 - Configurable EDN/Clojure parser with location metadata

  • clojobuf 0.2.1 - A clojure(script) library that dynamically interprets protobuf files (.proto) and use the resultant schemas to encode/decode plain clojure(script) map into/from protobuf binaries.

  • rubberbuf 0.3.2 - A clojure(script) library to parse protobuf definition (.proto) into abstract syntax tree (AST).

  • calva-power-tools 0.0.10 - A VS Code Extension providing commands for extra powerful things in the Clojure ecosystem

  • ruuter 1.3.5 - A system-agnostic, zero-dependency router

  • honeyeql 1.0.6 - HoneyEQL is a Clojure library enables you to query database using the EDN Query Language.

  • replicant 2025.06.21 - A data-driven rendering library for Clojure(Script) that renders hiccup to DOM or to strings.

  • scicloj.ml.xgboost 6.4.0 - A xgboost plugin for scicloj.ml

  • qclojure 0.3.0 - A functional quantum computer programming library for Clojure with backend protocols, simulation backends and visualizations.

  • overarch 0.39.0 - Overarch provides a data model for the holistic description of a software system, opening multiple use cases on the model data. Supports C4 and UML diagram generation with PlantUML.

  • calva-backseat-driver 0.0.14 - VS Code AI Agent Interactive Programming. Tools for CoPIlot and other assistants. Can also be used as an MCP server.

  • polylith 0.2.22 - A tool used to develop Polylith based architectures in Clojure.

  • pretty 3.4.1 - Library for helping print things prettily, in Clojure - ANSI fonts, formatted exceptions

  • aleph 0.9.0 - Asynchronous streaming communication for Clojure - web server, web client, and raw TCP/UDP

  • babashka 1.12.204 - Native, fast starting Clojure interpreter for scripting

  • conjtest 0.3.0 - Run tests against common configuration file formats using Clojure!

  • nvd-clojure 5.1.0 - National Vulnerability Database dependency checker for Clojure projects

  • sci 0.10.47 - Configurable Clojure/Script interpreter suitable for scripting and Clojure DSLs

Permalink

Accelerating maps with join-with

We designed ClojureDart maps to be more regular and deterministic than Clojure’s, making it easier to optimize batch operations.

For a while, that potential sat untapped, until a couple of months ago (yeah this issue sat in draft for too long).

Today we are going to talk maps, accelerating merge and introducing a new function, a real killer: join-with.

We’re currently available for new work — short or long term.
You can work with either one of us, or both, depending on what you need. From a quick brain-pick session to a full prototype, MVP, or long-term project: we’d love to hear what you’re working on.

Just hit reply or get in touch!

When we’re not helping clients, we’re working on ClojureDart and Paktol — the positive spending tracker where money goes up 📈, we’ve just released a major new feature (anticipations) which makes it a complete and cohesive product.

Maps implementations differences

Clojure maps are conceptually tries with a 32 fan out. (The clever parts are how we navigate this trie using a hash and how we store sparse children arrays.)

Doing boolean operations on tries is conceptually easy: for a given child slot in two nodes, process the matching subtrees recursively, possibly short-circuiting on identical nodes or nils.

Clojure maps

In practice Clojure hash maps have three kinds of nodes: bitmap nodes, array nodes, collision nodes. Plus a node may need to be compared to an inline key-value pair. This makes a lot of cases to account for — and that's without counting on the fact that small maps are not hash maps but array maps.

ClojureDart maps

For ClojureDart I wanted to have a simpler hash map implementation:

  • No internal polymorphism: it has a single type of nodes: bitmap nodes (they are even reused for collision nodes — a collision node is just a node at maximum depth).
  • History independence: for a given keyset the layout is deterministic.
  • Nodes store the number of items they contain (this speeds up merging disjoint maps a lot).
  • Determining if a node is editable (owned by the transient) is done in a stateless way.

ClojureDart literal maps

What about small maps? In ClojureDart they are hash maps too, however we have one trick: literal maps for which keys are stable (strings, symbols, numbers, booleans etc.) have their layout determined at compilation time (by a macro).

This means that for many small maps (where there's no collision on the lowest hash code 5 bits — Birthday paradox applies: 50% of 7 entries maps) we allocate an array containing the interleaved keys and values in the right order (determined at compile time) with the right bitmap (determined at compile time). Making it almost as cheap as allocating an array map.

If your map requires two nodes or more they will be predetermined in the same way.

If not all your keys have safe hash code, then the infringing keys will be added at runtime using normal transient association.

Fast structural merge

The first function to benefit from ClojureDart simpler maps was merge — or conj between maps, as it's often overlooked that (conj {:a 1} {:b 2 :c 3}) is legit.

What does it mean for merge to be accelerated? First it means that (merge small-map big-map) is as fast as (merge big-map small-map).

It also means that a lot of hash computation and bit-twiddling and pointer traversal is avoided.

Together with predetermined literal maps this has some surprising results: (conj m {:a x :b y}) is now the fastest way to associate two keys on a map!

Generalizing merge-with

Instead of accelerating merge-with, we accelerated a new function join-with which is "merge-with with too many functions".

(join-with combine a b)
(join-with combine fa a b)
(join-with combine fa fb a b)
(join-with combine fa fb where? a b)

join-with can express any boolean operation on maps (and sets) combined with transformation and filtering, in a single lockstep walk of both maps.

(combine va vb) combines values when k belongs to maps a and b.

(fa va) is the value when k belongs only to map a.

(fb vb) is the value when k belongs only to map b.

where? is a predicate applied to values returned by combine, fa or fb to determine if the entry should be included in the result.

Any of combine, fa or fb can be nil to exclude the corresponding keysets. identity is a special value for fa and fb: it's recognized and treated more efficiently than (fn [x] x).

What can join-with do?

Obviously you can express merge-with:

(defn merge-with [f a b]
  (join-with f identity identity a b))

More surprisingly update-vals:

(defn update-vals [f m]
  (join-with nil f m {}))

Despite update-vals taking only one map, the more efficient traversal and building of the output is still a gain!

Another surprise is select-keys: when the keys to preserve are given by a set (instead of a vector)

(defn select-keys* [m ks]
  (assert (set? ks))
  (join-with (fn [v _] v) m ks)))

You can implement diff, patch, merge3 and so on:

(defn diff [from to]
  (join-with
    (fn [from to]
      (when-not (= from to)
        [:<> from to]))
    (fn [from] [:- from])
    (fn [to] [:+ to])
    some?
    from to))

(defn patch [m diffs]
  (join-with
    (fn [v [op a b]]
      (assert (#{:- :<>} op) "unexpected op")
      (assert (= a v) "unexpected value")
      (case op
        :<> b
        :- diffs)) ; reused as sentinel as it can contain itself
      identity
      (fn [_] (assert false "unexpected op"))
      (fn [x] (not (identical? diffs x)))
    m
    diffs))

(defn merge3
  [resolve ancestor child1 child2]
  (patch ancestor
    (merge-with
      (fn [d1 d2]
        (if (= d1 d2)
          d1
          (resolve d1 d2)))
      (diff ancestor child1)
      (diff ancestor child2))))

The protocol behind

Behind join-with there's -join-with in IJoinable.

-join-with was interesting to design: like equals or equiv it expects another argument "compatible" with this. We are at the limits of single dispatch. Anyway the solution we chose is to hard code compatibilities (hash maps and hash sets are compatible but this can't be extended) and to have -join-with to return nil on incompatible arguments so that slower path or exceptions could be thrown.

Conclusion

join-with is a very versatile function and we'd like to extend it to sorted collections. However we would also like to explore other ways to process several maps at once, like a sequence or iterator which allows to seek up to another sequence or iterator in a compatible collection (here compatible meaning iteration order).

Permalink

Evolving Nubank’s Architecture Toward Thoughtful Platformization

Nubank Engineering Meetup #12 offered valuable insights into the company’s technical evolution — how we’ve reshaped our platforms, systems, and career paths over the years. The conversation was led by Bel Gonçalves, now a Principal Engineer at Nubank, who shared the lessons learned over nearly a decade building technology at the company.

More than an inspiring personal story, the meetup gave attendees a deep dive into the topics that shape our daily engineering work — from architectural decisions to the development of large-scale technical career structures.

From a single product to a platform-based architecture

Nubank’s original architecture was simple, like many early-stage startups: built for a single product (credit card) in a single country (Brazil). But as we expanded by adding financial services, entering new markets, and navigating diverse regulations, this structure had to evolve.

The answer was platformization. The challenge was to build systems flexible enough to support different products, across multiple countries, with unique requirements, without rewriting everything from scratch each time. This meant clearly separating product-specific logic such as localized business rules from reusable components such as authorization engines and card and account management.

By making services more parameterizable, we accelerated development, reduced redundancy, and maintained the resilience needed to operate at scale. The journey included tough decisions, such as extracting critical code from legacy services in production, rewriting foundational components, and avoiding overengineering by balancing generality with simplicity.

Platforms that power scalability

One of the clearest examples of this shift was the creation of the card platform, which decoupled debit and credit functionalities from specific products or geographies. What used to be handled by a single service like the former CreditCardAccounts that as the name suggested, was specific to credit card, was restructured and endup derivating other services more flexible and capable to adapt to different realities, such as Brazil’s combo card and Mexico’s single cards for debit and credit.

Another critical milestone was the evolution of our Authorizer system, responsible for real-time transaction approvals. As one of the most sensitive parts of our operation, its migration from physical data centers to the cloud required low latency and high availability, especially to maintain communication with partners like Mastercard. This project required not only technical excellence but also meticulous planning to avoid any disruption to customers.

Standardization as the backbone of consistent engineering

To support engineering scalability, Nubank adopted a consistent approach rooted in standardization. All teams work with Clojure, build microservices, and favor asynchronous communication. This shared foundation encourages reuse, lowers cognitive load, and enables more predictable architectural evolution.

Our use of both Clojure and the Datomic database, both rooted in functional programming, also reflects a focus on safety and predictability. Immutability, for example, is not just a design choice—it’s a necessity to prevent harmful outcomes from incorrect system states.

This level of consistency helps teams replicate proven patterns and best practices across different contexts, accelerating product and market expansion.

The technical career path at Nubank

As our architecture has matured, so too has our technical career framework. The path includes clear milestones: engineer, senior, lead, staff, senior staff, principal, and finally, distinguished engineer. Each level brings increasing responsibility, not just in code, but in system-wide and strategic influence.

Unlike traditional models that nudge engineers into management roles, Nubank supports the growth of deep technical careers. Engineers can specialize in a given technology or take on broader roles, becoming cross-team technical leaders, especially in products or platforms with many stakeholders.

We also encourage movement between tracks. Experience in people leadership, for instance, can add perspective and empathy to those returning to hands-on technical work, strengthening business understanding and collaboration skills along the way.

Engineering at the intersection of tech, product, and business

In our cross-functional environment, engineering goes far beyond code. Engineers are involved in product decisions, help shape go-to-market strategies, and openly discuss trade-offs with stakeholders from other disciplines. Collaboration with data, design, and business teams is part of our daily rhythm, improving both product quality and creative thinking.

This collaborative model means engineers need not only technical depth, but also strong communication, active listening, and negotiation skills.

Culture, trust, and inclusion as core pillars

The architectural structure of Nubank is not just built on services and platforms — it’s built on people. Teams are the core unit of our company, and collaboration is the most essential skill. Behind every critical system, there’s a trusted network where different voices, backgrounds, and ways of thinking come together.

Building strong and diverse teams is part of our culture — and that’s why it’s a strategic priority for us. In initiatives like the creation of the NuCel team, we actively seek to build teams made up of people with different abilities, experiences, and perspectives across functions like engineering, product, design, and more.

Environments like this lead to more complete, empathetic, and relevant solutions for the people who use our products.

Balance and ownership in a high-complexity environment

With over 100 million customers, a growing product portfolio, and operations in multiple countries, pressure and complexity are part of our daily challenges. To manage this, our engineering team relies on mature processes, transparent communication, and a culture of autonomy with accountability.

Planning cycles balance short- and long-term goals. Product timelines are co-developed with engineering, with technical feasibility, resource constraints, and risk trade-offs always in the equation. It is common to adjust scope or renegotiate deadlines, always with a focus on delivering value sustainably.

A culture anchored in learning

If one principle guides everything we do at Nubank Engineering, it is continuous learning. Whether it is tackling a massive refactor, launching a new platform, or navigating the next career step, the mindset is always to stay curious and stay adaptable.

It is not just about mastering a tech stack or leading high-impact projects. It is about being where innovation happens, even when that means stepping out of your comfort zone.

The post Evolving Nubank’s Architecture Toward Thoughtful Platformization appeared first on Building Nubank.

Permalink

Convergence of Random Events

Life is full of random events.

We learn that multiple coin flips are “independent events” – no matter whether the past flip was heads or tails, the next flip is 50/50. (So why do they show the last few results at the routlette table? Hint: Don’t play routlette.) We learn that about half of babies are male and half female, so chances are 50/50 that your new little sibling will be a boy or a girl.

I found the answer to “Of my 8 children, what are the chances that 4 are girls and 4 are boys?” counterintuitive. The central limit theorem is crucial to intuition around this question.

When I initially encountered the Monte Hall problem, the correct answer wasn’t obvious or intuitive, but the mathemetical explanation is surprisingly understandable. We’ll try here to make the central limit theorem more understandable as well.

Start with a single random event – value drawn from [0.0, 1.0)

(rand)
0.9886041928663704

One way to combine random events is to take the average:

(defn avg [nums]
  (/ (reduce + nums) (count nums)))
(avg [0.0 1.0])
0.5

Let’s try taking the average of several events together:

(avg [(rand) (rand)])
0.7292787761221967
(avg [(rand) (rand) (rand)])
0.589531217833212

This is getting repetitive. We can make the computer repeat for us:

(avg (repeatedly 3 rand))
0.10569763169170669

The more events that you average, the closer the result comes to 0.5:

(avg (repeatedly 30 rand))
0.44297901532961975
(avg (repeatedly 300 rand))
0.5110507366501269

Let’s try taking several events together:

(defn event []
  (rand))
(event)
0.7142853995858464
(defn combined-event [number-of-events]
  (avg (repeatedly number-of-events event)))
(combined-event 1)
0.41033077432275
(combined-event 2)
0.16845716166263663
(combined-event 5)
0.47833224214781217

Let’s look at a series of multiple of these combined event

(repeatedly 5 #(combined-event 2))
(0.2869927499112073
 0.325419134389749
 0.8052879179372888
 0.45651672580589336
 0.41946344610872915)
(repeatedly 5 #(combined-event 5))
(0.47605788151311035
 0.6321172410778608
 0.31661487800071697
 0.3732065122410523
 0.635432147489243)
(repeatedly 5 #(combined-event 10))
(0.3421395777865136
 0.5879882875235527
 0.5896511361921144
 0.40028887826740167
 0.4330045961579912)

As we combine a larger number of events, the values cluster more closely to the middle of the original distribution.

And regardless of the shape of the original event distribution, the result of combining more and more events will approach the normal distribution – it’s a unique function toward which these combinations always converge.

This is true for both continuous variables (like (rand)) or discrete variables (like dice (rand-nth [1 2 3 4 5 6])), and it’s true even for oddly shaped distributions. When you combine enough of them, they take on the character of the bell-shaped curve.

Learn More at 3Blue1Brown - But what is the Central Limit Theorem?

Permalink

Build and Deploy Web Apps With Clojure and FLy.io

This post walks through a small web development project using Clojure, covering everything from building the app to packaging and deploying it. It’s a collection of insights and tips I’ve learned from building my Clojure side projects but presented in a more structured format.

As the title suggests, we’ll be deploying the app to Fly.io. It’s a service that allows you to deploy apps packaged as Docker images on lightweight virtual machines.[1] My experience with it has been good, it’s easy to use and quick to set up. One downside of Fly is that it doesn’t have a free tier, but if you don’t plan on leaving the app deployed, it barely costs anything.

This isn’t a tutorial on Clojure, so I’ll assume you already have some familiarity with the language as well as some of its libraries.[2]

Project Setup

In this post, we’ll be building a barebones bookmarks manager for the demo app. Users can log in using basic authentication, view all bookmarks, and create a new bookmark. It’ll be a traditional multi-page web app and the data will be stored in a SQLite database.

Here’s an overview of the project’s starting directory structure:

.
├── dev
│   └── user.clj
├── resources
│   └── config.edn
├── src
│   └── acme
│       └── main.clj
└── deps.edn

And the libraries we’re going to use. If you have some Clojure experience or have used Kit, you’re probably already familiar with all the libraries listed below.[3]

deps.edn
{:paths ["src" "resources"]
 :deps {org.clojure/clojure               {:mvn/version "1.12.0"}
        aero/aero                         {:mvn/version "1.1.6"}
        integrant/integrant               {:mvn/version "0.11.0"}
        ring/ring-jetty-adapter           {:mvn/version "1.12.2"}
        metosin/reitit-ring               {:mvn/version "0.7.2"}
        com.github.seancorfield/next.jdbc {:mvn/version "1.3.939"}
        org.xerial/sqlite-jdbc            {:mvn/version "3.46.1.0"}
        hiccup/hiccup                     {:mvn/version "2.0.0-RC3"}}
 :aliases
 {:dev {:extra-paths ["dev"]
        :extra-deps  {nrepl/nrepl    {:mvn/version "1.3.0"}
                      integrant/repl {:mvn/version "0.3.3"}}
        :main-opts   ["-m" "nrepl.cmdline" "--interactive" "--color"]}}}

I use Aero and Integrant for my system configuration (more on this in the next section), Ring with the Jetty adaptor for the web server, Reitit for routing, next.jdbc for database interaction, and Hiccup for rendering HTML. From what I’ve seen, this is a popular “library combination” for building web apps in Clojure.[4]

The user namespace in dev/user.clj contains helper functions from Integrant-repl to start, stop, and restart the Integrant system.

dev/user.clj
(ns user
  (:require
   [acme.main :as main]
   [clojure.tools.namespace.repl :as repl]
   [integrant.core :as ig]
   [integrant.repl :refer [set-prep! go halt reset reset-all]]))

(set-prep!
 (fn []
   (ig/expand (main/read-config)))) ;; we'll implement this soon

(repl/set-refresh-dirs "src" "resources")

(comment
  (go)
  (halt)
  (reset)
  (reset-all))

Systems and Configuration

If you’re new to Integrant or other dependency injection libraries like Component, I’d suggest reading “How to Structure a Clojure Web”. It’s a great explanation about the reasoning behind these libraries. Like most Clojure apps that use Aero and Integrant, my system configuration lives in a .edn file. I usually name mine as resources/config.edn. Here’s what it looks like:

resources/config.edn
{:server
 {:port #long #or [#env PORT 8080]
  :host #or [#env HOST "0.0.0.0"]
  :auth {:username #or [#env AUTH_USER "john.doe@email.com"]
         :password #or [#env AUTH_PASSWORD "password"]}}

 :database
 {:dbtype "sqlite"
  :dbname #or [#env DB_DATABASE "database.db"]}}

In production, most of these values will be set using environment variables. During local development, the app will use the hard-coded default values. We don’t have any sensitive values in our config (e.g., API keys), so it’s fine to commit this file to version control. If there are such values, I usually put them in another file that’s not tracked by version control and include them in the config file using Aero’s #include reader tag.

This config file is then “expanded” into the Integrant system map using the expand-key method:

src/acme/main.clj
(ns acme.main
  (:require
   [aero.core :as aero]
   [clojure.java.io :as io]
   [integrant.core :as ig]))

(defn read-config
  []
  {:system/config (aero/read-config (io/resource "config.edn"))})

(defmethod ig/expand-key :system/config
  [_ opts]
  (let [{:keys [server database]} opts]
    {:server/jetty (assoc server :handler (ig/ref :handler/ring))
     :handler/ring {:database (ig/ref :database/sql)
                    :auth     (:auth server)}
     :database/sql database}))

The system map is created in code instead of being in the configuration file. This makes refactoring your system simpler as you only need to change this method while leaving the config file (mostly) untouched.[5]

My current approach to Integrant + Aero config files is mostly inspired by the blog post “Rethinking Config with Aero & Integrant” and Laravel’s configuration. The config file follows a similar structure to Laravel’s config files and contains the app configurations without describing the structure of the system. Previously I had a key for each Integrant component, which led to the config file being littered with #ig/ref and more difficult to refactor.

Also, if you haven’t already, start a REPL and connect to it from your editor. Run clj -M:dev if your editor doesn’t automatically start a REPL. Next, we’ll implement the init-key and halt-key! methods for each of the components:

src/acme/main.clj
;; src/acme/main.clj
(ns acme.main
  (:require
   ;; ...
   [acme.handler :as handler]
   [acme.util :as util])
   [next.jdbc :as jdbc]
   [ring.adapter.jetty :as jetty]))
;; ...

(defmethod ig/init-key :server/jetty
  [_ opts]
  (let [{:keys [handler port]} opts
        jetty-opts (-> opts (dissoc :handler :auth) (assoc :join? false))
        server     (jetty/run-jetty handler jetty-opts)]
    (println "Server started on port " port)
    server))

(defmethod ig/halt-key! :server/jetty
  [_ server]
  (.stop server))

(defmethod ig/init-key :handler/ring
  [_ opts]
  (handler/handler opts))

(defmethod ig/init-key :database/sql
  [_ opts]
  (let [datasource (jdbc/get-datasource opts)]
    (util/setup-db datasource)
    datasource))

The setup-db function creates the required tables in the database if they don’t exist yet. This works fine for database migrations in small projects like this demo app, but for larger projects, consider using libraries such as Migratus (my preferred library) or Ragtime.

src/acme/util.clj
(ns acme.util 
  (:require
   [next.jdbc :as jdbc]))

(defn setup-db
  [db]
  (jdbc/execute-one!
   db
   ["create table if not exists bookmarks (
       bookmark_id text primary key not null,
       url text not null,
       created_at datetime default (unixepoch()) not null
     )"]))

For the server handler, let’s start with a simple function that returns a “hi world” string.

src/acme/handler.clj
(ns acme.handler
  (:require
   [ring.util.response :as res]))

(defn handler
  [_opts]
  (fn [req]
    (res/response "hi world")))

Now all the components are implemented. We can check if the system is working properly by evaluating (reset) in the user namespace. This will reload your files and restart the system. You should see this message printed in your REPL:

:reloading (acme.util acme.handler acme.main)
Server started on port  8080
:resumed

If we send a request to http://localhost:8080/, we should get “hi world” as the response:

$ curl localhost:8080/
# hi world

Nice! The system is working correctly. In the next section, we’ll implement routing and our business logic handlers.

Routing, Middleware, and Route Handlers

First, let’s set up a ring handler and router using Reitit. We only have one route, the index / route that’ll handle both GET and POST requests.

src/acme/handler.clj
(ns acme.handler
  (:require
   [reitit.ring :as ring]))

(def routes
  [["/" {:get  index-page
         :post index-action}]])

(defn handler
  [opts]
  (ring/ring-handler
   (ring/router routes)
   (ring/routes
    (ring/redirect-trailing-slash-handler)
    (ring/create-resource-handler {:path "/"})
    (ring/create-default-handler))))

We’re including some useful middleware:

  • redirect-trailing-slash-handler to resolve routes with trailing slashes,
  • create-resource-handler to serve static files, and
  • create-default-handler to handle common 40x responses.

Implementing the Middlewares

If you remember the :handler/ring from earlier, you’ll notice that it has two dependencies, database and auth. Currently, they’re inaccessible to our route handlers. To fix this, we can inject these components into the Ring request map using a middleware function.

src/acme/handler.clj
;; ...

(defn components-middleware
  [components]
  (let [{:keys [database auth]} components]
    (fn [handler]
      (fn [req]
        (handler (assoc req
                        :db database
                        :auth auth))))))
;; ...

The components-middleware function takes in a map of components and creates a middleware function that “assocs” each component into the request map.[6] If you have more components such as a Redis cache or a mail service, you can add them here.

We’ll also need a middleware to handle HTTP basic authentication.[7] This middleware will check if the username and password from the request map matche the values in the auth map injected by components-middleware. If they match, then the request is authenticated and the user can view the site.

src/acme/handler.clj
(ns acme.handler
  (:require
   ;; ...
   [acme.util :as util]
   [ring.util.response :as res]))
;; ...

(defn wrap-basic-auth
  [handler]
  (fn [req]
    (let [{:keys [headers auth]} req
          {:keys [username password]} auth
          authorization (get headers "authorization")
          correct-creds (str "Basic " (util/base64-encode
                                       (format "%s:%s" username password)))]
      (if (and authorization (= correct-creds authorization))
        (handler req)
        (-> (res/response "Access Denied")
            (res/status 401)
            (res/header "WWW-Authenticate" "Basic realm=protected"))))))
;; ...

A nice feature of Clojure is that interop with the host language is easy. The base64-encode function is just a thin wrapper over Java’s Base64.Encoder:

src/acme/util.clj
(ns acme.util
   ;; ...
  (:import java.util.Base64))

(defn base64-encode
  [s]
  (.encodeToString (Base64/getEncoder) (.getBytes s)))

Finally, we need to add them to the router. Since we’ll be handling form requests later, we’ll also bring in Ring’s wrap-params middleware.

src/acme/handler.clj
(ns acme.handler
  (:require
   ;; ...
   [ring.middleware.params :refer [wrap-params]]))
;; ...

(defn handler
  [opts]
  (ring/ring-handler
   ;; ...
   {:middleware [(components-middleware opts)
                 wrap-basic-auth
                 wrap-params]}))

Implementing the Route Handlers

We now have everything we need to implement the route handlers or the business logic of the app. First, we’ll implement the index-page function which renders a page that:

  1. Shows all of the user’s bookmarks in the database, and
  2. Shows a form that allows the user to insert new bookmarks into the database
src/acme/handler.clj
(ns acme.handler
  (:require
   ;; ...
   [next.jdbc :as jdbc]
   [next.jdbc.sql :as sql]))
;; ...

(defn template
  [bookmarks]
  [:html
   [:head
    [:meta {:charset "utf-8"
            :name    "viewport"
            :content "width=device-width, initial-scale=1.0"}]]
   [:body
    [:h1 "bookmarks"]
    [:form {:method "POST"}
     [:div
      [:label {:for "url"} "url "]
      [:input#url {:name "url"
                   :type "url"
                   :required true
                   :placeholer "https://en.wikipedia.org/"}]]
     [:button "submit"]]
    [:p "your bookmarks:"]
    [:ul
     (if (empty? bookmarks)
       [:li "you don't have any bookmarks"]
       (map
        (fn [{:keys [url]}]
          [:li
           [:a {:href url} url]])
        bookmarks))]]])

(defn index-page
  [req]
  (try
    (let [bookmarks (sql/query (:db req)
                               ["select * from bookmarks"]
                               jdbc/unqualified-snake-kebab-opts)]
      (util/render (template bookmarks)))
    (catch Exception e
      (util/server-error e))))
;; ...

Database queries can sometimes throw exceptions, so it’s good to wrap them in a try-catch block. I’ll also introduce some helper functions:

src/acme/util.clj
(ns acme.util
  (:require
   ;; ...
   [hiccup2.core :as h]
   [ring.util.response :as res])
  (:import java.util.Base64))
;; ...

(defn preprend-doctype
  [s]
  (str "<!doctype html>" s))

(defn render
  [hiccup]
  (-> hiccup h/html str preprend-doctype res/response (res/content-type "text/html")))

(defn server-error
  [e]
  (println "Caught exception: " e)
  (-> (res/response "Internal server error")
      (res/status 500)))

render takes a hiccup form and turns it into a ring response, while server-error takes an exception, logs it, and returns a 500 response.

Next, we’ll implement the index-action function:

src/acme/handler.clj
;; ...

(defn index-action
  [req]
  (try
    (let [{:keys [db form-params]} req
          value (get form-params "url")]
      (sql/insert! db :bookmarks {:bookmark_id (random-uuid) :url value})
      (res/redirect "/" 303))
    (catch Exception e
      (util/server-error e))))
;; ...

This is an implementation of a typical post/redirect/get pattern. We get the value from the URL form field, insert a new row in the database with that value, and redirect back to the index page. Again, we’re using a try-catch block to handle possible exceptions from the database query.

That should be all of the code for the controllers. If you reload your REPL and go to http://localhost:8080, you should see something that looks like this after logging in:

Screnshot of the app

The last thing we need to do is to update the main function to start the system:

src/acme/main.clj
;; ...

(defn -main [& _]
  (-> (read-config) ig/expand ig/init))

Now, you should be able to run the app using clj -M -m acme.main. That’s all the code needed for the app. In the next section, we’ll package the app into a Docker image to deploy to Fly.

Packaging the App

While there are many ways to package a Clojure app, Fly.io specifically requires a Docker image. There are two approaches to doing this:

  1. Build an uberjar and run it using Java in the container, or
  2. Load the source code and run it using Clojure in the container

Both are valid approaches. I prefer the first since its only dependency is the JVM. We’ll use the tools.build library to build the uberjar. Check out the official guide for more information on building Clojure programs. Since it’s a library, to use it we can add it to our deps.edn file with an alias:

deps.edn
{;; ...
 :aliases
 {;; ...
  :build {:extra-deps {io.github.clojure/tools.build 
                       {:git/tag "v0.10.5" :git/sha "2a21b7a"}}
          :ns-default build}}}

Tools.build expects a build.clj file in the root of the project directory, so we’ll need to create that file. This file contains the instructions to build artefacts, which in our case is a single uberjar. There are many great examples of build.clj files on the web, including from the official documentation. For now, you can copy+paste this file into your project.

build.clj
(ns build
  (:require
   [clojure.tools.build.api :as b]))

(def basis (delay (b/create-basis {:project "deps.edn"})))
(def src-dirs ["src" "resources"])
(def class-dir "target/classes")

(defn uber
  [_]
  (println "Cleaning build directory...")
  (b/delete {:path "target"})

  (println "Copying files...")
  (b/copy-dir {:src-dirs   src-dirs
               :target-dir class-dir})

  (println "Compiling Clojure...")
  (b/compile-clj {:basis      @basis
                  :ns-compile '[acme.main]
                  :class-dir  class-dir})

  (println "Building Uberjar...")
  (b/uber {:basis     @basis
           :class-dir class-dir
           :uber-file "target/standalone.jar"
           :main      'acme.main}))

To build the project, run clj -T:build uber. This will create the uberjar standalone.jar in the target directory. The uber in clj -T:build uber refers to the uber function from build.clj. Since the build system is a Clojure program, you can customise it however you like. If we try to run the uberjar now, we’ll get an error:

# build the uberjar
$ clj -T:build uber
# Cleaning build directory...
# Copying files...
# Compiling Clojure...
# Building Uberjar...

# run the uberjar
$ java -jar target/standalone.jar
# Error: Could not find or load main class acme.main
# Caused by: java.lang.ClassNotFoundException: acme.main

This error occurred because the Main class that is required by Java isn’t built. To fix this, we need to add the :gen-class directive in our main namespace. This will instruct Clojure to create the Main class from the -main function.

src/acme/main.clj
(ns acme.main
  ;; ...
  (:gen-class))
;; ...

If you rebuild the project and run java -jar target/standalone.jar again, it should work perfectly. Now that we have a working build script, we can write the Dockerfile:

Dockerfile
# install additional dependencies here in the base layer
# separate base from build layer so any additional deps installed are cached
FROM clojure:temurin-21-tools-deps-bookworm-slim AS base

FROM base as build
WORKDIR /opt
COPY . .
RUN clj -T:build uber

FROM eclipse-temurin:21-alpine AS prod
COPY --from=build /opt/target/standalone.jar /
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "standalone.jar"]

It’s a multi-stage Dockerfile. We use the official Clojure Docker image as the layer to build the uberjar. Once it’s built, we copy it to a smaller Docker image that only contains the Java runtime.[8] By doing this, we get a smaller container image as well as a faster Docker build time because the layers are better cached.

That should be all for packaging the app. We can move on to the deployment now.

Deploying with Fly.io

First things first, you’ll need to install flyctl, Fly’s CLI tool for interacting with their platform. Create a Fly.io account if you haven’t already. Then run fly auth login to authenticate flyctl with your account.

Next, we’ll need to create a new Fly App:

$ fly app create
# ? Choose an app name (leave blank to generate one): 
# automatically selected personal organization: Ryan Martin
# New app created: blue-water-6489

Another way to do this is with the fly launch command, which automates a lot of the app configuration for you. We have some steps to do that are not done by fly launch, so we’ll be configuring the app manually. I also already have a fly.toml file ready that you can straight away copy to your project.

fly.toml
# replace these with your app and region name
# run `fly platform regions` to get a list of regions
app = 'blue-water-6489' 
primary_region = 'sin'

[env]
  DB_DATABASE = "/data/database.db"

[http_service]
  internal_port = 8080
  force_https = true
  auto_stop_machines = "stop"
  auto_start_machines = true
  min_machines_running = 0

[mounts]
  source = "data"
  destination = "/data"
  initial_sie = 1

[[vm]]
  size = "shared-cpu-1x"
  memory = "512mb"
  cpus = 1
  cpu_kind = "shared"

These are mostly the default configuration values with some additions. Under the [env] section, we’re setting the SQLite database location to /data/database.db. The database.db file itself will be stored in a persistent Fly Volume mounted on the /data directory. This is specified under the [mounts] section. Fly Volumes are similar to regular Docker volumes but are designed for Fly’s micro VMs.

We’ll need to set the AUTH_USER and AUTH_PASSWORD environment variables too, but not through the fly.toml file as these are sensitive values. To securely set these credentials with Fly, we can set them as app secrets. They’re stored encrypted and will be automatically injected into the app at boot time.

$ fly secrets set AUTH_USER=hi@ryanmartin.me AUTH_PASSWORD=not-so-secure-password
# Secrets are staged for the first deployment

With this, the configuration is done and we can deploy the app using fly deploy:

$ fly deploy
# ...
# Checking DNS configuration for blue-water-6489.fly.dev
# Visit your newly deployed app at https://blue-water-6489.fly.dev/

The first deployment will take longer since it’s building the Docker image for the first time. Subsequent deployments should be faster due to the cached image layers. You can click on the link to view the deployed app, or you can also run fly open which will do the same thing. Here’s the app in action:

The app in action

If you made additional changes to the app or fly.toml, you can redeploy the app using the same command, fly deploy. The app is configured to auto stop/start, which helps to cut costs when there’s not a lot of traffic to the site. If you want to take down the deployment, you’ll need to delete the app itself using fly app destroy <your app name>.

Adding a Production REPL

This is an interesting topic in the Clojure community, with varying opinions on whether or not it’s a good idea. Personally I find having a REPL connected to the live app helpful, and I often use it for debugging and running queries on the live database.[9] Since we’re using SQLite, we don’t have a database server we can directly connect to, unlike Postgres or MySQL.

If you’re brave, you can even restart the app directly without redeploying from the REPL. You can easily go wrong with it, which is why some prefer to not use it.

For this project, we’re gonna add a socket REPL. It’s very simple to add (you just need to add a JVM option) and it doesn’t require additional dependencies like nREPL. Let’s update the Dockerfile:

Dockerfile
# ...
EXPOSE 7888
ENTRYPOINT ["java", "-Dclojure.server.repl={:port 7888 :accept clojure.core.server/repl}", "-jar", "standalone.jar"]

The socket REPL will be listening on port 7888. If we redeploy the app now, the REPL will be started but we won’t be able to connect to it. That’s because we haven’t exposed the service through Fly proxy. We can do this by adding the socket REPL as a service in the [services] section in fly.toml.

However, doing this will also expose the REPL port to the public. This means that anyone can connect to your REPL and possibly mess with your app. Instead, what we want to do is to configure the socket REPL as a private service.

By default, all Fly apps in your organisation live in the same private network. This private network, called 6PN, connects the apps in your organisation through Wireguard tunnels (a VPN) using IPv6. Fly private services aren’t exposed to the public internet but can be reached from this private network. We can then use Wireguard to connect to this private network to reach our socket REPL.

Fly VMs are also configured with the hostname fly-local-6pn, which maps to its 6PN address. This is analogous to localhost, which points to your loopback address 127.0.0.1. To expose a service to 6PN, all we have to do is bind or serve it to fly-local-6pn instead of the usual 0.0.0.0. We have to update the socket REPL options to:

Dockerfile
# ...
ENTRYPOINT ["java", "-Dclojure.server.repl={:port 7888,:address \"fly-local-6pn\",:accept clojure.core.server/repl}", "-jar", "standalone.jar"]

After redeploying, we can use the fly proxy command to forward the port from the remote server to our local machine.[10]

$ fly proxy 7888:7888
# Proxying local port 7888 to remote [blue-water-6489.internal]:7888

In another shell, run:

$ rlwrap nc localhost 7888
# user=>

Now we have a REPL connected to the production app! rlwrap is used for readline functionality, e.g. up/down arrow keys, vi bindings. Of course you can also connect to it from your editor.

Deploy with GitHub Actions

If you’re using GitHub, we can also set up automatic deployments on pushes/PRs with GitHub Actions. All you need is to create the workflow file:

.github/workflows/fly.yaml
name: Fly Deploy
on:
  push:
    branches:
      - main
  workflow_dispatch:

jobs:
  deploy:
    name: Deploy app
    runs-on: ubuntu-latest
    concurrency: deploy-group
    steps:
      - uses: actions/checkout@v4
      - uses: superfly/flyctl-actions/setup-flyctl@master
      - run: flyctl deploy --remote-only
        env:
          FLY_API_TOKEN: ${{ secrets.FLY_API_TOKEN }}

To get this to work, you’ll need to create a deploy token from your app’s dashboard. Then, in your GitHub repo, create a new repository secret called FLY_API_TOKEN with the value of your deploy token. Now, whenever you push to the main branch, this workflow will automatically run and deploy your app. You can also manually run the workflow from GitHub because of the workflow_dispatch option.

End

As always, all the code is available on GitHub. Originally, this post was just about deploying to Fly.io, but along the way I kept adding on more stuff until it essentially became my version of the user manager example app. Anyway, hope this article provided a good view into web development with Clojure. As a bonus, here are some additional resources on deploying Clojure apps:


  1. The way Fly.io works under the hood is pretty clever. Instead of running the container image with a runtime like Docker, the image is unpacked and “loaded” into a VM. See this video explanation for more details. ↩︎

  2. If you’re interested in learning Clojure, my recommendation is to follow the official getting started guide and join the Clojurians Slack. Also, read through this list of introductory resources. ↩︎

  3. Kit was a big influence on me when I first started learning web development in Clojure. I never used it directly, but I did use their library choices and project structure as a base for my own projects. ↩︎

  4. There’s no “Rails” for the Clojure ecosystem (yet?). The prevailing opinion is to build your own “framework” by composing different libraries together. Most of these libraries are stable and are already used in production by big companies, so don’t let this discourage you from doing web development in Clojure! ↩︎

  5. There might be some keys that you add or remove, but the structure of the config file stays the same. ↩︎

  6. “assoc” (associate) is a Clojure slang that means to add or update a key-value pair in a map. ↩︎

  7. For more details on how basic authentication works, check out the specification. ↩︎

  8. Here’s a cool resource I found when researching Java Dockerfiles: WhichJDK. It provides a comprehensive comparison on the different JDKs available and recommendations on which one you should use. ↩︎

  9. Another (non-technically important) argument for live/production REPLs is just because it’s cool. Ever since I read the story about NASA’s programmers debugging a spacecraft through a live REPL, I’ve always wanted to try it at least once. ↩︎

  10. If you encounter errors related to Wireguard when running fly proxy, you can run fly doctor which will hopefully detect issues with your local setup and also suggest fixes for them. ↩︎

Permalink

AI, Lisp, and Programming

On the last Apropos, we welcomed Christoph Neumann to talk about his new role as the Clojure Developer Evangelist at Nubank. It’s very exciting that the role is in such great hands.

Our next guest is Peter Strömberg. Peter is known as PEZ online. He is the creator of Calva, the Clojure plugin for VS Code. He’s going to demo some REPL-Driven Development with Calva.


AI, Lisp, and Programming

I have a Masters of Science in Computer Science, with a specialty in Artificial Intelligence. With the way AI salaries are these days, you’d think I pull seven figures. Alas, my degree is from 2008. At that time, people wondered why I wanted to go into a field that had been in a sorry state since the 1980s. Wouldn’t I enjoy something more lucrative like security or databases?

But I liked the project of AI. AI, to me, was an exploration of our own intelligence—how we thought, solved problems, and perceived the world—to better understand how we can get a machine to do it. It was a kind of reverse-engineering of the human mind. By building our own minds, we could understand ourselves.

It’s clear that Lisp and AI have been linked since very early in the two fields’ existences. John McCarthy, the inventor of Lisp, also coined and defined the term artificial intelligence. Lisp was traditionally closely associated with the study of AI. Lisp has been a generator of programming language ideas that might seem normal now, but were considered weird or costly at the time. Here’s a partial list:

  1. Garbage collection

  2. Structured conditionals

  3. First-class and higher-order functions

  4. Lexical scoping

While it’s clear that Lisp has been influential on programming languages (even after being ridiculed for the same features languages borrow), what is not so clear is how much AI has been an influence on programming practice. Until recently, it was kind of a joke that AI had been around since the 1950s but hadn’t produced any real results. The other side of the joke is that once something works, it’s just considered programming and not artificial intelligence.

Artificial intelligence has produced so many riches in the programming world. Here is a partial list:

  1. Compilers (which used to be called “automatic programming”)

  2. Tree search

  3. Hash tables

  4. Constraint satisfaction

  5. Rules engines

  6. Priority queues

Let me put it directly: Seeking to understand human thought has been fruitful for software engineering. My interest in Lisp is interlinked with my interest in AI. AI has always been synonymous with “powerful programming techniques.” And I have always seen myself as part of that thread, however small my contribution might be.

The link between AI, Lisp, and programming was so strong 30 years ago that Peter Norvig started the preface of Paradigms of Artificial Intelligence Programming with these words:

This book is concerned with three related topics: the field of artificial intelligence, or AI; the skill of computer programming; and the programming language Common Lisp.

source

In 2008, when I graduated, Google could barely categorize images. Identifying a cat in a photo was still considered a hard problem. Many attempts had been made, but none could get close to human-level. In 2014, I heard about a breakthrough called “deep learning”. It was using the scale of the internet and the vast parallelism of GPUs to make huge neural networks trained on millions of images to break accuracy records. It was working. And it was completely uninteresting to me.

Okay, not really completely uninteresting. It tickled my interest in building new things. I could see how being able to identify cats (or other objects) reliably could be useful. But I saw in this nothing of the project for understanding ourselves. Instead, it was much like what happened at Intel.

Nobody really likes the Intel architecture. It’s not that great. But once Intel got a slight lead in marketshare, it could ride Moore’s Law. Instead of looking for a better architecture, invest your time instead in scaling the transistor down and scaling the number of transistors up. Even the worst architecture will get faster. And Intel can cement their lead by investing in better manufacturing processes. Their dominance wound up lasting decades. But computer architecture has languished relative to the growth of the demand for computing.

The same effect is at play in neural networks: Instead of investing in understanding of how thought works, just throw more processing and more training at bigger networks. With enough money to fund the project, your existing architectures, scaled up, will do better.

These are oversimplifications. There were undoubtedly many minor and some major architectural breakthroughs that helped Intel keep pace with Moore’s Law. Likewise, there have been similar architectural breakthroughs in neural networks, including convolutions and transformers. But the neural network strategy is dominated by scale—more training data, more neurons, more FLOPS.

My whole point is that my research into the history of the field of AI has somewhat inoculated me against the current hype. I don’t think AI will “replace all humans”. And I don’t think AGI (artificial general intelligence) is defined well enough to be a real goal. So where does that leave us? How is AI going to transform programming? Where will all of this end up?

Artificial intelligence has always been a significant part of the leading edge of programming. And its promise has always been far ahead of its ability. In the next few issues, I want to explore what this current hype wave of AI means for us. I don’t like where I see AI going. But I also want to give apply some optimism to it because I think a lot of the consequences are inevitable. The world is being changed, and we will have to live in that new world.

Permalink

What The Heck Just Happened?

Also titled “Lesson learned in CLJS Frontends” …

I have been doing frontend development for well over two decades now, and over that time I have used a lot of different frontend solutions. For over a decade I have been into ClojureScript and I don’t think I’m ever moving on from that. The language is just beautiful and the fact that I can use the same language on the server is just perfect. I’ll only talk about CLJS here though, since the frontend just runs JS, which we compile to.

I like optimizing things, so I’ll frequently try new things when I encounter new ideas. One the most impactful was when react came onto the scene. Given its current state I obviously wasn’t the only one. I’d say in general the CLJS community adopted it quite widely. The promise of having the UI represented as a function of your state was a beautiful idea. (render state) is just too nice to not like. However, we very quickly learned that this doesn’t scale beyond toy examples. The VDOM-representation this render call returns needs to be “diffed”, as in compared to the previous version whenever any change is made. This becomes expensive computationally very quickly.

In this post I kinda want to document what I (and others) have learned over the years, and where my own journey currently stands.

What The Heck Just Happened?

Answering this question becomes the essential question the rendering library of choice has to answer. It only sees two snapshots and is supposed to find the needle in a possibly large pile of data. So it has to ask and answer this a lot.

Say you have an input DOM element and its value should be directly written into some other div. The most straightforward way is to do this directly in JS:

const input = document.getElementById("input");
const target = document.getElementById("target");

input.addEventListener("input", function(e) {
   target.textContent = input.value;
});

Plus a <input type="text" id="input"> and <div id="target"></div> somewhere. I’ll call this the baseline. The most direct way to get the result we want. There is not a single “What the Heck Just Happened?” asked, the code is doing exactly what it needs in as little work was possible.

Of course, frontend development is never this simple and this style of imperative code often leads to very entangled hard to maintain messes. Thus enter react, or any library of that kind, that instead has you write a declarative representation of the current desired UI state. I’ll use CLJS and hiccup from here. The actual rendering library used is almost irrelevant to the rest of this post. The problems are inherent in the approach.

So, for CLJS this might look something like:

(defn render [state]
  [:<>
   [:input {:type "text" :on-input handle-input}]
   [:div (:text state)]])

I’ll very conveniently ignore where state comes from or how its handled. Just assume it is a CLJS map and the handle-input is a function adds the current input value under that :text key and whatever mechanism is used just calls render again with the updated map.

So, what the rendering library of choice now has to decipher is two different snapshots in time.

Before:

[:<>
 [:input {:type "text" :on-input handle-input}]
 [:div "foo"]]

After:

[:<>
 [:input {:type "text" :on-input handle-input}]
 [:div "foo!"]]

It has to traverse these two snapshots and find the difference. So, for this dumb example it has to check 3 vectors, 2 keywords and a map with 2 key value pairs and our actually changed String. We basically went from 1 operation in the pure JS example to I don’t even know how many. The = operation in CLJS is well-defined and pretty fast, but actually often more than one operation. For simplicity’s sake lets say 1 though. So, in total we are now at 12 times asking “What The Heck Just Happened?”.

Point being that this very quickly gets out of hand. I’m going to assume there are going to be hundreds or even thousands of DOM elements. An easy way to do a count how many DOM elements your favorite “webapp” has is running document.body.querySelectorAll("*").length in the devtools console. I did this for my shadow-cljs github repo and get 2214 in incognito. For reasons I can’t even see it goes up to 2647 when logged in. Amazon Prime Video Homepage is 5517 for me. You see how quickly this goes up. Of course not every app is going to have that many elements, but it also isn’t uncommon.

The cost isn’t only the “diffing”. You also have to create that Hiccup in the first place. Please note that it is pretty much irrelevant which what representation is used. React Elements have to do the same work. However, for libraries that convert from hiccup to react at runtime (e.g. reagent) you also have to pay the conversion cost. Hence, why there are many libraries that try to do much of this at compile time via macros.

Death by a thousand cuts. This all adds up to become significant.

Enter The Trade-Offs

Staying within the pure (render state) model sure would be nice, but if you ask me it just doesn’t scale. Diffing thousands of elements to “find” one change is just bonkers. Modern Hardware is insanely fast, but I’d rather not waste all that power.

I’m by no means saying that everyone should always render everything on the client. Sometimes it is just best to have the server spit out some HTML and be done with it. Can’t be faster than no diff ever right? Well, we want some dynamic content, so we have to get there somehow. I covered strategies for dealing with server-side content in my previous series (1, 2, 3).

This is all about frontend and pretty much SPA territory, where this kinda of scaling begins to matter. We can employ various techniques to speed the client up significantly.

Memoization

One of the simplest and also most impactful things is making sure things stay identical? (or === in JS terms). That is the beauty of the CLJS datastructures. If they are identical? we know they didn’t change. JS objects not so much, but I won’t go into that here.

So, modifying the example above we can just extract the input element, since it never changes.

(def input
  [:input {:type "text" :on-input handle-input}])
   
(defn render [state]
  [:<>
   input
   [:div (:text state)]])

Our rendering library of choice now finds an identical? hiccup vector, whereas before it had to do a full = check. This removes 6 “WTHJH?” questions. Quite a significant drop. Again, probably not the exact number, but I hope you get the idea.

This technique is called memoization, where you have a function that returns an identical result when called with the same arguments. The goal being to reduce the amount of things we have to compare. Often just comparing a few values instead of the expanded “virtual tree”, i.e. avoid running as much code as possible. I skipped the function part here to make that example easier to follow. Same principle though.

Components

Components are basically the next level of memoization. Let’s say we turn our input into a “component”, using a simplistic CLJS defn for now.

(defn input []
  [:input {:type "text" :on-input handle-input}])
   
(defn render [state]
  [:<>
   [input]
   [:div (:text state)]])

So, our rendering library of choice will now encounter a hiccup vector with a function as its first element. On the first render run it’ll just call that and get the expanded virtual tree. On subsequent runs it can check whether that fn is identical? as well, and since it doesn’t take any extra arguments can just skip calling it completely. Again bypassing “a lot of work”.

There is a lot more to components of course, but I first want to highlight one problem they do not solve. render received the state argument. Let’s assume that for some reason input needs it as well.

(defn render [state]
  [:<>
   [input state]
   [:div (:text state)]])

Here the rendering lib will call (input state) basically. But it has no clue what part of state is actually used by input, it only knows that state changed, so it has to call input even though it still might expand to the exact same tree. Even the input “component” needs additional tools and care to avoid just generating the hiccup from scratch. Hence, in react terms to get the optimal performance you are supposed to useMemo a lot. Which isn’t exactly “fun” and very far away from our (render state) ideals.

We could change it so that we only pass the needed data to input, but that requires knowledge of the implementation, and that isn’t always straightforward.

Trying To Find Balance

In the end the real goal is finding a good balance between developer productivity/convenience and runtime performance. A solution that nobody can work on is no good, but neither is something that is unusably slow for the end user. Do not underestimate how slow things can get, especially if you measure against the not very-latest high-tech devices. Try 4x Slowdown in Chrome and the results may shock you.

My Journey

I have been working on my own rendering lib for a rather long time now. I have never documented or even talked about it much. Quite honestly I wrote this for myself and I consider this an active journey. It works and is good enough for my current needs, but far from finished.

One Conclusion I have come to is that the “push” model of just passing the whole state into the root doesn’t scale. Instead, it is better to have each component pull the data it needs from some well-defined datasource. And then have that datasource notify the component if the data it requested was changed. Then allowing the component to kick off an update from its place in the tree, instead of from the root.

Another superpower CLJS has is macros. So, leveraging those more can give substantial gains. I have the fragment macro (<<), which can statically analyze the hiccup structure and not only bypass the hiccup creation completely. It can also just generate the code to directly apply the update.

Changing our render to

(defn render [state]
  (<< [:input {:type "text" :on-input handle-input}])
      [:div (:text state)]]))

This still looks very hiccup-ish, even though it doesn’t create hiccup at all. But during macro expansion this is very trivial to tell that only (:text state) can possibly change here. This gets very close the initial 1 operation JS example. Heck if you help the macro a little more it becomes that literal 1 operation.

(defn render [{:keys [^string text] :as state}]
  (<< [:input {:type "text" :on-input handle-input}])
      [:div text]]))

But that level of optimization is really unnecessary. It sure was fun to build the implementation though. For the people that care, the trick here is that the update impl can just set the .textContent property if it can determine that all children are “strings”. So, it even works for stuff like [:div "Hello, " text "!"].

Also, an optimization what the React Compiler only wishes it could achieve. Not that I have any hope that it would ever understand CLJS code to begin with.

Conclusion

The point is that there are so many optimizations to find, that the challenge isn’t finding them. The challenge is building a coherent “framework” that can scale from tiny to very large, all while writing “beautiful” code.

This post is already getting too long, so I intend to write more detailed posts about my experiences later. I don’t even want to talk about my specific library. Most of the stuff I built I didn’t come up with in the first place. They are just “Lessons learned” by the greater community over the years. All of those techniques have been done before, I just adapted them to CLJS. A lot of them you could translate 1:1 and use them with react or whatever else.

Some of them however you cannot. The fragment macro could be adapted for react, but it could never reach the level of performance it has in shadow-grove because react can only do the “virtual DOM diffing” and is not extensible in that area. shadow-grove doesn’t have a “virtual DOM”, as in there is no 1:1 mapping of virtual and actual DOM elements. A fragment may handle many elements, like it did in the render example above.

There is this sentiment that due to LLMs having seen so much react code, that there is no value in building something that is not react. I think that is utter nonsense. There is however the argument of “ecosystem value”, where react clearly wins and is very hard to compete with. You kinda have to be a maniac like me and accept that you have to write your own implementation for stuff a lot, instead of just using readily available 3rd party libraries.

Again, all I wish to highlight with this post, and all my posts really, is that we should talk about Trade-Offs more. A lot of stuff on the Internet just talks about why they are good, and never mentions what it will “cost” you.

I’m very aware that my Trade-Offs would not be correct for most people, but everyone still should be aware of the ones they are making by committing to Solution X. Talking about them might get us close to a coherent story for CLJS, rather than the fragmented react-heavy landscape it currently is.

Building on top of something built by a part-time hobby enthusiast isn’t a good business strategy, so I really do not want this to be about shadow-grove, rather some of the ideas behind it and how we could maybe apply them more broadly.

I wrote a follow-up post to back all this up with some numbers and some insights into how I arrived at my conclusions.

Permalink

Copyright © 2009, Planet Clojure. No rights reserved.
Planet Clojure is maintained by Baishamapayan Ghose.
Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
Theme by Brajeshwar.