On Extensibility

by Laurence Chen

For a long time, I had a misunderstanding about Clojure: I always thought that the extensibility Clojure provides was just about macros. Not only that, but many articles I read about Lisp emphasized how Lisp’s macros were far more advanced than those in other languages while also extolling the benefits of macros—how useful it is for a language to be extensible. However, what deeply puzzled me was that the Clojure community takes a conservative approach to using macros. Did this mean that Clojure was less extensible than other Lisp languages?

Later, I realized that my misunderstanding had two aspects:

First, Clojure has at least two extension mechanisms: Macros and Protocols. When using the Clojure language, we often do not clearly distinguish between core syntax and core library functions. In other words, if the extensible parts are core library functions, the experience for future users is almost the same as extending core syntax. More specifically, many parts of Clojure’s core library are constructed using Protocol/Interface syntax. This syntax serves as the predefined extension points in the core library, meaning that core library functionality can also be extended.

Second, I used to mix up “extensibility” and “extensibility mechanisms.” I always focused on “Oh, I discovered another language, database, or software that supports plugins. That’s great! It has an extensibility mechanism, so it can be extended.” However, an extensibility mechanism is just a means to achieve extensibility. But what exactly is extensibility? What problems does it solve, and what benefits does it bring? I never really thought through these questions.

Here is a proposed definition of extensibility:

Given a module or a segment of code developed based on certain external assumptions, when these external assumptions change, without modifying the existing code, the behavior of the module can be enhanced or altered through a predefined mechanism, allowing it to adapt to new requirements.

Extensibility Definition

According to this definition, the benefits of extensibility are:

  • Cost savings. If no modifications are needed, there is no need to worry about breaking existing functionality or regression issues.
  • Reduced complexity. The ability to extend or modify a module’s behavior through predefined mechanisms eliminates the need to copy entire modules and make modifications, saving a significant amount of code.
  • Empowering users. Even though the module has already been developed, it can still be modified. This is particularly useful when module developers and users belong to different organizations or teams, as it provides great flexibility, allowing users to self-serve.

Next, let’s look at some real-world examples to better understand extensibility in practice.

Macro

Let’s first examine some common built-in Macros:

  • ->: Transforms linear syntax into nested parentheses, effectively introducing a new DSL (domain-specific language).
  • comment: Ignores a block of code.
  • with-open: Grants a block of code access to a resource that is automatically closed when leaving the block.
  • with-redefs: Temporarily redefines global variables within a block of code.
  • with-in-str: Temporarily binds *in* to a specific StringReader within a block of code.

Macros can be roughly categorized into two types:

  • Non with-style Macros
  • With-style Macros

Non with-style Macros

These Macros typically accept arguments in the form of & body, internally parsing body and transforming its evaluation strategy.

For example, consider core.async/go:

(go
  (println "Before")
  (<! (timeout 1000))
  (println "After"))

The go Macro transforms body into a state machine to execute it asynchronously. It doesn’t just wrap a block of code but actually rewrites it.

The code passed as an argument to these Macros often introduces new syntax or semantics, effectively extending the Clojure language itself by adding new DSLs.

With-style Macros

In contrast, some Macros accept arguments in the form of a b c & body and internally reference ~@body. These Macros do not dissect the statements inside body; instead, they inject additional processing before or after body executes. Because they preserve the original structure of body, they are particularly suited for resource management, context setting, and similar scenarios.

The embedkit library contains an inspiring with-style Macro that treats authentication state as a form of context.

  • with-refresh-auth: Refresh the authentication state and retry the request if the request fails with a 401 error.
(defmacro with-refresh-auth [conn & body]
  `((fn retry# [conn# retries#]
      (Thread/sleep (* 1000 retries# retries#)) ; backoff
      (try
        ~@body
        (catch clojure.lang.ExceptionInfo e#
          (if (and (= 401 (:status (ex-data e#)))
                   (instance? clojure.lang.Atom conn#)
                   (< retries# 4))
            (do
              (reset! conn# (connect @conn#))
              (retry# conn# (inc retries#)))
            (throw e#)))))
    ~conn
    0))

;; Use with-refresh-auth to wrap the do-request
(defn mb-request [verb conn path opts]
  (with-refresh-auth conn
    (do-request (request conn verb path opts))))

Protocol

Many Clojurians, including myself, have struggled to grasp when to use Protocols—they can feel abstract and difficult to apply. The best explanation I’ve found is from ask.clojure.org:

Protocol functions are better as SPI to hook in implementations than in API as functions consumers call directly.

If, like me, you don’t immediately grasp what an SPI (service provider interface) is, studying the buddy-auth library can help.

buddy-auth is a commonly used authentication library for web applications. Users can extend it by adding new authentication mechanisms without modifying its source code.

To define an authentication mechanism, one must implement the IAuthentication Protocol using reify.

For example, http-basic-backend is a basic authentication mechanism that implements IAuthentication:

(defn http-basic-backend
  [& [{:keys [realm authfn unauthorized-handler] :or {realm "Buddy Auth"}}]]
  {:pre [(ifn? authfn)]}
  (reify
    proto/IAuthentication
    (-parse [_ request]
      (parse-header request))
    (-authenticate [_ request data]
      (authfn request data))
 ...
)

When using buddy-auth, the wrap-authentication middleware is added to the Ring handler. This middleware ultimately calls proto/-parse and proto/-authenticate.

Strategy Pattern

Looking at this diagram, you might think, “Isn’t this just the Strategy Design Pattern?” Indeed, in this pattern, Strategy corresponds to the Service Provider Interface, allowing authentication to be swapped in buddy-auth without modifying any code.

Summary

Extensibility MechanismMacro (non with-style)Macro (with-style)Protocol
ExtendsClojure language itselfCode passed to the MacroModules
Behavior Modification MechanismParses and rewrites bodyWraps bodyDesign and replace
Predefined Extension PointsNoneNoneRequires Protocols in the module design
Degree of ExtensibilityHighLowMedium (only specified parts are replaceable)

If we were to name these three mechanisms:

  • Non with-style Macros: syntax-rewriting extension
  • With-style Macros: contextual extension
  • Protocols: replace-based extension

Replace-based extension is relatively easy to grasp and common across programming languages. Contextual extension, while involving meta-programming, remains accessible. Syntax-rewriting extension, on the other hand, fundamentally alters the language itself, making it the domain of compilation experts.

Clojure provides excellent extensibility, offering diverse mechanisms that allow extensions at the language, code block, core library, and user-defined module levels. If you want to elevate your programming skills, consider how to design software for extensibility—it will make your software feel more Clojurish.

Note:

  • In this article, you can consider “module” and “library” as synonymous. To me, a library is simply a published module.
  • “Interface” and “Protocol” can also be regarded as synonymous. While there are subtle differences between them, there is no distinction in their usage within this article.
  • >
  • Permalink

    Tracking memory usage with clj-memory-meter.trace

    Automatic memory management is probably JVM's biggest selling point. You don't need to remember to clean up the stuff you've allocated — the garbage collector will take care of it for you. You can "leak" memory if you leave live references to allocated objects (e.g. store objects in static fields), making them unreclaimable by the GC. Thus, as long as you don't do that, you can treat memory as infinite and never run out of it. Right?Definitely not. Memory is a limited resource, and you can't fill it with more data than its capacity allows if all of that data is needed at the same time. So, it becomes crucial not only to know how much memory your data structures occupy but also the access patterns to those data structures. Does the algorithm process one item at a time or expect the whole list to be in memory at once? Is the result gradually written to disk, or is it accumulated in memory first? These questions may not matter when the data size is small and doesn't ever get close to memory limits. But when they do...

    Permalink

    Why Clojure?

    Why Clojure?

    This is about a 17 minute read. Feel free to pause and come back to it later.

    Clojure is not one of the handful of "big" mainstream languages. This means that sometimes people are surprised that we are all in on Clojure. Why go against the grain? Why make it harder for yourself by building on niche technology?

    Gaiwan is mostly known as a Clojure consultancy, but we don&apost consider ourselves as being defined by Clojure. Rather, we are group of experienced technologists (10+ years of industry experience on average) who are deliberate and intentional about the technologies we build upon. Rather than choosing tech that is fashionable, or that has the biggest marketing budget, we choose tech that gives us the highest leverage. Tech that allows a small team like ours to be maximally productive, to maintain velocity as systems grow, and that allows us to keep overall complexity low. Right now, that tech is Clojure.

    In this article I want cover some of the reasons of why that is. In the first place I&aposm writing this for engineers or technical leaders who are trying to decide if Clojure is worth investing time in. It should for the most part also be understandable by business leaders, who want to understand the business benefits of building on Clojure.

    The reasons I&aposll outline below fall into three main categories:

    • Developer productivity: Clojure development is interactive, low ceremony, and high leverage. Clojure developers are happy developers that can ship quickly.
    • Long-term maintainability: the Clojure language and ecosystem are mature and stable, with a culture of stability that no other language ecosystem I&aposm aware of can match. This lets you build high-quality systems that last, while keeping maintenance costs down.
    • Culture of Ideas: while not a benefit of the language per se, adopting Clojure means you become part of a community which actively explores ideas from the past and present, academia and industry, to find better ways of building software. Clojure will challenge you in the best way possible.

    (hello &aposclojure)

    Clojure is a language in the Lisp family (also styled LISP). Lisp was conceived in the 1950s as a theoretical model for reasoning about computability, similar to the Turing machine or Lambda Calculus. It soon turned out that this theoretical model also made an excellent practical language to program in, one with a high conceptual elegance. The Lisp syntax has a one-to-one correspondence with the syntax tree data structure used to represent it, which provides several benefits compared to languages with more ad-hoc grammars. This notably made it the language of choice for AI applications during the previous big AI boom.

    Interest in Lisp languages has waxed and waned over time. Over the past decade Clojure has come to prominence. The main Clojure implementation is built on top of Java&aposs underlying machinery (the JVM), and incorporates several modern innovations in programming language design, including a complete set of well performing functional ("immutable") data types, and first class concurrency primitives. While Clojure forms a small language community and ecosystem compared to the major languages people are familiar with, it has done remarkably well for a language with no major corporate backing, and with a syntax and appearance that can seem wholly alien to people steeped in imperative curly-bracket languages or ML variants.

    A host of alternative implementations exist or are under development, including ClojureScript (compile-to-js), ClojureCLR (targeting Microsoft&aposs .NET), Babashka (a fast-booting interpreter for scripting, compiled to native using GraalVM), and Jank (native compilation), which provides reach and leverage. Clojure knowledge will transfer to multiple contexts and circumstances, and will give you access to multiple large open source ecosystems. This article takes as its reference the JVM implementation, but much of it is true for the other variants as well, with some nuance.

    What follows are some of the reasons why we find Clojure the most compelling programming language offering that exists today.

    Interactive Development

    Programming is a constant cycle of writing code, and validating said code. Without a feedback mechanism it is near impossible to write anything but the most trivial program and still be confident that it does what it&aposs supposed to do.

    These feedback loops come in many flavors. At its most basic people simple run their script-style programs over and over. For interactive programs they might click through its (web) UI, maybe putting some print or logging calls in the code to better see what is going on. Unit testing provides a more rigorous and repeatable feedback loop. Compilers, linters, and other analysis tools can provide a different kind of validation, a coarse grained assessment that a program is at least structurally sound. These cycles take from seconds to hours, and generally necessitate a context switch, from the editor to a terminal, UI, or CI, and back.

    Short, quick feedback cycles are preferable over long, slow feedback cycles, and this feedback cycle speed is one of the biggest predictors of a programmer&aposs productivity. Without quick and early feedback, you end up in a slow write/debug cycle, where as the cycles get slower, you end up spending ever more time debugging, compared to the time spent writing code.

    All of the mentioned validation techniques are available in the Clojure world as well, with for instance sophisticated tooling for unit and property-based testing. At the heart of Clojure development however lies the practice of interactive development.

    Before a single letter is written, the Clojure programmer starts up the Clojure runtime, which is connected to their editor. From here a program is "grown" by writing/running small pieces of it, and seeing the result (or error), directly and immediately, without leaving the editor.

    Here the Lisp syntax is a great help, since it provides a Lego-block like uniform syntax that makes it easy to "take a program apart", in a sense, executing individual bits or large composite pieces, merely by placing the cursor in the right spot.

    It&aposs hard to overstate the impact of this style of interactive development, as it provides the quickest feedback cycle, and thus most immediate feedback possible. You will also see this referred to as "REPL driven development", which obscures its true power. Many programming languages have a REPL (also referred to as a console or command line interface) somewhere "over there", in a terminal emulator or browser devtools. Few allow you to execute arbitrary pieces of code "over here", right where you are writing them, as you are writing them, against a complete running system.

    And this is only the tip of the iceberg, as this ability to connect to a running system and manipulate it has more far reaching consequences. It provides an ad-hoc inspection, debugging, and manipulation interface to any Clojure program running in any environment.

    Culture of Stability

    When choosing Clojure, you don&apost just get a piece of powerful tech. You also become part of a community of practice, with its own notions and dogma. Even more than the tech itself it&aposs this community of practice that really makes the choice for Clojure so compelling, and teams that adopt the tech in isolation without engaging with the wider culture and community sell themselves short. They would have been better off not choosing for Clojure at all.

    One strong cultural tenet is a commitment to stability and backwards compatibility. This starts from the core language, where breaking changes are virtually unseen, despite releasing regular improvements and extensions. This has become a deeply ingrained value in the open source ecosystem surrounding the language as well, and stands in sharp contrast with almost every other modern programming ecosystem, where a certain amount of churn — change for the sake of change — is taken for granted. This churn is a hard to overstate waste of resources, the global cost of which has to be measured in billions, and it&aposs wholly avoidable.

    Not so in Clojure, where it&aposs normal to upgrade to the latest version of the language, and other project dependencies as a matter of course. You simply carry on with your day. You can get the benefits of bug fixes, security, and performance improvements, without having to rewrite parts of your code base, or wonder what hidden subtle bugs have been introduced by breaking changes, even in point releases, often not even documented.

    I imagine at this point some eyebrows may be raised sceptically. Isn&apost change necessary to allow for progress? This shows a confusion between stability and stagnation. In software it is absolutely possible to have progress, to do new things, or improve existing things, without breaking the things that are already there. We live and breathe this every day.

    Information Systems / Knowledge Representation

    In the space of web and business applications in particular we write programs that deal with information about the world. Gathering, accessing, and processing of information, facts, is at the heart of what we do, and yet it&aposs staggering how poor many mainstream languages perform in this area. Either they provide data representation and manipulation primitives that are needlessly low-level, or they insist on a statically typed worldview leading to parochial, snowflake APIs that defy abstraction and higher level manipulation, or both.

    Clojure&aposs functional data structures and core set of data manipulation functions make information truly first class. Clojure is dynamically typed, and idiomatically follows the open world assumption. RDF, the data modeling framework originally developed for the Semantic Web, has an outsized influence in the community. This is visible in the preference for triple stores/graph databases, notably Datomic. It&aposs also visible in the language itself, where namespaced keywords are preferred, providing fully qualified identifiers for attributes that can be assigned context-free semantics.

    This isn&apost as heavy a lift as it may sound. A Clojure map with namespaced keywords is no more complex than a bit of JSON, but it can carry precise semantics without out of band contextualization, and it can be safely extended with additional information without risking naming conflicts.

    Small composable functions over immutable data

    This is another aspect that is cultural as much as it is technical. Clojure is not a purely functional language, and it&aposs easy to translate Java, Ruby, or C code directly into Clojure. But an idiomatic Clojure program looks very different from an idiomatic Java program, consisting for the most part out of pure functions over immutable data.

    Immutable data provides value semantics (as opposed to reference or identity semantics), and pure functions compute a result value based purely on a tuple of input values, without having an influence on, or being influenced by, the world outside of the function. (Like reading/writing global data, or causing side effects).

    This leads to several corrolaries.

    Concurrency Handling

    Contemporary computing is inherently concurrent, and has been for close to 20 years. We have dealt with the limits of Moore&aposs law by stacking processors with ever more cores, and our programs have had to keep up.

    Clojure helps with this in the first place by emphasizing immutability. Operations which involve mutable memory locations introduce timing and ordering dependencies, which need to be carefully controlled when introducing parallelization. A pure data-in data-out transformation on the other hand can always run safely, regardless of what else is going on.

    But programs do need to maintain state over time. For this the JVM has had excellent concurrency primitives since java.util.concurrent shipped in Java 5, but using them correctly still requires the care of an expert. Clojure provides higher level abstractions on top of these that provide specific concurrency and correctness guarantees. Atoms are the most commonly used ones, providing serialization of read-then-write style operations, through Compare-and-Set (CAS) combined with automatic retries. Refs provide Software Transactional Memory (STM), Agents provide serialization of updates which are applied asynchronously, Futures provide a fork-and-join interface backed by a thread pool. These all rely on Clojure&aposs functional data structures (including immutable queues), providing elegant thread-safe abstractions that can be used easily without shooting yourself in the foot.

    For data processing or event-driven systems there is core.async, available as a library maintained by the core team, providing Communicating Sequential Processes (CSP), similar to Go&aposs goroutines, and comparable to actor systems as found in Erlang/Elixir, or in Scala&aposs Akka.

    Of course you don&apost have to use these higher level abstractions (see also Move up and down the trade-off/abstraction ladder), the lower level primitives are still available, including concurrent queues, atomic references, locks and semaphores, various types of thread pools, all the way down to manual thread locking and marking of synchronized critical sections, for when you do need that fine-grained control.

    Local reasoning

    There&aposs only so much even the most gifted programmer can keep into their frame of mind at any given time. Each additional piece of context that needs to be considered to assess the impact of a change, the harder it becomes to confidently and correctly make that change. This curve is a hockey stick, things go from easy to hard to impossible quickly the more distinct pieces of code and state need to be considered at the same time to understand what a program is doing.

    The fact that most of a Clojure program consists of pure functions means that one only needs to understand what the inputs for a given function are to understand the function&aposs full behavior.

    Another aspect that helps here is that Clojure generally avoids polymorphim. There is no superclass implementing part of the behavior, you don&apost need to know the runtime type of objects to understand which implementation is being invoked. There are only concrete functions in namespaces. It&aposs been said that in object oriented programming everything happens somewhere else. In Clojure there is much less of this kind of indirection, making navigating around a code base to understand control flow straightforward.

    Of course you can write code that has this property in other languages, when taking sufficient care. But in non-functional languages this often means going against the grain of the language, and adopting a coding style that is not considered common or idiomatic. Other functional languages do promote this kind of purity, but lack some of the other benefits outlined in this article.

    This local reasoning, together with Lisp&aposs Lego-block-like uniform internal structure, makes it easy to refactor and evolve a code base. When refactoring the programmer improves a code base by changing its structure and organization, without changing its behavior. This can be quite challenging, since there might be implicit dependencies between different parts of the code base, through shared mutable state. Clojure encourages having a small amount of imperative code handling mutable state, separated from the otherwise purely functional code base. This makes both sides easier to develop and test, and provides some confidence that changes won&apost have unintended side-effects.

    Ease of testing

    When working with functional code, whether during interactive programming or in a unit test, validating that a piece of your program works as expected is a matter of pushing values in and seeing which values come out. There is no careful setup and teardown of state, and loading of fixtures, no stubbing out communication channels or delicately managing timing requirements, all common sources of the dreaded flakiness in tests. No code is easier to test than purely functional code.

    This also opens the door to higher leverage techniques like Property Based Testing, also known as Generative Testing, where a random sequence of ever more complex input values is fed into the program, to find values that violate certain known properties or invariants, followed by a crucial shrinking phase, so the programmer is presented with a minimal examplar of the unsupported edge case.

    Clojure has no unique claim to these techniques, in fact Property Based Testing originated in the Haskell community, and QuickCheck-inspired libraries are available for most major languages now. It does however synergize with some of the other benefits outlined, especially the emphasis on simple, immutable data structures, and with the interactive style of development.

    Positive self selection for hiring candidates

    "But what about hiring?" When you use any language that isn&apost in the top 3 of currently most popular languages, you will get this question. JavaScript programmers are counted in the millions, Clojure programmers in the tens of thousands. How will you ever find the required talent?

    It seems like a logical question, but it&aposs overly focused on the supply side. Yes, there are fewer Clojure developers, but there are also fewer Clojure jobs. It&aposs not useful to look at absolute numbers, you need to consider the balance between the two. Anecdotally this balance seems to be ok. From what we&aposve seen companies looking for Clojure talent are generally able to find people, and developers looking for jobs are able to get hired.

    In specific locales the story may be different. In smaller cities there might be no Clojure programmers at all. If you are intent on hiring locally. That is certainly a factor to consider. Even in bigger cities, if you are looking at hiring a lot (dozens to hundreds of people), this will be a factor. In either case you may have to find other suitable candidates, and train them into the specifics of Clojure. Nubank famously has trained hundreds of Brazillian developers to pick up Clojure out of necessity, but they describe it as a positive experience, for the company and the developers.

    In either case, whether you&aposre hiring people with the requisite Clojure experience, or training people up, what we hear over and over again is that the quality of applicants for a Clojure job is higher than when hiring say for JavaScript or Python. You may get only a handful of CVs instead of a few hundred, but they&aposll be quality CVs. Remember that Clojure is a community of ideas. It attracts people who think deeply about their craft, who are interested in finding better ways to do things, who are keen on learning advanced somewhat alien looking technologies. What we find is that both people who have studied Clojure in their own time, or people who are drawn in by the prospect of learning Clojure on the job, tend to be curious and open minded problem solvers. Exactly the kind of people you&aposd want to have on your team.

    This is of course all very anecdotal, which is all we have to go on in the absence of large scale studies. We leave it to the reader to decide if they find these claims credible, or at least plausible. What I can say from working with dozens of Clojure teams over the years is that while hiring is a concern that&aposs frequently voiced by people not (yet) doing Clojure, I have rarely heard it expressed as a major problem by teams actually doing Clojure.

    Move up and down the trade-off/abstraction ladder

    Clojure is, quite decisevely, a high-level language. Idiomatically, code is concise and expressive, with little ceremony or incidental complexity, in large part thanks to the functional (immutable) data structures, and accompanying data manipulation API.

    Clojure&aposs data structures perform very well for functional (immutable) data structures, but you still pay a cost for the convenience and guarantees they provide. Clojure&aposs maps and vectors are internally represented as trees (Hash array mapped tries to be precise), and there is a certain amount of path copying involved in every update. When done in bulk this puts pressure on the garbage collector.

    Clojure also provides seamless interop with Java types (see the section below on Host Interop), using runtime reflection, and automatically boxing/unboxing primitives, if necessary. This all comes at a cost.

    For everyday applications this cost is negligable, and easily justifiable given the ease of use you get in return. Used well functional data structures let you write smarter algorithms, so you do less work, offsetting some of these costs. But there are certainly use cases where this style of programming is not suitable. If you are writing a game graphics engine, doing realtime signal processing, or doing anything else that could be described as number crunching, then you want to get down to the metal.

    The good thing is that you can get down to the metal, without leaving your familiar environment. Providing some type hints to the compiler can eliminate runtime reflection and boxed math. You can work with contiguous arrays of primitive types, amenable to L1/L2 caching in the CPU. Optimized numeric vector/matrix types are available as libraries, including GPU backed.

    Contrast this with other high level languages where when the needs get high, you may be forced to switch to native extensions in C or Rust. In Clojure instead of this dichotomy you get a sliding scale. Maybe you have an event loop that needs to be able to handle high loads. A bit of profiling, type hinting, and sprinkles of interop may be all you need. At the end of the day Clojure runs as Java bytecode, which gets optimized on the fly (JIT compilation) by the JVM. You may be surprised how much you can squeeze out of that event loop with minimal changes.

    And even when doing this kind of lower level coding, you still get access to Clojure&aposs excellent metaprogramming support, to handle some of the drudgery for you. Which brings us to the next point.

    Move up and down the metaprogramming ladder

    It&aposs been commented on a few times that Clojure is a Lisp. What makes it a Lisp isn&apost (just) the superficial stuff of where the parentheses go. It&aposs the fact that in a very real sense code is a datastructure. It&aposs like JSON, if JSON was designed to represent programs in a readable way. Instead of Javascript&aposs objects, arrays, strings, and so forth, Clojure code is represented as nested lists, with symbols to represent functions, variables, and reserved keywords. (When used as a JSON-like data format, this syntax is known as EDN).

    What&aposs unique about Lisp is that facilities for converting between a string and a data representation of code are built into the language (known as the Reader and Writer), as well as facilities to evaluate such data structures as code, or, in the case of Clojure, compile and run them as JVM bytecode.

    Macros allow the programmers to extend the syntax, essentially augmenting the compiler, by writing functions that transform this code-as-data, and this is probably the most well-known example of Lisp metaprogramming. But it&aposs not the only option available, given these building blocks. The cultural trend in Clojure is to use macros sparingly, reserving them for key high leverage constructs, since macros are opaque and difficult to debug. They also make life harder for tools that do static analysis.

    Instead in the Clojure open source ecosystem in particular there&aposs a trend towards data driven interfaces, where instead of providing concrete functions and macros, an API is provided which takes a data structures, usually some combination of nested vectors and map, and let&aposs that drive the library&aposs behavior. Examples are HTTP routing, HTML and CSS generation, data validation and coercion, and many more.

    Superficially and syntactially the distinction is small, but the leverage gained is significant. Behavior is now driven through data, rather than invoked directly, and data, information, can be generated and manipulated. In fact, Clojure excels at this, as we pointed out earlier.

    You now have the full power of the language to create dynamic and adaptive systems. You can transform this data specification to deal with cross-cutting concerns, or make it end-user editable by storing it in a database, which is in turn trivial because you have the Clojure Reader and Writer available at runtime.

    Again this is a sliding scale, where programs and programmers will generally start out on the concrete and verbatim end of the spectrum, and stepping down a rung into metaprogramming territory when called for.

    It makes Clojure particularly suitable for highly dynamic and simulation-type systems, which can be reconfigured or rewired at runtime to exhibit new behaviors. In general these techniques provides a high amount of leverage, empowering people to do much more with the same tools and libraries, without being beholding to the library&aposs author to support their specific use case a priori.

    Host (Java) Interop

    Modern applications are more glue than substance. We take a language&aposs standard library, a few hundred open source libraries, a dozen SaaS APIs, and a handful of off the shelf components like databases and message queues, then add a bit of code on top to make it all work together. For application programmers (as opposed to system programmers) the bulk of their work is calling into APIs written by others, and wiring them together.

    This means it matters a lot which open source ecosystem you have access to. By leveraging the JVM and providing excellent interop capabilities, Clojure can leverage the millions of packages available on Maven Central, Java&aposs package repository, and the biggest single open source package repository in the world.

    It does so with little to no ceremony. Clojure is concise compared to Java, and the interactive programming facilities make it easy to explore APIs, and quickly wire them together. It&aposs not controversial to say that in this kind of exploratory glue programming Clojure beats Java hands down.

    With ClojureScript all the same arguments can be made for JavaScript and the NPM package repository.

    Culture of Ideas

    At the end of the day does your choice of language really matter that much? Teams and companies can be succesfull in virtually any language, and conversely no language can stop a well intentioned engineer from creating a huge mess. A sharp blade does not make you a master chef, and in the wrong hands may do more harm than good. And Clojure certainly has a few sharp edges. The language attempts very little hand-holding, expecting the programmer to know what they are doing. While strides have been made to improve the onboarding and learning experience, it can still feel like a trial by fire, especially with insufficient mentoring. This does lead to people becoming reasonably proficient, but still missing out on a lot of Clojure&aposs benefits.

    Indeed we&aposve come across a good few Clojure code bases of questionable merit. Often these are written by teams with a different language background, say Java or Python, who adopted Clojure&aposs syntax, but failed to steep themselves in the ideas and idioms of Clojure&aposs community of practice, resulting in a LISP flavored pidgin.

    On the other hand those who do embrace this culture of ideas will find they gain a more refined mental framework for reasoning about software design, one which transfers remarkably well to other languages and ecosystems.

    We&aposve pointed out a few ways already in which the appeal of Clojure is at least in part cultural, rather than merely technical. Much of Clojure&aposs relative success despite major corporate backing is due to Rich Hickey&aposs conference talks, in which he explores the ideas that influenced the design of Clojure and Datomic, as well as his own insights distilled from decades in the industry. Similarly at Clojure conferences talks tend to explore ideas, revisit influential papers, or share experiences, rather than simply presenting libraries and tools.

    Fundamentally Clojure&aposs community is one which isn&apost afraid to second guess itself. Here you find professionals working at the outer edge of their capabilities, always striving to learn and to find better ways of building software together, rather than merely coasting along. I am deeply grateful I can be part of it.

    Permalink

    Clojure Is Awesome!!! [PART 12]

    (ns chain-of-responsibility
      (:require [clojure.pprint :as pp]))
    
    ;; === Request Processing Chain ===
    (defprotocol RequestHandler
      (handle-request [this request])
      (set-next [this handler]))
    
    ;; === Authentication Handler ===
    (defrecord AuthenticationHandler [next-handler]
      RequestHandler
      (handle-request [this request]
        (if-let [auth-token (:auth-token request)]
          (if (= auth-token "valid-token")
            (if next-handler
              (handle-request next-handler 
                             (assoc request :authenticated true))
              (assoc request :authenticated true))
            {:error "Invalid authentication token"})
          {:error "Missing authentication token"}))
    
      (set-next [_ handler]
        (->AuthenticationHandler handler)))
    
    ;; === Authorization Handler ===
    (defrecord AuthorizationHandler [next-handler]
      RequestHandler
      (handle-request [this request]
        (if (:authenticated request)
          (if (contains? (:roles request) :admin)
            (if next-handler
              (handle-request next-handler 
                             (assoc request :authorized true))
              (assoc request :authorized true))
            {:error "Insufficient permissions"})
          (if next-handler
            (handle-request next-handler request)
            request)))
    
      (set-next [_ handler]
        (->AuthorizationHandler handler)))
    
    ;; === Validation Handler ===
    (defrecord ValidationHandler [next-handler]
      RequestHandler
      (handle-request [this request]
        (if (and (:data request)
                 (map? (:data request))
                 (every? string? (vals (:data request))))
          (if next-handler
            (handle-request next-handler 
                           (assoc request :validated true))
            (assoc request :validated true))
          {:error "Invalid request data format"}))
    
      (set-next [_ handler]
        (->ValidationHandler handler)))
    
    ;; === Logging Handler ===
    (defrecord LoggingHandler [next-handler]
      RequestHandler
      (handle-request [this request]
        (println "\nProcessing request:")
        (pp/pprint (dissoc request :handler))
        (let [response (if next-handler
                        (handle-request next-handler request)
                        request)]
          (println "\nResponse:")
          (pp/pprint response)
          response))
    
      (set-next [_ handler]
        (->LoggingHandler handler)))
    
    ;; === Cache Handler ===
    (def request-cache (atom {}))
    
    (defrecord CacheHandler [next-handler]
      RequestHandler
      (handle-request [this request]
        (if-let [cached (@request-cache (:id request))]
          (do
            (println "Cache hit for request:" (:id request))
            cached)
          (let [response (if next-handler
                          (handle-request next-handler request)
                          request)]
            (when (:id request)
              (swap! request-cache assoc (:id request) response))
            response)))
    
      (set-next [_ handler]
        (->CacheHandler handler)))
    
    ;; === Request Processing ===
    (defn build-chain []
      (-> (->LoggingHandler nil)
          (set-next (->CacheHandler nil))
          (set-next (->AuthenticationHandler nil))
          (set-next (->AuthorizationHandler nil))
          (set-next (->ValidationHandler nil))))
    
    ;; === Example Usage ===
    (defn run-examples []
      (let [chain (build-chain)]
        (println "\n=== Valid Admin Request ===")
        (handle-request chain
                       {:id "req-1"
                        :auth-token "valid-token"
                        :roles #{:admin}
                        :data {"name" "John"
                              "action" "read"}})
    
        (println "\n=== Invalid Token ===")
        (handle-request chain
                       {:id "req-2"
                        :auth-token "invalid-token"
                        :roles #{:admin}
                        :data {"name" "John"}})
    
        (println "\n=== Missing Token ===")
        (handle-request chain
                       {:id "req-3"
                        :roles #{:admin}
                        :data {"name" "John"}})
    
        (println "\n=== Insufficient Permissions ===")
        (handle-request chain
                       {:id "req-4"
                        :auth-token "valid-token"
                        :roles #{:user}
                        :data {"name" "John"}})
    
        (println "\n=== Invalid Data ===")
        (handle-request chain
                       {:id "req-5"
                        :auth-token "valid-token"
                        :roles #{:admin}
                        :data {"name" 123}})
    
        (println "\n=== Cached Request ===")
        (handle-request chain
                       {:id "req-1"
                        :auth-token "valid-token"
                        :roles #{:admin}
                        :data {"name" "John"
                              "action" "read"}})))
    
    
    (run-examples)
    

    Permalink

    Enhancing engineering workflows with AI: a real-world experience

    Artificial Intelligence (AI) and Large Language Models (LLMs) are revolutionizing the tech industry, and at Nubank, we’re using these technologies to enhance engineering workflows across Brazil, Mexico, and Colombia. In a recent talk at Clojure Conj 2024, Carin Meier, Principal Software Engineer at Nubank, and Marlon Silva, Software Engineer at Nubank, shared how AI-powered tools are transforming how we work.

    Clojure Conj, a conference held since 2010, is a key event for the global Clojure community. It brings together developers and thought leaders to discuss the latest trends in Clojure programming. In 2024, it provided the perfect platform for Carin and Marlon to present how Nubank is integrating AI, including LLMs, to streamline our engineering processes.

    In this article, we’ll explore the main topics from the lecture, and how these AI tools are optimizing everything from code generation to team collaboration at Nubank—and how they could help your team too.

    What are Large Language Models (LLMs)?

    Before diving into our experiences, let’s start with a quick overview of what LLMs are and how they work.

    At a high level, LLMs like GPT-3 and GPT-4 are machine learning models trained on vast datasets to predict the next word (or token) in a sequence based on the context provided. They are designed to mimic human-like understanding and generation of language.

    For example, when you type a prompt like “Clojure is a lovely programming language that allows you to,” an LLM can predict and continue the sentence with something like “code a program in a pure functional style.” The model does this by drawing from patterns it has learned during training, where it encounters large amounts of code and documentation, allowing it to generate meaningful sentences in response.

    However, LLMs are not perfect. They require experimentation to understand their potential, especially when it comes to generating code in specific programming languages like Clojure, a language that doesn’t have as much public training data compared to more mainstream languages like Python or JavaScript.

    The power of benchmarking: testing LLMs for Clojure

    To understand whether LLMs could truly enhance engineering workflows, we needed to test their capabilities. At Nubank, we selected a few models and applied them to generate Clojure code. While many existing benchmarks showed impressive results for languages like Python and JavaScript, we were curious how well these models would perform for Clojure, which has its own unique syntax and concepts.

    Initially, we used a tool called the MultiPL-E (Multi-Programming Language Evaluation of Large Language Models of code) Benchmarking Tool. This open-source tool allows us to test the quality of code generated by LLMs based on a set of predefined problems, like those in the HumanVal and MBPP datasets.

    With this tool, we were now able to put our Clojure code generation capabilities to the test. Thanks to invaluable support from Alex Miller, a prominent figure in the Clojure community and a vital part of Nubank’s operations, we integrated Clojure into MultiPL-E and started comparing it alongside Python and JavaScript. 

    At first, we didn’t apply any special fine-tuning or engineering tricks; we simply wanted to observe the raw potential of the latest models (including open-source projects like Llama3 and private GPT variants from OpenAI) in producing production-ready code. Unsurprisingly, Clojure lagged a bit behind Python and JavaScript at first—likely a reflection of the smaller corpus of Clojure code used to train most LLMs—but the surprise was how close these results actually turned out.

    With each new release—GPT-3.5, GPT-4, GPT-4o, o1-preview, o1, and beyond—we’ve observed the gap shrink further. It’s encouraging to see Clojure gain ground so quickly, and it gives us hope for a future where the disparity between languages all but disappears. As more models are trained on increasingly diverse datasets, we expect to see Clojure’s performance match Python’s and JavaScript’s. 

    The open-source community and ongoing efforts like MultiPL-E are making strides to improve support and visibility for functional languages, and we’re excited about what this means for developers who rely on Clojure every day.

    The lesson here? Don’t be afraid to experiment. Try various models and see how they align with your specific use cases. The performance of these models can vary significantly depending on your needs.

    Building flexible tools for Engineering teams

    One of the key takeaways from our journey with LLMs is the importance of building flexible and extensible tools. The world of AI is moving so fast that we can’t predict exactly what our engineers will need in the next month, let alone a year.

    At Nubank, we’ve embraced this uncertainty. We’ve designed tools that are small, modular, and easy to adapt as new developments emerge. A good example of this is Roxy, a local proxy that facilitates the use of LLMs in a regulated environment.

    Roxy is designed to ensure that any interaction with LLMs adheres to compliance and security regulations. Rather than building a complex solution tailored to a specific use case, we created a thin, flexible interface that engineers can use in a variety of ways. This approach allowed us to quickly adapt as new requirements or opportunities arose.

    The key takeaway here is that teams shouldn’t over-engineer their tools. They should create something simple that can grow and evolve alongside technology.

    Fostering a community for sharing AI insights

    In any fast-moving field, collaboration is key. At Nubank, we’ve found that creating a community of practice—what we call guilds—has been invaluable. These are internal user groups where we share experiences, discuss challenges, and brainstorm ways to leverage new tools like LLMs effectively.

    By gathering on a regular basis, we ensure that everyone stays up-to-date on the latest AI advancements and gets a chance to provide feedback. This has helped us continually improve our tools and techniques for integrating LLMs into engineering workflows.

    If you’re working with AI or any new technology, consider fostering your own community. It’s a great way to keep learning and stay ahead of the curve.

    Can LLMs help us think?

    While many people worry that AI will replace human thinking, we believe that LLMs can actually enhance our thinking—if used correctly. For example, LLMs can help engineers and product managers think critically, ask better questions, and approach problems from new angles.

    Something that we’ve found useful is using AI to guide us in identifying the root cause of a problem, rather than just providing the answer. For instance, if we’re faced with a performance issue in a microservice, we might prompt the LLM with a question like, “How can I best frame a solution for a microservice that runs slow on an IO operation?”

    The idea isn’t to ask for an answer right away but to use the LLM to help us structure our thinking. By using LLMs this way, we can dig deeper into the problem and come up with better solutions.

    In another example, Marlon used this method to craft a product report. He asked the LLM to assume the role of a product manager and help him structure a report for upper management on the benchmark of LLM models for Clojure. The result was a report that exceeded expectations and impressed the product manager.

    A look into the future: the power of autonomous AI agents

    As AI evolves, the idea of autonomous agents that can write code and solve problems on their own is becoming more of a reality. We’ve explored some early-stage tools, like Open Hands, which use LLMs to assist with tasks like data analysis.

    In a recent demo, we tasked Open Hands with performing a data analysis on the Iris dataset using Clojure. The agent autonomously planned, wrote, and executed the code, demonstrating how LLMs can assist engineers in tasks that would typically require more time and effort. While the technology is still in its early stages, we’re excited by the possibilities it presents.

    Devin, an autonomous AI software engineer developed by Cognition Labs, is another example of how AI is transforming software development. Devin has been instrumental in helping us migrate our massive ETL system with over 6 million lines of code. 

    By automating repetitive tasks like refactoring and code migration, Devin enabled Nubank to complete a project initially estimated to take over 18 months with a thousand engineers in just weeks, achieving a 12-fold increase in efficiency and significant cost savings. 

    Looking ahead

    As AI continues to evolve, it’s clear that Large Language Models are not just tools for automating tasks—they are essential for enhancing developer workflows. By integrating LLMs into Nubank’s engineering processes, we’ve seen firsthand how they can boost productivity, foster creativity, and bridge gaps between technical and business teams. 

    And, as we continue to explore and refine our AI solutions, we encourage other organizations to experiment and build flexible, extensible tools that adapt to the fast-moving world of AI. The future of engineering is here, and with LLMs, the possibilities are endless.

    Learn more about what we shared on this topic in the video below:

    The post Enhancing engineering workflows with AI: a real-world experience appeared first on Building Nubank.

    Permalink

    Revisiting 'Clojure Don'ts : concat

    Nostalgia city

    I've recently started maintaining a Clojure codebase that hasn't been touched for over a decade - all Clojure devs that built and maintained it are long gone. It's using java8, Clojure 1.6 and libs like korma and noir - remember those? Contrary to the prevailing Clojure lore, upgrading Clojure will not be just a matter of changing version numbers in the lein project.clj.

    I find one of the most dated aspects of the project is the laziness. I only use laziness as an explicit choice and have done so for many years. Laziness is a feature I find I rarely need, but is sometimes just the right fit.

    A lot of the original Clojure collection functions are lazy and it is still common to see new code written with them - I think because they are still seen as an idiomatic default, rather than a conscious choice. Non-lazy versions like mapv and filterv came later and transducers later still, but of course the old functions must continue to work as before.

    Investigating a bug in the codebase led me back to this great blog post, Clojure Dont's: Concat also written around a decade ago. The rest of this post will discuss that post, so if you haven't please read that (and ofc the rest of the 'Dont's series is also good').

    Revisiting the post

    I had first read the post many years ago and had forgotten the details - I guess, the main thing I remembered was 'don't use concat' - which is maybe a good heuristic but actually missed the main point which could be phrased as build lazy sequences starting from the outside - I'll explain the outside thing further on.

    Reading it again, I had go over it couple of times to fully understand it - if it was crystal clear to you then you've no need to read on. To check your understanding - answer this: what difference would it make (wrt overflow) to change the order of the args to concat in the build-result function?

    Following is my attempt to make the post's message even clearer.

    The post mentions that seq realises the collection and causes the overflow. Just in case it is not clear, seq does not in general realise lazy collections in entirety, it just realises the first element.

    To demonstrate that, have a look at the following, which is like range but the numbers in the sequence descend to one :

    
    (defn range-descending [x]
      (when (pos? x)
        (lazy-seq
          (cons x (range-descending (dec x))))))
    
    (let [_ (seq (range-descending 4000))]
      nil) ; => ok, no overflow
    
    

    This is what one might call an outside-in lazy sequence. As the sequence is generated, one might picture it like this:

    (4000, LazySeq-obj)
    (4000, 3999, LazySeq-obj)
    (4000, 3999, 3998, LazySeq-obj)
    ...
    

    Calling seq on the collection, only the first element is realized, so no overflow.

    The equivalent to the way concat was used in the original post would be more like this:

      (defn range-descending-ohno [x]
        (when (pos? x)
          (lazy-seq
            (conj (range-descending (dec x)) x))))
    

    Now visualising the sequence generation, it would look more like this:

    (conj LazySeq-obj 4000)
    (conj (conj LazySeq-obj 3999) 4000)
    ...
    (conj `...` (conj nil 1) `...` 4000)    
    

    Now when calling seq (as in (seq (range-descending-ohno 4000))), the whole sequence needs to be realised for seq to get to the first element (4000 in the example). As the post says: seq has to recurse through them until it finds an actual value. One might call this an inside-out lazy sequence.

    Conclusion

    The original post concludes Don’t use lazy sequence operations in a non-lazy loop - which I would update to add don't use laziness at all unless required.

    If deciding to use laziness, avoid building sequences inside-out - this might be in your direct usage of e.g. lazy-seq or hiding in plain sight in your usage of e.g. clojure.core functions such as concat.

    Further Reading

    • The inside-out lazy seq topic is also covered in Clojure Brain Teasers if you want more pictures and explanation (in Boom Goes the Dynamite chapter).
    • Clojure's Deadly sin is a very well considered and comprehensive look into the problems of laziness in clojure.

    Permalink

    Pathom3 Instrumentation

    In this article I will explain how to get performance insights into your Pathom3 resolvers by using Tufte. My aim is to show a very basic example of how it can be done, without doing a deep dive on any of the topics.

    Pathom

    If you are unfamiliar with Pathom, its docs define it as "a Clojure/script library to model attribute relationships". In essence, Pathom allows you to create graph of related keywords and query it using the EDN Query Language (EQL). It supports read and write operations using resolvers and mutations. The "magic" of it is that it produces an interface which abstracts away function calling by handling all the graph traversal internally when responding to EQL requests. What does that mean? A short example should suffice:

    ;; create a few resolvers to model related attributes
    (pco/defresolver all-items
      "Takes no input and outputs `:all-items` with their `:id`."
      []
      {::pco/output [{:all-items [:id]}]}
      {:all-items
       [{:id 1}
        {:id 2}
        {:id 3}]})
    
    (pco/defresolver fetch-v
      "Takes an `:id` and outputs its `:v`."
      [{:keys [id]}]
      (Thread/sleep 300)
      {:v (* 10 id)})
    
    ;; query the graph for some data
    (p.eql/process
     (pci/register [all-items fetch-v])
     ;; ask for the `:v` attribute of `:all-items`
     [{:all-items [:v]}])
    ; => {:all-items [{:v 10} {:v 20} {:v 30}]}
    

    Source: Pathom3 docs on Batch Resolvers.

    As you can see, once the graph is established, you only need to tell Pathom what you want, not how to get it. As long as there is enough data to satisfy the input requirements of some initial resolver, its output can be used as input to whatever other resolver(s) need to be used in order to satisfy the entire request. Pathom will continue traversing the graph using whatever data it has at each point in order to get all the requested attributes. An elaborate chain of function calls is reduced to a single EQL expression.

    While this does offer developers a great deal of power, one trade-off is that it becomes a little bit harder to understand exactly what your program is doing when you send your query to the Pathom parser. The above example creates a very simple graph without much mystery, but real applications often include a large number of resolvers, often with multiple paths for getting certain attributes.

    Tufte

    Tufte is useful for understanding what happens when you send a query to your Pathom parser. From the Tufte example in its repo's README, the basic usage is like this:

    (tufte/profile ; Profile any `p` forms called during body execution
      {} ; Profiling options; we'll use the defaults for now
      (dotimes [_ 5]
        (tufte/p :get-x (get-x))
        (tufte/p :get-y (get-y))))
    

    In plain English, we need to use p to wrap individual expressions and profile to wrap a set of p expressions to profile them together.

    Profiling Pathom Queries

    To put it together, we need to understand one last piece: Pathom Plugins. Plugins allow developers to extend Pathom's functionality by wrapping specific parts of its internal execution process with arbitrary extension code. The various places you can add wrapping are identified by keywords. In our case, we want to wrap individual resolver calls with p and the entire process (which may call many resolvers) with profile. The keywords for these extension points are:

    • ::pcr/wrap-resolve for individual resolvers
    • ::p.eql/wrap-process-ast for the entire process

    NOTE: this article is specifically for Pathom's EQL interface.

    With this knowledge, we can create some extension functions and register the plugin:

    (defn tufte-resolver-wrapper
      "Wrap a Pathom3 resolver call in `tufte/p`."
      [resolver]
      (fn [env input]
        (let [resolver-name (-> (get-in env [::pcp/node ::pco/op-name])
                                (name)
                                (keyword))
              identifier (str "resolver: " resolver-name)]
          (tufte/p identifier (resolver env input)))))
    
    (defn tufte-process-wrapper
      "Wrap a Pathom3 process in `tufte/profile`."
      [process-ast]
      (fn [env ast] (tufte/profile {} (process-ast env ast))))
    
    (p.plugin/defplugin tufte-profile-plugin
      {::p.plugin/id `tufte-profile-plugin
       ::pcr/wrap-resolve tufte-resolver-wrapper
       ::p.eql/wrap-process-ast tufte-process-wrapper})
    

    The last step is to include this plugin in Pathom's environment when processing a query:

    ;; Add handler to print results to *out*
    (tufte/add-basic-println-handler! {})
    
    (p.eql/process
    ;; Only the first form is new, everything else is as before.
     (-> (p.plugin/register tufte-profile-plugin)
         (pci/register [all-items fetch-v]))
     [{:all-items [:v]}])
     ; => {:all-items [{:v 10} {:v 20} {:v 30}]}
    

    Tufte Results - no batch

    If you follow along with the Batch Resolvers docs linked above, you can see how to optimize such a situation to avoid the N+1 query and the extra 600ms of processing time it causes. Let's replace the fetch-v resolver with its batch version and profile it again:

    (pco/defresolver batch-fetch-v
      "Takes a _batch_ of `:id`s and outputs their `:v`."
      [items]
      {::pco/input  [:id]
       ::pco/output [:v]
       ::pco/batch? true}
      (Thread/sleep 300)
      (mapv #(hash-map :v (* 10 (:id %))) items))
    
    (p.eql/process
      (-> (p.plugin/register tufte-profile-plugin)
          (pci/register [all-items #_fetch-v batch-fetch-v]))
      [{:all-items [:v]}])
    ; => {:all-items [{:v 10} {:v 20} {:v 30}]}
    

    Tufte Results - batch

    Comparing results, we can see the processing time saved by the batch version, exactly how much time was spent in each resolver and which resolvers were called. Again, this is a very simplified example. In a real-world scenario your may end up calling a large number of resolvers to produce the result, so having Tufte's stats at hand can be very useful.

    Pathom Viz

    As a final note, I want to point out that Pathom has its own tool for gaining such insights. It's called Pathom Viz and provides an excellent visual interface that shows everything you get from the above and more. It's a great tool and I use it often. Using Tufte as I've outlined above is an alternative lightweight approach that I've found useful.

    Wrapping Up

    In this article I covered a basic introduction to Pathom, its extension points and how to integrate it with Tufte in order to get performance and execution insights. Nothing groundbreaking here, but I did a quick search and didn't find any similar content, so hopefully this helps someone in the future.

    You can find the complete working example code in my fnguy-examples repo.

    Permalink

    Pathom3 Instrumentation

    In this article I will explain how to get performance insights into your Pathom3 resolvers by using Tufte. My aim is to show a very basic example of how it can be done, without doing a deep dive on any of the topics.

    Pathom

    If you are unfamiliar with Pathom, its docs define it as "a Clojure/script library to model attribute relationships". In essence, Pathom allows you to create graph of related keywords and query it using the EDN Query Language (EQL). It supports read and write operations using resolvers and mutations. The "magic" of it is that it produces an interface which abstracts away function calling by handling all the graph traversal internally when responding to EQL requests. What does that mean? A short example should suffice:

    ;; create a few resolvers to model related attributes
    (pco/defresolver all-items
      "Takes no input and outputs `:all-items` with their `:id`."
      []
      {::pco/output [{:all-items [:id]}]}
      {:all-items
       [{:id 1}
        {:id 2}
        {:id 3}]})
    
    (pco/defresolver fetch-v
      "Takes an `:id` and outputs its `:v`."
      [{:keys [id]}]
      (Thread/sleep 300)
      {:v (* 10 id)})
    
    ;; query the graph for some data
    (p.eql/process
     (pci/register [all-items fetch-v])
     ;; ask for the `:v` attribute of `:all-items`
     [{:all-items [:v]}])
    ; => {:all-items [{:v 10} {:v 20} {:v 30}]}
    

    Source: Pathom3 docs on Batch Resolvers.

    As you can see, once the graph is established, you only need to tell Pathom what you want, not how to get it. As long as there is enough data to satisfy the input requirements of some initial resolver, its output can be used as input to whatever other resolver(s) need to be used in order to satisfy the entire request. Pathom will continue traversing the graph using whatever data it has at each point in order to get all the requested attributes. An elaborate chain of function calls is reduced to a single EQL expression.

    While this does offer developers a great deal of power, one trade-off is that it becomes a little bit harder to understand exactly what your program is doing when you send your query to the Pathom parser. The above example creates a very simple graph without much mystery, but real applications often include a large number of resolvers, often with multiple paths for getting certain attributes.

    Tufte

    Tufte is useful for understanding what happens when you send a query to your Pathom parser. From the Tufte example in its repo's README, the basic usage is like this:

    (tufte/profile ; Profile any `p` forms called during body execution
      {} ; Profiling options; we'll use the defaults for now
      (dotimes [_ 5]
        (tufte/p :get-x (get-x))
        (tufte/p :get-y (get-y))))
    

    In plain English, we need to use p to wrap individual expressions and profile to wrap a set of p expressions to profile them together.

    Profiling Pathom Queries

    To put it together, we need to understand one last piece: Pathom Plugins. Plugins allow developers to extend Pathom's functionality by wrapping specific parts of its internal execution process with arbitrary extension code. The various places you can add wrapping are identified by keywords. In our case, we want to wrap individual resolver calls with p and the entire process (which may call many resolvers) with profile. The keywords for these extension points are:

    • ::pcr/wrap-resolve for individual resolvers
    • ::p.eql/wrap-process-ast for the entire process

    NOTE: this article is specifically for Pathom's EQL interface.

    With this knowledge, we can create some extension functions and register the plugin:

    (defn tufte-resolver-wrapper
      "Wrap a Pathom3 resolver call in `tufte/p`."
      [resolver]
      (fn [env input]
        (let [resolver-name (-> (get-in env [::pcp/node ::pco/op-name])
                                (name)
                                (keyword))
              identifier (str "resolver: " resolver-name)]
          (tufte/p identifier (resolver env input)))))
    
    (defn tufte-process-wrapper
      "Wrap a Pathom3 process in `tufte/profile`."
      [process-ast]
      (fn [env ast] (tufte/profile {} (process-ast env ast))))
    
    (p.plugin/defplugin tufte-profile-plugin
      {::p.plugin/id `tufte-profile-plugin
       ::pcr/wrap-resolve tufte-resolver-wrapper
       ::p.eql/wrap-process-ast tufte-process-wrapper})
    

    The last step is to include this plugin in Pathom's environment when processing a query:

    ;; Add handler to print results to *out*
    (tufte/add-basic-println-handler! {})
    
    (p.eql/process
    ;; Only the first form is new, everything else is as before.
     (-> (p.plugin/register tufte-profile-plugin)
         (pci/register [all-items fetch-v]))
     [{:all-items [:v]}])
     ; => {:all-items [{:v 10} {:v 20} {:v 30}]}
    

    Tufte Results - no batch

    If you follow along with the Batch Resolvers docs linked above, you can see how to optimize such a situation to avoid the N+1 query and the extra 600ms of processing time it causes. Let's replace the fetch-v resolver with its batch version and profile it again:

    (pco/defresolver batch-fetch-v
      "Takes a _batch_ of `:id`s and outputs their `:v`."
      [items]
      {::pco/input  [:id]
       ::pco/output [:v]
       ::pco/batch? true}
      (Thread/sleep 300)
      (mapv #(hash-map :v (* 10 (:id %))) items))
    
    (p.eql/process
      (-> (p.plugin/register tufte-profile-plugin)
          (pci/register [all-items #_fetch-v batch-fetch-v]))
      [{:all-items [:v]}])
    ; => {:all-items [{:v 10} {:v 20} {:v 30}]}
    

    Tufte Results - batch

    Comparing results, we can see the processing time saved by the batch version, exactly how much time was spent in each resolver and which resolvers were called. Again, this is a very simplified example. In a real-world scenario your may end up calling a large number of resolvers to produce the result, so having Tufte's stats at hand can be very useful.

    Pathom Viz

    As a final note, I want to point out that Pathom has its own tool for gaining such insights. It's called Pathom Viz and provides an excellent visual interface that shows everything you get from the above and more. It's a great tool and I use it often. Using Tufte as I've outlined above is an alternative lightweight approach that I've found useful.

    Wrapping Up

    In this article I covered a basic introduction to Pathom, its extension points and how to integrate it with Tufte in order to get performance and execution insights. Nothing groundbreaking here, but I did a quick search and didn't find any similar content, so hopefully this helps someone in the future.

    You can find the complete working example code in my fnguy-examples repo.

    Permalink

    Taming LLM Responses with Instaparse

    Taming LLM Responses with Instaparse

    It started with a simple goal: integrate an LLM model. Little did I know this would lead us down a rabbit hole for parsing challenges that would fundamentally change how we handle LLM outputs.

    Taming LLM Responses with Instaparse

    The Promise and the Pain

    Like many developers, our journey began with a straightforward vision: use LLMs to generate UI operations for our no-code platform. The plan seemed simple - have the model return JSON structures describing UI components, their properties, and how they should be manipulated.

    Our initial schema looked promising:

    [{
      "type": "append:node",
      "context": {
        "engine": "string",
        "workspace": "string"
      },
      "data": {
        "source": [{
          "id": "string",
          "componentName": "string",
          "props": {
            "data-content-editable": "content",
            "class": "string",
            "content": "string"
          }
        }],
        "target": {
          "id": "string",
          "componentName": "string",
          "props": {}
        }
      }
    }]

    We wrote comprehensive prompts, carefully explained our component hierarchy, and felt confident about our approach. Then reality struck.

    The Pain Points

    Our testing phase revealed several critical issues:

    1. JSON formatting significantly increased response latency
    2. Not all models supported JSON mode consistently
    3. Even with JSON mode enabled, sometimes LLMs would respond with incomplete JSON.
    4. The performance impact was unacceptable for real-time applications

    The Regex Temptation

    I&aposll admit it - my first instinct was to reach for regex. After all, how hard could it be to match some curly braces and square brackets?

    ;; I actually wrote this. I&aposm not proud of it.
    (re-find #"\{[^}]+\}" llm-response)

    I can feel you laughing right now. If you&aposve ever tried to parse JSON with regex, you know exactly where this is going - a path of madness, unmaintainable code, and edge cases that haunted my dreams.

    Instaparse - The Game Changer

    Instead of fighting with regex, I decided to write a proper grammar to parse JSON-like structures embedded in text.

    Here&aposs the complete solution I developed:

    1.The Grammar Definition

    First, I defined a grammar that could handle JSON embedded within normal text:

    (ns json-extractor.core
      (:require [instaparse.core :as insta]
                [clojure.edn :as edn]))
    
    
    (def json-parser
      (insta/parser
        "text = (not-json | json)*
    
         <not-json> = #&apos[^{\\[]+|[{\\[](?![\"\\s\\[{])&apos
    
         json = object | array
    
         <value> = object | array | string | number | boolean | null
    
         object = <&apos{&apos> <ws> (pair (<&apos,&apos> <ws> pair)*)? <ws> <&apos}&apos>
    
         array = <&apos[&apos> <ws> (value (<&apos,&apos> <ws> value)*)? <ws> <&apos]&apos>
    
         pair = string <ws> <&apos:&apos> <ws> value
    
         string = <&apos\"&apos> #&apos[^\"\\\\]*(?:\\\\.[^\"\\\\]*)*&apos <&apos\"&apos>
    
         number = #&apos-?(?:0|[1-9]\\d*)(?:\\.\\d+)?(?:[eE][+-]?\\d+)?&apos
    
         boolean = &apostrue&apos | &aposfalse&apos
    
         null = &aposnull&apos
    
         ws = #&apos\\s*&apos"))

    2.Validation Layer

    Once parsed, I needed to ensure the structures were valid:

    (defn valid-json-structure? [x]
      (or (map? x)
          (and (sequential? x)
               (every? (fn [item]
                        (or (number? item)
                            (string? item)
                            (boolean? item)
                            (nil? item)
                            (valid-json-structure? item)))
                      x))))

    3.Transform Rules

    (def transform-map
      {:string identity
       :number (fn [n]
                (try
                  (edn/read-string n)
                  (catch Exception _
                    n)))
       :boolean #(= % "true")
       :null (constantly nil)
       :pair vector
       :object (fn [& pairs]
                (try
                  (reduce (fn [acc [k v]]
                           (assoc acc (keyword k) v))
                         {}
                         pairs)
                  (catch Exception _
                    nil)))
       :array (fn [& items]
               (try
                 (vec (remove nil? items))
                 (catch Exception _
                   nil)))
       :json identity
       :text (fn [& items]
              (->> items
                   (remove nil?)
                   (filter valid-json-structure?)))})

    4.JSON String Detection

    Before parsing, we need to find potential JSON strings in the text:

    (defn find-all-json-like-strings
      "Find potential JSON objects/arrays in text using balanced delimiter matching"
      [text]
      (let [results (atom [])
            len (count text)]
        (loop [i 0
               stack []
               start -1]
          (if (< i len)
            (let [c (nth text i)
                  stack&apos (cond
                          (and (empty? stack) (#{\{ \[} c))
                          (conj stack c)
    
                          (and (= (peek stack) \{) (= c \}))
                          (pop stack)
    
                          (and (= (peek stack) \[) (= c \]))
                          (pop stack)
    
                          (#{\{ \[} c)
                          (conj stack c)
    
                          :else
                          stack)]
              (cond
                (and (empty? stack) (= start -1) (#{\{ \[} c))
                (recur (inc i) stack&apos i)
    
                (and (empty? stack) (> start -1))
                (do
                  (swap! results conj (subs text start (inc i)))
                  (recur (inc i) stack&apos -1))
    
                :else
                (recur (inc i) stack&apos start)))
            (when (> start -1)
              (swap! results conj (subs text start len)))))
        @results))

    5.Putting It All Together

    Finally, I combined everything into two main functions:

    (defn parse-single-json
      "Parse a single JSON string"
      [text]
      (try
        (let [result (json-parser text)]
          (when-not (insta/failure? result)
            (let [transformed (insta/transform transform-map result)
                  transformed (if (sequential? transformed)
                               (first transformed)
                               transformed)]
              (when (valid-json-structure? transformed)
                transformed))))
        (catch Exception e
          (tap> {:exception e :text text})
          nil)))
    
    (defn extract-json
      "Extract all valid JSON structures from text"
      [text]
      (->> (find-all-json-like-strings text)
           (map parse-single-json)
           (filterv some?)))

    Learn from my mistakes

    1. Write tests from the start.
    2. Don&apost modify the grammar without thorough testing
    3. Don&apost assume all LLM responses will contain valid JSON
    4. Don&apost skip the validation step, even if parsing succeeds
    5. Don&apost try to parse extremely large JSON structures in one go

    When dealing with LLMs, robust parsing isn&apost just nice to have - it&aposs essential for building reliable AI systems.

    See It In Action

    Our Auxtool Agent now streams UI operations in real-time, applying them as they arrive from the LLM. This creates a fluid, interactive experience where you can watch your UI being built dynamically as the model generates responses.

    0:00
    /3:44

    Demo Vade Auxtool Building Landing Page

    Permalink

    Clojure Deref (Feb 14, 2025)

    Welcome to the Clojure Deref! This is a weekly link/news roundup for the Clojure ecosystem (feed: RSS). Thanks to Anton Fonarev for link aggregation.

    Libraries and Tools

    New releases and tools this week:

    Permalink

    Deploying ML models in Clojure

    Kira Howe ‘s 2024 article about the current state of ML in Clojure prominently features the Tribuo library by Oracle Labs and the Clojure wrapper for Tribuo. Tribuo integrates XGBoost, ONNX runtime, and Tensorflow-Java. However the Tensorflow bindings for Java look a bit verbose (see e.g. MNIST example).

    Another approach is to train the model in Python, export it to the ONNX format and then use the ONNX runtime directly to perform inference in Clojure. There is a recent tutorial on using ONNX models from Clojure. However it only deals with tabular data.

    Training

    The following example uses PyTorch to train a traditional CNN classifier on the well-known MNIST dataset (the dataset can be obtained here). The implementation performs the following steps:

    • A class for reading MNIST images and labels is implemented.
    • A CNN model using two convolutional layers and two fully connected layers is implemented and dropout regularization is applied.
    • The training and test data is loaded as batches.
    • The cross entropy loss function and an Adam optimizer are instantiated. Note that learning rate and dropout are hyperparameters which need to be tuned.
    • The training loop performs prediction, loss computation, backpropagation, and optimization step.
    • The test loop accumulates and displays the prediction accuracy on the test set.
    • After 25 epochs, the models is exported to the ONNX format.
    import numpy as np
    import torch
    from torch import nn
    from torch import onnx
    from torch.nn import functional as F
    from torch.utils.data import DataLoader, Dataset
    
    
    class MNISTData(Dataset):
    
        def __init__(self, images_file_name, labels_file_name):
            """Read MNIST images and labels from specified files"""
            super(MNISTData, self).__init__()
            # Read images (skip magic, length, height, and width integers)
            self.images = np.fromfile(images_file_name, dtype=np.uint8)[16:].reshape(-1, 28, 28)
            # Read labels (skip magic and length integer)
            self.labels = np.fromfile(labels_file_name, dtype=np.uint8)[8:]
    
        def __len__(self):
            """Return the number of images (or labels) in the dataset"""
            return len(self.labels)
    
        def __getitem__(self, idx):
            """Return the image and label at the specified index"""
            image = torch.from_numpy(self.images[idx]).to(torch.float) / 255.0
            label = torch.zeros(10)
            label[self.labels[idx]] = 1
            return image, label
    
    
    class MNISTNet(nn.Module):
    
        def __init__(self):
            """Construct network with 2 convolutional layers and 2 fully connected layers"""
            super(MNISTNet, self).__init__()
            self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
            self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
            self.conv2_drop = nn.Dropout2d(p=0.2)
            self.fc1 = nn.Linear(320, 50)
            self.fc2 = nn.Linear(50, 10)
    
        def forward(self, x):
            """Perform forward pass of network"""
            x = x.view(-1, 1, 28, 28)
            x = F.relu(F.max_pool2d(self.conv1(x), 2))
            x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
            x = x.view(-1, 320)
            x = F.relu(self.fc1(x))
            x = F.dropout(x, p=0.2, training=self.training)
            x = self.fc2(x)
            return F.softmax(x, dim=1)
    
    
    def main():
        train_data = MNISTData('data/train-images-idx3-ubyte', 'data/train-labels-idx1-ubyte')
        test_data = MNISTData('data/t10k-images-idx3-ubyte', 'data/t10k-labels-idx1-ubyte')
    
        train_loader = DataLoader(train_data, batch_size=64)
        test_loader = DataLoader(test_data, batch_size=64)
    
        model = MNISTNet()
        loss = nn.CrossEntropyLoss()
        # Adam optimizer
        optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
    
        for epoch in range(25):
            for x, y in train_loader:
                pred = model(x)
                l = loss(pred, y)
                optimizer.zero_grad()
                l.backward()
                optimizer.step()
    
            correct = 0
            total = 0
            for x, y in test_loader:
                pred = model(x).argmax(dim=1)
                correct += (pred == y.argmax(dim=1)).sum().item()
                total += len(y)
            print('Accuracy: {}'.format(correct / total))
    
        # Save model as ONNX
        torch.onnx.export(model,
                          (torch.randn((1, 1, 28, 28), dtype=torch.float),),
                          'mnist.onnx',
                          input_names=['input'],
                          output_names=['output'])

    Inference

    The model file mnist.onnx can now be used for inference in Clojure. The deps.edn file specifies the ONNX runtime and the cljfx library:

    {:deps {com.microsoft.onnxruntime/onnxruntime {:mvn/version "1.20.0"}
            cljfx/cljfx {:mvn/version "1.9.3"}}
     :paths ["."]
     :aliases {:infer {:main-opts ["-m" "infer.core"]}}}

    The infer.clj file contains the code to run the inference on the model. The code contains the following functions for inference:

    • read-digit - Read a 28*28 gray-scale byte block from the MNIST dataset
    • feature-scaling - Scale byte features to [0, 1] floating-point range. Note that Clojure byte arrays contain signed values which need to be converted to unsigned values!
    • argmax - Return the index of the maximum value of a one-dimensional probability vector.
    • infer - Convert a byte array to a ONNX tensor with batch size and number of channels being 1, run inference, and return the argmax of the probability vector.

    Furthermore the digit->image function uses the idea shown in James Thompson’s Gist to convert a byte array to a JavaFX image in order to display it. The remaining code displays a small JavaFX GUI showing random images from the MNIST test data and the inference result.

    (ns infer.core
        (:require [clojure.java.io :as io]
                  [cljfx.api :as fx])
        (:import [java.io ByteArrayOutputStream ByteArrayInputStream]
                 [java.nio FloatBuffer]
                 [javafx.application Platform]
                 [ai.onnxruntime OrtEnvironment OrtSession OnnxTensor]))
    
    (def environment (OrtEnvironment/getEnvironment))
    
    (def mnist (-> environment (.createSession "mnist.onnx")))
    
    (defn read-digit [n]
      "Read a 28*28 gray-scale byte block from the MNIST dataset."
      (with-open [in (io/input-stream "data/t10k-images-idx3-ubyte")]
        (.skip in (+ 16 (* n 28 28)))
        (.readNBytes in (* 28 28))))
    
    (defn byte->ubyte [b]
      "Convert byte to unsigned byte"
      (if (>= b 0) b (+ b 256)))
    
    (defn feature-scaling [digit]
      "Scale features to [0, 1] range"
      (float-array (map #(/ (byte->ubyte %) 255.0) digit)))
    
    (defn argmax [arr]
      "Return the index of the maximum value in the array"
      (first
        (reduce (fn [[result maximum] [index value]] (if (> value maximum) [index value] [result maximum]))
                [0 (first arr)]
                (map vector (range) arr))))
    
    (defn inference [digit]
      "Run inference on a digit image"
      (let [scaled        (feature-scaling digit)
            input-buffer  (FloatBuffer/wrap scaled)
            inputs        {"input" (OnnxTensor/createTensor environment input-buffer (long-array [1 1 28 28]))}
            outputs       (.run mnist inputs)
            output-tensor (.get (.get outputs "output"))
            output-buffer (.getFloatBuffer output-tensor)
            result        (float-array 10)]
        (.get output-buffer result)
        (argmax result)))
    
    (defn digit->image [data]
      "Convert a 28*28 byte array to JavaFX image"
      (let [image  (java.awt.image.BufferedImage. 28 28 java.awt.image.BufferedImage/TYPE_BYTE_GRAY)
            raster (.getRaster image)
            out    (ByteArrayOutputStream.)]
        (.setDataElements raster 0 0 28 28 data)
        (javax.imageio.ImageIO/write image "png" out)
        (.flush out)
        (javafx.scene.image.Image. (ByteArrayInputStream. (.toByteArray out)))))
    
    (def app-state (atom {:index (rand-int 10000)}))
    
    (defn event-handler [& args]
      "Update application state with random index"
      (swap! app-state update :index (fn [_] (rand-int 10000))))
    
    (defn display-image [{:keys [image]}]
      "Image display for cljfx GUI"
      {:fx/type :image-view
       :fit-width 256
       :fit-height 256
       :image image})
    
    (defn next-button [_]
      "Next button for cljfx GUI"
      {:fx/type :button
       :text "Next"
       :on-action event-handler})
    
    (defn root [{:keys [index]}]
      "Main window for cljfx GUI"
      (let [digit  (read-digit index)
            result (inference digit)]
        {:fx/type :stage
         :showing true
         :title "MNIST"
         :scene {:fx/type :scene
                 :root {:fx/type :v-box
                        :padding 3
                        :spacing 5
                        :children [{:fx/type display-image :image (digit->image digit)}
                                   {:fx/type :h-box
                                    :padding 3
                                    :spacing 5
                                    :children [{:fx/type next-button}
                                               {:fx/type :label :text (str "result = " result)}]}]}}}))
    
    (def renderer
      "Renderer for cljfx GUI"
      (fx/create-renderer
       :middleware (fx/wrap-map-desc assoc :fx/type root)))
    
    (defn -main [& args]
      (Platform/setImplicitExit true)
      (fx/mount-renderer app-state renderer))

    Here is a screenshot of the inference GUI:

    inference GUI screenshot

    GPU usage

    For the MNIST example a CPU is sufficient for training and inference. For larger models one needs to use a GPU.

    In PyTorch one can use the .to method to move models and tensors to the GPU. For inference in Clojure, one needs to install onnxruntime_gpu instead of onnxruntime. Furthermore one needs to select a GPU device when creating a session:

    ; ...
    (def device-id 0)
    (def options (OrtSession$SessionOptions.))
    (.addCUDA options device-id)
    (def environment (OrtEnvironment/getEnvironment))
    
    (def mnist (-> environment (.createSession "mnist.onnx" options)))
    ; ...

    Conclusion

    The ONNX runtime allows you to train models using PyTorch and deploy them in Clojure applications. Furthermore there are Tensorflow-Java bindings however they are more verbose. Hopefully the Clojure Tribuo bindings eventually will provide a more concise API for implementing ML models and training them.

    When using byte arrays in Clojure to represent images, one needs to convert them to unsigned byte in order to obtain correct results. In the example we also used feature scaling for faster convergence during training.

    Also see github.com/wedesoft/clojure-onnx for source code.

    Enjoy!

    Permalink

    Release 2.1.0 of Clojure lib for AWS presigned URLs & requests

    With a little help, aws-simple-sign now supports PUT URLs and I have released a new version (2.1.0) of the library.

    I noticed a few issues with some of the examples in the README which are now also fixed.

    For those unfamiliar with aws-simple-sign, it generates presigned URLs for S3 objects and signs HTTP requests for AWS. It doesn’t require any dependencies (Java or otherwise) which is pretty handy for using it with Babashka.

    Enjoy 🚀

    Permalink

    Achieving High Throughput and Low Latency through Adaptive Asynchronous Transaction

    Effective Throughput

    In my previous post, I demonstrated that Datalevin performs complex queries faster than PostgreSQL. A common reaction is, "Oh, have you tested writing speed? I imagine that when there are indices for everything (as in the case of Datalevin), writing speed could be significantly slower." This post aims to address that write-speed concern.

    When discussing database write speed, throughput and latency are the two key performance metrics. Throughput is the number of transactions the system can process in a given amount of time—the higher the throughput, the better the system. Latency is the amount of time it takes to process a single transaction from start to finish. In durable databases, this refers to the time between when the transaction starts and when the data is committed and flushed to disk. The lower the latency, the better the system performs.

    Obviously, it is desirable to have both high throughput and low latency. However, achieving both simultaneously is often challenging.

    Throughput-Latency Trade-Off

    To improve throughput, database systems often use batching techniques (grouping many transactions together) to amortize the setup cost across multiple transactions. In particular, the number of expensive disk flush operations can be reduced dramatically through batching. However, waiting to accumulate a batch may increase the latency for individual transactions.

    Conversely, processing transactions immediately reduces the wait time for each transaction, lowering latency. Yet handling each transaction independently can prevent the system from fully utilizing available resources, potentially resulting in lower overall throughput.

    Asynchronous Transaction

    A well-implemented asynchronous transaction model has the potential to improve both throughput and latency simultaneously. Instead of processing a transaction immediately, an asynchronous transaction is placed in a queue—this provides an opportunity to batch it with other transactions, thereby improving overall throughput. Of course, it is still important to ensure that transactions do not wait too long in the queue, which could hurt latency.

    Some databases implement asynchronous commit. For example, PostgreSQL’s synchronous_commit parameter can be set to off. In this mode, the system returns transaction success as soon as the transaction is logically completed—before the generated WAL (Write-Ahead Log) records are actually written to disk. However, this implementation compromises durability and consistency guarantees. There is a short window between reporting transaction completion and when the transaction is truly committed. If the system crashes during this risk window, the changes made by that transaction may be lost. PostgreSQL allows users to control the duration of this risk window by setting wal_writer_delay.

    Adaptive Asynchronous Transaction

    Striking a balance between the advantages of asynchronous processing and the durability guarantees of transactions might seem challenging. In Datalevin, we aim to take advantage of asynchronous transactions without compromising durability. Moreover, having too many tuning parameters goes against our goal of excellent database usability—there is really no good reason for a user to worry about low-level implementation details.

    To achieve these goals, I have implemented an adaptive asynchronous transaction method in Datalevin.

    The idea is as follows. First, to maintain transaction durability, the asynchronous transaction method returns a Clojure future, which is only realized once the transaction is fully committed and the data is flushed to disk. The user can dereference the future to determine when the transaction is completed, or block until it is. Alternatively, the method optionally accepts a user-supplied callback function so that the user can be notified when the transaction is complete.

    At first glance, these options might seem no different from a synchronous transaction: the user still has to wait until the commit is completed. However, the promise of asynchronous transactions is this: the wait will be shorter than with synchronous transactions. If we can achieve a shorter wait time, we have succeeded.

    How can we achieve a shorter wait time? As suggested above, one answer lies in batching transactions.

    The challenge now is to find an appropriate batch size that minimizes wait time. As argued earlier, setting a fixed batch size or preset time delay isn’t ideal. Instead, the batch size—or equivalently, the wait time in the queue—should depend on the system load.

    When the system is inundated with write requests, a larger batch size is useful for dispatching many requests at once, thus improving throughput. On the other hand, when the system is idle, a write request should be fulfilled immediately, thus minimizing latency. In other words, we want a dynamic batch size that adapts to system load.

    Another source of reduced wait time is improved concurrency. While waiting for an I/O operation to complete, other tasks can use the CPU. Proper design allows this overlapping of computation and I/O to effectively “hide” I/O wait times.

    The most direct way to achieve this is through a simple mechanism [1]: when processing a queued asynchronous write event, the system checks its event queue to see if there are other similar events waiting. If so, the system simply ignores the duplicate event and moves on to the next one. Otherwise, it submits a task to a worker thread pool to process the queued asynchronous transactions in a batch.

    This mechanism exhibits the adaptive behavior described above. Under a heavy load, the batch size increases; when the load is low, the batch size is small. The available system resources are effectively utilized to optimize both throughput and latency outcomes—without compromising durability.

    Metric

    In order to measure how well we have achieved our goal, we need a metric that combines throughput and latency. Combining these two metrics into a single number is challenging because they measure different aspects of performance and have different units. There may not be a perfect answer, but here are a few ideas.

    One simple composite is to compute a ratio such as Throughput / Latency. A higher value implies that the system is processing more transactions quickly (i.e., achieving high throughput and low latency). However, if throughput is measured in transactions per second and latency in seconds per transaction, the ratio has units of transactions squared per second, which makes its interpretation less obvious.

    One way to refine this formula is to introduce a target latency required by a particular use case. For example, for low-latency applications that expect sub-millisecond database transactions, the target latency might be 1 millisecond. For applications that can tolerate slightly higher latency, the target might be 10 milliseconds.

    With a target latency in place, we can define an effective “good throughput” for a given use case. One simple approach is to combine throughput with the fraction of operations finishing within that target latency. For instance, one might define a metric like this:

    Effective Throughput = Actual Throughput × (Target Latency / Actual Latency)
    

    This metric rewards systems that process many transactions and keep them within acceptable latency boundaries. If the latency exceeds the target, the effective throughput metric drops, reflecting a degraded user experience.

    Effective Throughput (ET) is a use-case–dependent metric that is simple to interpret. The higher the ET, the better the system. It is also easy to calculate, as its value is proportional to the target latency. For example, the ET for a 10-millisecond target is 10 times that for a 1-millisecond target. Thus, we really only need to calculate the ratio of throughput to latency (i.e., for a target latency of 1 unit) and then extrapolate to other target latencies. For the analysis below, we will adopt ET.

    Benchmark

    To address the write-speed concerns of Datalevin, we compare Datalevin with SQLite—the most widely used embedded database, renowned for its fast writes. Although Datalevin can be used in a client/server mode, for this benchmark we use it as an embedded database.

    Using the same dataset, we compare four transaction conditions: Datalevin Default, Datalevin Async, SQLite Default, and SQLite WAL. These are all fully durable transaction methods—a transaction is considered complete only after its data is fully flushed to disk.

    Database transaction benchmark performance is highly sensitive to hardware and the operating system; the numbers can vary widely between different machines. In this benchmark run, we used a machine with a 2016 Intel Core i7-6850K CPU @ 3.60GHz (6 cores), 64GB RAM, and a Samsung 860 EVO 1TB SSD. This machine performs around the middle of the pack in Today's CPU Benchmark, so it is reasonably representative.

    The code and detailed description of the benchmark can be found in our GitHub repository.

    Pure Write

    The benchmark has two tasks. The first is a pure write task, where each write consists of an entity (row) with two attributes (columns). Every 10,000 writes, we measure the throughput and latency. In addition to individual writes, we also measure performance when writing data in batches—testing batch sizes of 10, 100, and 1000.

    Using the Effective Throughput (ET) metric described above, the bar chart at the top of this post shows the pure write task results. (Note: the Y axis is logarithmic.)

    When the batch size is 1 (i.e., writing a single entity at a time), Datalevin Async achieves the best ET. It is several orders of magnitude higher than under other write conditions—not only is the raw throughput high (16,829.2 writes per second), the latency is low (0.1 milliseconds) as well. When writes are batched, the ET of Datalevin Async slightly decreases due to the latency increasing faster than throughput.

    Datalevin Default performance peaks at a batch size of 100; however, at that batch size, SQLite’s performance outperforms it.

    SQLite Default performs very poorly for individual writes, though SQLite WAL is much better. Both benefit significantly from increased batch sizes, with SQLite Default benefiting the most.

    In general, Datalevin's write performance is more stable and less sensitive to variations in batch size compared to SQLite. Because Datalevin performs indexing at transaction time, its write performance does not benefit as much from increased batch sizes.

    For an operational database, each transaction normally contains a small number of entities (or rows), as recommended by industry best practices [2]. In these online transaction processing (OLTP) workloads, Datalevin is expected to perform better than SQLite.

    For use cases involving the bulk loading of data, Datalevin provides init-db and fill-db functions that bypass the expensive transactional processes and are more appropriate.

    Mixed Read/Write

    After one million entities (rows) have been loaded, the second task alternates between reading a row and writing a row until one million reads and one million writes have been performed. For this task, we report the results using Linux’s time command.

    The first chart shows wallclock time, while the second shows user and system CPU time.

    Mixed Read/Write Wallclock Time Mixed Read/Write CPU Time

    For the mixed read/write task, Datalevin Default is much faster than SQLite Default, and Datalevin Async is much faster than SQLite WAL, while SQLite WAL outperforms Datalevin Default.

    Regarding CPU time, the differences among the various conditions are small, indicating that the underlying amount of work is not hugely different.

    Notice that in the three synchronous conditions—Datalevin Default, SQLite Default, and SQLite WAL—most of the time is spent waiting for I/O, with CPU times being relatively small compared to the wallclock time. Datalevin Async is different; its total CPU time (227.89 seconds) is actually greater than its wallclock time (111.04 seconds), indicating effective utilization of multicore processing and an apparent hiding of I/O wait times. This result confirms our hypothesized advantage of asynchronous transactions.

    Conclusion

    We can now answer the question about Datalevin's write speed: it performs better than SQLite under OLTP workloads. While the default synchronous write mode performs at a respectable level, Datalevin truly shines with asynchronous transactions, achieving both high throughput and low latency without compromising transaction durability.

    Reference

    [1] Nathan Marz, "2.5x Better Performance: Rama vs. MongoDB and Cassandra", April 2024. (https://blog.redplanetlabs.com/2024/04/25/better-performance-rama-vs-mongodb-and-cassandra/)

    [2] Gray, J., & Reuter, A. Transaction Processing: Concepts and Techniques. Morgan Kaufmann Publishers, 1993.

    Permalink

    Copyright © 2009, Planet Clojure. No rights reserved.
    Planet Clojure is maintained by Baishamapayan Ghose.
    Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
    Theme by Brajeshwar.