Why work at Nu?

If you’re wondering what makes top-quality professionals choose to work at Nu, the answer lies in a unique environment that combines customer-centric innovation, a deep respect for individuals, and a dedication to embracing diverse perspectives. 

In this blog post, we’ll explore what makes Nubank stand out by sharing the personal experiences of three Nubankers—Eduardo, João and Vanessa. Each of them found more than just a job here; they discovered a place where meaningful work converges with career growth, collaboration with world-class peers, and a relentless focus on driving positive change. 

Keep reading to learn how these testimonials illustrate the values and opportunities that define Nubank as a leading destination for those seeking both professional fulfillment and a lasting impact.

Eduardo’s leap of faith, with bold vision

When Eduardo Nince, Senior Product Ops Lead, first joined Nubank, he admits it was “a leap into the unknown.” Yet in his words, “I couldn’t have chosen a better place to work.” Drawn by “an audacious idea—that a small startup could disrupt the big banks in Brazil,” Eduardo quickly discovered that Nubank’s passion for innovation is more than a tagline; it’s a driving force that continues to reshape financial services and empower over 100 million customers.

Central to Eduardo’s experience is Nubank’s customer-obsessed culture. Quoting him, “The definition of goals is done with a focus on the customer,” whether that’s aiming to pay insurance claims in a single day or personally tracking real home-assistance visits. This top-down alignment ensures teams remain close to those they serve, turning bold ideas into products that truly make a difference.

Eduardo also emphasizes Nubank’s respect for its people. He contrasts it with past roles where weekend work was the default: “Nubank always tries to understand how the team is feeling, to change the scope if needed”. This empathy flows through daily support systems, from lightning-fast IT help to timely leadership responses. Together, these interactions form a culture where high expectations coexist with genuine care.

Rounding out Eduardo’s story is Nubank’s diverse, high-caliber talent. He highlights teammates from around the world, including industry veterans from tech giants like Google and Microsoft: “Only at Nubank would I be able to work with a guy like Fausto Ibarra, VP and General Manager, Insurance, at Nu”. Such exposure, combined with continuous feedback and development, makes Nubank not just an exceptional place to work, but also a powerful career reference that stands out—both in Brazil and abroad.

João’s path to the Purple Future

When João Augusto Lanjoni, Software Engineer, announced on LinkedIn that he was “very happy and proud to share the start of a new journey by joining Nubank,” he was stepping into what he calls “a unique opportunity and a privilege” to help transform the financial services. After presenting at one of Nubank’s Software Engineering Meetups—and experiencing firsthand the passion surrounding functional programming—he found himself drawn to a culture driven by innovation and relentless customer focus. Today, João channels that same energy into building solutions that empower millions of people to take control of their finances.

A key aspect of João’s day-to-day work lies in Nubank’s unwavering customer-centric mindset. Although some of his projects happen behind the scenes—like those on Platform and Payments teams—João sees how every line of code ultimately serves end-users. “The customer doesn’t directly see what these teams do, but… if Nu is expanding more and more, it’s thanks to the platform team that makes it all happen,” he explains. By constantly refining performance, usability, and accessibility, João and his colleagues ensure that even the most invisible parts of the system deliver a tangible impact for customers.

Beyond the technology, João highlights how respect for people shapes his experience at Nubank. Rather than pushing engineers to work weekends or ignore burnout risks, leadership provides freedom to discuss challenges openly. “Nu cares a lot about the quality of life of its engineers,” he shares, pointing to the company’s generous benefits, long-term career planning, and extended paternity leave policy as proof that employee well-being remains a top priority.

João also finds inspiration in Nubank’s world-class talent. A longtime Clojure enthusiast, he regularly collaborates with top minds—including key contributors to Clojure and Datomic—who are just “a Slack message away.” While it was daunting at first, working alongside such experts soon became a motivating force. He credits their mentorship for improving his skills and expanding his career horizons, reminding him that even at a high level, everyone is approachable and invested in each other’s growth.

In João’s view, Nubank’s rapid growth and technical complexity make it an unparalleled place for personal and career development. Handling the demands of more than 100 million customers exposes engineers to challenges that few other companies can offer. “Nu is a big tech reference”, he notes, underscoring the lasting impact of working here. For João, building the Purple Future means not only helping millions manage their finances more confidently—but also growing into a more skilled, more driven professional every step of the way.

Vanessa’s Talent Acquisition journey at Nu

When Vanessa Paladini, Head of Talent Acquisition, joined Nubank in 2020, an ambitious leap that she calls her “best decision ever”, many of the key moments in Nu’s history had not yet happened. Nearly five years later, she has witnessed hyper-growth, an IPO, the launch of multiple products, and mergers & acquisitions—all while helping to hire and develop countless Nubankers. Vanessa remains inspired by how this bold environment continually surpasses expectations and reshapes the future of financial services, proving that taking risks can lead to extraordinary rewards.

Central to Vanessa’s role is attracting and nurturing top talent—all in service of delivering exceptional products and experiences for Nubank customers. “We need to ensure that we have the best people to build complete products, with different visions” she says. Her team’s work is guided by the same customer-centric mindset driving every corner of the company. Whether scouting AI specialists abroad or advising different business units on team structures, Vanessa and her colleagues focus on aligning skilled professionals with Nubank’s mission to innovate for hundreds of millions of users.

Equally important is how Nubank empowers its people. For Vanessa, it’s a place that values those willing to “take risks and have the courage to take on big challenges”. She points to her own trajectory—hiring executives, pivoting to support various areas, and ultimately leading the Talent Acquisition team—as an example of how Nu encourages growth at every stage. “The more willing you are and the more flexible you are to change, the more you’ll be recognized” she adds, noting that the company’s fast pace comes with abundant opportunities to learn, evolve, and make a real impact.

Vanessa also highlights the density of world-class professionals at Nubank—many of whom she personally helped bring on board. “We have a very large intellectual capital of talent,” she explains, emphasizing that diverse viewpoints and constructive debate fuel innovation. Her team, in turn, constantly challenges hiring processes to seek out less obvious backgrounds and cultivate fresh thinking. “Innovation goes through this diversity”, she says, stressing that including people with unique experiences drives better decisions for the company and its customers alike.

Ultimately, Vanessa believes Nubank’s reputation for fostering top talent extends far beyond its offices. She sees it as an exceptional environment—one where professionals can shape the future of financial services while “making the extraordinary happen, literally”! Whether you’re an engineer, a designer, or part of her own Talent Acquistion team, Vanessa’s story demonstrates that coming to Nubank means embracing change, fueling personal growth, and taking part in a mission that resonates on a global scale.

Conclusion

As seen through the stories of João, Eduardo, and Vanessa, Nubank isn’t just a financial services company—it’s a thriving community built on customer focus, mutual respect, and an unwavering commitment to personal and professional development. 

By fostering diverse teams and encouraging each individual to tap into a genuine sense of purpose, Nubank offers a launchpad for meaningful innovation and continuous self-improvement. 

For those wondering “Why should I choose to work at Nubank?”, these testimonials speak volumes: it’s a place where passion, possibility, and transformative ideas come together to shape both careers and the future of financial services.

The post Why work at Nu? appeared first on Building Nubank.

Permalink

Zillions of one-line functions

I was thankful for jump-to-definition and jump-to-references, each bound to a keystroke in my IDE. But I was reaching the limits of my mental stack. I must have been 10 calls deep before it was hard to keep track of where I was. After about 20, I realized I should have kept notes. I basically had to start over.

I eventually fixed the bug in this existing codebase, but I was exhausted. Tracing the call stack to understand what was going on and where the bug was taxed my mental capacity. When I looked back over the code I traced, from the entry point to the final call where the bug was, I was a little annoyed with what I found—many useless one-liners.

Many of the functions were one-liners. Many of those did nothing more than call two functions on their argument like this:

(defn- foo [a]
  (baz (bar a)))

Another set of the one-liners added a conditional check, like this:

(defn- maybe-foo [a]
  (when a (foo a)))

And then there was a final set that recombined multiple arguments:

(defn- foo [a b]
  (bar (transform b) a))

This isn’t an exhaustive set. I bring up these three types merely to give a flavor of what I was looking at.

Now, you might also be surprised to learn that many of these one-line functions were only used in one place each. Luckily, they were mostly defined as private (using defn-), so they weren’t advertising to any code outside the namespace to use them.

It got me thinking about these one-liners. In this case, they definitely made the code harder to read. Seriously, does this function add anything:

(defn- datetime? [x]
  (= "datetime" (type-of x)))

You might say, “yes,” that it is giving a cleaner interface to the value. Why muddy your code with “magic strings” and have to “manually” call type-of each time. I know how good it feels to pull out these little helpers, especially when I’m in flow, and especially when I’m writing the code. When I’m in flow, I can mentally hold a much richer graph of function calls (each with their own complex contracts). Each little helper captures some tiny sliver of meaning—and the graph feels more meaningful because of it.

But that mental graph fades quickly. When I return to the code, I cannot see what’s going on anymore. I have to do a depth-first search, often dozens of calls deep, to figure out what’s going on. In general, these one-liners are not worth it. They’re great for writeability but not for readability. For example, datetime? is really great for writing a one-line filter:

(filter datetime? columns)

It feels so good when we write these lines that we forget to take a look a the bigger picture.

Writing my book about domain modeling has given me a much deeper appreciation of why these functions don’t work. In theory, they are good. They’re simple. They do one thing. They’re short. They’re each easy to understand. But the real killer for them is that they don’t help recover the design. In fact, they usually obscure it.

I’ve been working with this idea in my new drafts that an important function of software design is to make sure the model is evident in the code. This tiny example might not seem to obscure the public function type-of, but consider that there are three levels of indirection between the entry point and the final call to type-of. You wouldn’t know it’s an important function without digging through the call tree to the bottom.

But this doesn’t damn all single-line functions. Being able to do stuff on a single line is also a signal that you’ve got the model right. Things just easily compose together. So when does it make sense?

Well, like all design decisions, the answer is really complex and context-dependent. I can’t give you hard and fast rules. But there are signals to look for.

Is the function non-obvious? That is, could the implementation be meaningfully different? Is this part of your business’s secret sauce? For instance, this one-line function is probably worth keeping:

(defn coffee-price [coffee]
  (+ (size-price (size coffee)) (add-ins-price (add-ins coffee))))

We can imagine a different coffee shop calculating the price of their coffees differently. This function captures what our company means by the price of a coffee. It’s probably worth having that written down in one place and given a name.

Is the function used in multiple places? If it is, it’s probably an important concept that belongs to your domain. Instead of obscuring things, it’s actually illuminating meaning.

Does the name of the function add meaning? The name of the function anchors the function in human meaning. If the function simply restates what the body of the function states (like datetime? does), it’s probably not worth it. Even the function type-of might have problems in this area:

(defn type-of [x]
  (:type-of x))

Uh-oh. Not much meaning added there, is there? That one is probably not worth it, either.

Besides not restating the body, the name should also be phrased in domain terms. If you can’t come up with a domain term (hint: ask why the function should exist), it’s probably not a good function. For example, it might be fun to write a function like this:

(defn ->latte [coffee]
  (add-add-in coffee :milk))

It converts a coffee to a latte. But wait, is that actually a thing? Do people come in asking to convert coffees to lattes? Or do they just order a latte? The barista knows, and they say, no conversion. That’s not a thing. But latte is the most popular beverage, so it would be nice to have a shortcut in the UI to add a latte to an order. Bam! That’s meaningful:

(defn make-latte []
  (add-add-in (make-coffee) :milk))

Still a one-liner, but meaningful since it is helpful to the barista.

I’m reminded of John Ousterhout’s “deep” vs “shallow” modules. He also rails against having lots of tiny function that do very little. And his idea has the ring of truth. But this rule of thumb felt a little too style-focused for me. It focused too much on the code (size of the interface vs complexity of the implementation) and not enough on the domain (does the interface make the model recoverable?). I am seeking a similar law that relates correctly designed code to the domain.

I’m still reeling from the spelunking through code, trying to make sense of these functions. While small, composable pieces usually signal good design, too many layers of indirection can harm code readability. We need to choose layers that really matter—the ones that illuminate the model. Is this just the way codebases get as they get refactored out over time? I don’t think so. I hope that over time, codebases become clearer as the programmers learn to express the model better in code. All we need to do is to step back and see a bigger picture.

Permalink

Create a Server Driven CLI from your REST API

This article is written by Rahul Dé, a VP of > Site Reliability Engineering at Citi and creator/maintainer of popular tools > like babashka, > bob, and now > climate. All opinions expressed are > his own.

APIs, specifically the REST APIs are everywhere and the OpenAPI is pretty much a standard. Accessing them via various means is a fairly regular thing that a lot of us do often and when it comes to the CLI (Command Line Interface) languages like Go, Rust etc are quite popular choices when building. These languages are mostly of statically typed in nature, favouring a closed world approach of knowing all the types and paths at compile time to be able to produce lean and efficient binary for ease of deployment and use.

Like with every engineering choice, there are trade-offs. The one that's here is the loss of dynamism, namely we see a lot of bespoke tooling in these languages doing fundamentally the same thing: make HTTP calls and let users have a better experience than making those calls themselves. The need to know all the types and paths beforehand causes these perceived maintenance issues:

  • Spec duplication: the paths, the schemas etc need to be replicated on the client side again. eg when using the popular Cobra lib for Go, one must tell it all the possible types beforehand.
  • Tighter coupling of client and server: As we have to know each of the paths and the methods that a server expects, we need to essentially replicate that same thing when making the requests making a tight coupling which is susceptible to breakage when the API changes. API is a product having its own versioning. eg kubectl only supports certain versions of kubernetes. Similarly podman or docker CLIs.
  • Servers can't influence the client: Ironically to the previous point, as we now have replicated the server spec on the client side we effectively have a split brain: changes on the server like needing a new parameter etc need to be copied over to the client.

All of this put together increases the maintenance overhead and its specially true for complex tooling like kubectl.

Using standards

I work primarily on the infra side of this, namely Platform and Site Reliability Engineering which involves me having other developers as my users and this cascading effect of an API breakage is quite painful. There are way to work around this issue and from my experience, being spec-first seems to offer the best balance of development and maintenance velocities.

I am quite a big fan of being spec-first, mainly for the following reasons:

  • The API spec is the single source of truth: This is what your users see and not your code. Make this the first class citizen like your users and the code should use this and not the other way round.
  • This keeps all the servers and clients in sync automatically with less breakage.
  • This keeps a nice separation between the business logic (the API handler code) and the infra thereby allowing developers to focus on what's important.

Another project of mine Bob can be seen as an example of spec-first design. All its tooling follow that idea and its CLI inspired Climate. A lot of Bob uses Clojure a language that I cherish and who's ideas make me think better in every other place too.

Codegen

Although codegen is one of the ways to be spec-first, I personally don't subscribe to the approach of generating code:

  • Introduces another build step adding complexity and more layers of debugging.
  • Makes the build more fragile in keeping up with tooling and language changes.
  • The generated code comes with its own opinions and is often harder to change/mould to our needs.
  • It is static code at the end, can't do much at runtime.

Prior art

  • restish: Inspired some of the ideas behind this. This is a project with different goals of being a fully automatic CLI for an OpenAPI REST API and is a bit hard to use as a lib.
  • navi: Server side spec-first library I wrote for Clojure which inspired the handler mechanism in Climate.

What is Climate?

Keeping all of the above into consideration and the fact that Go is one of the most widely used CLI languages, Climate was built to address the issues.

As the name implies, its your mate or sidekick when building CLIs in Go with the intentions of:

  • Keeping the REST API boilerplate away from you.
  • Keep the CLI code always in sync with the changes on the server.
  • Ability to bootstrap at runtime without any code changes.
  • Decoupling you from API machinery, allowing you to focus on just the handlers, business logic and things that may not the part of the server calls.
  • It does just enough to take the machinery out and not more like making the calls for you too; that's business logic.

How does it work?

Every OpenAPI3 Schema consists of one or more Operations having an OperationId. An Operation is a combination of the HTTP path, the method and some parameters.

Overall, Climate works by with these operations at its core. It:

  • Parses these from the YAML or JSON file.
  • Transforms each of these into a corresponding Cobra command by looking at hints from the server.
  • Transform each of the parameters into a Flag with the type.
  • Build a grouped Cobra command tree and attach it to the root command.

Servers influencing the CLI

Climate allows the server to influence the CLI behaviour by using OpenAPI's extensions. This is the secret of Climate's dynamism. Influenced by some of the ideas behind restish it uses the following extensions as of now:

  • x-cli-aliases: A list of strings which would be used as the alternate names for an operation.
  • x-cli-group: A string to allow grouping subcommands together. All operations in the same group would become subcommands in that group name.
  • x-cli-hidden: A boolean to hide the operation from the CLI menu. Same behaviour as a cobra command hide: it's present and expects a handler.
  • x-cli-ignored: A boolean to tell climate to omit the operation completely.
  • x-cli-name: A string to specify a different name. Applies to operations and request bodies as of now.

Type checking

As of now, only the primitive types are supported:

  • boolean
  • integer
  • number
  • string

More support for types like collections and composite types are planned. These are subject to limitations of what Cobra can do out of the box and what makes sense from a CLI perspective. There are sensible default behaviour like for request bodies its implicity string which handles most cases. These types are converted to Flags with the appropriate type checking functions and correctly coerced or the errors reported when invoked.

Checkout Wendy as a proper example of a project built with Climate.

Usage

This assumes an installation of Go 1.23+ is available.

go get github.com/lispyclouds/climate

Given a spec:

openapi: "3.0.0"

info:
  title: My calculator
  version: "0.1.0"
  description: My awesome calc!

paths:
  "/add/{n1}/{n2}":
    get:
      operationId: AddGet
      summary: Adds two numbers
      x-cli-name: add-get
      x-cli-group: ops
      x-cli-aliases:
        - ag

      parameters:
        - name: n1
          required: true
          in: path
          description: The first number
          schema:
            type: integer
        - name: n2
          required: true
          in: path
          description: The second number
          schema:
            type: integer
    post:
      operationId: AddPost
      summary: Adds two numbers via POST
      x-cli-name: add-post
      x-cli-group: ops
      x-cli-aliases:
        - ap

      requestBody:
        description: The numbers map
        required: true
        x-cli-name: nmap
        content:
          application/json:
            schema:
              $ref: "#/components/schemas/NumbersMap"
  "/health":
    get:
      operationId: HealthCheck
      summary: Returns Ok if all is well
      x-cli-name: ping
  "/meta":
    get:
      operationId: GetMeta
      summary: Returns meta
      x-cli-ignored: true
  "/info":
    get:
      operationId: GetInfo
      summary: Returns info
      x-cli-group: info

      parameters:
        - name: p1
          required: true
          in: path
          description: The first param
          schema:
            type: integer
        - name: p2
          required: true
          in: query
          description: The second param
          schema:
            type: string
        - name: p3
          required: true
          in: header
          description: The third param
          schema:
            type: number
        - name: p4
          required: true
          in: cookie
          description: The fourth param
          schema:
            type: boolean

      requestBody:
        description: The requestBody
        required: true
        x-cli-name: req-body

components:
  schemas:
    NumbersMap:
      type: object
      required:
        - n1
        - n2
      properties:
        n1:
          type: integer
          description: The first number
        n2:
          type: integer
          description: The second number

Load the spec:

model, err := climate.LoadFileV3("api.yaml") // or climate.LoadV3 with []byte

Define a cobra root command:

rootCmd := &cobra.Command{
    Use:   "calc",
    Short: "My Calc",
    Long:  "My Calc powered by OpenAPI",
}

Handlers and Handler Data:

Define one or more handler functions of the following signature:

func handler(opts *cobra.Command, args []string, data climate.HandlerData) error {
    slog.Info("called!", "data", fmt.Sprintf("%+v", data))
    err := doSomethingUseful(data)

    return err
}

Handler Data

As of now, each handler is called with the cobra command it was invoked with, the args and an extra climate.HandlerData, more info here

This can be used to query the params from the command mostly in a type safe manner:

// to get all the int path params
for _, param := range data.PathParams {
    if param.Type == climate.Integer {
        value, _ := opts.Flags().GetInt(param.Name)
    }
}

Define the handlers for the necessary operations. These map to the operationId field of each operation:

handlers := map[string]Handler{
    "AddGet":      handler,
    "AddPost":     handler,
    "HealthCheck": handler,
    "GetInfo":     handler,
}

Bootstrap the root command:

err := climate.BootstrapV3(rootCmd, *model, handlers)

Continue adding more commands and/or execute:

// add more commands not from the spec

rootCmd.Execute()

Sample output:

$ go run main.go --help
My Calc powered by OpenAPI

Usage:
  calc [command]

Available Commands:
  completion  Generate the autocompletion script for the specified shell
  help        Help about any command
  info        Operations on info
  ops         Operations on ops
  ping        Returns Ok if all is well

Flags:
  -h, --help   help for calc

Use "calc [command] --help" for more information about a command.

$ go run main.go ops --help
Operations on ops

Usage:
  calc ops [command]

Available Commands:
  add-get     Adds two numbers
  add-post    Adds two numbers via POST

Flags:
  -h, --help   help for ops

Use "calc ops [command] --help" for more information about a command.

$ go run main.go ops add-get --help
Adds two numbers

Usage:
  calc ops add-get [flags]

Aliases:
  add-get, ag

Flags:
  -h, --help     help for add-get
      --n1 int   The first number
      --n2 int   The second number

$ go run main.go ops add-get --n1 1 --n2 foo
Error: invalid argument "foo" for "--n2" flag: strconv.ParseInt: parsing "foo": invalid syntax
Usage:
  calc ops add-get [flags]

Aliases:
  add-get, ag

Flags:
  -h, --help     help for add-get
      --n1 int   The first number
      --n2 int   The second number

$ go run main.go ops add-get --n1 1 --n2 2
2024/12/14 12:53:32 INFO called! data="{Method:get Path:/add/{n1}/{n2}}"

Conclusion

Climate results from my experiences of being at the confluence of many teams developing various tools and proving the need to keep specifications at the centre of things. WIth this it hopefully inspires others to adopt such approaches and with static tooling like Go, its still possible to make flexible things which keep the users at the forefront.

Permalink

On Extensibility

by Laurence Chen

For a long time, I had a misunderstanding about Clojure: I always thought that the extensibility Clojure provides was just about macros. Not only that, but many articles I read about Lisp emphasized how Lisp’s macros were far more advanced than those in other languages while also extolling the benefits of macros—how useful it is for a language to be extensible. However, what deeply puzzled me was that the Clojure community takes a conservative approach to using macros. Did this mean that Clojure was less extensible than other Lisp languages?

Later, I realized that my misunderstanding had two aspects:

First, Clojure has at least two extension mechanisms: Macros and Protocols. When using the Clojure language, we often do not clearly distinguish between core syntax and core library functions. In other words, if the extensible parts are core library functions, the experience for future users is almost the same as extending core syntax. More specifically, many parts of Clojure’s core library are constructed using Protocol/Interface syntax. This syntax serves as the predefined extension points in the core library, meaning that core library functionality can also be extended.

Second, I used to mix up “extensibility” and “extensibility mechanisms.” I always focused on “Oh, I discovered another language, database, or software that supports plugins. That’s great! It has an extensibility mechanism, so it can be extended.” However, an extensibility mechanism is just a means to achieve extensibility. But what exactly is extensibility? What problems does it solve, and what benefits does it bring? I never really thought through these questions.

Here is a proposed definition of extensibility:

Given a module or a segment of code developed based on certain external assumptions, when these external assumptions change, without modifying the existing code, the behavior of the module can be enhanced or altered through a predefined mechanism, allowing it to adapt to new requirements.

Extensibility Definition

According to this definition, the benefits of extensibility are:

  • Cost savings. If no modifications are needed, there is no need to worry about breaking existing functionality or regression issues.
  • Reduced complexity. The ability to extend or modify a module’s behavior through predefined mechanisms eliminates the need to copy entire modules and make modifications, saving a significant amount of code.
  • Empowering users. Even though the module has already been developed, it can still be modified. This is particularly useful when module developers and users belong to different organizations or teams, as it provides great flexibility, allowing users to self-serve.

Next, let’s look at some real-world examples to better understand extensibility in practice.

Macro

Let’s first examine some common built-in Macros:

  • ->: Transforms linear syntax into nested parentheses, effectively introducing a new DSL (domain-specific language).
  • comment: Ignores a block of code.
  • with-open: Grants a block of code access to a resource that is automatically closed when leaving the block.
  • with-redefs: Temporarily redefines global variables within a block of code.
  • with-in-str: Temporarily binds *in* to a specific StringReader within a block of code.

Macros can be roughly categorized into two types:

  • Non with-style Macros
  • With-style Macros

Non with-style Macros

These Macros typically accept arguments in the form of & body, internally parsing body and transforming its evaluation strategy.

For example, consider core.async/go:

(go
  (println "Before")
  (<! (timeout 1000))
  (println "After"))

The go Macro transforms body into a state machine to execute it asynchronously. It doesn’t just wrap a block of code but actually rewrites it.

The code passed as an argument to these Macros often introduces new syntax or semantics, effectively extending the Clojure language itself by adding new DSLs.

With-style Macros

In contrast, some Macros accept arguments in the form of a b c & body and internally reference ~@body. These Macros do not dissect the statements inside body; instead, they inject additional processing before or after body executes. Because they preserve the original structure of body, they are particularly suited for resource management, context setting, and similar scenarios.

The embedkit library contains an inspiring with-style Macro that treats authentication state as a form of context.

  • with-refresh-auth: Refresh the authentication state and retry the request if the request fails with a 401 error.
(defmacro with-refresh-auth [conn & body]
  `((fn retry# [conn# retries#]
      (Thread/sleep (* 1000 retries# retries#)) ; backoff
      (try
        ~@body
        (catch clojure.lang.ExceptionInfo e#
          (if (and (= 401 (:status (ex-data e#)))
                   (instance? clojure.lang.Atom conn#)
                   (< retries# 4))
            (do
              (reset! conn# (connect @conn#))
              (retry# conn# (inc retries#)))
            (throw e#)))))
    ~conn
    0))

;; Use with-refresh-auth to wrap the do-request
(defn mb-request [verb conn path opts]
  (with-refresh-auth conn
    (do-request (request conn verb path opts))))

Protocol

Many Clojurians, including myself, have struggled to grasp when to use Protocols—they can feel abstract and difficult to apply. The best explanation I’ve found is from ask.clojure.org:

Protocol functions are better as SPI to hook in implementations than in API as functions consumers call directly.

If, like me, you don’t immediately grasp what an SPI (service provider interface) is, studying the buddy-auth library can help.

buddy-auth is a commonly used authentication library for web applications. Users can extend it by adding new authentication mechanisms without modifying its source code.

To define an authentication mechanism, one must implement the IAuthentication Protocol using reify.

For example, http-basic-backend is a basic authentication mechanism that implements IAuthentication:

(defn http-basic-backend
  [& [{:keys [realm authfn unauthorized-handler] :or {realm "Buddy Auth"}}]]
  {:pre [(ifn? authfn)]}
  (reify
    proto/IAuthentication
    (-parse [_ request]
      (parse-header request))
    (-authenticate [_ request data]
      (authfn request data))
 ...
)

When using buddy-auth, the wrap-authentication middleware is added to the Ring handler. This middleware ultimately calls proto/-parse and proto/-authenticate.

Strategy Pattern

Looking at this diagram, you might think, “Isn’t this just the Strategy Design Pattern?” Indeed, in this pattern, Strategy corresponds to the Service Provider Interface, allowing authentication to be swapped in buddy-auth without modifying any code.

Summary

Extensibility MechanismMacro (non with-style)Macro (with-style)Protocol
ExtendsClojure language itselfCode passed to the MacroModules
Behavior Modification MechanismParses and rewrites bodyWraps bodyDesign and replace
Predefined Extension PointsNoneNoneRequires Protocols in the module design
Degree of ExtensibilityHighLowMedium (only specified parts are replaceable)

If we were to name these three mechanisms:

  • Non with-style Macros: syntax-rewriting extension
  • With-style Macros: contextual extension
  • Protocols: replace-based extension

Replace-based extension is relatively easy to grasp and common across programming languages. Contextual extension, while involving meta-programming, remains accessible. Syntax-rewriting extension, on the other hand, fundamentally alters the language itself, making it the domain of compilation experts.

Clojure provides excellent extensibility, offering diverse mechanisms that allow extensions at the language, code block, core library, and user-defined module levels. If you want to elevate your programming skills, consider how to design software for extensibility—it will make your software feel more Clojurish.

Note:

  • In this article, you can consider “module” and “library” as synonymous. To me, a library is simply a published module.
  • “Interface” and “Protocol” can also be regarded as synonymous. While there are subtle differences between them, there is no distinction in their usage within this article.
  • >
  • Permalink

    Tracking memory usage with clj-memory-meter.trace

    Automatic memory management is probably JVM's biggest selling point. You don't need to remember to clean up the stuff you've allocated — the garbage collector will take care of it for you. You can "leak" memory if you leave live references to allocated objects (e.g. store objects in static fields), making them unreclaimable by the GC. Thus, as long as you don't do that, you can treat memory as infinite and never run out of it. Right?Definitely not. Memory is a limited resource, and you can't fill it with more data than its capacity allows if all of that data is needed at the same time. So, it becomes crucial not only to know how much memory your data structures occupy but also the access patterns to those data structures. Does the algorithm process one item at a time or expect the whole list to be in memory at once? Is the result gradually written to disk, or is it accumulated in memory first? These questions may not matter when the data size is small and doesn't ever get close to memory limits. But when they do...

    Permalink

    Why Clojure?

    Why Clojure?

    This is about a 17 minute read. Feel free to pause and come back to it later.

    Clojure is not one of the handful of "big" mainstream languages. This means that sometimes people are surprised that we are all in on Clojure. Why go against the grain? Why make it harder for yourself by building on niche technology?

    Gaiwan is mostly known as a Clojure consultancy, but we don&apost consider ourselves as being defined by Clojure. Rather, we are group of experienced technologists (10+ years of industry experience on average) who are deliberate and intentional about the technologies we build upon. Rather than choosing tech that is fashionable, or that has the biggest marketing budget, we choose tech that gives us the highest leverage. Tech that allows a small team like ours to be maximally productive, to maintain velocity as systems grow, and that allows us to keep overall complexity low. Right now, that tech is Clojure.

    In this article I want cover some of the reasons of why that is. In the first place I&aposm writing this for engineers or technical leaders who are trying to decide if Clojure is worth investing time in. It should for the most part also be understandable by business leaders, who want to understand the business benefits of building on Clojure.

    The reasons I&aposll outline below fall into three main categories:

    • Developer productivity: Clojure development is interactive, low ceremony, and high leverage. Clojure developers are happy developers that can ship quickly.
    • Long-term maintainability: the Clojure language and ecosystem are mature and stable, with a culture of stability that no other language ecosystem I&aposm aware of can match. This lets you build high-quality systems that last, while keeping maintenance costs down.
    • Culture of Ideas: while not a benefit of the language per se, adopting Clojure means you become part of a community which actively explores ideas from the past and present, academia and industry, to find better ways of building software. Clojure will challenge you in the best way possible.

    (hello &aposclojure)

    Clojure is a language in the Lisp family (also styled LISP). Lisp was conceived in the 1950s as a theoretical model for reasoning about computability, similar to the Turing machine or Lambda Calculus. It soon turned out that this theoretical model also made an excellent practical language to program in, one with a high conceptual elegance. The Lisp syntax has a one-to-one correspondence with the syntax tree data structure used to represent it, which provides several benefits compared to languages with more ad-hoc grammars. This notably made it the language of choice for AI applications during the previous big AI boom.

    Interest in Lisp languages has waxed and waned over time. Over the past decade Clojure has come to prominence. The main Clojure implementation is built on top of Java&aposs underlying machinery (the JVM), and incorporates several modern innovations in programming language design, including a complete set of well performing functional ("immutable") data types, and first class concurrency primitives. While Clojure forms a small language community and ecosystem compared to the major languages people are familiar with, it has done remarkably well for a language with no major corporate backing, and with a syntax and appearance that can seem wholly alien to people steeped in imperative curly-bracket languages or ML variants.

    A host of alternative implementations exist or are under development, including ClojureScript (compile-to-js), ClojureCLR (targeting Microsoft&aposs .NET), Babashka (a fast-booting interpreter for scripting, compiled to native using GraalVM), and Jank (native compilation), which provides reach and leverage. Clojure knowledge will transfer to multiple contexts and circumstances, and will give you access to multiple large open source ecosystems. This article takes as its reference the JVM implementation, but much of it is true for the other variants as well, with some nuance.

    What follows are some of the reasons why we find Clojure the most compelling programming language offering that exists today.

    Interactive Development

    Programming is a constant cycle of writing code, and validating said code. Without a feedback mechanism it is near impossible to write anything but the most trivial program and still be confident that it does what it&aposs supposed to do.

    These feedback loops come in many flavors. At its most basic people simple run their script-style programs over and over. For interactive programs they might click through its (web) UI, maybe putting some print or logging calls in the code to better see what is going on. Unit testing provides a more rigorous and repeatable feedback loop. Compilers, linters, and other analysis tools can provide a different kind of validation, a coarse grained assessment that a program is at least structurally sound. These cycles take from seconds to hours, and generally necessitate a context switch, from the editor to a terminal, UI, or CI, and back.

    Short, quick feedback cycles are preferable over long, slow feedback cycles, and this feedback cycle speed is one of the biggest predictors of a programmer&aposs productivity. Without quick and early feedback, you end up in a slow write/debug cycle, where as the cycles get slower, you end up spending ever more time debugging, compared to the time spent writing code.

    All of the mentioned validation techniques are available in the Clojure world as well, with for instance sophisticated tooling for unit and property-based testing. At the heart of Clojure development however lies the practice of interactive development.

    Before a single letter is written, the Clojure programmer starts up the Clojure runtime, which is connected to their editor. From here a program is "grown" by writing/running small pieces of it, and seeing the result (or error), directly and immediately, without leaving the editor.

    Here the Lisp syntax is a great help, since it provides a Lego-block like uniform syntax that makes it easy to "take a program apart", in a sense, executing individual bits or large composite pieces, merely by placing the cursor in the right spot.

    It&aposs hard to overstate the impact of this style of interactive development, as it provides the quickest feedback cycle, and thus most immediate feedback possible. You will also see this referred to as "REPL driven development", which obscures its true power. Many programming languages have a REPL (also referred to as a console or command line interface) somewhere "over there", in a terminal emulator or browser devtools. Few allow you to execute arbitrary pieces of code "over here", right where you are writing them, as you are writing them, against a complete running system.

    And this is only the tip of the iceberg, as this ability to connect to a running system and manipulate it has more far reaching consequences. It provides an ad-hoc inspection, debugging, and manipulation interface to any Clojure program running in any environment.

    Culture of Stability

    When choosing Clojure, you don&apost just get a piece of powerful tech. You also become part of a community of practice, with its own notions and dogma. Even more than the tech itself it&aposs this community of practice that really makes the choice for Clojure so compelling, and teams that adopt the tech in isolation without engaging with the wider culture and community sell themselves short. They would have been better off not choosing for Clojure at all.

    One strong cultural tenet is a commitment to stability and backwards compatibility. This starts from the core language, where breaking changes are virtually unseen, despite releasing regular improvements and extensions. This has become a deeply ingrained value in the open source ecosystem surrounding the language as well, and stands in sharp contrast with almost every other modern programming ecosystem, where a certain amount of churn — change for the sake of change — is taken for granted. This churn is a hard to overstate waste of resources, the global cost of which has to be measured in billions, and it&aposs wholly avoidable.

    Not so in Clojure, where it&aposs normal to upgrade to the latest version of the language, and other project dependencies as a matter of course. You simply carry on with your day. You can get the benefits of bug fixes, security, and performance improvements, without having to rewrite parts of your code base, or wonder what hidden subtle bugs have been introduced by breaking changes, even in point releases, often not even documented.

    I imagine at this point some eyebrows may be raised sceptically. Isn&apost change necessary to allow for progress? This shows a confusion between stability and stagnation. In software it is absolutely possible to have progress, to do new things, or improve existing things, without breaking the things that are already there. We live and breathe this every day.

    Information Systems / Knowledge Representation

    In the space of web and business applications in particular we write programs that deal with information about the world. Gathering, accessing, and processing of information, facts, is at the heart of what we do, and yet it&aposs staggering how poor many mainstream languages perform in this area. Either they provide data representation and manipulation primitives that are needlessly low-level, or they insist on a statically typed worldview leading to parochial, snowflake APIs that defy abstraction and higher level manipulation, or both.

    Clojure&aposs functional data structures and core set of data manipulation functions make information truly first class. Clojure is dynamically typed, and idiomatically follows the open world assumption. RDF, the data modeling framework originally developed for the Semantic Web, has an outsized influence in the community. This is visible in the preference for triple stores/graph databases, notably Datomic. It&aposs also visible in the language itself, where namespaced keywords are preferred, providing fully qualified identifiers for attributes that can be assigned context-free semantics.

    This isn&apost as heavy a lift as it may sound. A Clojure map with namespaced keywords is no more complex than a bit of JSON, but it can carry precise semantics without out of band contextualization, and it can be safely extended with additional information without risking naming conflicts.

    Small composable functions over immutable data

    This is another aspect that is cultural as much as it is technical. Clojure is not a purely functional language, and it&aposs easy to translate Java, Ruby, or C code directly into Clojure. But an idiomatic Clojure program looks very different from an idiomatic Java program, consisting for the most part out of pure functions over immutable data.

    Immutable data provides value semantics (as opposed to reference or identity semantics), and pure functions compute a result value based purely on a tuple of input values, without having an influence on, or being influenced by, the world outside of the function. (Like reading/writing global data, or causing side effects).

    This leads to several corrolaries.

    Concurrency Handling

    Contemporary computing is inherently concurrent, and has been for close to 20 years. We have dealt with the limits of Moore&aposs law by stacking processors with ever more cores, and our programs have had to keep up.

    Clojure helps with this in the first place by emphasizing immutability. Operations which involve mutable memory locations introduce timing and ordering dependencies, which need to be carefully controlled when introducing parallelization. A pure data-in data-out transformation on the other hand can always run safely, regardless of what else is going on.

    But programs do need to maintain state over time. For this the JVM has had excellent concurrency primitives since java.util.concurrent shipped in Java 5, but using them correctly still requires the care of an expert. Clojure provides higher level abstractions on top of these that provide specific concurrency and correctness guarantees. Atoms are the most commonly used ones, providing serialization of read-then-write style operations, through Compare-and-Set (CAS) combined with automatic retries. Refs provide Software Transactional Memory (STM), Agents provide serialization of updates which are applied asynchronously, Futures provide a fork-and-join interface backed by a thread pool. These all rely on Clojure&aposs functional data structures (including immutable queues), providing elegant thread-safe abstractions that can be used easily without shooting yourself in the foot.

    For data processing or event-driven systems there is core.async, available as a library maintained by the core team, providing Communicating Sequential Processes (CSP), similar to Go&aposs goroutines, and comparable to actor systems as found in Erlang/Elixir, or in Scala&aposs Akka.

    Of course you don&apost have to use these higher level abstractions (see also Move up and down the trade-off/abstraction ladder), the lower level primitives are still available, including concurrent queues, atomic references, locks and semaphores, various types of thread pools, all the way down to manual thread locking and marking of synchronized critical sections, for when you do need that fine-grained control.

    Local reasoning

    There&aposs only so much even the most gifted programmer can keep into their frame of mind at any given time. Each additional piece of context that needs to be considered to assess the impact of a change, the harder it becomes to confidently and correctly make that change. This curve is a hockey stick, things go from easy to hard to impossible quickly the more distinct pieces of code and state need to be considered at the same time to understand what a program is doing.

    The fact that most of a Clojure program consists of pure functions means that one only needs to understand what the inputs for a given function are to understand the function&aposs full behavior.

    Another aspect that helps here is that Clojure generally avoids polymorphism. There is no superclass implementing part of the behavior, you don&apost need to know the runtime type of objects to understand which implementation is being invoked. There are only concrete functions in namespaces. It&aposs been said that in object oriented programming everything happens somewhere else. In Clojure there is much less of this kind of indirection, making navigating around a code base to understand control flow straightforward.

    Of course you can write code that has this property in other languages, when taking sufficient care. But in non-functional languages this often means going against the grain of the language, and adopting a coding style that is not considered common or idiomatic. Other functional languages do promote this kind of purity, but lack some of the other benefits outlined in this article.

    This local reasoning, together with Lisp&aposs Lego-block-like uniform internal structure, makes it easy to refactor and evolve a code base. When refactoring the programmer improves a code base by changing its structure and organization, without changing its behavior. This can be quite challenging, since there might be implicit dependencies between different parts of the code base, through shared mutable state. Clojure encourages having a small amount of imperative code handling mutable state, separated from the otherwise purely functional code base. This makes both sides easier to develop and test, and provides some confidence that changes won&apost have unintended side-effects.

    Ease of testing

    When working with functional code, whether during interactive programming or in a unit test, validating that a piece of your program works as expected is a matter of pushing values in and seeing which values come out. There is no careful setup and teardown of state, and loading of fixtures, no stubbing out communication channels or delicately managing timing requirements, all common sources of the dreaded flakiness in tests. No code is easier to test than purely functional code.

    This also opens the door to higher leverage techniques like Property Based Testing, also known as Generative Testing, where a random sequence of ever more complex input values is fed into the program, to find values that violate certain known properties or invariants, followed by a crucial shrinking phase, so the programmer is presented with a minimal examplar of the unsupported edge case.

    Clojure has no unique claim to these techniques, in fact Property Based Testing originated in the Haskell community, and QuickCheck-inspired libraries are available for most major languages now. It does however synergize with some of the other benefits outlined, especially the emphasis on simple, immutable data structures, and with the interactive style of development.

    Positive self selection for hiring candidates

    "But what about hiring?" When you use any language that isn&apost in the top 3 of currently most popular languages, you will get this question. JavaScript programmers are counted in the millions, Clojure programmers in the tens of thousands. How will you ever find the required talent?

    It seems like a logical question, but it&aposs overly focused on the supply side. Yes, there are fewer Clojure developers, but there are also fewer Clojure jobs. It&aposs not useful to look at absolute numbers, you need to consider the balance between the two. Anecdotally this balance seems to be ok. From what we&aposve seen companies looking for Clojure talent are generally able to find people, and developers looking for jobs are able to get hired.

    In specific locales the story may be different. In smaller cities there might be no Clojure programmers at all. If you are intent on hiring locally. That is certainly a factor to consider. Even in bigger cities, if you are looking at hiring a lot (dozens to hundreds of people), this will be a factor. In either case you may have to find other suitable candidates, and train them into the specifics of Clojure. Nubank famously has trained hundreds of Brazillian developers to pick up Clojure out of necessity, but they describe it as a positive experience, for the company and the developers.

    In either case, whether you&aposre hiring people with the requisite Clojure experience, or training people up, what we hear over and over again is that the quality of applicants for a Clojure job is higher than when hiring say for JavaScript or Python. You may get only a handful of CVs instead of a few hundred, but they&aposll be quality CVs. Remember that Clojure is a community of ideas. It attracts people who think deeply about their craft, who are interested in finding better ways to do things, who are keen on learning advanced somewhat alien looking technologies. What we find is that both people who have studied Clojure in their own time, or people who are drawn in by the prospect of learning Clojure on the job, tend to be curious and open minded problem solvers. Exactly the kind of people you&aposd want to have on your team.

    This is of course all very anecdotal, which is all we have to go on in the absence of large scale studies. We leave it to the reader to decide if they find these claims credible, or at least plausible. What I can say from working with dozens of Clojure teams over the years is that while hiring is a concern that&aposs frequently voiced by people not (yet) doing Clojure, I have rarely heard it expressed as a major problem by teams actually doing Clojure.

    Move up and down the trade-off/abstraction ladder

    Clojure is, quite decisevely, a high-level language. Idiomatically, code is concise and expressive, with little ceremony or incidental complexity, in large part thanks to the functional (immutable) data structures, and accompanying data manipulation API.

    Clojure&aposs data structures perform very well for functional (immutable) data structures, but you still pay a cost for the convenience and guarantees they provide. Clojure&aposs maps and vectors are internally represented as trees (Hash array mapped tries to be precise), and there is a certain amount of path copying involved in every update. When done in bulk this puts pressure on the garbage collector.

    Clojure also provides seamless interop with Java types (see the section below on Host Interop), using runtime reflection, and automatically boxing/unboxing primitives, if necessary. This all comes at a cost.

    For everyday applications this cost is negligable, and easily justifiable given the ease of use you get in return. Used well functional data structures let you write smarter algorithms, so you do less work, offsetting some of these costs. But there are certainly use cases where this style of programming is not suitable. If you are writing a game graphics engine, doing realtime signal processing, or doing anything else that could be described as number crunching, then you want to get down to the metal.

    The good thing is that you can get down to the metal, without leaving your familiar environment. Providing some type hints to the compiler can eliminate runtime reflection and boxed math. You can work with contiguous arrays of primitive types, amenable to L1/L2 caching in the CPU. Optimized numeric vector/matrix types are available as libraries, including GPU backed.

    Contrast this with other high level languages where when the needs get high, you may be forced to switch to native extensions in C or Rust. In Clojure instead of this dichotomy you get a sliding scale. Maybe you have an event loop that needs to be able to handle high loads. A bit of profiling, type hinting, and sprinkles of interop may be all you need. At the end of the day Clojure runs as Java bytecode, which gets optimized on the fly (JIT compilation) by the JVM. You may be surprised how much you can squeeze out of that event loop with minimal changes.

    And even when doing this kind of lower level coding, you still get access to Clojure&aposs excellent metaprogramming support, to handle some of the drudgery for you. Which brings us to the next point.

    Move up and down the metaprogramming ladder

    It&aposs been commented on a few times that Clojure is a Lisp. What makes it a Lisp isn&apost (just) the superficial stuff of where the parentheses go. It&aposs the fact that in a very real sense code is a datastructure. It&aposs like JSON, if JSON was designed to represent programs in a readable way. Instead of Javascript&aposs objects, arrays, strings, and so forth, Clojure code is represented as nested lists, with symbols to represent functions, variables, and reserved keywords. (When used as a JSON-like data format, this syntax is known as EDN).

    What&aposs unique about Lisp is that facilities for converting between a string and a data representation of code are built into the language (known as the Reader and Writer), as well as facilities to evaluate such data structures as code, or, in the case of Clojure, compile and run them as JVM bytecode.

    Macros allow the programmers to extend the syntax, essentially augmenting the compiler, by writing functions that transform this code-as-data, and this is probably the most well-known example of Lisp metaprogramming. But it&aposs not the only option available, given these building blocks. The cultural trend in Clojure is to use macros sparingly, reserving them for key high leverage constructs, since macros are opaque and difficult to debug. They also make life harder for tools that do static analysis.

    Instead in the Clojure open source ecosystem in particular there&aposs a trend towards data driven interfaces, where instead of providing concrete functions and macros, an API is provided which takes a data structures, usually some combination of nested vectors and map, and let&aposs that drive the library&aposs behavior. Examples are HTTP routing, HTML and CSS generation, data validation and coercion, and many more.

    Superficially and syntactially the distinction is small, but the leverage gained is significant. Behavior is now driven through data, rather than invoked directly, and data, information, can be generated and manipulated. In fact, Clojure excels at this, as we pointed out earlier.

    You now have the full power of the language to create dynamic and adaptive systems. You can transform this data specification to deal with cross-cutting concerns, or make it end-user editable by storing it in a database, which is in turn trivial because you have the Clojure Reader and Writer available at runtime.

    Again this is a sliding scale, where programs and programmers will generally start out on the concrete and verbatim end of the spectrum, and stepping down a rung into metaprogramming territory when called for.

    It makes Clojure particularly suitable for highly dynamic and simulation-type systems, which can be reconfigured or rewired at runtime to exhibit new behaviors. In general these techniques provides a high amount of leverage, empowering people to do much more with the same tools and libraries, without being beholding to the library&aposs author to support their specific use case a priori.

    Host (Java) Interop

    Modern applications are more glue than substance. We take a language&aposs standard library, a few hundred open source libraries, a dozen SaaS APIs, and a handful of off the shelf components like databases and message queues, then add a bit of code on top to make it all work together. For application programmers (as opposed to system programmers) the bulk of their work is calling into APIs written by others, and wiring them together.

    This means it matters a lot which open source ecosystem you have access to. By leveraging the JVM and providing excellent interop capabilities, Clojure can leverage the millions of packages available on Maven Central, Java&aposs package repository, and the biggest single open source package repository in the world.

    It does so with little to no ceremony. Clojure is concise compared to Java, and the interactive programming facilities make it easy to explore APIs, and quickly wire them together. It&aposs not controversial to say that in this kind of exploratory glue programming Clojure beats Java hands down.

    With ClojureScript all the same arguments can be made for JavaScript and the NPM package repository.

    Culture of Ideas

    At the end of the day does your choice of language really matter that much? Teams and companies can be succesfull in virtually any language, and conversely no language can stop a well intentioned engineer from creating a huge mess. A sharp blade does not make you a master chef, and in the wrong hands may do more harm than good. And Clojure certainly has a few sharp edges. The language attempts very little hand-holding, expecting the programmer to know what they are doing. While strides have been made to improve the onboarding and learning experience, it can still feel like a trial by fire, especially with insufficient mentoring. This does lead to people becoming reasonably proficient, but still missing out on a lot of Clojure&aposs benefits.

    Indeed we&aposve come across a good few Clojure code bases of questionable merit. Often these are written by teams with a different language background, say Java or Python, who adopted Clojure&aposs syntax, but failed to steep themselves in the ideas and idioms of Clojure&aposs community of practice, resulting in a LISP flavored pidgin.

    On the other hand those who do embrace this culture of ideas will find they gain a more refined mental framework for reasoning about software design, one which transfers remarkably well to other languages and ecosystems.

    We&aposve pointed out a few ways already in which the appeal of Clojure is at least in part cultural, rather than merely technical. Much of Clojure&aposs relative success despite major corporate backing is due to Rich Hickey&aposs conference talks, in which he explores the ideas that influenced the design of Clojure and Datomic, as well as his own insights distilled from decades in the industry. Similarly at Clojure conferences talks tend to explore ideas, revisit influential papers, or share experiences, rather than simply presenting libraries and tools.

    Fundamentally Clojure&aposs community is one which isn&apost afraid to second guess itself. Here you find professionals working at the outer edge of their capabilities, always striving to learn and to find better ways of building software together, rather than merely coasting along. I am deeply grateful I can be part of it.

    Permalink

    Clojure Is Awesome!!! [PART 12]

    (ns chain-of-responsibility
      (:require [clojure.pprint :as pp]))
    
    ;; === Request Processing Chain ===
    (defprotocol RequestHandler
      (handle-request [this request])
      (set-next [this handler]))
    
    ;; === Authentication Handler ===
    (defrecord AuthenticationHandler [next-handler]
      RequestHandler
      (handle-request [this request]
        (if-let [auth-token (:auth-token request)]
          (if (= auth-token "valid-token")
            (if next-handler
              (handle-request next-handler 
                             (assoc request :authenticated true))
              (assoc request :authenticated true))
            {:error "Invalid authentication token"})
          {:error "Missing authentication token"}))
    
      (set-next [_ handler]
        (->AuthenticationHandler handler)))
    
    ;; === Authorization Handler ===
    (defrecord AuthorizationHandler [next-handler]
      RequestHandler
      (handle-request [this request]
        (if (:authenticated request)
          (if (contains? (:roles request) :admin)
            (if next-handler
              (handle-request next-handler 
                             (assoc request :authorized true))
              (assoc request :authorized true))
            {:error "Insufficient permissions"})
          (if next-handler
            (handle-request next-handler request)
            request)))
    
      (set-next [_ handler]
        (->AuthorizationHandler handler)))
    
    ;; === Validation Handler ===
    (defrecord ValidationHandler [next-handler]
      RequestHandler
      (handle-request [this request]
        (if (and (:data request)
                 (map? (:data request))
                 (every? string? (vals (:data request))))
          (if next-handler
            (handle-request next-handler 
                           (assoc request :validated true))
            (assoc request :validated true))
          {:error "Invalid request data format"}))
    
      (set-next [_ handler]
        (->ValidationHandler handler)))
    
    ;; === Logging Handler ===
    (defrecord LoggingHandler [next-handler]
      RequestHandler
      (handle-request [this request]
        (println "\nProcessing request:")
        (pp/pprint (dissoc request :handler))
        (let [response (if next-handler
                        (handle-request next-handler request)
                        request)]
          (println "\nResponse:")
          (pp/pprint response)
          response))
    
      (set-next [_ handler]
        (->LoggingHandler handler)))
    
    ;; === Cache Handler ===
    (def request-cache (atom {}))
    
    (defrecord CacheHandler [next-handler]
      RequestHandler
      (handle-request [this request]
        (if-let [cached (@request-cache (:id request))]
          (do
            (println "Cache hit for request:" (:id request))
            cached)
          (let [response (if next-handler
                          (handle-request next-handler request)
                          request)]
            (when (:id request)
              (swap! request-cache assoc (:id request) response))
            response)))
    
      (set-next [_ handler]
        (->CacheHandler handler)))
    
    ;; === Request Processing ===
    (defn build-chain []
      (-> (->LoggingHandler nil)
          (set-next (->CacheHandler nil))
          (set-next (->AuthenticationHandler nil))
          (set-next (->AuthorizationHandler nil))
          (set-next (->ValidationHandler nil))))
    
    ;; === Example Usage ===
    (defn run-examples []
      (let [chain (build-chain)]
        (println "\n=== Valid Admin Request ===")
        (handle-request chain
                       {:id "req-1"
                        :auth-token "valid-token"
                        :roles #{:admin}
                        :data {"name" "John"
                              "action" "read"}})
    
        (println "\n=== Invalid Token ===")
        (handle-request chain
                       {:id "req-2"
                        :auth-token "invalid-token"
                        :roles #{:admin}
                        :data {"name" "John"}})
    
        (println "\n=== Missing Token ===")
        (handle-request chain
                       {:id "req-3"
                        :roles #{:admin}
                        :data {"name" "John"}})
    
        (println "\n=== Insufficient Permissions ===")
        (handle-request chain
                       {:id "req-4"
                        :auth-token "valid-token"
                        :roles #{:user}
                        :data {"name" "John"}})
    
        (println "\n=== Invalid Data ===")
        (handle-request chain
                       {:id "req-5"
                        :auth-token "valid-token"
                        :roles #{:admin}
                        :data {"name" 123}})
    
        (println "\n=== Cached Request ===")
        (handle-request chain
                       {:id "req-1"
                        :auth-token "valid-token"
                        :roles #{:admin}
                        :data {"name" "John"
                              "action" "read"}})))
    
    
    (run-examples)
    

    Permalink

    Enhancing engineering workflows with AI: a real-world experience

    Artificial Intelligence (AI) and Large Language Models (LLMs) are revolutionizing the tech industry, and at Nubank, we’re using these technologies to enhance engineering workflows across Brazil, Mexico, and Colombia. In a recent talk at Clojure Conj 2024, Carin Meier, Principal Software Engineer at Nubank, and Marlon Silva, Software Engineer at Nubank, shared how AI-powered tools are transforming how we work.

    Clojure Conj, a conference held since 2010, is a key event for the global Clojure community. It brings together developers and thought leaders to discuss the latest trends in Clojure programming. In 2024, it provided the perfect platform for Carin and Marlon to present how Nubank is integrating AI, including LLMs, to streamline our engineering processes.

    In this article, we’ll explore the main topics from the lecture, and how these AI tools are optimizing everything from code generation to team collaboration at Nubank—and how they could help your team too.

    What are Large Language Models (LLMs)?

    Before diving into our experiences, let’s start with a quick overview of what LLMs are and how they work.

    At a high level, LLMs like GPT-3 and GPT-4 are machine learning models trained on vast datasets to predict the next word (or token) in a sequence based on the context provided. They are designed to mimic human-like understanding and generation of language.

    For example, when you type a prompt like “Clojure is a lovely programming language that allows you to,” an LLM can predict and continue the sentence with something like “code a program in a pure functional style.” The model does this by drawing from patterns it has learned during training, where it encounters large amounts of code and documentation, allowing it to generate meaningful sentences in response.

    However, LLMs are not perfect. They require experimentation to understand their potential, especially when it comes to generating code in specific programming languages like Clojure, a language that doesn’t have as much public training data compared to more mainstream languages like Python or JavaScript.

    The power of benchmarking: testing LLMs for Clojure

    To understand whether LLMs could truly enhance engineering workflows, we needed to test their capabilities. At Nubank, we selected a few models and applied them to generate Clojure code. While many existing benchmarks showed impressive results for languages like Python and JavaScript, we were curious how well these models would perform for Clojure, which has its own unique syntax and concepts.

    Initially, we used a tool called the MultiPL-E (Multi-Programming Language Evaluation of Large Language Models of code) Benchmarking Tool. This open-source tool allows us to test the quality of code generated by LLMs based on a set of predefined problems, like those in the HumanEval and MBPP datasets.

    With this tool, we were now able to put our Clojure code generation capabilities to the test. Thanks to invaluable support from Alex Miller, a prominent figure in the Clojure community and a vital part of Nubank’s operations, we integrated Clojure into MultiPL-E and started comparing it alongside Python and JavaScript. 

    At first, we didn’t apply any special fine-tuning or engineering tricks; we simply wanted to observe the raw potential of the latest models (including open-source projects like Llama3 and private GPT variants from OpenAI) in producing production-ready code. Unsurprisingly, Clojure lagged a bit behind Python and JavaScript at first—likely a reflection of the smaller corpus of Clojure code used to train most LLMs—but the surprise was how close these results actually turned out.

    With each new release—GPT-3.5, GPT-4, GPT-4o, o1-preview, o1, and beyond—we’ve observed the gap shrink further. It’s encouraging to see Clojure gain ground so quickly, and it gives us hope for a future where the disparity between languages all but disappears. As more models are trained on increasingly diverse datasets, we expect to see Clojure’s performance match Python’s and JavaScript’s. 

    The open-source community and ongoing efforts like MultiPL-E are making strides to improve support and visibility for functional languages, and we’re excited about what this means for developers who rely on Clojure every day.

    The lesson here? Don’t be afraid to experiment. Try various models and see how they align with your specific use cases. The performance of these models can vary significantly depending on your needs.

    Building flexible tools for Engineering teams

    One of the key takeaways from our journey with LLMs is the importance of building flexible and extensible tools. The world of AI is moving so fast that we can’t predict exactly what our engineers will need in the next month, let alone a year.

    At Nubank, we’ve embraced this uncertainty. We’ve designed tools that are small, modular, and easy to adapt as new developments emerge. A good example of this is Roxy, a local proxy that facilitates the use of LLMs in a regulated environment.

    Roxy is designed to ensure that any interaction with LLMs adheres to compliance and security regulations. Rather than building a complex solution tailored to a specific use case, we created a thin, flexible interface that engineers can use in a variety of ways. This approach allowed us to quickly adapt as new requirements or opportunities arose.

    The key takeaway here is that teams shouldn’t over-engineer their tools. They should create something simple that can grow and evolve alongside technology.

    Fostering a community for sharing AI insights

    In any fast-moving field, collaboration is key. At Nubank, we’ve found that creating a community of practice—what we call guilds—has been invaluable. These are internal user groups where we share experiences, discuss challenges, and brainstorm ways to leverage new tools like LLMs effectively.

    By gathering on a regular basis, we ensure that everyone stays up-to-date on the latest AI advancements and gets a chance to provide feedback. This has helped us continually improve our tools and techniques for integrating LLMs into engineering workflows.

    If you’re working with AI or any new technology, consider fostering your own community. It’s a great way to keep learning and stay ahead of the curve.

    Can LLMs help us think?

    While many people worry that AI will replace human thinking, we believe that LLMs can actually enhance our thinking—if used correctly. For example, LLMs can help engineers and product managers think critically, ask better questions, and approach problems from new angles.

    Something that we’ve found useful is using AI to guide us in identifying the root cause of a problem, rather than just providing the answer. For instance, if we’re faced with a performance issue in a microservice, we might prompt the LLM with a question like, “How can I best frame a solution for a microservice that runs slow on an IO operation?”

    The idea isn’t to ask for an answer right away but to use the LLM to help us structure our thinking. By using LLMs this way, we can dig deeper into the problem and come up with better solutions.

    In another example, Marlon used this method to craft a product report. He asked the LLM to assume the role of a product manager and help him structure a report for upper management on the benchmark of LLM models for Clojure. The result was a report that exceeded expectations and impressed the product manager.

    A look into the future: the power of autonomous AI agents

    As AI evolves, the idea of autonomous agents that can write code and solve problems on their own is becoming more of a reality. We’ve explored some early-stage tools, like Open Hands, which use LLMs to assist with tasks like data analysis.

    In a recent demo, we tasked Open Hands with performing a data analysis on the Iris dataset using Clojure. The agent autonomously planned, wrote, and executed the code, demonstrating how LLMs can assist engineers in tasks that would typically require more time and effort. While the technology is still in its early stages, we’re excited by the possibilities it presents.

    Devin, an autonomous AI software engineer developed by Cognition Labs, is another example of how AI is transforming software development. Devin has been instrumental in helping us migrate our massive ETL system with over 6 million lines of code. 

    By automating repetitive tasks like refactoring and code migration, Devin enabled Nubank to complete a project initially estimated to take over 18 months with a thousand engineers in just weeks, achieving a 12-fold increase in efficiency and significant cost savings. 

    Looking ahead

    As AI continues to evolve, it’s clear that Large Language Models are not just tools for automating tasks—they are essential for enhancing developer workflows. By integrating LLMs into Nubank’s engineering processes, we’ve seen firsthand how they can boost productivity, foster creativity, and bridge gaps between technical and business teams. 

    And, as we continue to explore and refine our AI solutions, we encourage other organizations to experiment and build flexible, extensible tools that adapt to the fast-moving world of AI. The future of engineering is here, and with LLMs, the possibilities are endless.

    Learn more about what we shared on this topic in the video below:

    The post Enhancing engineering workflows with AI: a real-world experience appeared first on Building Nubank.

    Permalink

    Revisiting 'Clojure Don'ts : concat

    Nostalgia city

    I've recently started maintaining a Clojure codebase that hasn't been touched for over a decade - all Clojure devs that built and maintained it are long gone. It's using java8, Clojure 1.6 and libs like korma and noir - remember those? Contrary to the prevailing Clojure lore, upgrading Clojure will not be just a matter of changing version numbers in the lein project.clj.

    I find one of the most dated aspects of the project is the laziness. I only use laziness as an explicit choice and have done so for many years. Laziness is a feature I find I rarely need, but is sometimes just the right fit.

    A lot of the original Clojure collection functions are lazy and it is still common to see new code written with them - I think because they are still seen as an idiomatic default, rather than a conscious choice. Non-lazy versions like mapv and filterv came later and transducers later still, but of course the old functions must continue to work as before.

    Investigating a bug in the codebase led me back to this great blog post, Clojure Dont's: Concat also written around a decade ago. The rest of this post will discuss that post, so if you haven't please read that (and ofc the rest of the 'Dont's series is also good').

    Revisiting the post

    I had first read the post many years ago and had forgotten the details - I guess, the main thing I remembered was 'don't use concat' - which is maybe a good heuristic but actually missed the main point which could be phrased as build lazy sequences starting from the outside - I'll explain the outside thing further on.

    Reading it again, I had go over it couple of times to fully understand it - if it was crystal clear to you then you've no need to read on. To check your understanding - answer this: what difference would it make (wrt overflow) to change the order of the args to concat in the build-result function?

    Following is my attempt to make the post's message even clearer.

    The post mentions that seq realises the collection and causes the overflow. Just in case it is not clear, seq does not in general realise lazy collections in entirety, it just realises the first element.

    To demonstrate that, have a look at the following, which is like range but the numbers in the sequence descend to one :

    
    (defn range-descending [x]
      (when (pos? x)
        (lazy-seq
          (cons x (range-descending (dec x))))))
    
    (let [_ (seq (range-descending 4000))]
      nil) ; => ok, no overflow
    
    

    This is what one might call an outside-in lazy sequence. As the sequence is generated, one might picture it like this:

    (4000, LazySeq-obj)
    (4000, 3999, LazySeq-obj)
    (4000, 3999, 3998, LazySeq-obj)
    ...
    

    Calling seq on the collection, only the first element is realized, so no overflow.

    The equivalent to the way concat was used in the original post would be more like this:

      (defn range-descending-ohno [x]
        (when (pos? x)
          (lazy-seq
            (conj (range-descending (dec x)) x))))
    

    Now visualising the sequence generation, it would look more like this:

    (conj LazySeq-obj 4000)
    (conj (conj LazySeq-obj 3999) 4000)
    ...
    (conj `...` (conj nil 1) `...` 4000)    
    

    Now when calling seq (as in (seq (range-descending-ohno 4000))), the whole sequence needs to be realised for seq to get to the first element (4000 in the example). As the post says: seq has to recurse through them until it finds an actual value. One might call this an inside-out lazy sequence.

    Conclusion

    The original post concludes Don’t use lazy sequence operations in a non-lazy loop - which I would update to add don't use laziness at all unless required.

    If deciding to use laziness, avoid building sequences inside-out - this might be in your direct usage of e.g. lazy-seq or hiding in plain sight in your usage of e.g. clojure.core functions such as concat.

    Further Reading

    • The inside-out lazy seq topic is also covered in Clojure Brain Teasers if you want more pictures and explanation (in Boom Goes the Dynamite chapter).
    • Clojure's Deadly sin is a very well considered and comprehensive look into the problems of laziness in clojure.

    Permalink

    Pathom3 Instrumentation

    In this article I will explain how to get performance insights into your Pathom3 resolvers by using Tufte. My aim is to show a very basic example of how it can be done, without doing a deep dive on any of the topics.

    Pathom

    If you are unfamiliar with Pathom, its docs define it as "a Clojure/script library to model attribute relationships". In essence, Pathom allows you to create graph of related keywords and query it using the EDN Query Language (EQL). It supports read and write operations using resolvers and mutations. The "magic" of it is that it produces an interface which abstracts away function calling by handling all the graph traversal internally when responding to EQL requests. What does that mean? A short example should suffice:

    ;; create a few resolvers to model related attributes
    (pco/defresolver all-items
      "Takes no input and outputs `:all-items` with their `:id`."
      []
      {::pco/output [{:all-items [:id]}]}
      {:all-items
       [{:id 1}
        {:id 2}
        {:id 3}]})
    
    (pco/defresolver fetch-v
      "Takes an `:id` and outputs its `:v`."
      [{:keys [id]}]
      (Thread/sleep 300)
      {:v (* 10 id)})
    
    ;; query the graph for some data
    (p.eql/process
     (pci/register [all-items fetch-v])
     ;; ask for the `:v` attribute of `:all-items`
     [{:all-items [:v]}])
    ; => {:all-items [{:v 10} {:v 20} {:v 30}]}
    

    Source: Pathom3 docs on Batch Resolvers.

    As you can see, once the graph is established, you only need to tell Pathom what you want, not how to get it. As long as there is enough data to satisfy the input requirements of some initial resolver, its output can be used as input to whatever other resolver(s) need to be used in order to satisfy the entire request. Pathom will continue traversing the graph using whatever data it has at each point in order to get all the requested attributes. An elaborate chain of function calls is reduced to a single EQL expression.

    While this does offer developers a great deal of power, one trade-off is that it becomes a little bit harder to understand exactly what your program is doing when you send your query to the Pathom parser. The above example creates a very simple graph without much mystery, but real applications often include a large number of resolvers, often with multiple paths for getting certain attributes.

    Tufte

    Tufte is useful for understanding what happens when you send a query to your Pathom parser. From the Tufte example in its repo's README, the basic usage is like this:

    (tufte/profile ; Profile any `p` forms called during body execution
      {} ; Profiling options; we'll use the defaults for now
      (dotimes [_ 5]
        (tufte/p :get-x (get-x))
        (tufte/p :get-y (get-y))))
    

    In plain English, we need to use p to wrap individual expressions and profile to wrap a set of p expressions to profile them together.

    Profiling Pathom Queries

    To put it together, we need to understand one last piece: Pathom Plugins. Plugins allow developers to extend Pathom's functionality by wrapping specific parts of its internal execution process with arbitrary extension code. The various places you can add wrapping are identified by keywords. In our case, we want to wrap individual resolver calls with p and the entire process (which may call many resolvers) with profile. The keywords for these extension points are:

    • ::pcr/wrap-resolve for individual resolvers
    • ::p.eql/wrap-process-ast for the entire process

    NOTE: this article is specifically for Pathom's EQL interface.

    With this knowledge, we can create some extension functions and register the plugin:

    (defn tufte-resolver-wrapper
      "Wrap a Pathom3 resolver call in `tufte/p`."
      [resolver]
      (fn [env input]
        (let [resolver-name (-> (get-in env [::pcp/node ::pco/op-name])
                                (name)
                                (keyword))
              identifier (str "resolver: " resolver-name)]
          (tufte/p identifier (resolver env input)))))
    
    (defn tufte-process-wrapper
      "Wrap a Pathom3 process in `tufte/profile`."
      [process-ast]
      (fn [env ast] (tufte/profile {} (process-ast env ast))))
    
    (p.plugin/defplugin tufte-profile-plugin
      {::p.plugin/id `tufte-profile-plugin
       ::pcr/wrap-resolve tufte-resolver-wrapper
       ::p.eql/wrap-process-ast tufte-process-wrapper})
    

    The last step is to include this plugin in Pathom's environment when processing a query:

    ;; Add handler to print results to *out*
    (tufte/add-basic-println-handler! {})
    
    (p.eql/process
    ;; Only the first form is new, everything else is as before.
     (-> (p.plugin/register tufte-profile-plugin)
         (pci/register [all-items fetch-v]))
     [{:all-items [:v]}])
     ; => {:all-items [{:v 10} {:v 20} {:v 30}]}
    

    Tufte Results - no batch

    If you follow along with the Batch Resolvers docs linked above, you can see how to optimize such a situation to avoid the N+1 query and the extra 600ms of processing time it causes. Let's replace the fetch-v resolver with its batch version and profile it again:

    (pco/defresolver batch-fetch-v
      "Takes a _batch_ of `:id`s and outputs their `:v`."
      [items]
      {::pco/input  [:id]
       ::pco/output [:v]
       ::pco/batch? true}
      (Thread/sleep 300)
      (mapv #(hash-map :v (* 10 (:id %))) items))
    
    (p.eql/process
      (-> (p.plugin/register tufte-profile-plugin)
          (pci/register [all-items #_fetch-v batch-fetch-v]))
      [{:all-items [:v]}])
    ; => {:all-items [{:v 10} {:v 20} {:v 30}]}
    

    Tufte Results - batch

    Comparing results, we can see the processing time saved by the batch version, exactly how much time was spent in each resolver and which resolvers were called. Again, this is a very simplified example. In a real-world scenario your may end up calling a large number of resolvers to produce the result, so having Tufte's stats at hand can be very useful.

    Pathom Viz

    As a final note, I want to point out that Pathom has its own tool for gaining such insights. It's called Pathom Viz and provides an excellent visual interface that shows everything you get from the above and more. It's a great tool and I use it often. Using Tufte as I've outlined above is an alternative lightweight approach that I've found useful.

    Wrapping Up

    In this article I covered a basic introduction to Pathom, its extension points and how to integrate it with Tufte in order to get performance and execution insights. Nothing groundbreaking here, but I did a quick search and didn't find any similar content, so hopefully this helps someone in the future.

    You can find the complete working example code in my fnguy-examples repo.

    Permalink

    Pathom3 Instrumentation

    In this article I will explain how to get performance insights into your Pathom3 resolvers by using Tufte. My aim is to show a very basic example of how it can be done, without doing a deep dive on any of the topics.

    Pathom

    If you are unfamiliar with Pathom, its docs define it as "a Clojure/script library to model attribute relationships". In essence, Pathom allows you to create graph of related keywords and query it using the EDN Query Language (EQL). It supports read and write operations using resolvers and mutations. The "magic" of it is that it produces an interface which abstracts away function calling by handling all the graph traversal internally when responding to EQL requests. What does that mean? A short example should suffice:

    ;; create a few resolvers to model related attributes
    (pco/defresolver all-items
      "Takes no input and outputs `:all-items` with their `:id`."
      []
      {::pco/output [{:all-items [:id]}]}
      {:all-items
       [{:id 1}
        {:id 2}
        {:id 3}]})
    
    (pco/defresolver fetch-v
      "Takes an `:id` and outputs its `:v`."
      [{:keys [id]}]
      (Thread/sleep 300)
      {:v (* 10 id)})
    
    ;; query the graph for some data
    (p.eql/process
     (pci/register [all-items fetch-v])
     ;; ask for the `:v` attribute of `:all-items`
     [{:all-items [:v]}])
    ; => {:all-items [{:v 10} {:v 20} {:v 30}]}
    

    Source: Pathom3 docs on Batch Resolvers.

    As you can see, once the graph is established, you only need to tell Pathom what you want, not how to get it. As long as there is enough data to satisfy the input requirements of some initial resolver, its output can be used as input to whatever other resolver(s) need to be used in order to satisfy the entire request. Pathom will continue traversing the graph using whatever data it has at each point in order to get all the requested attributes. An elaborate chain of function calls is reduced to a single EQL expression.

    While this does offer developers a great deal of power, one trade-off is that it becomes a little bit harder to understand exactly what your program is doing when you send your query to the Pathom parser. The above example creates a very simple graph without much mystery, but real applications often include a large number of resolvers, often with multiple paths for getting certain attributes.

    Tufte

    Tufte is useful for understanding what happens when you send a query to your Pathom parser. From the Tufte example in its repo's README, the basic usage is like this:

    (tufte/profile ; Profile any `p` forms called during body execution
      {} ; Profiling options; we'll use the defaults for now
      (dotimes [_ 5]
        (tufte/p :get-x (get-x))
        (tufte/p :get-y (get-y))))
    

    In plain English, we need to use p to wrap individual expressions and profile to wrap a set of p expressions to profile them together.

    Profiling Pathom Queries

    To put it together, we need to understand one last piece: Pathom Plugins. Plugins allow developers to extend Pathom's functionality by wrapping specific parts of its internal execution process with arbitrary extension code. The various places you can add wrapping are identified by keywords. In our case, we want to wrap individual resolver calls with p and the entire process (which may call many resolvers) with profile. The keywords for these extension points are:

    • ::pcr/wrap-resolve for individual resolvers
    • ::p.eql/wrap-process-ast for the entire process

    NOTE: this article is specifically for Pathom's EQL interface.

    With this knowledge, we can create some extension functions and register the plugin:

    (defn tufte-resolver-wrapper
      "Wrap a Pathom3 resolver call in `tufte/p`."
      [resolver]
      (fn [env input]
        (let [resolver-name (-> (get-in env [::pcp/node ::pco/op-name])
                                (name)
                                (keyword))
              identifier (str "resolver: " resolver-name)]
          (tufte/p identifier (resolver env input)))))
    
    (defn tufte-process-wrapper
      "Wrap a Pathom3 process in `tufte/profile`."
      [process-ast]
      (fn [env ast] (tufte/profile {} (process-ast env ast))))
    
    (p.plugin/defplugin tufte-profile-plugin
      {::p.plugin/id `tufte-profile-plugin
       ::pcr/wrap-resolve tufte-resolver-wrapper
       ::p.eql/wrap-process-ast tufte-process-wrapper})
    

    The last step is to include this plugin in Pathom's environment when processing a query:

    ;; Add handler to print results to *out*
    (tufte/add-basic-println-handler! {})
    
    (p.eql/process
    ;; Only the first form is new, everything else is as before.
     (-> (p.plugin/register tufte-profile-plugin)
         (pci/register [all-items fetch-v]))
     [{:all-items [:v]}])
     ; => {:all-items [{:v 10} {:v 20} {:v 30}]}
    

    Tufte Results - no batch

    If you follow along with the Batch Resolvers docs linked above, you can see how to optimize such a situation to avoid the N+1 query and the extra 600ms of processing time it causes. Let's replace the fetch-v resolver with its batch version and profile it again:

    (pco/defresolver batch-fetch-v
      "Takes a _batch_ of `:id`s and outputs their `:v`."
      [items]
      {::pco/input  [:id]
       ::pco/output [:v]
       ::pco/batch? true}
      (Thread/sleep 300)
      (mapv #(hash-map :v (* 10 (:id %))) items))
    
    (p.eql/process
      (-> (p.plugin/register tufte-profile-plugin)
          (pci/register [all-items #_fetch-v batch-fetch-v]))
      [{:all-items [:v]}])
    ; => {:all-items [{:v 10} {:v 20} {:v 30}]}
    

    Tufte Results - batch

    Comparing results, we can see the processing time saved by the batch version, exactly how much time was spent in each resolver and which resolvers were called. Again, this is a very simplified example. In a real-world scenario your may end up calling a large number of resolvers to produce the result, so having Tufte's stats at hand can be very useful.

    Pathom Viz

    As a final note, I want to point out that Pathom has its own tool for gaining such insights. It's called Pathom Viz and provides an excellent visual interface that shows everything you get from the above and more. It's a great tool and I use it often. Using Tufte as I've outlined above is an alternative lightweight approach that I've found useful.

    Wrapping Up

    In this article I covered a basic introduction to Pathom, its extension points and how to integrate it with Tufte in order to get performance and execution insights. Nothing groundbreaking here, but I did a quick search and didn't find any similar content, so hopefully this helps someone in the future.

    You can find the complete working example code in my fnguy-examples repo.

    Permalink

    Taming LLM Responses with Instaparse

    Taming LLM Responses with Instaparse

    It started with a simple goal: integrate an LLM model. Little did I know this would lead us down a rabbit hole for parsing challenges that would fundamentally change how we handle LLM outputs.

    Taming LLM Responses with Instaparse

    The Promise and the Pain

    Like many developers, our journey began with a straightforward vision: use LLMs to generate UI operations for our no-code platform. The plan seemed simple - have the model return JSON structures describing UI components, their properties, and how they should be manipulated.

    Our initial schema looked promising:

    [{
      "type": "append:node",
      "context": {
        "engine": "string",
        "workspace": "string"
      },
      "data": {
        "source": [{
          "id": "string",
          "componentName": "string",
          "props": {
            "data-content-editable": "content",
            "class": "string",
            "content": "string"
          }
        }],
        "target": {
          "id": "string",
          "componentName": "string",
          "props": {}
        }
      }
    }]

    We wrote comprehensive prompts, carefully explained our component hierarchy, and felt confident about our approach. Then reality struck.

    The Pain Points

    Our testing phase revealed several critical issues:

    1. JSON formatting significantly increased response latency
    2. Not all models supported JSON mode consistently
    3. Even with JSON mode enabled, sometimes LLMs would respond with incomplete JSON.
    4. The performance impact was unacceptable for real-time applications

    The Regex Temptation

    I&aposll admit it - my first instinct was to reach for regex. After all, how hard could it be to match some curly braces and square brackets?

    ;; I actually wrote this. I&aposm not proud of it.
    (re-find #"\{[^}]+\}" llm-response)

    I can feel you laughing right now. If you&aposve ever tried to parse JSON with regex, you know exactly where this is going - a path of madness, unmaintainable code, and edge cases that haunted my dreams.

    Instaparse - The Game Changer

    Instead of fighting with regex, I decided to write a proper grammar to parse JSON-like structures embedded in text.

    Here&aposs the complete solution I developed:

    1.The Grammar Definition

    First, I defined a grammar that could handle JSON embedded within normal text:

    (ns json-extractor.core
      (:require [instaparse.core :as insta]
                [clojure.edn :as edn]))
    
    
    (def json-parser
      (insta/parser
        "text = (not-json | json)*
    
         <not-json> = #&apos[^{\\[]+|[{\\[](?![\"\\s\\[{])&apos
    
         json = object | array
    
         <value> = object | array | string | number | boolean | null
    
         object = <&apos{&apos> <ws> (pair (<&apos,&apos> <ws> pair)*)? <ws> <&apos}&apos>
    
         array = <&apos[&apos> <ws> (value (<&apos,&apos> <ws> value)*)? <ws> <&apos]&apos>
    
         pair = string <ws> <&apos:&apos> <ws> value
    
         string = <&apos\"&apos> #&apos[^\"\\\\]*(?:\\\\.[^\"\\\\]*)*&apos <&apos\"&apos>
    
         number = #&apos-?(?:0|[1-9]\\d*)(?:\\.\\d+)?(?:[eE][+-]?\\d+)?&apos
    
         boolean = &apostrue&apos | &aposfalse&apos
    
         null = &aposnull&apos
    
         ws = #&apos\\s*&apos"))

    2.Validation Layer

    Once parsed, I needed to ensure the structures were valid:

    (defn valid-json-structure? [x]
      (or (map? x)
          (and (sequential? x)
               (every? (fn [item]
                        (or (number? item)
                            (string? item)
                            (boolean? item)
                            (nil? item)
                            (valid-json-structure? item)))
                      x))))

    3.Transform Rules

    (def transform-map
      {:string identity
       :number (fn [n]
                (try
                  (edn/read-string n)
                  (catch Exception _
                    n)))
       :boolean #(= % "true")
       :null (constantly nil)
       :pair vector
       :object (fn [& pairs]
                (try
                  (reduce (fn [acc [k v]]
                           (assoc acc (keyword k) v))
                         {}
                         pairs)
                  (catch Exception _
                    nil)))
       :array (fn [& items]
               (try
                 (vec (remove nil? items))
                 (catch Exception _
                   nil)))
       :json identity
       :text (fn [& items]
              (->> items
                   (remove nil?)
                   (filter valid-json-structure?)))})

    4.JSON String Detection

    Before parsing, we need to find potential JSON strings in the text:

    (defn find-all-json-like-strings
      "Find potential JSON objects/arrays in text using balanced delimiter matching"
      [text]
      (let [results (atom [])
            len (count text)]
        (loop [i 0
               stack []
               start -1]
          (if (< i len)
            (let [c (nth text i)
                  stack&apos (cond
                          (and (empty? stack) (#{\{ \[} c))
                          (conj stack c)
    
                          (and (= (peek stack) \{) (= c \}))
                          (pop stack)
    
                          (and (= (peek stack) \[) (= c \]))
                          (pop stack)
    
                          (#{\{ \[} c)
                          (conj stack c)
    
                          :else
                          stack)]
              (cond
                (and (empty? stack) (= start -1) (#{\{ \[} c))
                (recur (inc i) stack&apos i)
    
                (and (empty? stack) (> start -1))
                (do
                  (swap! results conj (subs text start (inc i)))
                  (recur (inc i) stack&apos -1))
    
                :else
                (recur (inc i) stack&apos start)))
            (when (> start -1)
              (swap! results conj (subs text start len)))))
        @results))

    5.Putting It All Together

    Finally, I combined everything into two main functions:

    (defn parse-single-json
      "Parse a single JSON string"
      [text]
      (try
        (let [result (json-parser text)]
          (when-not (insta/failure? result)
            (let [transformed (insta/transform transform-map result)
                  transformed (if (sequential? transformed)
                               (first transformed)
                               transformed)]
              (when (valid-json-structure? transformed)
                transformed))))
        (catch Exception e
          (tap> {:exception e :text text})
          nil)))
    
    (defn extract-json
      "Extract all valid JSON structures from text"
      [text]
      (->> (find-all-json-like-strings text)
           (map parse-single-json)
           (filterv some?)))

    Learn from my mistakes

    1. Write tests from the start.
    2. Don&apost modify the grammar without thorough testing
    3. Don&apost assume all LLM responses will contain valid JSON
    4. Don&apost skip the validation step, even if parsing succeeds
    5. Don&apost try to parse extremely large JSON structures in one go

    When dealing with LLMs, robust parsing isn&apost just nice to have - it&aposs essential for building reliable AI systems.

    See It In Action

    Our Auxtool Agent now streams UI operations in real-time, applying them as they arrive from the LLM. This creates a fluid, interactive experience where you can watch your UI being built dynamically as the model generates responses.

    0:00
    /3:44

    Demo Vade Auxtool Building Landing Page

    Permalink

    Clojure Deref (Feb 14, 2025)

    Welcome to the Clojure Deref! This is a weekly link/news roundup for the Clojure ecosystem (feed: RSS). Thanks to Anton Fonarev for link aggregation.

    Libraries and Tools

    New releases and tools this week:

    Permalink

    Copyright © 2009, Planet Clojure. No rights reserved.
    Planet Clojure is maintained by Baishamapayan Ghose.
    Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
    Theme by Brajeshwar.