Sr. Software Engineer - Data Solutions at BroadPeak Partners
120000 - 165000
About BroadPeak
BroadPeak’s mission is to enable business users to drive data management, delightfully enabling their firms to gain a competitive edge in analytics. We’re focused on commodity trading enterprises with increasing amounts of data to manage, that need to move fast, and deliver sustainable solutions.
We are passionate about the problem: enabling our customers to collect, curate, and make sense of their data.
Job Description
We are looking to add a Senior Software Engineer to our team. As we add to our suite of cloud-based data solutions, this opportunity calls for significant experience with data warehouse technologies. In addition to that we are also looking for industry experience in utilizing machine learning models in the financial domain.
Our engineers don’t just write code: we’re deeply involved in systems architecture, technology selection, UI/UX, and ensuring customers’ success. We strive every day to deliver elegant, scalable solutions to real problems.
An ideal candidate brings their unique experience, interest in the problem, and eagerness to learn, even if you don’t yet have all these skills or technologies on your resume.
How You’ll Contribute
Work closely with engineers, analysts, product, infrastructure, support, and customers to understand needs and deliver solutions.
Design applications, services, and whole systems.
Identify, evaluate, and select technologies (libraries, cloud services, etc).
Full-stack development.
Troubleshoot and optimize systems in production.
Skills/Technologies
Data: SQL RDBMS (RDS PostgreSQL, SQL Server, etc) and data warehouse (Redshift, Snowflake, BigQuery, or otherwise) experience required; Kinesis/Kafka/Elasticsearch/DynamoDB/Redis/etc experience is a plus.
Functional programming: Experience with Clojure preferred; interest in and experience with FP (Scala, Rust, Common Lisp, Haskell, ML, etc) is essential.
Java and the JVM: Clojure/Java interop, JDBC, JMS, etc
Performance Optimization: Profiling and optimizing JVM-based applications.
Computer Science and Software Engineering fundamentals: data structures, space/speed tradeoffs, throughput vs latency, etc.
Domain experience working with complex data including, but not limited to, financial and trading information, large data sets, and high volume processes is a plus.
Bonus: Experience in analyzing financial data sets by utilizing machine learning techniques. These skills include using techniques for detection and ranking, time series analysis for order books, prices etc, productionization, monitoring and governance of machine learning models.
At BroadPeak, we measure goals and delivery, not time spent at a desk. Depending on the position offered, base salary, bonus, equity, and other forms of compensation may be provided as part of a total compensation package, in addition to medical and financial benefits.
BroadPeak is an equal opportunity employer and committed to creating a welcoming environment for all of our applicants. We do not discriminate based upon race, religion, color, national origin, gender, sexual orientation, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.
Job Type: Full-time
Pay: $120,000.00 - $165,000.00 per year
Benefits:
401(k)
401(k) matching
Dental insurance
Flexible spending account
Health insurance
Health savings account
Life insurance
Paid time off
Professional development assistance
Vision insurance
Compensation Package:
Bonus opportunities
Schedule:
Monday to Friday
Experience:
Clojure: 3 years (Preferred)
Machine Learning for financial data (Nice to have)
O intuito deste artigo é ser a primeira parte de um material introdutório à Programação Funcional. Usarei Clojure, uma linguagem que usa esse paradigma muito diferente da tradicional Orientação a Objetos amplamente utilizada no mercado de trabalho e ensinada em faculdades tradicionais. Se você tem a mente aberta para uma nova linguagem de programação e acredita que uma sintaxe e paradigma novos te levarão a novos horizontes, este artigo é para você! Vou também fazer algumas comparações com linguagens historicamente consideradas orientadas a objeto (como Java, por exemplo).
Logo de início, já vou trazer que a Programação Funcional tem muito a agregar. Com ela, é possível escrever códigos mais robustos, menos suscetíveis a erros e expandir sua forma de pensar. Por exemplo, a imutabilidade, um conceito comum nesse paradigma, minimiza a possibilidade de encontrarmos defeitos oriundos da manipulação de estado em lugares desconhecidos. Uma grande vantagem de aprender Programação Funcional é utilizar os benefícios do paralelismo, que trarei aqui mais adiante. Escrever código paralalizável em uma linguagem de paradigma funcional é muito mais fácil. A ausência de efeitos colaterais nas funções de um programa permite que essas funções sejam executadas sem uma ordem definida e em paralelo. A programação funcional nos ajuda a pensar sempre em construir funções sem efeitos colaterais.
Por que Clojure?
Essa linguagem possui recursos interessantes para nos ajudar a manter o foco nos aspectos inerentes à linguagem e à Programação ƒuncional: a sintaxe, que é simples e muito diferente das linguagens mais populares, e um sistema dinâmico de tipos, o que vai nos permitir evitar pensar em Orientação a Objetos por um tempo. Clojure é um dialeto Lisp. Enquanto todas as outras linguagens, como Java, Golang, C# etc, derivam do Algol (como C), linguagens de Paradigma Funcional, como Clojure, Scala ou Elixir por exemplo, são da família do Lisp. Lisp é uma linguagem que trouxe a ideia de que é possível utilizar apenas funções matemáticas para reprezentar estruturas de dados básicas, aliado ao conceito de código como dado. Depois do "choque inicial", é possível perceber que a sintaxe é extremamente simples, com poucos símbolos ou palavras-chaves reservadas.
Clojure roda na máquina virtual Java, o que permite que programas escritos nesta linguagem usem bibliotecas escritas em Java! Dito isto, você precisa instalar o kit de desenvolvimento do Java, JDK, antes de instalar o Clojure em si. Basta seguir o passo a passo da própria documentação no site oficial deles.
Primeiro código em Clojure (e o REPL)
Algo que pode parecer estranho no começo é o fato de não depurar o código da forma tradicional em Clojure. REPL é um ambiente imterativo onde escrevemos códigos e eles são interpretados de imediato, gerando resultados muito rápidos. É ideal para trechos pequenos e validação de ideias.
REPL é um acrônimo para read, evaluate, print e loop. Isto quer dizer que ele vai ler nossos comandos, interpretá-los, exibir na tela o resultado e repetir o processo.
Programar no REPL é algo bem comum entre quem tem familiaridade com Clojure. Claro, em algum momento, nossos códigos vão parar em arquivos e serão empacotados. Mas o REPL fornece um ambiente muito prático para experimentação e testes. Algumas pessoas substituem desenvolvimento guiado por testes pelo constante uso do REPL, mas eu não recomendo. É sempre melhor criar casos de testes para garantir a consistência e qualidade do código, independente da linguagem. Se hoje você sente dificuldade com TDD, ou Desenvolvimento Orientado por Testes (Test-Driven Development), aprender a desenvolver em Clojure é uma excelente oportunidade de treinar essa prática.
Para entender esse tal de REPL na prática, vamos usar o Boot "clj". Basta abrir o terminar e digitar clj ou clojure que vai abrir um terminal interativo esperando você fornecer alguma informação, o terminal diz: user=>. Isso significa que já podemos escrever código! É possível sair desse contexto apertando "ctrl + d".
Primeiros contatos com Clojure
Aqui vou fazer uma abordagem um pouco diferente sobre essa linguagem. O código em Clojure me lembrou muito RPN (Reverse Polish Notation / Notação Polonesa Reversa). RPN basicamente é um método para escrever expressões matemáticas onde os operadores (+, -, *) são colocados depois dos números (operandos), eliminando a necessidade de parênteses. Exemplo de uso: "O motor de cálculo converte a fórmula (10 + 5) * 2 para a expressão RPN 10 5 + 2 * para poder processá-la". É normalmente usado em sistemas de back-end, como calculadoras e motores de regras (rule engines), para avaliar de forma rápida e sem ambiguidade fórmulas complexas de juros, tarifas ou limites de crédito.
Entrando no REPL e digitando (+ 1 2) o resultado, como esperado, é 3. Mas a sintaxe parece bem diferente das linguagens com as quais estamos acostumadas. Vamos à explicação:
Acontece que a linha dentro do () é uma lista composta de +, 1 e 2. O primeiro elemento dessa lista é sempre uma função que é executada e os demais elementos desta lista são argumentos para esta função. O mesmo se aplica para as outras operações aritméticas:
(* 2 3)
(/ 2 2)
(- 0 2)
Se for uma operação um pouco mais complexa, fica da seguinte forma:
(* 2 (+ 3 3))
;; isto é um comentário em Clojure e vamos
;; utilizá-lo para demonstrar o retorno das funções
;; 12
Este exemplo é onde Clojure mostra mais um diferencial no sentido de sintaxe, fruto de sua herança de Lisp. Pensando na matemática, é claro que a soma será executada primeiro. O diferencial, porém, é que a linguagem exige os parênteses, o que não deixa margem alguma para dúvidas do que precede o quê. A ordem de execução de código é sempre de dentro para fora.
E para concatenar Strings? Assim: (str "Hello, " "world!")
Podemos também verificar se duas Strings são iguais:
(= "Hello" "Hi")`
;; false
(= "Hi" "Hi")
;; true
Importante observar que, diferente de muitas linguagens, o = é uma função em Clojure que verifica se duas coisas são iguais. O = não é um operador de associação, normalmente utilizado na construção de variáveis.
E como verificamos se um número é par?
(even? 2)
;; true
E se um número é múltiplo de 3?
(= 0 (mod 9 3))
;; true
Neste último exemplo, o que acontece é que verificamos o módulo da divisão entre 9 e 3. O resultado será utilizado como o segundo argumento na função que verifica igualdade, =.
Nossas próprias funções
Começando do básico, vamos criar uma função que recebe um nome e da um "oi". Para criar funções, fazemos o seguinte:
(defn oi [nome]
(str "Oi, " nome "!"))
Namespaces
Ao criar a função é possível ler a saída: #'user/oi
Isto quer dizer que alguma coisa com o nome oi acabou de ser criada, e encontra-se no namespace padrão, user. Namespaces em Clojure representam a mesma ideia que em outras linguagens, como pacotes em Java, sendo uma forma de agrupar funções. A combinação do namespace e do nome da função forma o identificador de tal função
A função +, por exemplo, é encontrada no namespace clojure.core, sendo o identificador clojure.core/+.
Como o namespace clojure.core é disponibilizado por padrão, a função + está sempre disponível.
Funções em outros namespaces precisam ser incluídas no nosso cófigo antes de serem utilizadas.
Com a função criada, vamos invocar com o nome desejado:
(oi "zé")
;; "Oi, zé!"
O defn nos indica que vamos criar uma função.
Depois, damos um nome a ela (neste caso, oi).
Logo a seguir, vem a lista de argumentos, cercada por [ e ]. Neste caso, temos apenas um argumento, então fica [nome].
Em seguida vem o que realmente é executado: a concatenação de Strings. Note que não precisamos definir o que será retornado. A última instrução é o que será retornado.
O Trecho que concatena Strings tem uma particularidade: é aplicado 3 argumentos. Esta é uma de várias funções que são aplicáveis em uma quantidade indeterminada de argumentos.
E como seria uma função que verifica se um número é múltiplo de 3?
if tem muito cara de uma função também, mas ela é na verdade o que é chamado de forma especial: um recurso base do Clojure.
O if recebe 3 argumentos: O primeiro é uma verificação que retorna verdadeiro ou falso. Os demais argumentos representam algo a ser executado de acordo com o resultado da verificação. Em caso de verdadeiro, o segundo argumento é avaliado e retornado. Caso contrário, o argumento 3 é que é avaliado e retornado.
É importante salientar que apenas nil e false são considerados realmente falsos para verificação de condições. Outros, como 0 e String vazia, que são comuns em outras linguagens, serão avaliados como verdadeiros!
Consultando a documentação
Sempre que você quiser saber mais sobre algum recurso do Clojure, você pode experimentar a documentação dentro do REPL!
Para isso, use a função doc: (doc if)
Agora, quando houver outra condição além de duas, a macro cond pode ajudar. Ela recebe pares de condicionais e expressões. Segue um exemplo:
(defn saldo [valor]
(cond (< valor 0) "negativo"
(> valor 0) "positivo"
:else "zero"))
Neste exemplo, quando o valor é menor que zero, o primeiro teste, logo de cara, retorna verdadeiro, e a expressão que o segue, "negativo" , é avaliada (neste caso, não faz nada, apenas retorna a String "negativo"). Quando o valor é maior que zero, o primeiro teste ((< valor 0)) falha, retornando falso, mas o teste seguinte retorna verdadeiro, e a expressão correspondente é avaliada. Agora, quando o valor é exatamente 0, as duas primeiras verificações retornam falso, e a última verificação (:else) é validada e "zero" é retornado.
O que significa este :else? Nada demais, na real. Como qualquer coisa diferente de nil e false é considerada verdadeira, qualquer coisa que colocássemos ali no lugar do :else funcionaria como uma forma de fazer com que "zero" fosse retornado por padrão. Se quiser, teste lá com valores como 1 ou "milhão". Bem, não é realmente qualquer coisa: números (mesmo que 0), Strings (mesmo que vazias), caracteres, coleções... O :else é apenas uma convenção adotada por algumas pessoas as quais eu andei copiando.
Conclusão
Ufa! Acho que agora temos a capacidade de aumentar um pouco mais a complexidade dos problemas que podemos tratar. Aqui aprendemos como interagir com o REPL do Clojure, chamar funções, criar nossas próprias funções e trabalhar com condicionais. No próximo artigo, resolveremos um problema bastante comum no mundo da programação: o Fizz-Buzz. Você verá como a solução para este problema fica bem sucinta em Clojure.
The following function is used to create screenshots for this article.
We read the pixels, write them to a temporary file using the STB library and then convert it to an ImageIO object.
In the fragment shader we use the pixel coordinates to output a color ramp.
The uniform variable iResolution will later be set to the window resolution.
Note: It is beyond the topic of this talk, but you can set up a Clojure function to test an OpenGL shader function by using a probing fragment shader and rendering to a one pixel texture.
Please see my article Test Driven Development with OpenGL for more information!
Creating vertex buffer data
To provide the shader program with vertex data we are going to define just a single quad consisting of four vertices.
First we define a macro and use it to define convenience functions for converting arrays to LWJGL buffer objects.
Now we use the function to setup the VAO, VBO, and IBO.
(defvao(setup-vaoverticesindices))
The data of each vertex is defined by 3 floats (x, y, z).
We need to specify the layout of the vertex buffer object so that OpenGL knows how to interpret it.
We can now use vector math to subsample the faces and project the points onto a sphere by normalizing the vectors and multiplying with the moon radius.
In order to introduce lighting we add ambient and diffuse lighting to the fragment shader.
We use the ambient and diffuse lighting from the Phong shading model.
The ambient light is a constant value.
The diffuse light is calculated using the dot product of the light vector and the normal vector.
Goddamn things have been busy lately. Mostly personal stuff and not tech stuff, so you won't get to hear about the majority of it unless you know me in real life. And even then, lets be honest, it's less interesting than boring. The work/tech stuff I've been interested in has to do with docker compose and 3D printing.
# (see if you've got anything unexpected being captured by docker networking)
ip -4 addr | awk '/inet /{print $2}'
docker network ls
docker network inspect bridge | grep -i subnet -A2
# (if so, Edit /etc/docker/daemon.json to set non-overlapping pools). Add
# ...
# "default-address-pools": [
# { "base": "172.80.0.0/12", "size": 24 }
# ]
#...
# to the top level, then restart docker)
sudo systemctl restart docker
docker network prune -f
# (continue about your docker composing business)
This is more of a PSA than a progress report. The actual project isn't published yet, so I'm not going to scoop the research team, but suffice it to say that it involves running docker compose a lot for testing/local development purposes. And it turns out that thanks to the idiosyncrasies of Docker networking, the docker daemon sometimes captures local IP ranges. If you get somewhat unlucky, it might capture IPs belonging to websites you want to visit, at which point you won't be able to.
If you're extremely unlucky, it'll capture 192.168.0.*. If this happens, your symptom will be the sudden and inexplicable lack of connectivity to anything on your local network. And, I realize this doesn't apply to many people, but if you have a locally running GPU cluster for some odd reason, you won't be able to access it and you won't really know why.
The solution is to edit your /etc/docker/daemon.json so that it has the top-level key "default-address-pools" set to something like
[
{
"base": "172.80.0.0/12",
"size": 24
}
]
If you find yourself doing a lot of docker compose calls, and don't want your network to be borked as a result, do the same.
These things are both really fun and really useful. The main ones I've got running right now are a Creality Ender3 and an SV08. There are definitely more polished products out there, but, as you know, I'm an Emacs user. Which telegraphs an almost OCD-level of desire for control of my affordances, and a maniacal drive to tinker and experiment. So I'm naturally going to go for the less polished, but more open-source-friendly options.
The SV08 is running perfectly stock right now. That is Sovol's stock. Apparently it didn't used to come with regular klipper? I seem to have all the control that implies so I guess that changed somewhere along the line since it launched. The Ender3 has been seriously messed with. It's really prone to various failures (clogs, touch module errors, jams, etc), which mean that I got to be really comfortable with the process of disassembling it and putting it back together properly. I'm setting up a RasPi/klipper setup for it as we speak, so I'll be able to let you know how that works in a bit, and I'll be setting up a webcam to go with it at approximately the same time.
In the meantime, what I can definitely recommend is:
if you have the spare cash to throw at your hobby, definitely spring for a quick-swap hotend. It doesn't actually accelerate nozzle swaps as far as I can tell, but it does keep you from needing to do any hot tightening, and it makes the output more consistent. The stock Ender3 hotend comes with a short nozzle which then gets fed through a PTFE tube that goes through the heatblock, and this was the cause of at least three of the clogs I've encountered so far. Unless you're doing what I did and are deliberately trying to rack up troubleshooting experience, probably just go for it.
if you're doing the sort of work I'm doing with this thing, mostly utilitarian prototyping and not sculptures, then you'll also probably want to spring for a wider nozzle. Right now I'm running an 0.6mm rather than the stock 0.4mm, and while the layer lines are much more noticeable, I can get prints out much faster. Because my workflow is 1. magic -> 2. virtually evaluate a prototype -> 2. print -> 3. test physical prototype -> 4. if not good, tweak it and go to 2, else ship it, a ~30% reduction in print time tightens my loop and lets me get more pieces running out in the real world. If you are a sculpture printer, you should probably invest in a better printer for your purposes. If you have lab space rather than livingroom space, you should probably look at resin rather than FDM printers.
get an open toolhead cover. I recommend this one, but you might need to clip it to fit depending on which version of the Ender you're running. The reason for this is kinda dumb; about 70% of the issues I have with this printer are to do with the touch sensor, and having an open toolhead cover means I've been able to resolve those without partly disassembling it.
Apparently, installing klipper on an Ender is the single best upgrade you can make to one of these things. I didn't have the balls to do it while it was my only printer, since the entire production line would then be out of commission. Now that I've got a second workhorse, I'm giving it a shot. Once this done, I'll also be giving this mod a serious shot. Load-cell probing seems like it'd be more accurate, less error-prone, and more reliable than dealing with the finicky touch sensor, but it might be a moot point since klipper apparently has better error recovery mechanisms available for this situation.
The frontier models have been getting better and better at coding help. ChatGPT in particular is now dangerously competent when dealing with the more common languages. I think I still edge it out in LISP/Clojure coding, but it's in the same weight class in Python and JS. It definitely still needs some background assumptions to be made. Every time I've asked it to code something from scratch, I get back a giant pile of spaghetti, but if I architect the app before asking it to fill in the blanks, I get really good results back. I'm hoping to do something about this, because I'd really like to be able to point it at a project and just have it make serious progress in my stead rather than in concert with me.
Also, relatedly, there is now a surprisingly long list of tasks for which ChatGPT, do the thing fails miserably but ChatGPT, write a python program that takes foo as input and does the thing gives you a pretty good first cut at the real solution. And many-shot prompts do even better than that. There are many things that I wasn't particularly expecting to get mundane utility for yet, but that have been more or less solved for me in my day-to-day life.
That's about everything interesting floating around in my head right now. Wish me luck, and as always, I'll let you know how it goes.
I’ve been programming in Clojure for the last five years.
I don’t write much about it here, largely because I use Clojure at work and rarely for hobby projects, so I don’t have much to share.
Even today, the post will be more about Clojure tooling, rather than Clojure itself.
Today, most developers expect the language they work with to have a certain amount of specific tools, like a language server, build system, etc.
These tools usually help work on a project with less friction.
For instance, most languages require a language server to provide things like autocomplete, jumping to definition, and linting.
However, when I do work on a hobby project, I usually prefer a more distraction-free environment.
Count me old-school, but I kinda like the simplicity of just having you, your text editor, and the code.
I don’t really use these things even with other languages.
Of course, Clojure itself doesn’t need such tooling that much, because there’s the REPL, which already acts like your language-specific tooling.
I use Emacs, and it has excellent Clojure support thanks to nREPL and CIDER - it provides most of the features that a language server can, albeit in a bit different way.
However, even with nREPL and CIDER, I don’t really use most of the features, maybe only the goto definition thing.
Another thing provided by language servers is linting, but since Clojure is a dynamic language and it has a REPL, I already evaluate code all the time, so the REPL usually gives me all errors, and I don’t really need static linting that much either.
That goes for hobby projects.
However, when working in a team on a big project, that’s a different story.
For the last five years, I’ve been working on medium-sized projects, and the main difference for me is that there’s a lot of code I didn’t write myself.
Language server helps here, because not only does it provide linting of such code, it also allows me to avoid loading all of the code into the REPL for CIDER-specific features to kick in.
Also, not everyone in those projects used a language server, so every now and then the linter pointed out some potential problems.
Recently, I switched jobs, and now I work on a different team, on a much bigger project.
My work setup didn’t change - basically, my Emacs config was ready as is to start working on a new project, or that’s what I thought.
However, I noticed that things didn’t work out as planned.
I’m a firm believer that developers should use the slowest hardware they can get their hands on for developing.
If your hardware is slow, you’re urged to write optimal code and, subsequently, create tools that work on that hardware.
This is where things started to go haywire.
You see, this is a big project, with a lot of files.
Usually, that’s not a problem, however, this time it was.
In Clojure, we have Clojure-LSP as the language server to use.
I have a pretty old laptop.
Here are my specs:
Model
RAM
CPU
GPU
Lenovo IdeaPad S540
16 GB, 8GB ZRAM SWAP
AMD Ryzen™ 7 3750H
AMD Radeon™ Vega 10
It was plenty for anything I did in the past, for hobby projects, that is.
At my previous workplace, I had a work laptop that had 24GB of RAM and a slightly faster CPU.
Not that I really need an extra eight GB of RAM, I never saw it go above 16 while working on a single project.
Sometimes I did work on several projects at the same time, and then it surely helped, but it was rarely the case.
At the current job, I hit SWAP all the time, and it started to bother me a lot lately.
The reason is, as you might have guessed is Clojure-LSP.
When started in the project I’m working on, it alone takes up around 8-11 GB of RAM.
Right now, as I write this post, my editor, Firefox, and a messenger already take up 4.5 GB of RAM, so adding Clojure-LSP to this mix will by itself approach my limit.
And then, I start the REPL, which took another 2-3 GB, and we’re in the SWAP territory.
I think it’s obvious that there’s no way my laptop can handle this load without becoming sluggish.
RAM isn’t the only bottleneck here, CPU usage spikes up a lot, too, and the temperature is around 75 degrees and up constantly.
So I decided to change this.
I’m not alone at this, unfortunately, my colleagues also suffer from Clojure-LSP being a resource hog.
And the real problem is that they have even faster machines than mine with more RAM.
So even if I upgrade my laptop or buy a new one, it won’t help that much.
So, as an experiment, we decided to disable Clojure-LSP and go back to a simpler setup.
Disabling Clojure-LSP
I use the lsp-mode package, so disabling Clojure-LSP was as easy as commenting out the hook that starts the language server, but I went further and removed lsp-mode completely, as I don’t have use for it other than for Clojure.
But now, I have no linting, which I’d like to have since this is a complex project.
The go-to linter in Clojure world is clj-kondo, so I added flymake-kondor1, since I use flymake and not flycheck.
Fortunately, Clojure-LSP uses clj-kondo internally, so the linting configuration is the same.
I was expecting that linting such a big project would still eat a lot of RAM, however, for some reason, there wasn’t any major spike in RAM usage when just using clj-kondo.
Linting works fast, and Emacs no longer freezes every now and then.
I guess communication with the language server is much more taxing than simple parsing of stdout.
However, linting isn’t the only thing provided by Clojure-LSP.
Bringing back refactoring
Two main things I noticed I rely on with a language server are symbol renaming and finding references.
Both of these tasks can be handled with refactor-nrepl, however, it’s a bit finicky.
Symbol renaming
Before we begin, let me tell you why you might want this feature as part of the tool that does code analysis.
Sure, it’s one thing to rename a symbol inside a single namespace, but when you want to rename it across the project, it gets tricky.
One way of doing it is to use grep, and utilize the Emacs capabilities to edit the buffer created by grep directly.
While it works, it’s not as precise as with a language server, because the symbol can be different depending on a file.
For instance, when renaming a namespaced keyword, you can encounter a problem that the keyword is written as ::foo in the file you’re editing, but as :fully.qualified.namespace.name/foo or ::namespace-alias/foo depending on how the namespace was used.
Sure, you can write a regular expression for grep in the first case, but not really for the second one.
Here’s an example: imagine we have several namespaces in our project (can be in different files, can be in a single file like here):
It’s a bit verbose, but I tried to make as short of an example that shows all possible ways to use a symbol.
Here, if you’re going to rename the ::foo keyword, you can’t really grep with "::foo|:project.multimethods/foo|::[^/]+/foo", because it will also find ::unrelated/foo, which is unrelated.
Sure, you can write a pretty generic regular expression, and then meticulously find all relevant symbols in the grep buffer, and rename them using multiple cursors, or the query-replace feature, but it’s a bit much.
Same goes for renaming functions - if you’re trying to rename foo from project.utils, grepping can find foo in the project.unrelated namespace.
So grep is not a suitable alternative to Clojure-LSP, as it isn’t capable of doing semantic analysis.
Since we’ve disabled Clojure LSP, we need a different tool to handle this task.
Thankfully, there’s the refactor-nrepl project that provides refactoring features via the nREPL integration that we’re using.
It has a lot of features, and it works well enough for our project.
Well enough, because refactor-nrepl is a bit finicky.
Problems with refactor-nrelp
First of all, refactor-nrepl works.
Most of the time, that is.
I think one of the reasons why Clojure LSP used so much RAM was that it did all of the possible analysis in the background.
The reason I think it’s true is that it did the renaming almost instantaneously.
When renaming a symbol with refactor-nrepl, I often get a timeout error with CIDER - renaming is a blocking operation, and CIDER tries not to block the editor for too long in some cases.
On a large project, renaming a symbol takes a lot of time to fetch all symbol occurrences alone.
I guess that’s the trade-off.
Perhaps I can configure the timeout to be a bit longer, but we’ll see.
Another feature of Clojure LSP I relied on was finding usages.
Refactor nREPL gives that in the form of finding references, which, again, works, but is susceptible to the same timeout problem.
And I’m not sure if it is as precise as Clojure LSP’s one.
CIDER itself also has a feature for finding usages, but it is also finicky in its own way.
I suppose, because Clojure-LSP appeared, fewer people use refactor-nrepl today than it was before, and the overall advancement in CIDER development may have slowed down because of it.
Looking at GitHub graphs, refactor-nrepl development started around 2015, and major activity stopped around 2019, precisely when Clojure-LSP was created.
Figure 1: refactor-nrepl
Figure 2: clojure-lsp
Which is a bit of a shame, if Clojure-LSP was responsible for slowing down advancement in refactor-nrepl, but thankfully it is still maintained and works.
It’s good to have alternatives, and putting all eggs into one basket was never a good way of doing things.
With refactor-nrepl, most of the things I used Clojure LSP are again available to me, but I have to do further testing, because I found some instances where it fails.
Maybe I need to configure a bunch of settings.
Or maybe you’ll see another post with me going back to Clojure LSP in a few months/weeks.
Who knows!
Not a bashing on Clojure LSP
Contrary to what this post may seem like, this was not intended as bashing on Clojure LSP developers.
Clojure LSP is a good piece of tech, and certainly helps Clojure developers around the world.
The reason it is problematic to use on this particular project can be due to a lot of factors.
First, the project is enormous, lsp-mode reports that it wants to “watch” for around 2300 directories.
Disabling file watchers helps, but by a small margin.
Second, the tech stack.
Clojure LSP is written in Clojure, which, while it makes sense, might not be the best choice.
Clojure is known to be not the best tool for writing utilities.
And while Clojure LSP is a server, and not a CLI utility with a fast lifecycle, it may still be sub-optimal.
Maybe, once Jank is ready, Clojure LSP could be rewritten in it, making it faster.
I don’t know.
A completely unrelated reason, in my opinion, is that Clojure itself is not the best target for LSP.
It’s a dynamic language, where we do a lot at runtime in the REPL.
Doing static analysis in such a system can be difficult, as there’s no longer one source of truth.
nREPL, bridging runtime and source code, while also having a lot of LSP features, seems like a much better fit for languages like Clojure.
And, like, nREPL has appeared almost six years before the Language Server Protocol, and in my opinion, it could have been a far better protocol for developing language tooling, especially since it is also language-agnostic.
Anyway, replacing Clojure LSP with plain clj-kondo brought back the joy of writing Clojure, so I’m happy again.
What’s up with the name? Why not just flymake-kondo? ↩︎
Wherein we describe the significant enhancements to typename syntax and resolution in ClojureCLR effective with version 1.12.3-alpha2.
TL;DR
Several significant improvements have been made to typename syntax and resolution in ClojureCLR.
Discover and automatically load assemblies that the application depends on and assemblies that are part of the .NET runtime, so that types in those assemblies can be found without explicit assembly loading. You will never write (assembly-load "System.Text.Json") again; if you execute (System.Text.Json.JsonSerializer/Serialize ...), the assembly will be loaded automatically if not alread loaded.
You can define type aliases for any type, including generic types.
You can use type aliases at the top level or embedded as generic type parameters.
You can use the built-in Clojure primitive type names such as int, long, shorts, etc. as generic type parameters.
In many places, you no longer need to include the arity of the generic type in the name.
ClojureCLR typename syntax and resolution: status quo ante
I wrote previously about typename resolution in ClojureCLR in Are you my type?. That post described the strategies used to look up types by name, and some of the tradeoffs involved.
Two prominent pain points in dealing with typenames and resolving them are:
the need to call assembly-load or related functions to load assemblies before types in them can be resolved, when the assemblies of interest could be automatically discovered and loaded.
the need to use fully namespace-qualifed names, explicit number of generic parameter counts, and other syntactic burdens when referring to types.
The first is self-explanatory. For the second, a poster child of the problem is
Why must we write System.Int64 instead of just int?
Why must we write System.String instead of just String? We can just write String when used direclty as a type hint. Why not here?
Why must we write Dictionary`2 instead of just Dictionary? We have the context to infer the arity of the generic type definition.
One has to make direct reference by name to underlying platform types in various places in Clojure(JVM or CLR) code. Type hints to avoid reflection are one example. There are some places where a string can be used instead of a symbol to refer to a type, but often a symbol must be used. Which presented a problem for ClojureCLR given the complexity of CLR types.
For a variety of reasons, I decided to use the syntax of fully-qualified type names used by the CLR itself. This is the syntax used by methods such as Type.GetType() and Assembly.GetType().
A non-trivial problem with that choice; the syntax uses characters such as backquotes, commas and square brackets that are not valid in Clojure symbols. So I had to come up with a way to write a symbol using characters that the Lisp reader would not normally accept. (Any alternative syntax likely would have had the same problem.)
Other Lisps have solutions to this problem. I decided to use a simplified version of the symbol syntax used in CommonLisp. This is the |-quotiing used by the Clojure Lisp reader. Read about it in Reader extension: |-quoting.
The type resolution code in ClojureCLR passes the name of this symbol directly to methods such as Type.GetType() and Assembly.GetType().
Unfortunately, this syntax is not very pleasant to write.
Aliases in the before world
Why not just use import and ns declarations to define type aliases?
Namespaces already supply a mechanism for mapping symbols to types.
Namespaces are mappings from simple (unqualified) symbols to Vars and/or Classes.
Vars can be interned in a namespace, using def or any of its variants,
in which case they have a simple symbol for a name and a reference to their containing namespace,
and the namespace maps that symbol to the same var.
A namespace can also contain mappings from symbols to vars interned in other namespaces by using refer or use,
or from symbols to Class objects by using import. [ Emphasis mine. Reference: Namespaces ]
Use of the symbol-to-type map is embedded all over the Clojure interpreter/compiler code. I’ve written a little about this:
Every namespace comes pre-loaded with a set of type aliases for all the public types in the System namespace in assemblies that are loaded during Clojure initialization. This is why you have been able to write
(.ToLower^Strings)
There is an entry in the namespace map associating String with the type System.String. That mapping is found when the type hint ^String is processed.
Clojure provides a mechanism for users to define type aliases: import. Though one can call import directly, it is more commonly encountered in :import clauses in ns declarations. import and (ns ... (:import ...)) can do some of twhat we want, but is not tied into the underlying CLR type resolution mechanism. For example, you can write:
to introduce aliases FileInfo for System.IO.FileInfo, etc.
And these will work standalone.
(defnf[file-info](.FullName^FileInfofile-info))
But, before the changes described below, this would not work:
|System.Collections.Generic.List`1[FileInfo]|
The underlying CLR typename resolution algorithm is not aware of Clojure aliases; it requires fully-qualified names. Instead of FileInfo, you must write System.IO.FileInfo.
In addition, import does not do what is needed for generic types. Though you can do
you would get an error about defining a second alias for List`1.
Let’s turn to the improvements.
Improved assembly discovery
There are assemblies that should be checked and automatically loaded when looking up types. These include assemblies that the application itself directly depends on as–they might not be loaded yet–and assemblies that are part of the .NET runtime, such as System.Collections.Concurrent.dll.
The new version of RT.classForName() (see the aforementioned Are you my type?) automatically discovers and loads these assemblies. It uses the DependencyContext class to find the entry assembly’s dependencies. It also discovers assemblies that are part of the shared .NET runtime via AppContext.GetData("TRUSTED_PLATFORM_ASSEMBLIES").
Note that the latter is not available in .NET Framework 4.6.2. I added a build of ClojureCLR for .NET Framework 4.8.1 which does have access to AppContext.GetData. If you are on Framework and want the improved typename resolution, use that build.
The algorithm uses several heuristics to identify assemblies that should be inspected. There might be some improvements in these heuristics in the future. But the current version seems to work well in practice. One user reported on the #clr channel on Clojurians Slack that they were able to remove the following assembly-load calls in one of their files.
I’ll describe each of the three improvements in turn.
Type aliases
There are two aspects: providing a way to define type aliases that fixes the problems with import, and integrating type aliasing into the typename resolution mechanism.
To define aliases, I have introduced a new function and a new macro. They have the same functionality, but the macro does not require quoting. The following are equivalent:
The second argument must evaluate to a Type object. With type aliasing fully incorporated into typename resolution, you can now write:
(alias-typeList|System.Collections.Generic.List`1|);; List maps to the generic type definition(alias-typeIntList|List[int]|);; aliases List and int are both recognized(defil(IntList/new));; You can use an alias where type is expected(defnf[^IntListxs)...)
The backquote-arity suffix is required; it is the true name of the type. In this circumstance, we do not have a context to compute the arity.
I considered ways to provide that context. C# would allow this reference
Dictionary<,>
But I’m not sure that is really better than Dictionary`2. And I very much prefer
|System.Func`6|
to
|System.Func<,,,,,>|
The need to provide backquote-arity suffixes only occurs when referring to the generic type definition itself. When you provide type arguments, the arity can be inferred, as in
|System.Collections.Generic.List[int]|
A note on the built-in special types
Clojure provides special handling for names identifying built-in primitive numeric types: int, long, shorts, etc.
ClojureCLR adds a few for primitive numeric types that are unique to ClojureCLR, such as uint and sbytes.
Note that these do not use the type alias mechanism. They are special-cased to be recognized only in certain places, such as the processing of type hints. Consider:
(alias-typeLongSystem.Int64);; esablish an aliasLong;; => System.Int64 -- evaluates to a typelong;; => #object[core$long 0x28993d0 "clojure.core$long"] -- the Clojure function 'long`(defnf[^longx]...);; interprets ^long as a type hint for System.Int64;; on the JVM, this is a type hint for the primitive numeric type 'long'
The typename resolution code will recognize int and friends when they appear appear in type definitions as generic type parameters only. Of course, the pre-existing usage in type hints is unaffected.
Note: I tried to allow expressions like |int*[]| to be used at the top level, something way down deep in the compiler had a problem with that. I decided it wasn’t worth the effort to find a solution – for now. You can still use |System.Int32*[]| or define an alias.
Inferring generic type arity
When in C# you write
Dictionary<String, List<long>> mylist = [];
(assuming appropriate using directives), the compiler does a lot of work for you behind the scenes. It knows first of all that for Dictionary it is looking for a generic type with two type parameters. For that reason, it looks for a type named Dictionary`2: generic type definition names must have a suffix of a backquote and the arity of the generic.
However, to allow such overloading on generic arity at the source
language level, CLS Rule 43 is defined to map generic type names to unique CIL names. That Rule
states that the CLS-compliant name of a type C having one or more generic parameters, shall have a
suffix of the form `n, where n is a decimal integer constant (without leading zeros) representing the
number of generic parameters that C has. [Source: ECMA-335, Common Language Infrastructure (CLI), Partition II, section 9]
Looking through the namespaces mentioned in usings, the C# compiler finds System.Collections.Generic.Dictionary`2. Similarly, for List`1.
Our enhancements to typename resolution in ClojureCLR now allow you to omit the arity when generic type arguments are supplied. However, you still must supply the backquote-arity in the name when no generic type arguments are given, i.e., when you are referring to the generic type definition itself.
Nested types and generic type definitions – Beware!
CLR supports nested classes. Referring to these is very straightforward – unless generics are involved. For a simple case such as
namespace MyNamespace;
public class Outer
{
public class Inner { }
}
you could refer to MyNamespace.Outer and MyNamespace.Outer+Inner. You could import MyNamespace+Outer and then refer just to Outer+Inner. All fine and dandy.
Now consider:
namespace MyNamespace;
public class GenParent<T1, T2>
{
public class Child
{
public class GrandChild<T3>
{
public class GreatGrandChild<T4, T5>
{
}
}
}
}
Working in C#, you could refer to any of these (assuming using MyNamespace; is in effect):
GenParent<,> // Generic type definition
GenParent<int, string> // constructed generic type
GenParent<,>.Child // nested type -- this is also a generic type definition
GenParent<,>.Child.GrandChild<> // nested type -- constructed generic type
GenParent<int string,>.Child.GrandChild<double> // constructed generic type
// etc.
However, if you were to print the fully-qualified name of the type
We have most of flexibility of the C# syntax in ClojureCLR, except: if you have generic type definitions, as opposed to constructed generic types (those with type arguments supplied), you must provide the backquote-arity suffixes for all generic type definitions in the nesting hierarchy. For the C# examples given above:
GenParent<,> // Generic type definition
GenParent<int, string> // constructed generic type
GenParent<,>.Child // nested type -- this is also a generic type definition
GenParent<,>.Child.GrandChild<> // nested type -- constructed generic type
GenParent<int string,>.Child.GrandChild<double> // constructed generic type
// etc.
we can write in ClojureCLR:
|MyNamespace.GenParent`2|;; Generic type definition, backquote-arity required|MyNamespace.GenParent[int,String]|;; No need for `2 here|MyNamespace.GenParent`2+Child|;; Nested generic type definition; must provide `2 |MyNamespace.GenParent`2+Child+GrandChild`1|;; Nested generic type definition; must provide `2 and `1|MyNamespace.GenParent[int,String]+Child+GrandChild[double]|;; No need for `2 or `1 here
If you introduce type aliases, the same rules apply.
(alias-typeGP|MyNamespace.GenParent`2|)(alias-typeGPC|GP+Child|);; we know the arity from the alias for GP
The comma after System.String indicates that an assembly name follows. However, if you are supplying an assembly name for a generic type parameters, you need the brackets. Example:
(note: not brackets around this), and where the type argument is
[typename,assembly-specifier]
(Note the brackets.)
I hope you never have to deal with nested generic types and assembly-qualified names.
Conclusion
I hope this has been helpful in understanding the improvements to typename syntax and resolution in ClojureCLR. I think these changes make type referencing more pleasant to use and easier to understand.
At Nubank, we are constantly pushing the boundaries of how Artificial Intelligence can help us better understand our customers’ financial journeys. Our previous posts have detailed how we leverage transformer-based foundation models to convert sequences of transaction data into powerful embeddings, enabling us to better meet the financial needs of our customers at the right time [1, 4]. We have explored the interface that translates raw transaction data into embeddings that our models can understand [2], discussed the nuances of fine-tuning these models for specific tasks [3], and demonstrated how we optimize user narratives by thoughtfully selecting and representing transaction features and sources [5].
However, the aforementioned journey of optimizing user narratives is continuous. As we highlighted in our previous posts, choosing which information from a transaction to include and how to represent it matters, especially given the limited context length of our transformer architectures. Today, we dive deeper into a crucial aspect of transaction data: the timestamp of when the transaction happened. How we encode the “when” of a transaction can significantly impact a foundation model’s ability to understand a customer’s financial state and predict future behaviors.
In the remainder of this blog post, we first discuss the challenges with using absolute timestamps. Then, we propose a different approach that uses time deltas to represent the time information, detailing the design process and key decisions. Lastly, we present the experimental design and results that validate this new approach on a real business problem.
The Challenge with Absolute Timestamps
Initially, when representing transactions, our token-level models encoded absolute timestamps represented by special tokens for <MONTH>, <DAY>, and <WEEKDAY> for each transaction. While straightforward, this approach presented several challenges for a foundation model designed to build user representations potentially spanning long periods of time. The figure below reiterates the existing transaction tokenization procedure used by our models [2,4].
For example, consider a scenario where a customer becomes inactive for an extended period, perhaps a year, and then resumes activity. If the model solely relies on absolute timestamps, the embeddings generated at any point during this inactivity period would remain identical. More specifically, the model lacks a “notion of now”. This insensitivity to inactivity periods means the embeddings might not accurately reflect the customer’s current behavior, which is an aspect inherently captured by traditional machine learning features that are calculated over time windows relative to a “score date” (e.g., 1 month, 3 months, 6 months).
Furthermore, absolute timestamp encodings can lead to models overfitting to specific date periods or combinations of <MONTH><DAY><WEEKDAY> and other transaction attributes, especially if the training data covers less than a full year, or if the target has strong seasonalities. This limits the model’s ability to generalize effectively during inference, particularly for out-of-time (OOT) data.
Introducing Time Deltas: A Relative Approach
To address the limitations of absolute time encodings, we hypothesized that representing the timestamp information as a “time delta,” or the “age” of the transaction relative to the score date (the “now”), would be more effective. This approach allows embeddings to reflect periods of inactivity and better capture the recency and relevance of past transactions.
As with other transaction features, we implemented this by designing a special token. More specifically, we implemented this by quantizing time deltas into distinct buckets, similar to how we handle transaction amounts. These buckets are then represented by their own special tokens, such as:
<TIMEDELTA:1-DAY-OR-LESS>
<TIMEDELTA:BETWEEN-1-AND-2-DAYS>
<TIMEDELTA:BETWEEN-2-AND-3-DAYS>
…
<TIMEDELTA:BETWEEN-1-AND-2-MONTHS>
<TIMEDELTA:BETWEEN-2-AND-3-MONTHS>
…
<TIMEDELTA:ABOVE-2-YEARS> (with a chosen truncation cap)
Importantly, there are two hyperparameters we must choose. Firstly, the granularity/scale of the time deltas must be selected. Secondly, we must define a threshold where time deltas are truncated. In the above example, the time delta truncation threshold was set to two years. Therefore, in this case, any transaction that is greater than two years from the score date is truncated to: <TIMEDELTA:ABOVE-2-YEARS>. In the following section, we explore setting these parameters by analysing the distribution of time deltas in our data.
Defining the Time Delta Horizon and Granularity
As mentioned, an important step to effectively use the time delta special tokens is sensibly defining the maximum time delta truncation threshold. For example, selecting a cap that’s too small risks losing valuable information. Conversely, an overly large cap can introduce an excessive number of special tokens, which may be undertrained if their occurrence is rare during the training phase.
By plotting the cumulative distribution of transaction temporal window sizes (the time between the oldest and most recent transactions in a sequence) for our training dataset, we observed that nearly 97% sequences contained transactions up to two years old. Based on this, we decided to start by using two years as the time delta cap for our encoding. Next, we need to choose a granularity for the time delta buckets. We experimented with two different bucket strategies:
Default strategy: More granular buckets for recent data (up to 3 months), then monthly buckets. This included edges like [0, 1, 2, …, 13, 14, 21, 30, 45, 60, 90, 120, …, 330, 365, …, max_age] days.
Less granular buckets: Merging some buckets for transactions aged between 1-2 weeks, to assess if we could discard age granularity for slightly older transactions. Its edges were [0, 1, 2, …, 6, 7, 14, 21, 30, 45, 60, 90, 120, …, 330, 365, …, max_age] days.
Using the default strategy, we plotted the histogram of the time-delta buckets comparing the distributions on the train, validation and test datasets. We can see that the distributions are consistent for the 3 dataset splits, which is a positive sign for generalization in the out-of-time period. The less granular strategy has a similar distribution.
Experimental Design and Results
To rigorously test our hypothesis that a relative time representation is better than an absolute one, we pre-trained four foundation model variants on the same dataset using the next token prediction task. Then, we fine-tuned each foundation model variant on a downstream task using a labeled dataset for a business problem. The variants were:
1. Baseline: Uses DAY, MONTH, WEEKDAY special tokens for absolute timestamp encoding.
2. Relative Time-Delta (REL): Uses only the relative time-delta encoding with the default bucket strategy.
3. Relative Time-Delta, Less Granular (REL-LOW): Uses only the relative encoding with the less granular bucket strategy.
4. Relative Time-Delta + Absolute Encoding (REL+ABS): Combines the relative time-delta with the baseline’s absolute encoding.
To make the distinction between these variants clear, we will explore an example of how each encodes a set of transactions. Let’s consider a user who has the following 4 transactions (with date, description and value):
30/08/2025: Supermarket, R$300,00
22/08/2025: Streaming subscription, R$30,00
22/07/2025: Streaming subscription, R$30,00
10/02/2023: Gas station, R$200,00
Then, using a score date of 31/08/2025 00:00:00 AM, we would get the following tokens for the time representations:
After pre-training and fine-tuning each of the variants, we evaluated the four models on a test set containing data from a later time period, which more accurately reflects real-world production performance. The primary metric for evaluation was AUC. The Figure below shows the delta AUC versus the baseline variant.
Key Takeaways:
Significant AUC Lift with Relative Encoding: The relative time-delta encoding model achieved a 0.1 percentage point (pp) AUC lift compared to the absolute encoding baseline. While that might not sound like much, on a highly optimized model, this lift translates directly into business impact at scale. It is important to emphasize that we are not adding any new information to the model; the lift is obtained simply by better representing the temporal information.
Not Due to Context Length: Interestingly, the relative+absolute model variant demonstrated a similar AUC lift to the purely relative model. This is a crucial finding, as the relative encoding uses two fewer tokens per transaction, which is 15% more efficient in a context-length-constrained scenario. The fact that REL+ABS (which has a shorter effective context length than REL due to more tokens per transaction) still performs similarly suggests the AUC lift is genuinely due to the representation of time and not merely an extended context window.
Granularity Matters: The relative encoding with lower resolution performed worse than the other variants. This indicates that more granular time-delta information for transactions aged between one to two weeks is indeed valuable. This granularity is especially important for capturing the time passed between two transactions, which loses precision if transactions fall into wider buckets.
Improved Generalization Over Time: Driven by the positive results on the standard test set, we performed an extended evaluation on a test set covering a longer out-of-time period. Here, the relative time-delta model showed an even higher AUC lift of 0.2pp compared to the baseline. Furthermore, as shown in the Figure below, the metrics show a positive trend in the delta AUC (i.e., relative improvement over the baseline) vs the baseline as time passes, strongly supporting the hypothesis that relative encoding features generalize better in later time periods.
In this work, we found that how we represent the temporal information can significantly impact the foundation model’s ability to understand customer financial behavior. Encoding time as time deltas instead of absolute dates improved ROC-AUC by 0.2 percentage points (pp), while simultaneously reducing the number of tokens per transaction by about 15%, enabling longer transaction histories within the same token budget. These findings highlight a key principle: the way we design our data representation can have a substantial impact on model performance. The weaker results of the less granular time delta setting further underscore the importance of systematic experimentation and evaluation to achieve optimal results.
[4] Braithwaite, D. T., Cavalcanti, M., McEver, R. A., et al (2025). Your Spending Needs Attention: Modeling Financial Habits with Transformers. arXiv preprint arXiv:2507.23267.
2025 Annual Funding Report 4. Published September 4, 2025.
My goal with this funding in 2025 is to support Apple silicon (M cpus) in Neanderthal
(and other Uncomplicate libraries where that makes sense and where it’s possible).
Having a decent Apple CPU engine for Neanderthal completed in the May-June period, I could continue building on the work on the Deep Diamond CPU engine for Apple hardware that I’ve started.
I had already found out that Apple’s low-level APIs (BNNS) are not very well thought out (as I wrote in the last report), so I expected it to be a not so pleasant march. And it wasn’t. No wonder why large parts of it are already deprecated, in favor of Graph API (which does not replace it’s functionality, but is a completely alternative way of dealing with tensors and Deep Learning).
However, there were no room for quitting, since we need at least a basic tensor functionality, regardless of DNN operations, so we can later potentially integrate the Graph API and other tensor-based libraries. Besides that, the NDArray and Tensor parts of BNNS is not deprecated, so that’s the stuff we have to work with.
I will spare you the gory details, including lots of segfaults and WTFs, but in the end I managed to tame it, and even fit it into the existing deep-diamond/dnnl API, with backward compatibility (not including the convolution and rnn ops which I found not worth trying to tame at this stage).
Then, just at the end of the month, I even managed to tidy up a Apple silicon enabled Deep Diamond release (0.35.2, 0.35.3).
It’s in the Clojars!
I didn’t have time and resources to advertise it wildly right away, and the next day I already started working on the ONNX Runtime integration, so it’s still an Easter egg for the dedicated folks who actually read these reports. :)
I’ve already worked on Neanderthal, and made a release with assorted bugfixes and improvements.
It’s not a glamurous work, working at these lower levels, but I see bright future for Clojure tensors. Cheers!
Eric Dallo
2025 Annual Funding Report 4. Published September 8, 2025.
In these last 2 months I mainly focused on my recently created project, ECA and its related projects, there were so many improvements and new features, the project grown a lot with lots of people using!
ECA (Editor Code Assistant) is a OpenSource, free, standardized server written in Clojure to make any editor have AI features like Cursor, Continue, Claude and others.
0.2.0 - 0.43.1
There were so many releases I can’t just put the whole changelog here hehe, but the main highlights were:
Web page_: A whole new site detailing the project, docs and features: https://eca.dev
New models_: Anthropic (with subscription too), OpenAI, Github Copilot, Z.AI, OpenRouter, Azure and many others
Custom providers_: It’s possible to configure custom providers for your models.
After testing other tools, improving a lot and receiving positive feedbacks, I believe ECA emacs offers the best Emacs tool for AI development right now, which is great, there are still so many features to add!
There were mainly improvements in performance regarding clj-kondo bumps and some small fixes.
Also now we have a new custom linter clojure-lsp/cyclic-dependencies!
2025.08.15-15.37.37 - 2025.08.25-14.21.46
Docs
update neovim editor configuration for clojure lsp
General
New feature: Add clojure-lsp/cyclic-dependencies linter to detect cyclic dependencies between namespaces in the project.
Change clojure-lsp/cyclic-dependencies custom linter default level to be off until corner cases are fixed.
New optional :kondo-config-dir setting to configure clj-kondo execution.
Parallelize and log the time spent on built-in linters execution.
Fix #1851: Error when source files have non-ASCII characters in their path or name
Fix caching issue when :source-aliases changes. #2081
Fix emitting invalid messages if there’s an internal error.
Bump clj-kondo to 2025.07.28 considerably improving memory usage.
Editor
Avoid lint .lsp/stubs folder when starting.
Michiel Borkent
2025 Annual Funding Report 4. Published September 5, 2025.
In this post I’ll give updates about open source I worked on during July and August 2025.
I’d like to thank all the sponsors and contributors that make this work
possible. Without you the below projects would not be as mature or wouldn’t
exist or be maintained at all! So a sincere thank you to everyone who
contributes to the sustainability of these projects.
c:\Users\kathl\Documents\Documents\Clojurists Together\2025 Reports\mayJune lng term 2025\mb-switzerland-2025.jpeg
Although summer hit Europe and I made a train trip to Switzerland for some hiking with my wife, OSS activity continued in the borkiverse. 20 projects saw updates. As usual, babashka, SCI and clj-kondo saw the most activity.
One of the big things I’m looking forward to is speaking at Clojure Conj 2025. At the risk of sounding a bit pretentious, the title of my talk is “Making Tools Developers Actually Use”. Babashka started as a quirky interpreter “nobody had asked for” but now many Clojure developers don’t want to live without it. Clj-kondo started out as a minimal proof-of-concept linter and now is widely used tool in Clojurian’s every day toolset and available even in Cursive today. In the talk I want to reflect on what makes a tool something developers (like myself) actually want to use. I’m excited about this opportunity and about my first time visiting the Conj (don’t ask me how I got the Clojure Conj cap on the photo above). Given the rest of the schedule, it’s something I wouldn’t want to miss.
For babashka, my main focus has been making it feel even more like regular Clojure. One example is the change in how non-daemon threads are handled. Previously, people had to sometimes add sometimes @(promise) to keep an httpkit server alive. Now babashka behaves like clojure -X in this regard: if you spawn non-daemon threads, the process waits for them. It’s looks like a small change, but it brings consistency with JVM Clojure, something I’m always aiming for more with babashka. If you want the old behavior, you can still use --force-exit. While implementing this I hit an interesting bug with GraalVM and also found out that clojure -X sometimes stalls when using agents. Maybe more on this next time.
Another change that was introduced is that when code is evaluated through load-string or Compiler/load (which is the same thing in bb), vars like *warn-on-reflection* are bound. This fixes a problem with loading code in non-main threads. E.g. @(future (load-string "(set! *warn-on-reflection* true)")) would fail in previous versions of babashka. You might wonder why you would ever want to do this. Well, a similar thing happens when you execute babashka tasks in parallel and that’s where I ran into this problem.
SCI, the interpreter under the hood of babashka and several other projects, got some critical fixes as well. I detected one somewhat embarrasing bug when loading clojure+.hashp in babashka. It had code that looked like:
In the expression (alter-var-root #'config (constantly config)) the var #'config was mistaken for the local config since SCI’s analyzer used a resolve-like function that also resolves locals. This fails horribly. In 6 years of SCI it’s the first time I encountered this bug though. After fixing this problem, I noticed that babashka’s CI acted up. On every commit, babashka CI tests dozens of Clojure libraries by running their test suites. I noticed that specter’s tests were failing. It turned out that one test actually worked prior to fixing the above bug exactly because the SCI analyzer’s resolve returned a node that evaluated to a local value. But there is no way I could just leave that bug in, so I had to make a pull request to specter as well to set this straight. A new specter version was released that works both with older version of babashka and the new version.
One other headscratcher in SCI was on the ClojureScript side of things and had to do with munging. In interop like (.-foo-bar #js {:foo-bar 1}) ClojureScript munges the field name in the interop form to foo_bar but in the object it stays "foo-bar". The munging of this name wasn’t applied in SCI as an oversight. So in SCI (and thus in nbb, joyride, scittle, etc.) the above expression would return 1 whereas in ClojureScript it would return nil. In contrast, (.-foo-bar #js {:foo_bar 1}) would return nil in SCI but 1 in CLJS. Although fixing this could mean a breaking change in SCI-based scripting environments I decided to align it with CLJS anyway, as switching between SCI and CLJS should not introduce these kinds of surprises.
Other improvements in SCI were made in the area of better using type hints on instance method interop.
And then there’s clj-kondo, the linter that is supposed to spark joy ✨, as far as a linter is able to do that in a developer’s life. Two new linters were added, including one that catches suspicious uses of locking. This linter was inspired by a similar rule in splint. Lots of smaller improvements were made like sorting findings and imported files such that they are consistent across multiple runs that use the --parallel option and across operating systems. And as usual bugfixes and preventing false positives.
One happy improvement to scittle is that referencing a library that was introduced by a <script> tag now was made a lot easier. You can find the docs about that here. The tl;dr of this is that when a library registers itself as a global, you can just use that global in :require now: (require '["JSConfetti" :as confetti]).
Of course, none of this happens in isolation. I’m deeply grateful to the community and the sponsors who make this work sustainable: Clojurists Together, Roam Research, Nextjournal, Nubank, and many other companies and individuals. Every bit of support means I can keep refining these tools, fixing edge cases, and thinking about the long-term direction.
Here are updates about the projects/libraries I’ve worked on in the last two months in detail.
babashka: native, fast starting Clojure interpreter for scripting.
Bump clojure to 1.12.2
#1843: BREAKING (potententially): non-daemon thread handling change. Similar
to JVM clojure, babashka now waits for non-daemon threads to finish. This
means you don’t have to append @(promise) anymore when you spawn an
httpkit server, for example. For futures and agents, bb uses a thread pool
that spawns daemon threads, so that pool isn’t preventing an exit. This
behavior is similar to clojure -X. You can get back the old behavior where
bb always forced an exit and ignored running non-daemon threads with
--force-exit.
#1690: bind clojure.test/*test-out* to same print-writer as *out* in nREPL server
Add Compiler/demunge
Add clojure.lang.TaggedLiteral/create
Add java.util.TimeZone/setDefault
Add println-str
SCI: Var literal or special form gets confused with local of same name
#1852: (.getContextClassLoader (Thread/currentThread)) should be able to return results from babashka classpath
Bump deps.clj to 1.12.2.1565
Bind more vars like *warn-on-reflection* during load{string,reader} (same as JVM Clojure) so can load code in other than than the main thread
Fix babashka support by removing optimizations that only worked due to SCI bug
clojure-test-suite: Dialect-independent tests for clojure.core, and others, focused on characterizing how Clojure JVM behaves so that other dialects to reach parity.
Added babashka to the test suite
Other projects
These are (some of the) other projects I’m involved with but little to no activity
happened in the past month.
Click for more details
quickdoc: Quick and minimal API doc generation for Clojure
unused-deps: Find unused deps in a clojure project
deps.add-lib: Clojure 1.12’s add-lib feature for leiningen and/or other environments without a specific version of the clojure CLI
Oleksandr Yakushev
2025 Annual Funding Report 4. Published September 6, 2025.
Hello friends! Here’s my update on July-August 2025 Clojurists Together work. It has been a slow summer for me, but I still advanced in several projects under my wing.
clj-async-profiler
I have been progressing towards a major 2.0.0 release for quite some time, and it will be released soon. The release will contain prominent changes, including the complete transition to JFR file format for collected profiles (compatible with Java Flight Recorder), support for continuous profiling, and a new exciting flamegraph type (secret for now!). I’m also making sure that Flamebin will receive the same new features shortly after that.
nREPL
We have just released nREPL 1.4 which supports configuring dynamic var values for the REPL environment and improvements to load-file middleware. There are already futher improvements in the pipeline, so expect a 1.5 coming soon.
CIDER
Together with Bozhidar, we have released CIDER 1.9 back in July, bringing the
accumulated features and fixes to stable-version users. In the unstable version, I’ve fixed a couple annoying bugs, namely:
ClojureScript node repl switches back to cljs.user namespace (fixed in piggieback 0.6.1).
Broken stacktrace response when C-c C-p throws an exception (#3827).
CIDER users will also benefit from the recent and upcoming nREPL improvements.
Peter Taoussanis
2025 Annual Funding Report 4. Published September 2, 2025.
A big thanks to Clojurists Together, Nubank, and other sponsors of my open source work! I realise that it’s a tough time for a lot of folks and businesses lately, and that sponsorships aren’t always easy 🙏
Hi folks! 👋 Will try keep today’s update for July and August brief.
Recent work
Sente
(Sente is a realtime web comms library for Clojure/Script)
Sente v1.21.0-RC1 is out now. v1.21 is a big release with improved performance, improved reliability, improved logging (via Trove) and new high-speed binary serialization that supports Clojure’s rich data types.
The binary serialization is still marked as experimental to be safe, though I have it running in production myself. This is a nice improvement, especially for folks with larger payloads and/or with mobile users that might be sensitive to network speed/limits.
To try the new serialization: just give (taoensso.sente.packers.msgpack/get-packer) to your client and server constructor calls. No extra deps needed.
Truss
(Truss is a micro toolkit for Clojure/Script errors)
Truss v2.2.0 is out now with some usability improvements, and a new demo video to show what the library can do. In short: the goal is to help improve the Clojure/Script error experience. Almost all of my Clojure/Script code uses Truss in some way.
The video should be a decent starting point if you’re not familiar with Truss.
Trove and Telemere
Trove is a modern logging facade for Clojure/Script. Telemere is the successor to Timbre.
Trove v1.0 final and Telemere v1.1.0 are out now with some usability improvements.
How do these libraries relate?
Trove is intended for library authors that want to emit rich logging (incl. un/structured logging) without forcing their users to adopt any particular backend (like Telemere, Timbre, μ/log, tools.logging, SLF4J, etc.).
Telemere is intended for general users that want to emit rich logging (incl. un/structured logging) from their own applications.
Structured logging in contrast outputs data. It retains rich data types and (nested) structures throughout the logging pipeline from logging callsite → filters → middleware → handlers.
A data-oriented pipeline can make a huge difference - supporting easier filtering, transformation, and analysis. It’s also usually faster, since you only pay for serialization if/when you need it. In a lot of cases you can avoid serialization altogether if your final target (DB, etc.) supports the relevant types.
So the structured (data-oriented) approach is inherently more flexible, faster, and well suited to the tools and idioms offered by Clojure and ClojureScript.
Bayesian Optimization is a sophisticated machine learning technique that intelligently searches for
optimal solutions in complex, expensive-to-evaluate scenarios. Unlike traditional trial-and-error
approaches, it uses probabilistic models to make smart decisions about where to search next.
If you’ve ever been frustrated with how long it takes to test a new pricing model, a marketing
campaign, or an operational tweak, you’re not alone.
Most leaders know the pain: experiments are expensive, and waiting months for a clean read is not a
luxury fast-moving companies can afford.
That’s where Bayesian Optimization (BO) comes in. Think of it as the disciplined, data-driven way to
shortcut your path to better answers—without drowning in endless A/B tests or guesswork.
Why this matters for leaders like you
Let’s be clear: experimentation isn’t optional anymore. Markets are dynamic, customer expectations
are shifting, and competitors are always probing for advantage.
The challenge is how to learn quickly, responsibly, and with limited budgets. That’s exactly what
Bayesian Optimization solves for.
Save time, save money: Instead of burning cycles on every possible test, BO
learns from each result and narrows the search toward what actually works.
That means fewer wasted campaigns, faster decisions, and tighter control over experimentation spend.
Designed for uncertainty: Markets are noisy. Data is imperfect. Yet BO was built
to thrive in messy real-world conditions—where attribution is delayed, or where multiple levers
interact in unpredictable ways.
Scales with ambition: Whether you’re tuning your CAC, optimizing operational
throughput, or finding the right price point for a new product, BO grows with your scope.
A quick story (so this doesn’t stay abstract)
Imagine you’re a CMO with a $500k monthly media budget. You could try to split that budget evenly
across search, social, and display.
Or maybe you lean on your team’s instincts and go heavier on one channel. But the truth is, you don’t
really know the optimal mix until you test it.
A traditional A/B or multi-arm test could take months—burning millions—before you see results.
Meanwhile, your competitors are not standing still.
Now imagine approaching the same problem with Bayesian Optimization. You run a handful of diverse
test allocations, feed those results into the model,
and let the system recommend the next best allocation. Each iteration is smarter than the last, and
after just 6–12 cycles,
you’re close to a near-optimal media mix. What would’ve taken months and untold budget is now a
structured, fast-learning loop.
How does it actually work? (Plain English)
At its core, Bayesian Optimization treats your business system like a black box. You give it inputs
(e.g., channel budgets, discount levels, staffing ratios),
you observe outputs (CAC, revenue, SLA performance), and it builds a living statistical model that
connects the two.
From there, an “acquisition function” decides what experiment to run next—balancing between exploiting
what looks promising and exploring areas that are still uncertain.
The surrogate model: A fast-to-update statistical map (often Gaussian Processes
or tree-based models) that predicts outcomes,
including uncertainty. It’s not magic—it’s just a smarter way to interpolate and extrapolate from
the data you already have.
The acquisition function: The decision engine that picks the next trial.
Sometimes it plays it safe (exploitation), sometimes it takes a bold shot (exploration).
That balance is what makes BO so efficient.
Where you can apply this today
This isn’t a research toy—it’s already quietly powering optimization in industries from tech to
retail. Here are common C-suite use cases:
Marketing allocation: Split budgets across channels to maximize CAC efficiency
without over-investing in unproven tactics.
Pricing & promotions: Test fewer bundles or discount levels but still find
the sweet spot that maximizes profit.
Conversion optimization: Refine web flows or app journeys where a full factorial
test is impossible due to dimensionality.
Operational tuning: Decide staffing levels or logistics parameters that minimize
cost and SLA breaches.
Buy vs. build (and where we come in)
Here’s the truth: you don’t need a PhD-heavy lab to benefit from Bayesian Optimization. Many
platforms quietly use it under the hood already.
But the moment your objectives, constraints, or data sources are unique, an off-the-shelf black box
won’t cut it.
That’s where we come in: our solution brings the rigor of BO into your business context—integrated
with your stack, aligned to your KPIs, and framed with your guardrails.
Buy: Faster time-to-value, but limited flexibility and vendor-driven roadmaps.
Build with us: Tailored objectives, tighter integration, and a defensible
competitive edge.
The elevator pitch
Bayesian Optimization is not just another buzzword—it’s a way to accelerate learning, stretch
budgets, and make smarter, faster decisions in uncertain environments.
It builds a living model of your business dynamics, proposes the next smartest test, and helps you
converge to high-confidence answers in a fraction of the time of traditional methods.
For executives, that translates into faster ROI, tighter governance, and a competitive edge without
the chaos of endless, costly experiments.
Done right—with the right partner—it’s not just an optimization tool. It’s a decision advantage.
Bayesian Optimization in Action (please expand the
section to view the code demo)
We begin Bayesian Optimization with only a handful of initial price points (five evenly spaced
samples). At this stage, the model has very little knowledge of the revenue curve, so the
prediction line is rough, and the uncertainty band is wide.
NOTE: For the sake of this demo, we are working with a simple 2D dataset where
the x-axis is the price and the y-axis is the revenue.
revenue-fn: Single-peaked revenue function (a clear “hill”) that falls off smoothly
and symmetrically on both sides
(defnrevenue-fn"Revenue is a function of price `p`"
[p]
(let [center 5.0;; peak at price = 5
height 30.0;; maximum revenue
width 3.0] ;; controls how wide the peak is
(utils/round-n
(* height (m/exp (*-0.5 (m/pow (/ (- p center) width) 2)))) 2)))
NOTE: This function is not known in practice, just using it for simulation.
We are simulating the y-axis revenue data using the revenue-fn above
Now let’s run Bayesian Optimisation on this training data and predict the price at which the
revenue will be maximum
(defresult
(bo-algo/bayes-opt;; Initial price data
initial-x
;; Initial revenue data
initial-y
;; Function for simulating experiment;; This function is unknown in real life
revenue-fn
;; Range to search the optimal price in
[38]
;; Iterations10))
And let’s look and the Gaussian Process and Expected Improvement plots for each iteration
Take a close look at the Next Sample point and observe how it improves i.e. moves
closer to the optimal value after each iteration
Click Play/Pause to
Start/Stop preview respectively. Click 💾 Download GIF to
export the animation.
Playing
From the data returned from the BO process above in the result, let’s look at only
the predicted data
Based on this information, we can say that the Bayesian Optimization technique has successfully
maximised our revenue by predicting the best price.
Real-World Business Applications
Discover how leading
companies across industries are leveraging Bayesian Optimization to drive measurable business outcomes
💲
Dynamic Pricing Optimization
E-commerce & Retail
Challenge
Traditional pricing strategies fail to adapt to market conditions, competitor actions, and demand
fluctuations in real-time, leading to suboptimal revenue and market share.
Solution
Bayesian Optimization continuously tunes pricing parameters by learning from customer response
data, competitor pricing, and market conditions to maximize revenue while maintaining
competitiveness.
Conversion Rate +15% — Higher purchase rates due to competitive pricing
Profit Margins +18% — Improved margins through intelligent pricing strategies
4-6 weeks to implement | Medium Complexity
🎯
PPC Budget Allocation
Digital Marketing
Challenge
Marketing teams struggle to optimally distribute PPC budgets across channels, keywords, and
demographics, often relying on gut feelings rather than data-driven decisions.
Solution
Bayesian Optimization automatically adjusts budget allocation across channels by learning which
combinations drive the highest ROI, continuously optimizing for conversion rates and cost
efficiency.
Results
Cost per Acquisition -32% — Reduced customer acquisition costs through
optimal bidding
ROAS +28% — Return on ad spend improved via intelligent allocation
Call centers face unpredictable demand patterns, making it difficult to maintain optimal staffing
levels that balance customer service quality with operational costs.
Solution
Bayesian Optimization predicts optimal staffing levels by learning from historical patterns,
seasonal trends, and real-time demand indicators to minimize wait times while controlling costs.
Results
Average Wait Time -45% — Significant reduction in customer waiting times
Staffing Costs -22% — Reduced overstaffing while maintaining service levels
Customer Satisfaction +35% — Higher CSAT scores due to improved service
delivery
6-8 weeks to implement | High Complexity
Conclusion
In this notebook, we demonstrated how Bayesian Optimization (BO) can be applied to maximize revenue
in a simulated pricing scenario. Starting with only a few initial price points, we built a Gaussian
Process model to learn the underlying revenue function and used an acquisition function to
intelligently select the next points to evaluate.
Through iterative updates, BO efficiently explored the price space, balancing exploration of unknown
regions with exploitation of known promising prices. This allowed us to converge quickly to the
optimal price point, achieving near-maximum revenue with far fewer experiments than a brute-force or
random search would require.
Key takeaways:
BO can optimize complex, expensive-to-evaluate functions with limited data.
It leverages probabilistic modeling (Gaussian Processes) to predict outcomes.
Acquisition functions like Expected Improvement guide efficient sampling.
In our example, BO successfully identified the price that maximized revenue, demonstrating its
power for real-world business optimization tasks.
Overall, this exercise shows that Bayesian Optimization is a practical and effective tool for
decision-making scenarios where experimentation is costly, allowing companies to maximize returns
while minimizing effort and expense.
Authors: Daniel Braithwaite, Arissa Yoshida, Rafael Celente, and Aman Gupta
In previous blog posts [1,2,3], we introduced Nubank’s approach for using transaction data-based foundation models to solve predictive problems [4]. These posts described how we formulate our transaction data for foundation models [2], pretrain these models, and finally finetune them (via joint fusion) for specific downstream tasks [3]. Importantly, we saw large improvements on tasks that are critical to Nubank. The most significant result was that the improvements were achieved not by using additional data sources, but rather by learning optimal transaction features as opposed to using handcrafted ones.
While powerful, these foundation models are computationally costly to train. At Nubank, we are always looking for ways to improve data efficiency to both reduce costs and build better-performing models. In this post, we explore how a novel optimizer, Muon [5], is helping us achieve these goals. Muon has recently received a significant amount of interest from the LLM research community, particularly for being sample-efficient to achieve a fixed quality for pre-training, when compared to AdamW (which happens to be the de facto choice for most pre-training workloads).
The quality of our foundation models increases as a function of the amount of data used, up to and beyond 203M rows. For example, in Figure 1, we demonstrate how the test set AUC for one of our smaller models (24M parameters) scales as a function of the number of joint fusion data points. Even slight improvements, such as a 0.05% increase in AUC, are highly valuable because they could lead to millions of dollars in savings for Nubank. However, while the AUC improves, so does the training cost. Joint fusion [3] with 5M rows takes around 12 hours with 8 NVIDIA A100 GPUs, whereas with 40M rows it takes around 95 hours using the same 8 A100s.
Figure 1 – Model quality improves as a function of the dataset size
The aforementioned computational cost of training these models shows that it is important to use methods that improve data efficiency. On the other hand, it also means we can achieve better performance with the same number of training steps. There are various methods we can use to improve data efficiency. However, in this blog post, we explore using the Muon [5] optimizer to make our foundation model pre-training more data efficient. In turn, these improved foundation models will lead to cost savings and better product performance for Nubank’s customers.
The Muon optimizer [5] represents a significant shift from the long-dominant, heuristic-based approaches like AdamW, introducing a simple second-order optimization method derived from first principles. Specifically designed for the dense linear layers of neural networks, Muon’s core mechanism can be described as matrix-structured steepest descent with spectral norm regularization. Its fundamental operation involves “orthogonalizing” the gradient matrix for each weight layer by pushing all the singular values to be close to 1. This process preserves the directional information of the gradient while normalizing its magnitude across all directions, preventing the optimization from being dominated by a few noisy or less useful gradient components. This theoretically elegant concept is made practical through the use of the efficient Newton-Schulz iteration [6], which approximates the orthogonalization without the prohibitive computational cost of a full SVD computation.
This principled design directly translates to substantial gains in both data and computational efficiency. Muon’s orthogonalized momentum updates allow for more stable and direct steps toward the loss minimum and enable the model to learn more from each token it processes. The efficiency gains are substantial from a computational standpoint. Scaling law experiments consistently demonstrate that Muon can achieve model quality comparable to that of an AdamW-trained counterpart while consuming only about half (~52%) of the training FLOPs, which corresponds to an approximate 2x improvement in computational efficiency [7,8].
To test our hypothesis that Muon can lead to better foundation models for Nubank, we pre-trained several 330M-parameter models on a 20M-sample dataset. We compared the performance of the Muon optimizer against the widely used AdamW optimizer across four different learning rates: 1e-4, 2e-4, 1e-3, and 2e-3. The figure below shows these results. Importantly, we see Muon converges significantly faster than AdamW, and converges to solutions with lower validation losses for each of the learning rate selections.
We can also combine all these runs into a single comparison, which shows the three best-performing models are Muon 1e-3, Muon 2e-3, and AdamW 1e-3. It is worth reiterating that the Muon runs converge faster than the best-performing AdamW run. These results confirm our hypothesis that using Muon can train better foundation models. An important sidenote is that the next token prediction losses are unusually low for language modeling because the specialized tokens utilized in our foundation models possess a restricted potential vocabulary.’
In this blog post, we demonstrate the advantages of integrating the Muon optimizer into Nubank’s foundation model pre-training pipeline. By adopting Muon, we have achieved faster convergence and superior model quality compared to the widely used AdamW optimizer, unlocking improvements in data and computational efficiency. These advancements directly translate into tangible benefits for Nubank: reduced training costs and enhanced product performance, ultimately delivering a better experience for our customers. Our findings confirm that sophisticated optimization techniques like Muon are crucial for pushing the boundaries of what is possible with large-scale foundation models, ensuring we continue to innovate efficiently and effectively.
[4] Braithwaite, D. T., Cavalcanti, M., McEver, R. A., et al (2025). Your Spending Needs Attention: Modeling Financial Habits with Transformers. arXiv preprint arXiv:2507.23267.
[5] Jordan, K., Jin, Y., Boza, V., You, J., Cesista, F., Newhouse, L., & Bernstein, J. (2024). Muon: An optimizer for hidden layers in neural networks. https://kellerjordan.github.io/posts/muon/
[6] Bernstein, J., & Newhouse, L. (2024). Old optimizer, new norm: An anthology. arXiv preprint arXiv:2409.20325.
[7] Shah, I., Polloreno, A. M., Stratos, K., Monk, P., Chaluvaraju, A., Hojel, A., … & Vaswani, A. (2025). Practical efficiency of muon for pretraining. arXiv preprint arXiv:2505.02222.
[8] Liu, J., Su, J., Yao, X., Jiang, Z., Lai, G., Du, Y., … & Yang, Z. (2025). Muon is scalable for LLM training. arXiv preprint arXiv:2502.16982.
Postgres has a great API to transfer data into and out from a database called
COPY. What is special about it is that it supports three different formats: CSV,
text and binary. Both CSV and text are trivial: values are passed using their
text representation. Only quoting rules and separating characters differ.
Binary format is special in that direction that values are not text. They’re
passed exactly how they’re stored in Postgres. Thus, binary format is more
compact: it’s 30% less in size than CSV or text. The same applies to
performance: COPY-ing a binary data back and forth takes about 15-25% less time.
To parse a binary dump, one must know its structure. This is what the library
does: it knows how to parse such dumps. It supports most of the built-in
Postgres types including JSON(b). The API is simple an extensible.
Here and below: I use Taggie to render complex values like date &
time, byte arrays and so on. Really useful!
This is what is going on here: we parse a source pointing to a dump using the
parse function. A source might be a file, a byte array, an input stream and so
on – anything that can be coerced to an input stream using the
clojure.java.io/input-stream function.
Binary files produced by Postgres don’t know their structure. Unfortunately,
there is no information about types, only data. One should help the library
traverse a binary dump by specifying a vector of types. The FIELDS variable
declares the structure of the file. See below what types are supported.
API
There are two functions to parse, namely:
pg-bin.core/parse accepts any source and returns a vector of parsed
lines. This function is eager meaning it consumes the whole source and
accumulates lines in a vector.
pg-bin.core/parse-seq accepts an InputStream and returns a lazy sequence
of parsed lines. It must be called under the with-open macro as follows:
If, for any reason, you have a type in your dump that the library is not aware
about, or you’d like to examine its binary representation, specify :raw or
:bytes. Each value will be a byte array then. It’s up to you how to deal with
those bytes:
Postgres is well-known for its vast JSON capabilities, and sometimes tables that
we dump have json(b) columns. Above, you saw that by default, they’re parsed as
plain strings. This is because there is no a built-in JSON parser in Java and I
don’t want to tie this library to a certain JSON implementation.
But the library provides a number of macros to extend undelrying
multi-methods. With a line of code, you can enable parsing json(b) types with
Chesire, Jsonista, Clojure.data.json, Charred, and JSam. This is how to do it:
The set-cheshire macro extends multimethods assuming you have Cheshire
installed. Now the parse function, when facing json(b) types, will decode them
properly.
The pg-bin.json namespace provides the following macros:
set-string: parse json(b) types as strings again;
set-cheshire: parse using Cheshire;
set-data-json: parse using clojure.data.json;
set-jsonista: parse using Jsonista;
set-charred: parse using Charred;
set-jsam: parse using JSam.
All of them accept optional parameters that are passed into the underlying
parsing function.
PG.Bin doesn’t introduce any JSON-related dependencies. Each macro assumes you
have added a required library into the classpath.
Metadata
Each parsed line tracks its length in bytes, offset from the beginning of a file
(or a stream) and a unique index:
At the moment, the library only parses binary dumps. Writing them is possible
yet requires extra work. Ping me if you really need writing binary files.
Scenarios
Why using this library ever? Imagine you have to fetch a mas-s-s-ive chunk of
rows from a database, say 2-3 million to build a report. That might be an issue:
you don’t want to saturate memory, neither you want to paginate using
LIMIT/OFFSET as it’s slow. A simple solution would be to dump the data you need
into a file and process it. You won’t keep the database constantly busy as
you’re working with a dump! Here is a small demo:
(nssome.ns(:require[pg-bin.core:ascopy][pg-bin.json:asjson]))(defnmake-copy-manager"
Build an instance of CopyManager from a connection.
"^CopyManager[^Connectionconn](newCopyManager(.unwrapconnBaseConnection)))(let[conn(jdbc/get-connectiondata-source)mgr(make-copy-managerconn)sql"copy table_name(col1, col2...) to stdout with (format binary)";; you can use a query without parameters as wellsql"copy (select... from... where...) to stdout with (format binary)"](with-open[out(io/output-stream"/path/to/dump.bin")](.copyOutmgrsqlout)))(with-open[in(io/input-stream"/path/to/dump.bin")](let[lines(copy/parse-seqin[:int2:text...])](doseq[linelines]...)))
Above, we dump the data into a file and then process it. There is a way to
process lines on the fly using another thread. The second demo:
(let[conn(jdbc/get-connectiondata-source)mgr(make-copy-managerconn)sql"copy table_name(col1, col2...) to stdout with (format binary)"in(newPipedInputStream)started?(promise)fut;; a future to process the output(future(with-open[_in];; must close it afterward(deliverstarted?true);; must report we have started(let[lines(copy/parse-seqin[:int2:text...])](doseq[linelines];; process on the fly;; without touching the disk...))))];; ensure the future has started@started?;; drain down to the piped output stream(with-open[out(newPipedOutputStreamin)](.copyOutmgrsqlout))@fut;; wait for the future to complete)