Clojurists Together is having our fifth board elections, and our fifth annual members meeting.
Key dates
(All dates are EOD, in Pacific Time)
Board nominations close: Sept. 25, 2025 Voting opens: a few days after submissions close and after the board has nominated candidates Voting closes: October 16, 2025 Annual members meeting:Oct. 29, 2025 - 10 am Pacific time
Board Elections
As part of our commitment to transparency and community governance, Clojurists Together holds elections for board members. The Committee is responsible for governing the projects, selecting which projects are sponsored, administering the projects, and interacting with sponsors.
Committee members are elected for a two-year term. Each election cycle, half of our board seats come up for re-election. This year there are four seats available.
If you are interested in standing for election, please fill out this form by Sept. 25th, 2025 - 5 pm Pacific Time. If you can’t access the form, contact us, and we can accept your nomination by email. Nominations are open to anyone, you don’t have to be a Clojurists Together member to stand for election. Our bylaws do require you to be a member if elected to the board, though we provide a stipend that offsets the cost of your membership.
You don’t have to have lots of experience with Clojure to apply. We want a committee made up of a cross-section of the Clojure community so that we have a wide range of perspectives when making decisions on which projects to fund.
As part of the nomination, if we get more than 12 candidates for board membership, the board will nominate no more than 12 candidates. Our bylaws state:
The Board shall nominate no more than 12 candidates seeking board membership in any given election. In nominating candidates for Director positions and in choosing the number of candidates to nominate overall, the Board shall use reasonable efforts to maintain a Board composition consisting of at least: (1) 25% female Directors, (2) 25% non-Caucasian Directors, and (3) 35% from any category(ies) of persons (e.g., race, gender, ability) commonly considered to have suffered from discrimination at some time and then-currently under-represented in the technology industry, in each case as determined by the Board in its reasonable discretion.
The main responsibilities of a committee member are:
Participate in the general discussions of the month-to-month running of the program
Evaluate and vote on which open source projects to fund
Help in decision-making for the future plans of Clojurists Together
These responsibilities take roughly one hour/month, though there are peaks and troughs of activity as we go through our quarterly funding cycle. If you have more time to offer, there are lots more things that need developing, automating, designing, e.t.c. It would be great to have you help out with those things, but we don’t want to exclude people from standing because they don’t have a lot of spare time.
Our bylaws requires that we do not have more than two committee members from any one company. More than two people from a company can stand for election, but if more than two of these people were to be elected, only the top two ranked candidates would be elected and the other seats would go to the next most highly ranked candidates from other companies. If you have any questions about this, please get in touch.
Elections will be held once the candidates are announced, and all Clojurists Together members will be eligible to vote.
Annual Members Meeting
We are also holding our fifth members meeting at 10 am Pacific time, October 29, 2025. This will be an opportunity for Clojurists Together to share information about 2025 to date and discuss future plans, present the new board members, and most importantly take questions from and engage with members.
More details on this will follow including a videoconferencing link.
Please share this with anyone you think would be able to represent the interests of the Clojure community and Clojurists Together members. Thanks for your support of Clojurists Together, we appreciate it!
In 2017 I discovered the free of charge Orbiter 2016 space flight simulator which was proprietary at the time and it inspired me to develop a space flight simulator myself.
I prototyped some rigid body physics in C and later in GNU Guile and also prototyped loading and rendering of Wavefront OBJ files.
I used GNU Guile (a Scheme implementation) because it has a good native interface and of course it has hygienic macros.
Eventually I got interested in Clojure because unlike GNU Guile it has multi-methods as well as fast hash maps and vectors.
I finally decided to develop the game for real in Clojure.
I have been developing a space flight simulator in Clojure for almost 5 years now.
While using Clojure I have come to appreciate the immutable values and safe parallelism using atoms, agents, and refs.
In the beginning I decided to work on the hard parts first, which for me were 3D rendering of a planet, an atmosphere, shadows, and volumetric clouds.
I read the OpenGL Superbible to get an understanding on what functionality OpenGL provides.
When Orbiter was eventually open sourced and released unter MIT license here, I inspected the source code and discovered that about 90% of the code is graphics-related.
So starting with the graphics problems was not a bad decision.
Software dependencies
The following software is used for development.
The software libraries run on both GNU/Linux and Microsoft Windows.
In order to manage the different dependencies for Microsoft Windows, a separate Git branch is maintained.
Atmosphere rendering
For the atmosphere, Bruneton’s precomputed atmospheric scattering was used.
The implementation uses a 2D transmittance table, a 2D surface scattering table, a 4D Rayleigh scattering, and a 4D Mie scattering table.
The tables are computed using several iterations of numerical integration.
Higher order functions for integration over a sphere and over a line segment were implemented in Clojure.
Integration over a ray in 3D space (using fastmath vectors) was implemented as follows for example:
(defnintegral-ray"Integrate given function over a ray in 3D space"{:malli/schema[:=>[:catrayN:double[:=>[:cat[:vector:double]]:some]]:some]}[{::keys[origindirection]}stepsdistancefun](let[stepsize(/distancesteps)samples(mapv#(*(+0.5%)stepsize)(rangesteps))interpolate(fninterpolate[s](addorigin(multdirections)))direction-len(magdirection)](reduceadd(mapv#(->%interpolatefun(mult(*stepsizedirection-len)))samples))))
Precomputing the atmospheric tables takes several hours even though pmap was used.
When sampling the multi-dimensional functions, pmap was used as a top-level loop and map was used for interior loops.
Using java.nio.ByteBuffer the floating point values were converted to a byte array and then written to disk using a clojure.java.io/output-stream:
(defnfloats->bytes"Convert float array to byte buffer"[^floatsfloat-data](let[n(countfloat-data)byte-buffer(.order(ByteBuffer/allocate(*n4))ByteOrder/LITTLE_ENDIAN)](.put(.asFloatBufferbyte-buffer)float-data)(.arraybyte-buffer)))(defnspit-bytes"Write bytes to a file"{:malli/schema[:=>[:catnon-empty-stringbytes?]:nil]}[^Stringfile-name^bytesbyte-data](with-open[out(io/output-streamfile-name)](.writeoutbyte-data)))(defnspit-floats"Write floating point numbers to a file"{:malli/schema[:=>[:catnon-empty-stringseqable?]:nil]}[^Stringfile-name^floatsfloat-data](spit-bytesfile-name(floats->bytesfloat-data)))
When launching the game, the lookup tables get loaded and copied into OpenGL textures.
Shader functions are used to lookup and interpolate values from the tables.
When rendering the planet surface or the space craft, the atmosphere essentially gets superimposed using ray tracing.
After rendering the planet, a background quad is rendered to display the remaining part of the atmosphere above the horizon.
Templating OpenGL shaders
It is possible to make programming with OpenGL shaders more flexible by using a templating library such as Comb.
The following shader defines multiple octaves of noise on a base noise function:
One can then for example define the function fbm_noise using octaves of the base function noise as follows:
(defnoise-octaves"Shader function to sum octaves of noise"(template/fn[method-namebase-functionoctaves](slurp"resources/shaders/core/noise-octaves.glsl"))); ...(deffbm-noise-shader(noise-octaves"fbm_noise""noise"[0.570.280.15]))
Planet rendering
To render the planet, NASA Bluemarble data, NASA Blackmarble data, and NASA Elevation data was used.
The images were converted to a multi resolution pyramid of map tiles.
The following functions were implemented for color map tiles and for elevation tiles:
a function to load and cache map tiles of given 2D tile index and level of detail
a function to extract a pixel from a map tile
a function to extract the pixel for a specific longitude and latitude
The functions for extracting a pixel for given longitude and latitude then were used to generate a cube map with a quad tree of tiles for each face.
For each tile, the following files were generated:
A daytime texture
A night time texture
An image of 3D vectors defining a surface mesh
A water mask
A normal map
Altogether 655350 files were generated.
Because the Steam ContentBuilder does not support a large number of files, each row of tile data was aggregated into a tar file.
The Apache Commons Compress library allows you to open a tar file to get a list of entries and then perform random access on the contents of the tar file.
A Clojure LRU cache was used to maintain a cache of open tar files for improved performance.
At run time, a future is created, which returns an updated tile tree, a list of tiles to drop, and a path list of the tiles to load into OpenGL.
When the future is realized, the main thread deletes the OpenGL textures from the drop list, and then uses the path list to get the new loaded images from the tile tree, load them into OpenGL textures, and create an updated tile tree with the new OpenGL textures added.
The following functions to manipulate quad trees were implemented to realize this:
(defnquadtree-add"Add tiles to quad tree"{:malli/schema[:=>[:cat[:maybe:map][:sequential[:vector:keyword]][:sequential:map]][:maybe:map]]}[treepathstiles](reduce(fnadd-title-to-quadtree[tree[pathtile]](assoc-intreepathtile))tree(mapvvectorpathstiles)))(defnquadtree-extract"Extract a list of tiles from quad tree"{:malli/schema[:=>[:cat[:maybe:map][:sequential[:vector:keyword]]][:vector:map]]}[treepaths](mapv(partialget-intree)paths))(defnquadtree-drop"Drop tiles specified by path list from quad tree"{:malli/schema[:=>[:cat[:maybe:map][:sequential[:vector:keyword]]][:maybe:map]]}[treepaths](reducedissoc-intreepaths))(defnquadtree-update"Update tiles with specified paths using a function with optional arguments from lists"{:malli/schema[:=>[:cat[:maybe:map][:sequential[:vector:keyword]]fn?[:*:any]][:maybe:map]]}[treepathsfun&arglists](reduce(fnupdate-tile-in-quadtree[tree[path&args]](applyupdate-intreepathfunargs))tree(applymaplistpathsarglists)))
Other topics
Solar system
The astronomy code for getting the position and orientation of planets was implemented according to the Skyfield Python library.
The Python library in turn is based on the SPICE toolkit of the NASA JPL.
The JPL basically provides sequences of Chebyshev polynomials to interpolate positions of Moon and planets as well as the orientation of the Moon as binary files.
Reference coordinate systems and orientations of other bodies are provided in text files which consist of human and machine readable sections.
The binary files were parsed using Gloss (see Wiki for some examples) and the text files using Instaparse.
Jolt bindings
The required Jolt functions for wheeled vehicle dynamics and collisions with meshes were wrapped in C functions and compiled into a shared library.
The Coffi Clojure library (which is a wrapper for Java’s new Foreign Function & Memory API) was used to make the C functions and data types usable in Clojure.
For example the following code implements a call to the C function add_force:
(defcfnadd-force"Apply a force in the next physics update"add_force[::mem/int::vec3]::mem/void)
Here ::vec3 refers to a custom composite type defined using basic types.
The memory layout, serialisation, and deserialisation for ::vec3 are defined as follows:
The clj-async-profiler was used to create flame graphs visualising the performance of the game.
In order to get reflection warnings for Java calls without sufficient type declarations, *unchecked-math* was set to :warn-on-boxed.
(set!*unchecked-math*:warn-on-boxed)
Furthermore to discover missing declarations of numerical types, *warn-on-reflection* was set to true.
(set!*warn-on-reflection*true)
To reduce garbage collector pauses, the ZGC low-latency garbage collector for the JVM was used.
The following section in deps.edn ensures that the ZGC garbage collector is used when running the project with clj -M:run:
The option to use ZGC is also specified in the Packr JSON file used to deploy the application.
Building the project
In order to build the map tiles, atmospheric lookup tables, and other data files using tools.build, the project source code was made available in the build.clj file using a :local/root dependency:
Various targets were defined to build the different components of the project.
For example the atmospheric lookup tables can be build by specifying clj -T:build atmosphere-lut on the command line.
The following section in the build.clj file was added to allow creating an “Uberjar” JAR file with all dependencies by specifying clj -T:build uber on the command-line.
To create a Linux executable with Packr, one can then run java -jar packr-all-4.0.0.jar scripts/packr-config-linux.json where the JSON file has the following content:
In order to distribute the game on Steam, three depots were created:
a data depot with the operating system independent data files
a Linux depot with the Linux executable and Uberjar including LWJGL’s Linux native bindings
and a Windows depot with the Windows executable and an Uberjar including LWJGL’s Windows native bindings
When updating a depot, the Steam ContentBuilder command line tool creates and uploads a patch in order to preserve storage space and bandwidth.
Future work
Although the hard parts are mostly done, there are still several things to do:
control surfaces and thruster graphics
launchpad and runway graphics
sound effects
a 3D cockpit
the Moon
a space station
It would also be interesting to make the game modable in a safe way (maybe evaluating Clojure files in a sandboxed environment?).
Conclusion
You can find the source code on Github.
Currently there is only a playtest build, but if you want to get notified, when the game gets released, you can wishlist it here.
We look at the implementation of functions in ClojureCLR and how evaluation/compilation translates a source code definition of a function to the underlying class representation. (The first of several posts on this topic.)
The universe, and everything
There are a number of interfaces and classes that are used with in ClojureCLR (and this is parallel to ClojureJVM). We will cover all of these in detail:
IFn
The essence of a function is that you can invoke it on (zero or more) arguments.
If you want a class to represent a function, the minimum requirement is to implement the IFn interface.
(In ClojureJVM, IFn also implements interfaces Callable and Runnable; no equivalents exist in ClojureCLR.) There are invoke overloads for zero to twenty arguments, and a final overload that takes arguments past the 20 count in a params array. The applyTo method is used to define the behavior of apply.
IFnArity
This interface is not in ClojureJVM. I implemented it in ClojureCLR to provide a way to interrogat a function about what arities it supports. It was required to deal with dynamic callsites. That’s going to take us a long way from what we need to understand here, so I’m going to ignore it. (Okay, I’m avoiding it partly because I don’t remember what purpose it serves. Homework for me. ) For reference only:
public interface IFnArity
{
bool HasArity(int arity);
}
AFn
If you want something to use as a function in ClojureCLR, just create a class that implements IFn and create an instance of it. You can put that any place a function can go.
As a practical matter, you should avoid that. To implement IFn, you need to provide implementations for all of the invoke methods – most of which are just going to throw a NotImplementedException – as well as applyTo.
The abstract base class AFn supplies default implementations of all the invoke methods and applyTo. The default implementations throw an ArityException, derived from ArgumentException, that indicates the function does not support the arity of the invoke.
public abstract class AFn : IFn, IDynamicMetaObjectProvider, IFnArity
{
#region IFn Members
public virtual object invoke()
{
throw WrongArityException(0);
}
public virtual object invoke(object arg1)
{
throw WrongArityException(1);
}
public virtual object invoke(object arg1, object arg2)
{
throw WrongArityException(2);
}
// ...
}
AFn cleverly provides a default implementation of applyTo that uses the invoke methods to do the work. It does this by counting the number of arguments in the ISeq passed to applyTo, and calling the appropriate invoke method. If there are more than 20 arguments, it collects the first 20 into individual variables and the rest into an array, and calls the final invoke overload.
public virtual object applyTo(ISeq arglist)
{
return ApplyToHelper(this, Util.Ret1(arglist,arglist=null));
}
The call to Util.Ret is a trick required by the JVM: it returns the value of its first argument. The second argument is there to ensure that the arglist variable in the caller is set to null after the call. This is to help the garbage collector reclaim the memory used by the ISeq if it is no longer needed. As it turns out, this is not really necessary in the CLR, but I never got around to deleting it – it occurs in a lot of places, as ApplyToHelper illustrates:
If you want to create a function class yourself, I highly recommend basing it on AFn.
As an example, in the test code for ClojureCLR.Next, I use F# object expressions to create functions to test things like reduce functionality:
let adder =
{ new AFn() with
member this.ToString() = ""
interface IFn with
member this.invoke(x, y) = (x :?> int) + (y :?> int) :> obj
}
AFunction
The Clojure (CLR+JVM) compiler does not generate AFn sublasses directly. It generates classes derived from AFunction (for functions that are not variant) and RestFunction (for functions that are variant.) A variant function is one that has a & parameter amongst its options:
AFunction is an abstract base class derived from AFn:
public abstract class AFunction : AFn, IObj, Fn, IComparer
{
// ...
}
It implements interface IObj, providing metadata attachment capability.
This is done not simply by having a metadata field, but by implementing the withMeta method to return an instance of an AFunction+MetaWrapper. The latter class has a field for the metadata and a pointer back to the originating function. I guess they wanted to save the space of the field for the metadata.
It implements interface IComparer so that an instance of the class can be passed to things like sort routines, comparers for hash tables, etc. The Compare method just calls the two-argument invoke:
public int Compare(object x, object y)
{
Object o = invoke(x, y);
if (o is Boolean)
{
if (RT.booleanCast(o))
return -1;
return RT.booleanCast(invoke(y, x)) ? 1 : 0;
}
return Util.ConvertToInt(o);
}
AFunction also includes some support for a method implementation cache used in implementing protocols. That’s for discussion at another time.
When creating a derived class of AFunction, the compiler defines overrides of the invoke methods for each arity defined in the function declaration.
RestFn
On to RestFn. Ah, RestFn. The source file is almost 5000 lines long. I still have nightmares. I took the JVM version and did a bunch of search/replace operations. (For the ClojureCLR.Next, written in F#, I wrote a program to generate it.)
Consider the following:
(fn
([x y] ... two-arg definition ... )
([x y z ] ... three-arg definition ... )
([x y z & zs] ... rest-arg definition ... ))
This function should fail if called on zero or one arguments.
It should call the supplied definitions when called on two or three arguments. And it should apply the variant definition when called with four or more arguments.
A key notion here is the required arity of the function. For this example, the required arity is two. This is the minimum number of arguments covered by one of the cases in the definition. If we call invoke() or invoke(object arg1), the call should fail.
RestFn uses a very clever trick to implement this functionality in a general way. By ‘clever’, I mean that every time I look at this code I have to take an hour to learn anew what it is actually doing. I hope by writing it down here, I can make this go faster in the future.
RestFn defines standard overrides for all the invoke overloads and applyTo. These are all defined in terms of other methods, the doInvoke methods that are designed to take a final argument that is an ISeq containing the rest args, if needed..
For the example, the compiler would provide overrides for
that implement the code supplied in each case of the function definition. Ask yourself: What should be the default implementations for all the other invoke and doInvoke overloads?
For invoke the behavior is different for those with fewer than the required arity and those with more:
If there are fewer than the required arguments, the call should fail.
If there are more than the required arguments, the call should delegate to doInvoke(arg1, arg2, arg3, args)
The distinguishing factor is the required arity.
For an invoke with fewer arguments than the required arity, we should fail. If we have more arguments than the required arity, then we should call the overload of doInvoke that takes the required arity plus the rest argument.
Thus, all the invoke overrides are variants on this theme:
This invoke will be invoked by call in the code with four arguments: (f 1 2 3 4). If f had a clause that takes four arguments, then we woudl be calling f’s override of the four-argument invoke. So it does not have such a case. Let’s say f has a required arity of 2. We have enought arguments to supply the minimum. Then we would end up calling doInvoke(1,2, [3, 4]), which f has an overridden.
If, instead, f had a required arity of 12, then we do not have arguments. The default case kicks in and we throw an exception: Wrong number of arguments.
The default implementations of doInvoke all return null.
They will never be called. We will only ever call an override of doInvoke, the one matching the required arity (+ 1 for the rest args).
I’ll leave applyTo as an exercise.
Static invocation
In the previous post C4: ISeq clarity, I touched upon the notion of static invocation of functions. Static invocation is an efficiency hack. It allows a call such as
(f 1 2 3)
to bypass the usual dynamic dispatch that does a lookup of the current value of the #'fVar and instead links in the code directly to an invocation of a static method on the class defined for f. For this to happen, the first requirement is that the function allows direct invocation, the constraints being:
The function is not nested inside of another function.
The function does not close over any lexical variables from an outer scope.
The function does not use the this variable.
When these conditions are met, for each invoke the function defines, there will be a staticInvoke method of the same arity with the actual function definition. The invoke just calls the staticInvoke of the same arity.
In the same two posts mentioned just above, I also touched upon primitive invocation. If one of the invoke overloads is typed so that its argument types and return types contain only ^long or ^double or (the default) ^object type hints and at are not all ^object, then we will create have the class representing the function implement an interface such as
public interface ODLLD
{
double invokePrim(object arg0, double arg1, long arg2, long arg3);
}
In invocations where we know the types of the arguments, we can avoid boxing by calling the invokePrim method directly.
The interface is named by the type of each argument plus the return type. ODLLD = four arguments, of types Object, double, long, and long, with a return type of double. These interfaces are in the clojure.lang.primifs namespace. One to four arguments are accommodated. If you care to count, that comes to 358 interfaces. (Eventually, I’d like to replace these with the corresponding Function interfaces. We do have real generics in C#. And in ClojureCLR.Next, I’d like to get rid of the restriction to just long and double primitives.)
And that pretty much covers the classes that implement functions in ClojureCLR. Except …
The mysterious Fn
If you look above carefully, you will note that AFunction (and hence, indirectly, RestFn) implements an interface named Fn. This is a marker interface – no methods:
public interface Fn
{
}
I have found only one use of Fn in the Clojure code. Over in core_print.clj you will find this:
If you have all of that digested, code generation is not as hard as one might think.
Some complexity is handled in source code macroexpansion. If you have a defn with fancy arg destructuring
fn* is the underlying primitive for all function definitions. There’s a little work regularizing syntax:
(fn* [x] ...))
is the same as
(fn* ([x] ...))
You’d have that nesting anyway if you had multiple arities overloaded.
There can be an optional name to use:
(fn* myfunc ([x]))
But after that, it’s pretty much just a bunch of method bodies to parse. The fn* parses into an FnExpr that contains a list of FnMethod instances, one for each arity overload. When we generate code, the FnExpr becomes a class, the FnMethod instance each contribute a method (or several if we have static or primitive hackery). And there we are.
I’m lying
The actual complexity involved in code-gen for FnExpr and FnMethod is daunting.
That’s going to take another post.
Our mission is to hasten the transition to universally accessible healthcare. We deliver on this mission by enabling innovators to bring cutting-edge software and AI to the healthcare market safely and quickly. We're regulated by the UK Government and European Commission to do so.
Our certification process is optimised for software and AI, facilitating a more efficient time to market, and the frequent releases needed to build great software. This ensures patients safely get the most up-to-date versions of life-changing technology.
Come help us bring the next generation of healthcare to the people who need it.
Our challenges
Product and engineering challenges go hand in hand at Scarlet. We know our mission can only be accomplished if we:
Build products and services that our customers love.
Streamline and accelerate complex regulatory processes without sacrificing quality.
Ensure that we always conduct excellent safety assessments of our customers’ products.
Continuously ship great functionality at a high cadence - and have fun while doing it.
Build and maintain a tech stack that embraces the complex domain in which we work.
Our engineering problems are plenty and we have chosen Clojure as the tool to solve them.
The team
The team is everything at Scarlet and we aspire to shape and nurture a team where every team member:
Really cares about our customers.
Works cross-functionally with engineers, product managers, designers, regulatory experts, and other stakeholders.
Collaborates on solving hard, important, real-world problems.
Helps bring out the best in each other and support each other’s growth.
Brings a diverse set of experiences, backgrounds, and perspectives.
Feels that what we do day-to-day is deeply meaningful.
We all have our fair share of experience working with startups, open source and various problem spaces. We wish to expand the team with ~2 more team members that can balance our strengths and weaknesses and help Scarlet build fantastic products.
We’re looking for ambitious teammates who have at least a few years of experience, have an insatiable hunger to learn, and want to do the most important work of their career!
How we work
Our ways of working are guided by a desire to perform at the highest level and do great work.
Flexible working: Remote-first with no fixed hours or vacation tracking.
Low/no scheduled meetings: Keep meetings to a minimum—no daily stand-ups or agile ceremonies.
Asynchronous collaboration: Have rich async discussions and flexible 1:1s as needed.
High trust and autonomy: Everyone solves problems; we are responsible for our choices and communicating them with our teammates.
Getting together: We meet a minimum of twice a year for a week at our offices in London.
Pick your tools: We believe in engineering excellence trust you to use the tool set you feel most productive with.
About you
If this sounds exciting to you, we believe Scarlet may be a great fit and would love to hear from you!
We believe that the potential for a great fit is even higher if you have one or more of the following:
Professional Clojure experience.
Professional full-stack web development experience.
Previous experience in the health tech / regulatory space
Endless curiosity and are always driven to understand why things are the way they are.
Superb written and verbal communication
Live within +/- 2h of the UK’s timezone
The interview process
Though the order may change, the interview steps are:
nREPL 1.4.0 is out! This month we celebrate 15 years since nREPL’s development started,
so you can consider this release part of the upcoming birthday celebrations.
So, what’s new?
Probably the highlight is the ability to pre-configure default values for dynamic
variables in all nREPL servers that are launched locally (either per project or
system-wide). The most useful application for this would be to enable
*warn-on-reflection* in all REPLs. To achieve this, create ~/.nrepl/nrepl.edn
with this content:
Now, any nREPL server started from any IDE will have *warn-on-reflection* enabled.
$ clojure -Sdeps "{:deps {nrepl/nrepl {:mvn/version \"1.4.0\"}}}" -m nrepl.cmdline -i
user=> #(.a %)
Reflection warning, NO_SOURCE_PATH:1:2 - reference to field a can't be resolved.
Note: nREPL doesn’t support directly XDG_CONFIG_HOME yet, but you can easily
override the default global config directory (~/.nrepl) with NREPL_CONFIG_DIR.
Another new feature is the ability to specify :client-name and :client-version when
creating a new nREPL session with the clone operator. This allows collecting information
about the clients used, which some organizations might find useful. (I know Nubank are
making use of this already)
One notable change in nREPL 1.4 is the removal of support for Clojure 1.7. Clojure 1.7 was released
way back in 2015 and I doubt anyone is using it these days, but we try to be extra conservative with the
supported Clojure versions and this is only the second time nREPL’s runtime requirements were bumped
in the 7 and a half years I’ve been the maintainer of the project. (early on I bumped the required Clojure from 1.2 to 1.7)
As usual the release features also small bug-fixes and internal improvements. One such improvement was
the introduction of matcher-combinators in our test suite. (which was the main motivation to bid farewell to Clojure 1.7)
You can check out the release note for a complete list of changes.
That’s all I have for you today. I hope we’ll have more exciting nREPL news to share by nREPL’s “official” birthday, October 8th.1
Big thanks to everyone who has contributed to this release and to all the people supporting my Clojure OSS work!
In the REPL we trust! Keep hacking!
The ISeq analyzer is Compiler.AnalyzeSeq. It receives an ISeq, which will be of the form (op ...args...).
When this is called we know that op is not symbol whose name starts with “def”.
AnalyzeSeq first tries to macroexpand the form. If macroexpanding gives us back something other than what we started with, it just calls Compiler.Analyze on that new thing. Otherwise:
If op is …
Then
nil
throw an exception
a Var or a symbol that resolves to a Var, and that Var has :inline metadata with an entry with correct number of arguments
invoke that entry (it should be an IFn) on the arguments and recursively analyze the result.
a special form
call the corresponding special form parser. (See below).
Otherwise
call the parser for InvokeExpr (Also see below).
The compiler has a map from special form symbols to the parser to be used for that special form.
Here you go:
Special form op
Hander
case*
CaseExpr
def
DefExpr
deftype*
DefType.Parser, contained in NewInstanceExpr
do
BodyExpr
fn*
FnExpr
if
IfExpr
import*
ImportExpr
let*
LetExpr
letfn*
LetFnExpr
loop*
LetExpr
monitor-enter
MonitorEnterExpr
monitor-exit
MonitorExitExpr
new
NewExpr
quote
ConstantExpr
recur
RecurExpr
reify*
Reify.Parser, contained in NewInstanceExpr
set!
AssignExpr
throw
ThrowExpr
try
TryExpr
var
TheVarExpr
.
HostExpr
Some of the op names have an asterisk at the end.
These are the primitive forms that more advanced syntactic constructs macroexpand into.
For example, let has a lot of special handling for deconstructing arguments.
A let form will macroexpand into a let* that has only simple bindings. E.g.
Also, some operators you are unlikely to type directly. More commonly they come from reader macros, e.g.,
'x ; reads as (quote x)
#'x ; reads as (var x)
The invocation parser
The catch-all parser at the end of AnalyzeSeq is InvokeExpr.Parser.Parse. When called, we know the form to analyze looks like (f arg1 arg2 ...) and we know f is not special form symbol, as detailed above.. It might not be a symbol at all; we could have a form such as ((fn [x] (inc (* 2 x))) y). This parser does a lot of special-case analysis to determine the best type of AST node to create.
The first step is to call Compiler.Analyze on f. Call the resulting AST node fexpr.
The following special cases are handled:
instance?. There is a special type of AST node just for this case: InstanceOfExpr. (I don’t know it gets its own node type.) The conditions for this are:
fexpr is a VarExpr
the Var is actually #'instance?
the form has exactly two arguments.
static invocation. The type of AST node to create is StaticInvokeExpr The conditions are:
fexpr is a VarExpr
the :direct-linking compiler option is set to true
we are not in an ‘evaluation context’ (more on that some other day).
the Var is not marked as dynamic, does not have metatdata :redef = true, and does not have metadata ‘:declared’ = true
primitive invocation. We create an AST node of type InstanceMethodExpr to invoke the .invokePrim method of the function. The conditions are:
fexpr is a VarExpr
the Var is bound to a class that has an invokePrim method with a matching number of arguments (determined by looking at the :arglists metadata on the Var)
we are not in an ‘evaluation context’ (more on that some other day).
We will discuss this in more detail in C4: Functional anatomy.
keyword invocation. When our form looks like (:keyword coll), we create an AST node of type KeywordInvokeExpr. The conditions are:
fexpr is a KeywordExpr
the form has exactly one argument
passthrough of StaticFieldExpr and StaticPropertyExpr. This is to deal with the so-called “static field bug that replaces a reference in parens with the field itself rather than trying to invoke the value in the field.” Think of it as dealing with (Int64/MaxValue) when you should be writing just Int64/MaxValue.
Dealing with QualifiedMethodExpr.
Conclusion
There are many devils hidden in the details of the many parsers mentioned above. There is no substitute for actually looking at each one in turn to understand their peculiarities. I hope the organization presented here makes that task less daunting. In addition, subsequent blog posts will provide overviews of some of the more complex pieces, such as function management and interop.
We believe that behind all technical solutions there must be a clear business problem to be solved. In this case, we were facing challenges related to the stability and efficiency of our monitoring ecosystem. As Nubank scaled rapidly, our existing log infrastructure began to show signs of pressure, especially in terms of cost predictability and scalability.
Considering the logging platform is key to support all the engineering teams during troubleshooting and incident mitigation, not having full control and visibility on your monitoring data is bad. There is nothing worse than trying to debug a production problem and discover you can’t see the logs of your application. In our case, we used to rely on an external solution to ingest and store our logs, and we had poor observability around it (ironic). Once we created metrics to understand the real situation, our analysis showed that a significant portion of logs weren’t being retained end-to-end, which limited our ability to act quickly in incident response scenarios.
On top of that, our contract was getting expensive (really expensive). The only way to mitigate our problems was buying more licenses (paying more), and there wasn’t a clear pricing model for us to plan our spends. If we had problems, we’d have to add more money. No predictability was possible here. It got to a point that the team analyzed that we could hire Lionel Messi as a software engineer, paying the same amount we were paying for the external solution.
With this complex and exciting problem at hand, we decided to explore alternatives, and the most efficient one seemed to be creating our own platform. This way we would have total control over our data, ingestion pipeline, storage strategy and querying runtime.
How was Nubank’s Log Infrastructure
Before moving to an in-house solution, Nubank’s log infrastructure was very simple, and totally coupled with the previous solution.
In short, every application log was being sent directly to the vendor’s platforms by its own forwarder. Also, we had many different unknown internal sources sending data directly to its API.
This architecture served Nubank well for many years, but with our massive hyper-scale growth, some years ago, we started to face its limitations, the future with it started to be a concern.
The primary concerns and problems with this architecture and approach, as identified by the team, were:
Lack of observability: We didn’t have any observability over the ingestion and storage flow, if something happens we didn’t have any trustable metric about it.
High coupling: At this time, lots of our alerts, dashboards were defined directly on vendor interfaces, all our data was stored within it, and we didn’t have the capability to change solutions or migrate away easily.
Lack of control: We didn’t have any way to filter, aggregate, route or apply logic over incoming data.
High Costs: The increasing costs related to logs stack was a constant concern from stakeholders and the trend was that it would keep growing if we didn’t take any action.
Coupled ingestion and querying processes: High load on ingestion directly impacted querying performance, and vice versa.
Divide and conquer
Developing an entire log platform from scratch is hard, at the time we didn’t have anything built!
To be able to solve this problem, we divided the entire project into two major steps:
The Observability Stream: A complete stream platform capable of ingesting and processing observability signals in a reliable and efficient manner. Decoupling from the other solution and having full control over our data.
Query & Storage Platform: The platform that would store and make logs searchable, so that engineers could use it on daily troubleshooting tasks.
For both projects we had a different set of requirements and features that we needed to build, but there were three common requirements:
Reliable: The platform needed to be reliable even under high load or unexpected scenarios to support Nubank operations.
Scalable: Be able to quickly scale when facing spike in ingestion and usage, and on long term when dealing with the hyper-growth of Nubank.
Cost Efficient: Being cost efficient is always important at Nubank, and we needed a platform that would be cost efficient on a long-term vision, being able to ingest and store all our generated data cheaper than any vendor.
With a clear list of requirements and expectations we started the project, first focusing on the ingestion and processing, and then the querying and storage platform.
The Observability Stream
The decision was to build the ingestion platform first, it allowed us to start the migration process without requiring any major disruptions on the developer experience while already decoupling the transaction environment from the observability environment. It also allowed us to gather metrics about our data to support better decision-making, especially during the storage platform development.
The observability stream was built with simplicity in mind, with a mix of open source projects and in-house developed systems.
To summarize, the ingestion architecture is composed of three distinct systems:
Fluent Bit: We opted for a lightweight, easily configurable, and efficient data collector and forwarder. This open-source, CNCF-backed project is a reliable industry standard for the task.
Data Buffer Service: The service responsible to handle all incoming data from forwarders and accumulate them in large chunks of data, to proceed on the pipeline on a micro-batching architecture.
Filter & Process Service: An in-house developed high scalable system able to filter and process any incoming data efficiently. This system is the core of our ingestion platform, being easily extensible to add any new filter/process logic as needed, it’s also responsible to collect metrics from incoming data.
With the observability stream fully operational, we established a foundation of reliability and scalability for our log ingestion processes. This comprehensive system not only resolved our immediate needs for quality data intake but also provided us with invaluable insights into our logging activities. Furthermore, it decoupled our ingestion processes from the querying process, allowing for greater flexibility and the ability to easily change components when needed, a capability we lacked previously due to tight coupling.
Query & Log Platform
With a robust ingestion platform ensuring reliability and scalability, our next challenge was to develop a query and storage solution capable of effectively handling and retrieving this massive volume of log data.
With all this, we needed to choose a query engine to search all this data, and Trino was the choice for several reasons:
Trino partitions feature was a crucial feature. Using it we’re able to enhance our query performance by segmenting data into manageable chunks, this allows queries to target only relevant data subsets, improving response times and reducing resource usage. Trino’s partitioning feature was a key factor in our decision to adopt it.
AWS S3 as storage: By storing all our data on AWS S3 we guarantee the high reliability of our data in a cost-effective way, its high scalability is well grounded to receive this massive amount of data, while being able to scale in long-term as Nubank grows.
To store the logs, the chosen format was Parquet. Using it, we’re able to achieve the best search performance due to its colunar storage while also achieving an average of 95% of compaction rate. This helps us achieve the goal of having all our data stored in the most effective way.
To generate all this parquet, we built a high scalable and extensible parquet generator app, that are capable of transform into parquet all the massive data coming the ingestion platform, the choice to build our own internal infrastructure to it, also emphasize our goal of have a cost-effective alternative while being able to extend and adapt on Nubank’s needs.
With our query and log platform fully integrated and operational, we have successfully redefined how Nubank manages its log data. The strategic choice of Trino for querying, S3 for storage, and Parquet for data format ensures that our logs are not only efficiently stored but also readily accessible for analysis and troubleshooting. These innovations have not only resolved initial challenges but have also equipped Nubank with a powerful tool for future growth.
Final Thoughts
Since mid-2024, Nubank’s in-house logging platform has been the default for log storage and querying. It currently ingests 1 trillion logs daily, totaling 1 PB of data. With a 45-day retention period, it stores 45 PB of searchable data. The platform handles almost 15,000 queries daily, scanning 150 PB of data each day.
Nubank developed this in-house logging platform to achieve significant cost savings and operational efficiency, moving away from reliance on external vendors. This platform is designed to support all current and future operations, scaling efficiently while costing 50% less than market solutions, according to our benchmarks.
This approach also provides Nubank with unparalleled control and flexibility. It enables rapid iteration, custom feature development, and a deeper understanding of data flows, leading to improved analytics, troubleshooting, and security.
Challenging the status quo is a core Nubank value, and this ambition drove the creation of an entire log platform from scratch, leveraging a combination of open-source projects and in-house software development.
This tutorial explores how to construct and analyze p-adic structures using prefix trees (tries) in Clojure. We will generalize binary Morton codes to other prime bases (like p=3, p=5) to understand p-adic norms and their applications in data analysis.
Any sequence of data, such as a Morton code or spatial coordinates, can be broken down into a chain of its prefixes. This forms a natural hierarchy, where each step in the chain adds more specific information.
For a sequence [a, b, c, d], the prefix chain is: [[a], [a, b], [a, b, c], [a, b, c, d]]
This structure is essentially a linked list or a simple trie, which is the foundation for our analysis. 🧱
(defnbuild-prefix-chain"Builds a list of all prefixes for a given sequence."[sequence](map#(vec(take%sequence))(range1(inc(countsequence)))));; Example 💡(let[morton-code[1021]](println"Sequence:"morton-code)(println"Prefix Chain:"(build-prefix-chainmorton-code)));; Output:;; Sequence: [1 0 2 1];; Prefix Chain: ([1] [1 0] [1 0 2] [1 0 2 1])
2. 🔀 Decomposing Data: Two Perspectives
We can analyze the hierarchical data in our prefix chains in two ways, analogous to Jordan and Cartan decompositions in algebra.
📊 A. Jordan-like Decomposition (Breadth-First)
This approach processes prefixes level by level, from shortest to longest. It's useful for analyzing data at progressive scales of detail.
(defnjordan-decomposition"Sorts prefixes by their length (breadth-first)."[prefix-chains](sort-bycount(distinct(applyconcatprefix-chains))));; Example: Analyze all prefixes by depth level 📏(let[sequences[[102][101][20]]all-prefixes(mapcatbuild-prefix-chainsequences)decomposed(jordan-decompositionall-prefixes)](clojure.pprint/pprint(group-bycountdecomposed)));; Output:;; {1 #{[1] [2]},;; 2 #{[1 0], [2 0]},;; 3 #{[1 0 2], [1 0 1]}}
🎯 B. Cartan-like Decomposition (Depth-First)
This approach processes the deepest (most specific) prefixes first. It's useful for focusing on fine-grained local details before considering the broader structure.
(defncartan-decomposition"Sorts prefixes by length in descending order (depth-first)."[prefix-chains](reverse(jordan-decompositionprefix-chains)));; Example: Focus on the most detailed prefixes first 🔍(let[sequences[[102][101][20]]all-prefixes(mapcatbuild-prefix-chainsequences)decomposed(cartan-decompositionall-prefixes)](printlndecomposed));; Output:;; ([1 0 2] [1 0 1] [1 0] [2 0] [1] [2])
3. 📐 P-adic Norms and Ultrametric Distance
The prefix structure directly leads to the concept of p-adic norms and ultrametric distance. The distance between two sequences is determined by the length of their longest common prefix.
If two sequences A and B share a prefix of length k, their ultrametric distance is p^(-k), where p is the base of the digits (e.g., p=2 for binary, p=3 for ternary). The longer the shared prefix, the closer they are. 🎯
(defnget-common-prefix-length"Finds the length of the common prefix between two sequences."[seq-aseq-b](count(take-whiletrue?(map=seq-aseq-b))))(defnp-adic-distance"Calculates the p-adic distance between two sequences for a given base p."[pseq-aseq-b](let[k(get-common-prefix-lengthseq-aseq-b)](Math/powp(-k))));; Example with p=3 ⚡(let[p3a[1201]b[1202]c[1210]](println(str"Common prefix length (a, b): "(get-common-prefix-lengthab)))(println(str"Distance(a, b): "(p-adic-distancepab))); should be 3^-3 = 0.037(println(str"Common prefix length (a, c): "(get-common-prefix-lengthac)))(println(str"Distance(a, c): "(p-adic-distancepac)))); should be 3^-2 = 0.111
This distance function satisfies the strong triangle inequality, d(x, z) <= max(d(x, y), d(y, z)), which is the defining property of an ultrametric space. ✨
4. 🌍 Case Study: Clustering with Ternary (p=3) Morton Codes
Let's apply these ideas to cluster spatial data. Instead of using standard binary (p=2) Morton codes, we can use a ternary (p=3) system. This creates a different hierarchical grouping of the data.
The goal is to convert 3D coordinates into a 1D ternary Morton code, which preserves spatial locality. Then, we can use our p-adic distance to find clusters. 🗺️
;; A simplified function to interleave digits for a p-adic Morton code 🔢(defnto-base-p[pprecisionn](loop[numnresult()](if(or(zero?num)(=(countresult)precision))(takeprecision(concat(repeat(-precision(countresult))0)result))(recur(quotnump)(cons(remnump)result)))))(defnp-ary-morton-3d[xyzpprecision](let[x'(to-base-ppprecisionx)y'(to-base-ppprecisiony)z'(to-base-ppprecisionz)](vec(interleavex'y'z'))));; Example: Use p=3 for clustering earthquake data 🌋(let[p3precision4;; Mock earthquake data (normalized coordinates)earthquakes[[10125][11135][252620]];; Generate ternary Morton codesmorton-codes(map#(p-ary-morton-3d(nth%0)(nth%1)(nth%2)pprecision)earthquakes)[morton-amorton-bmorton-c]morton-codes](println"Earthquake A Morton:"morton-a)(println"Earthquake B Morton:"morton-b)(println"Earthquake C Morton:"morton-c);; The first two earthquakes are spatially close 📍;; Their Morton codes will share a longer prefix.(println"\nDistance A-B:"(p-adic-distancepmorton-amorton-b))(println"Distance A-C:"(p-adic-distancepmorton-amorton-c)));; By sorting data based on these Morton codes, we achieve;; a spatially coherent ordering that can be used for efficient;; clustering, neighbor searches, and indexing. 🚀
🎉 Conclusion
By viewing data sequences as prefix trees, we have built a practical foundation for understanding p-adic numbers and ultrametric spaces. This tutorial shows that we can go beyond binary systems to construct p-adic fields for any prime p, using it to decompose and analyze data in a hierarchical way.
This approach connects computational geometry with number theory, offering a powerful framework for spatial analysis in Clojure. 💎
fs0.5.27 - File system utility library for Clojure
nrepl1.4.0 - A Clojure network REPL that provides a server and client, along with some common APIs of use to IDEs and other tools that may need to evaluate Clojure code in remote environments.
eca0.44.1 - Editor Code Assistant (ECA) - AI pair programming capabilities agnostic of editor
What does it mean, tripping around? Is it about round-tripping values between the REPL and the editor? Or about tripping over obstacles? In this post, I talk about both!
Round-tripping
In the context of REPL use, round-tripping means a particular property of printed data: the printed string representation of data, if evaluated, produces an equivalent data structure. For example, this map is round-trippable:
{:a1:btrue"str"0}
If you print it, you’ll get the same thing back. This is very useful because it speeds up development at the REPL: maps can be copied, saved, loaded, programmatically re-read, you get the point.
Some things cannot be round-tripped. Take this function, for example:
;; normal Clojure REPLuser=>assoc#object[clojure.core$assoc__54160x4a9486c0"clojure.core$assoc__5416@4a9486c0"]
Function is not data; it cannot always be round-tripped exactly, so you get this. It makes sense to a degree, though I don’t like it. The utility of round-tripping during development far outweighs the drawback of some inaccuracy. For this reason, Reveal — Read Eval Visualize Loop for Clojure — always used this representation instead:
;; Reveal REPL outputassoc=>clojure.core/assoc
Why did I use this representation? Two reasons:
It is round-trippable. I can copy a data structure with a function from the Reveal output pane into REPL, and it will evaluate to the same function without problem.
Due to syntax highlighting, it is visually distinct from symbol clojure.core/assoc:
Did you notice #_0x4a9486c0 after the function name? This is a new addition to the Reveal function printer, available in Reveal 1.3.296. It fixes a problem I was tripping over from time to time.
Tripping over identity
Default Clojure representation of a function includes an important bit of information: the object’s identity hash code. Identity matters, and hiding it makes it harder to discover identity-related issues; for example:
When comparing objects for equality to determine if some computation has to be repeated, using a function as a part of a “cache entry” requires care. Yes, I have a custom partial implementation with equality semantics in production.
Using objects with unique identity as keys requires care.
One particular gotcha is regex: instances of java.util.regex.Pattern do NOT define value equality and hash code. This means using them as keys is dangerous. This is why Reveal also shows regexes with their identity:
Yes, this code is not even a duplicate key error:
{#"a|b":a-or-b#"a|b":a-or-b}
You might ask, why use #_0xcafebabe to show identity? Well, that’s because it does not sacrifice round-tripping! #_ is a reader macro that ignores the next form, and 0xcafebabe is a valid, complete Clojure form. With this approach, you can both:
see identities of objects
copy them from the output pane to the editor, evaluate, and get (more or less) equivalent objects, whose identities, again, you can see.
More round-tripping with syntax highlighting
Syntax highlighting adds color — an extra dimension to printed data that allows for differentiating related things when the text is the same. Earlier, I showed how the symbol clojure.core/assoc and the function clojure.core/assoc use the same text, but different colors. But there is more! If we can use colors to differentiate symbols and functions, we can use them to differentiate objects and Clojure forms that produce such objects when evaluated. What kinds of objects? Refs! Futures! Files! Other stuff!
When dogfooding this feature, I found it important to use a separate color for parens, making them grey so they are not mistaken for collections (which also use parens — of yellow color). I think it’s very useful!
Tripping over namespaces (in Cursive)
In the final part of the post, I want to talk about using socket REPL in Cursive. I’ve been using it with Reveal for ages. One aspect in which a socket REPL is inferior to nREPL is automatic switching of a namespace to the current file. Cursive — a Clojure plugin for Intellij IDEA — only sends evaluated forms verbatim when using socket REPL. This means every time you switch between Clojure files in IDEA, you need to trigger a shortcut that will explicitly switch the ns so that sent forms will evaluate without errors. It’s annoying that I have to have this habit.
Had to have this habit. Turns out, since Cursive also sends file and line as form metadata, Reveal (or any other REPL implementation, really) can infer the right namespace for evaluation by inspecting the file content. The newest version of Reveal now supports this (under a flag, but enabled by default)! This means Reveal, when used as a socket REPL in IDEA, will now automatically evaluate forms in the right namespace — this greatly improves the experience!
This post walks through a small web development project using Clojure, covering everything from building the app to packaging and deploying it. It’s a collection of insights and tips I’ve learned from building my Clojure side projects but presented in a more structured format.
As the title suggests, we’ll be deploying the app to Fly.io. It’s a service that allows you to deploy apps packaged as Docker images on lightweight virtual machines.[1] My experience with it has been good, it’s easy to use and quick to set up. One downside of Fly is that it doesn’t have a free tier, but if you don’t plan on leaving the app deployed, it barely costs anything.
This isn’t a tutorial on Clojure, so I’ll assume you already have some familiarity with the language as well as some of its libraries.[2]
In this post, we’ll be building a barebones bookmarks manager for the demo app. Users can log in using basic authentication, view all bookmarks, and create a new bookmark. It’ll be a traditional multi-page web app and the data will be stored in a SQLite database.
Here’s an overview of the project’s starting directory structure:
And the libraries we’re going to use. If you have some Clojure experience or have used Kit, you’re probably already familiar with all the libraries listed below.[3]
I use Aero and Integrant for my system configuration (more on this in the next section), Ring with the Jetty adaptor for the web server, Reitit for routing, next.jdbc for database interaction, and Hiccup for rendering HTML. From what I’ve seen, this is a popular “library combination” for building web apps in Clojure.[4]
The user namespace in dev/user.clj contains helper functions from Integrant-repl to start, stop, and restart the Integrant system.
dev/user.clj
(ns user
(:require[acme.main :as main][clojure.tools.namespace.repl :as repl][integrant.core :as ig][integrant.repl :refer[set-prep! go halt reset reset-all]]))(set-prep!(fn[](ig/expand(main/read-config))));; we'll implement this soon(repl/set-refresh-dirs"src""resources")(comment(go)(halt)(reset)(reset-all))
If you’re new to Integrant or other dependency injection libraries like Component, I’d suggest reading “How to Structure a Clojure Web”. It’s a great explanation about the reasoning behind these libraries. Like most Clojure apps that use Aero and Integrant, my system configuration lives in a .edn file. I usually name mine as resources/config.edn. Here’s what it looks like:
In production, most of these values will be set using environment variables. During local development, the app will use the hard-coded default values. We don’t have any sensitive values in our config (e.g., API keys), so it’s fine to commit this file to version control. If there are such values, I usually put them in another file that’s not tracked by version control and include them in the config file using Aero’s #include reader tag.
This config file is then “expanded” into the Integrant system map using the expand-key method:
The system map is created in code instead of being in the configuration file. This makes refactoring your system simpler as you only need to change this method while leaving the config file (mostly) untouched.[5]
My current approach to Integrant + Aero config files is mostly inspired by the blog post “Rethinking Config with Aero & Integrant” and Laravel’s configuration. The config file follows a similar structure to Laravel’s config files and contains the app configurations without describing the structure of the system. Previously I had a key for each Integrant component, which led to the config file being littered with #ig/ref and more difficult to refactor.
Also, if you haven’t already, start a REPL and connect to it from your editor. Run clj -M:dev if your editor doesn’t automatically start a REPL. Next, we’ll implement the init-key and halt-key! methods for each of the components:
The setup-db function creates the required tables in the database if they don’t exist yet. This works fine for database migrations in small projects like this demo app, but for larger projects, consider using libraries such as Migratus (my preferred library) or Ragtime.
src/acme/util.clj
(ns acme.util
(:require[next.jdbc :as jdbc]))(defn setup-db
[db](jdbc/execute-one!
db
["create table if not exists bookmarks (
bookmark_id text primary key not null,
url text not null,
created_at datetime default (unixepoch()) not null
)"]))
For the server handler, let’s start with a simple function that returns a “hi world” string.
Now all the components are implemented. We can check if the system is working properly by evaluating (reset) in the user namespace. This will reload your files and restart the system. You should see this message printed in your REPL:
:reloading (acme.util acme.handler acme.main)
Server started on port 8080
:resumed
If we send a request to http://localhost:8080/, we should get “hi world” as the response:
$ curl localhost:8080/
# hi world
Nice! The system is working correctly. In the next section, we’ll implement routing and our business logic handlers.
If you remember the :handler/ring from earlier, you’ll notice that it has two dependencies, database and auth. Currently, they’re inaccessible to our route handlers. To fix this, we can inject these components into the Ring request map using a middleware function.
The components-middleware function takes in a map of components and creates a middleware function that “assocs” each component into the request map.[6] If you have more components such as a Redis cache or a mail service, you can add them here.
We’ll also need a middleware to handle HTTP basic authentication.[7] This middleware will check if the username and password from the request map matche the values in the auth map injected by components-middleware. If they match, then the request is authenticated and the user can view the site.
A nice feature of Clojure is that interop with the host language is easy. The base64-encode function is just a thin wrapper over Java’s Base64.Encoder:
We now have everything we need to implement the route handlers or the business logic of the app. First, we’ll implement the index-page function which renders a page that:
Shows all of the user’s bookmarks in the database, and
Shows a form that allows the user to insert new bookmarks into the database
src/acme/handler.clj
(ns acme.handler
(:require;; ...[next.jdbc :as jdbc][next.jdbc.sql :as sql]));; ...(defn template
[bookmarks][:html[:head[:meta{:charset"utf-8":name"viewport":content"width=device-width, initial-scale=1.0"}]][:body[:h1"bookmarks"][:form{:method"POST"}[:div[:label{:for"url"}"url "][:input#url {:name"url":type"url":requiredtrue:placeholer"https://en.wikipedia.org/"}]][:button"submit"]][:p"your bookmarks:"][:ul(if(empty? bookmarks)[:li"you don't have any bookmarks"](map(fn[{:keys[url]}][:li[:a{:href url} url]])
bookmarks))]]])(defn index-page
[req](try(let[bookmarks (sql/query(:db req)["select * from bookmarks"]
jdbc/unqualified-snake-kebab-opts)](util/render(template bookmarks)))(catch Exception e
(util/server-error e))));; ...
Database queries can sometimes throw exceptions, so it’s good to wrap them in a try-catch block. I’ll also introduce some helper functions:
render takes a hiccup form and turns it into a ring response, while server-error takes an exception, logs it, and returns a 500 response.
Next, we’ll implement the index-action function:
src/acme/handler.clj
;; ...(defn index-action
[req](try(let[{:keys[db form-params]} req
value (get form-params "url")](sql/insert! db :bookmarks{:bookmark_id(random-uuid):url value})(res/redirect"/"303))(catch Exception e
(util/server-error e))));; ...
This is an implementation of a typical post/redirect/get pattern. We get the value from the URL form field, insert a new row in the database with that value, and redirect back to the index page. Again, we’re using a try-catch block to handle possible exceptions from the database query.
That should be all of the code for the controllers. If you reload your REPL and go to http://localhost:8080, you should see something that looks like this after logging in:
The last thing we need to do is to update the main function to start the system:
Now, you should be able to run the app using clj -M -m acme.main. That’s all the code needed for the app. In the next section, we’ll package the app into a Docker image to deploy to Fly.
While there are many ways to package a Clojure app, Fly.io specifically requires a Docker image. There are two approaches to doing this:
Build an uberjar and run it using Java in the container, or
Load the source code and run it using Clojure in the container
Both are valid approaches. I prefer the first since its only dependency is the JVM. We’ll use the tools.build library to build the uberjar. Check out the official guide for more information on building Clojure programs. Since it’s a library, to use it we can add it to our deps.edn file with an alias:
Tools.build expects a build.clj file in the root of the project directory, so we’ll need to create that file. This file contains the instructions to build artefacts, which in our case is a single uberjar. There are many great examples of build.clj files on the web, including from the official documentation. For now, you can copy+paste this file into your project.
To build the project, run clj -T:build uber. This will create the uberjar standalone.jar in the target directory. The uber in clj -T:build uber refers to the uber function from build.clj. Since the build system is a Clojure program, you can customise it however you like. If we try to run the uberjar now, we’ll get an error:
# build the uberjar$ clj -T:build uber
# Cleaning build directory...# Copying files...# Compiling Clojure...# Building Uberjar...# run the uberjar$ java-jar target/standalone.jar
# Error: Could not find or load main class acme.main# Caused by: java.lang.ClassNotFoundException: acme.main
This error occurred because the Main class that is required by Java isn’t built. To fix this, we need to add the :gen-class directive in our main namespace. This will instruct Clojure to create the Main class from the -main function.
src/acme/main.clj
(ns acme.main
;; ...(:gen-class));; ...
If you rebuild the project and run java -jar target/standalone.jar again, it should work perfectly. Now that we have a working build script, we can write the Dockerfile:
Dockerfile
# install additional dependencies here in the base layer# separate base from build layer so any additional deps installed are cachedFROM clojure:temurin-21-tools-deps-bookworm-slim AS baseFROM base as buildWORKDIR /optCOPY . .RUN clj -T:build uberFROM eclipse-temurin:21-alpine AS prodCOPY--from=build /opt/target/standalone.jar /EXPOSE 8080ENTRYPOINT ["java", "-jar", "standalone.jar"]
It’s a multi-stage Dockerfile. We use the official Clojure Docker image as the layer to build the uberjar. Once it’s built, we copy it to a smaller Docker image that only contains the Java runtime.[8] By doing this, we get a smaller container image as well as a faster Docker build time because the layers are better cached.
That should be all for packaging the app. We can move on to the deployment now.
First things first, you’ll need to install flyctl, Fly’s CLI tool for interacting with their platform. Create a Fly.io account if you haven’t already. Then run fly auth login to authenticate flyctl with your account.
$ fly app create
# ? Choose an app name (leave blank to generate one): # automatically selected personal organization: Ryan Martin# New app created: blue-water-6489
Another way to do this is with the fly launch command, which automates a lot of the app configuration for you. We have some steps to do that are not done by fly launch, so we’ll be configuring the app manually. I also already have a fly.toml file ready that you can straight away copy to your project.
fly.toml
# replace these with your app and region name# run `fly platform regions` to get a list of regionsapp='blue-water-6489'primary_region='sin'[env]DB_DATABASE="/data/database.db"[http_service]internal_port=8080force_https=trueauto_stop_machines="stop"auto_start_machines=truemin_machines_running=0[mounts]source="data"destination="/data"initial_sie=1[[vm]]size="shared-cpu-1x"memory="512mb"cpus=1cpu_kind="shared"
These are mostly the default configuration values with some additions. Under the [env] section, we’re setting the SQLite database location to /data/database.db. The database.db file itself will be stored in a persistent Fly Volume mounted on the /data directory. This is specified under the [mounts] section. Fly Volumes are similar to regular Docker volumes but are designed for Fly’s micro VMs.
We’ll need to set the AUTH_USER and AUTH_PASSWORD environment variables too, but not through the fly.toml file as these are sensitive values. To securely set these credentials with Fly, we can set them as app secrets. They’re stored encrypted and will be automatically injected into the app at boot time.
$ fly secrets setAUTH_USER=hi@ryanmartin.me AUTH_PASSWORD=not-so-secure-password
# Secrets are staged for the first deployment
With this, the configuration is done and we can deploy the app using fly deploy:
$ fly deploy
# ...# Checking DNS configuration for blue-water-6489.fly.dev# Visit your newly deployed app at https://blue-water-6489.fly.dev/
The first deployment will take longer since it’s building the Docker image for the first time. Subsequent deployments should be faster due to the cached image layers. You can click on the link to view the deployed app, or you can also run fly open which will do the same thing. Here’s the app in action:
If you made additional changes to the app or fly.toml, you can redeploy the app using the same command, fly deploy. The app is configured to auto stop/start, which helps to cut costs when there’s not a lot of traffic to the site. If you want to take down the deployment, you’ll need to delete the app itself using fly app destroy <your app name>.
This is an interesting topic in the Clojure community, with varying opinions on whether or not it’s a good idea. Personally I find having a REPL connected to the live app helpful, and I often use it for debugging and running queries on the live database.[9] Since we’re using SQLite, we don’t have a database server we can directly connect to, unlike Postgres or MySQL.
If you’re brave, you can even restart the app directly without redeploying from the REPL. You can easily go wrong with it, which is why some prefer to not use it.
For this project, we’re gonna add a socket REPL. It’s very simple to add (you just need to add a JVM option) and it doesn’t require additional dependencies like nREPL. Let’s update the Dockerfile:
The socket REPL will be listening on port 7888. If we redeploy the app now, the REPL will be started but we won’t be able to connect to it. That’s because we haven’t exposed the service through Fly proxy. We can do this by adding the socket REPL as a service in the [services] section in fly.toml.
However, doing this will also expose the REPL port to the public. This means that anyone can connect to your REPL and possibly mess with your app. Instead, what we want to do is to configure the socket REPL as a private service.
By default, all Fly apps in your organisation live in the same private network. This private network, called 6PN, connects the apps in your organisation through Wireguard tunnels (a VPN) using IPv6. Fly private services aren’t exposed to the public internet but can be reached from this private network. We can then use Wireguard to connect to this private network to reach our socket REPL.
Fly VMs are also configured with the hostname fly-local-6pn, which maps to its 6PN address. This is analogous to localhost, which points to your loopback address 127.0.0.1. To expose a service to 6PN, all we have to do is bind or serve it to fly-local-6pn instead of the usual 0.0.0.0. We have to update the socket REPL options to:
After redeploying, we can use the fly proxy command to forward the port from the remote server to our local machine.[10]
$ fly proxy 7888:7888
# Proxying local port 7888 to remote [blue-water-6489.internal]:7888
In another shell, run:
$ rlwrap nc localhost 7888# user=>
Now we have a REPL connected to the production app! rlwrap is used for readline functionality, e.g. up/down arrow keys, vi bindings. Of course you can also connect to it from your editor.
To get this to work, you’ll need to create a deploy token from your app’s dashboard. Then, in your GitHub repo, create a new repository secret called FLY_API_TOKEN with the value of your deploy token. Now, whenever you push to the main branch, this workflow will automatically run and deploy your app. You can also manually run the workflow from GitHub because of the workflow_dispatch option.
As always, all the code is available on GitHub. Originally, this post was just about deploying to Fly.io, but along the way I kept adding on more stuff until it essentially became my version of the user manager example app. Anyway, hope this post provided a good view into web development with Clojure. As a bonus, here are some additional resources on deploying Clojure apps:
The way Fly.io works under the hood is pretty clever. Instead of running the container image with a runtime like Docker, the image is unpacked and “loaded” into a VM. See this video explanation for more details. ↩︎
Kit was a big influence on me when I first started learning web development in Clojure. I never used it directly, but I did use their library choices and project structure as a base for my own projects. ↩︎
There’s no “Rails” for the Clojure ecosystem (yet?). The prevailing opinion is to build your own “framework” by composing different libraries together. Most of these libraries are stable and are already used in production by big companies, so don’t let this discourage you from doing web development in Clojure! ↩︎
There might be some keys that you add or remove, but the structure of the config file stays the same. ↩︎
“assoc” (associate) is a Clojure slang that means to add or update a key-value pair in a map. ↩︎
For more details on how basic authentication works, check out the specification. ↩︎
Here’s a cool resource I found when researching Java Dockerfiles: WhichJDK. It provides a comprehensive comparison on the different JDKs available and recommendations on which one you should use. ↩︎
If you encounter errors related to Wireguard when running fly proxy, you can run fly doctor which will hopefully detect issues with your local setup and also suggest fixes for them. ↩︎
This post is about six seven months late, but here are my takeaways from Advent of Code 2024. It was my second time participating, and this time I actually managed to complete it.[1] My goal was to learn a new language, Zig, and to improve my DSA and problem-solving skills.
If you’re not familiar, Advent of Code is an annual programming challenge that runs every December. A new puzzle is released each day from December 1st to the 25th. There’s also a global leaderboard where people (and AI) race to get the fastest solves, but I personally don’t compete in it, mostly because I want to do it at my own pace.
I went with Zig because I have been curious about it for a while, mainly because of its promise of being a better C and because TigerBeetle (one of the coolest databases now) is written in it. Learning Zig felt like a good way to get back into systems programming, something I’ve been wanting to do after a couple of chaotic years of web development.
This post is mostly about my setup, results, and the things I learned from solving the puzzles. If you’re more interested in my solutions, I’ve also uploaded my code and solution write-ups to my GitHub repository.
There were several Advent of Code templates in Zig that I looked at as a reference for my development setup, but none of them really clicked with me. I ended up just running my solutions directly using zig run for the whole event. It wasn’t until after the event ended that I properly learned Zig’s build system and reorganised my project.
The project is powered by build.zig, which defines several commands:
Build
zig build - Builds all of the binaries for all optimisation modes.
Run
zig build run - Runs all solutions sequentially.
zig build run -Day=XX - Runs the solution of the specified day only.
Benchmark
zig build bench - Runs all benchmarks sequentially.
zig build bench -Day=XX - Runs the benchmark of the specified day only.
Test
zig build test - Runs all tests sequentially.
zig build test -Day=XX - Runs the tests of the specified day only.
You can also pass the optimisation mode that you want to any of the commands above with the -Doptimize flag.
Under the hood, build.zig compiles src/run.zig when you call zig build run, and src/bench.zig when you call zig build bench. These files are templates that import the solution for a specific day from src/days/dayXX.zig. For example, here’s what src/run.zig looks like:
The day module imported is an anonymous import dynamically injected by build.zig during compilation. This allows a single run.zig or bench.zig to be reused for all solutions. This avoids repeating boilerplate code in the solution files. Here’s a simplified version of my build.zig file that shows how this works:
build.zig
const std =@import("std");pubfnbuild(b:*std.Build)void{const target = b.standardTargetOptions(.{});const optimize = b.standardOptimizeOption(.{});const run_all = b.step("run","Run all days");const day_option = b.option(usize,"ay","");// The `-Day` option// Generate build targets for all 25 days.for(1..26)|day|{const day_zig_file = b.path(b.fmt("src/days/day{d:0>2}.zig",.{day}));// Create an executable for running this specific day.const run_exe = b.addExecutable(.{.name = b.fmt("run-day{d:0>2}",.{day}),.root_source_file = b.path("src/run.zig"),.target = target,.optimize = optimize,});// Inject the day-specific solution file as the anonymous module `day`.
run_exe.root_module.addAnonymousImport("day",.{.root_source_file = day_zig_file });// Install the executable so it can be run.
b.installArtifact(run_exe);// ...}}
My actual build.zig has some extra code that builds the binaries for all optimisation modes.
This setup is pretty barebones. I’ve seen other templates do cool things like scaffold files, download puzzle inputs, and even submit answers automatically. Since I wrote my build.zig after the event ended, I didn’t get to use it while solving the puzzles. I might add these features to it if I decided to do Advent of Code again this year with Zig.
While there are no rules to Advent of Code itself, to make things a little more interesting, I set a few constraints and rules for myself:
The code must be readable.
By “readable”, I mean the code should be straightforward and easy to follow. No unnecessary abstractions. I should be able to come back to the code months later and still understand (most of) it.
Solutions must be a single file.
No external dependencies. No shared utilities module. Everything needed to solve the puzzle should be visible in that one solution file.
The total runtime must be under one second.[2]
All solutions, when run sequentially, should finish in under one second. I want to improve my performance engineering skills.
Parts should be solved separately.
This means: (1) no solving both parts simultaneously, and (2) no doing extra work in part one that makes part two faster. The aim of this is to get a clear idea of how long each part takes on its own.
No concurrency or parallelism.
Solutions must run sequentially on a single thread. This keeps the focus on the efficiency of the algorithm. I can’t speed up slow solutions by using multiple CPU cores.
No ChatGPT. No Claude. No AI help.
I want to train myself, not the LLM. I can look at other people’s solutions, but only after I have given my best effort at solving the problem.
Follow the constraints of the input file.
The solution doesn’t have to work for all possible scenarios, but it should work for all valid inputs. If the input file only contains 8-bit unsigned integers, the solution doesn’t have to handle larger integer types.
Hardcoding is allowed.
For example: size of the input, number of rows and columns, etc. Since the input is known at compile-time, we can skip runtime parsing and just embed it into the program using Zig’s @embedFile.
Most of these constraints are designed to push me to write clearer, more performant code. I also wanted my code to look like it was taken straight from TigerBeetle’s codebase (minus the assertions).[3] Lastly, I just thought it would make the experience more fun.
From all of the puzzles, here are my top 3 favourites:
Day 6: Guard Gallivant - This is my slowest day (in benchmarks), but also the one I learned the most from. Some of these learnings include: using vectors to represent directions, padding 2D grids, metadata packing, system endianness, etc.
Day 17: Chronospatial Computer - I love reverse engineering puzzles. I used to do a lot of these in CTFs during my university days. The best thing I learned from this day is the realisation that we can use different integer bases to optimise data representation. This helped improve my runtimes in the later days 22 and 23.
Day 21: Keypad Conundrum - This one was fun. My gut told me that it can be solved greedily by always choosing the best move. It was right. Though I did have to scroll Reddit for a bit to figure out the step I was missing, which was that you have to visit the farthest keypads first. This is also my longest solution file (almost 400 lines) because I hardcoded the best-moves table.
Honourable mention:
Day 24: Crossed Wires - Another reverse engineering puzzle. Confession: I didn’t solve this myself during the event. After 23 brutal days, my brain was too tired, so I copied a random Python solution from Reddit. When I retried it later, it turned out to be pretty fun. I still couldn’t find a solution I was satisfied with though.
During the event, I learned a lot about Zig and performance, and also developed some personal coding conventions. Some of these are Zig-specific, but most are universal and can be applied across languages. This section covers general programming and Zig patterns I found useful. The next section will focus on performance-related tips.
Zig’s flagship feature, comptime, is surprisingly useful. I knew Zig uses it for generics and that people do clever metaprogramming with it, but I didn’t expect to be using it so often myself.
My main use for comptime was to generate puzzle-specific types. All my solution files follow the same structure, with a DayXX function that takes some parameters (usually the input length) and returns a puzzle-specific type, e.g.:
This lets me instantiate the type with a size that matches my input:
src/days/day01.zig
// Here, `Day01` is called with the size of my actual input.pubfnrun(_:std.mem.Allocator, is_run:bool)![3]u64{// ...const input =@embedFile("./data/day01.txt");var puzzle =tryDay01(1000).init(input);// ...}// Here, `Day01` is called with the size of my test input.test"day 01 part 1 sample 1"{var puzzle =tryDay01(6).init(sample_input);// ...}
This allows me to reuse logic across different inputs while still hardcoding the array sizes. Without comptime, I have to either create a separate function for all my different inputs or dynamically allocate memory because I can’t hardcode the array size.
I also used comptime to shift some computation to compile-time to reduce runtime overhead. For example, on day 4, I needed a function to check whether a string matches either "XMAS" or its reverse, "SAMX". A pretty simple function that you can write as a one-liner in Python:
example.py
defmatches(pattern, target):return target == pattern or target == pattern[::-1]
Typically a function like this requires some dynamic allocation to create the reversed string, since the length of the string is only known at runtime.[4] For this puzzle, since the words to reverse are known at compile-time, we can do something like this:
This creates a separate function for each word I want to reverse.[5] Each function has an array with the same size as the word to reverse. This removes the need for dynamic allocation and makes the code run faster. As a bonus, Zig also warns you when this word isn’t compile-time known, so you get an immediate error if you pass in a runtime value.
A common pattern in C is to return special sentinel values to denote missing values or errors, e.g. -1, 0, or NULL. In fact, I did this on day 13 of the challenge:
src/days/day13.zig
// We won't ever get 0 as a result, so we use it as a sentinel error value.fncount_tokens(a:[2]u8, b:[2]u8, p:[2]i64)u64{const numerator =@abs(p[0]* b[1]- p[1]* b[0]);const denumerator =@abs(@as(i32, a[0])* b[1]-@as(i32, a[1])* b[0]);returnif(numerator % denumerator !=0)0else numerator / denumerator;}// Then in the caller, skip if the return value is 0.if(count_tokens(a, b, p)==0)continue;
This works, but it’s easy to forget to check for those values, or worse, to accidentally treat them as valid results. Zig improves on this with optional types. If a function might not return a value, you can return ?T instead of T. This also forces the caller to handle the null case. Unlike C, null isn’t a pointer but a more general concept. Zig treats null as the absence of a value for any type, just like Rust’s Option<T>.
The count_tokens function can be refactored to:
src/days/day13.zig
// Return null instead if there's no valid result.fncount_tokens(a:[2]u8, b:[2]u8, p:[2]i64)?u64{const numerator =@abs(p[0]* b[1]- p[1]* b[0]);const denumerator =@abs(@as(i32, a[0])* b[1]-@as(i32, a[1])* b[0]);returnif(numerator % denumerator !=0)nullelse numerator / denumerator;}// The caller is now forced to handle the null case.if(count_tokens(a, b, p))|n_tokens|{// logic only runs when n_tokens is not null.}
Zig also has a concept of error unions, where a function can return either a value or an error. In Rust, this is Result<T>. You could also use error unions instead of optionals for count_tokens; Zig doesn’t force a single approach. I come from Clojure where returning nil for an error or missing value is common.
This year has a lot of 2D grid puzzles (arguably too many). A common feature of grid-based algorithms is the out-of-bounds check. Here’s what it usually looks like:
example.zig
fndfs(map:[][]u8, position:[2]i8)u32{const x,const y = position;// Bounds check here.if(x <0or y <0or x >= map.len or y >= map[0].len)return0;if(map[x][y]==.visited)return0;
map[x][y]=.visited;var result:u32=1;for(directions)| direction|{
result +=dfs(map, position + direction);}return result;}
This is a typical recursive DFS function. After doing a lot of this, I discovered a nice trick that not only improves code readability, but also its performance. The trick here is to pad the grid with sentinel characters that mark out-of-bounds areas, i.e. add a border to the grid.
You can use any value for the border, as long as it doesn’t conflict with valid values in the grid. With the border in place, the bounds check becomes a simple equality comparison:
example.zig
const border ='*';fndfs(map:[][]u8, position:[2]i8)u32{const x,const y = position;if(map[x][y]== border){// We are out of boundsreturn0;}// ...}
This is much more readable than the previous code. Plus, it’s also faster since we’re only doing one equality check instead of four range checks.
That said, this isn’t a one-size-fits-all solution. This only works for algorithms that traverse the grid one step at a time. If your logic jumps multiple tiles, it can still go out of bounds (except if you increase the width of the border to account for this). This approach also uses a bit more memory than the regular approach as you have to store more characters.
This could also go in the performance section, but I’m including it here because the biggest benefit I get from using SIMD in Zig is the improved code readability. Because Zig has first-class support for vector types, you can write elegant and readable code that also happens to be faster.
If you’re not familiar with vectors, they are a special collection type used for Single instruction, multiple data (SIMD) operations. SIMD allows you to perform computation on multiple values in parallel using only a single CPU instruction, which often leads to some performance boosts.[6]
I mostly use vectors to represent positions and directions, e.g. for traversing a grid. Instead of writing code like this:
You can represent position and direction as 2-element vectors and write code like this:
example.zig
next_position = position + direction;
This is much nicer than the previous version!
Day 25 is another good example of a problem that can be solved elegantly using vectors:
src/days/day25.zig
var result:u64=0;for(self.locks.items)|lock|{// lock is a vectorfor(self.keys.items)|key|{// key is also a vectorconst fitted = lock + key >@as(@Vector(5,u8),@splat(5));const is_overlap =@reduce(.Or, fitted);
result +=@intFromBool(!is_overlap);}}
Expressing the logic as vector operations makes the code cleaner since you don’t have to write loops and conditionals as you typically would in a traditional approach.
The tips below are general performance techniques that often help, but like most things in software engineering, “it depends”. These might work 80% of the time, but performance is often highly context-specific. You should benchmark your code instead of blindly following what other people say.
This section would’ve been more fun with concrete examples, step-by-step optimisations, and benchmarks, but that would’ve made the post way too long. Hopefully I’ll get to write something like that in the future.[7]
Whenever possible, prefer static allocation. Static allocation is cheaper since it just involves moving the stack pointer vs dynamic allocation which has more overhead from the allocator machinery. That said, it’s not always the right choice since it has some limitations, e.g. stack size is limited, memory size must be compile-time known, its lifetime is tied to the current stack frame, etc.
If you need to do dynamic allocations, try to reduce the number of times you call the allocator. The number of allocations you do matters more than the amount of memory you allocate. More allocations mean more bookkeeping, synchronisation, and sometimes syscalls.
A simple but effective way to reduce allocations is to reuse buffers, whether they’re statically or dynamically allocated. Here’s an example from day 10. For each trail head, we want to create a set of trail ends reachable from it. The naive approach is to allocate a new set every iteration:
src/days/day10.zig
for(self.trail_heads.items)|trail_head|{var trail_ends = std.AutoHashMap([2]u8,void).init(self.allocator);defer trail_ends.deinit();// Set building logic...}
What you can do instead is to allocate the set once before the loop. Then, each iteration, you reuse the set by emptying it without freeing the memory. For Zig’s std.AutoHashMap, this can be done using the clearRetainingCapacity method:
src/days/day10.zig
var trail_ends = std.AutoHashMap([2]u8,void).init(self.allocator);defer trail_ends.deinit();for(self.trail_heads.items)|trail_head|{
trail_ends.clearRetainingCapacity();// Set building logic...}
If you use static arrays, you can also just overwrite existing data instead of clearing it.
A step up from this is to reuse multiple buffers. The simplest form of this is to reuse two buffers, i.e. double buffering. Here’s an example from day 11:
src/days/day11.zig
// Initialize two hash maps that we'll alternate between.var frequencies:[2]std.AutoHashMap(u64,u64)=undefined;for(0..2)|i| frequencies[i]= std.AutoHashMap(u64,u64).init(self.allocator);deferfor(0..2)|i| frequencies[i].deinit();var id:usize=0;for(self.stones)|stone|try frequencies[id].put(stone,1);for(0..n_blinks)|_|{var old_frequencies =&frequencies[id %2];var new_frequencies =&frequencies[(id +1)%2];
id +=1;defer old_frequencies.clearRetainingCapacity();// Do stuff with both maps...}
Here we have two maps to count the frequencies of stones across iterations. Each iteration will build up new_frequencies with the values from old_frequencies. Doing this reduces the number of allocations to just 2 (the number of buffers). The tradeoff here is that it makes the code slightly more complex.
A performance tip people say is to have “mechanical sympathy”. Understand how your code is processed by your computer. An example of this is to structure your data so it works better with your CPU. For example, keep related data close in memory to take advantage of cache locality.
Reducing the size of your data helps with this. Smaller data means more of it can fit in cache. One way to shrink your data is through bit packing. This depends heavily on your specific data, so you’ll need to use your judgement to tell whether this would work for you. I’ll just share some examples that worked for me.
The first example is in day 6 part two, where you have to detect a loop, which happens when you revisit a tile from the same direction as before. To track this, you could use a map or a set to store the tiles and visited directions. A more efficient option is to store this direction metadata in the tile itself.
There are only four tile types, which means you only need two bits to represent the tile types as an enum. If the enum size is one byte, here’s what the tiles look like in memory:
As you can see, the upper six bits are unused. We can store the direction metadata in the upper four bits. One bit for each direction. If a bit is set, it means that we’ve already visited the tile in this direction. Here’s an illustration of the memory layout:
direction metadata tile type
┌─────┴─────┐ ┌─────┴─────┐
┌────────┬─┴─┬───┬───┬─┴─┬─┴─┬───┬───┬─┴─┐
│ Tile: │ 1 │ 0 │ 0 │ 0 │ 0 │ 0 │ 1 │ 0 │
└────────┴─┬─┴─┬─┴─┬─┴─┬─┴───┴───┴───┴───┘
up bit ─┘ │ │ └─ left bit
right bit ─┘ down bit
If your language supports struct packing, you can express this layout directly:[8]
Doing this avoids extra allocations and improves cache locality. Since the directions metadata is colocated with the tile type, all of them can fit together in cache. Accessing the directions just requires some bitwise operations instead of having to fetch them from another region of memory.
Another way to do this is to represent your data using alternate number bases. Here’s an example from day 23. Computers are represented as two character strings made up of only lowercase letters, e.g. "bc", "xy", etc. Instead of storing this as a [2]u8 array, you can convert it into a base-26 number and store it as a u16.[9]
Here’s the idea: map 'a' to 0, 'b' to 1, up to 'z' as 25. Each character in the string becomes a digit in the base-26 number. For example, "bc" ( [2]u8{ 'b', 'c' }) becomes the base-10 number 28 (). If we represent this using the base-64 character set, it becomes 12 ('b' = 1, 'c' = 2).
While they take the same amount of space (2 bytes), a u16 has some benefits over a [2]u8:
It fits in a single register, whereas you need two for the array.
Comparison is faster as there is only a single value to compare.
I won’t explain branchless programming here; the Algorithmica explains it way better than I can. While modern compilers are often smart enough to compile away branches, they don’t catch everything. I still recommend writing branchless code whenever it makes sense. It also has the added benefit of reducing the number of codepaths in your program.
Again, since performance is very context-dependent, I’ll just show you some patterns I use. Here’s one that comes up often:
src/days/day02.zig
if(is_valid_report(report)){
result +=1;}
Instead of the branch, cast the bool into an integer directly:
src/days/day02.zig
result +=@intFromBool(is_valid_report(report))
Another example is from day 6 (again!). Recall that to know if a tile has been visited from a certain direction, we have to check its direction bit. Here’s one way to do it:
The final performance tip is to prefer iterative code over recursion. Recursive functions bring the overhead of allocating stack frames. While recursive code is more elegant, it’s also often slower unless your language’s compiler can optimise it away, e.g. via tail-call optimisation. As far as I know, Zig doesn’t have this, though I might be wrong.
Recursion also has the risk of causing a stack overflow if the execution isn’t bounded. This is why code that is mission- or safety-critical avoids recursion entirely. It’s in TigerBeetle’s TIGERSTYLE and also NASA’s Power of Ten.
Iterative code can be harder to write in some cases, e.g. DFS maps naturally to recursion, but most of the time it is significantly faster, more predictable, and safer than the recursive alternative.
I ran benchmarks for all 25 solutions in each of Zig’s optimisation modes. You can find the full results and the benchmark script in my GitHub repository. All benchmarks were done on an Apple M3 Pro.
As expected, ReleaseFast produced the best result with a total runtime of 85.1 ms. I’m quite happy with this, considering the two constraints that limited the number of optimisations I can do to the code:
Parts should be solved separately - Some days can be solved in a single go, e.g. day 10 and day 13, which could’ve saved a few milliseconds.
No concurrency or parallelism - My slowest days are the compute-heavy days that are very easily parallelisable, e.g. day 6, day 19, and day 22. Without this constraint, I can probably reach sub-20 milliseconds total(?), but that’s for another time.
You can see the full benchmarks for ReleaseFast in the table below:
Day
Title
Parsing (µs)
Part 1 (µs)
Part 2 (µs)
Total (µs)
1
Historian Hysteria
23.5
15.5
2.8
41.8
2
Red-Nosed Reports
42.9
0.0
11.5
54.4
3
Mull it Over
0.0
7.2
16.0
23.2
4
Ceres Search
5.9
0.0
0.0
5.9
5
Print Queue
22.3
0.0
4.6
26.9
6
Guard Gallivant
14.0
25.2
24,331.5
24,370.7
7
Bridge Repair
72.6
321.4
9,620.7
10,014.7
8
Resonant Collinearity
2.7
3.3
13.4
19.4
9
Disk Fragmenter
0.8
12.9
137.9
151.7
10
Hoof It
2.2
29.9
27.8
59.9
11
Plutonian Pebbles
0.1
43.8
2,115.2
2,159.1
12
Garden Groups
6.8
164.4
249.0
420.3
13
Claw Contraption
14.7
0.0
0.0
14.7
14
Restroom Redoubt
13.7
0.0
0.0
13.7
15
Warehouse Woes
14.6
228.5
458.3
701.5
16
Reindeer Maze
12.6
2,480.8
9,010.7
11,504.1
17
Chronospatial Computer
0.1
0.2
44.5
44.8
18
RAM Run
35.6
15.8
33.8
85.2
19
Linen Layout
10.7
11,890.8
11,908.7
23,810.2
20
Race Condition
48.7
54.5
54.2
157.4
21
Keypad Conundrum
0.0
1.7
22.4
24.2
22
Monkey Market
20.7
0.0
11,227.7
11,248.4
23
LAN Party
13.6
22.0
2.5
38.2
24
Crossed Wires
5.0
41.3
14.3
60.7
25
Code Chronicle
24.9
0.0
0.0
24.9
A weird thing I found when benchmarking is that for day 6 part two, ReleaseSafe actually ran faster than ReleaseFast (13,189.0 µs vs 24,370.7 µs). Their outputs are the same, but for some reason ReleaseSafe is faster even with the safety checks still intact.
The Zig compiler is still very much a moving target, so I don’t want to dig too deep into this, as I’m guessing this might be a bug in the compiler. This weird behaviour might just disappear after a few compiler version updates.
Looking back, I’m really glad I decided to do Advent of Code and followed through to the end. I learned a lot of things. Some are useful in my professional work, some are more like random bits of trivia. Going with Zig was a good choice too. The language is small, simple, and gets out of your way. I learned more about algorithms and concepts than the language itself.
Besides what I’ve already mentioned earlier, here are some examples of the things I learned:
Some of my self-imposed constraints and rules ended up being helpful. I can still (mostly) understand the code I wrote a few months ago. Putting all of the code in a single file made it easier to read since I don’t have to context switch to other files all the time.
However, some of them did backfire a bit, e.g. the two constraints that limit how I can optimise my code. Another one is the “hardcoding allowed” rule. I used a lot of magic numbers, which helped to improve performance, but I didn’t document them so after a while I don’t even remember how I got them. I’ve since gone back and added explanations in my write-ups, but next time I’ll remember to at least leave comments.
One constraint I’ll probably remove next time is the no concurrency rule. It’s the biggest contributor to the total runtime of my solutions. I don’t do a lot of concurrent programming, even though my main language at work is Go, so next time it might be a good idea to use Advent of Code to level up my concurrency skills.
I also spent way more time on these puzzles than I originally expected. I optimised and rewrote my code multiple times. I also rewrote my write-ups a few times to make them easier to read. This is by far my longest side project yet. It’s a lot of fun, but it also takes a lot of time and effort. I almost gave up on the write-ups (and this blog post) because I don’t want to explain my awful day 15 and day 16 code. I ended up taking a break for a few months before finishing it, which is why this post is published in August lol.
Just for fun, here’s a photo of some of my notebook sketches that helped me visualise my solutions. See if you can guess which days these are from:
So… would I do it again? Probably, though I’m not making any promises. If I do join this year, I’ll probably stick with Zig. I had my eyes on Zig since the start of 2024, so Advent of Code was the perfect excuse to learn it. This year, there aren’t any languages in particular that caught my eye, so I’ll just keep using Zig, especially since I have a proper setup ready.
If you haven’t tried Advent of Code, I highly recommend checking it out this year. It’s a great excuse to learn a new language, improve your problem-solving skills, or just to learn something new. If you’re eager, you can also do the previous years’ puzzles as they’re still available.
One of the best aspects of Advent of Code is the community. The Advent of Code subreddit is a great place for discussion. You can ask questions and also see other people’s solutions. Some people also post really cool visualisations like this one. They also have memes!
I failed my first attempt horribly with Clojure during Advent of Code 2023. Once I reached the later half of the event, I just couldn’t solve the problems with a purely functional style. I could’ve pushed through using imperative code, but I stubbornly chose not to and gave up… ↩︎
The original constraint was that each solution must run in under one second. As it turned out, the code was faster than I expected, so I increased the difficulty. ↩︎
You can implement this function without any allocation by mutating the string in place or by iterating over it twice, which is probably faster than my current implementation. I kept it as-is as a reminder of what comptime can do. ↩︎
As a bonus, I was curious as to what this looks like compiled so I listed all the functions in this binary in GDB and found:
Well, not always. The number of SIMD instructions depends on the machine’s native SIMD size. If the length of the vector exceeds it, Zig will compile it into multiple SIMD instructions. ↩︎
One thing about packed structs is that their layout is dependent on the system endianness. Most modern systems are little-endian, so the memory layout I showed is actually reversed. Thankfully, Zig has some useful functions to convert between endianness like std.mem.nativeToBig, which makes working with packed structs easier. ↩︎
Technically, you can store 2-digit base 26 numbers in a u10, as there are only possible numbers. Most systems usually pad values by byte size, so u10 will still be stored as u16, which is why I just went straight for it. ↩︎