In my previous post I outlined my REPL based workflow for CLJ as well as CLJS. In this post I want to dive deeper into how you can extend/customize it to make it your own.
The gist of it all is having a dedicated src/dev/repl.clj file. This file serves as the entrypoint to start your development setup, so this is where everything related should go. It captures the essential start/stop cycle to quickly get your workflow going and restarting it if necessary.
Quick note: There is no rule that this needs to be in this file, or even that everyone working in a team needs to use the same file. Each team member could have their own file. This is all just Clojure code and there are basically no limits.
There are a couple common questions that come up in shadow-cljs discussions every so often. I’ll give some examples of stuff I have done or recommendations I made in the past. I used to recommend npm-run-all in the past, but have pretty much eliminated all my own uses in favor of this.
Example 1: Building CSS
Probably the most common question is how to process CSS in a CLJ/CLJS environment. Especially people from the JS side of the world often come with the expectation that the build tool (i.e. shadow-cljs) would take care of it. shadow-cljs does not support processing CSS, and likely never will. IMHO it doesn’t need to, you can just as well build something yourself.
I’ll use the code as a reference that builds the CSS for the shadow-cljs UI itself. You can find in its src/dev/repl.clj.
So, when I start my workflow, it calls the build/css-release function. Which is just a defn in another namespace. This happens to be using shadow-css, but for the purposes of this post this isn’t relevant. It could be anything you can do in Clojure.
shadow-css does not have a built-in watcher, so in the next bit I’m using the fs-watch namespace from shadow-cljs. Basically it watches the src/main folder for changes in cljs,cljc,clj files, given that shadow-css generates CSS from Clojure (css ...) forms. When fs-watch detects changes it calls the provided function. In this case I don’t care about which file was updated and just rebuild the whole CSS. This takes about 50-100ms, so optimizing this further wasn’t necessary, although I did in the past and you absolutely can.
fs-watch/start returns the watcher instance, which I’m storing in the css-watch-ref atom. The stop function will properly shut this down, to avoid ending up with this watch running multiple times.
When it is time to make an actual release I will just call the build/css-release as part of that process. That is the reason this is a dedicated function in a different namespace, I don’t want to be reaching into the repl ns for release related things, although technically there is nothing wrong with doing that. Just personal preference I guess.
Example 2: Running other Tasks
Well, this is just a repeat of the above. The pattern is always the same. Maybe you are trying to run tailwindcss instead? This has its own watch mode, so you can skip the fs-watch entirely. Clojure and the JVM have many different ways of running external processes. java.lang.ProcessBuilder is always available and quite easy to use from CLJ. The latest Clojure 1.12 release added a new clojure.java.process namespace, which might just suit your needs too, and is also just wrapper around ProcessBuilder.
So, this starts the tailwindcss command (example taken directly from tailwind docs). Those .redirectError/.redirectOutput calls make sure the output of the process isn’t lost and instead is written to the stderr/stdout of the current JVM process. We do not want to wait for this process to exit, since it just keeps running and watching the files. Integrating this into our start function then becomes
(reset! css-watch-ref
(build/css-watch))
The css-watch function returns a java.lang.Process handle, which we can later use to kill the process in our stop function.
You could then build your own build/css-release function, that uses the same mechanism but skips the watch. Or just run it directly from your shell instead.
Things to be aware of
It is quite possible to break your entire JVM or just leaking a lot of resources all over the place. Make sure you actually always properly clean up after yourself and do not just ignore this. Shutting down the JVM entirely will usually clean up, but I always consider this the “nuclear” option. I rely on stop to clean up everything, since actually starting the JVM is quite expensive I want to avoid doing it. I often have my dev process running for weeks, and pretty much the only reason to ever restart it is when I need to change my dependencies.
Also make sure you actually see the produced output. The above tailwind process for example. I would want to see this output in case tailwind is showing me an error. Depending on how you start your setup this may not be visible by default. If running this over nREPL for example it won’t show up in the REPL. I actually prefer that, so I’ll always have this visible in a dedicated Terminal window on my side monitor.
Those are the two main reasons I personally do not like the “jack-in” process of most editors. My JVM process lives longer than my editor, and, at least in the past, some of the output wasn’t always visible by default. Could be that is a non-issue these days, just make sure to check.
This year I had the privilege of attending Clojure/Conj in Alexandria, VA. Alexandria is a beautiful city, with a museum seemingly on every corner, and abundant trees just starting to turn yellow for the fall. It has an efficient metro system and a safe, welcoming atmosphere.
The conference was held in the unique George Washington Masonic Memorial. The venue has an impressive theater with quirky side rooms and hallways. The walls and alcoves were adorned with historical artifacts. The unusual and segmented layout leant itself to longer breakout discussions, with perhaps a little less broad social mixing as people were more spread out in clusters. The conference started with a buzz of energy, and settled into something of a serene atmosphere as most attendees settled into the proceedings - happy to be there, meet up, and enjoy the experience.
Some themes were: Visualization, tooling, collaboration, and quick responses to issues raised in libraries, as exemplified by Borkdude is everywhere! And it was revealed that Babashka is used by 90% of Clojure survey responders. Many discussions revolved around Portal, Cursive, Calva, and whitespace formatting. Many tool makers were taking the opportunity to brainstorm and collaborate closely.
Rich Hickey’s welcome address set a positive, thoughtful tone for the conference. All of the talks were excellent. If you only have time for one talk, I highly recommend watching "Science in Clojure: A Bird's Eye View" by Thomas – an engaging talk on scientific discovery and systematic innovation presented with a generous splash of humor. Make sure to catch them on ClojureTV.
The games night was a hit as always, with plenty of socializing, fun, and competitive strategies employed.
Conversations ranged from whitespace management in code formatting to the potential of brainwave interfaces. The Clojure community's diversity and creativity were on full display. I received lots of encouragement for Hummi.app, many people voiced their hope for a better diagramming app and were really interested in the details of what it does, and had great suggestions.
A big thank you to NuBank and the organizers for making this fantastic event possible.
What follows are my notes about the people I met and the things they were working on:
James Afterglow algorithmic light shows in Clojure.
Chase self taught Clojure as his first programming language. He chose Clojure to go straight to the good stuff. Previously an English literature teacher.
Samuel at Kevel working on advertisement serving. Glad to be working in Clojure as it is a much more enjoyable challenge to previous work in mechatronics hardware. Eager to see technology move forward and hopes that we can be more ambitious about creating amazing things.
Jeaye building Jank. Finding a lot of community support, and even more interest from outside the Clojure community.
Hanna coding Clojure 6 years at Nubank, enjoys the work and the team.
Francis is in Datom heaven at Nubank with more data than ever.
Tim S had a great take on the legend of bigfoot and why it was so popular on the west coast.
Hadil, long time Clojure user, first time Conj attendee looking forward to meeting David Nolan and Rich Hickey. Building a react native app in ClojureScript for matching yard work providers with customers. Starting to contribute to the Jank project.
Michael has been enjoying the Microsoft automatic graph layout. Search and replace would be great in a diagram app. Clerk style file based editing of a graph. Should the file be GraphQL or SVG, or maybe another source of truth? Observed astutely that Stu is a man of many disguises, he looks different every time he appears.
Anonymously overheard: “Borkdude just fixed my issue, I only submitted an issue 20 min ago” x2
> I heard several comments about library maintainers being responsive, and libraries being generally stable.
Chris O managing whitespace conflicts is not your job, let’s all move on.
> I’d love to not have to deal with whitespace when collaborating. It’s a pain, and unproductive.
Chris B meeting with many users of Portal. Would love to appear inline with Cursive's new inline evaluation output.
> I’d love to be able to display HTML inline, hopefully kindly visualizations as well.
Chris H has a book project underway.
> Joy of Clojure is a popular recommendation as a book that makes you think. Maybe a new edition is on the way?
Ivan on raising kids in the big city. My daughter on vacation asks “where are the tall buildings?”
Osei and Eli were deep in consultation with the Datomic team.
Toby is looking forward to a trip to Italy in April.
Lauren recently completed a GraphQL from REST refactoring.
Paul is solving sleep apnea at sovaSage. Their team and customer base is growing.
Kelvin works with Milt on a military training system.
> I remember meeting Milt at a Conj nearly a decade ago when he recommended I look into SVG when I was first thinking of making diagrams.
Colin inline Cursive evaluation and separate prompting plugin. Intellij code formatting is a constraint solver, is rule (3) a problem? Cljfmt kind of breaks it. Would like to standardize, but will need to re-implement according to the constraint solver. I hope Claude can help him get that done.
> Such a large impact on so many programmers
Pez is enjoying replicant and portfolio. postscript was the most dynamic LiSP. He hopes that Clojure formatting can standardize, and that he can promote a default in Calva.
> Great to see Pez and Colin together discussing how to implement rule 3. Both wanting to improve the compatibility of default formatting across editors.
David N is now at DRW, sifting through layers of Clojure archaeology. Oh that was popular, then this, then that. Seeing slices of history of when certain patterns were in favor.
> Nothing ever truly goes out of style in Clojure it seems.
Brandon is leveraging machine learning and making mindful investments.
Rich importance of ideas, fun, planting seeds of community outside the current borders. Go there and share. Enable creative people, optimists. Retired from Nubank, but still working on Clojure with the core team.
> Having the opportunity to shake Rich’s hand and hear his thoughts on community building was inspiring.
Aaron has a graph layout constraint system. Working on compute services at Equinix. Looking forward to visiting the Smithsonian National Air and Space Museum.
Dustin is building Electric Clojure with his team. They are available for consulting work to help companies develop applications using their technology to drastically reduce the code for client/server functionality.
> It’s amazing what they have achieved with Electric Clojure, I wish I had a commercial application to try it on. Such a principled approach focused on delivering value without incidental complexity.
Joe gave some insights on dealing with scale at Nubank, and the advantages of being able to preload data for queries.
Nacho is engineering manager at Splash addressing student debt and marginal loans.
> Nacho has a lot of empathy for those in need, this seems like a great fit.
Thomas’ talk - entertaining, informative and engaging, new testable knowledge, discovery, systematic exploration. Wolframite. Undeclared dependency on prayer. What quantum mechanics tells us about distributed systems. Dead plots, and the role Clay has in solving that. The trend of native Clojure libraries replacing python interop. Order vs Chaos - voluntary cooperation. Science in a nutshell. Emmy for symbolic math. Responsiveness of the community, generate me added Ince on request.
Designing experiments with mirrors and lasers. The laser connects everything. Can split the beam. Arrangement usually doesn’t matter except when refracting or polarizing. Similar to electric circuits?
Wesley commented on the absence of graph libraries in Clojure.
> Loom (no longer maintained) and Ubergraph are good, but many algorithms only have reference implementations in C and JavaScript.
Kira at Broadpeak (commodities trading) working with a heap of data. Building a grammar of graphics for ggplot style visualizations. Sits between very specific charts and basic rendering like SVG. Good for combining visualizations.
> ggplot uses functions that build data as a concise way to express visualizations.
Adam interested in using kindly-render as a way to embed visualizations in badspreadsheet.
Andrew thing.geom needs a new maintainer!
Lucas transpiling and resurrecting out of circulation games, job hunting for junior roles is difficult when your current job title is senior.
Cam is looking forward to Halloween. Had 700 trick or treaters last year, and now has a giant spider attraction set up in anticipation. When we find a bug in HoneySQL, Sean has already fixed it by the time we report it.
Sean recommended reading The Socratic Method (Ward Farnsworth). Rich recommended it in his talk last year Design in Practice, and Sean has found it useful for organizing thinking.
Sofia had read Socrates, going into a masters program human computer interaction through brainwaves, looking to improve the isolation of wave patterns.
> I hope somebody proposes a talk on the Socratic method next Conj
Quetzaly interested in joining Clojure Camp (https://clojure.camp/). This was the first time I’d heard of Clojure Camp so Sofia showed us how it works. It’s based around pairing mentors and mentees. You can choose how many times you’ll be paired, what times work for you, and what topics you are interested in. You opt in every week that you’d like to continue.
> So great to have this as a way for people to pair up and work together on something productive.
Torstein fish feed management and automation. Norway has 2 written languages Bokmål and Nynorsk, and overlaps languages with Swedish and Danish. There are many dialects. The Sámi people live in Norway, Sweden and Russia and have yet another language. Right now they cannot cross the Russian border. Learning more languages is a good thing, you learn to express more, be more creative. Nationalism and a drive to have one language has had the opposite, divisive effect.
> I am astonished at how complex the interactions of language and culture can be.
David Y building a tax accounting startup.
> Datascript with Firebase backend remains a popular stack, but relies on custom implementations.
Carin had a book recommendation “Slow Horses”. LLMs can help us think. She hopes the next generation uses them to improve thinking instead of replacing thinking.
> The next generation is growing up in such a different learning environment, I hope they can think better than we ever could.
Jay is working at Viasat on DHCP network management.
Clojure/Conj 2024 left me inspired and grateful for the connections I made. The conference highlighted the language's evolution, its vibrant community, and its potential to shape the future of technology. I'm excited to see where the Clojure journey leads us next!
Efficient time reporting is the baseline of trust. Doing it manually is a frustrating and time consuming process. I’ll show you an opportunity to do it with less pain using ActivityWatch and some Babashka scripting.
Source of truth:
Who would know better what are you working on and when are you working like your version control system? When you calls git it’s always knows
what is your actual project
the actual branch name
changes you already made
Etc.
And we are only one step far to knowing it.
Introduction to ActivityWatch:
ActivityWatch is an open-source, automated time-tracking application. It’s designed to record how one’s time is spent across different digital platforms, providing an overview of how you spent your time with your different devices. ActivityWatch uses buckets and events to keep track of user activities, like AFK, windows and through some browser extensions the visited pages. When you visit the website you will find several plugins and watchers for different tools, editors.
Babashka Scripts – Empowering Clojure:
Babashka is a scripting environment for Clojure, providing a fast-starting platform for scripting in this robust language. Babashka scripts bring the full power of Clojure into shell-like scripts without compromising on speed, making it an ideal candidate for quick automation tasks.
ActivityWatch RestAPI:
REST API documentation of ActivityWatch is a good place to start, just like the served API documentation. From these two sources you will be able to learn about buckets, events and heartbeats.
When you execute a git command in any directory, git should know if it belongs to a git repository, where its root directory is. From your shell this command is
git rev-parse –show-toplevel
And to know your actual brach you should execute
git branch –show-current
From babashka you can execute these commands with using
As you can see above we already have the data (the actual branch / feature what we are working on, and the project in the form of the git root directory) all we have to do is storing it in ActivityWatch. When you already have your bucket it’s enough to send one event, then you can keep tracking your git activities with only using heartbeats.
So far so good, we have everything for creating events in ActivityWatch, based on the actual directory. How to do it periodically, from the actual working directory? It’s easy. You shouldn’t do it by yourself. Leave it to your editor, which should execute your script when git should be used. In IntelliJ you can set up the git path to yourself:
I’m sure you can figure it out for your IDE too. You can set up an alias for your shell too. Like I did:
alias git=’/home/g-krisztian/git/git-watcher/git.bb’
I owe you one more thing, the missing pieces like the main function and how to execute it.
Babashka scripts offer an excellent approach to automate the process of logging hours into ActivityWatch by utilizing data from Git. This approach not only automatizes the time-tracking process but minimizes manual entry. It brings the time tracking to the developer’s world, and results in a well detailed base for the actual time reporting, which I will write about next time.
We’ve got our first set of reports from developers working on short-term projects funded in Q3 2024. You’ll find a brief description of each project at the top of the page to provide some context – followed by current project updates.
Develop an alternative builder for Nix that uses Babashka / SCI instead of Bash. It provides a way
for constructing complex Nix derivations using Babashka/SCI entirely, eliminating the need to
write any Bash code.
Also, will be adding a Babashka writer to nixpkgs. Nixpkgs supports creating self-contained scripts, called “nix writers.” Traditionally written in Bash, recent versions of nixpkgs include the ability to write these scripts in Python, Ruby, or Rust.
Create a new web application for clj-async-profiler that will allow users to host and share the generated flamegraphs. At the moment, even though flamegraphs are just self-contained HTML files, sending them to
somebody is a chore. The new service can make this much easier and offer extra features
like saving and sharing dynamic transforms on the fly. Additionally, I’d like to focus on the UI
side of clj-async-profiler - add new dynamic transforms, improve aesthetics and the UX.
For clj-java-decompiler, expand its preprocessing abilities to present clearer results to the user and integrate it more tightly with established Clojure IDEs like CIDER and Calva, which requires some groundwork.
Jank feels like Clojure now, with 92% syntax parity and nearly 40% clojure.core parity. But it only feels like Clojure to me because none of you are using it yet. My top priority is to change that. I’ll be working on building out jank’s nREPL server, which involves implementing bencode support, clojure.test, improving native interop, supporting pre-compiled binary modules, and ultimately adding AOT compilation support.
Continue development of Kushi, a foundation for building web UI with ClojureScript. Work this funding cycle will focus on finishing the new css transpilation pipeline, significant build system performance upgrades, and implementing a reimagined theming system.
This project (Constraints and Humanization) aims to drastically improve the expressivity of Malli schemas to help address current user feedback and enable future extensions. The basic idea is to add a constraint language to each schema to express fine-grained invariants and then make this constraint language compatible with validators/explainers/generators/etc so that Malli users can write high-level, precise schemas without resorting to partial workarounds. See prototype here: https://github.com/frenchy64/malli/pull/12
Check out Daniel’s video: https://www.youtube.com/watch?v=WO6mVURUky4. Scicloj is a Clojure group developing a stack of tools & libraries for data science.
Alongside the technical challenges, community building has been an essential part of its
efforts since the beginning of 2019. Our community-oriented goal is making the existing data-science stack easy to use through the maturing of the Noj library, mentioned below. In particular, we are working on
example-based documentation, easy setup, and recommended workflows for common tasks. All these, and the tools to support them, grow organically, driven by real-world use-cases. See updates for progress on Q3 projects and documentation.
Continue work on Standard Clojure Style - which is a “no config, runs everywhere, follows simple rules” formatter for Clojure code. More information about the genesis of the project can be found on Issue #1:
https://github.com/oakmac/standard-clojure-style-js/issues/1
Project Updates: Sept. and Oct. 2024
clj-Nix: José Luis Lafuente
Q3 2024 Report No. 1, Published Oct. 13, 2024
In this first half of the funding round, I made progress on two fronts:
Babashka writer on nixpkgs
I added a new writer, writeBabashka, to nixpkgs. It’s already merged and
currently available on the
nixpkgs and nixos unstable branch.
As you can see in the documentation, there are two versions, writeBabashka and
writeBabashkaBin. Both produce the same output script. The only difference is
that the latter places the script within a bin subdirectory. That’s a common
pattern in nixpkgs, for consistency with the convention of software packages
placing executables under bin.
Something I still want to do is to create a repository with some examples about
how to use the Babashka writer.
Nix Derivation Builder with Babashka
The build step of a Nix derivation is defined by a Bash script. I want to
provide an alternative builder written in Clojure (using Babashka).
I have a working prototype, but the API is still under development and may
change in the future. You can find the initial version on the bbenv branch:
clj-nix/extra-pkgs/bbenv
A pull request for early feedback is available here:
clj-nix PR #147
Here’s a glimpse of how it currently works. This example builds the
GNU Hello Project:
I’ve spent the firt month of my Clojurists Together project polishing the user experience for clj-async-profiler. The profile viewer UI (the flamegraph renderer) received big improvements in navigation, ease of access, consistency, and overall look. As a bullet list:
Published one big release of clj-async-profiler (1.3.0) and two small releases (1.3.1 and 1.3.2). Most important changes:
Completely redesigned collapsible sidebar.
Better rendering performance and responsiveness.
New custom on-hover tooltip.
Fewer configuration options for better out-of-the-box experience.
Prepared multiple UI mockups for the flamegraph sharing website that I’m starting to work on in the second month.
Jank: Jeaye Wilkerson
Q3 2024 Report No. 1, Published Oct. 14, 2024
Hi everyone! It’s been a few months since the last update and I’m excited to
outline what’s been going on and what’s upcoming for jank, the native Clojure
dialect. Many thanks to Clojurists Together and my Github sponsors for the
support. Let’s get into it!
Heart of Clojure
In September, I flew from Seattle to Belgium to speak at Heart of Clojure. For
the talk, I wanted to dig deep into the jank details, so I created a walk-through
of implementing exception handling in jank. You can watch my talk here.
Announcement
Part of my Heart of Clojure talk was an announcement that, starting in January
2025, I’ll be quitting my job at EA to focus on jank full-time. Two years ago, I
switched from full-time to part-time at EA in order to have more time for jank.
Now, with the momentum we have, the interest I’ve gathered, and the motivation
backing this huge effort, I’m taking things all the way.
I don’t have funding figured out yet, though. It’s hard for companies to invest
in jank now when they’re not using it, when it’s not providing them value. So my
goal is to get jank out there and start creating value in the native Clojure
space. If using jank interests you and you want white glove support for
onboarding jank once it’s released, reach out to me.
Mentoring
On top of working on jank full-time, next year, I have joined the
SciCloj mentorship program
as a mentor and have two official mentees with whom I meet weekly (or at least
once every two weeks) in order to help them learn to be compiler hackers by
working on jank. This is in tandem with the other mentee I had prior to the
SciCloj program.
What’s so inspiring is that there were half a dozen interested people, who
either reached out to me directly or went through the application process, and
we had to pare down the list to just two for the sake of time. Each of those
folks wants to push jank forward and learn something along the way.
JIT compilation speeds
Now, jumping into the development work which has happened in the past few
months, it all starts with me looking into optimizing jank’s startup time. You
might think this is a small issue, given that jank needs more development
tooling, improved error messages, better Clojure library support, etc. However,
this is the crux of the problem.
jank is generating C++ from its AST right now. This has some great benefits,
particularly since jank’s runtime is implemented in C++. It allows us to take
advantage of C++’s type inference, overloading, templates, and virtual dispatch,
whereas we’d have none of those things if we were generating LLVM IR, machine
code, or even C.
However, JIT compiling C++ as our primary codegen comes with on big problem: C++
is one of the slowest to compile languages there is. As a concrete example, in
jank, clojure.core is about 4k (formatted) lines of jank code. This codegens
to around 80k (formatted) lines of C++ code. On my beefy desktop machine, it
takes 12 seconds to JIT compile all of that C++. This means that starting jank,
with no other dependencies than clojure.core, takes 12 seconds.
To be fair, all of this disappears in AOT builds, where startup time is more
like 50ms. But starting a REPL is something we do every day. If it takes 12
seconds now, how long will it take when you start a REPL for your company’s
large jank project? What if your machine is not as beefy? A brave user who
recently compiled jank for WSL reported that it took a minute to JIT compile
clojure.core for them.
So, this leads us to look for solutions. jank is already using a pre-compiled
header to speed up JIT compilation. Before abandoning C++ codegen, I wanted to
explore how we could pre-compile modules like clojure.core, too. Very
pleasantly, the startup time improvements were great. jank went from 12 seconds
to 0.3 seconds to start up, when clojure.core is pre-compiled as a C++20
module and then loaded in as a shared library.
There’s a catch, though. It takes 2 full minutes to AOT compile clojure.core
to a C++20 pre-compiled module. So, we’re back to the same problem. jank could
compile all of your dependencies to pre-compiled modules, but it may take 30
minutes to do so, even on a reasonable machine. For non-dependency code, your
own source code, jank could use a compilation cache, but you’ll still need to
pay the JIT compilation cost whenever you do a clean build, whenever you eval a
whole file from the REPL, etc.
Before digging deeper into this, I wanted to explore what things would look like
in a world where we don’t codegen C++.
LLVM IR
LLVM has support for JIT compiling its own intermediate representation (IR),
which is basically a high level assembly language. Compared to generating C++,
though, we run into some problems here:
Calling into C++ is tough, since C++ uses name mangling and working C++ value
types involves non-trivial IR
We can’t do things like instantiate C++ templates, since those don’t exist
in IR land
So we need to work with jank at a lower level. As I was designing this, in my
brain, I realized that we just need a C API. jank has a C++ API, which is what
we’re currently using, but if we had a C API then we could just call into that
from assembly. Heck, if we can just write out the C we want, translating that to
assembly (or IR) is generally pretty easy. That’s what I did. I took an example
bit of Clojure code and I wrote out some equivalent C-ish code, using a made-up
API:
This was motivating. Furthermore, after two weekends, I have the LLVM IR codegen almost entirely done!
The only thing missing is codegen for closures (functions with captures) and try expressions, since those involve some extra work. I’ll give an example of how this looks, with exactly the IR we’re generating, before
LLVM runs any optimization passes.
Clojure
(let [a 1
b "meow"]
(println b a))
LLVM IR
; ModuleID = 'clojure.core-24'
source_filename = "clojure.core-24"; Each C function we reference gets declared.
declareptr @jank_create_integer(ptr)
declareptr @jank_create_string(ptr)
declareptr @jank_var_intern(ptr, ptr)
declareptr @jank_deref(ptr)
declareptr @jank_call2(ptr, ptr, ptr)
; All constants and vars are lifted into internal
; globals and initialized once using a global ctor.
@int_1 = internalglobalptr0
@string_2025564121 = internalglobalptr0
@0 = privateunnamed_addrconstant [5xi8] c"meow\00", align1
@var_clojure.core_SLASH_println = internalglobalptr0
@string_4144411177 = internalglobalptr0
@1 = privateunnamed_addrconstant [13xi8] c"clojure.core\00", align1
@string_4052785392 = internalglobalptr0
@2 = privateunnamed_addrconstant [8xi8] c"println\00", align1; Our global ctor function. It boxes all our
; ints and strings while interning our vars.
definevoid @jank_global_init_23() {
entry:
%0 = callptr @jank_create_integer(i641)
storeptr %0, ptr @int_1, align8
%1 = callptr @jank_create_string(ptr @0)
storeptr %1, ptr @string_2025564121, align8
%2 = callptr @jank_create_string(ptr @1)
storeptr %2, ptr @string_4144411177, align8
%3 = callptr @jank_create_string(ptr @2)
storeptr %3, ptr @string_4052785392, align8
%4 = callptr @jank_var_intern(ptr %2, ptr %3)
storeptr %4, ptr @var_clojure.core_SLASH_println, align8retvoid
}
; Our effecting fn which does the work of the actual code.
; Here, that just means derefing the println var and calling it.
defineptr @repl_fn_22() {
entry:
%0 = loadptr, ptr @int_1, align8
%1 = loadptr, ptr @string_2025564121, align8
%2 = loadptr, ptr @var_clojure.core_SLASH_println, align8
%3 = callptr @jank_deref(ptr %2)
%4 = callptr @jank_call2(ptr %3, ptr %1, ptr %0)
retptr %4
}
There’s still more to do before I can get some real numbers for how long it
takes to JIT compile LLVM IR, compared to C++. However, I’m very optimistic. By
using a C API, instead of our C++ API, handling codegen optimizations
like unboxing ends up being even more complex, but we also have even more power.
How this affects interop
Currently, jank has two forms of native interop (one in each direction):
A special native/raw form which allows embedding C++ within your jank code
The ability to require a C++ as though it’s a Clojure namespace, where that
C++ code then uses jank’s runtime to register types/functions
When we’re generating C++, a native/raw just gets code-generated right into
place. However, when we’re generating IR, we can’t sanely do that without
involving a C++ compiler. This means that native/raw will need to go away, to
move forward with IR. However, I think this may be a good move. If we buy into
the second form of interop more strongly, we can rely on actual native source
files to reach into the jank runtime and register their types/functions. Then,
in the actual jank code, everything feels like Clojure.
This means that we still have a need for JIT compiling C++. Whenever you require
a module from your jank code, which is backed by a C++ file, that code is JIT
compiled. Generally, what the C++ code will do is register the necessary functions
into the jank runtime so that way you can then drive the rest of your program
with jank code. I think this is a happy medium, where we still have the full
power of C++ at our disposal, but all of our jank code will result in IR, which
will JIT compile much more quickly than C++.
This means the answer to the question of C++ or IR is: why not both?
jank as THE native Clojure dialect
There’s another reason which leads me to explore LLVM IR within jank. While jank
is embracing modern C++, it doesn’t need to be so tightly coupled to it. By
using just the C ABI as our runtime library, everything can talk to jank. You
could talk to jank from Ruby, Lua, Rust, and even Clojure JVM. Just as
importantly, jank can JIT compile any LLVM IR, which means any language which
compiles on the LLVM stack can then be JIT compiled into your jank program.
Just as jank can load C++ files as required namespaces, seamlessly, so too could
it do the same for Rust, in the future. Furthermore, as the public interface for
jank becomes C, the internal representation and implementation can change
opaquely, which would also open the door for more Rust within the jank compiler.
In short, any native work you want to do in Clojure should be suited for jank.
Your jank code can remain Clojure, but you can package C, C++, and later
languages like Rust inside your jank projects and require then from your jank
code. The jank compiler and runtime will handle JIT compilation and AOT
compilation for you.
Community update
This has been a long update which hopefully created some more excitement for
jank’s direction. I want to wrap up with what the community has been up to
recently, though, since that alone warrants celebration.
Characters, scientific notation, and to_code_string
Saket has been improving jank’s runtime character
objects, which he originally implemented, to be more efficient and support
Unicode. He also recently added scientific notation for floating point values,
as well as an extension of jank’s object concept to support to_code_string,
which allows us to now implement pr-str.
At this point, Saket has the most knowledge of jank’s internals, aside from me,
so I’ve been giving him heftier tasks and he’s been super helpful.
More robust escape sequences
One of my SciCloj mentees, Jianling,
recently merged support for all of the ASCII escape sequences for jank’s
strings. Previously, we only had rudimentary support. Now he’s working on
support for hexadecimal, octal, and arbitrary radix literals, to further jank’s
syntax parity with Clojure.
Nix build
We have a newcomer to jank, Haruki, helping to
rework the build system and dependencies to allow for easy building with Nix!
There’s a draft PR here. I’m
excited for this, since I’m currently using NixOS and I need to do a lot of jank
dev in a distrobox for easy building. This will also help with stable CI builds
and ultimately getting jank into nixpkgs (the central package repo for Nix).
LLVM 19 support
The last JIT hard crash fix in LLVM is being backported to the 19.x branch,
which means we should be able to start using Clang/LLVM binaries starting 19.2!
This is going to drastically simplify the developer experience and allow for
packaging jank using the system Clang/LLVM install. My
backport ticket
has been closed as complete, though the PR
into the 19.x branch is still open.
Summary
More people are working on jank now than ever have; I expect this number to
keep growing in the coming year. I’ll see you folks at the Conj and, after that,
in my next update during the holiday season, when I’ll have some final numbers
comparing jank’s startup times with LLVM IR vs C++, as well as some updates on
other things I’ve been tackling.
Kushi: Jeremiah Coyle
Q3 2024 Report No. 1, Published Oct. 15, 2024
Q3 Milestones
Thanks to the funding from Clojurists Together, the Q3 development of Kushi is aimed at achieving the following 3 milestones:
Finishing the new css transpilation API.
Reimplementing the build system for enhanced performance.
A reimagined theming system.
Progress
Milestone #1: Finishing the new css transpilation API.
Goals
Solidify the API design and implementation of Kushi’s CSS transpilation functionality.
Incorporate the use of lightingcss for CSS transformation (older browsers) and minification.
Refactor existing public functions for injecting stylesheets, google fonts, etc.
Progress: Complete. The majority of the time spent working on Kushi in the first half of the funding period was focused on implementing a new CSS transpilation pipeline. An updated set of public macros and supporting functions was designed and implemented around this. A broad test suite was written, which consists of 7 tests containing 54 assertions.
Next steps: When the work on the new build system (milestone #2) is complete, test this new API by upgrading existing UI work (such as Kushi’s interactive documentation site) that uses the current (soon to be previous) API, then adjust and refine implementation details as necessary.
Milestone #2: Reimplementing the build system for enhanced performance.
Goal: Reimplement the build system for performance, and eliminate the use of side-effecting macros.
Progress: 75% of the initial design, discovery, and experimentation phase is complete.
Next steps: I anticipate moving it into the implementation phase by the last week of October. Roughly half of the the remaining 6 weeks of the Q3 period will be spent building this out.
Milestone #3: A reimagined theming system.
Goal: Based on learnings from using Kushi to build a lot of production UI over that last 2-3 years, redesign and implement a new theming system. This will involve harmonizing 3 principled subsystems:
Design tokens (a global system of CSS custom properties).
Utility classes.
Component-level data-attribute conventions.
Progress: 90% of the initial design, discovery, and experimentation phase is complete.
Next steps: I am hoping that enough progress will be made on the build system so that I can focus Kushi dev time on this new theming system for the last 3 weeks of November.
Details
All of the work related to Milestone #1 has been happening in a sandbox repo called kushi-css. Additional in-depth detail and documentation around this work can be found here. When all 3 of the above grated into the main kushi repo.
Malli: Ambrose Bonnaire-Sergeant
Q3 2024 Report No. 1, Published Oct. 18, 2024
Malli Constraints - Report 1
This is the first report of three in the project to extend Malli with constraints.
tldr; I gave a talk and started implementation and reflect its successes and failures below.
Thanks to Tommi (Malli lead dev) for working with me to propose the project, and the encouragement
I received from the Malli community and my friends.
This is a long update that really helps me get my thoughts in order for such a complex project. Thanks for reading and feel free to email me any questions or comments.
Background
In this project, I proposed to extend the runtime verification library
Malli with constraints with the goal of making the library more expressive
and powerful.
With this addition, schemas can be extended with extra invariants (constraints) that must be
satisfied for the schema to be satisfied. Anyone can add a constraint to a schema.
Crucially, these extensions should work as seamlessly as if the author of the schema
added it themselves.
Before the project started, I had completed an extensive prototype that generated many
ideas. The authors of Malli were interested in integrating these ideas in Malli and this
project aims to have their final forms fit with Malli in collaboration with the Malli devs.
Talk
It had been several months since I had built the prototype of Malli constraints, so
I gave a talk at Madison Clojure which I live-streamed.
You can watch it here.
It was well received and very enjoyable to give. I’m thankful to the attendees for their
engagement and encouragement, and for checking in on my progress during the project.
In the talk, I motivate the need for more expressive yet reliable schemas, propose
a solution in the form of constraints, and sketch some of the design ideas for
making it extensible. I gave this talk a few days before the project started and I hit the ground running.
Design Goals (Constraints)
I’ve had many fruitful interactions with the Malli community its developers
and I have a good idea what the project values. If this constraints project is to be
successful, it must check all the boxes as if it came straight from
the brain of Tommi (well, that’s my goal, Tommi is busy and has enjoyably
high standards). Given how deeply this project aims to integrate with
Malli, that attitude has definitely helped prune ideas (when was the
last time :string or :int changed it’s implementation? We’re
doing exactly that here).
There is a mundane but critical issue that Malli has been steadily
increasing its CLJS bundle size. I decided early on that my design
for constraints would be opt-in, so that the Malli devs can decide
whether its worth including by default. If adding constraints irreversibly
increased the CLJS bundle size to the point that Malli devs started worrying,
this project would be in jeopardy.
My prototype made constraints an entirely custom construct, unrelated
to the rest of Malli. It’s helpful to look at a related project under
similar circumstances: extending Malli to add sequence expressions
like :alt and :*. Sequence expressions
are a different abstraction than schemas, and yet they share many implementation
concepts, both even implementing m/Schema. Sequence expressions then
implement additional protocols for their characterizing operations.
I wanted to take inspiration from this: constraints should be like
schemas in their overlapping concepts, introducing new abstractions
only for differences.
I would like the constraints framework be merged incrementally, starting
with very simple constraints on the count and size of collections
and numbers. However, the framework itself should be fully realized
and be able to support much more complex constraints.
The last few goals are easy to list, but maximizing them all simultaneously
might be difficult as seem in deep tension.
Constraints should be fast, extensible, and robust.
It should be possible to achieve equivalent performance to a “hand-coded”
implementation of the constraint. It should be possible to implement as
many kinds of constraints as possible without having to change the constraint
framework itself. Constraints should have predictable, reliable semantics
that are congruent to the rest of Malli’s design.
Summary of goals:
control bundle size
backwards compatibility
equivalent performance
in tension with extensibility and compilation
extensibility
provide expressivity and power of primitives
robustness
think about Malli’s unwritten rules (properties round-trip)
first iteration should be fully realized but minimal
My first attempt at an idiomatic implementation of schema constraints
was completed in the first half of September. Since then it’s been
hammock-time pondering the results. I have surprisingly strong feelings
in both directions.
I go into more detail below.
Opt-in for CLJS Bundle size
I was able to separate the constraints framework from
malli.core so it can be opt-in to control CLJS bundle size.
The main code path adds several functions and a couple of protocols,
but the constraints themselves are loaded separately via an atom
that also lives in malli.core. This atom m/constraint-extensions
can be empty which will disable constraints, kicking in a backwards-compatibility mode for schemas
that migrated their custom properties to constraints (like :string’s :min and :max).
I went back and forth about whether to use a single global atom or
to add some internal mutable state to each schema that could be upgraded
to support constraints. In this implementation, I decided a global atom was
more appropriate for two reasons. First, registries can hold multiple copies
of the same schema but only one will “win”. We don’t want situations where
we extend a schema with constraints and then it gets “shadowed” by another
instance of the same schema, since that is functionally equivalent in all
other situations. Second, we already have an established way of extending schemas
to new operations in the form of a multimethod dispatching on m/type. I wanted
a similar experience where an entire class of extension is self-contained in one
global mutable variable.
Extending schemas with constraints is subtly different to many other kinds of
schema extensions, in that it is much finer grained. defmulti is appropriate
for defining generators or JSON Schema transformers where a schema extension
maps precisely to a function (defmethod), but extending constraints
is more like having a separate atom for each schema containing a map where
each field can themselves be extended with namespaced keywords. A single global
atom containing a map from schemas to constraint configuration became the natural
choice (an atom of atoms is rarely a good idea).
Ultimately the constraint implementation is activated by calling
(malli.constraint/activate-base-constraints!).
Reusing existing abstractions
Constraints implement m/Schema and their m/IntoSchema’s live in the registry.
They differ from schemas in how they are constructed and print
(it depends which schema the constraint is attached to) so they have their own
equivalent to m/schema in malli.constraint/constraint.
As I outlined in my talk, it was important to have multiple ways to parse
the same constraint for maximum reuse. For example, [:string {:min 10}] and [:vector {:min 10}]
should yield equivalent constraints on count, while [:int {:min 10}] and [:float {:min 10}]
yield constraints on size. This is useful when solving constraints for generators
(malli.constraint.solver).
Extensibility
The new implementation converts the :min, :max, :gen/min, and :gen/max
properties on the :string schema to constraints. They are implemented
separately from :string itself in a successful test of the extensibility
of constraints.
malli.constraint/base-constraint-extensions contains the configuration
for these :string constraints, which consist of the :true, :and, and :count
constraints. There are several ways to attach :count constraints to a :string,
each of which has a corresponding parser. For example, a string with a count between
1 and 5 (inclusive) can be created via [:string {:min 1 :max 5}] or
[:string {:and [:count 1 5]}]. The :string :parse-properties :{min,max} configuration shows how
to parse the former and :string :parse-constraint :count the latter.
Performance
Extensibility and performance are somewhat at odds here. While it’s great that two
unrelated parties could define :min and :max in [:string {:min 1 :max 5}],
we are left with a compound constraint [:and [:count 1] [:count nil 5]] (for the :min
and :max properties, respectively). To generate an efficient validator for the overall constraint we
must simplify the constraint to [:count 1 5]. The difference in validators before
and after intersecting are #(and (<= 1 (count %)) (<= (count %) 5)) and #(<= 1 (count %) 5). Depending on the performance of count`, choosing incorrectly could be a large regression in performance.
Constraints have an -intersect method to merge with another constraint
which :and calls when generating a validator. While we regain the performance of validation,
we pay an extra cost in having to create multiple constraints and then simplify them.
Robustness
My main concern is a little esoteric but worth considering. Malli has specific
expectations about properties that constraints might break, specifically that properties
won’t change if roundtripping a schema.
A constrained schema such as [:string {:min 1}] is really two schemas: :string
and [:count 1], the latter the result of the new -get-constraint method on
-simple-schema’s like :string. The problem comes when serializing this schema
back to the vector syntax: how do we know that [:count 1] should be serialized to
[:string {:min 1}] instead of [:string {:and [:count 1]}]? I don’t think
this is a problem for simple constraints like :min since we can just return
the same properties as we started with. There are several odd cases I’m not
sure what do with.
Here -set-constraint removes all properties related to constraints (since we’re replacing
the entire constraint) and then must infer the properties to serialize the new constraint to.
In this case the constraint configuration in :string :unparse-properties ::count-constraint
chooses [:string {:min 2}], but its resemblance to the initial schema is coincidental
and might yield surprises.
The big task here is thinking about (future) constraints that contain schemas. For example,
you could imagine a constraint [:string {:edn :int}] that describes strings that
edn/read-string to integers. This is very similar to [:string {:registry {::a :int}}]
in that the properties of the schema are actually different before and after parsing the
schema (in this case, m/-property-registry is used to parse and unparse the registry).
Part of the rationale of using -get-constraint as the external interface for extracting
a constraint from a schema is to treat each schema as having one constraint
instead of many small ones is for schema-walking purposes. Property registries don’t play
well with schema walking and it takes a lot of work to ensure schemas are walked correctly
(for example, ensuring a particular OpenAPI property is set on every schema, even those in
local registries). Walking schemas inside constraints will be more straightforward. To support constraints,
a schema will extend their -walk algorithm to automatically walk constraints with a separate
“constraint walker”, and constraints like :edn will revert to the original “schema walker”
to walk :int in [:string {:edn :int}]. This logic lives in malli.constraint/-walk-leaf+constraints.
This walking setup is intended to cleanly handle refs inside schemas such as:
Having schemas in properties leaves us in a fragile place in terms of the consistency of schema
serialization. For example, after walking
[:string {:edn :int}] to add an OpenAPI property on each schema, we might end up
with either
depending on the :unparse-property attached to :edn constraints under :string.
Or even more fundamentally, the properties of [:string {:edn :int}] become {:edn (m/schema :int)}
when parsed, but how do we figure out it was originally {:edn :int}? The current approach
(which is a consequence of treating each schema as having one constraint via -{get,set}-constraint)
depends on the unparser in :string :unparse-properties ::edn-constraint to guess correctly.
It is unclear how big of a problem this is. My fundamental worry is that schemas will not round-trip syntactically,
but is this lot of worry about nothing? Plenty of schemas don’t round-trip syntactically at first, but stabilize
after the first trip, for example [:string {}] => :string => :string. The important
thing is that they are semantically identical. This is similar to what I propose for constraints:
deterministically attempt to find the smallest serialization for the constraint within
the properties. If inconsistencies occur, at best might annoy some users, or at worst
it could make constraints incomprehensible (to humans) be restating them in technically-equivalent ways.
Next
I need to resolve this roadblock of constraint serialization inconsistency. Is it a problem?
If it is, do I need to throw out the entire design and start again?
SciCloj: Daniel Slutsky
Q3 2024 Report No. 1, Published Oct. 3, 2024
The Clojurists Together organisation has decided to sponsor Scicloj community building for Q3 2024, as a project by Daniel Slutsky. This is the second time the project is selected this year. Here is Daniel’s update for September.
Comments and ideas would help. :pray:
Scicloj is a Clojure group developing a stack of tools and libraries for data science. Alongside the technical challenges, community building has been an essential part of its efforts since the beginning of 2019. Our current main community-oriented goal is making the existing data-science stack easy to use through the maturing of the Noj library, mentioned below. In particular, we are working on example-based documentation, easy setup, and recommended workflows for common tasks.
All these, and the tools to support them, grow organically, driven by real-world use cases.
I serve as a community organizer at Scicloj, and this project was accepted for Clojurists Together funding in 2024 Q1 & Q3. I also receive regular funding from Nubank.
In this post, I am reporting on my involvement during September 2024, as well as the proposed goals for October.
I had 77 meetings during September. Most of them were one-on-one meetings for open-source mentoring or similar contexts.
All the projects mentioned below are done in collaboration with others. I will mention at least a few of the main people involved.
Scicloj is providing mentoring to Clojurians who wish to get involved in open-source. This initiative began in August and has been growing rapidly in September. This program is transforming Scicloj, and I believe it will influence the Clojure community as a whole.
We are meeting so many incredible people who are typically experienced, wise, and open-minded and have not been involved in the past. Making it all work is a special challenge. We have to embrace the uncertainty of working with people of varying availability and dynamically adapt to changes in the team. Building on our years-long experience in community building and open-source collaboration, we know we can support at least some of our new friends in finding impactful paths to contribute. We are already seeing some fruits of this work and still have a lot to improve.
47 people have applied so far. 34 are still active, and 10 have already made meaningful contributions to diverse projects.
I am coordinating the process, meeting all the participants, and serving as one of the mentors alongside generateme, Kira McLean, Adrian Smith, and Jeaye Wilkerson. The primary near-term goals are writing testable tutorials and docs for the Fastmath and Noj libraries. Quite a few participants will be working on parts of this core effort. A few other projects where people get involved are Clay, Kindly, Jank, and ggml.clj.
A few notable contributions were by Avicenna (mavbozo), who added a lot to the Fastmath documentation and tutorials; Jacob Windle, who added printing functionality to Fastmath regression models; Muhammad Ridho, who started working on portability of Emmy Viewers data visualizations; Lin Zihao, who improved the Reagent support to the Kindly standard; Epidiah Ravachol, who worked on insightful tutorials for dtype-next array-programming; Oleh Sedletskyi, who started working on statistics tutorials; Ken Huang, who’ve made various contributions to Clay; and Prakash Balodi, who worked on Tablecloth issues and started organizing the Scicloj weekly group (see below).
Noj is an entry point to data and science. It integrates a set of underlying libraries through a set of testable tutorials. Here, there were great additions by generateme and Carsten Behering, and I helped a bit with the integration.
generateme has made a big release of Fastmath version 3.0.0 alpha - a result of work in the last few months - which is affecting a few of the underlying libraries.
Carsten Behring has released new versions of a few of the machine learning libraries.
Carsten also made important changes to Noj in adding integration tests and automating the dev workflow.
I helped in gradually adapting and testing a few of the underlying libraries.
I helped initiate a few documentation chapters that are being written by new community members.
The real-world-data group is a space for people to share updates on their data projects at work.
Meeting #13 was dedicated to talk runs and discussions preceding the Heart of Clojure conference.
Meeting #14 was an interactive coding session of a data science tutorial.
Scicloj weekly
Together with Prakash Balodi, we initiated a new weekly meeting for new community members working on open-source projects.
Intentionally, we use a time slot which is more friendly to East and Central Asia time zones, unlike most Clojure meetups.
We have had three meetings so far, with 4, 15, and 6 participants.
Linear Algebra meetings
We organized a new group that will collaborate on implementing and teaching applied linear algebra algorithms in Clojure.
The first meeting actually took place in October 2nd, so we will update more in the next month.
Sami Kallinen represented Scicloj at Heart of Clojure with an incredbible talk about data modelling. The talk was extremely helpful in exploring and demonstrating a lot of the new additions to the Scicloj stack.
I collaborated with Sami on preparing the talk and improving the relevant tools and libraries to support the process.
October 2024 goals
This is the tentative plan. Comments and ideas would be welcome.
Noj and Fastmath
Both these libraries will recieve lots of attention in the form of (testable) tutorials and docs. I will be working with a few people on vairous chapters of that effort.
We will keep working on stabilizing the set of libraries behind Noj and improving the integration tests.
Open-source mentoring
We are expecting more participants to join.
I will keep working on supporting participants in new beginnings and ongoing projects.
Tableplot is a layered grammar of graphics library.
The goal for the coming few weeks is to bring it to beta stage and mostly improve the documentation.
Tooling
We will keep working on maturing kindly-render and refactoring Clay to use it internally.
Clay will be in active development for code quality, bugixes, and user requests.
Clojure Conj
The coming Clojure Conj conference will feature a few Scicloj-related talks. At Scicloj, we have a habit of helping each other in talk preparations. We will do that as much as the speakers will find it helpful. We will also organize a couple more pre-conference meetings with speakers, as we did in August.
Standard Clojure Style: Chris Oakman
Q3 2024 Report No. 1, Published Oct. 14, 2024
Standard Clojure Style is a project to create a “follows simple rules, no config, runs everywhere” formatter for Clojure code.
tl;dr
project is usable for most codebases in its current state
many bugs fixed
I will be presenting Standard Clojure Style at Clojure/conj 2024
website is next
Update
As of v0.7.0, Standard Clojure Style is ready for most codebases
Give it a try!
Standard Clojure Style is fast: Shaun Lebron shared some benchmarking on Issue #77
Several adventurous Clojure developers have ran Standard Clojure Style against their codebases and found bugs.
I have fixed most of the reported ones.
Seems like most new bugs are “small edge cases” as opposed to “large, fundamentally broken”
A big thank you to these developers and their helpful bug reports!
If you want to help test, please see the instructions in the README
I will be presenting Standard Clojure Style next week at Clojure/conj 2024 :tada:
Come say hello if you will be attending
I will also socialize the project at the conference
Next Up
I will continue work to stabilize the library and algorithm
I will work on a website to explain the project
There should be a “try it online” demo
Explanation of the formatting rules (what are the rules? and why?)
Something that teams can reference when they are deciding to adopt a formatter tool for their Clojure project
Earlier this year, Python SDK had a major facelift. Now other Conductor SDKs are undergoing significant rehaul, and we are thrilled to announce Java Client v4, featuring significant design enhancements, performance improvements, and optimized dependencies.
In our Java Client v4, we’ve reduced the dependency footprint by improving its design. We’ve added filters, events, and listeners to remove direct dependencies. This design decision was made after careful consideration to make the client easier to use, extend, and maintain.
Read on to learn more!
Why are we doing this?
We’ve heard your feedback and are working to improve the developer experience. Orkes and the Conductor OSS community were managing two separate Java client/SDK projects, both of which, like many in our industry, had accumulated some technical debt. We decided it was the right time to address this debt.
Some of the key things we wanted to address immediately were:
One Conductor Java client (to rule them all)
The goal was to consolidate the two existing projects into a unified, more manageable solution, taking the strongest elements from each. This should translate into faster updates, better support, and a more cohesive development experience.
Dependency optimization
As part of code cleanup, we’ve removed several dependencies:
Dependencies on backend code—The previous OSS Java client and SDK projects were part of the Conductor OSS repo and depended on conductor-commons. Although this kept the backend/client models in sync, it also meant some backend-related code and dependencies were leaking to the client.
Dependencies on deprecated artifacts.
Dependencies on stuff you won’t be needing.
By removing hard-coded dependencies, users and contributors can extend the client without being locked into specific libraries or tools.
More modularity
We’ve restructured the project to increase modularity, making the client more flexible and easier to customize.
With this modular approach, you can integrate your preferred monitoring, logging, or discovery tools through events, listeners, and filters. This not only simplifies customization but also makes the codebase more maintainable and future-proof, empowering developers to build and scale their own extensions as needed.
Code cleanup/refactoring
With a cleaner codebase, future development should be faster and less error-prone, making it easier for community contributions as well.
Better examples
We've introduced a module within the project specifically for examples. While it's still a work in progress, this module will serve as a central resource for practical, real-world examples whether you're getting started or looking for advanced use cases.
Going forward, all Conductor Clients and SDKs will eventually be housed in the same conductor-clients directory in the conductor-oss/conductor repo. Head there to find the source code for the Java Client/SDK v4.
What’s new in Java Client v4?
1. Optimized dependencies
Java Client v4 introduces a more streamlined and efficient dependency set compared to the two projects it replaces.
We’ve removed all unused, deprecated, and unnecessary dependencies, significantly reducing classpath pollution. This optimization not only minimizes the risk of conflicts between libraries but should also improve overall performance and maintainability. By simplifying the dependency tree, v4 provides a cleaner and more lightweight client that is easier to work with and integrates more smoothly into your projects.
2. New TaskRunner
TaskRunner has been refactored. It replaces TaskPollExecutor, since both share the same core responsibility: managing the thread pool that workers use for polling, executing, and updating tasks.
With that, we've removed direct dependencies on Netflix Eureka and Spectator, introduced event-driven mechanisms, and added a PollFilter—a callback that determines whether polling should occur. Additionally, error handling and concurrency management have been improved.
If you’re using Eureka and Spectator, no need to worry—events and filters are provided for seamless integration with these great tools and libraries.
3. Extensibility using events, listeners, and filters
Java Client v4 introduces enhanced extensibility through events, listeners, and filters. These can be used for various purposes, including metrics tracking, logging, auditing, and triggering custom actions based on specific conditions.
For example, you can use a Lambda Function as a PollFilter to check the instance status as reported by Eureka. If the instance is marked as UP—meaning Eureka considers it healthy and available—the worker will proceed to poll for tasks.
Additionally, a listener can be registered to handle PollCompleted events. In this scenario, the listener logs event details and uses Prometheus to track the duration of the polling process, attaching the task type as a label for detailed metrics tracking. This approach not only adds flexibility but also improves observability and control over the client's behavior.
The client also has some specialized interfaces like MetricsCollector, which is built on top of these events and listeners. We’ll be providing concrete implementations of Metrics Collectors soon.
4. OkHttp3 v4 — the right amount of features OOTB
OkHttp3 v4 is one of the most popular and well-regarded HTTP clients for Java. By upgrading to it, our Java Client/SDK v4 now supports HTTP2 and Gzip out-of-the-box, allowing you to make swifter HTTP requests or data transfers. While there are other excellent options, OkHTTP was chosen for its simplicity, performance, and reliability.
With the OkHttp upgrade, we also decided to remove one abstraction layer, Jersey. Jersey is more feature-rich but also more heavyweight compared to a simple HTTP client like OkHttp. Some of these features (such as dependency injection and exception mappers) can be overkill if you just want to make basic HTTP requests.
5. Ease of migration from OSS to Orkes
The client promotes seamless integration between OSS Conductor and Orkes Conductor, empowering users with the flexibility to switch as their needs evolve, while maintaining support for the open-source community first.
The Orkes Client module simply extends the Conductor Client by adding authentication through a HeaderSupplier.
For OSS users who have created workers with Client v4 but want to give Orkes Conductor a shot, they just need to add the orkes-conductor-client dependency to their project and instantiate the client with OrkesAuthentication as a Header Supplier. Switching back to OSS is as simple as removing that Header Supplier.
varclient=ConductorClient.builder().basePath(BASE_PATH).addHeaderSupplier(newOrkesAuthentication(KEY,SECRET)).build();returnnewTaskClient(client);// Use this TaskClient with TaskRunner to initialize workers
We’ve started consolidating examples into a dedicated module, with improvements that cover key areas like authorization, managing workflow and task definitions, scheduling workflows, and more. While this module is still a work in progress, we’ll continuously add and refine examples to provide better guidance and cover real-world use cases.
Our goal is to enable developers to use our Client/SDK effectively and explore best practices as the module evolves.
Getting started with Java Client v4
Here’s how you can get started using Java Client v4:
import com.netflix.conductor.client.http.ConductorClient;
// … boilerplate or other code omitted
var client = new ConductorClient("http://localhost:8080/api");
With the dependency added, you can execute the following code to register the workflow:
importcom.netflix.conductor.sdk.workflow.def.WorkflowBuilder;importcom.netflix.conductor.sdk.workflow.def.tasks.SimpleTask;importcom.netflix.conductor.sdk.workflow.executor.WorkflowExecutor;// … boilerplate or other code omittedvarexecutor=newWorkflowExecutor("http://localhost:8080/api");varworkflow=newWorkflowBuilder<Void>(executor).name("hello_workflow").version(1).description("Hello Workflow!").ownerEmail("examples@orkes.io").add(newSimpleTask("hello_task","hello_task_ref")).build();workflow.registerWorkflow(true,true);executor.shutdown();
Step 5: Start a workflow
Now that you have registered a workflow, you can start it with code:
Note: If you want to run these code snippets in Orkes Conductor, remember to authenticate.
What’s next?: Roadmap
We aim for full parity between OSS server models and SDKs, so stay tuned for more changes to Conductor’s Java SDK and improved documentation. To stay up-to-date with our progress, check out the Conductor OSS Project Board. To request features or report bugs, create a ticket in our GitHub repository or contact us on Slack.
We’ll also soon tackle other supported languages: JavaScript, C#, Go, and Clojure. In the meantime, happy coding! Don’t forget to follow the Conductor OSS project and support us by giving a ⭐.
Orkes Cloud is a fully managed and hosted Conductor service that can scale seamlessly to meet your needs. When you use Conductor via Orkes Cloud, your engineers don’t need to worry about setting up, tuning, patching, and managing high-performance Conductor clusters. Try it out with our 14-day free trial for Orkes Cloud.
kindly4-beta13 - A small library for defining how different kinds of things should be rendered
kindly-advice1-beta10 - A small library to advise Clojure data visualization and notebook tools how to display forms and values, following the kindly convention
Formal methods in software engineering are mathematical techniques used to specify, develop, and verify software systems. They provide a rigorous framework for ensuring software correctness, reliability, and safety, which is especially crucial in safety-critical industries like aerospace, automotive, and finance.
While formal methods may seem complex at first, practical tools and methods make them accessible for software engineers. Here’s how to get started with using formal methods effectively.
Why Use Formal Methods?
Traditional testing catches bugs, but it cannot prove the absence of errors. Formal methods, on the other hand, provide a mathematical foundation to verify system properties, ensuring that all possible scenarios have been considered. This can drastically reduce bugs, improve security, and enhance system stability.
Key Techniques and Tools
Model Checking
Model checking systematically explores the state space of a system to verify properties like deadlock-freedom or resource safety. A well-known model-checking tool is SPIN, which uses a high-level modeling language called PROMELA.
Example: To verify a concurrent system for deadlocks, you can model the system in PROMELA, define desired properties (e.g., “no deadlock”), and use SPIN to check if any violations occur.
Tip: Start small by modeling key components of your system, then gradually expand the model to include more features.
Theorem Proving
Theorem proving is a formal verification technique that uses mathematical logic to prove that a system meets its specification. Tools like Coq or Isabelle allow you to express system properties and prove them.
Example: In Coq, you define both your system (such as an algorithm) and a set of theorems that describe its correctness (e.g., sorting algorithms always produce a sorted list). Coq provides an interactive environment to construct formal proofs.
Tip: Break down complex proofs into smaller, manageable lemmas to ensure each part of the system functions correctly.
Abstract Interpretation
Abstract interpretation is a technique that approximates the behavior of a system to prove properties like absence of runtime errors or invariants over program variables. Frama-C, for example, is a tool for static analysis and abstract interpretation of C programs. While still experimental, Frama-C also has modules to assist the development of Java programs.
Example: Use Frama-C to verify that array accesses in a program are always within bounds. This helps prevent runtime errors such as segmentation faults.
Tip: Use abstract interpretation to focus on specific properties like variable ranges, loop invariants, or memory safety. It is particularly useful in large-scale projects where manual analysis would be too time-consuming.
Tips for Using Formal Methods Effectively
Start Early in the Design Process
Formal methods are most effective when applied early in the software development lifecycle. Define formal specifications for your system during the requirements phase, which will guide the development process and verification efforts.
Integrate with Agile Methods
Contrary to popular belief, formal methods can be integrated into agile workflows. Tools like TLA+ are designed for lightweight specification and allow quick iterations. Use formal specifications as a living document, evolving them alongside your codebase.
Focus on Critical Components
Not every part of your system needs formal verification. Focus formal methods on critical components, such as security-sensitive modules or components with complex concurrency. This optimizes your resources while ensuring high-risk parts are correct.
Combine with Traditional Testing
While formal methods provide strong guarantees, combining them with traditional testing techniques (unit tests, integration tests) ensures a comprehensive verification approach. For example, use CBMC (a bounded model checker) for proving program correctness in C/C++ code and complement this with regression testing. Similarly, static code analyzers like Kondo for Clojure serve a similar purpose of checking for common pitfalls before code execution.
Conclusion
Formal methods provide a mathematically rigorous way to improve software quality, safety, and reliability. Tools like SPIN, Coq, Frama-C, and TLA+ make formal verification accessible for practical use. By focusing on critical components, integrating them with agile practices, and combining formal methods with traditional testing, software engineers can harness their full potential for real-world software development.
ClojureScript + React is great for building rich UI experiences but it is awful at building landing pages because the big bundle size of all of clojurescript.core and libraries that you can include. This is why many devs choose to go with simple static pages generated through hiccup or htmx or other templating libraries.
But what happens if you have a full UI library for your react clojurescript app already defined that you want to use on your landing page?
Or if you want to do complex logic on landing pages like issuing requests, popping modals or use cool animation libraries? Can't you just make the landing page part of your application?
Yes, you can do that, however, the performance will be terrible:
You will first serve an empty html with the classic <div id="root" /> that react will take over
You will fetch a (at least) 1-1.5 MB javascript file containing the logic for your entire SPA just for the landing
This will downgrade your SEO score since web crawlers look at these metrics to rank you higher. In this blog, we will look at a solution to use your existing clojurescript code to build interactive and high performance landing pages.
Results on my project
I have a website built with UIx and ClojureScript, here's the lighthouse score for the landing page which is simply part of the SPA:
Lighthouse score before server side rendering:
Here's the result after the optimisation:
Note: In all honesty, this score is not just because SSR, I did some tweaks, and we'll cover those later
Requirements
We will need:
A clojurescript react wrapper that supports server side rendering. We will use UIx because currently it's the fastest and supports SSR. You can also use rum but I haven't tested SSR with it and it doesn't use the latest react versions.
A clojurescript selective compiler - shadow-cljs for compiling only the code you need for server rendered pages
A backend server to serve the rendered html. I'm using reitit but any server that can return html will do. A nice convenience for reitit is you can define your frontend (SPA handled) routes and your backend routes in the same routes.cljc file
Setup
We will work with this file structure to better separate concerns:
.
├── clj # backend
│ └── saas
│ ├── layout.clj
│ ├── routes.clj
│
├── cljc # common pages
│ └── saas
│ └── common
│ ├── ui
│ ├── pages
│ └── landing.cljc
├── shadow-cljs.edn # compiler
└── cljs # Pure UI
└── saas
├── core.cljs # entry point for single page app
└── ui
├── pages # pure SPA
└── dashboard.cljs
├── ssr_pages # SSR pages that are compiled separately
└── landing.cljs
Server side rendering a landing page
Here are the steps to make a landing page server side rendered and interactive afterward:
1. Define your landing page in cljc
Note: UIx supports both clj & cljs. I advise you write your UI library as much as possible in .cljc so you can use them when server rendering static pages. All of shipclojure's UI library is written in cljc to solve this.
Let's write a layout.clj file where we define how to render html on the backend:
(ns saas.layout
(:require[hiccup2.core :as h][hiccup.util :refer[raw-string]][uix.core :refer[$]][ring.util.http-response :refer[content-type ok]][uix.dom.server :as dom.server])(:gen-class))(defn page-script
"Returns script tag for specific application page or the SPA
Usage:
(page-script :app) ;; used for the SPA
(page-script :landing) ;; used for server rendered landing page
"[page-name][:script{:src(str"/assets/js/"(name page-name)".js")}])(defn root-template
[{:keys[title description inner script-name]}](str"<!DOCTYPE html>"(h/html[:html.scroll-smooth[:head[:meta{:charset"UTF-8"}][:meta{:name"viewport":content"width=device-width,initial-scale=1"}][:link{:rel"stylesheet":href"/assets/css/compiled.css"}][:title title][:meta{:name"description":content description}]][:body.font-body.text-base[:noscript"This is a JavaScript app. Please enable JavaScript to continue."][:div#root (raw-string inner)];; here's where actual react code will run(when script-name
(page-script script-name))]])))(defn render-page
"Render a html page
params - map with following keys:
- `:inner` - inner string representing page content. Usually obtained through `uix/render-to-string`
- `:title` - title of the html page
- `:description` - description of html page
- `:page-script` - name of js script to be required. Used to hydrate server side rendered react
pages"[params](-> params
(root-template)(ok)(content-type"text/html; charset=utf-8")))(defn app-page
"HTML page returned from the server when rendering the SPA.
The SPA is not server side rendered."([](app-page{}))([{:keys[title description]:or{title (:html/title config)
description (:html/description config)}}](render-page{:title title
:description description
:innernil;; We don't server side render the single page app:script-name:app})))(defn landing-page
"Server rendered landing page for high SEO which is hydrated on the frontend"[_](render-page{:title(:html/title config):description(:html/description config):script-name:landing;; the html file will require the <script src="landing.js" /> for hydration:inner(dom.server/render-to-string($ landing/landing-page))}))
3. Tell your router what to do
Define the reitit routes and what to serve on each of them:
routes.clj
(ns routes
(:require[saas.layout :as layout]))(defn routes
[][""["/"{:get(fn[req](layout/landing-page))}];; server side rendered landing["/app"{:get(fn[req](layout/app-page))}];; pure client single page app["/api"
...]])
4. Write your frontend that hydrates the landing page
We write a clojurescript file that defines code which will hydrate the server rendered html.
Full steps will be:
server renders static html
serves html to browser
browser requires landing.js
script hydrates react and makes page interactive
cljs/../landing.cljs
(ns saas.ui.ssr-pages.landing
(:require[uix.core :refer[$]][uix.dom :as dom.client];; cljc[saas.common.ui.pages.landing :refer[landing-page]]));; Hydrate the page once the compiled javascript is loaded.;; This can be checked with a `useEffect`(defn render [](dom.client/hydrate-root(js/document.getElementById"root")($ landing-page)))
4. Write shadow-cljs config for only this landing.cljs
We can define the SSR pages as code split bundles
shadow-cljs.edn
{:depstrue:dev-http{8081"resources/public"}:nrepl{:port7002}:builds{:app{:target:browser:modules{;; shared can be react, uix, and the ui components:shared{:entries[saas.common.ui.core]};; module containing just the code needed for SSR:landing{:init-fn saas.ui.ssr-pages.landing/render
:depends-on#{:shared}};; module for the SPA code:app{:init-fn saas.core/init
;; assuming the SPA also renders the "pages" above:depends-on#{:shared:landing}}}:output-dir"resources/public/assets/js":asset-path"/assets/js"}}}
This will generate in resources/public/assets/js the following files:
app.js
shared.js
landing.js
This file, will be loaded into our html, it will take our server generated html landing page and hydrate it, i.e make it interactive so you can use useState (uix/use-state) and any other react good stuff.
5. Rinse & repeat
Say you want your /about page to have the same principle. No problem:
Common (cljc) :
(ns saas.common.ui.pages.about);; about page written in .cljc(defui about-page
[](let[[count set-count](uix/use-state0)]($ :div"This is a cool about page"($ ui/button {:on-click#(set-times inc)}"+"))))
As you can see, we define each page in it's own file so that we only load the minimum required dependencies to be compiled with shadow-cljs. This is exactly why React SSR frameworks adopted a file-based routing system. You only include the required dependencies in javascript land.
Backend (clj):
layout.clj
;; layout.clj(ns saas.layout
(:require[saas.common.ui.pages.about :as about]))
...
(defn about-page
"Server rendered about page for high SEO which is hydrated on the frontend"[_](render-page{:title"About us":description"Our cool story":script-name:about;; name of the compiled script to load in html (about.js):inner(dom/render-to-string($ about/about-page))}))
routes.clj
(ns routes
(:require[saas.layout :as layout]))(defn routes
[][""["/"{:get(fn[req](layout/landing-page))}];; server side rendered landing["/about"{:get(fn[req](layout/about-page))}];; server side rendered about page
..
["/api"
...]])
Frontend (cljs):
ssr_pages/about.cljs
(ns saas.ui.ssr-pages.about
(:require[uix.core :refer[$]][uix.dom :as dom.client];; cljc[saas.common.ui.pages.about :refer[about-page]]));; Hydrate the page once the compiled javascript is loaded.;; This can be checked with a `useEffect`(defn render [](dom.client/hydrate-root(js/document.getElementById"root")($ about-page)))
shadow-cljs.edn
{:depstrue:dev-http{8081"resources/public"}:nrepl{:port7002}:builds{:app{:target:browser:modules{;; shared can be react, uix, and the ui components:shared{:entries[saas.common.ui.core]};;landing page code:landing{:init-fn saas.ui.ssr-pages.landing/render
:depends-on#{:shared}};; about page code:about{:init-fn saas.ui.ssr-pages.about/render
:depends-on#{:shared}}:app{:init-fn saas.core/init
;; assuming the SPA also renders the "pages" above:depends-on#{:shared:landing:about}}}:output-dir"resources/public/assets/js":asset-path"/assets/js"}}}
And we're done!
Building the release
Running
npx shadow-cljs release app
Will compile all of the assets for the SSR pages and the SPA. What's even better, we can add a specific cache for the shared.js file and if somebody visits the landing page, they will have to download considerably less code when visiting the SPA. (Thank you Thomas Heller for helping me with a better config for compiling these separate files)
After this, it's just about making the clojure backend serve the static files. See reitit/create-resource-handler for details on adding this capability.
Further optimizations
As I mentioned above, it wasn't just the SSR that boosted my score so hight. I also added gzip middleware to the static assets so load times are further reduced. It had an impressive effect: 600kb of javascript -> 154 kb.
shadow-cljs
We can further optimize the build by adding for page a script like this:
[script "saas.ui.ssr_pages.landing.render();"] and change (defn render [] ..) to (defn ^:export render [] ...) so the name is preserved after :advanced optimization
This is because if we use :init-fn in the shadow-cljs config, the :init-fn function is always called no matter what. Changing the code this way, we control when to call render and we allow sharing code from those modules in the future.
After writing this document, I realized all of this manual work can be put in a library that takes a folder with .cljc uix pages, a layout config and gives back the correct config for reitit and builds an internal shadow-cljs config a.k.a NextJS for clojure(script).
Send me an email if this is interesting for you and I'll continue to work on it.
Resources
Here are some resources you can look to further understand this:
Stepan Parunashvili talks about Instant, datalog, building a database in Clojure, and the demands of building modern applications. Instant Datomic Database in the Browser, a spec A Graph-Based Firebase Tonsky (Nikita Prokopov) The Web after Tomorrow WorldStore: Distributed caching with Reactivity - Asana How Figma’s multiplayer technology works Datascript Google CEL: Common Expression Language Amazon Aurora Instant is hiring!
A common question is how you’d use shadow-cljs in a fullstack setup with a CLJ backend.
In this post I’ll describe the workflow I use for pretty much all my projects, which often have CLJ backends with CLJS frontends. I’ll keep it generic, since backend and frontend stuff can vary and there are a great many options for which CLJ servers or CLJS frontends to use. All of them are fine to use with this pattern and should plug right in.
A common criticism of shadow-cljs is its use of npm. This even acts as a deterrent for some people, not even looking at shadow-cljs, since they don’t want to infect their system with npm. I get it. I’m not the biggest fan of npm either, but what most people do not realize that npm is entirely optional within shadow-cljs. It only becomes necessary once you want to install actual npm dependencies. But you could install those running npm via Docker if you must. So, I’ll try to write this so everyone can follow even without node/npm installed.
I also used this setup with leiningen before, it pretty much works exactly the same. You just put your :dependencies into project.clj. Do not bother with any of project.clj other features.
The Setup
The only requirements for any of this to work is a working deps.edn/tools.deps install, with a proper JVM version of course. I’d recommend JDK21+, but everything JDK11+ is fine.
Since the constraint here is to not use npm (or npx that comes with it), we’ll now have to create some directories and files manually. For those not minding npx, there is a useful small minimal script npx create-cljs-project acme-app command to do this for use. But it is not too bad without.
mkdir -p acme-app/src/dev
mkdir -p acme-app/src/main/acme
cd acme-app
src/main is where all CLJ+CLJS files go later. src/dev is where all development related files go. What you call these folders is really up to you. It could all go into one folder. It really doesn’t matter much, for me this is just habit at this point.
Next up we need to create our deps.edn file, which I’ll just make as minimal as it gets. I do not usually bother with :aliases at this point. That comes later, if ever.
Of course, you’ll add your actual dependencies here later, but for me its always more important to get the workflow going as fast as possible first.
I also recommend creating the shadow-cljs.edn file now, although this isn’t required yet.
{:deps true
:builds {}}
Starting the REPL
It is time for take off, so we need to start a REPL. The above setup has everything we need to get started.
For the CLJ-only crowd you run clj -M -m shadow.cljs.devtools.cli clj-repl, and in case you have npm you run npx shadow-cljs clj-repl. Either command after a bit of setup should drop you right into a REPL prompt. After potentially a lot of Downloading: ... you should see something like this:
shadow-cljs - server version: 2.28.18 running at http://localhost:9630
shadow-cljs - nREPL server started on port 60425
shadow-cljs - REPL - see (help)
To quit, type: :repl/quit
shadow.user=>
I do recommend to connect your editor now. shadow-cljs already started a fully working nREPL server for you. It is recommended to use the .shadow-cljs/nrepl.port file to connect, which tells your editor which TCP port to use. Cursive has an option for this, as well as most other editors I’d assume. You can just use the prompt manually, but an editor with REPL support makes your life much easier.
REPL Setup
We have our REPL going now, but one its own that doesn’t do much. So, to automate my actual workflow I create my first CLJ file called src/dev/repl.clj.
The structure is always the same. I want a point to start everything, something to stop everything. To complete my actual workflow I have created a keybinding in Cursive (my editor of choice) to send (require 'repl) (repl/go) to the connected REPL. This lets me restart my app by pressing a key. The ::started keyword is there as a safeguard, make sure it always the last thing in the defn. We will be calling this via the REPL, so the return value of (repl/start) will be printed. Not strictly necessary, but returning some potentially huge objects can hinder the REPL workflow.
What start/stop do is entirely up to your needs. I cannot possibly cover all possible options here, but maybe the file from shadow-cljs itself can give you an idea. It starts a watch for shadow-css to compile the CSS needed for the shadow-cljs UI. It is all just Clojure and Clojure functions.
The entire workflow can be customized for each project from here. Unfortunately, my projects that have a backend are not public, so I cannot share an actual example, but let me show a very basic example using ring-clojure with the Jetty Server.
Given that our initial deps.edn file didn’t have Jetty yet, we’ll CTRL+C the running clj process (or (System/exit 0) it from the REPL). Then we add the dependency and start again.
clj -M -m shadow.cljs.devtools.cli clj-repl again, and then (require 'repl) (repl/go) (keybind FTW). Once that is done you should have a working http server at http://localhost:3000.
REPL Workflow
This is already mostly covered. The only reason to ever restart the REPL is when you change dependencies. Otherwise, with this setup you’ll likely never need for that slow REPL startup during regular work.
One essential missing piece for this workflow is of course how changes make it into the running system. Given that the REPL is the driver here, making a change to the handler fn (e.g. changing the :body string) and saving the file does not immediately load it. You can either load-file the entire file over the REPL and just eval the defn form and the change should be visible if you repeat the HTTP request.
Cursive has the handy option to “Save + Sync all modified files” when pressing the repl/go keybind. Or just a regular “Sync all modified files” in the REPL, since often the stop/start cycle isn’t necessary. This is super handy and all I have for this. Another option may to use the new-ish clj-reload to do things on file save. I have not tried this, but it looks promising.
Either way, we want to do as much as possible over the REPL and if there are things to “automate” we put it into the start function, or possibly create more functions for us to call in the repl (or other) namespace.
Extending the HTTP Server
You’ll notice that the Jetty Server only ever responds with “Hello World!”, which of course isn’t all that useful. For CLJS to work we need to make it capable of serving files. For this we’ll use the ring-file middleware.
Now we may create a public/index.html file and the server will show that to use when loading http://localhost:3000. Don’t bother too much with its contents for now.
Adding CLJS
As this point it is time to introduce CLJS into the mix. I’ll only do a very basic introduction, since I cannot possibly cover all existing CLJS options. It doesn’t really matter if you build a full Single Page Application (SPA) or something less heavy. The setup will always be the same.
First, we create the src/main/acme/frontend/app.cljs file.
The default is to output all files into the public/js directory. So, this will create a public/js/main.js file once the build is started.
I wrote a more detailed post on how Hot Reload works. Everything is already ready for it, you just basically need to same start/stop logic here too and tell shadow-cljs via the :dev/after-load metadata tag. For the purposes of this post I’ll keep it short.
You may start the build via either the shadow-cljs UI (normally at http://localhost:9630/builds) and just clicking “Watch”. Or since we are into automating you can modify your src/dev/repl.clj file.
The only lines added were the :require and the (shadow/watch :frontend). This does the same as clicking the “Watch” button in the UI. It isn’t necessary to stop the watch in the stop fn, calling watch again will recognize that it is running and do nothing. Either way you should now have a public/js/main.js file. There will be more files in the public/js dir, but you can ignore them for now.
Next we’ll need the HTML to make use of this JS. Change public/index.html to this:
If you now open http://localhost:3000 in your browser you should see a blank page with Hello World printed in the console. Where you take this from here is up to you. You have a working CLJ+CLJS setup at this point.
CLJS REPL
The REPL by default is still a CLJ-only REPL. You may eval (shadow.cljs.devtools.api/repl :frontend) to switch that REPL session over to CLJS. Once done all evals happen in the Browser. Assuming you have that open, otherwise you’ll get a “No JS runtime.” error. To quit that REPL and get back to CLJ you can eval :cljs/quit.
I personally only switch to the CLJS REPL occasionally, since most of the time hot-reload is enough. You may also just open a second connection to have both a CLJ and CLJS REPL available. That entirely depends on what your editor is capable off.
A few more Conveniences
Running clj -M -m shadow.cljs.devtools.cli clj-repl AND (require 'repl) (repl/go) is a bit more verbose than needed. We can change this to only clj -M -m shadow.cljs.devtools.cli run repl/start by adding one bit of necessary metadata to our start fn.
(defn start
{:shadow/requires-server true} ;; this is new
[]
(shadow/watch :frontend)
(reset! jetty-ref
(jetty/run-jetty #'srv/handler
{:port 3000
:join? false}))
::started)
Without this shadow-cljs run assumes you just want to run the function and exit. In our case we want everything to stay alive though, and the middleware tells shadow-cljs to do that. Doing this will lose the REPL prompt in the Terminal though. So, only do this if you have your editor properly setup.
One thing I personally rely very much on is the Inspect UI. It might be of use for you to. It is basically println on steroids, similar to other tools such as REBL or Portal. It is already all setup for CLJS and CLJ, so all you need is to open http://localhost:9630/inspect and tap> something from the REPL (or your code). Try adding a (tap> req) as the first line in acme.server/my-handler. If you open http://localhost:3000/foo to trigger that handler and see the request show up in Inspect.
Getting To Serious Business
Please do not ever use the above setup to run your production server. Luckily getting to something usable does not require all that much extra work. All we need it to amend our existing acme.server namespace like so:
That gives us a -main function, which we can run directly via clj -M -m acme.server to get our “production-ready” server. This will start only that server and not the whole shadow-cljs development environment.
For CLJS you could run clj -M -m shadow.cljs.devtools.cli release frontend (or npx shadow-cljs release frontend) to get the production-optimized outputs. They are just static .js files, nothing else needed. Note that making a new CLJS release build does not require restarting the above CLJ server. It’ll just pick up the new files and serve them.
That is the most basic setup really. I personally do not bother with building uberjars or whatever anymore and just run via clj. But every projects requirement is going to vary, and you can use things like tools.build to create them if needed.
Of course real production things will look a bit more complicated than the above, but all projects I have started like this.
Node + NPM
Since pure CLJS frontends are kinda rare, you’ll most likely want some kind of npm dependencies at some point. I do recommend to install node.js via your OS package manager and just doing it via npm directly. Don’t worry too much if its slightly out of date. All you need is to run npm init -y in the project directory once to create our initial package.json file. After that just npm install the packages you need, e.g. npm install react react-dom to cover the basics. You just .gitignore the node_modules folder entirely and keep the package.json and package-lock.json version controlled like any other file.
If you are really hardcore about never allowing node on your system, you might be more open to running things via Docker. I frankly forgot what the exact command for this is, but if you know Docker you’ll figure it out. The images are well maintained, and you can just run npm in it. No need to bother with Dockerfile and such.
ChatGPT suggested: docker run -it -v $(pwd):/app -w /app node:20 npm install react react-dom. I don’t know if that is correct.
Either way, once the packages are installed shadow-cljs should be able to build them. No need to run any node beyond that point.
Conclusion
I hope to have shown how I work in an understandable format. Working this way via the REPL really is the ultimate workflow for me. Of course, I have only scratched the surface, but the point of all this is to grow from this minimal baseline. I never liked “project generators” that generate 500 different files with a bunch of stuff I might not actually need. Instead, I add what I need when I need it. The learning curve will be a bit higher in the beginning, but you’ll actually know what your system is doing from the start.
There may be instances where it is not possible to run shadow-cljs embedded into your CLJ REPL due to some dependency conflicts. But the entire workflow really doesn’t change all that much. You just run shadow-cljs as its own process. The main thing to realize is that the only thing CLJ needs to know about CLJS is where the produced .js files live and how to serve them. There is no other integration beyond that point. All files can be built just fine from the same repo. Namespaces already provide a very nice way to structure things.
I personally do not like the common “jack-in” workflow that is recommended by editors such as emacs/cider or vscode/Calva, and I do not know how that would work exactly. But I’m certain that either have a “connect” option, to connect to an existing nREPL server, like the one provided by shadow-cljs.
clong 1.4.3 - A wrapper for libclang and a generator that can turn c header files into clojure apis
kindly-advice 1-beta9 - a small library to advise Clojure data visualization and notebook tools how to display forms and values, following the kindly convention
lazytest1.2.0 - A standalone BDD test framework for Clojure
noj2-alpha9.2 - A clojure framework for data science
When Clojure 1.12.0 was released in September, the release note had a lot of cool features, such as virtual threads, but one feature caught my eye in particular: support for Java Streams and functional interface support.
That was just what I needed at work to port one of our internal libraries to Java, to make it easier to reuse from projects written in different JVM languages.
Little did I know that Java Streams suck.
The problem
We have a small library, that implements queues with lazy sequences.
It has both producer and consumer parts, so several projects can reuse the same library, one being a producer, and the other being a consumer.
Today we’re interested in the consumer part.
One of the features this library has is that we can have several separate queues, and we can take items from all of them in a fixed order.
Each queue is sorted, so when we have multiple queues, we need to get elements in order and still in a lazy way.
Let’s look at some examples:
It’s not that different from actual queues used in our library, as each sequence is potentially infinite and sorted.
What we need to get as a result is a single queue with items of both queues combined with retained ordering:
Note, that I’ve omitted some checks for things like empty sequences, and such for the sake of simplicity.
In short, it evaluates like this:
Let’s say, we have [(1 3 5 7...) (0 2 4 6...)] as an input;
We sort it by the first element of each sequence: [(0 2 4 6...) (1 3 5 7...)];
We take the first element from the first queue: 0, and cons it to the recursive call to which we pass back all of the sequences, except we remove the first element from the first queue;
Since the function returns a lazy sequence the recursive call is effectively trampolined.
Of course, it won’t work if sequence items are in arbitrary order, but that’s not the case for our implementation.
Java Stream solution
Now, as I mentioned, I had to rewrite this library in Java, for it to be reused in other projects written in other JVM languages.
It is possible to use a Clojure library from, say, Scala, but it is quite tedious to do so.
We either have to AOT compile the library to a jar, and provide a bunch of methods via genclass.
Alternatively, it’s possible to load Clojure, compile the sources, and use it this way, but it introduces way too many hoops to jump through, that rarely anyone would want to do so in our team.
And the library in question is small, about 400 LOC with documentation strings and comments, so rewriting it in Java wouldn’t be that hard.
Or so I thought.
I’m not a Java programmer, and I have very limited knowledge of Java.
Thankfully, Java is a simple language, unless you wrap everything in an unnecessary amount of classes, use abstract fabric builders, and so on.
This library, thankfully, required neither of those cursed patterns - it’s a single static class with no need for instancing, with a bunch of pure methods.
So, knowing that Java has a Stream class, and looking at its interface I thought that I would be able to implement this library.
And in truth, it wasn’t a problem, until I got to the lazy merge sort part.
That’s when it started looking cursed.
First of all, a Stream is not a data structure - you can’t work with it as if it were data, you have to use pipelines, and then either consume it or pass it around.
Moreover, most examples use streams as an intermediate transformation step and return a collection, or suggest passing in a transformer, instead of returning a stream, so I wonder where is this coming from:
Java APIs increasingly return Streams and are hard to consume because they do not implement interfaces that Clojure already supports, and hard to interop with because Clojure doesn’t directly implement Java functional interfaces.
Anyhow, I’ve created functions that return streams of items in the queue, much like the ones I showed above.
So it was time to implement the merge sort.
Let’s look at the skeleton of our function:
Stream.generate(Supplier) produces an infinite stream, generated by calling the supplier.
Basically, it’s what we need here, we can do our sorting inside the supplier.
However, there’s a problem - Streams are not data structures.
And there’s no way to take one element without consuming the stream.
I mean, there’s stream.findFirst() but if we look at the documentation for it, we’ll see that:
Optional<T> findFirst()
Returns an Optional describing the first element of this stream, or an empty Optional if the stream is empty. If the stream has no encounter order, then any element may be returned.
Returns:
an Optional describing the first element of this stream, or an empty Optional if the stream is empty
Throws:
NullPointerException - if the element selected is null
What is a short-circuiting terminal operation you ask?
A terminal operation may traverse the stream to produce a result or a side effect.
A short-circuiting terminal operation does the same, but even if presented with an infinite stream, it can finish in a finite time.
And after the terminal operation is performed, the stream is considered consumed, and can no longer be used.
But even if we could use findFirst without closing the stream, it wouldn’t be useful to us, because remember - we need to take the first element from each stream and sort the streams themselves.
But findFirst is a destructive operation, it removes the element from the stream.
In Clojure, sequences are immutable - all we can do is construct a new sequence if we wish to add items or take its tail if we need fewer items.
Thus first does nothing to the sequence in question, we can freely call it on any sequence, obtain an element, do stuff with it, and be done.
You can think of first like of an iterator peek, where you look at what the next element is in the iterator without advancing it.
Thankfully, we can convert a Stream to an Iterator, with it staying lazy and potentially infinite.
Only, there’s no peek method in the base Iterator class in Java.
Oh well.
Well, we can always implement our own wrapper for the Iterator class:
packageorg.example;
importjava.util.Iterator;
publicclassPeekingIterator<T> implements Iterator<T> {
Iterator<T> iterator;
privateboolean peeked = false;
private T peeked_item;
publicPeekingIterator(Iterator<T> it) { iterator = it; }
public T next() {
if (peeked) {
peeked = false;
T tmp = peeked_item;
peeked_item = null;
return tmp;
} elsereturn iterator.next();
}
publicbooleanhasNext() { return iterator.hasNext(); }
public T peek() {
if (!peeked && iterator.hasNext()) {
peeked = true;
peeked_item = iterator.next();
}
return peeked_item;
}
}
Of course, we could use a dependency, but due to circumstances, we have to keep the amount of dependencies as low as possible.
Now, we can get back and implement our merge sort:
The general idea is the same but the fact that we had to create a peeking iterator, and store it in an array is disturbing.
In Clojure, we manipulate lazy sequences as if they were ordinary data.
In Java, we don’t have any real data, so we have to make our own way of accessing it.
We have to create an intermediate array list to sort it when every item in the stream is generated.
The same happens in Clojure, of course when we call sort-by, and this is possibly worse, as in Java we only create the array once, and sort it in place, and in Clojure, we create a new list every time.
JVM is good at collecting garbage though, and the rate at which this sequence is consumed is far greater than the time to clean up the garbage, but it is a thing to consider.
Java streams also don’t specify anything about the order and can be processed in parallel, so I’m not sure how my PeekingIterator would behave.
And all of that is simply because Java streams are a half-backed interface made in a rush, or at least it feels like that.
Yes, it supports data pipelines with its own implementation of map, filter, etc., however, it makes me appreciate Clojure even more because it has one implementation of map that works across everything (also, it has transducers).
In a more complete version of this Java library, we have to map over streams, transform them into iterators just to do some stuff, that is not implemented for streams, transform iterators back into streams, and so forth.
The Clojure version is much more straightforward, and concise.
In hindsight, I wish it was easier to use Clojure from other JVM languages.
It would save me the time it took to re-implement everything in Java for sure.
In the end, I hooked the old Clojure implementation to use the Java version as a core and retained the interface by converting streams to sequences via stream-seq!.
It passed all of the library tests, so I moved on.
Our 26-loc Datalog is naive. Nothing personal, it's a technical term: each iteration in saturate rederives all the facts derived plus hopefully some new ones. The last iteration is guaranteed to be 100% redundant since by definition it's the one which derived nothing new!
Let's engineer a bad case (not that it's difficult given how purposefully unsophisticated our code is):
The idea behind semi-naive evaluation is to not keep rederiving from the same facts at each iteration. So the rule is to only consider facts which can be derived by using at least one fresh fact (derived during the previous iteration).
Changes to saturate
The first step is to split facts in two: fresh facts (dfacts—the d stands for diff or delta) and old news (facts).
In the saturate loop, we initialize dfacts with the initial set of facts because at the start of the computation everything is fresh. We keep looping while dfacts' is not empty.
We will modify match-rule to only return facts derived by using at least one fresh fact. However we'll still have to post-process its returned values with (remove facts') just in case it accidentally rederives old facts.
Automatic differentiation with dual numbers feels like magic: you compute the original function with special numbers and you get both the original result but also the value of the derivative and without knowing the expression of the derivative!
For example let's say you want to compute x^2+x at 7, then you compute as usual but by writing 7+ε instead of 7 and simplifying using the magic property that ε^2=0. (A non-null number whose square is zero isn't more bizarre than a number whose square is -1...)
Here we have computed both the value (56) of x^2+x for x=7 but we also computed the value (15) of the derivative of x^2+x (2x+1) without knowing the formula of the derivative!
The monster expression 🐲 above is a kind of derivative and we'd like to compute it without implementing it.
In the same way x is replaced by x + ε in dual numbers we are going to replace envs by [envs denvs] where envs are environments created using strictly matches over old facts and denvs environments where at least one fresh fact was matched against.
The original (for [fact facts env envs] ...) is thus going to be declined in 4 versions: facts×envs, facts×denvs, dfacts×denvs and dfacts×envs. Only the first one contributes to the envs component; all the others by having a d on envs or facts contribute to the devs component.
Last, we have to be careful to not eagerly compute envs since it's exactly what we want to avoid: rediriving the old facts. We do so by delaying its actual computation with delay.
The set has been replaced by into #{} to make envs and denvs computations more similar but otherwise the envs component of the semi-naive version is the original envs computation in the naive version.
Changes to q
Not much to say except we pass an empty set as the facts parameter to match-rule.
Now it takes 350ms for 50 and 5s for 100! It's respectively 30 and 50 times faster: progress! 🎉
Conclusion
This datalog is maybe vaguely better but it's still as minimal/useless as in our previous article: we have made the engine more efficient but we don't have made the language more powerful.
See, if we try to compute Bart's siblings (reusing edb and rules from 26-loc Datalog):