Speed obsession in the game industry

When do we really need speed

C++ became a standard language for games and graphics software a long time ago. And, there was actual reason -- work with real-time graphics and physics requires high performance. Processing geometry, managing buffers, matrix calculations - all of that does take time.
But, what about high-level logic? Game mechanics, user interface, storage management, network requests? Stability and safety are much more demanded there, than speed.

Responsibility distribution

We may implement performance-demanding functions in a compiled language, such as C++, and call them from a program written in a dynamic language, such as Python.
But, today we already have well-documented and easy to use libraries for Python (pygame, pyopengl, pyassimp, pybullet, numpy), that are implemented primarily on C/C++ and do provide functions for heavy calculations, or physics/graphics in particular. We may never face necessity to implement such libraries on our own.

Is C++ the only choice?

It is generally accepted, that garbage collected languages, such as Java or C#, are slower than C++ and don't really meet requirements for heavy calculations. This is, of course, not true.
C++ may overcome Java or C# in performance by 20-30% in some special cases, but when it comes to runtime abstractions, such as dynamic function dispatch, languages interaction, asynchronous tasks, text or abstract collections management, Java and C# show much higher efficiency than C++.
Also, we may run our Python programs on the same runtime with Java or C#, using Jython or IronPython. It brings a lot of benefits, such as shared garbage-collected memory, types system and easy access to C# or Java libraries right out of the box. On Java are implemented such nice dynamic languages as Clojure and Groovy, that have complete access to Java Class Library and share previously mentioned benefits.

What really makes influence on performance?

Today personal computers are much faster, than 15-20 years ago. But, most of desktop programs or games do not work as fast as expected (despite that they are still mostly implemented on C/C++). Today we need good algorithms and effective approaches much more, than just language speed. Function with constant complexity on Python is more preferable than function with linear complexity on C. To paint 100 trees by 15 lines of Python code is more preferable than to paint 500 trees by 300 lines of C++ code.

Care about game, not language

It is not really important, what language you use, when you don't have any game made, right?
Making game on C++ is much more demanding and exhausting, than doing same on Python or Ruby. When you would make 1 game with C++, you would make 10 games with Python. When you would make 5 games with Python, it would be 0 games with C++.
Let us care about games and fun, otherwise what the point?

Permalink

Podcast: Why Clojure with Martin Varela

Starting a podcast has been on my to-do list for a while, and I'm happy to announce that I've finally taken the first steps. My friend and colleague from Metosin, Martin Varela, was happy to jump in front of the camera to capture the following conver...

Permalink

May & June 2024 Long-Term Project Updates

A huge thank you to our 2024 long-term developers for their amazing work in May and June. Check out their latest project updates!

Bozhidar Batsov: CIDER
Michiel Borkent: squint, babashka, neil, cherry, clj-kondo, and more
Toby Crawley: clojars-web
Thomas Heller: shadow-cljs, shadow-grove
Kira McLean: Scicloj Libraries. tcutils, Clojure Data Cookbook, and more
Nikita Prokopov: Humble UI, Datascript, AlleKinos, Clj-reload, and more
Tommi Reiman: Reitit 7.0. Malli, jsonista, and more
Peter Taoussanis: Carmine, Nippy, Telemere, and more

Bozhidar Batsov

This period was quite busy and productive for CIDER and friends. The highlights are 3 (!!!) CIDER releases and a couple of nREPL releases:

CIDER 1.14 is our most ambitious release since CIDER 1.8 (“Geneva”), which was released last autumn.

The single most notable user-visible change of this release is that CIDER is now more robust when evaluating and displaying large values. CIDER will no longer hang when C-x C-eing a big value in a source buffer or stepping over such a value with CIDER debugger.

I’m guessing that many people will also appreciate the improvements we’ve made to flex completion (which is finally fully compliant with the Emacs completion API), the inspector and to the cider-cheatsheet functionality which was mostly redesigned.

nREPL 1.2 restores the ability to interrupt evaluation on JDK 20+ (see https://github.com/nrepl/nrepl/pull/318 for details) and CIDER 1.15 implements support for nREPL 1.2.

More interesting work is in progress, so I hope I’ll have another exciting report for all of you in a couple of months!


Michiel Borkent

Updates In this post I’ll give updates about open source I worked on during May and June 2024. To see previous OSS updates, go here.

Sponsors

I’d like to thank all the sponsors and contributors that make this work possible. Without you, the below projects would not be as mature or wouldn’t exist or be maintained at all.

Current top tier sponsors:

If you want to ensure that the projects I work on are sustainably maintained, you can sponsor this work in the following ways. Thank you!

If you’re used to sponsoring through some other means which aren’t listed above, please get in touch.

On to the projects that I’ve been working on!

Updates

Here are updates about the projects/libraries I’ve worked on in May and June.

  • html: Html generation library inspired by squint’s html tag

    • A NEW library for html generation that is both safe, performant, generates easy to understand code and works the same across CLJ and CLJS.
  • babashka: native, fast starting Clojure interpreter for scripting. Released v1.3.191 with the following changes: \

    • Fix #1688: use-fixtures should add metadata to *ns*
    • Fix #1692: Add support for ITransientSet and org.flatland/ordered-set
    • Bump org.flatland/ordered to 1.15.12.
    • Partially Fix #1695: --repl arg handling should consume only one arg (itself) (@bobisageek)
    • Partially Fix #1695: make *command-line-args* value available in the REPL (@bobisageek)
    • Fix #1686: do not fetch dependencies/invoke java for version, help, and describe options (@bobisageek)
    • #1696: add clojure.lang.DynamicClassLoader constructors (@bobisageek)
    • #1696: add clojure.core/*source-path* (points to the same sci var as *file*) (@bobisageek)
    • #1696: add clojure.main/with-read-known (@bobisageek)
    • #1696: add clojure.core.server/repl-read (@bobisageek)
    • #1696: make the cognitect-labs/transcriptor library work (@bobisageek)
    • #1700: catch exceptions from resolving symbolic links during bb.edn lookup (@bobisageek)
    • Support java.nio.channels.ByteChannel + several other related interop
    • Bump nrepl/bencode to 1.2.0
    • Bump babashka/fs
    • Bump org.babashka/http-client to 0.4.18
  • clj-kondo: static analyzer and linter for Clojure code that sparks joy. \

    • Fix #2335: read causes side effect, thus not an unused value
    • Fix #2336: do and doto type checking (@yuhan0)
    • Fix #2322: report locations for more reader errors (@yuhan0)
    • Imports were copied to .clj-kondo/imports but weren’t pick up correctly. Thanks @frenchy64 for reporting the bug.
    • #2333: Add location to invalid literal syntax errors
    • #2323: New linter :redundant-str-call which detects unnecessary str calls. Off by default.
    • #2302: New linter: :equals-expected-position to enforce expected value to be in first (or last) position. See docs
    • #1035: Support SARIF output with --config {:output {:format :sarif}}
    • #2307: import configs to intermediate dir
    • #2309: Report unused for expression
    • #2315: Fix regression with unused JavaScript namespace
    • #2304: Report unused value in defn body
    • #2227: Allow :flds to be used in keys destructuring for ClojureDart
    • #2316: Handle ignore hint on protocol method
    • #2322: Add location to warning about invalid unicode character
    • #2319: Support :discouraged-var on global JS values, like js/fetch
  • squint: CLJS syntax to JS compiler

    • #536: HTML is not escaped in dynamic expression
    • #537: Fix not: wrap argument in parens
    • Return interop expression in function body
    • Prefer value from props map over explicit value
    • #html improvements, support :& for spreading props
    • #492: defclass static methods and fields
    • #526: Fix export of class name with dashes
    • #517: Preserve state over REPL evals
    • #513: Fix shuffle core function random distribution and performances
    • #517: Fix re-definition of class with defclass in REPL
    • #522: fix nil #html rendering issue
  • neil: A CLI to add common aliases and features to deps.edn-based projects.
    Released version 0.3.65 with the following changes:

    • #209: add newlines between dependencies
    • #185: throw on non-existing library
    • Bump babashka.cli
    • Fetch latest stable slipset/deps-deploy, instead of hard-coding (@vedang)
    • Several emacs package improvements (@agzam)
  • cherry: Experimental ClojureScript to ES6 module compiler

    • Fix #130: fix predefined :aliases for cherry.embed
    • Support IDeref, ISwap, IReset in deftype
  • clojure-mode: Clojure/Script mode for CodeMirror 6.

    • Fix #54: support slurping from within string literal
  • pottery: A clojure library to interact with gettext and PO/POT files

    • Contributed a few improvements to dealing with reader conditionals
  • nbb: Scripting in Clojure on Node.js using SCI

    • Fix cljs.pprint/print-table + with-out-str
    • Fixed cljs.test/testing macro to display strings correctly on test failure (@jaidetree)
  • CLI: Turn Clojure functions into CLIs! \

    • Fix #98: internal options should not interfere with :restrict
  • deps.clj: A faithful port of the clojure CLI bash script to Clojure

    • Upgrade/sync with clojure CLI v1.11.3.1463

Other projects

There are many other projects I’m involved with but that had little to no activity in the past month. Check out the Other Projects section (more details) of my blog here to see a full list.

Published 30-06-2024.


Toby Crawley

Commit Logs: clojars-web, infrastructure

June 2024

Commit Logs: clojars-web, infrastructure


Thomas Heller

Time was mostly spent on doing maintenance work and some bugfixes. As well as helping people out via the typical channels (e.g. Clojurians Slack).

Current shadow-cljs version: 2.28.10 Changelog

Notable Updates

  • Reworked some of the shadow-grove internals and adjusted the shadow-cljs UI accordingly. Fixing some old bugs in the process.
  • Updated some Grove Architecture docs: KV, Big Picture, defc deep dive and impl notes

Kira McLean

This is a summary of the open source work I’ve spent my time on throughout May and June, 2024. There were lots of small bug fixes and reports, driven by work on the Clojure Data Cookbook. This work was also the impetus for my initial release of tcutils, a library of utility functions for working with tablecloth datasets. I also had the wonderful opportunity to attend PyData London in June and found it really insightful and inspiring. Read on for more details.

Sponsors

This work is made possible by the generous ongoing support of my sponsors. I appreciate all of the support the community has given to my work and would like to give a special thanks to Clojurists Together and Nubank for providing me with lucrative enough grants that I can reduce my client work significantly and afford to spend more time on these projects.

If you find my work valuable, please share it with others and consider supporting it financially. There are details about how to do that on my GitHub sponsors page. On to the updates!

Ecosystem issue reports and bug fixes

Working on the cookbook these last couple of months turned up a few small issues in ecosystem libraries. The other developers of Clojure’s data science tools are such a pleasure to work with, it’s so rare and nice to have a distributed team of people capable of getting cool things built asynchronously. Here are some details of a few particular issues that came up:

Initial release of tcutils

In my explorations of other languages' tools for working data I often come across nice utility functions that are super simple but have a big impact on the ergonomics of using the tools. I wanted to start bringing some of these convenience utilities to Clojure, so for now I’m putting them in tcutils. So far only a handful of helpers are implemented (lag, lead, cumsum, and clean-column-names). The goal is to eventually fill out more utilities that save people from having to dig into the documentation of half a dozen different libraries to figure out how to implement things like these. The goal is not to achieve feature parity or to exactly copy similar libraries, like pandas or dplyr, but rather to take inspiration from them and make our tools easier to use for people who are used to these conveniences.

Progress on Clojure Data Cookbook

I spent a lot of time on the Clojure Data Cookbook over these last two months. Notable progress includes:

  • The introductory chapters bear some resemblance now to the final form they’ll take.
  • The overall structure of the book is much more clear now.
  • I started the example analysis that will serve as the high-level introductory section of the book.
  • The publishing and deployment process is finally working.

It’s still very much in progress, but in the interest of transparency the work-in-progress version is available online now. It will continue to evolve and change as I fill out more and more of the chapters, but there’s enough of it available now to hopefully give a sense of the style and tone I’m going for. I also finally have the publishing workflow set up and it’s generating a nice-looking Quarto book, thanks to all of Daniel Slutsky’s amazing work on Clay and Quarto integration recently.

Progress on high-level goals

The high-level goal of my work in general remains to steward Clojure’s data science ecosystem to a state of maturity and flourishing so that data practitioners can use it to get real work done. Toward this end, I set up a project board to track progress toward what I see as the main components of this project.

Over the last couple of months, beginning with a prototype demoed at my London Clojurians talk in April, Daniel Slutsky has made tremendous progress on our goal of implementing a grammar of graphics in Clojure in the new hanamicloth library. The near-term goal is to stabilize the API of this library enough that it can be used to provide a user-friendly way to accomplish all of the simple data visualization tasks that are currently possible with our other tools. The long term goal is to take the lessons we learn from this library and build a JVM-only grammar of graphics library for doing data visualization “right” in Clojure.

The development and surrounding discussions of hanamicloth have also made me realize it would be useful to write an overview of the current state of dataviz options for Clojure and why we’re working on building something new. That’s on my list for the coming months, but lower priority than actual development work.

Impressions from PyData London

I got to attend PyData London this year thanks to a client of mine who was sponsoring the conference. I learned a lot and found the talks very interesting. My overall impression is that data science is maturing as a discipline, with more polished methods and robust theory backing up different approaches to data-related problems. With this maturation, though, comes higher expectations for production-ready, professional quality results. Most of the talks focused on high-level concerns like observability, scalability, and long-term stewardship of large open-source projects.

There are a lot of reasons why Python is just not ideal for building highly available, high-performance systems, and I really believe this is a good time to be building alternative tools for data science. Python is obviously entrenched as the current default language for working with data, but it is difficult and slow to write code that can take full advantage of modern hardware (because of the infamous global interpreter lock, reference counting, slow I/O, among other reasons). And to be fair, the Python community knows this. It’s why virtually all of the libraries that do the heavy lifting for data science in Python are actually implemented in C (numpy, pandas) or Rust (Polars, Pydantic), or are wrappers around C++ (PyTorch, TensorFlow, matplotlib) or Java (PySpark, Pydoop, confluent-kafka) libraries.

I think this provides a lot of insights into what data practitioners want. It’s clear that users want approachable, simple, human-readable interfaces for all of these tools, and that any new tool needs to interoperate with the rest of the ones currently in use. People are also tired of churn and are craving stability. I think Clojure has a lot to offer in all of these areas and is well placed to become more widely adopted for data science.

Ongoing work

My focus over the next two months will remain on the cookbook. My main goal is to finish the introductory chapter with the housing price analysis and to continue putting together the data import section with instructions and examples for all file formats that can reasonably be supported easily at this time.

I’ll continue to support and contribute to all of the ecosystem libraries I come across in my writings and analysis work in hopes of smoothing out all the rough edges I find.

Thanks for reading. I always love hearing from people who are interested in any of the things I’m working on. If that’s you, don’t hesitate to be in touch :)


Nikita Prokopov

Hi, I’m Nikitonsky and this is my open-source update for the past two months. Some good work was done on Humble UI (finally!), DataScript and new project — AlleKinos.de.

New project: AlleKinos.de, a no-nonsense movie showtimes site for the entire Germany:

  • A simple view for all movie screenings in Germany, inspired by Bret Victor’s Magic Ink
  • Developed in Clojure, data stored in DataScript, hosted on application.garden
  • Includes many many small cities (up to 671 now!),
  • And all the cinemas that were reported missing before.

HumbleUI, Clojure Desktop UI framework:

  • A long-standing vdom branch that was dragging from January is finally merged!
  • All examples have been ported to the new style
  • Still lots of issues and design ideas to look at, but now I’ll be going through them in the main branch

DataScript, immutable database and Datalog query engine for Clojure, ClojureScript, and JS:

  • fixed some OR queries broken in 1.6.4 468, closes 469)
  • Remove duplicate code 471
  • Document “partial db” during transaction 366
  • Stable sorting of sequences of various types 470
  • Correctly restore :max-tx from storage
  • Fixed tempid/upsert resolution when multiple tempids are added first (closes 472
  • Allow upsert by implicit tuple when only tuple components are specified 473
  • Allow lookup-refs inside tuples used for lookup-refs 452
  • Code cleanup and formatting for the entire codebase

clj-reload, a smarter way to reload Clojure code:

  • Disabled parallel init/reload via lock 9
  • :only with regexp only reloads changed + new matching
  • added find-namespaces
  • clj-reload support has been merged into CIDER 1.14
  • Now possible to initialize without passing :dirs option, will use system classpath in that case

Clojure Sublimed, Clojure support for Sublime Text 4:

  • Fixed Socket REPL not working on Windows 95
  • Fixed a bug when Clojure Sublimed would not work right after first install 109

Sublime Executor, Sublime Text plugin to run any executable from your working dir:

  • Added executor_show_panel_on_output option

Blogging:

Talks:

Love,
Niki


Tommi Reiman

Started my 3 month sabbatical in June with a road-trip with the kids, a welcome reset! Now back to home, learning and doing.
Refreshed my knowledge of the latest TypeScript, Zod and XState with a goal to pull some of the good things to Clojure (into Malli + a fully Xstate-compatible FSM-library). Also working on a template-project with monorepo + malli + reitit, using Java21 and Virtual Threads.

Library Releases

reitit 0.7.1 (active)

Fixing regression bugs from 0.7.0 + latest features via dependent libraries. Changelog here.

malli 0.16.2 (active)

Welcome Experimental Simplified Function Schemas!

[:-> :any] ; [:=> :cat :any]
[:-> :int :any] ; [:=> [:cat :int] :any]
[:-> [:cat :int] :any]  ; [:=> [:cat [:cat :int]] :any]
[:-> a b c d :any] ; [:=> [:cat a b c d] :any]

;; guard property
[:-> {:guard (fn [[[arg] ret]] ...)} :string :boolean]
; [:=> [:cat :string] :boolean [:fn (fn [[[arg] ret]] ...)]]

Also, small fixes and additions. Changelog here.

There is a big bunch of WIP work from myself and contributors waiting to be finished.

jsonista 0.3.9 (stable)

:do-not-fail-on-empty-beans option + updated dependencies, Changelog here

ring-http-response 0.9.4 (stable)

Teapots welcome! Changelog here.

spec-tools 0.10.7 (inactive)

Small fixes and improvements, Changelog here. If you are a user of spec-tools and want to help, feel free to ping me on Clojurians Slack, happy to take a new contributor here.

Something Else

Old abandoned Soviet-era sanatorium in Latvia.

baltics


Peter Taoussanis

A big thanks to Clojurists Together, Nubank, lambdaschmiede, and other sponsors of my open source work! I realise that it’s a tough time for a lot of folks and businesses lately, and that sponsorships aren’t always easy 🙏

2024 May - Jun

Hi folks! 👋

The last couple months have been light on big-ticket releases. Have been focused on maintenance, support, and groundwork for future releases. Output included:

Nippy and Carmine security releases

If you haven’t yet, please do try update to the latest versions of Nippy and/or Carmine when possible:

These include a fix to address a security vulnerability described in more detail in Nippy’s release notes.

In short: Carmine uses Nippy for its serialization, and Nippy uses a Java compression library for its compression. Earlier releases of that Java library may be vulnerable when decompressing malicious data directly crafted by an attacker. The attack is believed to require arbitrary control of the data provided to Nippy for thawing.

Relevant posts were made to the Clojure subreddit, Clojurians Slack, and my X account.

Telemere

Work has continued on Telemere, my new structured logging and telemetry library for Clojure/Script.

There were numerous minor beta releases to address various issues that came up, and to polish sharp edges and documentation, etc.

Instead of detailing all that here, I’ll just point to the current release - v1.0.0-beta14. The latest beta release will always include a summary of all major recent changes.

I’m aiming to try out RC1 around the end of August, but won’t needlessly rush. I’d like the API to be completely stable after v1 final is out, so I’d rather go a bit slower now to get things right.

Big thanks to early adopters and testers for all the valuable feedback so far! 🙏

Carmine

Work has continued on Carmine v4. It’s quite an undertaking, but I’ve recently updated and merged the first parts of the new v4 core into mainline.

The current plan is for all the new stuff to live in a parallel taoensso.carmine-v4 namespace. This’ll make it easier for me to roll out the new work in stages, and get feedback from early adopters without negatively impacting existing users.

There’ll be a lot to say on Carmine v4, but that’ll come later.

Upcoming work

My current roadmap can always be found here, and it’s now also possible to vote to help guide my priorities.

Current objectives for July-August include:

  • Continued efforts on Telemere.
  • Hopefully release the final stable version of Tempel - my new data security framework for Clojure. Before the final release I’m planning to investigate support for MFA, extend the docs re: use with OpenID, OWASP, and make a few other last improvements. Originally had this planned for earlier, but rescheduled so that I could prioritise the Nippy security topic.

Cheers!
- Peter Taoussanis

Permalink

Top-Down Imperative Clojure Architectures

When I first became interested in functional programming, a more experienced engineer told me: “you know functional programming doesn’t really amount to much more than procedural programming.” As I insisted on the benefits of map, filter and reduce, he simply shook his head. “You’re thinking in the small. Go look at a large real-world application.” It took some time for me to see what he meant. My preferred language, Clojure, is a functional language.

Permalink

What’s the point? BigDecimal in review

More to the point: Where’s the point? Recently I had to dig into the BigDecimal implementation to fix a reported bug. Every time I have to look at the BigDecimal code, it is a journey of rediscovery. I’m going to write down a few things to save me some time in the future.

General Decimal Arithmetic Specification

All the implementations of BigDecimal that I looked at when I started working on this implemented some variation of the specfication given in The General Decimal Arithmetic Specification. The specification is a bit … long. (74 pages.) And dense. But between the spec and the implementations, I got enough insight to proceed.

The GDAS has lots of added complexity. I was only trying to match the capabilities of java.math.BigDecimal. That implementation targeted a subset of the spec, more or less the X3.274 subset that is specified in an appendix of the GDAS. This subset shield us from some complexitiies: NaNs, infinite values, subnormal values, and negative zero. But I figured if it was good enough for Java, … .

What is a number?

The GDAS provides an abstract model for finite numbers. (I’m going to ignore the designated special cases.) A finite number is defined by three integer parameters:

  • sign: 0 for positive, 1 for negative
  • coefficient: an integer which is zero of or positive.
  • exponent: a signed integer

(There is a lot of discussion about allowed values for these parameters and interactions of limits and ranges. You’re welcome to have a go at it.)

The numerical value of a finite number is given by the formula:

 value = (-1)^sign * coefficient * 10^exponent

In what follows, I’ll use the abstract representation for clarity. It will be notated as [sign,coeff,exp]. For convenience, I’ll often reduce this to [coeff,exp] and assume the coefficient is a signed integer value. For example, I might notate the number 123.45 as [0, 12345, -2] or [12345, -2], depending on the context. There should be no confusion.

The first thing to understand is this:

This abstract definition deliberately allows for multiple representations of values which are numerically equal but are visually distinct (such as 1 and 1.00). [GDAS, p. 10]

What is 1? 1.00? Simple. The former is [1, 0] and the latter is [100, -2]. The save the same value. They differ in precision: the former has a precision of one digit; the latter has precision 3.

Conversions

The GDAS defines specific algorithms from converting an abstract representation to a string and string to an abstract representation. The algorithms are not complicated, but a little longer than we need to get into. Here are some examples from the GDAS:

[0, 123,   0]   <=>    "123"
[1, 123,   0]   <=>    "-123"
[0, 123,   1]   <=>    "1.23E+3"
[0, 123,   3]   <=>    "1.23E+5"
[0, 123,  -1]   <=>    "12.3"
[0, 123,  -5]   <=>    "0.00123"
[0, 123, -10]   <=>    "1.23E-8"
[1, 123, -12]   <=>    "-1.23E-10"
[0,   0,   0]   <=>    "0"
[0,   0,  -2]   <=>    "0.00"
[0,   0,   2]   <=>    "0E+2"
[1,   0,   0]   <=>    "-0"         // except we won't have negative zero in our implementation
[0,   5,  -6]   <=>    "0.000005"
[0,  50,  -7]   <=>    "0.0000050"
[0,   5,  -7]   <=>    "5E-7"

You understand now why I will stick with the abstract representation. Implementing the reading printing algorithms is a bit of fun.

Contexts

The GDAS provides “arbitrary precision”; that is rarely what one wants. One can do the following:

(+ 100000000M 1E-20M) ;; => 100000000.00000000000000000001M

but one finds it hard to image a situation where 29 digits of precision are really necessary. But, you do you.

In case you would like to limit the precision, the GDAS provides a context object which can be used to control the precision and other parameters affecting arithmetic operations. We need only these:

  • precision: “An integer which must be positive (greater than 0). This sets the maximum number of significant digits that can result from an arithmetic operation.” [GDAS, p 13]
  • rounding: “A named value which indicates the algorithm to be used when rounding is necessary.

Rounding is applied when a result coefficient has more significant digits than the value of precision; in this case the result coefficient is shortened to precision digits and may then be incremented by one (which may require a further shortening), depending on the rounding algorithm selected and the remaining digits of the original coefficient. The exponent is adjusted to compensate for any shortening. “ [GDAS, p 13]”

There are five rounding ‘algorithms’ – usually called rounding mode that must be implemented: Again quoting from GDAS:

Mode Description
round-down (Round toward 0; truncate.) The discarded digits are ignored; the result is unchanged.
round-half-up If the discarded digits represent greater than or equal to half (0.5) of the value of a one in the next left position then the result coefficient should be incremented by 1 (rounded up). Otherwise the discarded digits are ignored.
round-half-even If the discarded digits represent greater than half (0.5) the value of a one in the next left position then the result coefficient should be incremented by 1 (rounded up). If they represent less than half, then the result coefficient is not adjusted (that is, the discarded digits are ignored).
Otherwise (they represent exactly half) the result coefficient is unaltered if its rightmost digit is even, or incremented by 1 (rounded up) if its rightmost digit is odd (to make an even digit).
round-ceiling (Round toward +∞.) If all of the discarded digits are zero or if the sign is 1 the result is unchanged. Otherwise, the result coefficient should be ncremented by 1 (rounded up).
round-floor (Round toward -∞.) If all of the discarded digits are zero or if the sign is 0 the result is unchanged. Otherwise, the sign is 1 and the result coefficient should be incremented by 1.

In Clojure, the dynamic Var named *math-context* is used to hold the current context. The default value of nil indicates unbounded precision and no rounding mode. The Numbers suite of arithmetic operations will use this context to determine the precision and rounding mode for the operation. The context can be set using the with-precision macro. For example:

 (with-precision 5 :rounding HalfUp 
    (+ 1000000000M 1E-20M))        ;; => 1.0000E+9M = [10000, 5]

Note: if :rounding is not specified, the default is HalfUp.

Some operations require an explicit context, typically when the result of an operation with no rounding does not have exactly representable result. Division is the poster child.

 (/ 10M 2M)  ;; => 5M
 (/ 10M 3M)  ;; => throws ArithmeticException!  
             ;;    "Non-terminating decimal expansion; no exact representable decimal result."

 (with-precision 4 :rounding HalfUp
           (/ 10M 3M))                 ;; => 3.333M   

Basic arithmetic

The GDAS provides algorithms for the basic arithemetic operations. Some of them are rather involved. In fact, in my original implementation in C#, I have comments specifically noting places where I felt compelled to “port while looking”, i.e, I pretty much just straight translated the code from the OpenJDK implementation.

We can look at one operation, addition, to get a feel for how arithmetic computations are done, especially with regard to how the context comes into play for limiting precision and rounding.

Paraphrasing the GDAS (which combines the description of additoin and subtraction – I’m subtracting the subtraction part):

  • The coefficient of the result is computed by adding the aligned coefficients of the two operands.
  • The aligned coefficients are computed by comparing the exponents of the operands:
    • If the exponents are equal, the aligned coefficients are the same as the original coefficients.
    • Otherwise, the aligned coefficent of the number with the larger exponent is multiplied by 10^n, where n is the absolute value difference between the exponents; the aligned coefficient of the other operand is the same as the original coefficient.
  • The result exponent is the minimum of the exponents of the two operands.

In other words, basically you do the equivalent of shifting in order to align the decimal points. Without talking about decimal points, just exponents.

An example, using our notation for numbers:

 [2751, -1] + [4356, 1] = [2751, -1] + [435600, -1] 
                        = [438351, -1]

Or in the way we usually do arithmetic:

     275.1
+  43560.0
----------
   43835.1
  • The result is then rounded to precision digits if necessary, counting from the most significant digit of the result.

Now, this is where you are going to get into trouble. Precision is how many digits you want to keep. Rounding with a context does not make it easy to say – give me just an integer result.

Given these definitions for our friend in the previous example:

(def d5 275.1M)
(def d6 4356E1M)

fill in the following table for evalauting:

(with-precision N (+ d5 d6)
N Result
10  
6  
5  
4  
3  
2  
1  
0  

I just went ahead and ran the code.

(map #(with-precision % (+ d5 d6)) '(10 6 5 4 3 2 1))
;; => (43835.1M 
;;     43835.1M 
;;     43835M 
;;     4.384E+4M 
;;     4.38E+4M 
;;     4.4E+4M 
;;     4E+4M
;;     43835.1M)

Or

N Result
10 43835.1
6 43835.1
5 43835
4 43840
3 43800
2 44000
1 40000
0 43835.1

(Remember that a precision of 0 means no precision limit.)

What if you really want to round a result to get an integer? See below. But first, let’s write some code to at least get us through addition. It is easiest to discuss the rounding operation concretely.

It’s coding time

Let’s get a context type going first. We need an enum to cover the rounding modes.

type RoundingMode =
    | Up
    | Down
    | Ceiling
    | Floor
    | HalfUp
    | HalfDown
    | HalfEven
    | Unnecessary

The context is a record type.


[<Struct>]
type Context =
    { precision: uint32
      roundingMode: RoundingMode }

    // There are some standard contexts that can be used

    /// Standard precision for 32-bit decimal
    static member val Decimal32 =
        { precision = 7u
          roundingMode = HalfEven }
    /// Standard precision for 64-bit decimal
    static member val Decimal64 =
        { precision = 16u
          roundingMode = HalfEven }

    static member val Unlimited =
        { precision = 0u
          roundingMode = HalfUp }
    /// Default mode
    static member val Default =
        { precision = 9ul
          roundingMode = HalfUp }

    // And some factory methods

    /// Create a Context with specified precision and roundingMode = HalfEven
    static member ExtendedDefault precision =
        { precision = precision
          roundingMode = HalfEven }

    /// Create a Context from the given precision and rounding mode
    static member Create(precision, roundingMode) =
        { precision = precision
          roundingMode = roundingMode }

Now we can start implementing BigDecimal. For exposition purposes, I’ll present the code out of order. You’ll need to rearrange for an F# compilation to work.

We need three fields. The coefficient is a BigInteger and the exponent is an int. In addition, we lazily compute the precision of the number itself. We provide the precision in our private constructor. We supply 0, indicating not-yet-computed, much of the time, but if we know it at construction time we can supply it.

[<Sealed>]
type BigDecimal private (coeff, exp, precision) =

    // Precision

    // Constructor precision is shadowed with a mutable.
    // Value of 0 indicates precision not computed
    let mutable precision: uint = precision

    // Compute actual precision and cache it.
    member private _.GetPrecision() =
        match precision with
        | 0u -> precision <- Math.Max(ArithmeticHelpers.getBIPrecision (coeff), 1u)
        | _ -> ()

        precision

    // Public properties related to precision

    member this.Precision = this.GetPrecision()
    member _.RawPrecision = precision
    member this.IsPrecisionKnown = this.RawPrecision <> 0u

I’m going to skip that little helper method getBIPrecision for now. It deserves its onw (short) post.

Now let’s take a look at addition. We’ll provide two versions, one talking a context and one not. If we don’t take a context, then there is no rounding involved. Just align and add the coefficients and use the exponent of the smaller one.

    member this.Add(y: BigDecimal) =
        let xc, yc, exp = BigDecimal.align this y in BigDecimal(xc + yc, exp, 0u)

We will define align to give us back the aligned coefficents and the smaller exponent from the two BigDecimals.

    /// Return the aligned coefficients and the smaller exponent
    static member private align (x: BigDecimal) (y: BigDecimal) =
        if y.Exponent > x.Exponent then 
            (x.Coefficient, BigDecimal.computeAlign y x, x.Exponent)
        elif x.Exponent > y.Exponent then 
            (BigDecimal.computeAlign x y, y.Coefficient, y.Exponent )
        else 
            (x.Coefficient, y.Coefficient, y.Exponent)

The computeAlign function is simple. It just multiplies the coefficient of the larger exponent by 10 raised to the difference in exponents. The larger value is the first argument.

    static member private computeAlign (big: BigDecimal) (small: BigDecimal) =
        let deltaExp = (big.Exponent - small.Exponent) |> uint
        big.Coefficient * ArithmeticHelpers.biPowerOfTen (deltaExp)

The biPowerOfTen function is a simple helper function to compute BigInteger powers of ten.

When contexts are involved, we have to deal with rounding:

    member this.Add(y: BigDecimal, c: Context) =
        let result = this.Add(y)

        if c.precision = 0u
           || c.roundingMode = RoundingMode.Unnecessary then
           // rounding not required
            result
        else
            BigDecimal.round result c

Rounding is required if the precision of the result is greater than the precision of the context.
The precision of the result is just the number of digits in its BigInteger coefficient. Suppose we have the BigDecimal value [123456789, -2] (= 1234567.89) and the context precision is 4. We need to reduce the coefficient to four digits, leaving us with either 1234 or 1235, depending on the rounding mode. We get the 1234 by dividing by a power of ten, the power being the difference in precision. Here the the difference is 9 - 4 = 5, so we divide the coefficent by 100000 = 10^5. This yields 1234. Rounding up means adding 1, yielding 1235. Finally, to construct the result, we need the correct exponent. We divided by 10000, so we should multiply by the same amount; equivalently, increase the exponent by 5. The result is [1235, 3] or 1235000. In other words:

    static member private round (v: BigDecimal) c =
        let vp = v.GetPrecision()

        if (vp <= c.precision) then
            // No rounding required: precision is less than or equal to context precision
            v
        else
            // Rounding required
            let drop = vp - c.precision
            let divisor = ArithmeticHelpers.biPowerOfTen (drop)

            let rounded =
                BigDecimal.roundingDivide2 v.Coefficient divisor c.roundingMode

            // read below
            let exp =
                BigDecimal.checkExponentE ((int64 v.Exponent) + (int64 drop)) rounded.IsZero

            // check for the case where we had a 9999... that rounded up to 10000...

            if (ArithmeticHelpers.getBIPrecision(rounded) > c.precision && c.precision > 0u) then
                let newCoeff =  rounded / ArithmeticHelpers.biTen
                BigDecimal(newCoeff, exp+1, c.precision)
            else
                BigDecimal(rounded, exp, c.precision)

The call to checkExponentE is to ensure that the exponent is not too large. Our exponents are limited to the range of int32. The increment of the exponent might cause an overflow. We do the arithmetic in int64. checkExponentE will make sure it is in bounds (and also checks if we have a zero result, for which we can just set the exponent to 1). If the exponent is too large, it throws an exception.

The final conditional covers this warning in the GDAS:

When a result is rounded, the coefficient may become longer than the current precision. In this case the least significant digit of the coefficient (it will be a zero) is removed (reducing the precision by one), and the exponent is incremented by one.

An example: Context = (4, HalfUp). Number is [999996789, -2]. As in our previous example, we divide by 100000, yielding 9999. Rounding up gives 10000. The exponent is increased by 1, giving [100000, 3]. However, our precision is now 5, not four. The code detects that our coefficient does not have the correct precision; in that case, we divide by 10 and increment the exponent. The result is [1000, 4].

Where’s my integer?

Dig around in GDAS and you will note mention of functions round-to-integral-exact and round-to-integral-value. These are both defined as calls the quantize function. The quantize text mentions that it used to be called rescale, with slightly different parameters. I decided to write Quantize and Rescale. (Some other implementations have the a version of rescalejava.lang.BigDecimal has a setScale method.)

    static member Quantize(lhs: BigDecimal, rhs: BigDecimal, mode: RoundingMode) =
        BigDecimal.Rescale(lhs, rhs.Exponent, mode)

The lhs is the value to be quantized or rescaled. The second argument provides the exponent. (Quantize takes the exponent from a BigDecimal value; Rescale gets the exponent directly.

The GDAS has this to say about quantize:

[…] , quantize returns the number which is equal in value (except for any rounding) and sign to the first (left-hand) operand and which has an exponent set to be equal to the exponent of the second (right-hand) operand.

The coefficient of the result is derived from that of the left-hand operand. It may be rounded using the current rounding setting (if the exponent is being increased), multiplied by a positive power of ten (if the exponent is being decreased), or is unchanged (if the exponent is already equal to that of the right-hand operand).

Unlike other operations, if the length of the coefficient after the quantize operation would be greater than precision then an Invalid operation condition is raised. This guarantees that, unless there is an error condition, the exponent of the result of a quantize is always equal to that of the right-hand operand.

Let’s work through two cases, increasing vs. decreasing the exponent.

Suppose we call Rescale with [123456789, -4] (= 12345.6789) and desired exponent -1. Use mode HalfUp. We can guess what the result should be: 12345.7 = [123457, -1]. Now, we could figure out the code to do that directly but as it turns out, we can press round into service for us. We can view this as decreasing the precision. We just need to figure out what the new precision is. We can see it is 6 = 9 - 3 = the precision of the left-hand size minus the change in the exponent.

For the other direction, suppose we have [123456,-1] and we want to rescale to exponent -4. In other words, we go from 12345.6 to 12345.6000 = [123456000,-4]. No rounding will be needed. Just multiply by the appropriate power of 10.

And that is essentially the code below. A few minor tweaks to compute the precision of the result (since we know it) and a recursive call to Rescale to handle the 999… => 1000… problem

    static member Rescale(lhs: BigDecimal, newExponent, mode): BigDecimal =

        let increaseExponent delta =
            // delta negative => increasing the exponent => we might have to round to a new precision
            let decrease = -delta |> uint
            let p = lhs.Precision

            let newPrecision = if p < decrease then 0u else p - decrease

            let r =
                lhs.Round
                    ({ precision = newPrecision
                       roundingMode = mode })

            if (r.Exponent = newExponent) then r else BigDecimal.Rescale(r, newExponent, mode)

        let decreaseExponent delta =
            // delta positive => decrease the exponent => multiply by 10^some power and don't underflow
            let newCoeff =
                lhs.Coefficient
                * ArithmeticHelpers.biPowerOfTen (delta)

            let oldPrec = lhs.Precision

            let newPrec =
                oldPrec + (if oldPrec = 0u then 0u else delta)

            BigDecimal(newCoeff, newExponent, newPrec)

        let delta =
            BigDecimal.checkExponentE ((int64 lhs.Exponent) - (int64 newExponent)) false

        if delta = 0 then lhs
        elif lhs.Coefficient.IsZero then BigDecimal(BigInteger.Zero, newExponent, 0u)
        elif delta < 0 then increaseExponent delta
        else decreaseExponent (uint delta)

And I think that’s it for the exposition.

You can stick around for a little background, if you like.

How we got here

I was inspired to write this post because of a post on the Clojurian Slack channel that there was bug in converting a BigDecimal to an integer using rounding mode Ceiling. That got me to digging into the BigDecimal code, re-reading the GDAS, etc. It was a simple fix, but it took me a long time to get my head into the game; I hadn’t done any substantive work on this code in 14 years.

I thought writing some things down might help future me (or a future maintainer) should I venture this way again.

While writing this post, I found a few things hard to explain/justify and ended up making some code tweaks that got rid of an allocation and simplified a recursive call. A nice side-effect of posting.

But this begs the deeper question of how we got here: Why did I implement BigDecimal?

When Clojure first appeared, it supported all the primitive numeric types of Java and also the java.math.BigInteger and java.math.BigDecimal classes: The lisp reader supports literals of those types (123N or ‘123.45M respectively); the Numbers suite of arithemetical operations supports them via casts, arithemetic contagion, etc.

This presented a bit of a problem when porting to the CLR. At that time, there were no standard packages for these types in the Base Class Libary (BCL). If ClojureCLR was going to provide support for arbitrary precision integers and decimals, the choices seemed to be either find some libraries to include or write my own. I chose the latter approach, in part because there seemed to be a definite distaste for including third-party libraries in the Clojure eco-system. Okay, also in part because I thought it would be fun.

So I looked around at some implementations of BigInteger packages, decided on what I wanted to provide – I needed to support at least the basic methods available in the Java version – and started coding. That was relatively straightforward. I could look at Microsoft.Scripting.Math.BigInteger (part of the IronPython project) and the java.math.BigInteger source code from OpenJDK. But mostly I relied on Donald Knuth’s The Art of Computer Programming, Volume 2. It was worth doing just for the excuse to read that book.

The coding was not terribly hard: We have a pretty intuitive feeling for integer arithmetic. If you can add two integers represented as sequences of digits in the range ‘0’ to ‘9’, how hard can it be to add two integers represented as sequences of ‘digits’ in the range ‘0’ to UInt32.MaxValue? (We use arrays of uint32 to represent values.) See? You’re almost there..

BigDecimal was a another game entirely. This is not floating-point, so no inspiration there. Even the System.Decimal class in the CLR is of no help in this world. Fortunately, I found references to the GDAS early in the process.

I also look at various implementations of BigDecimal in other languages. I think my primary sources were:

Each has its own approach. The OpenJDK implementation is similar to ours, except instead of the exponent, they have a scale field, which is the negative of the exponent. (Translating that is fraught. All the comparisons and adjustments are backwards.) They also have some efficiency hacks, such as a compact representation for when the coefficient is small enough to fit into a regular integer; I decided not to bother with the added complexity.

I’m not sure of the origin of the IronPython version – I think it comes from some older Python implementation. This package provides a much more complete implementation of GDAS. It has more of the signals (Inexact, Subnormal, etc.). The representation if [exponent, coefficient, sign, isSpecial], implements NaNs, infinities, and other special values. The following comment in the code indicated that I wasn’t going to get much help here:

    # Note that the coefficient, self._int, is actually stored as
    # a string rather than as a tuple of digits.  This speeds up
    # the "digits to integer" and "integer to digits" conversions
    # that are used in almost every arithmetic operation on
    # Decimals.  This is an internal detail: the as_tuple function
    # and the Decimal constructor still deal with tuples of
    # digits.

The IronRuby implementation also has more of the spec: NaNs and infinity, the overflow exception modes. The representation for a finite value is [sign, fraction, exponent].
The fraction is of type Fraction, implemented in the package. The representation here is unusual. An array of uint is used for the value, but rather than using the whole range in each ‘digit’, each uint will be in the range of 0 to 999999999. This makes translating to and from strings much simpler. It might be fun to play with a BigInteger implemenation usng this representation.

For ClojureCLR.Next, I decided to ditch my own clojure.lang.BigInteger and use System.Numerics.BigInteger instead. Less to maintain. So the F# version of BigDecimal uses System.Numerics.BigInteger for the coefficients.

And that’s the story.

Permalink

Soundcljoud gets more cloudy

A logo of a face wearing a red hoodie with orange sunglasses featuring the Soundcloud logo

Last time on "Soundcljoud, or a young man's Soundcloud clonejure", I promised to clone Soundcloud, but then got bogged down in telling the story of my life and never got around to the actual cloning part. 😬

To be fair to myself, I did do a bunch of stuff to prepare for cloning, so now we can get to it with no further ado! (Skipping the ado bit is very out of character for me, I know. I'll just claim this parenthetical as my ado and thus fulfil your expectations of me as the most verbose writer in the Clojure community. You're welcome!)

Popping in a Scittle

If you've followed along with any of my other cloning adventures, you'll know where I'm going with this: straight to Scittle Town!

I'll start by creating a player directory and dropping a bb.edn into it:

{:deps {io.github.babashka/sci.nrepl
        {:git/sha "2f8a9ed2d39a1b09d2b4d34d95494b56468f4a23"}
        io.github.babashka/http-server
        {:git/sha "b38c1f16ad2c618adae2c3b102a5520c261a7dd3"}}
 :tasks {http-server {:doc "Starts http server for serving static files"
                      :requires ([babashka.http-server :as http])
                      :task (do (http/serve {:port 1341 :dir "public"})
                                (println "Serving static assets at http://localhost:1341"))}

         browser-nrepl {:doc "Start browser nREPL"
                        :requires ([sci.nrepl.browser-server :as bp])
                        :task (bp/start! {})}

         -dev {:depends [http-server browser-nrepl]}

         dev {:task (do (run '-dev {:parallel true})
                        (deref (promise)))}}}

In short, what's happening here is I'm setting up a Babashka project with a dev task that starts a webserver on port 1341 serving up the files in the public/ directory, starts an nREPL server on port 1339 that we can connect to with Emacs (or any inferior text editor of your choosing), and a websocket server on port 1340 that is connected to the nREPL server on one end and waiting for a ClojureScript app to connect to the other end.

Speaking of the public/ directory, I need a public/index.html file to serve up:

<!doctype html>
<html class="no-js" lang="">

<head>
    <meta charset="utf-8">
    <meta http-equiv="x-ua-compatible" content="ie=edge">
    <title>Soundcljoud</title>
    <meta name="description" content="">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <link rel="stylesheet" href="style.css">

    <script src="https://cdn.jsdelivr.net/npm/scittle@0.6.15/dist/scittle.js" type="application/javascript"></script>
    <script src="https://cdn.jsdelivr.net/npm/scittle@0.6.15/dist/scittle.promesa.js" type="application/javascript"></script>
    <script>var SCITTLE_NREPL_WEBSOCKET_PORT = 1340;</script>
    <script src="https://cdn.jsdelivr.net/npm/scittle@0.6.15/dist/scittle.nrepl.js"
        type="application/javascript"></script>
    <script type="application/x-scittle" src="soundcljoud.cljs"></script>
</head>

<body>
  <h1>Soundcljoud</h1>
  <div id="wrapper" style="display: none;">
    <div id="player">
      <div class="cover-image">
        <img src="" alt="" />
      </div>
      <div id="controls">
        <audio controls src=""></audio>
        <div id="tracks" style=""></div>
      </div>
    </div>
  </div>
</body>

</html>

The index.html file loads three JavaScript scripts:

  1. Scittle itself, which knows how to interpret ClojureScript scripts
  2. The Scittle Promesa plugin, which provides some niceties for dealing with promises
  3. The Scittle nREPL plugin, which will connect to that websocket server on port 1340 and complete the circuit that will allow us to REPL-drive our browser from Emacs (or the inferior text editor of your choosing)

Once this JavaScript is in place, index.html loads the soundcljoud.cljs ClojureScript file, which we'll come to in just a second.

For a (much) more detailed explanation, refer to the Popping in a Scittle section of my cljcastr, or a young man's Zencastr clonejure blog post.

The body of index.html is all about setting up a basic HTML page with this structure:

+----------------------+
| Soundcljoud          |
+-------+--------------+  <---
| Album | Audio player |      }
| cover +--------------+      } <div id="wrapper">
| image | Tracks list  |      }
+-------+--------------+  <---

Note that everything inside the wrapper div is hidden from the start:

  <div id="wrapper" style="display: none;">

We don't know anything about the album we want to display yet, and there's no point in showing a bunch of empty divs until we do.

Let's drop a public/style.css in as well:

body {
  font:
    1.2em Helvetica,
    Arial,
    sans-serif;
  margin: 20px;
  padding: 0;
}

img {
  max-width: 100%;
}

#wrapper {
  max-width: 960px;
  margin: 2em auto;
}

#controls {
  display: flex;
  flex-direction: column;
  gap: 5px;
}

#tracks {
  display: flex;
  flex-direction: column;
  gap: 3px;
}

@media screen and (min-width: 900px) {
  #wrapper {
    display: flex;
  }

  #player {
    display: flex;
    gap: 3%;
  }

  #cover-image {
    margin-right: 5%;
    max-width: 60%;
  }

  #controls {
    width: 25%;
  }
}

All of this stuff is about using screen real estate effectively. The first chunk of CSS applies universally, but the bit inside this:

@media screen and (min-width: 900px) {
  /* ... */
}

only applies to windows at least 900px wide. So our page defaults to a layout that's appropriate for phones (or really narrow browser windows), but then adjusts to move more content "above the fold" so you can probably see the entire UI without scrolling if you're viewing the page on a standard computer.

Now that we have all of the HTML and CSS plumbing in place, let's add a public/soundcljoud.cljs file to get started with some ClojureScripting:

(ns soundcljoud
  (:require [promesa.core :as p]))

Firing up the REPL

Before we can start REPL-driving, we need to put the key in the ignition and give it a right twist! In other words, we open up a terminal in the top-level player/ directory and invoke Babashka:

: jmglov@alhana; bb dev
Serving static assets at http://localhost:1341
nREPL server started on port 1339...
Websocket server started on 1340...

If we now connect to http://localhost:1341/, we'll be rewarded with a simple webpage:

Screenshot of a web browser window saying Soundcljoud

This by itself is of course monumentally boring, so let's inject some excitement into our lives by jumping into soundcljoud.cljs and pressing C-c l C (cider-connect-cljs), selecting localhost, port 1339, and nbb for the REPL type (assuming you're in Emacs; if you're using some other editor, perform the incantations necessary to connect your ClojureScript REPL to localhost:1339).

If everything went according to plan, you should see something like this in your terminal window:

:msg "{:versions
       {\"scittle-nrepl\"
        {\"major\" \"0\", \"minor\" \"0\", \"incremental\" \"1\"}},
       :ops
       {\"complete\" {}, \"info\" {}, \"lookup\" {}, \"eval\" {},
        \"load-file\" {}, \"describe\" {}, \"close\" {}, \"clone\" {},
        \"eldoc\" {}},
       :status [\"done\"],
       :id \"3\",
       :session \"3264dc1e-1b46-48a6-b11a-f606fea032b7\",
       :ns \"soundcljoud\"}"
:msg "{:value \"nil\",
       :id \"5\",
       :session \"3264dc1e-1b46-48a6-b11a-f606fea032b7\",
       :ns \"soundcljoud\"}"
:msg "{:status [\"done\"],
       :id \"5\",
       :session \"3264dc1e-1b46-48a6-b11a-f606fea032b7\",
       :ns \"soundcljoud\"}"

And something like this in your editor's REPL window:

;; Connected to nREPL server - nrepl://localhost:1339
;; CIDER 1.12.0 (Split)
;;
;; ClojureScript REPL type: nbb
;;
nil> 

Let's prove that it works by evaluating the buffer with C-c C-k (cider-load-buffer), adding a Rich comment, putting some ClojureScript in there that grabs our wrapper div, positioning our cursor at the end of the form, and evaluating that sucker with C-c C-v f c e (cider-pprint-eval-last-sexp-to-comment):

(ns soundcljoud
  (:require [promesa.core :as p]))

(comment

  (js/document.querySelector "#wrapper")
  ;; => #object[HTMLDivElement [object HTMLDivElement]]

)

We've proven that we can evaluate ClojureScript code in the running browser process from our REPL buffer, which is nifty for sure, but our page still bores us, and the result of evaluating that code is pretty useless:

#object[HTMLDivElement [object HTMLDivElement]]

Let's actually do something with the div we've pulled down, and whilst we're at it, provide a useful way of logging stuff:

(ns soundcljoud
  (:require [promesa.core :as p]))

(defn log
  ([msg]
   (log msg nil))
  ([msg obj]
   (if obj
     (js/console.log msg obj)
     (js/console.log msg))
   obj))

(comment

  (let [div (js/document.querySelector "#wrapper")]
    (set! (.-style div) "display: flex")
    (log "All is revealed!" div))
  ;; => #object[HTMLDivElement [object HTMLDivElement]]

)

Screenshot of a web browser window with an audio player

Fantastic! By using js/document.log (by the way, that js/ prefix is the way you instruct ClojureScript to do some JavaScript interop; it's basically saying "look for the next symbol in the top-level scope in JavaScript land"), we now get the fancy inspection tools in the browser's JavaScript console so we can expand parts of the object and drill down to see stuff we're interested in.

Now that we've established a baseline, we can get stuck in and do some real work. 💪🏻

Reading some RSS

Do you remember the MP3 files and RSS feed we prepared in the previous blog post? Let's plop those down in our public/ directory so we can access them from the webapp we're slowly constructing:

: jmglov@alhana; mkdir -p 'public/Garth Brooks/Fresh Horses'

: jmglov@alhana; cp /tmp/soundcljoud.12524185230907219576/*.{rss,mp3} !$

: jmglov@alhana; ls -1 !$
album.rss
'Garth Brooks - Cowboys and Angels.mp3'
'Garth Brooks - Ireland.mp3'
"Garth Brooks - It's Midnight Cinderella.mp3"
"Garth Brooks - Rollin'.mp3"
"Garth Brooks - She's Every Woman.mp3"
"Garth Brooks - That Ol' Wind.mp3"
'Garth Brooks - The Beaches of Cheyenne.mp3'
'Garth Brooks - The Change.mp3'
'Garth Brooks - The Fever.mp3'
'Garth Brooks - The Old Stuff.mp3'

Now that our files are in place, let's see about loading the RSS feed from ClojureScript:

(comment

  (def base-path "/Garth+Brooks/Fresh+Horses")
  ;; => #'soundcljoud/base-path

  (p/->> (js/fetch (js/Request. (str base-path "/album.rss")))
         (.text)
         (log "Fetched XML:"))
  ;; => #<Promise[~]>

)

In our console, we can see what we fetched:

Fetched XML: <?xml version='1.0' encoding='UTF-8'?>
<rss version="2.0"
     xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
     xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <atom:link
        href="http://localhost:1341/Garth+Brooks/Fresh+Horses/album.rss"
        rel="self"
        type="application/rss+xml"/>
    <title>Garth Brooks - Fresh Horses</title>
    <link>https://api.discogs.com/masters/212114</link>
    <pubDate>Sun, 01 Jan 1995 00:00:00 +0000</pubDate>
    <itunes:subtitle>Album: Garth Brooks - Fresh Horses</itunes:subtitle>
    <itunes:author>Garth Brooks</itunes:author>
    <itunes:image href="https://i.discogs.com/0eLXmM1tK1grkH8cstgDT6eV2TlL0NvgWPZBoyScJ_8/rs:fit/g:sm/q:90/h:600/w:600/czM6Ly9kaXNjb2dz/LWRhdGFiYXNlLWlt/YWdlcy9SLTY4NDcx/Ny0xNzE3NDU5MDIy/LTMxNjguanBlZw.jpeg"/>
    
    <item>
      <itunes:title>The Old Stuff</itunes:title>
      <title>The Old Stuff</title>
      <itunes:author>Garth Brooks</itunes:author>
      <enclosure
          url="http://localhost:1341/Garth+Brooks/Fresh+Horses/Garth+Brooks+-+The+Old+Stuff.mp3"
          length="5943424" type="audio/mpeg" />
      <pubDate>Sun, 01 Jan 1995 00:00:00 +0000</pubDate>
      <itunes:duration>252</itunes:duration>
      <itunes:episode>1</itunes:episode>
      <itunes:episodeType>full</itunes:episodeType>
      <itunes:explicit>false</itunes:explicit>
    </item>
   ... 
    <item>
      <itunes:title>Ireland</itunes:title>
      <title>Ireland</title>
      <itunes:author>Garth Brooks</itunes:author>
      <enclosure
          url="http://localhost:1341/Garth+Brooks/Fresh+Horses/Garth+Brooks+-+Ireland.mp3"
          length="6969472" type="audio/mpeg" />
      <pubDate>Sun, 01 Jan 1995 00:00:00 +0000</pubDate>
      <itunes:duration>301</itunes:duration>
      <itunes:episode>10</itunes:episode>
      <itunes:episodeType>full</itunes:episodeType>
      <itunes:explicit>false</itunes:explicit>
    </item>
    
  </channel>
</rss>

That looks quite familiar! That also looks like a bunch of text, which is not the nicest thing to extract data from. Luckily, that's a bunch of structured text, and more luckily, it's XML (XML is great, and don't let anyone tell you otherwise! And don't get me started on how we've reinvented XML but poorly with JSON Schema and all of this other nonsense we've built up around JSON because we realised that things like data validation are important when exchanging data between machines. 🤦🏼‍♂️), and most luckily of all, browsers know how to parse XML (which makes sense, as modern HTML is in fact XML):

(defn parse-xml [xml-str]
  (.parseFromString (js/window.DOMParser.) xml-str "text/xml"))

(comment

  (p/->> (js/fetch (js/Request. (str base-path "/album.rss")))
         (.text)
         parse-xml
         (log "Fetched XML:"))
  ;; => #<Promise[~]>

)

Screenshot of a web browser window with an XML document in the JS console

Let's do the right thing and make a function out of this:

(defn fetch-xml [path]
  (p/->> (js/fetch (js/Request. path))
         (.text)
         parse-xml
         (log "Fetched XML:")))

Now that we know how to fetch and parse XML, let's see how to extract useful information from it. Looking at the log output, we can see that the parsed XML is of type #document, just like our good friend js/document (the current webpage that the browser is displaying). That's right, we have a Document Object Model, which means we can use all the tasty DOM functions we're used to, such as document.querySelector() to grab a node using an XPATH query.

Let's start with the album title:

(comment

  (p/let [title (p/-> (fetch-xml (str base-path "/album.rss"))
                      (.querySelector "title")
                      (.-innerHTML))]
    (set! (.-innerHTML (js/document.querySelector "h1")) title))
  ;; => #<Promise[~]>

)

Cool! We now see "Garth Brooks - Fresh Horses" as our page heading! Let's see about grabbing the album art next:

(comment

  (p/let [xml (fetch-xml (str base-path "/album.rss"))
          title (p/-> xml
                      (.querySelector "title")
                      (.-innerHTML))
          image (p/-> xml
                      (.querySelector "image")
                      (.getAttribute "href"))]
    (set! (.-innerHTML (js/document.querySelector "h1")) title)
    (set! (.-src (js/document.querySelector ".cover-image > img")) image)
    (set! (.-style (js/document.querySelector "#wrapper")) "display: flex;"))
  ;; => #<Promise[~]>

)

Screenshot of a web browser window with the album art for Fresh Horses and an audio player

Before we go any further, let's create some functions from this big blob of code. At the moment, we're complecting two things:

  1. Extracting data from the XML DOM
  2. Updating the HTML DOM to display the data

Let's do the functional programming thing and create a purely functional core and a mutable shell. Instead of extracting and updating, we'll create a function that transforms the XML DOM representation of an album into a ClojureScript representation:

(defn xml-get [el k]
  (-> el
      (.querySelector k)
      (.-innerHTML)))

(defn xml-get-attr [el k attr]
  (-> el
      (.querySelector k)
      (.getAttribute attr)))

(defn ->album [xml]
  {:title (xml-get xml "title")
   :image (xml-get-attr xml "image" "href")})

(defn load-album [path]
  (p/-> (fetch-xml path) ->album))

(comment

  (p/let [{:keys [title image] :as album} (load-album (str base-path "/album.rss"))]
    (set! (.-innerHTML (js/document.querySelector "h1")) title)
    (set! (.-src (js/document.querySelector ".cover-image > img")) image)
    (set! (.-style (js/document.querySelector "#wrapper")) "display: flex;"))
  ;; => #<Promise[~]>

)

Now that we have a nice ClojureScript data structure to represent our album, let's tackle the DOM mutations we need to do to display the album:

(defn get-el [selector]
  (if (instance? js/HTMLElement selector)
    selector  ; already an element; just return it
    (js/document.querySelector selector)))

(defn set-styles! [el styles]
  (set! (.-style el) styles))

(defn display-album! [{:keys [title image] :as album}]
  (let [header (get-el "h1")
        cover (get-el ".cover-image > img")
        wrapper (get-el "#wrapper")]
    (set! (.-innerHTML header) title)
    (set! (.-src cover) image)
    (set-styles! wrapper "display: flex;")
    album))

(comment

  (p/-> (load-album (str base-path "/album.rss")) display-album!)
  ;; => #<Promise[~]>

)

Tracking down the tracks

Displaying the album title and cover art is all well and good, but in order to complete our Soundcloud clone, we need some way of actually listening to the music on the album. If you recall, our RSS feed contains a series of <item> tags representing the tracks:

    <item>
      <itunes:title>The Old Stuff</itunes:title>
      <title>The Old Stuff</title>
      <itunes:author>Garth Brooks</itunes:author>
      <enclosure
          url="http://localhost:1341/Garth+Brooks/Fresh+Horses/Garth+Brooks+-+The+Old+Stuff.mp3"
          length="5943424" type="audio/mpeg" />
      <pubDate>Sun, 01 Jan 1995 00:00:00 +0000</pubDate>
      <itunes:duration>252</itunes:duration>
      <itunes:episode>1</itunes:episode>
      <itunes:episodeType>full</itunes:episodeType>
      <itunes:explicit>false</itunes:explicit>
    </item>
   ... 
    <item>
      <itunes:title>Ireland</itunes:title>
      <title>Ireland</title>
      <itunes:author>Garth Brooks</itunes:author>
      <enclosure
          url="http://localhost:1341/Garth+Brooks/Fresh+Horses/Garth+Brooks+-+Ireland.mp3"
          length="6969472" type="audio/mpeg" />
      <pubDate>Sun, 01 Jan 1995 00:00:00 +0000</pubDate>
      <itunes:duration>301</itunes:duration>
      <itunes:episode>10</itunes:episode>
      <itunes:episodeType>full</itunes:episodeType>
      <itunes:explicit>false</itunes:explicit>
    </item>

What we need from each item in order to display and play the track is:

  • Song title
  • Artist (for this album, all tracks are from Garth, but an album could be a compilation of songs by different artists, so let's grab the artist in case we later decide to display it)
  • Track number
  • URL of the source audio

Let's write an aspirational function that assumes it will be called with a DOM element representing an <item> and transforms it into a ClojureScript map, just as we did for the item itself:

(defn ->track [item-el]
  {:artist (xml-get item-el "author")
   :title (xml-get item-el "title")
   :number (-> (xml-get item-el "episode") js/parseInt)
   :src (xml-get-attr item-el "enclosure" "url")})

For the track number, we need to convert it to an integer, since the text contents of an XML elements are, well, text, and we'll want to sort our tracks numerically.

Now that we have a function to convert an <item> into a track, let's plug that into our ->album function to add a list of tracks to the album:

(defn ->album [xml]
  {:title (xml-get xml "title")
   :image (xml-get-attr xml "image" "href")
   :tracks (->> (.querySelectorAll xml "item")
                (map ->track)
                (sort-by :number))})

OK, we have data representing a list of tracks, so we need to consider how we want to display it. If we cast our mind back to our HTML, we have a div where the tracks should go:

<body>
  ...
  <div id="wrapper" style="display: none;">
    <div id="player">
      ...
      <div id="controls">
        ...
        <div id="tracks" style=""></div>
      </div>
    </div>
  </div>
</body>

What we can do is create a <span> for each track, something like this:

<span>1. The Old Stuff</span>

Let's go ahead and write that function:

(defn track->span [{:keys [number artist title] :as track}]
  (let [span (js/document.createElement "span")]
    (set! (.-innerHTML span) (str number ". " title))
    span))

(comment

  (p/->> (load-album (str base-path "/album.rss"))
         :tracks
         first
         track->span
         (log "The first track is:"))
  ;; => #<Promise[~]>

)

In the JavaScript console, we see:

The first track is: <span>1. The Old Stuff</span>

This is cool, because the track->span function is still pure—there's no mutation occurring there. We have one and only one place where that's doing mutation, and that's display-album!, which is where we can hook into our functional core and display the tracks. In order to do that, we'll take our list of tracks, turn them into a list of <span> elements, and then set them as the children of the #tracks div.

(defn set-children! [el children]
  (.replaceChildren el)
  (doseq [child children]
    (.appendChild el child))
  el)

(defn display-album! [{:keys [title image tracks] :as album}]
  (let [header (get-el "h1")
        cover (get-el ".cover-image > img")
        wrapper (get-el "#wrapper")]
    (set! (.-innerHTML header) title)
    (set! (.-src cover) image)
    (->> tracks
         (map track->span)
         (set-children! (get-el "#tracks")))
    (set-styles! wrapper "display: flex;")
    album))

(comment

  (p/-> (load-album (str base-path "/album.rss")) display-album!)
  ;; => #<Promise[~]>

)

Screenshot of a web browser window with the album art for Fresh Horses, an audio player, and a list of tracks

This is fantastic... if all we want to do is know what's on an album. But of course my initial problem was wanting to listen to Garth and not having a way to do that. Now I have written much Clojure and ClojureScript, and still cannot listen to Garth. 🤔

Play it again, Sam

Of course what I do have is an HTML <audio> element and an MP3 file with a source URL, and I bet if I can just put these two things together, my ears will soon be filled with the sweet sweet sounds of 90s country music.

Let's start out with the simplest thing we can do, which is to activate the first track on the album once it's loaded. Since display-album! returns the album, we can just add some code to the end of the pipeline:

(comment

  (def base-path "http://localhost:1341/Garth+Brooks/Fresh+Horses")
  ;; => #'soundcljoud/base-path

  (p/->> (load-album (str base-path "/album.rss"))
         display-album!
         :tracks
         first
         :src
         (set! (.-src (get-el "audio"))))
  ;; => #<Promise[~]>

)

As soon as we evaluate this code, the <audio> element comes to life, displaying a duration and activating the play button. Pressing the play button, we do in fact hear some Garth! 🎉

However, our UX is quite poor, since there's no visual representation of which track is playing. We can fix this by emboldening the active track:

(comment

  (p/let [{:keys [number src] :as track}
          (p/->> (load-album (str base-path "/album.rss"))
                 display-album!
                 :tracks
                 first)]
    (-> (get-el "#tracks")
        (.-children)
        seq
        (nth (dec number))
        (set-styles! "font-weight: bold;"))
    (set! (.-src (get-el "audio")) src))
  ;; => #<Promise[~]>

)

Screenshot of our UI with the first track highlighted and loaded in the audio element

Speaking of UX, though, one would imagine that they'd be able to change to a track by clicking on it. At the moment, clicking does nothing, but that's easy enough to fix by adding an event handler to our span for each track that activates the track. Let's create a function and shovel our track activating code in there:

(defn activate-track! [{:keys [number src] :as track}]
  (log "Activating track:" (clj->js track))
  (let [track-spans (seq (.-children (get-el "#tracks")))]
    (-> track-spans
        (nth (dec number))
        (set-styles! "font-weight: bold;")))
  (set! (.-src (get-el "audio")) src)
  track)

By the way, that clj->js function takes a ClojureScript data structure (in this case, our track map) and recursively transforms it into a JavaScript object so it can be printed nicely in the JS console.

OK, now that we have activate-track! as a function, we can use it in a click handler:

(defn track->span [{:keys [number title] :as track}]
  (let [span (js/document.createElement "span")]
    (set! (.-innerHTML span) (str number ". " title))
    (.addEventListener span "click" (partial activate-track! track))
    span))

(comment

  (p/-> (load-album (str base-path "/album.rss"))
        display-album!
        :tracks
        first
        activate-track!)
  ;; => #<Promise[~]>

)

Evaluating this code activates the first track on the album as before, and then clicking another track highlights it in bold and loads it into the <audio> element. That's good, but what isn't so good is that the first track stays bold. 😬

Luckily, there's an easy fix for this. All we need to do is reset the weight of all the track spans before bolding the active one in activate-track!:

(defn activate-track! [{:keys [number src] :as track}]
  (log "Activating track:" (clj->js track))
  (let [track-spans (seq (.-children (get-el "#tracks")))]
    (doseq [span track-spans]
      (set-styles! span "font-weight: normal;"))
    (-> track-spans
        (nth (dec number))
        (set-styles! "font-weight: bold;")))
  (set! (.-src (get-el "audio")) src)
  track)

Amazing!

Whilst we're ticking off UX issues, let's think about what should happen when our user clicks on a different track. At the moment, we load the track into the player and then the user has to click the play button to start listening to it. That is perfectly reasonable when first loading the album, but if I'm listening to a track and then select another one, I would kinda expect the new track to start playing automatically instead of me having to click play manually.

Let's see how we can do this. According to the HTMLMediaElement documentation, our <audio> element should have paused attribute telling us whether playback is happening. Let's try it out:

(comment

  (p/-> (load-album (str base-path "/album.rss"))
        display-album!
        :tracks
        first
        activate-track!)
  ;; => #<Promise[~]>

  (-> (get-el "audio")
      (.-paused))
  ;; => true

)

Now if we click the play button and check the value of the paused attribute again:

(comment

  (-> (get-el "audio")
      (.-paused))
  ;; => false

)

Excellent! Now let's see how we programatically start playing a newly loaded track. Referring back to the documentation, we discover a HTMLMediaElement.play() method. Let's try that out:

(comment

  (p/-> (load-album (str base-path "/album.rss"))
        display-album!
        :tracks
        second
        activate-track!)
  ;; => #<Promise[~]>

  (-> (get-el "audio")
      (.play))
  ;; => #<Promise[~]>

)

Evaluating this code results in "Cowboys and Angels" starting to play!

Now we can use what we've learned to teach activate-track! to start playing the track when appropriate:

(defn activate-track! [{:keys [number src] :as track}]
  (log "Activating track:" (clj->js track))
  (let [track-spans (seq (.-children (get-el "#tracks")))
        audio-el (get-el "audio")
        paused? (.-paused audio-el)]
    (doseq [span track-spans]
      (set-styles! span "font-weight: normal;"))
    (-> track-spans
        (nth (dec number))
        (set-styles! "font-weight: bold;"))
    (set! (.-src audio-el) src)
    (when-not paused?
      (.play audio-el)))
  track)

(comment

  (p/-> (load-album (str base-path "/album.rss"))
        display-album!
        :tracks
        first
        activate-track!)
  ;; => #<Promise[~]>

)

When the album loads, the first track is activated but doesn't start playing. Clicking on another track activates it but doesn't start playing it. However, if we click the play button and start listening to the active track, then click on another track, the new track is activated and immediately starts playing.

This, my friends, is some seriously good UX! Of course, we can improve it further.

Keep playing it, Sam

The next UX nit that we should pick is the fact that when a track ends, our poor user has to manually click on the next track and then manually click the play button just to keep listening to the album. This seems a bit mean of us, so let's see what we can do in order to be the nice people that we know we are, deep down inside.

Our good friend HTMLMediaElement has a bunch of events that tell us useful things about what's happening with the media, and one of these events is ended:

Fired when playback stops when end of the media (

This seems like it will fit the bill quite nicely. Hopping back in our hammock for a minute, we think about what should happen when the end of a track is reached:

  • The next track is activated and starts playing, unless
  • It's the last track on the album, in which case nothing should happen.

We can of course add a ended event listener to the <audio> element every time a new track is activated, but this is problematic because we would then want to remove the previous event listener, and it turns out that removing event listeners is a bit complicated. What if we instead had an event listener that knew what track was currently playing, where that track comes in the album, and what track (if any) is next? Then we'd only have to attach a listener once, right after we load the album. Let's think through how we could do that.

So far, we've been relying on the state of the DOM to tell us things like if the track is paused. A much more functional approach would be to control the state ourselves using immutable data structures and so on. A nice side effect of this (sorry, Haskell folks, Clojurists are just fine with uncontrolled side effects) is that it actually makes REPL-driven development easier as well! 🤯

Let's start by extracting a function to handle the tedium of loading the album, displaying it, and then activating the first track:

(defn load-ui! [dir]
  (p/->> (load-album (str dir "/album.rss"))
         display-album!
         :tracks
         first
         activate-track!))

Now that we have this, we'll define a top-level atom to hold the state, then update our load-ui! function to stuff the album into the atom once it's loaded:

(def state (atom nil))

(defn load-ui! [dir]
  (p/->> (load-album (str dir "/album.rss"))
         display-album!
         (assoc {} :album)
         (reset! state)
         :album
         :tracks
         first
         activate-track!))

What we're doing here is creating a map to hold the state, then assoc-ing the loaded album into the map under the :album key, then putting that map into the state atom with reset!, which returns the new value saved in the atom, which is the one we just put in there, which will look like this:

{:title "Garth Brooks - Fresh Horses",
 :image "https://i.discogs.com/.../LTMxNjguanBlZw.jpeg",
 :tracks
 ({:artist "Garth Brooks",
   :title "The Old Stuff",
   :number 1,
   :src "http://localhost:1341/Garth+Brooks/Fresh+Horses/Garth+Brooks+-+The+Old+Stuff.mp3"}
  ...
  {:artist "Garth Brooks",
   :title "Ireland",
   :number 10,
   :src "http://localhost:1341/Garth+Brooks/Fresh+Horses/Garth+Brooks+-+Ireland.mp3"}), :paused? true}

We'll then grab the album back out of the map and proceed as before to activate the first track. This is a little gross, but we'll clean it up as we go.

Oh yeah, and remember when I promised this would make debugging easier? Check this out:

(comment

  (load-ui! "http://localhost:1341/Garth+Brooks/Fresh+Horses")
  ;; => #<Promise[~]>

  @state
  ;; => {:album {:title "Garth Brooks - Fresh Horses",
  ;;             :image "https://i.discogs.com/.../LTMxNjguanBlZw.jpeg",
  ;;             :tracks
  ;;             ({:artist "Garth Brooks",
  ;;               :title "The Old Stuff",
  ;;               :number 1,
  ;;               :src "http://localhost:1341/Garth+Brooks/Fresh+Horses/Garth+Brooks+-+The+Old+Stuff.mp3"}
  ;;               ...
  ;;               {:artist "Garth Brooks",
  ;;                :title "Ireland",
  ;;                :number 10,
  ;;                :src "http://localhost:1341/Garth+Brooks/Fresh+Horses/Garth+Brooks+-+Ireland.mp3"})}}

)

That's right, we no longer have to rely on logging stuff to the JS console in our promise chains!

OK, but we haven't really changed anything other than making the load-ui! function more complicated. Let's add a little more to our state atom so we can actually tackle the problem of auto-advancing tracks. First, we'll add a :paused? key:

(defn load-ui! [dir]
  (p/->> (load-album (str dir "/album.rss"))
         display-album!
         (assoc {:paused? true} :album)
         (reset! state)
         :album
         :tracks
         first
         activate-track!))

(comment

  (load-ui! "http://localhost:1341/Garth+Brooks/Fresh+Horses")
  ;; => #<Promise[~]>

  @state
  ;; => {:paused? true, :album {...}}

)

Now let's add an event listener to the <audio> element that updates the state when the play button is pressed, doing a little cleanup of the load-ui! function whilst we're at it:

(defn load-ui! [dir]
  (p/let [album (load-album (str dir "/album.rss"))]
    (display-album! album)
    (reset! state {:paused? true, :album album})
    (->> album
         :tracks
         first
         activate-track!)
    (.addEventListener (get-el "audio") "play"
                       #(swap! state assoc :paused? false))))

(comment

  (load-ui! "http://localhost:1341/Garth+Brooks/Fresh+Horses")
  ;; => #<Promise[~]>

  (:paused? @state)
  ;; => true

  ;; Click the play button and...
  (:paused? @state)
  ;; => false

)

If you're not familiar with swap!, it takes an atom and a function which will be called with the current value of the atom, then sets the next value of the atom to whatever the function returns, just like update does for plain old maps. And also just like update, it has a shorthand form so that instead of writing this:

(swap! state #(assoc % :paused? false))

you can write this:

(swap! state assoc :paused? false)

in which case swap! will treat the arg after the atom as a function which will be called with the current value first, then the rest of the args to swap!. You can imagine that swap! is written something like this:

(defn swap!
  ([atom f]
   (reset! atom (f @atom)))
  ([atom f & args]
   (reset! atom (apply f @atom args))))

It's obviously not written like that, even though that would technically probably maybe work. It's actually written like this:

(defn swap!
  "Atomically swaps the value of atom to be:
  (apply f current-value-of-atom args). Note that f may be called
  multiple times, and thus should be free of side effects.  Returns
  the value that was swapped in."
  {:added "1.0"
   :static true}
  ([^clojure.lang.IAtom atom f] (.swap atom f))
  ([^clojure.lang.IAtom atom f x] (.swap atom f x))
  ([^clojure.lang.IAtom atom f x y] (.swap atom f x y))
  ([^clojure.lang.IAtom atom f x y & args] (.swap atom f x y args)))

But you get the point.

Aaaaaanyway, I seem to have digressed—which is firmly on brand for this blog, so I apologise for nothing!

But yeah, at this point, we're back to the functionality that we had before. If we click on a track whilst the player is paused, the new track is selected but doesn't start playing, and if we click on a new track whilst the player is playing, the player plays on by playing the new track. Got it?

However, activate-track! is still relying on the DOM to keep track of whether the player is paused. Let's fix this by checking the state atom instead:

(defn activate-track! [{:keys [number src] :as track}]
  (log "Activating track:" (clj->js track))
  (let [track-spans (seq (.-children (get-el "#tracks")))
        audio-el (get-el "audio")
        ;; Instead of this 👇
        ;; paused? (.-paused audio-el)
        ;; Do this! 👇
        {:keys [paused?]} @state]
      ;; ...
    )
  track)

Next, let's write a function to advance to the next track:

(defn advance-track! []
  (let [{:keys [active-track album]} @state
        {:keys [tracks]} album
        last-track? (= active-track (count tracks))]
    (when-not last-track?
      (activate-track! (nth tracks active-track)))))

Oops, this is relying on :active-track being present in the state atom. Let's put it there in activate-track!

(defn activate-track! [{:keys [number src] :as track}]
  (log "Activating track:" (clj->js track))
  (let [track-spans (seq (.-children (get-el "#tracks")))
        audio-el (get-el "audio")
        {:keys [paused?]} @state]
    ;; ...
    )
  ;; Swappity swap swap! 👇
  (swap! state assoc :active-track number)
  track)

(comment

  (load-ui! "http://localhost:1341/Garth+Brooks/Fresh+Horses")
  ;; => #<Promise[~]>

  @state
  ;; => {:paused? true,
  ;;     :active-track 1,
  ;;     :album {...}}

)

Now we should be able to actually call advance-track! to, well, advance the track:

(comment

  (load-ui! "http://localhost:1341/Garth+Brooks/Fresh+Horses")
  ;; => #<Promise[~]>

  (:active-track @state)
  ;; => 1

  (advance-track!)
  ;; => {:artist "Garth Brooks", :title "Cowboys And Angels", :number 2, :src "http://localhost:1341/Garth+Brooks/Fresh+Horses/Garth+Brooks+-+Cowboys+and+Angels.mp3"}

  (:active-track @state)
  ;; => 2

)

We will have also seen the highlighted track change when we evaluated the (advance-track!) form! 🎉

Is this the end?

What we're building up to is of course the ability to play our album continuously. When one track ends, the next should begin. And our good friend <audio> has just what we need, in the form of the ended event. If we add one line of code to register advance-track! as the listener for the ended event:

(defn load-ui! [dir]
  (p/let [album (load-album (str dir "/album.rss"))]
    (display-album! album)
    (reset! state {:paused? true, :album album})
    (->> album
         :tracks
         first
         activate-track!)
    (.addEventListener (get-el "audio") "play"
                       #(swap! state assoc :paused? false))
    (.addEventListener (get-el "audio") "ended"
                       advance-track!)))

(comment

  (load-ui! "http://localhost:1341/Garth+Brooks/Fresh+Horses")
  ;; => #<Promise[~]>

  ;; Click ▶️ and witness the glory!
)

We win!

Screenshot of our UI with the last track highlighted and the console showing activating track for all previous tracks

Winners who have won before and know how to win will of course know that the best thing to do after winning is to stride triumphantly to the podium, receive your 🥇, wave to your adoring public, soak up the applause like warm sunshine on a July day (unless you're in the southern hemisphere, in which case the warm sunshine is best appreciated in December, unless you're close enough to the equator to appreciate warm sunshine whenever you damn well please, unless you're too close to the equator and that sunshine is too warm to appreciate because you're sweating like wild), and then head home, find a comfy chair and open a bottle of champagne or fizzy water or tasty whiskey or whatever.

I, of course, am no such winner, so instead of retiring to my comfy chair with a glass of Lagavulin, I want to jump ahead in a track, so I confidently reach for the audio control and click ahead in the timeline, and... nothing happens WTF?

Reading more documentation, I discover that I can see the current time in seconds in the track by reading its currentTime property, and I can seek to an arbitrary time by setting currentTime, so let's give that a try, shall we? (Spoiler: we shall.)

(comment

  (.-currentTime (get-el "audio"))
  ;; => 37.010544

  (set! (.-currentTime (get-el "audio")) 50)
  ;; => nil

  ;; Why did my track start over? 🤬
  
  (.-currentTime (get-el "audio"))
  ;; => 2.006649

)

To make a long story short, this all boils down to how the browser actually implements seeking. When it first loads the audio track, it issues a request like this:

GET /Garth+Brooks/Fresh+Horses/Garth+Brooks+-+The+Old+Stuff.mp3 HTTP/1.1
Range: bytes=0-

and expects a response like this:

HTTP/1.1 206 Partial Content
Accept-Ranges: bytes
Content-length: 5943424
Content-Range: bytes 0-1024000/5943424
Content-Type: audio/mpeg

It will then buffer the bytes it got back and make the track seekable within those bytes, as described here. You can peer under the hood by inspecting the buffered and seekable properties of the <audio> element:

audio.buffered.length; // returns 2
audio.buffered.start(0); // returns 0
audio.buffered.end(0); // returns 5
audio.buffered.start(1); // returns 15
audio.buffered.end(1); // returns 19

But if we do this in our player, we experience a deep feeling of melancholy:

(comment

  (let [b (-> (get-el "audio")
              (.-buffered))]
    [(.start b 0) (.end b 0)])
  ;; => [0 144.758]

  (-> (get-el "audio")
      (.-seekable)
      (.-length))
  ;; => 1

  (let [s (-> (get-el "audio")
              (.-seekable))]
    [(.start s 0) (.end s 0)])
  ;; => [0 0]

)

The buffering looks fine, but it seems that we can only seek between 0 seconds and 0 seconds in the track, which kinda explains why attempting to set currentTime to any number that isn't 0 results in seeking back to 0. 😭

Seeking apparently only works if we get that blessed 206 Partial Content response from the webserver, so the browser knows how to make subsequent range requests to buffer more data, and unfortunately, the built-in babashka.http-server that we're using to serve up files in public/ responds like this:

HTTP/1.1 200 OK
Content-length: 5943424
Content-Type: audio/mpeg
Server: http-kit

No partial content?

Screenshot of a chef saying no seek for you, come back 1 year

We may attempt to fix this next time on "Soundcljoud, or a young man's Soundcloud clonejure", that is if there is a next time.

Part 1: Soundcljoud, or a young man's Soundcloud clonejure

Permalink

June & July 2024 Short-Term Project Updates

We’ve got several updates to share from our Q2 2024 project developers. Check out the latest in their June and July Reports following the project list below.

clj-merge tool: Kurt Harriger
This project focuses on developing a git diff and merge tool for edn and clojure code with the aim of creating a git mergetool that can be used as a replacement for git’s default merge tool for clj(s) and edn files.

Compojure-api: Ambrose Bonnaire-Sergeant
This project will deploy the first new releases since 2019 (and include compojure-api 1.x, 2.0.0-alpha branch, ring-swagger), compojure-api/reitet migration tools, and Swagger 3.0.

Enjure: Janet A. Carr
This project focuses on MVP for the Enjure CLI tool and providing the ability to create new projects and view/controller templates as well as delete templates.

Jank: Jeaye Wilkerson
Jank’s library parity with Clojure.core is around 20%. The next step is to fill out the language to make it feel more like Clojure - including Lazy sequences, Loop/recur, Destructuring, Symbol interning, and for and doseq macros.

Lost in Lambduhhs Podcast: L. Jordan Miller
Rejuvenate and streamline production of the Lost In Lambduhhs Podcast, where the audience gets the opportunity to “meet the person behind the Github” - illuminating the personal narratives and insights of tech luminaries, giving them a platform to share their perspectives while promoting their library or tool.

 
 

Clj-merge: Kurt Harriger

Q2 2024 Report No. 2. Published July 1, 2024

Introduction

This tool aims to reduce unnecessary conflicts due to whitespace and syntax peculiarities by using a more semantic approach to diffing and merging. I’m grateful for the support from ClojuristsTogether and the invaluable feedback and support from the Clojure community.

Recent Progress

This month, I focused on the following improvements:

  • Bug Fixes: Several bugs were fixed to enhance the stability of the tool.
  • CI/CD Pipeline: A CI/CD pipeline was added to streamline the installation process and prevent additional regressions.
  • Error Reporting: Simplified error reporting to make it easier for users to provide useful feedback when the tool does not work as expected.
    Due to an exceptionally busy schedule, progress on diff visualization and project promotion was limited.

Milestones Overview

The project was structured around several key milestones:

  1. Development of the MVP - Mostly complete
  2. Enhancement of diff handling and presentation - Ongoing
  3. Community engagement and feedback integration - Ongoing
  4. Performance optimization and cross-platform compatibility - Done

Milestone Progress

  1. Development of the MVP

    • Goals: To create a minimal viable product using editscript and rewrite-clj.
    • Recent Updates: Bug fixes were implemented to enhance the stability of the tool.
    • Status: Mostly Complete. From a technical perspective I have been able to test the feasibility of the implementation and learned a lot. Its hard to say when this is “done,” I don’t quite feel ready to push the adoption until more work has been done on the diff visualization.
  2. Enhancement of Diff Handling and Presentation

    • Goals: To improve the readability and utility of diffs for developers.
    • Recent Updates: Limited progress on diff visualization due to time constraints.
    • Status: Much more work still needs to be done here.
  3. Community Engagement and Feedback Integration

    • Goals: To actively engage with the community to gather detailed feedback and real-world merge conflict examples.
    • Recent Updates: Simplified error reporting to facilitate better feedback.
    • Next Steps: Increase efforts to engage the community and aim to present at a Clojure meetup in the near future.
  4. Performance Optimization and Cross-Platform Compatibility

    • Goals: Simplify the installation process.
    • Recent Updates: A CI/CD pipeline was added to streamline the installation process.
    • Status: Done

Conclusion

Thank you for your support and contributions to the clj-mergetool project.


Compojure-api: Ambrose Bonnaire-Sergeant

Q2 2024 Report No. 3. Published July 8, 2024

Last month I successfully added support for compojure-api 1.x coercions in the 2.x branch. This is one of the last steps towards backwards-compatibility of 1.x code using the 2.x branch. I did not make any progress on this front this month, but the remaining steps are starting to crystallize, which I will talk a bit about here.

Implementation wise, a 1.x coercion is detected with fn?, and implies the Schema backend (Spec support was addedin 2.x). Such a coercion is a nested function with the shape request->field->schema->coercer, often often implemented like (constantly nil) or (constantly {:body matcher}).

Several insights during this 3-month project made backwards-compatibility particularly clean.

Once I grokked the main differences between 1.x and 2.x coercions at the end of month 2, I realized that my efforts to restore support for ring-middleware-format was misguided. I could instead translate 1.x coercions to muuntaja’s expected format.

There are two steps to this. The first was to add support for “legacy” coercions in 2.x’s coercion abstraction. This involved changing the coerce-request implementation in compojure.api.coercion.schema.

The second step is to translate ring-middleware-format’s :format options to muuntaja’s :formats. I have only done this for the default options, and currently any custom :format extensions do not work.

One wrinkle that is difficult to reconcile in all cases is that 2.x dropped implicit support for several coercion formats such as yaml. In order to maintain backwards compatibility with 1.x coercions, we want to ensure that yaml formats are supported by default for legacy coercions.

I have not completely solved this problem, but I identified that coercions are usually configured in terms of the api-defaults var, which is a map with containing :format in 1.x and :formats in 2.x. In both branches, I introduced a breaking change, renaming api-default to api-defaults-v1 and api-defaults-v2 using :format and :formats respectively. This might help decide whether to include yaml coercion by default, but requires more thought.

Finally, I removed any attempt to add ring-middleware-format support in the 2.x branch since I realized it was unnecessary.

recompojure

As part of this 3-month project, I am releasing a recompojure, which is a library providing compojure-style macros that expand to reitit.

It does not have a stable release yet, but there are some interesting problems to solve.

I worked on recompojure for the first month of this project, but stopped after I ported it out of a corporate repo I prototyped it in. I realized that compojure-api has a different set of features than reitit, and decided that my time would be better served working on compojure-api itself.

One big difference between reitit and compojure-api is reitit accepts a local configuration (opts) map which compojure-api is extended using global state. To preserve this local configuration style, recompojure is structured as a macro-generating-macro that is passed a top-level options map. An additional subtlety is that this options map is needed at compile-time when compojure-api does most of its work.

For example, the load-api call here defines all the compojure-api macros such as GET, POST, context, etc., but their extensions can (eventually) be centralized in the options var. This attempts to address a common concern using compojure-api where care must be taken to ensure extension are loaded before any routing macros are expanded—this style should centralize extensions such that they are deterministic without further safe guards.

(ns com.recompojure.compojure-api1
  "Exposes the API of compojure.api.core v1.1.13 but compiling to reitit."
  (:require [com.recompojure.compojure-api1.impl :as impl]
            [clojure.set :as set]))

(def ^:private options {:impl :compojure-api1})

(impl/load-api `options)

From here, I would like to add compojure-api 2.x support, and fully take advantage of the implicit options map as described. The next big feature would be to reconcile compojure-api and reitit’s middleware support so that compojure-api-style applications can easily be translated to reitit via recompojure. In particular, most compojure-api app use the api function to create an app, but recompojure does not yet support translating this to reitit.

Project Summary

This 3-month project had two main focuses.

The first half concentrated on performance of the 2.x branch of compojure-api and ensuring stable versions of security fixes were deployed.

My main goal for the second half of this 3-month project was to ease future maintenance of compojure-api by retiring the 1.x branch. That way, features need only be developed in the 2.x branch and can still be enjoyed by 1.x users.

This was much more challenging and onerous than I anticipated, and I would not have been able to invest time in this if Clojurists Together had not funded the project. My main activity was attempting to understand and compare two versions of the same project and reverse-engineer the evolution of the project.

I’d like to thank Clojurists Together for selecting this project for funding.


Enjure: Janet A. Carr

Q2 2024 Report. Published June 12, 2024

Progress has been good. I regularly stream Enjure to my audience on Twitch which seem to be bootstrapping Enjure’s Github stars.

Despite the progress, I intentionally expanded the scope of the project by implementing an HTTP router for Enjure. I’m not entirely sure that this was a wise decision, but it adheres to Enjure’s guiding philosophy. Enjure’s HTTP router is implemented with a Radix tree and supports path parameters, in pure Clojure. I’m hoping to come up with a scheme for query, body, path, and form coercion soon, but I haven’t decided on a scheme I like. The router lives in a dynamic var managed by Enjure and is updated by macros for defining pages and controllers.

Enjure has several macros enforcing a similar convention to define HTTP resources. The purpose of which brings together a resource’s routes, contract, coercion and handling expressions to a single namespace. Often Clojure web applications are structured with several libraries. For example, Consider an application with Reitit with ring, next.jdbc with postgreSQL, The application will likely have its routes in one namespace, it’s handlers in another namespace, its business logic in another namespace and data modification language (DML) in another namespace; Necessitating opening several source files to accomplish a small task. Rarely do the Routes, contracts, and handlers change. If they do, it’s from minute changes. Bringing these together cuts down on the cyclomatic complexity of developing web Applications in Clojure. Thanks to homoiconicity, I can create constructs to help with Exactly this:

;; Example from Enjure repo
(defpage user "/users/:user-id"
  [req]
  (let [{:keys [path-params]} req
        {:keys [user-id]} path-params]
    (format "<h1>Hello, %s</h1>" user-id)))

This “page” construct is simply a function var under the hood, but also manages the routing-table Var. There’s no middleware required to update the routing table upon REPL reload. Simply evaluating The buffer/namespace will change the routing table. defpage expects a string as its return value as it’s largely tied to the content-type text/html. In the future I hope to have other, similar view constructs to support other popular application mime types.

Similarly, there are actions, changes, and removals that correspond to POST, PUT, and DELETE HTTP methods, respectively. Since Enjure places a high emphasis on convention, I’ll only show a Simple example of a Sign In action for a user:

(defaction signin "/signin"
  [req]
  (let [{:keys [email password]} (:form-params req)]
    (if (check-db email password)
      (redirect pages/home :see-other) ;; redirects to whatever route pages/home var has.
      (pages/siginin req) ;; this is a page var to render
  )))

Since resources in Enjure are just function vars, they can be called directly, and also reverse-routed to using some of the response macros. In the above example, if the signing check passes, redirect Redirects to whatever route pages/home has declared. (Reverse-routing is largely for convenience, and Not mandatory, the redirect macro supports redirecting to static paths/URLs, Enjure resources like pages, and Values from functions).

Ideally, resources would interact with the database through a data model supported by the framework. My ideas for this are still experimental and can be found in the repository under the internal “frm” namespace. Currently, it’s some simple templating of basic queries by querying the information_schema in Postgres, and interning the query functions as vars. These are queries I’ve seen regularly over the years, and I’m sure that, once implemented, will give developers a boost in productivity. Plus, there’s the added benefit of being decoupled from whatever mechanism I choose for supporting migrations/entities in Enjure (still A TBD). However, this model does not alienate developers who opt to create more complex queries as those are supported as well with next.jdbc.

Another idea I’ve been experimenting with is something I call the ReactiveRecord. ReactiveRecord uses Software transactional memory to synchronize with the database, providing an in-memory DB representation. I think given the functional interface provided by FRM above and information schema data. It might be interesting, but I do believe this kind of transacting might be faux-pas or even dangerous, so more thought is needed here on my part.

All of this will ideally be controlled with the Enjure CLI. Enjure puts a heavy emphasis on reducing developer friction. Given a base installation of Clojure, installing Enjure should allow for the creation and management of Enjure projects very easily. I’ll admit this is an area I’ve been slacking on a bit since I wanted to finished with other core components first. As of writing this, the Enjure CLI has two basic commands: Notes and help. Notes search a project for comments containing NOTES, FIXME, TODO, and HACK. Help just prints out the help dialog. Soon enough Enjure CLI will support creating and deleting resources, migrations, Entities, dependencies, etc., as well as creating new projects. The Enjure CLI can already be installed to a user’s path as a CLI utility written entirely in Clojure.

Finally, documentation of the project has become my lowest priority and definitely at risk. However, I’m not too concerned about the documentation faltering. In some sense, I’m a technical writer thanks to my blog, so I believe writing documentation for Enjure won’t be as challenging as the rest of the project.


Jank: Jeaye Wilkerson

Q2 2024 Report 3. Published June 30, 2024

Welcome back to another jank development update! For the past month, I’ve been pushing jank closer to production readiness primarily by working on multimethods and by debugging issues with Clang 19 (currently unreleased). Much love to Clojurists Together and all of my Github sponsors for their support this quarter.

Multimethods

I thought, going into this month, that I had a good idea of how multimethods work in Clojure. I figured we define a dispatch function with defmulti:

(defmulti sauce-suggestion ::noodle-type)

Then we define our catch-all method for handling types:

(defmethod sauce-suggestion :default [noodle]
  (println "You can't go wrong with some butter and garlic."))

Then we define some specializations for certain values which come out of our dispatch function.

(defmethod sauce-suggestion ::shell [noodle]
  (println "Cheeeeeeeese!"))

(defmethod sauce-suggestion ::flate-white-rice [noodle]
  (println "Hor fun gravy."))

Then, when you call the sauce-suggestion function, first the dispatch function is called and then the correct method is looked up and called.

(sauce-suggestion {::noodle-type ::shell})
Cheeeeeeeese!

(sauce-suggestion {::noodle-type ::spaghetti})
You can't go wrong with some butter and garlic.

This is as much as I knew. But wait, there’s more!

Hierarchies

It turns out that multimethods match dispatch values based on a couple of different hierarchies, too. If you’re matching actual class types, like String, you could have a method which is parameterized on Object and it will be a catch-all. So this would allow you to match on everything which inherits from IRenderable, for example, and then use that interface to render the object. I wasn’t concerned about this, since jank’s object model isn’t based on inheritance. I figured I could leave this whole feature out of multimethods.

However, it turns out that Clojure supports another form of hierarchies! Even crazier, we have full control over those hierarchies at run-time and we can build as many as we want. Check this out.

; We can classify spaghetti and penne as Italian.
; They will both be considered children of ::italian.
(derive ::spaghetti ::italian)
(derive ::penne ::italian)

; Then we can define a method based on the parent.
(defmethod sauce-suggestion ::italian [noodle]
  (println "Sugo al pomodoro."))

; This allows us to match multiple dispatch values in a
; deterministic and intuitive way.
(sauce-suggestion {::noodle-type ::penne})
Sugo al pomodoro.

There are a handful of related core functions for working with these hierarchies. jank now implements all of them.

  • make-hierarchy
  • isa?
  • parents
  • ancestors
  • descendents
  • derive
  • underive

As I was implementing multimethods, I needed a few more core functions, so those were all implemented as well:

  • hash-set
  • disj
  • defmulti
  • alter-var-root
  • bound?
  • thread-bound?

Notably, this includes bound?, which required me to actually create a dedicated unbound var object so I could distinguish between unbound vars and vars holding nil.

Clang/LLVM 19

Most of my time this past month was not spent developing new features for jank, which is why I only have multimethods and 13 new functions to report. Instead, my time was spent trying to get jank ported over to the latest Clang/LLVM version, which will allow us to leave Cling behind. jank uses these for JIT compiling C++ code and upgrading to the upstream Clang will unlock huge performance wins, make compiling jank easier, and will allow for jank to follow the bleeding edge of the native JIT space. However, before we get there, we have a couple of bugs to get past.

Extern templates

The first bug, which was causing JIT linking issues, I reduced down to a simple test case involving an extern template which is linked either in the current process or in a loaded shared library. Clang will be unable to resolve the address of the definition of that function. As it happens, the fmt library uses this pattern to provide some optimized versions of certain templates. However, we can fortunately work around this, since fmt wraps those definitions in a FMT_HEADER_ONLY preprocessor flag. The relevant fmt source is here.

The process of narrowing this down from the entire jank runtime is cumbersome, ruling out chunks of code at a time while still trying to keep things compiling and correct.

Optimization crash

This is the blocking bug preventing jank from switching to Clang. It only happens in release builds, which also makes it harder to debug. This month, I traced the bug down from a crash in jank all the way to a minimal test case involving assignments with an implicit constructor. However, when testing whether or not the bug existed in Clang 18, I found that it indeed did not. This meant that it’s since been introduced in the yet unreleased Clang 19. So I bisected around 1300 commits, each time requiring a fresh Clang/LLVM compilation and taking ~30m. It was an entire day of all 32 cores on my machine being busy compiling, but fortunately I could script all of the hard work just using some bash. Bisecting allowed me to find the commit which introduced the issue. This has yet to be fixed and I don’t have the expertise to know what’s wrong with that commit, but I’ve provided a test case, pinged the relevant people, and now I’m hoping the real experts can come in for the save.

Clang status

Aside form those two issues, only one of them being a blocker, the port to Clang is ready. In debug builds (which avoid the second bug), jank can pass its full test suite using Clang 19. Even better, some early benchmarking has shown that Clang 19 is more than twice as fast as Cling when it comes to JIT compiling large amounts of generated C++ code (such as all of clojure.core). That will mean faster startup times and shorter REPL iteration loops.

What’s next?

Implementing multimethods identified a couple of issues related to certain sequence types in jank which I’m still investigating. Once those are sorted, I’ll continue working through the requirements to implement clojure.test, which is why I was implementing multimethods in the first place. From there, I can start testing my jank code using more jank code and the dogfooding cycle can really begin. Stay tuned, folks!


Lost in Lambduhhs Podcast: L. Jordan Miller

Q2 2024 Report 2. Published June 27, 2024.

I have made continued progress on my new podcast series, thanks to the support from Clojurists Together. Here are the key milestones I’ve achieved since my last update and my plans moving forward:

Theme Music and Audio Engineering

  • Created Theme Music and Audio Engineering Template: Developed the theme music and an audio engineering template to ensure a consistent and professional sound for each episode.

Riverside.fm Proficiency

  • Learned Riverside.fm Editing Software: Gained proficiency in using Riverside.fm’s editing software and created a workflow for efficiently editing audio.

Episode Releases

  • Released Two Episodes:
    • David Nolan
    • Arne Brasseur

Guest Coordination and Diversity Efforts

  • Gender Diversity Challenge: I am striving to ensure gender diversity in my episodes, which has been challenging.
  • Reached Out to Prospective Guests: Contacted three prospective guests, with two having returned my communications.

Recent Challenges

  • Scheduling Conflicts: Faced scheduling conflicts due to a death in my family followed by getting sick with strep throat. I am now on day 8 of recovering from the sickness and have recordings scheduled for next week.

Next Steps

  • Continue Outreach: Continue to reach out to schedule recordings, ensuring a diverse lineup of guests.
  • Timely Editing and Release: Edit and release episodes in a timely manner, promoting on Clojurians Slack, Clojure Weekly updates, LinkedIn, and Twitter.
  • Expand Promotion Channels: Create a Mastodon account to help promote the podcast.

Conclusion

Despite recent challenges, I am on track with my project timeline and excited about the content I am creating. I will continue to provide updates as I progress further.


Permalink

Data Manipulation in Clojure Compared to R and Python

I spend a lot of time developing and teaching people about Clojure's open source tools for working with data. Almost everybody who wants to use Clojure for this kind of work is coming from another language ecosystem, usually R or Python. Together with Daniel Slutsky, I'm working on formalizing some of the common teachings into a course. Part of that is providing context for people coming from other ecosystems, including "translations" of how to accomplish data science tasks in Clojure.

As part of this development, I wanted to share an early preview in this blog post. The format is inspired by this great blog post I read a while ago comparing R and Polars side by side (where "R" here refers to the tidyverse, an opinionated collection of R libraries for data science, and realistically mostly dplyr specifically). I'm adding Pandas because it's among the most popular dataset manipulation libraries, and of course Clojure, specifically tablecloth, the primary data manipulation library in our ecosystem.

I'll use the same dataset as the original blog post, the Palmer Penguin dataset. For the sake of simplicity, I saved a copy of the dataset as a CSV file and made it available on this website. I will also refer the data as a "dataset" throughout this post because that's what Clojure people call a tabular, column-major data structure, but it's the same thing that is variously referred to as a dataframe, data table, or just "data" in other languages. I'm also assuming you know how to install the packages required in the given ecosystems, but any necessary imports or requirements are included in the code snippets the first time they appear. Versions of all languages and libraries used in this post are listed at the end. Here we go!

Reading data

Reading data is straightforward in every language, but as a bonus we want to be able to indicate on the fly which values should be interpreted as "missing", whatever that means in the given libraries. In this dataset, the string "NA" means "missing", so we want to tell the dataset constructor this as soon as possible. Here's the comparison of how to accomplish that in various languages:

Tablecloth

(require '[tablecloth.api :as tc])

(def ds 
  (tc/dataset "https://codewithkira.com/assets/penguins.csv"))

Note that tablecloth interprets the string "NA" as missing (nil, in Clojure) by default.

R

In reality, in R you would get the dataset from the R package that contains the dataset. This is a fairly common practice in R. In order to compare apples to apples, though, here I'll show how to initialize the dataset from a remote CSV file, using the readr package's read_csv, which is part of the tidyverse:

library(tidyverse)

ds <- read_csv("https://codewithkira.com/assets/penguins.csv",
               na = "NA")

Pandas

import pandas as pd

ds = pd.read_csv("https://codewithkira.com/assets/penguins.csv")

Note that pandas has a fairly long list of values it considers NaN already, so we don't need to specify what missing values look like in our case, since "NA" is already in that list.

Polars

import polars as pl

ds = pl.read_csv("https://codewithkira.com/assets/penguins.csv",
                 null_values="NA")

Basic commands to explore the dataset

The first thing people usually want to do with their dataset is see it and poke around a bit. Below is a comparison of how to accomplish basic data exploration tasks using each library.

Operationtableclothdplyr
see first 10 rows(tc/head ds 10)head(ds, 10)
see all column names(tc/column-names ds)colnames(ds)
select column(tc/select-columns ds "year")select(ds, year)
select multiple columns(tc/select-columns ds ["year" "sex"])select(ds, year, sex)
select rows(tc/select-rows ds #(> (% "year") 2008))filter(ds, year > 2008)
sort column(tc/order-by ds "year")arrange(ds, year)


Operationpandaspolars
see first n rowsds.head(10)ds.head(10)
see all column namesds.columnsds.columns
select columnds[["year"]]ds.select(pl.col("year"))
select multiple columnsds[["year", "sex"]]ds.select(pl.col("year", "sex"))
select rowsds[ds["year"] > 2008]ds.filter(pl.col("year") > 2008)
sort columnds.sort_values("year")ds.sort("year")

Note there are some differences in how different libraries sort missing values, for example in tablecloth and polars they are placed at the beginning (so they're at the top when a column is sorted in ascending order and last when descending), but dplyr and pandas place them last (regardless of whether ascending or descending order is specified).

As you can see, these commands are all pretty similar, with the exception of selecting rows in tablecloth. This is a short-hand syntax for writing an anonymous function in Clojure, which is how rows are selected. Being a functional language, functions in Clojure are "first-class", which basically just means they are passed around as arguments willy-nilly, all over the place, all the time. In this case, the third argument to tablecloth's select-rows function is a predicate (a function that returns a boolean) that takes as its argument a dataset row as a map of column names to values. Don't worry, though, tablecloth doesn't process your entire dataset row-wise. Under the hood datasets are highly optimized to perform column-wise operations as fast as possible.

Here's an example of what it looks like to string a couple of these basic dataset exploration operations together, for example in this case to get the bill_length_mm of all penguins with body_mass_g below 3800:

Tablecloth

(-> ds
    (tc/select-rows #(and (% "body_mass_g")
                          (> (% "body_mass_g") 3800)))
    (tc/select-columns "bill_length_mm"))

Note that in tablecloth we have to explicitly omit rows where the value we're filtering by is missing, unlike in other libraries. This is because tablecloth actually uses nil (as opposed to a library-specific construct) to indicate a missing value , and in Clojure nil is not treated as comparable to numbers. If we were to try to compare nil to a number, we would get an exception telling us that we're trying to compare incomparable types. Clojure is fundamentally dynamically typed in that it only does type checking at runtime and bindings can refer to values of any type, but it is also strongly typed, as we see here, in the sense that it explicitly avoids implicit type coercion. For example deciding whether 0 is greater or larger than nil requires some assumptions, and these are intentionally not baked into the core of Clojure or into tablecloth as a library as is the case in some other languages and libraries.

This example also introduces Clojure's "thread-first" macro. The -> arrow is like R's |> operator or the unix pipe, effectively passing the output of each function in the chain as input to the next. It comes in very handy for data processing code like this.

Here is the equivalent operation in the other libraries:

dplyr

ds |>
    filter(body_mass_g < 3800) |>
    select(bill_length_mm)

Pandas

ds[ds["body_mass_g"] < 3800]["bill_length_mm"]

Polars

ds.filter(pl.col("body_mass_g") < 3800).select(pl.col("bill_length_mm"))

More advanced filtering and selecting

Here is what some more complicated data wrangling looks like across the libraries.

Select all columns except for one

LibraryCode
tablecloth(tc/select-columns ds (complement #{"year"}))
dplyrselect(ds, -year)
pandasds.drop(columns=["year"])
polarsds.select(pl.exclude("year"))

Another property of functional languages in general, and especially Clojure, is that they really take advantage of the fact that a lot of things are functions that you might not be used to treating like functions. They also leverage function composition to simply combine multiple functions into a single operation.

For example a set (indicated with the #{} syntax in Clojure) is a special function that returns a boolean indicating whether the given argument is a member of the set or not. And complement is a function in clojure.core that effectively inverts the function given to it, so combined (complement #{"year"}) means "every value that is not in the set #{"year"}, which we can then use as our predicate column selector function to filter out certain columns.

Select all columns that start with a given string

LibraryCode
tablecloth(tc/select-columns ds #(str/starts-with? % "bill"))
dplyrselect(ds, starts_with("bill"))
pandasds.filter(regex="^bill")
polars
import polars.selectors as cs
ds.select(cs.starts_with("bill"))

Select only numeric columns

LibraryCode
tablecloth(tc/select-columns ds :type/numerical)
dplyrselect(ds, where(is.numeric))
pandasds.select_dtypes(include='number')
polarsds.select(cs.numeric())

The symbol :type/numerical in Clojure here is a magic keyword that tablecloth knows about and can accept as a column selector. This list of magic keywords that tablecloth knows about is not (yet) documented anywhere, but it is available in the source code.

Filter rows for range of values

LibraryCode
tablecloth(tc/select-rows ds #(< 3500 (% "body_mass_g" 0) 4000))
dplyrfilter(ds, between(body_mass_g, 3500, 4000))
pandasds[ds["body_mass_g"].between(3500, 4000)]
polarsds.filter(pl.col("body_mass_g").is_between(3500, 4000))

Note here we handle the missing values in the body_mass_g column differently than above, by specifying a default value for the map lookup. We're explicitly telling tablecloth to treat missing values as 0 in this case, which can then be compared to other numbers. This is probably the better way to handle this case, but the method above works, too, plus it gave me the opportunity to soapbox about Clojure types for a moment.

Reshaping the dataset

Tablecloth

(tc/pivot->longer ds 
                  ["bill_length_mm" "bill_depth_mm"
                   "flipper_length_mm" "body_mass_g"]
                  {:target-columns "measurement" :value-column-name "value"})

dplyr

ds |>
    pivot_longer(cols = c(bill_length_mm, bill_depth_mm,
                          flipper_length_mm, body_mass_g),
                 names_to = "measurement",
                 values_to = "value")

Pandas

pd.melt(
    ds, 
    id_vars=ds.columns.drop(["bill_length_mm", "bill_depth_mm", 
                             "flipper_length_mm", "body_mass_g"]), 
    var_name="measurement",
    value_name="value"
)

Polars

ds.unpivot(
     index=set(ds.columns) - set(["bill_length_mm",
                                  "bill_depth_mm",
                                  "flipper_length_mm",
                                  "body_mass_g"]),
     variable_name="measurement",
     value_name="value")

Creating and renaming columns

Adding columns based on some other existing columns

There are many reasons you might want to add columns, and often new columns are combinations of other ones. Here's how you'd generate a new column based on the values in some other columns in each library:

LibraryCode
tablecloth
(require '[tablecloth.column.api :as tcc])
(tc/add-columns ds {"ratio" (tcc// (ds "bill_length_mm")
(ds "flipper_length_mm"))})
dplyrmutate(ds, ratio = bill_length_mm / flipper_length_mm)
pandasds["ratio"] = ds["bill_length_mm"] / ds["flipper_length_mm"]
polars
ds.with_columns(
(pl.col("bill_length_mm") /
pl.col("flipper_length_mm")).alias("ratio")
)

Note that this is where the wheels start to come off if you're not working in a functional way with immutable data structures. Clojure data structures (including tablecloth datasets) are immutable, which is not the case Pandas. The Pandas code above mutates the dataset in place, so as soon as you do any mutating operations like these, you now have to keep mental track of the state of your dataset, which can quickly lead to high cognitive overhead and lots of incidental complexity.

Renaming columns

LibraryCode
tablecloth(tc/rename-columns ds {"bill_length_mm" "bill_length"})
dplyrrename(ds, bill_length = bill_length_mm)
pandasds.rename(columns={"bill_length_mm": "bill_length"})
polarsds.rename({"bill_length_mm": "bill_length"})

Again beware, the Pandas implementation shown here mutates the dataset in place. Also manually specifying every column name transformation you want to do is one way to accomplish the task, but sometimes that can be tedious if you want to apply the same transformation to every column name, which is fairly common.

Transforming column names

Here's how you would upper case all column names:

LibraryCode
tablecloth(tc/rename-columns ds :all str/upper-case)
dplyrrename_with(ds, toupper)
pandasds.columns = ds.columns.str.upper()
polarsds.select(pl.all().name.to_uppercase())

Like the other libraries, tablecloth's rename-columns accepts both types of arguments – a simple mapping of old -> new column names, or any column selector and any transformation function. For example, removing the units from each column name would look like this in each language:

LibraryCode
tablecloth(tc/rename-columns ds #".+_(mm|g)" #(str/replace % #"(.+)_(mm|g)" "$1"))
dplyrrename_with(penguins, ~ str_replace(.x, "^(.+)_(mm|g)$", "\1"))
pandas
import re
ds.rename(columns=lambda x: re.sub(r"(.+)_(mm|g)$", r"\1", x))
polars
ds = ds.rename({
col: col.replace("_mm", "").replace("_g", "")
for col in ds.columns
})

Grouping and aggregating

Grouping behaves somewhat unconventionally in tablecloth. Datasets can be grouped by a single column name or a sequence of column names like in other libraries, but grouping can also be done using any arbitrary function. Grouping in tablecloth also returns a new dataset, similar to dplyr, rather than an abstract intermediate object (as in pandas and polars). Grouped datasets have three columns, (name of the group, group id, and a column containing a new dataset of the grouped data). Once a dataset is grouped, the group values can be aggregated in a variety of ways. Here are a few examples, with comparisons between libraries:

Summarizing counts

To get the count of each penguin by species:

Tablecloth

(-> ds
    (tc/group-by ["species"])
    (tc/aggregate {"count" tc/row-count}))

dplyr

ds |>
    group_by(species) |>
    summarise(count = n())

Pandas

ds.groupby("species").agg(count=("species", "count"))

Polars

ds.group_by("species").agg(pl.count().alias("count"))

Find the penguin with the lowest body mass by species

Tablecloth

(-> ds
    (tc/group-by ["species"])
    (tc/aggregate {"lowest_body_mass_g" #(->> (% "body_mass_g")
                                              tcc/drop-missing
                                              (apply tcc/min))}))

dplyr

ds |>
    group_by(species) |>
    summarize(lowest_body_mass_g = min(body_mass_g, na.rm = TRUE))

Pandas

ds.groupby("species").agg(
    lowest_body_mass_g=("body_mass_g", lambda x: x.min(skipna=True))
).reset_index()

Polars

ds.group_by("species").agg(
    pl.col("body_mass_g").min().alias("lowest_body_mass_g")
)

Conclusions

As you can see, all of these libraries are perfectly suitable for accomplishing common data manipulation tasks. Choosing a language and library can impact code readability, maintainability, and performance, though, so understanding the differences between available toolkits can help us make better choices.

Clojure's tablecloth emphasizes functional programming concepts and immutability, which can lead to more predictable and re-usable code, at the cost of adopting a potentially new paradigm. Hopefully this comparison serves not only as a translation guide, but an an intro to the different philosophies underpinning these common data science tools.

Thanks for reading :)

Versions

The code in this post works with the following language and library versions:

ToolVersion
MacOSSonoma 14.5
JVM21.0.2
Clojure1.11.1
Tablecloth7.021
R4.4.1
Tidyverse2.0.0
Python3.12.3
Pandas2.1.4
Polars1.1.0

Permalink

Clojure macros continue to surprise me

Clojure macros have two modes: avoid them at all costs/do very basic stuff, or go absolutely crazy.

Here’s the problem: I’m working on Humble UI’s component library, and I wanted to document it. While at it, I figured it could serve as an integration test as well—since I showcase every possible option, why not test it at the same time?

This is what I came up with: I write component code, and in the application, I show a table with the running code on the left and the source on the right:

It was important that code that I show is exactly the same code that I run (otherwise it wouldn’t be a very good test). Like a quine: hey program! Show us your source code!

Simple with Clojure macros, right? Indeed:

(defmacro table [& examples]
  (list 'ui/grid {:cols 2}
    (for [[_ code] (partition 2 examples)]
      (list 'list
        code (pr-str code)))))

This macro accepts code AST and emits a pair of AST (basically a no-op) back and a string that we serialize that AST to.

This is what I consider to be a “normal” macro usage. Nothing fancy, just another day at the office.

Unfortunately, this approach reformats code: while in the macro, all we have is an already parsed AST (data structures only, no whitespaces) and we have to pretty-print it from scratch, adding indents and newlines.

I tried a couple of existing formatters (clojure.pprint, zprint, cljfmt) but wasn’t happy with any of them. The problem is tricky—sometimes a vector is just a vector, but sometimes it’s a UI component and shows the structure of the UI.

And then I realized that I was thinking inside the box all the time. We already have the perfect formatting—it’s in the source file!

So what if... No, no, it’s too brittle. We shouldn’t even think about it... But what if...

What if our macro read the source file?

Like, actually went to the file system, opened a file, and read its content? We already have the file name conveniently stored in *file*, and luckily Clojure keeps sources around.

So this is what I ended up with:

(defn slurp-source [file key]
  (let [content      (slurp (io/resource file))
        key-str      (pr-str key)
        idx          (str/index-of content key)
        content-tail (subs content (+ idx (count key-str)))
        reader       (clojure.lang.LineNumberingPushbackReader.
                       (java.io.StringReader.
                         content-tail))
        indent       (re-find #"\s+" content-tail)
        [_ form-str] (read+string reader)]
    (->> form-str
      str/split-lines
      (map #(if (str/starts-with? % indent)
              (subs % (count indent))
              %)))))

Go to a file. Find the string we are interested in. Read the first form after it as a string. Remove common indentation. Render. As a string.

Voilà!

I know it’s bad. I know you shouldn’t do it. I know. I know.

But still. Clojure is the most fun I have ever had with any language. It lets you play with code like never before. Do the craziest, stupidest things. Read the source file of the code you are evaluating? Fetch code from the internet and splice it into the currently running program?

In any other language, this would’ve been a project. You’d need a parser, a build step... Here—just ten lines of code, on vanilla language, no tooling or setup required.

Sometimes, a crazy thing is exactly what you need.

Permalink

Going to the cinema is a data visualization problem

Do you like going to the cinema? I do. But I also like to know where I am going and which movie I am going to see. But how do you choose?

You can’t go to the cinema’s website. There are just too many. Of course, you might have a favorite one and always go to it, but you won’t know what you are missing out.

Then, there are aggregators. The idea is good: gather everything that’s playing in cinemas right now in one place. Flight aggregators, but for movies.

Implementation, unfortunately, is not that good. As with any other website, the aggregator’s goal is to make you go through as many web pages as possible, do as many clicks as possible, and show you as many ads as possible.

Please use an ad blocker, this is unbearable

They even play a freaking TV ad in place of a movie trailer!

Information architecture can be weird too:

kino.de, auto-translated from German

Should I go to “Movies” or “Cinema Programme”? Should I select “Currently in Cinema” or “New in Cinema”?

So I decided to take matters into my own hands and build a cinema selection website I always dreamed of.

Meet allekinos.de:

So what is it?

It’s a website that shows every movie screening in every cinema across the entire Germany.

And when I say EVERY screening, I mean it:

Every screening, every cinema, every movie. All in one long HTML table.

What else can it do?

Just filter. You can filter:

  • by city,
  • by city district (don’t want to travel too far),
  • by a particular cinema (maybe you have a favorite one),
  • by genre (want to see something with your kid but don’t know what),
  • or by movie (which cities does it still play?).

That’s it. That’s the site.

Oh, we also have a list of premieres so you would know what’s coming. But that’s it.

What about the interface?

There isn’t one. I mean, there is, of course, but I tried to make it as invisible as possible. There’s no logo. No menu. No footer. No pagination. No “See more”. No cookie banners (because no cookies). No ChatGPT/SEO generated bullshit. No ads, of course.

Why? Because people don’t care about that stuff. They care about function. And our UI is a pure function.

But how do I search?

Well, Ctrl+F, of course. We are too humble, too lazy, and too smart to try to compete with in-browser implementation.

Wait, what about page size?

It’s totally fine. I mean, for Berlin, for example, we serve 1.4 MB of HTML. 3 MB with posters. It’s fine.

Slack loads 50 MB (yes, MEGA bytes) to show you a list of 10 chats. AirBnB loads 15 MB, including 500 KB HTML, just to show 20 images. LinkedIn loads 1.5 MB of just HTML (37 MB total) for a fraction of the data we’re showing. So we are fine.

It’s kind of refreshing, actually. What kind of speed do you get from a table with a thousand rows. Feels like a lot, but still feels faster than anything on the modern web.

What about mobile?

That is a good question. I am still thinking about it.

The table trick won’t work on mobile. So layout needs to be different, but I also want it to have the same information density as the desktop, which is tricky.

If you just make the table vertical, it’ll be too much to scroll even for people with the strongest fingers. Maybe I’ll figure something out one day.

What’s under the hood?

DataScript.

When I looked at the data, I realized it’s multidimensional: there are movies, they have genres, years, countries, languages, there are cinemas, which are located in districts, which are located in cities, then there are showings, which have day and time, and very possibly something else will come up later, too.

Now, I had no idea how that data would be accessed. Is the cinema part of the movie or is the movie part of the cinema? So I decided to make it all flat and put it into the database.

And it worked! It worked remarkably well. Now I can utilize DataScript queries being data to build them on the fly:

(defn search [{:keys [city cinema district movie genre]}]
  (let [inputs   
        (cond-> [['$ db]]
          city     (conj ['?city     city])
          cinema   (conj ['?cinema   cinema])
          district (conj ['?district district])
          movie    (conj ['?movie    movie])
          genre    (conj ['?genre    genre]))
      
        where
        (cond-> [:where]
          city     (conj '(or
                            [?cinema :cinema/city ?city]
                            [?cinema :cinema/area ?city]))
          cinema   (conj '[?cinema :cinema/title ?cinema-title])
          district (conj '[?cinema :cinema/district ?district])
          movie    (conj '[?movie :movie/title ?movie-title])
          genre    (conj '[?movie :movie/genre ?genre]))]

    (apply ds/q
      (concat
        '[:find ?show ?date ?time ?url ?cinema ?version ?movie
          :keys  id    date  time  url  cinema  version  movie
          :in]
        (map first inputs)
        where 
        '[[?show    :show/cinema         ?cinema]
          [?show    :show/date           ?date]
          [?show    :show/time           ?time]
          [?show    :show/url            ?url]
          [?show    :show/movie-version  ?version]
          [?version :movie-version/movie ?movie]])
      (map second inputs))))

The whole database is around 11 MB, basically nothing. I don’t even bother with proper storage, I just serialize the whole thing to a single JSON file every time it updates.

The hosting

I have been building websites for a while. I have two (Grumpy and this blog) running right now on my own server. I already spent my time, I have figured this all out. I have all the templates at my fingertips.

But for allekinos.de I decided to try something different: application.garden.

It’s a hosting for small Clojure web apps (still in private beta) that’s supposed to take care of insignificant details for you and let you focus on your app first and foremost.

And it works! It’s refreshingly simple: you download a single binary that operates as a command-line tool, create garden.edn file with your project’s name, and call garden deploy. That’s it! Your app is live!

No, seriously. You tend to forget how many annoying small details there are before other people can use your app. But when something like Garden takes them away, you remember and get blown away again! If that’s what Heroku used to feel like back in the day, I’m all in for it.

The beauty Garden is that it helps you start fast, but it’s not a toy. It easily scales all the way up to production. Custom domain, HTTPS, auth, cron, logs, persistent storage: they take care of all of this for you.

And a cherry on top: they even provide nREPL to production! Again, no setup, just garden repl and you are in! Perfect for debugging weird performance issues or running one-off jobs.

An example: when I implemented premieres and committed the code, I still needed to run it for the first time. Instead of making a special flag or endpoint or adding and then immediately removing the startup code, I just connected to remote nREPL and invoked the function in the code. It doesn’t get easier than that!

Uncharacteristic of me, but I kind of enjoy building web apps again, when it’s that simple. Might build more in the future.

Conclusion

In the beginning, I wanted a simple website that solved my problem. I wanted a website that I’d enjoy using.

But I don’t want to make a product out of it. We have enough products already. It’s time someone took a user’s side. And I am one of the users.

Magic things happen when you trust your users and just show them everything you’ve got.

For example, I found some rare films playing that I had no idea about. Matrix in German (!), but once a week and only in one cinema. Or Mars Express, they play it in three cities only, excluding mine. How do you find out about stuff like this?

Here, I discovered it. I looked at the data and you started seeing stuff that otherwise is completely invisible.

Anyway, enjoy. If this becomes a trend, I’m all in for it. Wouldn’t mind seeing more sites like this in the future.

Permalink

Clojure AntiPatterns: the with-retry macro

Most of clojurians write good things about Clojure only. I decided to start sharing techniques and patterns that I consider bad practices. We still have plenty of them in Clojure projects, unfortunately.

My first candidate is widely used, casual macro called with-retry:

(defmacro with-retry [[attempts timeout] & body]
  `(loop [n# ~attempts]
     (let [[e# result#]
           (try
             [nil (do ~@body)]
             (catch Throwable e#
               [e# nil]))]
       (cond

         (nil? e#)
         result#

         (> n# 0)
         (do
           (Thread/sleep ~timeout)
           (recur (dec n#)))

         :else
         (throw (new Exception "all attempts exhausted" e#))))))

This is a very basic implementation. It catches all possible exceptions, has a strict number of attempts, and the constant delay time. Typical usage:

(with-retry [3 2000]
  (get-file-from-network "/path/to/file.txt"))

Should network blink, most likely you’ll get a file anyway.

Clojure people who don’t like macros write a function like this:

(defn with-retry [[attempts timeout] func]
  (loop [n attempts]
    (let [[e result]
          (try
            [nil (func)]
            (catch Throwable e
              [e nil]))]
      (cond

        (nil? e)
        result

        (> n 0)
        (do
          (Thread/sleep timeout)
          (recur (dec n)))

        :else
        (throw (new Exception "all attempts exhausted" e))))))

It acts the same but accepts not arbitrary code but a function. A form can be easily turned into a function by putting a sharp sign in front of it. After all, it looks almost the same:

(with-retry [3 2000]
  #(get-file-from-network "/path/to/file.txt"))

Although it is considered being a good practice, here is the outcome of using it in production.

Practice proves that, even if you wrap something into that macro, you cannot recover from a failure anyway. Imagine you’re downloading a file from S3 and pass wrong credentials. You cannot recover no matter how many times you retry. Wrong creds remain wrong forever. Now there is a missing file: again, no matter how hard you retry, it’s all in vain and you only waste resources. Should you put a file into S3, and submit wrong headers, it’s the same. If your network is misconfigured or some resources are blocked, or you have no permissions, it’s the same again: no matter how long have you been trying, it’s useless.

There might be dozens of reasons when your request fails, and there is no way to recover. Instead of invoking a resource again and again, you must investigate what went wrong.

There might be some rare cases which are worth retrying though. One of them is an IOException caused by a network blink. But in fact, modern HTTP clients already handle it for you. If you GET a resource and receive an IOException, most likely your client has already done three attempts silently with growing timeouts. By wrapping the call with-retry, you perform 9 attempts or so under the hood.

Another case might be 429 error code which stands for rate limitation on the server side. Personally I don’t think that a slight delay may help. Most likely you need to bump the limits, rotate API keys and so on but not Thread.sleep in the middle of code.

I’ve seen terrible usage of with-retry macro across various projects. One developer specified 10 attempts with 10 seconds timeout to reach a remote API for sure. But he was calling the wrong API handler in fact.

Another developer put two nested with-macro forms. They belonged to different functions and thus could not be visible at once. I’m reproducing a simplified version:

(with-retry [4 1000]
  (do-this ...)
  (do-that ...)
  (with-retry [3 2000]
    (do-something-else...)))

According to math, 4 times 3 is 12. When the (do-something-else) function failed, the whole top-level block started again. It led to 12 executions in total with terrible side effects and logs which I could not investigate.

One more case: a developer wrapped a chunk of logic that inserted something into the database. He messed up with foreign keys so the records could not be stored. Postgres replied with an error “foreign key constraint violation” yet the macro tried to store them three times before failing completely. Three broken SQL invocations… for what? Why?

So. Whenever you use with-retry, most likely it’s a bad sign. Most often you cannot recover from a failure no matter if you add two numbers, upload a file, or write into a database. You should only retry in certain situations like IOException or rate limiting. But even those cases are questionable and might be mitigated with no retrying.

Next time you’re going to cover a block of logic with-retry, think hard if you really need to retry. Will it really help in case of wrong creds, a missing file, incorrect signature or similar things? Perhaps not. Thus, don’t retry in vain. Just fail and write detailed logs. Then find the real problem, fix it and let it never happen again.

Permalink

Go Julia!

Last week two new language bindings were added to the YAMLScript family: Go and Julia.

Go

The Go binding has been a long time coming. Several people have been working on it this year but it was Andrew Pam who finally got it over the finish line.

Go is a big user of the YAML data language, so we're happy to be able to provide this library and hope to see it used in many Go projects.

Julia

The Julia binding was a bit more of a recent surprise addition. A few weeks ago a Julia hacker dropped by the YAML Chat Room to ask some questions about YAML. I ended up asking him more about Julia and if he could help write a YAMLScript binding.

He invited Kenta Murata to the chat room and Kenta said he could do it for us. Then Kenta disappeared for a few weeks. Last week he came back with a fully working Julia binding for YAMLScript!

Fun fact: Julia is Clark Evans favorite programming language! Clark is one of the original authors of the YAML data language.

YAMLScript Loader Libraries

These YAMLScript language bindings are intended to be an alternative YAML loader library for the respective languages. They can load normal existing YAML files in a consistent way, and common API across all languages. They can also load YAML files with embedded YAMLScript code, to achieve data importing, transformation, interpolation; anything a programming language can do.

The current list of YAMLScript loader libraries is:

Join the Fun!

If your language is missing a YAMLScript binding or you want to help improve one, please drop by the YAMLScript Chat Room and we'll get you started.

All of the bindings are part of the YAMLScript Mono-Repo on GitHub. If you look at the existing bindings, you'll see that they are all quite small. You'll need to learn about basic FFI (Foreign Function Interface) for your language, to make calls to the YAMLScript shared library libyamlscript, but that's about it.

It's a great way to get started with a new language project.

Some Future Plans

There's a lot of upcoming work planned for YAMLScript. I've mapped some of it out in the YAMLScript Roadmap.

Currently YAMLScript (written in Clojure, which compiles to JVM bytecode, which…) compiles to a native binary interpreter using the GraalVM native-image compiler. This is great for performance and distribution, but it's not great for portability, limiting it to Linux, MacOS and Windows.

The JVM is a great platform for portability, so we're planning to make a JVM version of the ys YAMLScript interpreter. Of course, having YAMLScript available as a JVM language is also a good thing for Linux, MacOS and Windows users.

We also want to make WebAssembly, JavaScript and C++ versions of the YAMLScript interpreter.

And of course we still want to get to our goal of 42 language bindings!!!

Lots of fun stuff to explore!

Permalink

Clojure Deref (July 17, 2024)

Welcome to the Clojure Deref! This is a weekly link/news roundup for the Clojure ecosystem (feed: RSS). Thanks to Anton Fonarev for link aggregation.

Libraries and Tools

New releases and tools this week:

  • pg2 0.1.15 - A fast PostgreSQL driver for Clojure

  • yamlscript 0.1.66 - Programming in YAML

  • expose-api 0.3.0 - A Clojure library designed to simplify the process of creating public-facing API namespaces

  • datomic-gcp-tf - Terraform module to run Datomic on GCP

  • clong 1.4 - A wrapper for libclang and a generator that can turn c header files into clojure apis

  • tools.build 0.10.5 - Clojure builds as Clojure programs

  • deep-diamond 0.29.4 - A fast Clojure Tensor & Deep Learning library

  • adorn 0.1.131-alpha - Extensible conversion of Clojure code to Hiccup forms

  • calva 2.0.467 - Clojure & ClojureScript Interactive Programming for VS Code

  • http-server 0.1.13 - Serve static assets

  • squint 0.8.113 - Light-weight ClojureScript dialect

  • overarch 0.27.0 - A data driven description of software architecture based on UML and the C4 model

  • hanamicloth 1-alpha4-SNAPSHOT - Easy layered graphics with Hanami & Tablecloth

  • clay 2-beta12 - A tiny Clojure tool for dynamic workflow of data visualization and literate programming

  • polylith 0.2.20 - A tool used to develop Polylith based architectures in Clojure

Permalink

Copyright © 2009, Planet Clojure. No rights reserved.
Planet Clojure is maintained by Baishamapayan Ghose.
Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
Theme by Brajeshwar.