CouchGO! — Enhancing CouchDB with Query Server Written in Go

Over the past month, I’ve been actively working on proof-of-concept projects related to CouchDB, exploring its features and preparing for future tasks. During this period, I’ve gone through the CouchDB documentation multiple times to ensure I understand how everything works. While reading through the documentation, I came across a statement that despite CouchDB shipping with a default Query Server written in JavaScript, creating a custom implementation is relatively simple and custom solutions already exist in the wild.

I did some quick research and found implementations written in Python, Ruby, or Clojure. Since the whole implementation didn’t seem too long, I decided to experiment with CouchDB by trying to write my own custom Query Server. To do this, I chose Go as the language. I haven’t had much experience with this language before, except for using Go templates in Helm’s charts, but I wanted to try something new and thought this project would be a great opportunity for it.

Understanding the Query Server

Before starting work, I revisited the CouchDB documentation once more to understand how the Query Server actually works. According to the documentation, the high-level overview of the Query Server is quite simple:

The Query server is an external process that communicates with CouchDB via the JSON protocol over a stdio interface and handles all design function calls […].

The structure of the commands sent by CouchDB to the Query Server can be expressed as [<command>, <*arguments>] or ["ddoc", <design_doc_id>, [<subcommand>, <funcname>], [<argument1>, <argument2>, …]] in the case of design documents.

So basically, what I had to do was write an application capable of parsing this kind of JSON from STDIO, performing the expected operations, and returning responses as specified in the documentation. There was a lot of type casting involved to handle a wide range of commands in Go code. Specific details about each command can be found under the Query Server Protocol section of the documentation.

One problem I faced here was that the Query Server should be able to interpret and execute arbitrary code provided in design documents. Knowing that Go is a compiled language, I expected to be stuck at this point. Thankfully, I quickly found the Yeagi package, which is capable of interpreting Go code with ease. It allows creating a sandbox and controlling access to which packages can be imported in the interpreted code. In my case, I decided to expose only my package called couchgo, but other standard packages can be easily added as well.

Introducing CouchGO!

As a result of my work, an application called CouchGO! emerged. Although it follows the Query Server Protocol, it is not a one-to-one reimplementation of the JavaScript version as it has its own approaches to handling design document functions.

For example, in CouchGO!, there is no helper function like emit. To emit values, you simply return them from the map function. Additionally, each function in the design document follows the same pattern: it has only one argument, which is an object containing function-specific properties, and is supposed to return only one value as a result. This value doesn't have to be a primitive; depending on the function, it may be an object, a map, or even an error.

To start working with CouchGO!, you just need to download the executable binary from my GitHub repository, place it somewhere in the CouchDB instance, and add an environment variable that allows CouchDB to start the CouchGO! process.

For instance, if you place the couchgo executable into the /opt/couchdb/bin directory, you would add following environment variable to enable it to work.

export COUCHDB_QUERY_SERVER_GO="/opt/couchdb/bin/couchgo"

Writing Functions with CouchGO!

To gain a quick understanding of how to write functions with CouchGO!, let’s explore the following function interface:

func Func(args couchgo.FuncInput) couchgo.FuncOutput { ... }

Each function in CouchGO! will follow this pattern, where Func is replaced with the appropriate function name. Currently, CouchGO! supports the following function types:

  • Map
  • Reduce
  • Filter
  • Update
  • Validate (validate_doc_update)

Let’s examine an example design document that specifies a view with map and reduce functions, as well as a validate_doc_update function. Additionally, we need to specify that we are using Go as the language.

{
  "_id": "_design/ddoc-go",
  "views": {
    "view": {
      "map": "func Map(args couchgo.MapInput) couchgo.MapOutput {\n\tout := couchgo.MapOutput{}\n\tout = append(out, [2]interface{}{args.Doc[\"_id\"], 1})\n\tout = append(out, [2]interface{}{args.Doc[\"_id\"], 2})\n\tout = append(out, [2]interface{}{args.Doc[\"_id\"], 3})\n\t\n\treturn out\n}",
      "reduce": "func Reduce(args couchgo.ReduceInput) couchgo.ReduceOutput {\n\tout := 0.0\n\n\tfor _, value := range args.Values {\n\t\tout += value.(float64)\n\t}\n\n\treturn out\n}"
    }
  },
  "validate_doc_update": "func Validate(args couchgo.ValidateInput) couchgo.ValidateOutput {\n\tif args.NewDoc[\"type\"] == \"post\" {\n\t\tif args.NewDoc[\"title\"] == nil || args.NewDoc[\"content\"] == nil {\n\t\t\treturn couchgo.ForbiddenError{Message: \"Title and content are required\"}\n\t\t}\n\n\t\treturn nil\n\t}\n\n\tif args.NewDoc[\"type\"] == \"comment\" {\n\t\tif args.NewDoc[\"post\"] == nil || args.NewDoc[\"author\"] == nil || args.NewDoc[\"content\"] == nil {\n\t\t\treturn couchgo.ForbiddenError{Message: \"Post, author, and content are required\"}\n\t\t}\n\n\t\treturn nil\n\t}\n\n\tif args.NewDoc[\"type\"] == \"user\" {\n\t\tif args.NewDoc[\"username\"] == nil || args.NewDoc[\"email\"] == nil {\n\t\t\treturn couchgo.ForbiddenError{Message: \"Username and email are required\"}\n\t\t}\n\n\t\treturn nil\n\t}\n\n\treturn couchgo.ForbiddenError{Message: \"Invalid document type\"}\n}",
  "language": "go"
}

Now, let’s break down each function starting with the map function:

func Map(args couchgo.MapInput) couchgo.MapOutput {
  out := couchgo.MapOutput{}
  out = append(out, [2]interface{}{args.Doc["_id"], 1})
  out = append(out, [2]interface{}{args.Doc["_id"], 2})
  out = append(out, [2]interface{}{args.Doc["_id"], 3})

  return out
}

In CouchGO!, there is no emit function; instead, you return a slice of key-value tuples where both key and value can be of any type. The document object isn't directly passed to the function as in JavaScript; rather, it's wrapped in an object. The document itself is simply a hashmap of various values.

Next, let’s examine the reduce function:

func Reduce(args couchgo.ReduceInput) couchgo.ReduceOutput {
  out := 0.0
  for _, value := range args.Values {
    out += value.(float64)
  }
  return out
}

Similar to JavaScript, the reduce function in CouchGO! takes keys, values, and a rereduce parameter, all wrapped into a single object. This function should return a single value of any type that represents the result of the reduction operation.

Finally, let’s look at the Validate function, which corresponds to the validate_doc_update property:

func Validate(args couchgo.ValidateInput) couchgo.ValidateOutput {
  if args.NewDoc["type"] == "post" {
    if args.NewDoc["title"] == nil || args.NewDoc["content"] == nil {
      return couchgo.ForbiddenError{Message: "Title and content are required"}
    }

    return nil
  }

  if args.NewDoc["type"] == "comment" {
    if args.NewDoc["post"] == nil || args.NewDoc["author"] == nil || args.NewDoc["content"] == nil {
      return couchgo.ForbiddenError{Message: "Post, author, and content are required"}
    }

    return nil
  }

  return nil
}

In this function, we receive parameters such as the new document, old document, user context, and security object, all wrapped into one object passed as a function argument. Here, we’re expected to validate if the document can be updated and return an error if not. Similar to the JavaScript version, we can return two types of errors: ForbiddenError or UnauthorizedError. If the document can be updated, we should return nil.

For more detailed examples, they can be found in my GitHub repository. One important thing to note is that the function names are not arbitrary; they should always match the type of function they represent, such as Map, Reduce, Filter, etc.

CouchGO! Performance

Even though writing my own Query Server was a really fun experience, it wouldn’t make much sense if I didn’t compare it with existing solutions. So, I prepared a few simple tests in a Docker container to check how much faster CouchGO! can:

  • Index 100k documents (indexing in CouchDB means executing map functions from views)
  • Execute reduce function for 100k documents
  • Filter change feed for 100k documents
  • Perform update function for 1k requests

I seeded the database with the expected number of documents and measured response times or differentiated timestamp logs from the Docker container using dedicated shell scripts. The details of the implementation can be found in my GitHub repository. The results are presented in the table below.

Test CouchGO! CouchJS Boost
Indexing 141.713s 421.529s 2.97x
Reducing 7672ms 15642ms 2.04x
Filtering 28.928s 80.594s 2.79x
Updating 7.742s 9.661s 1.25x

As you can see, the boost over the JavaScript implementation is significant: almost three times faster in the case of indexing, more than twice as fast for reduce and filter functions. The boost is relatively small for update functions, but still faster than JavaScript.

Conclusion

As the author of the documentation promised, writing a custom Query Server wasn’t that hard when following the Query Server Protocol. Even though CouchGO! lacks a few deprecated functions in general, it provides a significant boost over the JavaScript version even at this early stage of development. I believe there is still plenty of room for improvements.

If you need all the code from this article in one place, you can find it in my GitHub repository.

Thank you for reading this article. I would love to hear your thoughts about this solution. Would you use it with your CouchDB instance, or maybe you already use some custom-made Query Server? I would appreciate hearing about it in the comments.

Don’t forget to check out my other articles for more tips, insights, and other parts of this series as they are created. Happy hacking!

Permalink

Clojure 1.12.0-beta1

Clojure 1.12.0-beta1 is now available! Find download and usage information on the Downloads page.

Changes in 1.12 features:

  • CLJ-2853 Reflection error incorrectly reported target object type, not qualifying class

  • CLJ-2859 Expand scope of FI adapting to include Supplier (and other 0 arg FI)

  • CLJ-2858 Fix encoding of FnInvoker method for prim-returning FIs with arity > 2

  • CLJ-2864 Stop using truthy return logic in FI adapters

  • CLJ-2863 Reflective FI dynamic proxy should use runtime classloader

  • CLJ-2770 invoke-tool - remove external process name parameter (this is a runtime property)

Enhancements:

  • CLJ-2645 PrintWriter-on now supports auto-flush, and prepl uses it for the err stream

  • CLJ-2698 defprotocol - ignore unused primitive return type hints

  • CLJ-1385 transient - include usage model from reference docs

Permalink

Humble Chronicles: The Inescapable Objects

In HumbleUI, there is a full-fledged OOP system that powers lower-level component instances. Sacrilegious, I know, in Clojure we are not supposed to talk about it. But...

Look. Components (we call them Nodes in Humble UI because they serve the same purpose as DOM nodes) have state. Plain and simple. No way around it. So we need something stateful to store them.

They also have behaviors. Again, pretty unavoidable. State and behavior work together.

Still not a case for OOP yet: could’ve been maps and functions. One can just

(def node []
  {:state   (volatile! state)
   :measure (fn [...] ...)
   :draw    (fn [...] ...)})

But there’s more to consider.

Code reuse

Many nodes share the same pattern: e.g. a wrapper is a node that “wraps” another node. padding is a wrapper:

[ui/padding {:padding 10}
 [ui/button "Click me"]]

So is center:

[ui/center
 [ui/button "Click me"]]

So is rect (it draws a rectangle behind its child):

[ui/rect {:paint ...}
 [ui/button "Click me"]]

The first two are different in how they position their child but identical in drawing and event handling. The third one has a different paint function, but the layout and event handling are the same.

I want to write AWrapperNode once and let the rest of the nodes reuse that.

Now — you might think — still not a case for OOP. Just extract a bunch of functions and then pick and choose!

;; shared library code
(defn wrapper-measure [...] ...)

(defn wrapper-draw [...] ...)

;; a node
(defn padding [...]
  {:measure (fn [...]
              <custom measure fn>)
   :draw    wrapper-draw}) ;; reused

This has an added benefit of free choice: you can mix and match implementations from different parents, e.g. measure from wrapper and draw from container.

Partial code replacement

Some functions call other functions! What a surprise.

One direction is easy. E.g. Rect node can first draw itself and then call a parent. We solve this by wrapping one function into another:

(defn rect [opts child]
  {:draw (fn [...]
           (canvas/draw-rect ...)
           ;; reuse by wrapping
           (wrapper-draw ...))})

But now I want to do it the other way: the parent defines wrapping behavior and the child only replaces one part of it.

E.g., for Wrapper nodes we always want to save and restore the canvas state around the drawing, but the drawing itself can be redefined by children:

(defn wrapper-draw [callback]
  (fn [...]
    (let [layer (canvas/save canvas)]
      (callback ...)
      (canvas/restore canvas layer))))

(defn rect [opts child]
  {:draw (wrapper-draw ;; reuse by inverse wrapping
           (fn [...]
             (canvas/draw-rect ...)
             ((:draw child) child ...)}))})

I am not sure about you, but to me, it starts to feel a little too high-ordery.

Another option would be to pass “this” around and make shared functions lookup implementations in it:

(defn wrapper-draw [this ...]
  (let [layer (canvas/save canvas)]
    ((:draw-impl this) ...) ;; lookup in a child
    (canvas/restore canvas layer))))

(defn rect [opts child]
  {:draw      wrapper-draw   ;; reused
   :draw-impl (fn [this ...] ;; except for this part
                (canvas/draw-rect ...)
                ((:draw child) child ...)}))

Starts to feel like OOP, doesn’t it?

Future-proofing

Final problem: I want Humble UI users to write their own nodes. This is not the default interface, mind you, but if somebody wants/needs to go low-level, why not? I want them to have all the tools that I have.

The problem is, what if in the future I add another method? E.g. when it all started, I only had:

  • -measure
  • -draw
  • -event

Eventually, I added -context, -iterate, and -*-impl versions of these. Nobody guarantees I won’t need another one in the future.

Now, with the map approach, the problem is that there will be none. A node is written as:

{:draw    ...
 :measure ...
 :event   ...}

will not suddenly have a context method when I add one.

That’s what OOP solves! If I control the root implementation and add more stuff to it, everybody will get it no matter when they write their nodes.

How does it look

We still have normal protocols:

(defprotocol IComponent
  (-context              [_ ctx])
  (-measure      ^IPoint [_ ctx ^IPoint cs])
  (-measure-impl ^IPoint [_ ctx ^IPoint cs])
  (-draw                 [_ ctx ^IRect rect canvas])
  (-draw-impl            [_ ctx ^IRect rect canvas])
  (-event                [_ ctx event])
  (-event-impl           [_ ctx event])
  (-iterate              [_ ctx cb])
  (-child-elements       [_ ctx new-el])
  (-reconcile            [_ ctx new-el])
  (-reconcile-impl       [_ ctx new-el])
  (-should-reconcile?    [_ ctx new-el])
  (-unmount              [_])
  (-unmount-impl         [_]))

Then we have base (abstract) classes:

(core/defparent ANode
  [^:mut element
   ^:mut mounted?
   ^:mut rect
   ^:mut key
   ^:mut dirty?]
  
  protocols/IComponent
  (-context [_ ctx]
    ctx)

  (-measure [this ctx cs]
    (binding [ui/*node* this
              ui/*ctx*  ctx]
      (ui/maybe-render this ctx)
      (protocols/-measure-impl this ctx cs)))

  ...)

Note that parents can also have fields! Admit it: We all came to Clojure to write better Java.

Then we have intermediate abstract classes that, on one hand, reuse parent behavior, but also redefine it where needed. E.g.

(core/defparent AWrapperNode [^:mut child] :extends ANode
  protocols/IComponent
  (-measure-impl [this ctx cs]
    (when-some [ctx' (protocols/-context this ctx)]
      (measure (:child this) ctx' cs)))

  (-draw-impl [this ctx rect canvas]
    (when-some [ctx' (protocols/-context this ctx)]
      (draw-child (:child this) ctx' rect canvas)))
  
  (-event-impl [this ctx event]
    (event-child (:child this) ctx event))
  
  ...)

Finally, leaves are almost normal deftypes but they pull basic implementations from their parents.

(core/deftype+ Padding [] :extends AWrapperNode
  protocols/IComponent
  (-measure-impl [_ ctx cs] ...)
  (-draw-impl [_ ctx rect canvas] ...))

Underneath, there’s almost no magic. Parent implementations are just copied into children, fields are concatenated to child’s fields, etc.

Again, this is not the interface that the end-user will use. End-user will write components like this:

(ui/defcomp button [opts child]
  [clickable opts
   [clip-rrect {:radii [4]}
    [rect {:paint button-bg)}
     [padding {:padding 10}
      [center
       [label child]]]]]])

But underneath all these rect/padding/center/label will eventually be instantiated into nodes. Heck, even your button will become FnNode. But you are not required to know this.

Also, a reminder: all these solutions, just like Humble UI itself, are a work in progress at the moment. No promises it’ll stay that way.

Conclusion

I’ve heard a rumor that OOP was originally invented for UIs specifically. Mutable objects with mostly shared but sometimes different behaviors were a perfect match for the object paradigm.

Well, now I know: even today, no matter how you start, eventually you will arrive at the same conclusion.

I hope you find this interesting. If you have a better idea — let me know.

Permalink

Humble Chronicles: Shape of the Component

Last time I ran a huge experiment trying to figure out how components should work in Humble UI. Since then, I’ve been trying to bring it to the main.

This was trickier than I anticipated — even with a working prototype, there are still lots of decisions to make, and each one takes time.

I discussed some ideas in Humble Chronicles: Managing State with VDOM, but this is what we ultimately arrived at.

The simplest component:

(ui/defcomp my-comp []
  [ui/label "Hello, world!"])

Note the use of square brackets [], it’s important. We are not creating nodes directly, we return a “description” of UI that will later be analyzed and instantiated for us by Humble UI.

Later if you want to use your component, you do the same:

(ui/defcomp other-comp []
  [my-comp])

You can pass arguments to it:

(ui/defcomp my-comp [text text2 text3]
  [ui/label (str text ", " text2 ", " text3)])

To use local state, return a function. In that case, the body itself will become the “setup” phase, and the returned function will become the “render” phase. Setup is called once, render is called many times:

(ui/defcomp my-comp [text]
  ;; setup
  (let [*cnt (signal/signal 0)]
    (fn [text]
      ;; render
      [ui/label (str text ": " @*cnt)])))

As you can see, we have our own signals implementation. They seem to fit very well with the rest of the VDOM paradigm.

Finally, the fullest form is a map with the :render key:

(ui/defcomp my-comp [text]
  (let [timer (timer/schedule #(println 123) 1000)]
    {:after-unmount
     (fn []
       (timer/cancel timer)) 
     :render
     (fn [text]
       [ui/label text])}))

Again, the body of the component itself becomes “setup”, and :render becomes “render”. As you can see, the map form is useful for specifying lifecycle callbacks.

Code reuse

React has a notion of “hooks”: small reusable bits of code that have access to all the same state and lifecycle machinery that components have.

For example, a timer always needs to be cancelled in unmount, but I don’t want to write after-unmount every time I want to use a timer. I want to use a timer and have its lifecycle to be registered automatically.

Our alternative is with macro:

(defn use-timer []
  (let [*state (signal/signal 0)
        timer  (timer/schedule #(println @*state) 1000)
        cancel (fn []
                 (timer/cancel timer))]
    {:value         *state
     :after-unmount cancel}))

(ui/defcomp ui []
  (ui/with [*timer (use-timer)]
    (fn []
      [ui/label "Timer: " @*timer])))

Under the hood, with just takes a return map of its body and adds stuff it needs to it. Simple, no magic, no special “hooks rules”.

Same as with hooks, with can be used inside with recursively. It just works.

Thanks Kevin Lynagh for the idea.

Shared state

One of the goals of Humble UI was to make component reuse trivial. Web, for example, has hundreds of properties to customize a button, and still, it’s often not enough.

I lack the resources to make hundreds of properties, so I wanted to take another route: make components out of simple reusable parts, and let end users recombine them.

So a button becomes clickable (behavior) and button-look (visual). Want a custom button? Implement your own look, and use the same behavior. Want to reuse the look in another component (e.g. a toggle button?). Write your own behavior, and reuse the visuals.

The look itself consists of simple parts that can be reused and recombined:

(ui/defcomp button-look [child]
  [clip-rrect {:radii [4]}
   [rect {:paint button-bg)}
    [padding {:padding 10}
     [center
      [label child]]]]])

And then the button becomes:

(ui/defcomp button [opts child]
  [ui/clickable opts
   [ui/button-look child]])

(this and a previous one are simplified for clarity)

Now, the problem. The button is, of course, interactive. It reacts to being hovered, pressed, etc. But the state that represents it lives in clickable (the behavior). How to share?

The first idea was to use signals. Like this:

(ui/defcomp button [opts child]
  (let [*state (signal/signal nil)]
    (fn [opts child]
      [ui/clickable {:*state *state}
       [ui/button-look @*state child]])))

Which does work, of course, but a little too verbose. It also forces you to define state outside, while logically clickable should be responsible for it.

So the current solution is this:

(ui/defcomp button [opts child]
  [ui/clickable opts
   (fn [state]
     [ui/button-look state child])])

Which is a bit tighter and doesn’t expose the state unnecessarily. The look component is also straightforward: it accepts the state as an argument, without any magic, so it can be reused anywhere.

Where to try

Current development happens in the “vdom” branch. Components migrate slowly, but steadily, to the new model.

Current screenshot for history:

Soon we will all live in a Virtual DOM world, I hope.

Permalink

Clojure Deref (June 13, 2024)

Welcome to the Clojure Deref! This is a weekly link/news roundup for the Clojure ecosystem (feed: RSS). Thanks to Anton Fonarev for link aggregation.

Blogs, articles, and projects

Libraries and Tools

New releases and tools this week:

  • tools.build 0.10.4 - Clojure builds as Clojure programs

  • iort 0.1.54 - Interoperable Outcomes Research Tools

  • llama.clj 0.8.4 - Run LLMs locally. A clojure wrapper for llama.cpp

  • cljfx 1.9.0 - Declarative, functional and extensible wrapper of JavaFX inspired by better parts of react and re-frame

  • honeysql 2.6.1147 - Turn Clojure data structures into SQL

  • nrepl 1.2 - A Clojure network REPL that provides a server and client, along with some common APIs of use to IDEs and other tools that may need to evaluate Clojure code in remote environments

  • cider 1.15 - The Clojure Interactive Development Environment that Rocks for Emacs

  • squint 0.7.111 - Light-weight ClojureScript dialect

  • yamlscript 0.1.61 - Programming in YAML

  • apie 0.2.1 - OpenAPI Service Validator

  • overarch 0.21.0 - A data driven description of software architecture based on UML and the C4 model

Permalink

Well Known Identifiers and Reloadable Systems

Well Known Identifiers and Reloadable Systems

A second Tea Leaves edition, please bear with us as we try to find our stride.

Some Goodbyes

For economic reasons Gaiwan has recently significantly shrunk its team. We say goodbye to Ariel, Joshua, and Gabriel. They were all great colleagues, and we wish them the best in their future career and in life. Keep in touch!

We&aposre taking this as an opportunity with the remaining people to find some renewed focus and clarity on where we want the company to go in the coming years. One thing that&aposs clear is that we are a provider of expertise, and we want to put even more effort into sharing that expertise with the world. Hence this Tea Leaves newsletter and the accompanying Tea Garden pattern repository.

Garden News

We&aposve added two patterns to the pattern repository, Well Known Identifier, and Reloadable Component System.

Component Systems as pioneered by Alessandra Sierra are familiar to many Clojurists, and at this point we have a smorgasbord of choices. The original Component, Integrant, or Mount, just to name the most popular ones. In a future ecosystem analysis we want to put them into an overview, and explain how they differ and how to pick one that suits your need. Before that however we needed a description of the pattern itself, with a home on the web. Hence the new Reloadable Component System page.

A Well Known Identifier should look familiar for many developers. In this case we&aposre observing commonalities between a number of different existing naming practices, and using that to say "hey, there&aposs something interesting going on here, we can recognize a pattern". Once captured the pattern can go from holding descriptive and explanatory power, to also gaining generative powers.

Heart of Clojure

The organizing of Heart of Clojure is in full swing! We just closed the CFP, and boy we got some really great stuff in there. We&aposll need a few more weeks to sort it all out, but stay tuned for the first program announcements. It&aposs going to be another amazing edition. Make sure to grab a ticket while they&aposre still available!

Personal

This will be the first summer that I own a garden. I&aposve had a green thumb since I was a teenager, and despite the historically wet spring in Belgium I&aposve already had plenty of chance to enjoy getting my hands in the dirt. The wet weather has caused an explosion of snails, so it&aposs hard to get any vegetables going. That&aposs why for now I&aposm still keeping them in pots and close to the house for now. This weekend I repotted a handful of tomatoes, some lettuce, basil and other stuff. (as shown in the header picture)

Permalink

What is a list anyway?

As programmers, we work with lists all day every day… any time you have more than one of something, you will most likely make a list. Having nothing or having one of something can be considered special cases… and the common case is to have N somethings.

Because lists are so ubiquitous, all higher-level programming languages provide types or classes to work with lists, this usually includes creating lists, selecting items from lists, processing the items of a list and so on.

But what is a list in its essence?

If we go back to 1958 when John McCarthy discovered the LISP (short for LISt Processor), we can see that a list boils down to simply a pair of values: an element and a pointer to the next pair. In LISP this is called the cons cell and its slots are called CAR and CDR respectively.

So we can see that “a list” is not a thing in and of itself, but rather an amalgamation of cons cells. Many Object Oriented programming languages, like Java, provide a List class which differs from the original lispy view of things and models the whole instead of the part.

In this post, we will take a look at what programming in lists means, but we won’t be using LISP (or any of its many successors), but the declarative logic language Prolog. All examples work with the free SWI-Prolog implementation.

List basics

Prolog, like LISP and many other languages, has a built-in syntax for lists. Prolog uses square brackets and commas:

?- Languages = [lisp, c, assembly].
Languages = [lisp, c, assembly].

We can also destructure a list into its head (the first element) and tail (the rest of the list) using the | character.

?- [One|Rest] = [1,2,3,4].
One = 1,
Rest = [2, 3, 4].

The last element is the empty list:

?- [First|NothingMore] = [1].
First = 1,
NothingMore = [].

We can also check the members of a list:

?- member(X, [1,2,3]).
X = 1;
X = 2;
X = 3.

?- member(3, [1,2,3]).
true

?- member(42, [1,2,3]).
false

Basic utilities

What do we usually do with a list? Well, we need to go through its items, for example, to do something for each item. We must be able to create new lists and take apart lists.

To map a goal to each element we can use maplist:

?- maplist(succ, [1,2,3,4], PlusOnes).
PlusOnes = [2, 3, 4, 5].

Here we can see the difference of Prolog to “regular” programming as maplist isn’t a procedure that “returns” a new list. What we are declaring is a relation from one list to another and we can do it the other way as well or use it to check if the relation holds for the given parameters:

?- maplist(succ, MinusOnes, [1,2,3,4]).
MinusOnes = [0, 1, 2, 3].

?- maplist(succ, [1,2], [2,3]).
true.

Next, let’s define relational versions of the usual list functions: drop (remove N items from front) and take (take only the N first items of the list).

drop(0, Lst, Lst).
drop(N, [_|Rest], Out) :-
  N > 0,
  N1 is N - 1,
  drop(N1, Rest, Out).

Above you can see we define two clauses. The first states that when dropping 0 elements from a list, the output list is the list itself. The second states that for a positive number N, the first element of the input list is discarded, and then a recursive clause is added for N-1.

Similarly, we can define take as:

take(0, _, []).
take(N, [I|Items], [I|Taken]) :-
  N > 0,
  N1 is N - 1,
  take(N1, Items, Taken).

Again we have a recursive definition with a base case of taking zero items, but here the output of taking zero items from anything is the empty list.

We can now use both of these definitions:

?- drop(2, [1,2,3,4], Dropped).
Dropped = [3, 4].

?- take(3, [42, 1, 2, 3, 4], Taken).
Taken = [42, 1, 2].

As these are relations, can we run them the other way? Yes, somewhat.

?- between(1,3,N), drop(N, Input, [1,2,3]).
N = 1,
Input = [_, 1, 2, 3];

N = 2,
Input = [_, _, 1, 2, 3];

N = 3,
Input = [_, _, _, 1, 2, 3].

We can state that the result of dropping N items is the list [1,2,3] and Prolog will give us the possible input lists, but all the items before will be placeholders that can refer to anything.

But we can further constrain the list and state that all numbers are between 1 and 3 to get all possible permutations:

?- between(1,3,N), drop(N, Input, [1,2,3]), maplist(between(1,3), Input).
N = 1,
Input = [1, 1, 2, 3];

N = 1,
Input = [2, 1, 2, 3];

N = 1,
Input = [3, 1, 2, 3];

N = 2,
Input = [1, 1, 1, 2, 3];

N = 2,
Input = [1, 2, 1, 2, 3];

% ...many results omitted...

N = 3,
Input = [3, 3, 3, 1, 2, 3].

Enter append

Of course, a language must also have a way to concatenate lists. For this, we have append which comes in two forms: append two lists to get a third one and append a list of lists to get all of them concatenated.

?- append([h,e], [l,l,o], Out).
Out = [h, e, l, l, o].

?- append([[1,2], [3,4], [5,6]], All).
All = [1, 2, 3, 4, 5, 6].

Pretty simple… a utility found in most languages. What if I told you that the above utilities of drop and take aren’t needed because we have append? How can appending lists be used to split them apart, I hear you ask. Remember that we are describing a relation between the parameters, so we can do drop and take like:

?- length(Dropped, 3), append(_, Dropped, [1,2,3,4,5,6]).
Dropped = [4, 5, 6]

?- length(Taken, 3), append(Taken, _, [1,2,3,4,5,6]).
Taken = [1, 2, 3]

We can even use the second list of lists form to split a list at given element and extract it:

?- length(Before, 3), append([Before, [Fourth], After], [1,2,3,4,5,6]).
Before = [1, 2, 3],
Fourth = 4,
After = [5, 6]

If we don’t specify the length of the Before list, Prolog will give us results for all different possible lengths of Before and After including empty, allowing us to effectively iterate the items with lookback and lookahead. Neat!

Where we’re going, you won’t need objects

Many programming languages, like JavaScript and Clojure, also have direct syntax for creating key/value mappings that can be accessed by key. Hashtables are obviously good for performance when you have big mappings, but many LISP and Prolog programs traditionally use association lists. You can model the same information as a list of key/value pairs.

For many cases with small mappings, like arguments to some call or the fields of an object, the cost of hashing means the difference between a list vs a hashtable is negligible.

For example, SWI-Prolog JSON library represents an object as a compound term json([Key=Value, ...]) by default. This makes the list of fields easily accessible with Prolog list processing. We can easily access the value of a specific key with the standard list membership (member) predicate.

?- Attrs = [firstname="John",lastname="Doe",email="john.doe@example.com"], member(firstname=F, Attrs).
Attrs = [firstname="John", lastname="Doe", email="john.doe@example.com"],
F = "John"

Another common example is an environment bindings mapping (like in an interpreter). You can easily overwrite a key by appending a new mapping to the front of the list. Because lists are just cons cells, we can create a new cell that contains the new mapping and points to the old list. Very efficient. We also don’t need to remove anything when we exit, just keep using the old unchanged list.

Closing thoughts

Lists are more than just a type of object. I hope you can see why Lisp and Prolog (and many other functional programming languages) use lists heavily. They become very handy if the language includes direct syntax for creating and destructuring lists. Many pattern-matching languages also allow easy destructuring into the head and tail of a list.

Permalink

How I’m learning Clojure in 2024

I’ve recently been learning a bit of Clojure and it’s been a lot of fun! I thought I would note down what has been useful for me, so that it might help others as well.

Jumping right in

https://tryclojure.org/ is a great intro to the language. It provides a REPL and a tutorial that takes you through the basic features of Clojure.

Importantly, it forces you to get used to seeing lots of ( and )!

Doing exercises

Exercism provides small coding challenges for a bunch of languages, including Clojure. Unlike other platforms (cough leetcode cough), Exercism is focused on learning and I found it a great way to practice writing Clojure.

It provides a code editor and evaluates each challenge when you submit. There’s also a way to submit answers locally from your computer, but I found it quicker just to use the website.

Editor setup

I ended up setting up Neovim for developing locally. This guide was a great inspiration: https://endot.org/2023/05/27/vim-clojure-dev-2023/, although I did end up going with something a little bit simpler.

My .vimrc can be seen here but probably the most important plugin is Conjure, which provides REPL integration in Neovim.

The REPL is one of the big differences compared to programming in other languages. Basically, you start a REPL in the project directory and then you can evaluate code in your editor in that REPL.

This basically gives you really short iteration cycles, you can ‘play’ with your code, run tests, reload code in a running app and all without leaving your editor!

To understand REPL driven development, I really liked this video with teej_dv and lispyclouds. One key thing I learnt was using the comment function to be able to evaluate code without affecting the rest of my program.

; my super cool function
; given a number, adds 2 to it!
(defn add-2
  [n]
  (+ n 2)
)

; This tells Clojure to ignore what comes next
; but it still has to be syntactically correct!
(comment
  (add-2 3) ; <-- I can evaluate this to check my add-2 function :)
)

By opening a REPL and using the Conjure plugin mentioned before I can:

  • ,eb: Evaluate the buffer I am in. Kinda like loading up the file I have opened into the REPL.
  • ,ee: Evaluate the expression my cursor is under.
  • ,tc: Run the test my cursor is over.
  • ,tn: Run all tests in current file.

I use the following alias in my .bash_aliases to easily spin up a REPL:

# From https://github.com/Olical/conjure/wiki/Quick-start:-Clojure
# Don't ask me questions about how this works, but it does!

alias cljrepl='clj -Sdeps '\''{:deps {nrepl/nrepl {:mvn/version "1.0.0"} cider/cider-nrepl {:mvn/version "0.42.1"}}}\'\'' \
    --main nrepl.cmdline \
    --middleware '\''["cider.nrepl/cider-middleware"]'\'' \
    --interactive'

Docs

For docs, I really like https://clojuredocs.org/, which has the docs for the core library. I like the fact that users can submit code examples, which provides better information for each function.

Projects

I’ve currently have 2 projects in Clojure to further my learning.

  1. A bad terminal-based clone(ish) of Balatro. Balatro is a very addictive deck builder rougelike game. Doing this has been fun and it feels very natural to ‘build up’ over time. The source code can be seen here.
  2. A application that converts a subreddit into an RSS feed. The idea that this can be a webapp that produces daily RSS feeds for a collection of subreddits. Source code

The End

Thanks for reading!

Permalink

SQL generation: Golang's builder pattern vs Clojure's persistent map

I worked on a TODO code assignment for showing off my skills, and more importantly, showing my weak points. I coded in Golang and Masterminds/squirrel. Later, I ported only the SQL generation part to Clojure to compare and discuss why I prefer Clojure, which I have usually been asked about or even met with opposition for. I will discuss function by function and type by type. The first function is makeStatement.

func (repo *TodoRepoPg) makeStatement(orders []entity.Order, filters []entity.Filter) (string, []any, error) {
    builder := repo.Builder.Select("id, title, description, created, image, status")
    if err := buildOrders(&builder, orders); err != nil {
        return "", nil, err
    }
    if err := buildFilters(&builder, filters); err != nil {
        return "", nil, err
    }
    return builder.From("task").ToSql()
}

The makeStatement function's name clearly indicates it utilizes the builder pattern. However, to improve readability and avoid cluttering the function with too many details, it delegates order and filter information building to separate functions: buildOrders and buildFilters. Next one is the make-statement function in Clojure with HoneySQL.

(defn make-statement [orders filters]
  (sql/format (merge {:select [:id :description :status]
                      :from [:task]}
                     (filters->map filters)
                     (orders->map orders))))

In Clojure version, the main difference is that filters->map and orders->map are pure functions, which won't mutate or change their inputs like buildOrders and buildFilters do with the builder in Golang. The next one I will show contract or type or spec.

const (
    ID = iota
    Title
    Description
    Date
    Status
)

const (
    ASC = iota
    DESC
)

type Order struct {
    Field        int
    SortingOrder int
}

type Filter struct {
    Field int
    Value string
}

In Golang, to complement function definitions, I define custom types for conveying order and filter information. While using strings for this purpose is also acceptable, I prefer using types to leverage Go's static analysis and prevent typos.

(s/def :db1/orders (s/coll-of (s/tuple #{:title :created :status} #{:+ :-})))
(s/def :db1/filters (s/coll-of (s/tuple #{:title :description} any?)))

On the other hand, in Clojure, I defined similar contracts using Clojure Spec. Here, the information about orders and filters being collections of tuples resides within the Spec definition itself, unlike the separate function definitions in Golang.

func buildOrders(builder *squirrel.SelectBuilder, orders []entity.Order) error {
    for _, order := range orders {
        var fieldName string
        switch order.Field {
        case entity.Title:
            fieldName = "title"
        case entity.Date:
            fieldName = "created"
        case entity.Status:
            fieldName = "status"
        default:
            return fmt.Errorf("invalid field: %d", order.Field)
        }
        var sortOrder string
        switch order.SortingOrder {
        case entity.ASC:
            sortOrder = "ASC"
        case entity.DESC:
            sortOrder = "DESC"
        default:
            return fmt.Errorf("invalid sorting order: %d", order.SortingOrder)
        }
        orderExpr := fieldName + " " + sortOrder
        *builder = builder.OrderBy(orderExpr)
    }
    return nil
}

buildOrders looks very familiar. It reminds me of Pascal, which I learned 30 years ago. This suggests that the code utilizes a well-established approach, making it understandable to most programmers even without prior Go experience. However, I've identified potential code duplication between the type definition and the switch-case within this function.

(defn orders->map [orders] 
  (when-not (s/valid? :db1/orders orders)
    (throw (ex-info (s/explain-data :db1/orders orders))))

  (->> orders
       (mapv #(let [[field order-dir] %] 
                [field (case order-dir
                         :+ :asc
                         :- :desc)]))
       (array-map :order-by)))

The Clojure function orders->map might have surprised my younger self from 30 years ago. However, it leverages Clojure Spec to its full potential. Spec validates the input to the function, and provide clear explanations when validation fails. Furthermore, orders->map is a pure function, meaning it doesn't modify its input data. Both the input and output data leverage Clojure's persistent maps, a fundamental data structure known for immutability. Therefore,
unit testing for the orders->map function is relatively straightforward. I have no idea how to write a unit test for buildOrders in Go.

(deftest generate-orders-maps
  (is (= {:order-by []}
         (orders->map [])))
  (is (= {:order-by [[:title :desc]]}
         (orders->map [[:title :-]])))
  (is (= {:order-by [[:status :asc]]}
         (orders->map [[:status :+]]))))

In conclusion, Go's main advantage lies in its familiarity for programmers from various languages like Pascal, Java, JavaScript, Python, and C. This familiarity extends to the builder pattern, which offers the additional benefit of auto-completion in IDEs and smart editors. On the other hand, Clojure and HoneySQL emphasize using data structures, especially persistent maps, for building queries.

While auto-completion is less important for Clojure programmers who are comfortable manipulating basic data structures, Clojure Spec offers significant advantages in data validation.

Spec can explain what happens when data fails to meet the requirements, promoting better error handling and adherence to the open-closed principle (where code can be extended without modifying existing functionality). Additionally, Clojure Spec is not part of the function definition itself, allowing for greater flexibility and potential separation of concerns.

More importantly, writing unit tests in Clojure with HoneySQL is significantly more efficient. Because orders->map is based on persistent data structures, it avoids modifying the input data. This immutability, along with the ease of comparing maps, makes them ideal for testing.

Permalink

April & May 2024 Short-Term Project Updates

We’ve got several updates to share from our Q1 and Q2 project developers. Check out the latest in their April and May reports.

clj-merge tool: Kurt Harriger
Compojure-api: Ambrose Bonnaire-Sergeant
Instaparse: Mark Engelberg
Jank: Jeaye Wilkerson
Plexus: John Collins
Lost in Lambduhhs Podast: L. Jordan Miller
Scicloj: Daniel Slutsky

 
 

Clj-merge: Kurt Harriger

Q2 2024 Report No. 1. Published May 15, 2024

Introduction

This tool aims to reduce unnecessary conflicts due to whitespace and syntax peculiarities by using a more semantic approach to diffing and merging. I’m grateful for the support from ClojuristsTogether and the invaluable feedback and support from the Clojure community.

Milestones Overview

The project was structured around several key milestones:

  1. Development of the MVP.
  2. Enhancement of diff handling and presentation.
  3. Community engagement and feedback integration.
  4. Performance optimization and cross-platform compatibility.

Milestone Progress

  1. Development of the MVP

    • Goal: To create a minimal viable product using editscript and rewrite-clj.
    • Progress: The MVP was successfully developed and demonstrated its capability in resolving basic merge conflicts. Initially, extensive diffs suggested a complete rewrite of editscript might be necessary. However, implementing isomorphic translations between rewrite-clj node representations proved to be a sufficient workaround for now.
    • Next Steps: The focus will now shift towards refining this MVP, enhancing its error handling, and improving the user feedback system to make the tool more robust.
  2. Enhancement of Diff Handling and Presentation

    • Goal: To improve the readability and utility of diffs for developers.
    • Progress: The generated diffs are more machine-readable than human-readable, early use of the tool has underscored the importance of easily interpretable diffs emphasizing the need for better visualization.
    • Next Steps: I plan to continue working on improving the presentation of diffs, making them easier to understand and act upon.
  3. Community Engagement and Feedback Integration

    • Goal: To actively engage with the community to gather detailed feedback and real-world merge conflict examples.
    • Progress: Some feedback and bug reports have been received, such as an “index out of range” error; however, collecting comprehensive examples has proved challenging.
    • Next Steps: I aim to increase efforts to engage the community and hope to present at one of the clojure meetups in the near future.
  4. Performance optimization and cross-platform compatibility

    • Goal: Simplify the installation process
    • Progress: The project has been successfully compiled with GraalVM for fast startup. However, these are built during the install process and require GraalVM to be available at install.
    • Next Steps: CICD process to publish the binaries for download.

Future Vision and Calls to Action

Moving forward, enhancing diff visualization will be my primary focus. A more intuitive representation of changes will not only improve the tool’s usability but also its adoption. I encourage everyone in the Clojure community to try clj-mergetool, especially in challenging merge scenarios, and share any issues or feedback. Your contributions are crucial for refining the tool and expanding its capabilities.

Thank you for your continued support and contributions to the clj-mergetool project.


Compojure-api: Ambrose Bonnaire-Sergeant

Q2 2024 Report No. 1. Published April 30, 2024

ring-swagger

I have released ring-swagger 1.0.0, compojure-api 1.1.14 and 2.0.0-alpha33 which all include a critical fix to prevent this memory leak.

Rajkumar Natarajan proposed OpenAPI3 support and I have been reviewing it.

1.0.0 (30.4.2024)

compojure-api

My two main focuses with compojure-api have been to make 2.x backwards compatible with 1.x and implement performance improvements.

I have drafted some further performance ideas as issues and the remaining tasks for 1.x compatilibity are here.

2.0.0-alpha33 (2024-04-30)

  • Throw an error on malformed :{body,query,headers}, in particular if anything other than 2 elements was provided
  • Disable check with -Dcompojure.api.meta.allow-bad-{body,query,headers}=true
  • 50% reduction in the number of times :{return,body,query,responses,headers,coercion,{body,form,header,query,path}-params} schemas/arguments are evaluated/expanded
  • saves 1 evaluation per schema for static contexts
    • saves 1 evaluation per schema, per request, for dynamic contexts
  • Fix: Merge :{form,multipart}-params :info :public :parameters :formData field at runtime
  • Add :outer-lets field to restructure-param result which wraps entire resulting form
  • Remove static-context macro and replace with equivalent expansion without relying on compojure internals.
  • Upgrade to ring-swagger 1.0.0 to fix memory leaks

2.0.0-alpha32 (2024-04-20)

  • Fix empty spec response coercion. #413
  • Add back defapi (and deprecate it)
  • Remove potemkin #445
  • Add back compojure.api.routes/create
  • Add back middleware (and deprecate it)
  • Make context :dynamic by default
  • Add :static true option to context
  • Add static context optimization coach
    • -Dcompojure.api.meta.static-context-coach=print to print hints
    • -Dcompojure.api.meta.static-context-coach=assert to assert hints
  • port unit tests from midje to clojure.test

Instaparse: Mark Engelberg

Q1 2024 Report 2. Published May 31, 2024

Thanks to funding from Clojurists Together, I have continued to review instaparse pull requests that have been submitted over the past couple of years.

The most interesting and useful issue, which had languished among the pull requests for over two years, was a suggestion to incorporate namespaced non-terminals, so I am pleased to report that this feature has now been implemented, tested, documented, and deployed in instaparse version 1.5.0. The pull request wasn’t quite usable out of the box, as it relied on a feature unique to Clojure 1.11, and I always strive for instaparse to be backwards compatible to Clojure 1.5. But it provided a great starting point for implementation. I think the community will find this feature to be useful.

For the final third of my Clojurists Together time, I have my eye on a bug with negative lookahead that was discovered shortly after instaparse’s initial release. It’s something I have always wanted to fix, but have never had the time to solve it as the problem is quite subtle, and any fix will require in-depth testing since it could have wider ramifications.


Jank: Jeaye Wilkerson

Q2 2024 Reports 1&2. Published April 30 & May 31, 2024

Report 1: April 30, 2024

This quarter, I’m being funded by Clojurists Together to build out jank’s lazy sequences, special loop* form, destructuring, and support for the for and doseq macros. Going into this quarter, I had only a rough idea of how Clojure’s lazy sequences were implemented. Now, a month in, I’m ready to report some impressive progress!

Lazy sequences

There are three primary types of lazy sequences in Clojure. I was planning on explaining all of this, but, even better, I can shine the spotlight on Bruno Bonacci’s blog, since he’s covered all three of them very clearly. In short, we have:

  1. Per-element lazy sequences
  2. Chunked lazy sequences
  3. Buffered lazy sequences

This month, I have implemented per-element lazy sequences, along with partial support for chunked lazy sequences. Chunked lazy sequences will be finished next month. By implementing even per-element lazy sequences, so many new opportunities open up. I’ll show what I mean by that later in this post, so don’t go anywhere!

Loop

Prior to this month, jank supported function-level recur. As part of this month’s work, I also implemented loop* and its related recur. When we look at how Clojure JVM implements loop*, it has two different scenarios:

  1. Expression loops
  2. Statement loops

If a loop is in a statement position, Clojure JVM will code-generate labels with goto jumps and local mutations. If the loop is an expression, Clojure JVM generates a function around the loop and then immediately calls that. There is potentially a performance win of not generating the function wrapper and calling it right away, but note that this particular idiom is commonly identified and elided by optimizing compilers. It even has its own acronym: IIFE. (see this also)

jank, for now anyway, simplifies this by always using the IIFE. It does it in a more janky way, though, which is interesting enough that I’ll share it with you all. Let’s take an example loop* (note that the special form of loop is actually loop*, same as in Clojure; loop is a macro which provides destructuring on top of loop* – now you know):

(loop* [x 0]
  (when (< x 10)
    (println x)
    (recur (inc x))))

Given this, jank will replace the loop* with a fn* and just use function recursion. Initial loop values just get lifted into parameters. The jank compiler will transform the above code into the following:

((fn* [x]
   (when (< x 10)
     (println x)
     (recur (inc x)))) 0)

jank code-generates function recursion into a while(true) loop with mutation on some locals for each iteration, similar to Clojure.

However, loop* is tricky, since it can also do anything let* can do. For example (also note: no recursion):

(loop* [a 1
        b (* 2 a)]
  (println a b))

Since we’re using a in the binding for b, we can’t know b until we’ve calculated a, and doing so can involve any arbitrary expression. Agh! This can’t work if we just dump those into the positional parameters of the IIFE. So jank gets around this by actually just wrapping it in a let*. 🙃

(let* [a 1
       b (* 2 a)]
  ((fn* [a b]
    (println a b)) a b))

This could be done in a macro, but since it’s a language-level feature, the compiler does it for us. This means you can still use loop* even if you’re running without clojure.core. As mentioned, this is potentially slower, in the scenario of the loop being in statement position. We can return to this when the performance of loops is the most important thing to tackle. Right now, parity with Clojure and getting jank onto your machine are most important.

Destructuring

Clojure supports all kinds of fancy destructuring of sequences, maps, and keyword arguments. We use destructuring in let, defn, and loop, primarily. One interesting thing about this destructuring is that there’s no compiler support for it at all; it’s not a language-level feature. It’s a library feature, done entirely in macros. The amazing thing about it is that, as long as we support all of the core functions required, we can support destructuring. The actual destructure function is huge, but you can see it in Clojure’s source here.

This month, I implemented all of the missing functions required for the destructure function to be ported over to jank. Largely, once all those functions were implemented, the port just meant updating Java interop in a few places to be C++ interop. Now jank supports all of the fancy destructuring Clojure does, in all the same places. This helps demonstrate how much closer jank is to being a complete Clojure dialect, since complex functions like this can almost just work.

New clojure.core functions

So, to support lazy sequences and destructuring, I needed to add several new core functions. While adding those, I tended toward implementing any similar or surrounding functions as well. I got a little carried away, to be honest. Let’s take a look at the new functions jank now supports.

take (no transducer) cycle
take-while (no transducer) repeat
drop (no transducer) seq?
filter (no transducer) concat
identity ->
constantly ->>
into (no transducer) cond->
mapv zipmap
filterv last
reduce butlast
nthrest map?
nthnext key
partition val
partition-all dissoc
partition-by ident?
dorun simple-ident?
doall qualified-ident?
when-let boolean
when-some nth
when-first loop
split-at peek
split-with pop
drop-last for
take-last chunk-buffer
chunk-append destructure

That’s 52 new functions/macros! That alone amounts to around 10% of all the functions in clojure.core jank will be implementing. A few of these will need some updates once jank fully supports chunked lazy sequences and transducers, but they’re all very usable today. You may also note that for is in there, which was one of the goals this quarter.

Migration from Cling to Clang

jank is much closer to running on Clang’s JIT compiler than it was a month ago. Some recent patches have landed which partially address a blocking bug with pre-compiled header handling in Clang’s internal C++ JIT compiler. I have identified another small reproduction case for what I hope to be the rest of the issues. Part of my work this month involved getting jank running on LLVM 19 and updating filling out the related CMake system to be able to flexibly bring in LLVM on any system.

Once jank moves away from Cling in favor of Clang, building and distributing jank will be significantly easier. Developers won’t need to compile a custom Cling/Clang/LLVM stack. On top of that, Clang’s JIT compiler has recently landed support for loading C++20 modules, which can serve as an less-portable equivalent to JVM’s class files, allowing jank to load pre-compiled modules very quickly. This will drastically optimize jank’s startup time, but will require some work to get going. I’ll keep you updated!

What’s next?

I’m well ahead of schedule, for the quarter, but I need to finish up chunked sequences and doseq. I’ll have time after that and I’d like to get atoms working, since most Clojure programs have some form of state. From there, I can look into strengthening native interop and making jank more easily distributable, but let’s not get ahead of ourselves.

 

Report 2: May 31, 2024

Hey folks! I’ve been building on last month’s addition of lazy sequences, loop*, destructuring, and more. This month, I’ve worked on rounding out lazy sequences, adding more mutability, better meta support, and some big project updates.

Chunked sequences

I’ve expanded the lazy sequence support added last month to include chunked sequences, which pre-load elements in chunks to aid in throughput. At this point, only clojure.core/range returns a chunked sequence, but all of the existing clojure.core functions which should have support for them do.

If you recall from last month, there is a third lazy sequence type: buffered sequences. I won’t be implementing those until they’re needed, as I’d never even heard of them before researching more into the lazy sequences in Clojure.

Initial quarter goals accomplished

Wrapping up the lazy sequence work, minus buffered sequences, actually checked off all the boxes for my original goals this quarter. There’s a bottomless well of new tasks, though, so I’ve moved onto some others. So, how do I decide what to work on next?

My goal is for you all to be writing jank programs. The most important tasks are the ones which bring me closer to that goal. Let’s take a look at what those have been so far.

Volatiles, atoms, and reduced

Most programs have some of mutation and we generally handle that with volatiles and atoms in Clojure. jank already supported transients for most data structures, but we didn’t have a way to hold mutable boxes to immutable values. Volatiles are also essential for many transducers, which I’ll mention a bit later. This month, both volatiles and atoms have been implemented.

Implementing atoms involved a fair amount of research, since lockless programming with atomics is not nearly as straightforward as one might expect.

As part of implementing atoms, I also added support for the @ reader macro and the overall derefable behavior. This same behavior will be used for delays, futures, and others going forward.

Meta handling for defs

Last quarter, I added support for meta hints, but I didn’t actually use that metadata in many places. Now, with defs, I’ve added support for the optional meta map and doc string and I also read the meta from the defined symbol. This isn’t a huge win, but it does mean that jank can start using doc strings normally, and that we can do things like associate more function meta to the var in a defn, which can improve error reporting.

Monorepo

There will be many jank projects and I’ve known for a while that I want them all to be in one git monorepo. This makes code sharing, searching, refactoring, and browsing simpler. It gives contributors one place to go in order to get started and one place for all of the issues and discussions. It’s not my intention to convince you of anything, if you’re not a fan of monorepos, but jank is now using one.

This started by bringing in lein-jank, which was initially created by Saket Patel. From there, I’ve added a couple of more projects, which I’ll cover later in this update.

New clojure.core functions

Following last month’s theme, which saw 52 new Clojure functions, I have excellent news. We actually beat that this month, adding 56 new Clojure functions! However, I only added 23 of those and the other 33 were added by madstap (Aleksander Madland Stapnes). He did this while also adding the transducer arity into pretty much every existing sequence function. I added volatiles to support him in writing those transducers.

dotimes chunk
chunk-first chunk-next
chunk-rest chunk-cons
chunked-seq? volatile!
vswap! vreset!
volatile? deref
reduced reduced?
ensure-reduced unreduced
identical? atom
swap! reset!
swap-vals! reset-vals!
compare-and-set! keep
completing transduce
run! comp
repeatedly tree-seq
flatten cat
interpose juxt
partial doto
map-indexed keep-indexed
frequencies reductions
distinct distinct?
dedupe fnil
every-pred some-fn
group-by not-empty
get-in assoc-in
update-in update
cond->> as->
some-> some->>

New projects

At this point, I was thinking that jank actually has pretty darn good Clojure parity, both in terms of syntax and essential core functions. So how can I take the best steps toward getting jank onto your computer?

Well, I think the most important thing is for me to start writing some actual projects in jank. Doing this will require improving the tooling and will help identify issues with the existing functionality. The project I’ve chosen is jank’s nREPL server. By the end of the project, we’ll not only have more confidence in jank, we’ll all be able to connect our editors to running jank programs!

nREPL server

nREPL has some docs on building new servers, so I’ve taken those as a starting point. However, let’s be clear, there are going to be a lot of steps along the way. jank is not currently ready for me to just build this server today and have it all work. I need a goal to work toward, though, and every quest I go on is bringing me one step closer to completing this nREPL server in jank. Let’s take a look at some of the things I know I’ll need for this.

Module system

jank’s module system was implemented two quarters ago, but since there are no real jank projects, it hasn’t seen much battle testing. To start with, I will need to work through some issues with this. Already I’ve found (and fixed) a couple of bugs related to module writing and reading while getting started on the nREPL server. Further improvements will be needed around how modules are cached and timestamped for iterative compilation.

Native interop

Next, jank’s native interop support will need to be expanded. I’ve started on that this month by making it possible to now write C++ sources alongside your jank sources and actually require them from jank! As you may know, jank allows for inline C++ code within the special native/raw form, but by compiling entire C++ files alongside your jank code, it’s now much easier to offload certain aspects of your jank programs to C++ without worrying about writing too much C++ as inline jank strings.

jank’s native interop support can be further improved by declaratively noting include paths, implicit includes, link paths, and linked libraries as part of the project. This will likely end up necessary for the nREPL server.

AOT compilation

Also required for the nREPL server, I’ll need to design and implement jank’s AOT compilation system. This will involve compiling all jank sources and C++ sources together and can allow for direct linking, whole-program link time optimizations (LTO), and even static runtimes (no interactivity, but smaller binaries).

Distribution

Finally, both jank and the nREPL server will need distribution mechanisms for Linux and macOS. For jank, that may mean AppImages or perhaps more integrated binaries. Either way, I want this to be easy for you all to use and I’m following Rust/Cargo as my overall inspiration.

I hope I’ve succeeded in showing how much work still remains for this nREPL server to be built and shipped out. This will take me several months, I’d estimate. However, I think having this sort of goal in mind is very powerful and I’m excited that jank is far enough along to where I can actually be doing this.

nREPL server progress

Since I have C++ sources working alongside jank source now, I can use boost::asio to spin up an async TCP server. The data sent over the wire for nREPL servers is encoded with bencode, so I started on a jank.data.bencode project and I have the decoding portion of that working. From there, I wanted to write my tests in jank using clojure.test, but I haven’t implemented clojure.test yet, so I looked into doing that. It looks like clojure.test will require me to implement multimethods in jank, which don’t yet exist. On top of that, I’ll need to implement clojure.template, which requires clojure.walk, none of which have been started.

I’ll continue on with this depth-first search, implementing as needed, and then unwind all the way back up to making more progress on the nREPL server. Getting clojure.test working will be a huge step toward being able to dogfood more, so I don’t want to cut any corners there. Once I can test my decode implementation for bencode, I’ll write the encoding (which is easier) and then I’ll be back onto implementing the nREPL server functionality.

Hang tight, folks! We’ve come a long way, and there is still so much work to do, but the wheels are rolling and jank is actually becoming a usable Clojure dialect. Your interest, support, questions, and encouragement are all the inspiration which keeps me going.


Plexus: John Collins

Q2 2024 Report 1. Published May 15, 2024

1. Much better Loft Algorithm.

The previous loft algorithm simply mapped vertices between one-to-one between cross sections. This meant that lofted cross-sections had to have the same number of vertices. The new loft algorithm is much more general and now supports many-to-one and one-to-many vertex mappings. Loft is a very important operation that is not available in many programmatic CAD tools, so I’m excited to have support for this.

I go over this a bit more in a blog post: http://www.cartesiantheatrics.com/2024/04/09/perfect-loft.html

2. Text rendering Support

I wrote a first-cut at support for text rendering in companion library to plexus called clj-manifold3d. There is not now a simple function called text that takes a .ttf font file, a string, and a few other parameters and renders the string to a 2D cross section that can be extruded like any other cross section. This is done in the c++ by interpolating freetype2 glyphs.

Aside from built-in support for textures, this is the last major feature that clj-manifold lacked compared to OpenSCAD.

3. N Slices

There is now a highly efficient algorithm to slice a manifold into N evenly spaced 2D cross sections. This is a primitive that can be used to implement custom slicers. In the future, I will likely be writing a slicer that gives a high level of control over the G-code generated.

4. Three Point Arcs

You can now create arcs given three points. This lets you draw circles or arcs in a similar way that is often done in graphical CAD systems. It is useful for many complex polygon constructions.

5. Progress on Navigation on Manifolds

This is totally unfinished and unproven, but I have been experimenting heavily with a somewhat unique way doing custom texturing of manifolds. It works by creating a set of primitives that make it easy to “draw on” arbitrary 3D manifolds as if you’re drawing on a 2D plane. I’m hoping achieve something more fine-tunnable than common texturing methods (like UV mapping) and that could integrate seamlessly with Plexus’s abstractions. It has proven to be challenging unclear how well it’s going to work out.

While continuing to tinker with this approach, I’ll likely add support for more standard texturing.

6. Example projects

I improved and updated example projects that demonstrate using plexus for real-world problems. These are:

I will try more examples in the future.

7. Better support for 2D affine transforms

2D affine transforms (3x2 matrices) are now fully supported. They support the same API as 2D cross sections. Having affine transforms as independent objects is a big advantage over OpenSCAD.

8. Tests, Docs, Code Cleanup

Documentation is still sorely lacking, but README’s have been improved. There are more tests now and somewhat better API documentation in the code. There are also now github action pipelines in an effort to bring a bit more transparency and professionalism to the dev process.


Lost in Lambduhhs Podast: L. Jordan Miller

Q2 2024 Report 1. Published May 15, 2024.

Progress Overview

I have made significant progress on my new podcast series, thanks to the support from Clojurists Together. Here are the key milestones I’ve achieved so far and my plans moving forward:

Platform Subscription

  • Subscribed to Riverside.fm as my podcasting platform to ensure high-quality audio recordings and ease of use for my guests.

Guest Coordination

  • First Four Guests Identified and Scheduled:
    • David Nolan - Recorded on Monday, May 13.
    • Arne Brasseur - Scheduled to record on Wednesday, May 15.
    • Recia Roopnarine - Scheduled to record on Wednesday, May 21.
    • Raf Dittwald - Scheduled to record on Wednesday, May 21.

Recordings and Editing

  • Completed Recording:
    • My first session with David Nolan was successfully recorded.
  • Current Task:
    • I am currently editing and post-processing the interview with David Nolan.
  • Upcoming Recordings:
    • Recording with Arne Brasseur on May 15.
    • Recording with Recia Roopnarine and Raf Dittwald on May 21.

Release Schedule

  • Planned Release:
    • I plan to release the episode featuring David Nolan within the next week.

Next Steps

  • Complete the editing phase for the first episode.
  • Prepare and test all technical setups for the upcoming recordings.
  • Begin outreach for additional guests to feature in future episodes.

Conclusion

I am on track with my project timeline and excited about the content I am creating. I will continue to provide updates as I progress further.


Scicloj: Daniel Slutsky

Q1 2024 Report 3. Published April 30, 2024

April 2024 was the last of three months on the Clojurists Together project titled “Scicloj Community Building and Infrastructure”.

Scicloj is an open-source group developing Clojure tools and libraries for data and science. As a community organizer at Scicloj, my current role is to help make the emerging Scicloj stack easier and more accessible for broad groups of Clojurians. I collaborate with a few Scicloj members on this.

While this is the last update under the Clojurists Together 2024 Q1 support, the project will, of course, continue.

Below are the sub-projects that were addressed during April 2024. They are listed by their proposed priorities for the coming month.

The new real-world-data group is ranked highest for its impact on community growth. This means the following. Assuming this group will (hopefully) grow well and demand attention, the goals of other projects will receive less attention and will be delayed. However, some of them (e.g., required extensions or bugfixes to libraries) will receive more attention if the real-world-data group requires them.

The real-world-data group

The real-world-data group is a space for Clojure data and science practitioners to bring their data projects, share experiences, and evolve common practices.

April summary

  • had a few one-on-one meetings with group members, discussing their goals, and helping out with the technical path
  • had the second and third group meetings, which included new presentations, follow-ups on personal projects, hands-on parts, and discussions
  • kept working on introductory materials to support the group

May goals

  • have more one-on-one meetings, three more group meetings, and ad-hoc small topical meetings
  • help the participants take on active paths that connect their interests with community goals

     

Noj

The Noj project bundles a few recommended libraries for data and science and adds convenience layers and documentation for using them together.

April summary

  • collaborated with Kira McLean on a draft for a new data-visualization API, combining Tablecloth, Hanami, and statistical functions
  • updated documentation: added a tutorial for visualizing correlation matrices (WIP); started working on an additional machine-learning tutorial
  • updated the implementation to reuse existing functions of other libraries

May goals

  • implement the new data-visualization API (still in experimental stage)
  • improve documentation

     

translating books

In this project, we are renewing previous efforts to systematically review data science books in other programming languages and convert them to Clojure.
The goal is twofold: figuring out what common data science needs are still missing in the Clojure stack and creating well-polished documentation of this stack. It is also an opportunity for Clojurians to get involved in the data science community and learn from books they are curious about.

April summary

  • created a list of books (to be announced soon in a tidy repo) in a discussion with community members, exploring content and licenses
  • explored a Clay+Quarto workflow for a couple of the books, and created draft repos for them
  • started exploring certain books with community members who may take them on as their long-term projects

May goals

visual-tools group

This group’s goal is to create collaborations in learning and building Clojure tools for data visualization, literate programming, and UI design.

April summary

  • had the third ggplot study session
  • had a meetup about badspreadsheet and HTMX with Adam James
  • coordinated collaborations with a few group members who are working on HTMX-based dashboards (TBA)
  • kept exploring options for grammar-of-graphics implementations (documented in the Scrapbook)

May goals

  • keep the collaborations around HTMX-based layers
  • continue the grammar-of-graphics study sessions
  • clarify a proposal and a proof-of-concept for the long-term grammar-of-graphics project

     

Clojure Data Scrapbook

The Clojure Data Scrapbook is intended to be a community-driven collection of tutorials around data and science in Clojure.

April summary

  • continued the “exploring ggplot” tutorials
  • started tutorials (WIP): processing JSON files, analyzing transportation networks
  • adapted old tutorials to ecosystem updates

May goals

  • encourage and help community contributions to the scrapbook
  • keep adding content to support other projects

     

Clay

Clay is a minimalistic namespace-as-a-notebook tool for literate programming and data visualization.

April summary

  • added an experimental version of test generation for the purpose of testable docs / literate testing
  • minor bugixes and extensions
  • 5 minor releases of Clay, 2 minor releases of the clay.el Emacs package

May goals

  • start working on additional visualizations, mostly Emmy.viewers integration
  • explore the extraction of the HTML and Markdown generation layer as a separate library
  • keep evolving by user needs

     

Kindly

Kindly is a proposed standard for requesting data visualizations in Clojure.

April summary

  • added the meta-kind kind/fn for user-defined display
  • added the meta-kind kind/test-last (with kindly/check syntactic sugar) for test generation
  • updated documentation (the Kindly-noted project)
  • updated kind-clerk (Clerk adapter): plotly support

May goals

  • start working on Kindly support with the creators of new HTMX-based visual-tools
  • explore the option of a standalone Kindly implementation that is reusable in different tools (an alternative to the current approach of tool-specific implementations)

     

cmdstan-clj

Cmdstan-clj is a draft library for interop with Stan (probabilistic modeling through Bayesian statistics).

April summary

  • gave a presentation of the library and the topic of Bayesian Statistics at the real-world-data group
  • maintenance: adapted the library to related ecosystem changes

May goals

  • practice usage with community members and keep developing by need

     

ClojisR

ClojisR is a bridge between Clojure and the R language for statistical computing. During this Month, @generateme released the first non-beta version of the library and announced it as stable after 4.5 years of usage.

April summary

  • My role in the release was mostly migrating the old documentation to use our current literate programming workflow with Clay, test-generation, and Quarto.

May goals

The Scicloj website

April summary

  • Maintenance and updates

May goals

  • minor updates reflecting current projects and events

     

Your feedback would help

Scicloj is in transition. On the one hand, quite a few of the core members have been very active recently, developing the emerging stack of libraries. At the same time, new friends are joining, and soon, more people will enjoy the Clojure for common data and science needs.

If you have any thoughts about the current directions, or if you wish to discuss how the evolving platform may fit your needs, please reach out.

Permalink

Copyright © 2009, Planet Clojure. No rights reserved.
Planet Clojure is maintained by Baishamapayan Ghose.
Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
Theme by Brajeshwar.