Heart of Clojure 2.0

Heart of Clojure 2.0

This post contains practical information about the CFP, you can scroll down for that, but we do recommend you read the whole post before submitting a talk or session.

In 2019 we organized the first Heart of Clojure, a conference for the Clojure community, in Leuven, Belgium. On two lovely summer days 250 technologists came together to meet each other, watch inspiring talks, and join in activities in and around the venue and the city. We had talks about building a compiler, how to tap into the non-verbal capacities of your mind, lessons learned from teaching kids to code, climate change, and techniques for improving resiliency of your code base. We had a case study from a local law-tech Clojure startup, and an 11 year old presented the games she made in Clojure and Racket.

People also got to make their own screenprinting designs, we had a historian guide us through the beautiful city of Leuven, a mystery dinner randomly grouped people for an evening dinner, and people practiced yoga and meditation on the grass. Oh and we had waffles, delicious Belgian waffles.

Heart of Clojure 2.0Coffee and breakfast on day 2
Heart of Clojure 2.0The screen printing caravan

It was a special experience, we got overwhelmingly positive feedback, and everyone wanted to know when the next edition would come. To which we didn&apost have a good answer. We knew it wasn&apost going to be a yearly thing. Planning for the first edition took us a full year, and we weren&apost ready to jump in and start planning the next event.

Double the venues, double the fun

We also knew we wanted to do some things differently. One of the big challenges was how to increase capacity. Hal 5 is an amazing venue, the location couldn&apost be better, it&aposs full of charm and character, and the team at Hal 5 had been a pleasure to work with. But we were pushing the limits of what we could do there. We had about 250 seats, of which 150 went to regular ticket holders, and the rest to speakers, crew, and sponsors. Given that we sold out well before the conference, and given the positive reception, we knew there would be enough interest to go significantly bigger, but cramming more seats into Hal 5 wasn&apost an option.

We talked to half a dozen other venues in Leuven, but Het Depot was the only one we got really excited about.

Het Depot is a music venue right across from the Leuven train station, it&aposs a converted old cinema, with a number of fixed seats, and a large "pit" in front of the stage, where more seating can be placed.

Heart of Clojure 2.0Theater seating at Het Depot

Het Depot checked most of our boxes, there&aposs plenty of seating, there&aposs excellent A/V equipment available, the location is great, they have a nice bar and lockers. But it&aposs missing the open atmosphere of Hal 5. It&aposs perfect for hosting presentations, but less ideal for all the fringe activities and community involved sessions we like to host.

So we decided to do both. We&aposll open and close the conference with everyone together at Het Depot, for some of the finest and most thought provoking presentations. The rest of the event we&aposll swarm out between the two venues. They&aposre only a short jaunt removed from one another, with the two hotels where people are most likely to stay positioned right in the middle. And of course the station is right there as well, with excellent connections to Germany, the UK, the Netherlands, France, and from there to the rest of Europe.

Heart of Clojure 2.0Map of the Leuven train station area

Heart of Clojure is an interactive conference. We&aposre bringing people together from all over the continent, people who normally would only interact online, if at all. It&aposs a unique opportunity to get to know like minded people, and to create a bond through shared experience. Wouldn&apost it be a shame if for two days most of what you did was sitting side by side, looking at the stage? Watching a presentation you could have watched online? What a waste. Yet this is how a lot of conferences go.

Instead, we try to create a mix. Keynotes have their place, they inspire, provide food for thought, and for conversation, they set the stage and tone of the event. But once the stage is set it&aposs your turn to get up and participate. Heart of Clojure reifies the hallway track. We&aposll have extended breaks and opportunities to just walk around, chat with others, and exchange experiences.

Sessions and Activities

We&aposll also have sessions and activities. Sessions is our catch-all name for technical sessions that are more interactive than just watching a talk. Some examples of sessions could be:

  • Workshop, where an instructor walks you step by step through the use of a library, project, or technology
  • Birds of a Feather, where a moderator guides discussion around a topic of interest
  • Office Hours, where a maintainer or contributor of a project is available to answer questions and help others
  • Contributor Onboarding, where people can show up who want to contribute to an open source project or community initiative, and are shown the ropes

There will also be non-technical Activities to partake in, do yoga, go climbing, go explore the city, do a book swap, practice a second language... For us the primary reason to go to a conference is to create connections, and the best way to make connections with people is to just do stuff. That&aposs the idea behind activities. Plus, it&aposs fun!

For both of these, sessions and activities, we&aposll have an Activities App. Here you can see what&aposs happening, sign up, or set up your own sessions or activities. That&aposs right, this isn&apost a one way street. We&aposre taking some of ideas from unconferences, and opening the floor to all attendees to help shape this event. Want to get up to 7 people together to go check out the best gelato place in town? Make an activity! Want to find some folks to practice your Esperanto? Make an activity! Do origami, practice knots, whatever you&aposre into. This is your chance to find like minded geeks.

Opening the CFP

Which brings us to the Call for Proposals. We&aposre looking for a wide range of topics and formats. Think outside of the box, propose something original and appealing.

The schedule for regular talks will be fully done through the CFP, with potentially a few keynote speakers being invited directly. For five-minute lightning talks at the end of the conference the sign-up will be ad-hoc, the day of.

Sessions will mostly be scheduled through the CFP, so we can ensure a diverse and appealing offering. But some spaces and time slots will be left open for ad-hoc community sessions, which can be booked and organized by any attendee.

Since we want to leave much of the conference open for interactive sessions, that means slots for regular talks are limited. We are especially interested in talks that cross the gap between programming and the wider world, that synthesize lessons from other fields, and make connections between disparate intellectual schools. Talks that consider the practice of software development holistically, and from the perspective of the humans involved.

We are also interested in deep technical talks, although even these should try to situate themselves in the wider world. What real world context gave rise to this technological solution? Who are the people involved, and how does this impact them? Tell us a story, take us on a journey.

Talks and sessions do not have to be about Clojure. They merely have to be interesting to the kind of people who are drawn to Clojure. Inquisitive, open minded people, who prefer solving problems over solving puzzles. We warmly welcome people who are active in other communities, or other fields, to come share with us their knowledge and experience.

We are really, really looking forward to your proposals! Please spread this CFP far and wide!

Heart of Clojure 2.0
Heart of Clojure 2.0
Heart of Clojure 2.0
Heart of Clojure 2.0

Permalink

Clojure Deref (Apr 19, 2024)

Welcome to the Clojure Deref! This is a weekly link/news roundup for the Clojure ecosystem (feed: RSS). Thanks to Anton Fonarev for link aggregation.

From the core

We have three big chunks of work remaining for 1.12. The first one, reworking the symbolic array type representation is complete, is captured in CLJ-2807. That will rollback the previous implementation added in alpha6 and replace it with a new representation. Array symbols will have the syntax ComponentType/dimension, eg String/1 or long/2. More to say on that when we release.

The second piece is reworking the method value and param-tags implementation. The big picture is that param-tags must resolve to a single method, and that qualified methods will otherwise support reflective invocation. Additionally, we’re going to alter the syntax of qualified instance methods to make them distinct with Classname/.method (vs Classname/method for static). There are existing cases in the JDK (Double.isNaN()) that have both static and instance methods that overlap in effective arity (instance methods take the instance object as first "arg") - method values have no way to differentiate between these and that was causing a lot of pain in the implementation. This work is nearing completion.

The final piece is functional interface conversion. We’ve been working through some of the implementation details and figuring out some of the gnarly bits of optimizing method values passed as FIs (avoiding intermediate thunks or converters). This piece is also getting close. At an interim point this week I ran some wide regression testing and found a few interesting cases, some were actual bugs I found in the wild, some were things I was able to smooth over in the implementation. Integrating all of the things above cleanly has been quite challenging, but we are getting close.

Have been working on Clojure/Conj stuff too, hoping to get info about CFP, tickets, and sponsorship out in the next month.

Blogs, articles, and projects

Libraries and Tools

New releases and tools this week:

Permalink

Measuring startup and shutdown overhead of several code interpreters

I used the hyperfine tool to measure the overhead of startup and shutdown of several code interpreters. Each interpreter was invoked with an empty string for evaluation. In few cases it was not possible, an empty file was passed as argument.
This is more of a interesting experiment and presentation of hyperfine utility, and its plotting capabilities rather than a serious benchmark. The results should not be taken as an indication of "X is faster than Y".

Here's the link to the repo with the code:

Hyperfine

Hyperfine is command-line benchmarking tool. It reruns the given commands multiple times and does statistical analysis. It can export measurements to various file formats like csv, JSON, markdown, etc. Those results can be then used as an input for a bunch of included scripts (e.g. to draw a plot).

Basically, it does all the hard part, so that you don't have to.

Check out the official hyperfine Github repo

Here's the command I used to do measurements:

hyperfine -N --warmup 1 --export-csv 1.csv --export-json 1.json <CODE_INTERPRETERS...>
  • -N tells hyperfine to not use shell for launching commands. It's not needed in my case, because I'm launching binaries.
  • --warmup 1 performs one run of each command before doing actual measurements.
  • --export-csv 1.csv exports summary to csv file.
  • --export-json 1.json expoorts measurements data to JSON file. This is needed to plot the results.

Other useful options include:

  • --min-runs performs at least n runs for each command.
  • --runs performs exactly n runs for each command.
  • --parameter-list lets you execute parametric runs.

Plotting the results

Hyperfine comes with a Python script that generates a box plot of the measurements.
It requires the measurement data to be exported as JSON, which is used as the input file.
I made some minor improvements to the script to better suit my needs.
If you're not familiar with box plot, you can read e.g. How to interpret the result of a box plot?

Interpreters

I used the following code interpreters:

Lua 5.4.6
LuaJIT 2.1.1702233742
GNU bash, version 5.2.26(1)-release
zsh 5.9
dash 0.5.12-1
fish, version 3.7.0
GNU Awk 5.3.0
perl 5, version 38, subversion 2 (v5.38.2)
ruby 3.0.6p216
Python 3.11.8
PHP 8.3.3
JavaScript-C60.9.0
JavaScript-C78.15.0
guile (GNU Guile) 3.0.9
CHICKEN Version 5.3.0 (rev e31bbee5)
Erlang/OTP 26 [erts-14.2.2]
julia version 1.10.2
The OCaml toplevel, version 5.1.0
SWI-Prolog version 9.0.4
(GNU Prolog) 1.5.0
scryer-prolog "c8bf0be"
Poly/ML 5.9.1 Release
R version 4.3.3
tcl 8.6.14-1
cbqn r1646.49c0d9a-1

Results

The results of the box plot are sorted by median time.

Results

Higher res image

Here's the results without the slowest ones:

Results without slowest ones

Higher res image

Some insights:

  • dash, luajit and lua have the smallest overhead.
  • perl is right next. It's actually fascinating that its overhead is smaller than that of shells considering their usage for scripting.
  • The next group are tclsh, awk, sh, bash, zsh and gprolog. Two surprising things here. First, I did not expect awk to have bigger overhead than perl, considering it's much smaller and more frequently used for small scripts. Second, I did not expect language like prolog to even be here (Especially looking at results of other prolog implementation). I'd think that such higher level, more complex languages often have bigger startup/shutdown overhead.
  • Past this point, the results are much more varied.
  • The next comes chicken scheme, polyml and fish. Fish has somewhat bigger overhead than other shells. That shouldn't matter though, because it's more focused on interactive use. You should still write your shell scripts in /bin/sh (or /usr/bin/env bash).
  • The next is guile scheme and bqn, which has the biggest span of the measurements so far.
  • After that there's php and python.
  • Then there's a small gap and there's spider monkey (js78 and js60).
  • After that ocaml and then node. A bit surprising that SpiderMonkey gets better results here than node.
  • Then there's a big jump for ruby. It's quite surprising, because I'd expect it to be closer to python/php.
  • Then there's even bigger gap and we get julia, R, erlang and scryer-prolog
  • We finish with another big jump and last result, which is swi-prolog

I tried to add clojure, but its overhead was twice that of swi-prolog, which is probably expected. Java is known to not be fastest one when it comes to startup.

Note that I was running this on my personal machine. Some interpreters read configuration file, while others do not. This adds to the overhead. For more reliable results, it should be run in a separate, clean namespace.

Conclusions

hyperfine is cool. It does all the hard work and comes with a script to visualise the results. You should use it for your projects.

No definitive conclusions should be made from the measurements themselves, because in most cases the overhead of startup and shutdown doesn't matter. It only applies if you have really tiny script that is run huge number of times. If you really do then maybe dash, lua or perl would be a good choice, or maybe not, please measure.

Permalink

March 2024 Short-Term Project Updates

We’ve got a lot of great work to report on - all projects funded in Q1 2024. Thanks to all!

clojure-lsp: Eric Dallo
Instaparse: Mark Engelberg
Jank: Jeaye Wilkerson
Scicloj: Daniel Slutsky
SiteFox: Chris McCormick
UnifyBio: Benjamin Kamphaus
Wolframite: Thomas Clark

clojure-lsp: Eric Dallo

Q1 2024 Funding. Reports 2 & 3. Published March 1 and April 1, 2024

clojure-lsp

The main highlight from my work in February is the new linter different-aliases helps guarantee consistency across alias in your codebase!
different-aliases (1)

In April I spent some time fixing and improving clojure-lsp for Calva, but most of the time working on the IntelliJ support for LSP and REPL, improving both clojure-lsp-intellij and clojure-repl-intellij.

2024.03.01-11.37.51

General

  • Bump clj-kondo to 2024.02.13-20240228.191822-15.
  • Add :clojure-lsp/different-aliases linter. #1770
  • Fix unused-public-var false positives for definterface methods. #1762
  • Fix rename of records when usage is using an alias. #1756

Editor

  • Fix documentation resolve not working for clients without hover markdown support.
  • Added setting to allow requires and imports to be added within the current comment form during code action and completion: :add-missing :add-to-rcf #1316
  • Fix suppress-diagnostics not working on top-level forms when preceded by comment. #1678
  • Fix add missing import feature on some corner cases for java imports. #1754
  • Fix semantic tokens and other analysis not being applied for project files at root. #1759
  • Add support for adding missing requires and completion results referring to JS libraries which already have an alias in the project #1587

2024.03.31-19.10.13

Editor

  • Adding require command fails for requires without alias. #1791
  • Add require command without alias now add requires with brackets.
  • Project tree feature now support keyword definitions like re-frame sub/reg.#1789
  • Support textDocument/foldingRange LSP feature. #1602
  • Improve textDocument/documentSymbol considering keyword definitions and returning flatten elements.
  • Fix Add require/import usages count in code actions. #1794.

2024.03.13-13.11.00

General

  • Bump clj-kondo to 2024.03.13 fixing high memory usage issue.
    Editor
  • Fix workspace/didChangeConfiguration exception causing noise on logs. #1784

clojure-lsp-intellij

1.14.8 - 1.14.10

  • Fix exception when starting server related to previous version.
  • Fix some exceptions that can rarely occurr after startup.
  • Bump clojure-lsp to 2024.02.01-11.01.59.

There was a major change to how the plugin starts clojure-lsp, now it starts a clojure-lsp process under the hood (like all other editors) instead of using clojure-lsp as a JVM deps, this fixed a lot of macos bugs.
Also this adds support for “find implementations” of defmultis and protocols, something that it was never possible in any other IntelliJ plugin.

2.0.0 - 2.3.2

  • Use clojure-lsp externally instead of built-in since causes PATH issues sometimes. Fixes #25 and #26
  • Fix multiple code lens for the same line. #29
  • Fix os type for macos non aarch64 when downloading clojure-lsp server.
  • Fix references for different URIs when finding references.
  • Fix only noisy codelens exception. #33
  • Support “Find implementations” of defmultis/defprotocols. #31
  • Fix commands, code actions not being applied after 2.0.0.
  • Improve “find declaration or usages” to show popup for references.
  • Improve Find references/implementations to go directly to the usage if only one is found. #39
  • Wait to check for client initialized to minor cpu usage improvmenet.
  • Support multiple projects opened with the plugin. #37
  • Fix Stackoverflow exception when renaming. #32
  • Add common shortcuts to DragForward and DragBackward.
  • Fix race condition NPE when intellij starts slowly.

clojure-repl-intellij

Although this is not related with clojure-lsp, it’s a critical library for IntelliJ usage since without it, there is no REPL usage using only LSP. I spent considerable time adding the missing feature to make this plugin good enough for a stable release.
Now the plugin has test support!

0.1.7 - master

  • Use cider-nrepl middleware to support more features.
  • Add test support. #46
  • Fix freeze on evaluation. #48
  • Improve success test report message UI.
  • Support multiple opened projects. #51
  • Fix eval not using same session as load-file. #52

image


Instaparse: Mark Engelberg

Q1 2024 Funding. Report 1. Published March 30, 2024.

Thanks to funding from Clojurists Together, I have been able to review Instaparse pull requests that have been submitted over the past couple of years. I began by incorporating some “low-hanging fruit” pull requests, which addressed some quality-of-life issues raised by users with minimal changes to the code. Although these were small changes and code was contributed by other users, I needed to test the code and make sure the changes were adequately documented.

I also engaged with users who submitted issues where I needed more explanation or input to carefully consider their proposals. In some cases, I spent time evaluating pull requests but eventually decided not to incorporate that pull request. A good example of this was tonsky’s proposal to change the way that parsers and error messages print at the REPL. His proposal was logical but would be a breaking change, which posed a dilemma. After collecting information from him, I consulted with other Clojurists who are involved with instaparse as well as pretty printing and REPLs. I came to the conclusion that it was better to leave as-is.

The most interesting issue I began looking at was a suggestion to incorporate namespaced non-terminals. This is an excellent suggestion because namespaced keywords have taken on much more importance in Clojure since the time instaparse was initially released, due to their critical role in Datomic and Spec. This will be my next area of focus.

I had hoped to complete more work on namespaced keywords, but I spent most of March ill with covid (my first time getting covid), which delayed my work on this more substantive issue. So, rather than wait for the completion of that work, I deployed a new release with the pull requests I had incorporated so far (actually two releases in quick succession: 1.4.13 and 1.4.14) and I look forward to making progress on namespaced keywords in the coming weeks.


Jank: Jeaye Wilkerson

Q1 2024 Funding. Report 2. Published March 30, 2024.

Oh, hey folks. I was just wrapping up this macro I was writing. One moment.

(defmacro if-some [bindings then else]
  (assert-macro-args (vector? bindings)
                     "a vector for its binding"
                     (= 2 (count bindings))
                     "exactly 2 forms in binding vector")
  (let [form (get bindings 0)
        pred (get bindings 1)]
    `(let [temp# ~pred]
       (if (nil? temp#)
         ~else
         (let [~form temp#]
           ~then)))))

“Does all of that work in jank?" I hear you asking yourself. Yes! Indeed it does. Since my last update, which added dynamic bindings, meta hints, and initial reader macros, I’ve finished up syntax quoting support, including gensym support, unquoting, and unquote splicing. We might as well see all of this working in jank’s REPL CLI.

 jank repl
> (defmacro if-some [bindings then else]
    (assert-macro-args (vector? bindings)
                       "a vector for its binding"
                       (= 2 (count bindings))
                       "exactly 2 forms in binding vector")
    (let [form (get bindings 0)
          pred (get bindings 1)]
      `(let [temp# ~pred]
         (if (nil? temp#)
           ~else
           (let [~form temp#]
             ~then)))))
#'clojure.core/if-some
> (if-some [x 123]
    (str "some " x)
    "none")
"some 123"
> (if-some [x nil]
    (str "some " x)
    "none")
"none"
>

New interpolation syntax

Some of the early feedback I had for jank’s inline C++ support is that the interpolation syntax we use is different from what ClojureScript uses. Turns out there’s no reason to be different, aside from jank needing some more work, so jank has been improved to support the new ~{} syntax. If you’re not familiar, inline C++ in jank looks like this:

(defn sleep [ms]
  (let [ms (int ms)]
    ; A special ~{ } syntax can be used from inline C++ to interpolate
    ; back into jank code.
    (native/raw "auto const duration(std::chrono::milliseconds(~{ ms }->data));
                 std::this_thread::sleep_for(duration);")))

More reader macros

Aside from that, reader macro support has been extended to include shorthand #() anonymous functions as well as #'v var quoting. The only reader macro not yet implemented is #"" for regex. All of that concludes what I had aimed to accomplish for my quarter, and then some. It doesn’t stop there, though.

I’m wonderfully pleased to announce that jank now has a logo! The logo was designed by jaide, who was graciously patient with me and a joy to work with through the various iterations. With this logo, we’re capturing C++ on one side, Lisp on the other, and yet a functional core.
image

Transients

Back to code. In truth, there’s more work going on. A lovely man named Saket has been helping me fill out jank’s transient functionality, which now includes array maps, vectors, and sets, as well as the corresponding clojure.core functions. This is not the first time I’ve brought up Saket, since he also implemented the initial lein-jank plugin. Let’s take a look at that.

lein-jank

This plugin isn’t ready for prime time yet, but it’s a good proof of concept that jank can work with leiningen’s classpaths and it’s a good testing ground for multi-file projects. jank will be adding AOT compilation soon and this lein-jank plugin will be the first place new features will land. As a brief demonstration of where it is today, take a look at this session.

 cat project.clj
(defproject findenv "0.1.0-SNAPSHOT"
  :dependencies [[org.clojure/clojure "1.11.1"]]
  :plugins [[lein-jank "0.0.1-SNAPSHOT"]]
  :jank {:main findenv.core})

 cat src/findenv/core.jank
(ns findenv.core)

(defn -main [& args]
  (let [env-var (first args)
        ; Call through into native land to look up the var.
        env-val (or (native/raw "auto const str(runtime::detail::to_string(~{ env-var }));
                                 __value = make_box(std::getenv(str.c_str()));")
                    "var not found")]
    (println env-val)))

 export FINDME="found me!"

 lein jank run FINDME
found me!

 lein jank run YOUWONTFINDME
var not found

 lein jank run LC_ALL
en_US.UTF-8

Migration from Cling to clang-repl

Lastly, I’ve been working on migrating jank to use the upstream LLVM version of Cling, called clang-repl. The key benefit here is that we’d no longer need to compile our own Cling/Clang/LLVM stack in order to build jank and we can distribute jank to use each distro’s normal LLVM package, rather than its own. On top of that, future work is happening more on clang-repl than on Cling, so it has recent support for loading pre-compiled C++20 modules, for example. That would greatly improve jank’s startup performance, since Cling doesn’t allow us to load pre-compiled modules at this point.

Work here is ongoing and there are some bugs that I have identified in clang-repl which need to be fixed before jank can fully make the switch. I’ll keep you all updated!


Scicloj: Daniel Slutsky

Q 1 2024 Funding. Report 2. Published March 31, 2024.

March 2024 was the second of three months on the Clojurists Together project titled “Scicloj Community Building and Infrastructure”. Scicloj is an open-source group developing Clojure tools and libraries for data and science.

As a community organizer at Scicloj, my current role is to help make the emerging Scicloj stack easier and more accessible for broad groups of Clojurians. I collaborate with a few Scicloj members on this.

In March 2024, this has been mostly about the following projects. The projects are listed by their proposed priorities for the coming month.

The new real-world-data group is ranked highest for its impact on community growth. This means the following. Assuming this group will (hopefully) grow well and demand attention, the goals of other projects will receive less attention and will be delayed. However, some of them (e.g., required extensions or bugfixes to libraries) will receive more attention if the real-world-data group requires them.

The real-world-data group

The real-world-data group is a space for Clojure data and science practitioners to bring their data projects, share experiences, and evolve common practices.

March summary

  • had quite a few one-on-one meetings with group members, discussing their goals, interests, and needs
  • had the first group meeting, including personal introductions, talks by Kyle Passarelli and by Adham Omran, a hands-on part, and discussions
  • started creating introductory materials to support the group (see the Scrapbook section)

April goals

  • have more one-on-one meetings, two more group meetings, and ad-hoc small topical meetings
  • help the participants take on active paths that connect their interests with community goals

Noj

The Noj project bundles a few recommended libraries for data and science and adds convenience layers and documentation for using them together.

March summary

  • reorganized the docs and clarified the status of different parts
  • moved some parts of the experimental functionality to other libraries

April goals

  • start stabilizing important parts of the experimental API (noj.vis.*, noj.stats)
  • improve documentation

Clojure Data Scrapbook

The Clojure Data Scrapbook is intended to be a community-driven collection of tutorials around data and science in Clojure.

March summary

April goals

  • encourage and help community contributions to the scrapbook
  • keep adding content to support other projects

Clay

Clay is a minimalistic namespace-as-a-notebook tool for literate programming and data visualization.

March summary

  • user support
  • bugfixes, extensions, and performance improvements
  • 7 minor releases
  • shifted from Alpha to Beta stage

April goals

  • support user needs, especially in study groups
  • explore adding emmy-viewers support

Kindly

Kindly is a proposed standard for requesting data visualizations in Clojure.

April goals

  • discuss Kindly integration with visual tool authors

visual-tools group

This group’s goal is to create collaborations in learning and building Clojure tools for data visualization, literate programming, and UI design.

March summary

April goals

  • continue the grammar-of-graphics exploration
  • have at least one more study session

cmdstan-clj

Cmdstan-clj is a draft library for interop with Stan (probabilistic modeling through Bayesian statistics).

April goals

  • practice usage with community members and keep developing by need

Your feedback would help

Scicloj is in transition. On the one hand, quite a few of the core members have been very active recently, developing the emerging stack of libraries. At the same time, new friends are joining, and soon, more people will enjoy the Clojure for common data and science needs.

If you have any thoughts about the current directions, or if you wish to discuss how the evolving platform may fit your needs, please reach out.


SiteFox: Chris McCormick

Q1 2024 Funding. Report 2. Published March 30, 2024.

Hello! The second half of my Clojurists Together funded work on Sitefox is complete. I made around 30 commits to the project for a total of 80 since the start of the year and this is a summary of the progress I’ve made.

My goal with the Clojurists Together funding has been to make it safer and easier for other people to get started building sites and apps on Sitefox. There were two components to this: a) improving documentation and setup tooling; b) improving security and stability.

E2e testing

I continued work on e2e tests for Sitefox. I did this to lay the groundwork for other improvements and get confidence that my changes aren’t breaking any major functionality. I made three main improvements to the tests:

  • I started by finshing off the AJAX fetch request test which includes CSRF testing.
  • I created a new set of CI rules to run the tests using Postgres, in addition to Sqlite.
  • I got basic testing in place against shadow-cljs in dev and compiled release modes.

These tests enabled me to replace the CSRF protection module, make changes to the database layer, and upgrade dependencies knowing the main functions of the framework were still working.

Dependency upgrades

Once I had finished the AJAX request CSRF tests I was able to finish replacing the csurf module. I chose csrf-csrf as the replacement and with a few tweaks I was able to verify it as a drop-in replacement. I also upgraded nbb and the Express webserver module which was flagged by GitHub security infrastructure as vulnerable.

Database layer changes

Sitefox wraps the Keyv key-value database layer and it features functions for filtering data based on the result of a callback. Previously this would load all of the rows of the table into memory before running the filter, but I updated it to do this sequentially in batches instead. I also tweaked the way values are deserialized to use the underlying library’s method, and removed some legacy cruft to do with the way objects were returned. I used the database tests to verify that none of these changes impacted typical database use-cases.

Other misc. updates

Various other updates I made to Sitefox included: adding to the auth layer the ability to dynamically redirect on sign-out, inlining the npm “create” scripts so they are part of the main monorepo, and general code tidying and linter fixes.

Presentations

Near the end of February I gave a talk about Sitefox to the London Clojurians meetup group. This was useful as it helped me distill the core ideas behind Sitefox and get feedback from the community.

Finally, one of my goals was to create a “getting started” tutorial. I wrote the tutorial and shot a simple YouTube video, but then discovered the sound had some issues. I will re-record this video with better sound and upload it, as well as publishing the text version of the tutorial to help people get started with Sitefox.

What’s next

Now that the funding period is complete I intend to continue Sitefox maintenance and updates. I have started by cutting a release (v0.0.19) with all of the changes made during the Q1 2024 Clojurists Together funding period.

One thing I am particularly interested in is building an RPC layer as an alternative to cumbersome REST or GraphQL communications. I hope this will make ClojureScript client-server code more natural to write and reason about without hiding away fundamental information about which computer the code is running on. Hopefully more on that later.

I’m very grateful to have received this support from Clojurists Together. Of course the funding itself is helpful but the most important thing for me was that it showed others are interested in this work and find it valuable and worth working on.

Thanks again for your interest and your support!


UnifyBio: Benjamin Kamphaus

Q1 2024 Funding. Report 2. Published March 31, 2024.

Since the last update, most of my development effort has been split into three areas:

  • Data-driven validations (essentially resulting in specs applied to imported data, as well as other constraints)
  • Data lifecycle management (retractions work, but diff/merge still in progress)
  • Example dataset for quickstart, and full tutorial (I have tiput together he example, support docs still in progress)

I haven’t completely wrapped these yet as of Mar 31, but I am very close on the first and last bullet. I will try to update Clojurists Together and the larger data sci/engineering and datomic communities when these things are all available.

Some of my time was spent figuring out what a sustainable long-term solution would be for continuing to develop the core of UnifyBio and to ensure it sees actual use in the life sciences. I had some discussions with different orgs and ended up accepting a full-time position at the Rare Cancer Research Foundation (just started at the end of this quarter), where I’ll have support to further develop UnifyBio. This means the project will probably be moved to the RCRF repo, and there’s a possibility it might be re-branded, or that the UnifyBio name and site might be brought under RCRF’s umbrella. UnifyBio will remain open source, regardless of where it lives. I’ll update Clojurists Together when I have this information, so the project can be pointed to.

IMO this is an ideal place to be positioned coming out of this quarter where I’ve been working on the project as an independent, supported by a mix of client work and funding like the small grant provided to me by Clojurists Together. This change does mean I’ll be spending more time on bio specific applications and not as much work focused on making Unify a generically useful tool for Clojure data science, but this won’t be a 100% shift, as general use will continue to be helpful for the health of the open source data commons ecosystem I’ll be building with RCRF.

I want to thank Clojurists Together again for helping me bridge a time of uncertainty in this interim period while I was working on UnifyBio as a solo dev, and didn’t yet have longer term support from a sponsor or employer.


Wolframite: Thomas Clark

Q1 2024 Funding. Report 1. Published April 15, 2024.

  1. Overview
  2. Code organization
  3. User experience
  4. Documentation
  5. Preparation

Overview

Well, 1 and a half months, much confusion and 1 baby later, we have officially sunk our teeth into the wolfram<->clojure bridge. Wolframite, as newly christened by Pawel, is the continuation of Clojuratica: a geat project, pioneered by great people over more than 15 years. As such, the main task for this part of the funding was simply to get to grips with what’s going on underneath before we can prepare it for posterity. A summary of the work so far follows below.

Code organization

A big part of the work was debugging and merging all of the great things that Pawel and others contributed with the current system. As such, main is now up to date and stable. Examples of incorporated work are making sure that we don’t rely on macros or dynamic vars and that options are passed explicitly. The API has been cleaned up, seprating core functionality from tools etc. and functions and namespaces have been renamed to enforce a standard across the codebase.

User experience

For the user, we have particularly worked on the initialization experience. This has involved fixing bugs that prevented the use of wolframengine on linux, type mismatches, reordering engine priorities (Wolfram has different flavours of installation) and having clearer error messages (e.g. when executables and licences are missing). As well as basic streamlining (removing the need for flags etc.), we have ensured a more robust jlink detection, un-lazy interning and significantly faster symbol loading. We have also, with thanks to Daniel Slutsky, integrated support for ’clay’ and the ’kindly’ system, with a plan for allowing users to specify their own clojure<->wolfram aliases.

Documentation

As well as updating the key, user-facing docs, we have also built a series of troubleshooting tips and development documentation that will no doubt grow before (and after) release. This was further complemented with ongoing internal commenting within the source code. Additionally, we have a new demo notebook that illustrates clay and kindly, and two example physics namespaces that demonstrate the bridge for a couple of real-world problems.

Preparation

Finally, an important part of this early stage has simply been the necessary preparation for the second half, where the lion’s share of our contribution is expected to be felt (particularly now that no more deliveries of small people are expected :) ). Such preparation has largely focused on learning and documenting the internal code structure, opening channels with the official Wolfram team and opening issues: both existing bugs and enhancements highlighted by test use cases.

In the short term, we plan to release wolframite-1.0.0-alpha very soon so that the next stage of development can benefit from community feedback. This should happen as soon as we finalize some “getting-started” materials. These will be aimed at two target groups: Clojure developers interested in data science, and Mathematica / Wolfram users who would benefit from using the algorithms from Clojure (a real programming language!).

Permalink

Meet Datomic: the immutable and functional database.

"When you combine two pieces of data you get data. When you combine two machines you get trouble." - Rich Hickey presenting The Functional Database.

Concepts focused on functional programming, mainly on immutability, have been increasingly present in our daily lives, therefore, nothing more fair than getting to know a database whose philosophy is the immutability of data, bringing control of facts in a format completely different from what we are used to.

In this article, we will get to know Datomic, which has this name precisely because it brings data in a format slightly different from the conventional one, seeking to bring data immutability closer to the database level, with a functional approach focused to work well with distributed systems.

Table of Contents

  • What is Datomic?
  • Architecture
  • Data structure
  • How a transaction works?
  • Conclusion

What is Datomic?

At the beginning of 2012, the Relevance team (later joining Metadata to form Cognitec), together with Rich Hickey, launched Datomic, which they began working on in 2010, with the main motivation being to transfer a substantial part of the power assigned to database servers for application servers, so that the programmer would have more programming power with data within the application logic.

Datomic Cloud was released in early 2018 using Amazon's components:

  • DynamoDB, EFS, EBS and S3 as storage services;
  • CloudFormation for deployment;
  • AWS Cloudwatch for logging, monitoring, and metrics.

Cognitect (the company previously responsible for developing Datomic) was acquired by Nubank in 2020, and Nubank announced in April 2023 that the Datomic binaries are publicly available and free to use (this means that its Pro version is now free to use).

Written in Clojure, Datomic works a little differently from the databases we are used to using, being used to manage data, but not store it. We'll go through more details about its architecture, but in short, it means that Datomic can use several other data storage services to store transactions, even other databases, which can result in a nice combination.

Concept

The main operating concept of Datomic is that the data is immutable as a whole. An interesting analogy for you to imagine and understand how it works a little better is:

  • Imagine a database that you are used to working with, such as PostgreSQL or MySQL;
  • Imagine now that you have two tables, a product table, but you also have a log table for these products, storing every modification that was made to the original product table;
  • When you update an item, this item has its data modified in the product table, but we add its previous value to the log table, highlighting what was done (in our case an update on its value);
  • For Datomic, there would only be the product "table" (in our case schema), with an additional column, indicating whether that item is true or not, so, when we update the product value, a new line would be added with the new value, however, the old line now has the check column value set to false, after all, it is no longer true at the current time.

This means that past data continues to exist, but there is this representation that indicates whether the value of the product is valid or not. These lines are called facts, so when the word "fact" is mentioned, remember this analogy.

One point to highlight: remember that no matter how much the value of the product has been changed, it remains a fact - no longer valid for the current time, however, it is still a fact that occurred, after all, the product in the past had this value.

Architecture

To better understand how everything works within Datomic, we first need to better understand how its architecture works.

How can I store data?

As previously mentioned, Datomic is not characterized by storing data as a database in the same way as we are used to, being used mainly to "transact and manage data". You can combine Datomic in different ways, such as:

  • SQL databases (such as PostgreSQL and MySQL);
  • DynamoDB (if you choose to use Datomic Cloud, the Indexes will be stored in S3, Transaction Log in DynamoDB and Cache in EFS);
  • Cassandra and Cassandra2;
  • Dev mode, storing data in memory. How it works is very simple: depending on your choice, a table will be created within your database in order to store all the data, which Datomic will manage.

Peers

Every type of interaction that an application will make with Datomic will be through a Peer, responsible for carrying out a small cache control, assembling and executing queries, bringing indexing data in addition to sending commits. Peers can be used in two main ways, namely:

  • Peer Library: a library to be added to your dependencies that will always be working with your application, after all, it is through it that you will carry out any type of action with your database;
  • Peer Server: being used mainly with the Datomic in Cloud format, your application now has only a Client library, responsible for communicating with the Peer Server that will perform direct actions with Datomic, as well as the Peer Library.

Peers work with a given database value for an extended period of time without concern. These values are immutable and provide a stable, consistent view of data for as long as a needs one, functioning as a kind of "snapshot" of the database to allow data to be returned in a more practical way without overloading with multiple queries in real time . This is a marked difference from relational s which require work to be done quickly using a short-lived. You can see more about the description of how Peers work in official documentation.

An important point to highlight is: if you are using the Peer Library, each node of your application, each service will have a Peer running alongside it, therefore, this means that in the process of a distributed system, these multiple Peers will be responsible for sending commits to your database, however, in a distributed system it is extremely important to control data consistency, after all, if I have multiple services sending queries in parallel, how can I control that the data is truthful and correctly passes a race condition? Well, to understand this better, let's understand what Transactor is.

Transactor

The Transactor is responsible for transacting the received commits and storing data within our database. The architecture of an application with Datomic is characterized by having multiple Peers working and sending commits (after all, we will have multiple services), but only a single Transactor, guaranteeing total data consistency. This means that regardless of whether the Transactor is receiving n commits per second, they will all be queued to guarantee total consistency.

The main point of this architectural format comes from understanding that dealing with concurrency in general, allowing multiple data to be stored in parallel in our database, especially in case of a distributed system, could negatively affect the consistency of stored data, being a drastic problem.

Now, we can look in more detail at a diagram that demonstrates how this entire architecture behaves:
Datomic architecture

Note in the diagram above that the Transactor is responsible for all activities that require direct communication with the data storage service used, in addition to controlling data indexing formats, working with a memcached cluster, and responding to commits. Thus, we can state that Datomic deals with ACID transactions, an acronym that refers to the set of 4 key properties that define a transaction: Atomicity , Consistency, Isolation, and Durability.

Storage Services

"Peers read facts from the Storage Services. The facts the Storage Service returns never change, so Peers do extensive caching. Each Peer's cache represents a partial copy of all the facts in the database. The Peer cache implements a least-recently used policy for discarding data, making it possible to work with databases that won't fit entirely in memory. Once a Peer's working set is cached, there is little or no network traffic for reads." - from Datomic Pro Documentation.

Storage Services can be configured however you like, all you need to do is create a properties file (called transactor.properties) to represent how your Transactor will be created and managed.

So, if you use PostgreSQL, for example, you will have to configure the driver that will be used, the connection url, username and password, and you can also configure values for memory-index-threshold, memory-index-max , object-cache-max, read-concurrency and even write-concurrency, ultimately creating a table named datomic_kvs within your PostgreSQL:

CREATE TABLE datomic_kvs (
  id text NOT NULL,
  rev integer,
  map text,
  val bytea,
  CONSTRAINT pk_id PRIMARY KEY (id)
) WITH (OIDS=FALSE);

Data structure

Well, we've already talked about the architecture of how Datomic as a whole works, so now let's better visualize what the basis of Datomic's data structure is like, starting with Datoms.

Datoms

"A datom is an immutable atomic fact that represents the addition or retraction of a relation between an entity, an attribute, a value, and a transaction."

So basically a datom is a simple fact in log, representing data changes of a relation. We can express a datom as a five-tuple:

  • an entity id (E)
  • an attribute (A)
  • a value for the attribute (V)
  • a transaction id (Tx)
  • a boolean (Op) indicating whether the datom is being added or retracted

All of these are from Datomic Cloud documentation. Look at the example below:

E 42
A :user/favorite-color
V :blue
Tx 1234
Op true

Entities

"A Datomic entity provides a lazy, associative view of all the information that can be reached from a Datomic entity id."

Looking into an Entity we can visualize as a table:

E A V Tx Op
42 :user/favorite-color :blue 1234 true
42 :user/first-name "John" 1234 true
42 :user/last-name "Doe" 1234 true
42 :user/favorite-color :green 4567 true
42 :user/favorite-color :blue 4567 false

The Transaction Id can be visualized as a point of the time which represents that data. In the example above we have 1234 and 4567. Look at 1234... In the first row, the :user/favorite-colour attribute has the value :blue, with op as true. But, in the future, at 4567 now the attribute has the op set to false for the attribute with the value :blue (now :green is set for true).

For us, we haven't changed manually the Op. Datomic automatically made this when we updated the value for :user/favorite-color. That means: Datomic automatically manage our data and set or update values, and we have the exactly point in time which the :user/favorite-color have been changed.

Schemas

As the documentation says: Attributes are defined using the same data model used for application data. That is, attributes are themselves defined by entities with associated attributes.

Well, for defining a new attribute we need to define:

  • :db/ident, a name that is unique within the database
  • :db/cardinality, specifying whether entities can have one or a set of values for the attribute
  • :db/valueType, the type allowed for an attribute's value
  • :db/doc (optional), the attribute's description/documentation

Look, all of these :db/ident, :db/cardinality and etc are only simple entities which pointer to each other. They are automatically generated by Datomic in the initial stage. This means: they have a default entity id.

How a transaction works?

"Every transaction in Datomic is its own entity, making it easy to add facts about why a transaction was added (or who added it, or from where, etc.)"

We have "two options" for transactions: add or retraction. Every transaction returns the transaction id and the database state before and after the transaction. The forms can be:

[:db/add entity-id attribute value]
[:db/retract entity-id attribute value]

How we saw before: every transaction occur in a queued mode.

If a transaction completes successfully, data is committed to the database and we have a transaction report returned as a map with the following keys:

key usage
:db-before database value before the transaction
:db-after database value after the transaction
:tx-data datoms produced by the transaction
:tempids map from temporary ids to assigned ids

The database value is like a "snapshot" from the database, as we saw before.

Let's see an example of how :db/add works. Look at the example below:

;; We have this schema
{:internal/id ...
 :internal/value 0
 :internal/key " "}

;; Making a simple transaction
[[:db/add id :internal/value 1]]
;; This will update the value...

;; But, we can perform multiple
;; transactions, look:
[[:db/add id :internal/value 1]
 [:db/add id :internal/key "another"]]
;; It will work fine.

;; But, when we perform something like:
[[:db/add id :internal/value 1]
 [:db/add id :internal/value 2]]
;; We will have a conflict

The conflict occur when we have a change in the same entity with the same attribute. That's make sense, because we can't have a fact updating multiple times in the same time lapse.

A cool fact: if we perform a multiple transaction they occur in parallel (with multiple processing). This is secure because as we saw before, the same attribute can't be updated in the same transaction.

Conclusion

This article and the beginning of this series of articles aims to introduce Datomic and present its various possibilities and advantages for general use. It is important to highlight that the official Datomic documentation is excellent, therefore, for further in-depth research it is extremely important that you use it! And of course, if you want to take your first steps with Datomic, feel free to use Getting Started from the official documentation, but if you want a repository with the codes used, I've made a repository available on my GitHub (don't forget to give a star)!

Permalink

52: Coding in YAML with Ingy döt Net

Ingy döt Net talks about his new programming language YAMLScript, compiling YAML to Clojure, and the development of the YAML format. SML mailing list archive ActiveState Data::Denter Zope Ingy.net personal website Acmeism SnakeYAML / clj-yaml BPAN PST - Package Super Tool YAMLScript docs release-yamlscript file Yes expressions e.g. a(b c) => (a b c) and (a + b) => (+ a b) Deno - capabilities/permissions Advent of YAMLScript New YAML version

Permalink

New Library: clj-reload

The problem

Do you love interactive development? Although Clojure is set up perfectly for that, evaluating buffers one at a time can only get you so far.

Once you start dealing with the state, you get data dependencies, and with them, evaluation order starts to matter, and now you change one line but have to re-eval half of your application to see the change.

But how do you know which half?

The solution

Clj-reload to the rescue!

Clj-reload scans your source dir, figures out the dependencies, tracks file modification times, and when you are finally ready to reload, it carefully unloads and loads back only the namespaces that you touched and the ones that depend on those. In the correct dependency order, too.

Let’s do a simple example.

a.clj:

(ns a
  (:require b))

b.clj:

(ns b
  (:require c))

c.clj:

(ns c)

Imagine you change something in b.clj and want to see these changes in your current REPL. What do you do?

If you call

(clj-reload.core/reload)

it will notice that

  • b.clj was changed,
  • a.clj depends on b.clj,
  • there’s c.clj but it doesn’t depend on a.clj or b.clj and wasn’t changed.

Then the following will happen:

Unloading a
Unloading b
Loading b
Loading a

So:

  • c wasn’t touched — no reason to,
  • b was reloaded because it was changed,
  • a was loaded after the new version of b was in place. Any dependencies a had will now point to the new versions of b.

That’s the core proposition of clj-reload.

Usage

Here, I recorded a short video:

But if you prefer text, then start with:

(require '[clj-reload.core :as reload])

(reload/init
  {:dirs ["src" "dev" "test"]})

:dirs are relative to the working directory.

Use:

(reload/reload)
; => {:unloaded [a b c], :loaded [c b a]}

reload can be called multiple times. If reload fails, fix the error and call reload again.

Works best if assigned to a shortcut in your editor.

Usage: Return value

reload returns a map of namespaces that were reloaded:

{:unloaded [<symbol> ...]
 :loaded   [<symbol> ...]}

By default, reload throws if it can’t load a namespace. You can change it to return exception instead:

(reload/reload {:throw false})

; => {:unloaded  [a b c]
;     :loaded    [c b]
;     :failed    b
;     :exception <Throwable>}

Usage: Choose what to reload

By default, clj-reload will only reload namespaces that were both:

  • Already loaded
  • Changed on disk

If you pass :only :loaded option to reload, it will reload all currently loaded namespaces, no matter if they were changed or not.

If you pass :only :all option to reload, it will reload all namespaces it can find in the specified :dirs, no matter whether loaded or changed.

Usage: Skipping reload

Some namespaces contain state you always want to persist between reloads. E.g. running web-server, UI window, etc. To prevent these namespaces from reloading, add them to :no-reload during init:

(reload/init
  {:dirs ...
   :no-reload '#{user myapp.state ...}})

Usage: Unload hooks

Sometimes your namespace contains stateful resource that requires proper shutdown before unloading. For example, if you have a running web server defined in a namespace and you unload that namespace, it will just keep running in the background.

To work around that, define an unload hook:

(def my-server
  (server/start app {:port 8080}))

(defn before-ns-unload []
  (server/stop my-server))

before-ns-unload is the default name for the unload hook. If a function with that name exists in a namespace, it will be called before unloading.

You can change the name (or set it to nil) during init:

(reload/init
  {:dirs [...]
   :unload-hook 'my-unload})

This is a huge improvement over tools.namespace. tools.namespace doesn’t report which namespaces it’s going to reload, so your only option is to stop everything before reload and start everything after, no matter what actually changed.

Usage: Keeping vars between reloads

One of the main innovations of clj-reload is that it can keep selected variables between reloads.

To do so, just add ^:clj-reload/keep to the form:

(ns test)

(defonce x
  (rand-int 1000))

^:clj-reload/keep
(def y
  (rand-int 1000))

^:clj-reload/keep
(defrecord Z [])

and then reload:

(let [x test/x
      y test/y
      z (test/->Z)]
  
  (reload/reload)
  
  (let [x' test/x
        y' test/y
        z' (test/->Z)]
    (is (= x x'))
    (is (= y y'))
    (is (identical? (class z) (class z')))))

Here’s how it works:

  • defonce works out of the box. No need to do anything.
  • def/defn/deftype/defrecord/defprotocol can be annotated with ^:clj-reload/keep and can be persistet too.
  • Project-specific forms can be added by extending clj-reload.core/keep-methods multimethod.

Why is this important? With tools.namespace you will structure your code in a way that will work with its reload implementation. For example, you’d probably move persistent state and protocols into separate namespaces, not because logic dictates it, but because reload library will not work otherwise.

clj-reload allows you to structure the code the way business logic dictates it, without the need to adapt to developer workflow.

Simply put: the fact that you use clj-reload during development does not spill into your production code.

Comparison: Evaluating buffer

The simplest way to reload Clojure code is just re-evaluating an entire buffer.

It works for simple cases but fails to account for dependencies. If something depends on your buffer, it won’t see these changes.

The second pitfall is removing/renaming vars or functions. If you had:

(def a 1)

(def b (+ a 1))

and then change it to just

(def b (+ a 1))

it will still compile! New code is evaluated “on top” of the old one, without unloading the old one first. The definition of a will persist in the namespace and let b compile.

It might be really hard to spot these errors during long development sessions.

Comparison: (require ... :reload-all)

Clojure has :reload and :reload-all options for require. They do track upstream dependencies, but that’s about it.

In our original example, if we do

(require 'a :reload-all)

it will load both b and c. This is excessive (b or c might not have changed), doesn’t keep track of downstream dependencies (if we reload b, it will not trigger a, only c) and it also “evals on top”, same as with buffer eval.

Comparison: tools.namespace

tools.namespace is a tool originally written by Stuart Sierra to work around the same problems. It’s a fantastic tool and the main inspiration for clj-reload. I’ve been using it for years and loving it, until I realized I wanted more.

So the main proposition of both tools.namespace and clj-reload is the same: they will track file modification times and reload namespaces in the correct topological order.

This is how clj-reload is different:

  • tools.namespace reloads every namespace it can find. clj-reload only reloads the ones that were already loaded. This allows you to have broken/experimental/auxiliary files lie around without breaking your workflow TNS-65
  • First reload in tools.namespace always reloads everything. In clj-reload, even the very first reload only reloads files that were actually changed TNS-62
  • clj-reload supports namespaces split across multiple files (like core_deftype.clj, core_defprint.clj in Clojure) TNS-64
  • clj-reload can see dependencies in top-level standalone require and use forms TNS-64
  • clj-reload supports load and unload hooks per namespace TNS-63
  • clj-reload can specify exclusions during configuration, without polluting the source code of those namespaces.
  • clj-reload can keep individual vars around and restore previous values after reload. E.g. defonce doesn’t really work with tools.namespace, but it does with clj-reload.
  • clj-reload has 2× smaller codebase and 0 runtime dependencies.
  • clj-reload doesn’t support ClojureScript. Patches welcome.

That’s it!

Clj-reload grew from my personal needs on Humble UI project. But I hope other people will find it useful, too.

Let me know what works for you and what doesn’t! I’ll try to at least be on par with tools.namespace.

And of course, here’s the link:

Permalink

PG2 release 0.1.11: HugSQL support

The latest 0.1.11 release of PG2 introduces HugSQL support.

The pg2-hugsql package brings integration with the HugSQL library. It creates functions out from SQL files like HugSQL does but these functions use the PG2 client instead of JDBC. Under the hood, there is a special database adapter as well as a slight override of protocols to make inner HugSQL stuff compatible with PG2.

Since the package already depends on core HugSQL functionality, there is no need to add the latter to dependencies: having pg2-hugsql by itself will be enough (see Installation).

Basic Usage

Let’s go through a short demo. Imagine we have a demo.sql file with the following queries:

-- :name create-demo-table :!
create table :i:table (id serial primary key, title text not null);

-- :name insert-into-table :! :n
insert into :i:table (title) values (:title);

-- :name insert-into-table-returning :<!
insert into :i:table (title) values (:title) returning *;

-- :name select-from-table :? :*
select * from :i:table order by id;

-- :name get-by-id :? :1
select * from :i:table where id = :id limit 1;

-- :name get-by-ids :? :*
select * from :i:table where id in (:v*:ids) order by id;

-- :name insert-rows :<!
insert into :i:table (id, title) values :t*:rows returning *;

-- :name update-title-by-id :<!
update :i:table set title = :title where id = :id returning *;

-- :name delete-from-tablee :n
delete from :i:table;

Prepare a namespace with all the imports:

(ns pg.demo
  (:require
   [clojure.java.io :as io]
   [pg.hugsql :as hug]
   [pg.core :as pg]))

To inject functions from the file, pass it into the pg.hugsql/def-db-fns function:

(hug/def-db-fns (io/file "test/demo.sql"))

It accepts either a string path to a file, a resource, or a File object. Should there were no exceptions, and the file was correct, the current namespace will get new functions declared in the file. Let’s examine them and their metadata:

create-demo-table
#function[pg.demo...]

(-> create-demo-table var meta)

{:doc ""
 :command :!
 :result :raw
 :file "test/demo.sql"
 :line 2
 :arglists ([db] [db params] [db params opt])
 :name create-demo-table
 :ns #namespace[pg.demo]}

Each newborn function has at most three bodies:

  • [db]
  • [db params]
  • [db params opt],

where:

  • db is a source of a connection. It might either a Connection object, a plain Clojure config map, or a Pool object.
  • params is a map of HugSQL parameters like {:id 42};
  • opt is a map of pg/execute parameters that affect processing the current query.

Now that we have functions, let’s call them. Establish a connection first:

(def config
  {:host "127.0.0.1"
   :port 10140
   :user "test"
   :password "test"
   :dbname "test"})

(def conn
  (jdbc/get-connection config))

Let’s create a table using the create-demo-table function:

(def TABLE "demo123")

(create-demo-table conn {:table TABLE})
{:command "CREATE TABLE"}

Insert something into the table:

(insert-into-table conn {:table TABLE
                         :title "hello"})
1

The insert-into-table function has the :n flag in the source SQL file. Thus, it returns the number of rows affected by the command. Above, there was a single record inserted.

Let’s try an expression that inserts something and returns the data:

(insert-into-table-returning conn
                             {:table TABLE
                              :title "test"})
[{:title "test", :id 2}]

Now that the table is not empty any longer, let’s select from it:

(select-from-table conn {:table TABLE})

[{:title "hello", :id 1}
 {:title "test", :id 2}]

The get-by-id shortcut fetches a single row by its primary key. It returs nil for a missing key:

(get-by-id conn {:table TABLE
                 :id 1})
{:title "hello", :id 1}

(get-by-id conn {:table TABLE
                 :id 123})
nil

Its bulk version called get-by-ids relies on the in (:v*:ids) HugSQL syntax. It expands into the following SQL vector: ["... where id in ($1, $2, ... )" 1 2 ...]

-- :name get-by-ids :? :*
select * from :i:table where id in (:v*:ids) order by id;
(get-by-ids conn {:table TABLE
                  :ids [1 2 3]})

;; 3 is missing
[{:title "hello", :id 1}
 {:title "test", :id 2}]

To insert multiple rows at once, use the :t* syntax which is short for “tuple list”. Such a parameter expects a sequence of sequences:

-- :name insert-rows :<!
insert into :i:table (id, title) values :t*:rows returning *;
(insert-rows conn {:table TABLE
                   :rows [[10 "test10"]
                          [11 "test11"]
                          [12 "test12"]]})

[{:title "test10", :id 10}
 {:title "test11", :id 11}
 {:title "test12", :id 12}]

Let’s update a single row by its id:

(update-title-by-id conn {:table TABLE
                          :id 1
                          :title "NEW TITLE"})
[{:title "NEW TITLE", :id 1}]

Finally, clean up the table:

(delete-from-table conn {:table TABLE})

Passing the Source of a Connection

Above, we’ve been passing a Connection object called conn to all functions. But it can be something else as well: a config map or a pool object. Here is an example with a map:

(insert-rows {:host "..." :port ... :user "..."}
             {:table TABLE
              :rows [[10 "test10"]
                     [11 "test11"]
                     [12 "test12"]]})

Pay attention that, when the first argument is a config map, a Connection object is established from it, and then it gets closed afterward before exiting a function. This might break a pipeline if you rely on a state stored in a connection. A temporary table is a good example. Once you close a connection, all the temporary tables created within this connection get wiped. Thus, if you create a temp table in the first function, and select from it using the second function passing a config map, that won’t work: the second function won’t know anything about that table.

The first argument might be a Pool instsance as well:

(pool/with-pool [pool config]
  (let [item1 (get-by-id pool {:table TABLE :id 10})
        item2 (get-by-id pool {:table TABLE :id 11})]
    {:item1 item1
     :item2 item2}))

{:item1 {:title "test10", :id 10},
 :item2 {:title "test11", :id 11}}

When the source a pool, each function call borrows a connection from it and returns it back afterwards. But you cannot be sure that both get-by-id calls share the same connection. A parallel thread may interfere and borrow a connection used in the first get-by-id before the second get-by-id call acquires it. As a result, any pipeline that relies on a shared state across two subsequent function calls might break.

To ensure the functions share the same connection, use either pg/with-connection or pool/with-connection macros:

(pool/with-pool [pool config]
  (pool/with-connection [conn pool]
    (pg/with-tx [conn]
      (insert-into-table conn {:table TABLE :title "AAA"})
      (insert-into-table conn {:table TABLE :title "BBB"}))))

Above, there is 100% guarantee that both insert-into-table calls share the same conn object borrowed from the pool. It is also wrapped into transaction which produces the following session:

BEGIN
insert into demo123 (title) values ($1);
  parameters: $1 = 'AAA'
insert into demo123 (title) values ($1);
  parameters: $1 = 'BBB'
COMMIT

Passing Options

PG2 supports a lot of options when processing a query. To use them, pass a map into the third parameter of any function. Above, we override a function that processes column names. Let it be not the default keyword but clojure.string/upper-case:

(get-by-id conn
           {:table TABLE :id 1}
           {:fn-key str/upper-case})

{"TITLE" "AAA", "ID" 1}

If you need such keys everywhere, submitting a map into each call might be inconvenient. The def-db-fns function accepts a map of predefined overrides:

(hug/def-db-fns
  (io/file "test/demo.sql")
  {:fn-key str/upper-case})

Now, all the generated functions return string column names in upper case by default:

(get-by-id config
           {:table TABLE :id 1})

{"TITLE" "AAA", "ID" 1}

For more details, refer to the official HugSQL documentation.

Permalink

Launching Columns for Tablecloth

This is a cross-post of a recent post by Ethan Miller at his blog, announcing the release of a substantial addition to the Tablecloth library. The New Column API # Today we at scicloj deployed a new Column API (tablecloth.column.api) into the data processing library Tablecloth (available as of version 7.029.1). This new API adds a new primitive to the Tablecloth system: the column. Here’s how we use it:

Permalink

Launching Columns for Tablecloth

The New Column API

Today we at scicloj deployed a new Column API (tablecloth.column.api) into the data processing library Tablecloth (available as of version 7.029.1). This new API adds a new primitive to the Tablecloth system: the column. Here’s how we use it:

(require '[tablecloth.column.api :as tcc])

(tcc/column [1 2 3 4 5])
;; => #tech.v3.dataset.column<int64>[5]
null
[1, 2, 3, 4, 5]

The new column is the same as the columns that comprise a dataset. It is a one-dimensional typed sequence of values. Underneath the hood, the column is just the column defined in tech.ml.dataset, the library that backs Tablecloth.

The difference is that now when you are using Tablecloth you have the option of interacting directly with a column using an API that provides a set of operations that always take and return a column.

Basic Usage

Let’s go through a simple example. Let’s say we have some test scores that we need to analyze:

(def test-scores (tcc/column [85 92 78 88 95 83 80 90]))

test-scores
;; => #tech.v3.dataset.column<int64>[8]
null
[85, 92, 78, 88, 95, 83, 80, 90]

Now that we have these values in a column, we can easily perform operations on them:

(tcc/mean test-scores)
;; => 86.375

(tcc/standard-deviation test-scores)
;; => 5.926634795564849

There are a many operations that one can perform. At the moment, the available operations are those that you would have previously accessed by importing the tech.v3.datatype.functional namespace from dtype-next.

To get a fuller picture of the Column API and how it works, please consult the Column API section in the Tablecloth documentation.

Easier Column Operations on the Dataset

The changes we’ve deployed also improve the expressive power of Tablecloth’s standard Dataset API. Previously, if you needed to do something simple like a group by and aggregation on a column in a dataset, the code could become unnecessarily verbose:

(defonce stocks
  (tc/dataset "https://raw.githubusercontent.com/techascent/tech.ml.dataset/master/test/data/stocks.csv" {:key-fn keyword}))


(tc/column-names stocks)

(-> stocks
    (tc/group-by [:symbol])
    (tc/aggregate (fn [ds]
                    (-> ds
                        :price
                        tech.v3.datatype.functional/mean))))
;; => _unnamed [5 2]:

| :symbol |      summary |
|---------|-------------:|
|    MSFT |  24.73674797 |
|    AMZN |  47.98707317 |
|     IBM |  91.26121951 |
|    GOOG | 415.87044118 |
|    AAPL |  64.73048780 |

With the new column operations within for datasets, you can now simply write:

(-> stocks
    (tc/group-by [:symbol])
    (tc/mean [:price]))

The same set operations available to be run on the column can be called on columns in the datasest. However, when operating a dataset, functions that would return a scalar value act as aggregator functions, as seen above.

Functions that would return a new column allow the user to specify a target column to be added to the dataset, as in this example where we first use the method above to add a column with the mean back to stocks:

(def stocks-with-mean
  (-> stocks
      (tc/group-by [:symbol])
      (tc/mean [:price])
      (tc/rename-columns {"summary" :mean-price})
      (tc/inner-join stocks :symbol)))


stocks-with-mean
;; => inner-join [560 4]:
;;    | :symbol | :mean-price |      :date | :price |
;;    |---------|------------:|------------|-------:|
;;    |    MSFT | 24.73674797 | 2000-01-01 |  39.81 |
;;    |    MSFT | 24.73674797 | 2000-02-01 |  36.35 |
;;    |    MSFT | 24.73674797 | 2000-03-01 |  43.22 |
;;    |    MSFT | 24.73674797 | 2000-04-01 |  28.37 |

Then we use a dataset column operation that returns a column – column division, in this case – to add a new column holding the relative daily price of the stock:

(require '[tablecloth.api :as tc])

(-> stocks-with-mean
    (tc// :relative-daily-price [:price :mean-price]))
;; => inner-join [560 5]:
;;    | :symbol | :mean-price |      :date | :price | :relative-daily-price |
;;    |---------|------------:|------------|-------:|----------------------:|
;;    |    MSFT | 24.73674797 | 2000-01-01 |  39.81 |            1.60934655 |
;;    |    MSFT | 24.73674797 | 2000-02-01 |  36.35 |            1.46947368 |
;;    |    MSFT | 24.73674797 | 2000-03-01 |  43.22 |            1.74719814 |
;;    |    MSFT | 24.73674797 | 2000-04-01 |  28.37 |            1.14687670 |

For more information, on these operations, please consult the documentation here.

Thanks to Clojurist Together

This contribution to Tablecloth was supported by Clojurists Together through their Quarterly Fellowships for open source development.

Permalink

Clojure Deref (Apr 12, 2024)

Welcome to the Clojure Deref! This is a weekly link/news roundup for the Clojure ecosystem (feed: RSS). Thanks to Anton Fonarev for link aggregation.

Blogs, articles, and projects

Libraries and Tools

New releases and tools this week:

Permalink

Heart of Clojure Tickets For Sale

Tickets for Heart of Clojure are now available.

Grab them here!

We have a limited amount of Early Bird tickets, they will be on sale until the end of April, but we expect them to sell out before that.

Meanwhile the orga team is in full swing! The next major milestone will be to get the CFP up, so people can submit their talk proposals. While we get that ready you can already start thinking about the talk or session you want to propose.

We are also busy reaching out to potential sponsors. We really need sponsors to make this event happen, and in the current economic climate they&aposre a little harder to find than five years ago.

But we&aposre confident we can convinve enough of them to support us. It&aposs a unique opportunity to reach three to fourhundred smart and talented technologists, and to be associated with probably the coolest event of the year. If you know of companies that would be interested, please send an intro to orga@heartofclojure.eu, or pass on our sponsorship prospectus.

- Bettina & Arne

Permalink

Copyright © 2009, Planet Clojure. No rights reserved.
Planet Clojure is maintained by Baishamapayan Ghose.
Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
Theme by Brajeshwar.