Debugging in Clojure

Notes

  • VSCodium
  • Calva
  • For executing whole code block choose ALT or Option + ENTER
  • For executing block in which cursor is in, choose CTRL + SHIFT + ENTER
;; debugging.clj

(defn bu [string]
  (str "bu" " " string))

(defn fu [string]
  (str "fu" " " string))

(defn du [string]
  (str "du" " " string))

(bu
 (fu
  (du "tu")))

Permalink

Analysis of DevOps Enterprise Summit Speaker Titles (2014-2022)

Later this month, we’ll be holding our first live DevOps Enterprise Summit in three years—our sixteenth event since 2014!

One of my favorite things about DevOps Enterprise Summit has been the seniority of the speakers. They are such accomplished people doing amazing things.

I’ve observed that over the past eight years the speakers have been getting even more senior, as measured by their job titles. I’ve speculated that is because so many of our speakers have been promoted. And furthermore, the importance of the technology efforts that these people are leading warrant the attention of some of the senior business people in the organization.

(This is reflected in one of the main objectives of the conference programming, which is spanning the business/technology divide, as advocated for so articulately and passionately by Programming Committee member Courtney Kissler, CTO, Zulily.)

The speculation about the increasing seniority of job titles was based on observation, primarily during the process of creating the conference programming. However, I’ve wanted to do a more rigorous analysis for many years.

Well, over the weekend, I finally did it. And there were some genuine eye-openers.

It really amazes me just how senior our plenary speakers have been — I suppose this shouldn’t be surprising, as we invest so much time as a programming committee finding senior technology leaders who are doing amazing things.

I find it incredible that 40% of plenary speakers at 2019 Vegas and 2020 London Virtual were at the highest management level (“m04” represents EVP, CTO, Managing Director, CEO, or CFO), and 20% of 2020 Vegas Virtual were at the highest individual contributor levels (“e03” represents Principal Engineers, Solution Architects, Chief Architects, Distinguished Engineers, and Technical Fellows).

Statistics

  • 15 DevOps Enterprise Summits
  • 977 talks in the Video Library
  • 1,489 speakers
  • 1.5 average speakers per talk
  • 506 companies represented

Legend: Job Codes

In the graph below, you can see the distribution of speaker titles, coded in the following way:

  • Individual contributors
    • e01: developer, engineer
    • e02: senior engineer, sre, staff engineer, arch, coach
    • e03: principal engr, solution arch, chief arch, distinguished engr, fellow
  • Managers
    • m01: manager, lead
    • m02: senior manager, product owner, director, senior director
    • m03: vp, head, svp, exec dir, cio, cso
    • m04: evp, cto, md, ceo/cfo

Interpretation of Results

  • It really amazes me how senior the plenary speakers are — I suppose this shouldn’t be surprising, as we invest so much time as a programming committee finding senior technology leaders who are doing amazing things. But it still blows me away how 40% of plenary speakers at 2019 Vegas and 2020 London Virtual were at the “m04” level, and 20% of 2020 Vegas Virtual were at “e03” level.
  • Over the years, we’ve also increased the percentage of talks from senior individual contributors — no surprise, as this this a technical field!
  • Over the years, the number of talks from senior managers (m03/m04) has increased as a % of speakers
  • And of course, titles aren’t everything — we need leaders at all levels, and often, titles don’t mean much on their own

(I did this analysis in Clojure, using the amazing clerk notebook library. This analysis was made so much easier because of the superb organization of the Video Library JSON data, thanks to the Gaiwan folks who wrote it. I’ll post the entire repo eventually, but you can see the main part that does the job title categorization here: https://gist.github.com/realgenekim/a1df9d0f7dc6bfce43e198666fec0348)

With so many organizations represented, it’s fun to go to our Video Library and start typing in different company names to see if we have a talk from them. Often we do, and if not we’ll have someone else from that industry!

In a few short weeks, we’ll be adding even more organizations and senior speakers during DevOps Enterprise Summit Las Vegas, such as Mattel, Airbnb, and Gap— the full agenda is live here.

The post Analysis of DevOps Enterprise Summit Speaker Titles (2014-2022) appeared first on IT Revolution.

Permalink

Engineering Manager at OneStudyTeam

Engineering Manager at OneStudyTeam


About Us

OneStudyTeam is changing the way medicines are developed by connecting and empowering the clinical trial ecosystem. We are a team of researchers, entrepreneurs, technologists, and healthcare-obsessed professionals building solutions that eliminate some of the biggest challenges in clinical research.

What You'll Be Working On:

  • As a software engineering manager you will not be expected to code, however, you will still be reviewing code and ensuring our engineers are making good design decisions.
  • Recruit and hire engineers consistent with organizational values.
  • Work with product, design, data and customer teams to build and deliver products our users want to use.
  • Foster a healthy and collaborative culture that embodies our team values.
  • Actively involved in the architecture and design of the solutions.
  • Manage an engineering team of up to 5 engineers.
  • Help grow and develop a team of talented and motivated engineers.

What You Bring to OneStudyTeam

  • Engineering management experience with a strong record of shipping product with Agile processes.
  • Experience working remotely. The engineering team is a distributed one with remote-first processes.
  • Strong knowledge and experience leveraging SaaS based engineering principles.
  • As a former engineer, you can hold your own in technical conversations.
  • Experience or strong interest in healthcare products.

We value diversity and believe the unique contributions each of us brings drives our success. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Permalink

Software Engineer at OneStudyTeam

Software Engineer at OneStudyTeam


About Us

Reify Health is changing the way medicines are developed by connecting and empowering the clinical trial ecosystem. We are a team of researchers, entrepreneurs, technologists, and healthcare-obsessed professionals building solutions that eliminate some of the biggest challenges in clinical research.

We care about the people who care for people...and we have fun while doing it.

We're looking for Software Engineers with a passion for continuous learning who applies newly acquired skills to your daily work. You delight in solving difficult problems, pay close attention to detail, and believe in the value of automation. You shine as a collaborator and excel as an individual contributor. You have the courage to lead and to tackle extremely difficult problems as a member of a powerful team. Your personal initiative and discipline allow you to thrive while working remotely. Your high degree of empathy for others makes you the kind of colleague everyone wants on their team. As an integral member of a fast-growing organization, you will put your fingerprint on what we do and how we do it.

What You'll Be Doing

  • Deliver extraordinary software that solves complex, real-world problems in healthcare.
  • Build high-quality, maintainable, and well-tested code across our entire application. We value the developer who focuses on “front-end” or “back-end”, as specialization brings deep technical understanding, leading to the ability to solve difficult problems elegantly.  We also value the developer who brings their own specialties, and who will enjoy working across our entire application stack. 
  • Strive for technological excellence and accomplishment through the adoption of modern tools, processes and standards.
  • Work closely with our Design and Product teams as features move through the value stream.
  • Support your teammates in an environment where collaboration, respect, humility, diversity and empathy are prized.

What You Bring to Reify Health

  • You have a minimum of 5 years of professional software product development in an Agile environment, ideally developing SaaS products.
  • The applicant we are looking for has experience in functional programming, has a passion for learning and personal growth, and  works best when working with a team of diverse but like-minded individuals.
  • You have real-world experience building software with functional programming languages like with languages like Clojure, Haskell, Lisp, F#, etc...
  • You have great oral and written communication skills and are comfortable with collaboration in a virtual setting. 
  • You demonstrate an enthusiastic interest to learn new technologies.
  • You are comfortable with modern infrastructure essentials like AWS, Docker, CI/CD tools, etc.
  • You are proficient with common development tools (e.g. Git, Jira) and processes (e.g. pairing, testing, code reviews) .
  • Prior experience in the healthcare domain, especially clinical trials and/or HIPAA Compliance is a plus. 

We value diversity and believe the unique contributions each of us brings drives our success. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Note: We are unable to sponsor work visas at this time.

Permalink

45: Data Rabbit with Ryan Robitaille

Ryan Robitaille talks about data visualisation, and building a visual coding environment in Clojure. Data Rabbit “Programming is blindly manipulating symbols” - Bret Victor Storybook

Permalink

Biff September updates: Clojurists Together, documentation, in-memory queues

Two months ago I mentioned I had some plans for adding a bunch of documentation to Biff:

Right now Biff only has reference docs. I want to add a lot more, such as:

  • A series of tutorials that show you how to build some application with Biff, step-by-step. Perhaps a forum + real-time chat application, like Discourse and Slack in one.
  • A page that curates/recommends resources for learning Clojure and getting a dev environment set up. Aimed at those who are brand new to Clojure and want to use it for web dev. If needed I might write up some of my own articles to go along with it, though I'd prefer to curate existing resources as much as possible.
  • A series of tutorials/explanatory posts that teach the libraries Biff uses. [...] This is intended for those who prefer a bottom-up approach to learning, or for those who are familiar with Biff and want to deepen their understanding.

As part of that, I plan to restructure the website, while taking lessons from The Grand Unified Theory of Documentation into account.

This was secretly a copy-and-paste-and-slight-edit of my Clojurists Together application, which has been funded! (The grants were announced the day after my last monthly update went out, which is why I'm mentioning this a little late.) Huge thanks to them and everyone who donates! Also huge thanks to JUXT for their continuing sponsorship of Biff.

Documentation

I mentioned in my application that this is a long-term project (especially the third bullet), and so with the funding I'm mainly planning to complete at least the first bullet (the forum tutorial) along with the website restructuring. And then we'll see how far I get into the other bullet points. They'll happen eventually in any case.

Last month I completed the website restructuring. It's very spiffy. Previously the reference docs were on a big single-page thingy rendered with Slate, and the API docs were rendered with Codox. Now I've written custom code to render both of those alongside the rest of the Biff website. The site is more cohesive now, and it will be easier to add additional documentation sections. Currently there are three sections ("Get Started", "Reference", and "API"); ultimately I plan to have the following sections:

  • Get Started
  • Tutorial (i.e. the forum tutorial)
  • Reference
  • How-To
  • API
  • Background Info (this might have essays about design philosophy, for example)
  • Learn Clojure*

*About the last point: I'm currently waffling over whether this should stay as a single page under the "Get Started" section, or if I should combine it with my plans for "a series of tutorials/explanatory posts that teach the libraries Biff uses" as mentioned above. i.e. if I do actually get around to writing a mini book/course thing that teaches Biff from the ground up (e.g. "here's how to start a new project", "here's how to render a static site," and so on), maybe it will be natural to make it accessible for people who are brand new to Clojure. 🤷‍♂️. No need to make a decision now I guess.

v0.5.0: in-memory queues

I cut a new Biff release:

  • Biff's XTDB dependency has been bumped to 1.22.0.
  • add-libs is now used to add new dependencies from deps.edn whenever you save a file; no need to restart the REPL.
  • Biff's feature maps now support a :queues key, which makes it convenient to create BlockingQueues and thread pools for consuming them:
(defn echo-consumer [{:keys [biff/job] :as sys}]
  (prn :echo job)
  (when-some [callback (:biff/callback job)]
    (callback job)))

(def features
  {:queues [{:id :echo
             :n-threads 1
             :consumer #'echo-consumer}]})

(biff/submit-job sys :echo {:foo "bar"})
=>
(out) :echo {:foo "bar"}
true

@(biff/submit-job-for-result sys :echo {:foo "bar"})
=>
(out) :echo {:foo "bar", :biff/callback #function[...]}
{:foo "bar", :biff/callback #function[...]}

I added these since I have a bunch of background job stuff in Yakread and it was getting out of hand. Especially since Yakread uses some JavaScript and Python code (specifically, Readability, Juice, and Surprise—they're opened as subprocesses, and communication happens over pipes) and I want to make sure there isn't more than one Node/Python process running at a time.

So far I've set up a queue + consumer for doing recommendations with Surprise (with more queues to come next week). Each job it receives has a user ID and a set of item IDs. The consumer opens a Python subprocess which loads the recommendation model into memory, takes in the user ID + item IDs over stdin, and spits out a list of predicted ratings on stdout. The queue consumer keeps the subprocess open until all the jobs currently on the queue have been handled.

Having a priority queue will also be handy. Some of the recommendations happen in batch once per day, to make sure users always have something fresh (made with an up-to-date model) ready to go. But Yakread also needs to make additional recommendations while people use the app. For the latter, jobs can be given a higher priority, so they'll still get done quickly even if we're in the middle of a large batch thing.

(Eventually I'd really like to replace all the Python/Javascript stuff with Clojure code so it takes fewer resources, but it's just not worth it at this stage.)

I wondered about if I should try to make something like yoltq but for XTDB instead of Datomic, so jobs could be persisted to the database, in order to facilitate retries + distributing to separate worker machines. I decided to stick with the current minimal in-memory implementation since that really is all I need personally at the moment. Persistance can be added to these queues from application code, though. All you have to do is:

  1. Instead of calling biff/submit-job directly, save the job as a document in XTDB.
  2. Create a transaction listener which calls biff/submit-job whenever a job document is created.

The sky's the limit from there, I guess. You could:

  • On startup, load any unfinished jobs into the appropriate queues.
  • Add a wraper to your consumer functions which catches exceptions and marks the job as failed (or marks the job as complete if there isn't an exception).
  • Create another transaction listener that watches for failed jobs and puts them into a DelayQueue for retrying.
  • Add a scheduled task that retries any jobs which have been in-progress for too long.
  • Scale out to a degree by creating a separate worker for each queue (there's a :biff.queues/enabled-ids option for this; you can specify which queues should be enabled on which machines).

The big thing you can't do (at least, not well) is have the same queue be consumed by multiple machines. See yoltq's Limitations section, which would also apply to an equivalent XTDB setup. My plan is that when I get to the point where I need more than one machine to consume a single queue (either for throughput or for high availability), I'll just throw in a Redis instance or something and use an already-written job queue library.

As such, while there is more functionality which could be built on top of Biff's in-memory queues, I'm not sure how much of it is really needed. We'll see.

Here's the implementation for anyone who would like to peruse the code.

Roadmap

  1. This weekend I plan to make some more code updates. Mainly I'll replace the task shell script with bb tasks, so that the task implementations can be stored as library code instead of needing to be copied into new projects.
  2. After that I'll work on the forum tutorial discussed above until it's complete. (This might happen mostly in November since baby #2 will arrive in a few weeks.)
  3. Then I'll add various other documentation, like a page with a curated list of resources for learning Clojure, some how-to articles, a reference page for the new queues feature, maybe an essay or two.
  4. Finally get a public Platypub instance deployed, and make some usability improvements in general. Update the GitHub issues, make a roadmap, and write some contributor docs so it's easier for people to help out. 
  5. Take the forum thing mentioned in #2 and turn that into a real-world, useful application like Platypub. Unlike Platypub, I intend the forum to primarily be an educational resource (as the subject of a tutorial), but I do also think it would be fun to have a lightweight Slack/Discourse/Discord/etc alternative to experiment with.
  6. Start working on that "Learn Clojure/Biff from scratch" project I discussed above, unless I think of something better to work on by the time I get through #1-#5.

This should last me well into next year.

Meetups

We had two meetups in September: first we played around with Babashka tasks, and then we attempted to use the Fly Machines API as a sandbox for untrusted code. The second one turned out to be at the precise moment that Fly's Machines API was experiencing downtime, so the latter half of the recording might not be very interesting to watch 🙂.

There will be one meetup in October, on Thursday the 13th (RSVP here). After that, it's up in the air due to the "baby #2" thing mentioned above.

Reminders

  • Come chat with us in #biff on Clojurians Slack if you haven't joined already. Also feel free to create a discussion thread about anything.
  • Again, thank you to everyone who sponsors Biff. If you'd like to help support Biff, please consider becoming a sponsor too.
  • I'm available for short-term consulting engagements; email me if you have a project you'd like to discuss.

Permalink

Building a startup on Clojure

Building a startup on Clojure

After first getting hooked on Clojure in 2018 I decided to build my business, Wobaka.com, on 100% Clojure the following year. That&aposs how much I enjoyed it. I thought I&aposd share some of my experience and technical stack for anyone else thinking about building and running their business on Clojure.

I&aposll walk you through my background and parts of my experience with Clojure, but if you&aposre just looking for my tech stack, skip to the end.

Yes, it&aposs great (even if Java)

First of all, it&aposs a great experience. Enjoyable. Easy to upgrade and no npm fatigue. Libraries doesn&apost always look as popular on Github as in other languages, but they&aposre often of high quality.

I&aposm not big on Java but Clojure running on the JVM turned out to be a huge benefit. It makes it easy to run your app on any platform and you can even deploy it as a single file if you want to using uberjar.

Don&apost hold of starting with Clojure because of Java.

Before Clojure

Let&aposs start with some background on what I used before coming to Clojure, so you know a little more about what I compare it too.

I&aposve been building things on the web since around 1999. Staring with HTML, JS, Geocities 🤘, ASP, PHP and Java. More recently I&aposve liked to work with Python using Flask or Django. I&aposve also worked quite a bit with NodeJS and Ruby on Rails. For the frontend I&aposve mostly used plain JavaScript, jQuery, Backbone, Angular and most recently React and Vue. The whole shebang.

I also have some (mostly academic) experience from working with distrubuted systems in Go, low level assembly and C, writing functional programs in Haskell, working on a compiler in Scala and coding proof testers in Prolog.

Alright, enough about my background, let&aposs dig into the setup and experience of building a business on Clojure in 2022. Which is not that different from 2019, when I started building Wobaka. And that&aposs a good thing!

Getting started

"Simple, not easy" is almost a mantra in the Clojure community by now. If you haven&apost watched the talk (Simplicity matters), check the footnotes [1]. To keep things simple I opted to not use a full web framework like Rails, so getting set up took a little more time. Honestly, it was mostly just finding my way around the ecosystem and finding what libraries to use for things like routing and database migrations. Tools like Leiningen makes it really easy to set things up and I wish I had this blogpost when I started out.

You can use Leiningen (a handy tool like npm + extras), and basically do lein new app <name>.

Development experience

Development experience is great once up and running. I setup everything with very few libraries so I had to write some extra code to get started. Now I have hot reloading, simple state management, very few dependencies and never have to switch language, not even for css.

Building a startup on ClojureOk it&aposs 97.8% Clojure. I do have some config files not written in Clojure.

Library support

Sometimes the documentation is great, sometimes it&aposs not. However, the good thing with Clojure is that it&aposs really easy to read the code. It&aposs just functions and data. Just look at the imported namespace and see what functions there is.

Java and JavaScript interop

Interop makes it easy to tap into the host language&aposs libraries. I almost never use interop with Java but on the frontend I sometimes need to interact with the DOM. Working with JavaScript is pretty straight forward though. For example, adding a click event handler that calls event.preventDefault() is as simple as this:

{:on-click (fn [event] (.preventDefault event))}

You basically just flip the order when calling object methods. Or you could use Clojure threading which passes down the result from functions. Look, now it&aposs basically the same:

{:on-click (fn [event] (-> event .preventDefault))}

This makes it easy to use any JavaScript library, and since the view library I use is built on top of React, I can basically use any React library as well, if I wanted to.

Deployment options

Deploying a ClojureScript frontend is a very smooth experience. I really recommend using something like Netlify and from there it&aposs basically a one-liner to build the application using Shadowcljs (a beautiful tool to setup and build ClojureScript projects): npx shadow-cljs release app.

The backend is also easily deployed using something like Heroku but you can also host it easily on your own server. To host it yourself you can build the entire app into a self-contained uberjar using lein ring uberjar . Now you can just run the uberjar on any system with Java and you have the backend up and running. Most likely using something like systemctl and a reverse proxy in front like Nginx.

One thing to consider is that when your project grows larger, the startup time will not be instant or really quick like NodeJS or native binaries. This means that you may have to do a little work to get zero downtime deploys. But it&aposs nothing that can&apost be fixed.

The stack

This is what I use to build my SaaS on 100% Clojure. Let&aposs start with the backend.

  • Leiningen for project / dependency management
  • Ring + Compojure for routing and REST API
  • Ring-logger for logging requests
  • Environ to manage environment variables
  • Java.jdbc + c3p0 for database connection to Postgresql
  • Ragtime for database migrations
  • Tick for date/time utils
  • Postal for sending emails
  • Buddy for cryptography
  • Clj-http for making requests
  • Cheshire for JSON parsing

That&aposs it for the backend. I&aposve built my own worker/job scheduling using Postgresql and clojure.core.async, which basically let&aposs you do concurrency like in the Go programming language.

I have two frontend apps. One for the marketing page which is server rendered and hydrated on load, and another that&aposs just an ordinary SPA. For both I use the same setup.

  • Shadowcljs to build the apps
  • Rum as a React wrapper and view component library
  • Accountant + Secretary for client side routing
  • Cljs-http for making API requests
  • Garden for writing CSS in Clojure
  • Flatpickr as date picker
  • Chartjs for charts
  • Papaparse for client side csv parsing
  • Trix as WYSIWYG editor
  • Plyr as Vimeo video player

All in all, Wobaka is a relatively large app by now but it&aposs still simple to work with, because it&aposs just functions and data. It&aposs also still simple to manage and update dependencies. I default to building things myself but sometimes it doesn&apost make sense and I&aposm happy to be able to tap into the JavaScript ecosystem.

Parenthesis

I almost forgot to write about parenthesis. That&aposs how much you care about it after getting started. They are just in different places (it&aposs not a big deal).

Give Clojure a try

I really recommend giving Clojure a try. Especially if you like to write functional JavaScript code or enjoy working with things like React.

Clojure for the brave and true [2] is a great place to start. For a more web focused guide I&aposm not sure. But I hope this guide can give you some good starting points on which libraries to use. Perhaps I&aposll write a getting started with web development in Clojure blogpost at some point.

Building your startup on Clojure or just want to talk Clojure? Hit me up on Twitter @drikerf.

Also check out my other blogpost about things I love about working with Clojure.

How I Use Clojure to Build and Scale my SaaS
I was first introduced to Clojure in grad school where I was taking a course on building large scale software applications.
Building a startup on Clojure

[1] https://www.youtube.com/watch?v=rI8tNMsozo0
[2] https://www.braveclojure.com/

Permalink

Clojure Deref (Oct 3, 2022)

Welcome to the Clojure Deref! This is a weekly link/news roundup for the Clojure ecosystem. (@ClojureDeref RSS)

Blogs and articles

Libraries and tools

New releases and tools this week:

  • clj-http-lite 1.0.13 - A JVM and babashka compatible lite version of clj-http

  • rewrite-edn 0.3.4 - Utility lib on top of rewrite-clj with common operations to update EDN while preserving whitespace and comments

  • feeds - Announce Clojure events with the CLI

  • bbssh - Babashka pod for SSH support

  • fully-satisfies 1.9.0 - Utility functions for Clojure

  • muotti 0.2.0 - Muotti is a graph based value transformer library

  • secrets.clj - A library designed to generate cryptographically strong random numbers

  • pretty 1.2 - Library for helping print things prettily, in Clojure

  • procedure.async 0.1.0 - Async procedures for Clojure

  • clj-test-containers 0.7.3 - Control Docker containers from your test lifecycle for Clojure integration tests

  • tacos 0.0.2 - Collection of timeseries technical analysis indicators

  • graphcom 0.1.1 - Composable incremental computations engine

  • deps.clj 1.11.1.1165 - A faithful port of the clojure CLI bash script to Clojure

  • bbin 0.1.4 - Install any Babashka script or project with one command

  • calva 2.0.305 - Clojure & ClojureScript Interactive Programming for VS Code

Permalink

Page 2

Last time out, I was desperately trying to understand why my beautifully crafted page-aware lazy loading S3 list objects function was fetching more pages than it actually needed to fulfil my requirements (doesn't sound very lazy to me!), but to no avail. If you cast your mind back, I had set my page size to 50, and was taking 105 objects:

(comment

  (->> (list-objects logs-client prefix)
       (take 105))
  ;; => ("logs/E1HJS54CQLFQU4.2022-09-15-00.0125de6e.gz"
  ;;     "logs/E1HJS54CQLFQU4.2022-09-15-00.3b36a099.gz"
  ;;     ...
  ;;     "logs/E1HJS54CQLFQU4.2022-09-15-10.ae86e512.gz"
  ;;     "logs/E1HJS54CQLFQU4.2022-09-15-10.b4a720f9.gz")

)

But sadly seeing the following in my REPL buffer:

Fetching page 1
Fetching page 2
Fetching page 3
Fetching page 4

I know that some lazy sequences in Clojure realise in chunks, but those chunks are usually realised 32 at a time in my experience. It is actually absurdly hard to find any documentation that explains exactly how chunking works, but one can gather hints here and there from the dusty corners of the web's ancient past:

They all mention the number 32 (holiest of all powers of 2, clearly), and the first one even suggests looking at the implementation of clojure.core/map and seeing that map calls the magic chunk-first, which "takes 32 elements for performance reasons". Spelunking deeper into the source for the definition of chunk-first leads one to the following lines:

(defn ^:static  ^clojure.lang.IChunk chunk-first ^clojure.lang.IChunk [^clojure.lang.IChunkedSeq s]
  (.chunkedFirst s))

Which leads to Clojure's Java implementation, which leads to me reading a couple of the classes that implement the IChunk interface, looking for some mention of the chunk size, and running away in tears.

The funny thing about all of this is that I know that one is not supposed to use functions with side effects when processing lazy sequences. In fact, it says exactly that in the docstring for clojure.core/iterate:

(iterate f x)

Returns a lazy sequence of x, (f x), (f (f x)) etc. f must be free of
side-effects.

But I figured that it would "probably be fine for this use case." 😂

Having received my well-deserved comeuppance—albeit not completely understanding the form said comeuppance is taking—it's time to figure out how to lazily page without chunking. As luck would have it, right after I published my previous post, I opened up Planet Clojure in my RSS reader and saw a post by Abhinav Omprakash on "Clojure's Iteration function ". According to the post, Clojure has a function called iteration, and:

One of the most common use cases for iteration is making paginated api calls.

OK, this looks interesting. Why in the world didn't I know about this? Well, Abhinav's post links to a post on the JUXT blog called "The new Clojure iteration function" (written by the irrepressible Renzo Borgatti!) wherein it is revealed that iteration is new in Clojure 1.11. In the post's introduction, Renzo mentions:

the problem of dealing with batched API calls, those requiring the consumer a "token" from the previous invocation to be able to proceed to the next. This behaviour is very popular in API interfaces such as AWS S3, where the API needs to protect against the case of a client requesting the content of a bucket with millions of objects in it.

He goes on to make a bold claim:

In the past, Clojure developers dealing with paginated APIs have been solving the same problem over and over. The problem is to create some layer that hides away the need of knowing about the presence of pagination and provides the seqable or reducible abstraction we are all familiar with. It is then up to the user of such abstractions to decide if they want to eagerly load many objects or consume them lazily, without any need to know how many requests are necessary or how the pagination mechanism works.

OK, I buy this, having solved this problem in many sub-optimal ways over the years. So iteration really sounds like what I want here. Let's see if I can modify my code based on the iterate function to use iteration instead. Here's what I ended up with last time:

(defn get-s3-page [{:keys [s3-client s3-bucket s3-page-size]}
                   prefix prev]
  (let [{token :NextContinuationToken
         truncated? :IsTruncated
         page-num :page-num} prev
        page-num (if page-num (inc page-num) 1)
        done? (false? truncated?)
        res (when-not done?
              (println "Fetching page" page-num)
              (-> (aws/invoke s3-client
                              {:op :ListObjectsV2
                               :request (mk-s3-req s3-bucket prefix s3-page-size token)})
                  (assoc :page-num page-num)))]
    res))

(defn s3-page-iterator [logs-client prefix]
  (partial get-s3-page logs-client prefix))

(defn list-objects [logs-client prefix]
  (->> (iterate (s3-page-iterator logs-client prefix) nil)
       (drop 1)
       (take-while :Contents)
       (mapcat (comp (partial map :Key) :Contents))))

The JUXT post helpfully walks through an example of listing objects in an S3 bucket, which is exactly what I'm doing, but unhelpfully bases the example on Amazonica (an excellent Clojure wrapper around the AWS Java SDK that I used for years until some cool kids from Nubank told me that all the cool kids were now using Cognitect's aws-api, and I wanted to be cool like them, so I decided to use it for my next thing, which turned out to be a great decision since my next thing was Blambda, which runs on Babashka, which can't use the AWS Java SDK anyway).

Where was I? Oh yeah, the JUXT blog. So it breaks down the arguments to iteration:

(iteration step & {:keys [somef vf kf initk]
                   :or {vf identity
                        kf identity
                        somef some?
                        initk nil}})
  • step is a function of the next marker token. This function should contain the logic for making a request to the S3 API (or other relevant paginated API) passing the given token.
  • somef is a function that applied to the return of (step token) returns true or false based on the fact that the response contains results or not, respectively.
  • vf is a function that applied to the return of (step token) returns the items from the current response page.
  • kf is a function that applied to the return of (step token) returns the next marker token if one is available.
  • initk is an initial value for the marker.

Looking at this, my get-s3-page function sounds a lot like step, in that it contains the logic for making a request to S3. However, step is a function taking one argument, and get-s3-page takes three, so clearly it can't be used it as is. But the same was actually true for my previous attempt at paging that used iterate, and in fact I wrote a function to take care of this:

(defn s3-page-iterator [logs-client prefix]
  (partial get-s3-page logs-client prefix))

s3-page-iterator closes over the client and the prefix and returns a function that takes only one argument: prev, which is the previous page of results from S3. So that's step sorted!

In order to figure out what functions I need for somef, vf, and kf (gotta love the terse names of variables in clojure.core!), I need to look at what get-s3-page returns, since all three of those functions operate on the return value of (step token):

(comment

  (->> (get-s3-page logs-client "logs/A1BCD23EFGHIJ4.2022-09-26-" nil)
       keys)
  ;; => (:Prefix
  ;;     :NextContinuationToken
  ;;     :Contents
  ;;     :MaxKeys
  ;;     :IsTruncated
  ;;     :Name
  ;;     :KeyCount
  ;;     :page-num)

)

I'll tackle vf and kf first, since they are pretty straightforward. vf needs to return the items from the current response page. Those items live in the map returned by get-s3-page under the :Contents key, and since keywords are functions that when called with a map, look themselves up in the map, I can use the :Contents keyword as my vf! 🎉

kf returns the next token, which I have in the response as :NextContinuationToken, so it sounds like I should use that for kf. The only problem is that the second invocation of my step function will look like this:

(step (:NextContinuationToken response))

and get-s3-page expects prev to be the response itself, from which it knows how to extract the token all by itself. So I want to just pass the response to my function as-is, and luckily, Clojure has a function for that: identity, which returns its argument unchanged.

Now it's time to look at somef, a function that returns true if the response contains results and false otherwise. In my case, get-s3-page makes a request to the S3 API and returns the response unless the previous response wasn't truncated, in which case it returns nil. So what I want for somef is a function that returns true for any non-nil value, which is exactly what clojure.core/some? does (not to be confused with clojure.core/some).

Now that somef, vf, and kf are sorted, I'll turn my roving eye to initk, which is the initial value for the token passed to my step function. Just like in my previous attempt, I can use nil as the initial argument.

So putting this all together, my new list-objects function would look like this:

(defn list-objects [logs-client prefix]
  (->> (iteration (s3-page-iterator logs-client prefix)
                  :vf :Contents
                  :kf identity
                  :somef some?
                  :initk nil)
       (mapcat (partial map :Key))))

Looks good, lemme test it out!

(comment

  (->> (list-objects logs-client prefix)
       (take 5))
  ;; => ("logs/A1BCD23EFGHIJ4.2022-09-25-00.0187bda9.gz"
  ;;     "logs/A1BCD23EFGHIJ4.2022-09-25-00.0e46ca54.gz"
  ;;     "logs/A1BCD23EFGHIJ4.2022-09-25-00.348fa655.gz"
  ;;     "logs/A1BCD23EFGHIJ4.2022-09-25-00.4345d6ea.gz"
  ;;     "logs/A1BCD23EFGHIJ4.2022-09-25-00.63005d64.gz")

)

Nice! Except for one thing. My REPL buffer reveals that I actually haven't fixed the problem I set out to fix:

Fetching page 1
Fetching page 2
Fetching page 3
Fetching page 4

Looks like I should have read a little further in the JUXT blog article, because Renzo explains exactly what's happening here:

The results of calling [get-s3-page] are batched items as a collection of collections. In general, we need to collapse the batches into a single sequence and process them one by one [...]

Surprisingly, accessing the [first 5 items from] the first page produces additional network calls for pages well ahead of what we currently need. This is an effect of using [mapcat, which always evaluates the first 4 > arguments]!

The reader should understand that this is not a problem of iteration itself, but more about the need to concatenate the results back for processing maintaining laziness in place.

Renzo being Renzo, of course he has a solution to this:

(defn lazy-concat [colls]
  (lazy-seq
   (when-first [c colls]
     (lazy-cat c (lazy-concat (rest colls))))))

I can fold this into my list-objects function:

(defn list-objects [logs-client prefix]
  (->> (iteration (s3-page-iterator logs-client prefix)
                  :vf :Contents
                  :kf identity
                  :somef some?
                  :initk nil)
       lazy-concat
       (map :Key)))

Since lazy-concat is sewing the lists returned by iteration together, I don't need the chunktacular mapcat anymore; I can just use regular old map. Let's see if this works:

(comment

  (->> (list-objects logs-client prefix)
       (take 5))
  ;; => ("logs/A1BCD23EFGHIJ4.2022-09-25-00.0187bda9.gz"
  ;;     "logs/A1BCD23EFGHIJ4.2022-09-25-00.0e46ca54.gz"
  ;;     "logs/A1BCD23EFGHIJ4.2022-09-25-00.348fa655.gz"
  ;;     "logs/A1BCD23EFGHIJ4.2022-09-25-00.4345d6ea.gz"
  ;;     "logs/A1BCD23EFGHIJ4.2022-09-25-00.63005d64.gz")

)

And the REPL buffer?

Fetching page 1

Amazing!

There's one last thing that's bugging me, though. If I look back at the docs for iteration, I see that it has some default arguments:

(iteration step & {:keys [somef vf kf initk]
                   :or {vf identity
                        kf identity
                        somef some?
                        initk nil}})

So vf and kf default to identity, somef defaults to some?, and initk defaults to nil. Taking a look at how I call iteration, things look quite familiar:

(iteration (s3-page-iterator logs-client prefix)
           :vf :Contents
           :kf identity
           :somef some?
           :initk nil)

My kf, somef, and initk all match the defaults! Looks like the Clojure core team kinda knows what they're doing. 😉

With this knowledge under my belt, I can simplify list-objects even further:

(defn list-objects [logs-client prefix]
  (->> (iteration (s3-page-iterator logs-client prefix)
                  :vf :Contents)
       lazy-concat
       (map :Key)))

The cool thing about all of this is that I could use the exact same get-s3-page function I had before, as well as the same s3-page-iterator function, and only needed to change list-objects and sprinkle in the magic lazy-concat function from Renzo's box o' fun!

Before you try this at home, be sure to read the JUXT blog post carefully enough not to miss this sentence, which probably should have been bold and inside the dearly departed HTML <blink> tag:

You need to remember to avoid using sequence with transducers for processing items even after the initial concatenation, because as soon as you do, chunking will hunt you down.

Permalink

Functional Collections and Arity Exceptions

“We need to show that off to the Scheme programmers.”

Rich Hickey

David Nolen was live coding at Clojure/conj and Rich Hickey raises his hand. David—unsure of what to expect—summons Rich’s question. It was more of a suggestion about his open Emacs buffer:

“Why don’t you use the set as a function?”

David had eta-expanded a set.

Dan Friedman and Will Byrd were in the audience and had presented one of their famous paired miniKanren talks. Entirely in Scheme, of course. Tongue in cheek, Rich justified his comment in a way resembling the quote at the top of this page.

What made this quote memorable, perhaps, was that I was also trying to impress them. Dan is a non-stop recruiter for grad students at Indiana University, and having warmed the crowd for Dan and Will's talk by giving a talk on core.logic myself at the same conference, I had their attention. They easily hooked me on the idea of studying at Indiana Unversity, and I started my PhD there a few short years later.

So, sets are functions. In Clojure, most of the interesting data structures are too. They usually take a key to index into the collection and look themselves up.

Clojure makes this possible by making a function an abstraction via the interface clojure.lang.IFn. The JVM is very good at dispatching methods based on the number of arguments you provide, so its method invoke has 22 overloads to optimize the common cases of functions with 0-20 parameters.

While this is great for performance, it’s not so great for implementors of this interface. You need to implement 22 methods to be a function, even if, like sets, you only need to work on 1 or 2 arguments. This is where implementation inheritance comes in: the abstract class clojure.lang.AFn implements all 22 methods to throw an arity exception based on the number of arguments provided. Now, each collection can simply extend this class and override just a couple of methods, and you get good defaults for the rest.

At least that’s the theory.

Today I found a minor bug in AFn: for 22 or more arguments, the default implementation of invoke will throw an arity exception that claims only 21 arguments in its error message:

Clojure 1.11.1
user=> (apply {} (range 22))
Execution error (ArityException) at user/eval1 (REPL:1).
Wrong number of args (21) passed to: clojure.lang.PersistentArrayMap

user=> (apply {} (range 23))
Execution error (ArityException) at user/eval3 (REPL:1).
Wrong number of args (21) passed to: clojure.lang.PersistentArrayMap

This affected almost every functional collection in Clojure, including:

clojure.lang.PersistentArrayMap
clojure.lang.PersistentArrayMap$TransientArrayMap
clojure.lang.PersistentHashMap
clojure.lang.PersistentHashMap$TransientHashMap
clojure.lang.PersistentVector
clojure.lang.PersistentVector$TransientVector
clojure.lang.PersistentHashSet
clojure.lang.PersistentHashSet$TransientHashSet
clojure.lang.PersistentTreeMap
clojure.lang.PersistentTreeSet
clojure.lang.MapEntry

It’s also likely that libraries similarly extend AFn to implement their own collections. Try it out with your favourite 3rd-party collection.

Beyond that, the most important kind of function was impacted: fn.

user=> (apply (fn []) (range 25))
Execution error (ArityException) at user/eval5 (REPL:1).
Wrong number of args (21) passed to: user/eval5/fn--141

The Clojure compiler uses AFn in the same way as a shortcut for defining functions. The emitted code extends AFn and just overrides the methods it needs:

user=> (supers (class (fn [])))
#{... clojure.lang.AFn ...}

That means probably every function defined by fn and defn that supports 20 arguments max or less suffers from this problem. Try it on your own functions.

user=> (apply identity (range 25))
Execution error (ArityException) at user/eval148 (REPL:1).
Wrong number of args (21) passed to: clojure.core/identity

user=> (apply fnil (range 100))
Execution error (ArityException) at user/eval152 (REPL:1).
Wrong number of args (21) passed to: clojure.core/fnil

If the fn has rest arguments, then a different code path is taken. A RestFn is created instead which redirects arguments slightly differently: its method getRequiredArity returns the number of fixed arguments it has. The Clojure compiler allows up to 20 fixed arguments, but you can create your own instances of RestFn that exceed this with familiar results.

user=> (apply (proxy [clojure.lang.RestFn] []
                (getRequiredArity [] 30))
              (range 25))
Execution error (ArityException) at user.proxy$clojure.lang.RestFn$ff19274a/throwArity (REPL:-1).
Wrong number of args (21) passed to: user.proxy/clojure.lang.RestFn/ff19274a

This is just a coincidence though, the source of this second problem is RestFn itself, not AFn. It’s a different and more (completely?) benign issue.

I have reported these both to Clojure:

Permalink

Clojure Goodness: Writing Text File Content With spit

In a previous post we learned how to read text file contents with the slurp function. To write text file content we use the spit function. We content is defined in the second argument of the function. The first argument allows several types that will turn into a Writer object used for writing the content to. For example a string argument is used as URI and if that is not valid as a file name of the file to read. A File instance can be used directly as argument as well. But also Writer, BufferedWriter, OutputStream, URI, URL and Socket. As an option we can specify the encoding used to write the file content using the :encoding keyword. The default encoding is UTF-8 if we don't specify the encoding option. With the option :append we can define if content needs to be appended to an existing file or the content should overwrite existing content in the file.

In the following example we use the spit function with several types for the first argument:

(ns mrhaki.sample.spit
  (:import (java.io FileWriter FilterWriter StringWriter File)))

;; With spit we can write string content to a file.
;; spit treats the first argument as file name if it is a string.
(spit "files/data/output.txt" (println-str "Clojure rocks!"))

;; We can add the option :append if we want to add text
;; to a file, instead of overwriting the content of a file.
(spit "files/data/output.txt" (println-str "And makes the JVM functional.") :append true)

;; Another option is the :encoding option which is UTF-8 by default
(spit "files/data/output.txt" (str "Clojure rocks!") :encoding "UTF-8")

;; The first argument can also be a File.
(spit (File. "files/data/file.xt") "Sample")

;; We can pass a writer we create ourselves as well.
(spit (FileWriter. "files/data/output.txt" true) "Clojure rocks")

;; Or use a URL or URI instance.
(spit (new java.net.URL "file:files/data/url.txt") "So many options...")
(spit (java.net.URI/create "file:files/data/url.txt") "And they all work!" :append true)

Written with Clojure 1.11.1.

Permalink

Clojure Goodness: Reading Text File Content With slurp

The slurp funtion in Clojure can be used to read the contents of a file and return it as a string value. We can use several types as argument for the function. For example a string argument is used as URI and if that is not valid as a file name of the file to read. A File instance can be used directly as argument as well. But also Reader, BufferedReader, InputStream, URI, URL, Socket, byte[] and char[]. As an option we can specify the encoding used to read the file content using the :encoding keyword. The default encoding is UTF-8 if we don't specify the encoding option.

In the following example we use the slurp function in different use cases. We use a file named README with the content Clojure rocks!:

(ns mrhaki.sample.slurp
    (:require [clojure.java.io :as io]
              [clojure.test :refer [is]])
    (:import (java.io File)))
  
  ;; slurp interperts a String value as a file name.
  (is (= "Clojure rocks!" (slurp "files/README")))
  ;; Using the encoding option.
  (is (= "Clojure rocks!" (slurp "files/README" :encoding "UTF-8")))

  ;; We can also use an explicit File object.
  (is (= "Clojure rocks!" (slurp (io/file "files/README"))))
  (is (= "Clojure rocks!" (slurp (File. "files/README"))))
  
  ;; We can also use an URL as argument.
  ;; For example to read from the classpath:
  (is (= "Clojure rocks!" (slurp (io/resource "data/README"))))
  
  ;; Or HTTP endpoint
  (is (= "Clojure rocks!" (slurp "https://www.mrhaki.com/clojure.txt")))

Written with Clojure 1.11.1.

Permalink

Document Management System: Importance for Better Workflow

As we know, our society is becoming more and more digital. The amount of data and information has grown at an enormous speed. Every day, your company makes and manages a large amount of paperwork, such as contracts, presentations, marketing content, blog posts, HR policies and procedures, training guides, and onboarding guides.

These files may be located in a variety of places like your computer, mobile device, and cloud storage services like Dropbox, Google Drive, OneDrive, SharePoint, etc. By switching to an electronic document management system, companies can automate their workflows and make sure their content has better progress. This gives them more time to focus on important company operations.

Therefore, a solution that could serve this duty was required from Agiliway team in order to ensure that there was no clutter in the work that was being done by workers as well as on the devices that they used. Since Agiliway has considerable experience developing document-related solutions, today, we will discuss the importance of documents in every company and reveal the secrets behind creating our solutions for clients. So, let’s move on to the solution that we’ve been developing and putting into action.

About the product

The inspiration for this project’s creation comes from another system we created in-house. It is written entirely on Azure and based on AWS utilizing test buckets.

The system files are grouped according to the product owners’ requirements and stored in containers called “buckets.” At the bucket level, more permission controls are applied. Because of this, we have set up fine-grained controls over how people can access, store, and get documents.

The idea was that every document contains meta-data. Meta-data describes the information or, in the traditional sense, gives the rules for determining what the data means. Metadata includes the type of document, the dates it is valid (from and to), the author, the editor, keywords or tags that help identify the document, and so on.

The Agiliway development team concentrated on creating such functionalities:

  • Information and document sorting;
  • Include a mechanism for tracking progress;
  • Status for documentation;
  • Document tracking and management applications like DocuSign have a notification system that allows you to take action whenever the document’s status changes.

Provided solutions

We have been working on a technical solution, so let’s get to it:

  • A front-end has been developed to allow search, filter, edit, and view of different versions of documents, as well as add new keywords, metadata, etc.
  • Change tracking is a second feature. This is precisely what the Microsoft Word stack does; the only difference is that we save multiple versions of the document and can always roll back to a previous version to determine who modified it. Since most of the documents were contracts, it was essential to identify the version’s author and date.
  • Developed functionality allows each document to have a status; if the document is a contract, its status must be determined.
  • Also of note is that our team made it possible to replicate a key feature of document circulation systems like DocuSign, namely, the ability to activate predefined actions in response to a document’s change in status. If the contract receives the “sign” status, then a trigger for this client’s onboarding must be sent, as described in the contract, to one of the portals of another product.
  • The database is a PostgreSQL relational tabular database, which consists of auxiliary matching tables that allow you to combine IDs. There is nothing remarkable about the database, however, as everything is stored in buckets and the database simply generates connections between users and buckets.
  • In addition, in one of the upcoming versions, we intend to integrate OCR – optical document recognition, or AI that enables you to recognize scanned or PDF documents.

Our team used the following technologies in the project: PostgreSQL, AWS, Azure, lambda function, Clojure, and ClojureScript.

Again, they could acquire this functionality by purchasing an Alfresco document management system or DocuSign, and a few other systems, but doing so would be extremely expensive. And instead, we constructed a system that combines all of these functional properties into a single unit: the document management software. Moreover, this technology is used by various systems to store documents in a wide range of formats (agreements, contracts, forms, medical documents of various types, etc.).

Value delivered

The main secret of our success is our team working together and achieving the top. As we point out, Agiliway has a considerable amount of experience in dealing with papers, and we constructed the system from the ground up, resulting in cost savings for the customer. There are systems like this already created, but they are being offered for sale on the market at extreme prices. Our system’s structure, which is built on document buckets, enables it to be incredibly adaptable when it comes to the modification options that may be applied.

Everything is extremely flexible, which means that you can quickly add any kind of document, document format, and AM access role system, which is what we based on who can read and execute what in the first place. It is straightforward and handy since you create users with similar simplicity, which means that you can control who has access to which documents.

This is of enormous assistance given that some of our employees need to access all the papers in one format while others need to view a selection of the documents in another format. The human access level enables the inclusion of granular access to individual documents and even the supply of access to certain portions of documents, which is a unique feature. This may be done by providing access to specific sections of documents.

When working with Agiliway, clients are assisted in making system-wide modifications that boost efficiency, delight end users, attract new consumers, and broaden the scope of partnerships.

READ ALSO: HOW TO USE MICRO-FRONTEND TO CREATE A SUCCESSFUL PRODUCT FOR YOUR BUSINESS

The post Document Management System: Importance for Better Workflow first appeared on Agiliway.

Permalink

Copyright © 2009, Planet Clojure. No rights reserved.
Planet Clojure is maintained by Baishamapayan Ghose.
Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
Theme by Brajeshwar.