Just use Dictionaries

A Python dictionary has a simple & well-known API. It is possible to merge data using a nice & minimalistic syntax, without mutating or worrying about state. You're probably not gonna need classes.

Hey, what's wrong with classes? 🤔

From what I've seen in Python, classes often add unnecessary complexity to code. Remember, the Python language is all about keeping it simple.

My impression is that in general, class and instance-based code feels like the proper way of coding: encapsulating data, inheriting features, exposing public methods & writing smart objects. The result is very often a lot of code, weird APIs (each one with its own) and not smart-enough objects. That kind of code quickly tend to be an obstacle. I guess that's when workarounds & hacks usually are added to the app.

Two ways of solving a problem: class-based vs data-oriented.
Less code, less problems.

What about Dataclasses?

Python dataclasses might be a good tradeoff between a heavy object with methods and the simple dictionary. You get typings and autocomplete. You can also create immutable data classes, and that's great! But you might miss the flexibility: the simplicity of merging, picking or omitting data from a dictionary. Letting data flow smoothly through your app.

Hey, what about Pydantic?

That's a really good choice for things like defining FastAPI endpoints. You'll get the typed data as OpenAPI docs for free.

I would as early as possible convert the Pydantic model to a dictionary (using the model.dict() function), or just pick the individual keys and pass those on to the rest of the app. By doing that, the rest of the app is not required to be aware of a domain specific type or some base class, created as workaround for the new set of problems introduced with custom types.

Just data. Keeping it simple.

What about the basic types? 🤔

That is certainly a tradeoff when using dictionaries, the data can be of any type and you will potentially get runtime errors. On the other hand, is that a real problem when using basic data types like dict, bool, str or int? For me, I can't remember that ever has been an issue.

But shouldn't data be private?

Classes are often used to separate public and private functionality. From my experience, explicitly making data and functionality private rarely adds value to a product. I think Python agrees with me about this. By default, all things in a Python module is public. I remember when learning about it, and the authors saying that’s okay because we’re all adults here. I very much liked that perspective!

Do you like Interfaces? 🤩

Yes! Especially when structuring code in modules and packages (more about that in the next section). Using __init__.py is a great way to make the intention of a small package clearer and easier to grasp. Maybe there's only one function that makes sense to use from the outside? That's where the package interface (aka the __init__.py file) feature fits in well.

Python files, modules, packages?

In Python, a file is a module. One or more modules in a folder is a package. One or more packages can be combined into an app. Using a package interface makes sense when structuring code in this way.

Keeping it simple. 😊

I'm finishing off this post with a quote from the past:

“Data is formless, shapeless, like water.
If you put data in a Clojure map, it becomes the map.
You put data in a Python list and it becomes the list.
Now data in a program can flow or it can crash.

Be data, my friend.

Bruce Lee, 1940 - 1973

ps.

If neither Bruce or I convinced you about the great thing with simple data structures, maybe Rich Hickey will? Don't miss his "just use maps" talk!

ds.



Top photo by Syd Wachs on Unsplash

Permalink

Clojure Deref (Aug 19, 2022)

Welcome to the Clojure Deref! This is a weekly link/news roundup for the Clojure ecosystem. (@ClojureDeref RSS)

Libraries and Tools

New releases and tools this week:

  • jank - The jank programming language

  • clavascript - ClojureScript syntax to JavaScript compiler

  • antq 2.0.889 - Point out your outdated dependencies

  • uri-template 0.7.1 - Clojure implementation of URI Template (RFC 6570)

  • cli 0.3.35 - Turn Clojure functions into CLIs!

  • yoltq 0.2.60 - An opinionated Datomic queue for building (more) reliable systems

  • hermes 1.0.712 - A library and microservice implementing the health and care terminology SNOMED CT

  • joyride 0.0.17 - Making VS Code Hackable since 2022

  • helix 0.1.0 - A simple, easy to use library for React development in ClojureScript

  • fulcro 3.5.24 - A library for development of single-page full-stack web applications in clj/cljs

Permalink

Fix your Clojure code: Clojure comes with design patterns (Part 1)

This post will be part of a series since there are few dozen software design patterns I want to cover, but writing about all of them in one post will take ages.

Patterns

A motivational story

Fix your Clojure code: Clojure comes with design patterns (Part 1)

Suzi, after busting her ass at her first Clojure job, lands her second Clojure job at StartupAI. This company moved her half-way across the country to the Big City™, filling her with naive excitement. When she finishes company on-boarding, Suzi submits her first pull request after several days of staggering through the code. While the code is the same Clojure she has always known and loved, the repository gives her stress headaches.

The persistence namespace, the one with all the database queries, has eight different functions to get the same data using same query. Next, a minuscule change in the application layer causes dozens unit tests to fail, and to top it off, it takes three weeks to a pull request to get approved for the change.

"I moved across the country for this?!".

Here we go again

Although I&aposm sure some people will say it is, Suzi&aposs dilemma isn&apost unique. Large Clojure codebases sometimes suffer from a lack proper care and abstraction. I suspect a lot of people rely too heavily on the pure function. Functions are great, but over time they create a codebase laden with high coupling and low cohesion. So, how do we help Suzi create a codebase satisfying our desired "ilities"? Lucky for Suzi, Clojure either directly supports, or easily supports design patterns like singleton, command, observer, visitor, and iterator. Before we get into those patterns though, we need to get little more abstract than the pure function.

We&aposre speaking the same language

Let&aposs start with the abstraction just above the function: Protocols. Protocols are Clojure&aposs analogue to the interface. At a higher level, we can call a bag of function inputs and their outputs (or function signatures) an interface. Without diving into full-blown type theory, when something satisfies the interface, we can also say that something is a type of that interface. I&aposll even argue that we don&apost need a formal programming construct to achieve this "typehood". On a basic level, by definition, every Clojure sequence supports the functions cons ,first ,rest , and  count. Does that mean we should go out and create our own collection types? Not likely, the built-in types are quite good, but we do have few handy constructs for creating or extending types if we want.

(defprotocol MoneySafe
  "A type to convert other money type to bigdec"
  (make-safe [this] "Coerce a type to be safe for money arithmetic"))

(extend-protocol MoneySafe
  java.lang.Number
  (make-safe [this] (bigdec this))

  java.lang.String
  (make-safe [this]
    (try
      (-> this
          (Double/parseDouble)
          (bigdec))
      (catch NumberFormatException nfe
        (println "String must be string of number characters"))
      (catch Exception e
        (println "Unknown error converts from string to money")
        (throw e))))

  clojure.lang.PersistentVector
  (make-safe [this]
    (try
      (let [num-bytes (->> this (filter int?) (count))]
        (if (= (count this) num-bytes)
          (->> this
               (map char)
               (clojure.string/join)
               (Double/parseDouble)
               (bigdec))
          (throw (ex-info "Can only convert from vector of bytes"
                          {:input this}))))
      (catch NumberFormatException nfe
        (println "Vector must be bytes representing ASCII number chars")
        (throw nfe))
      (catch Exception e
        (println "Error converting from vector of bytes to money")
        (throw e)))))

(defn ->money
  [x]
  (make-safe x))
  
;; (+ (->money "0.1") (->money 0.1) (->money [0x30 0x2E 0x31])) => 0.3M
Dispatching on Protocols

While we could define types with functions like sequence, we would lose the biggest benefit of the protocol construct, Polymorphism.


Null Object

Create an empty implementation for an interface.

By my "interface is a bag of functions" reasoning, nil makes a good Null Object for sequences, collections, and maps (or maybenil is a sequence, collection, and map, but I&aposll leave that to the Philosophers of Computer Science ;).

When to use it

When simulating the behaviour of a type without affecting the process.

Clojure analogue

nil - Clojure sequence, collection, and map operations all support nil as their input sequence, collection, and map, respectively, either returning an empty sequence, collection, map, or nil.

Keep in mind

Not everything supports nil, and nil is different than Java&aposs null pointer reference, yet some gotchas with nil can throw NullPointerExecption, so some defensive programming with if or some->/some->> might be needed to avoid pitfalls.

Sample Code

Repl output:

user> (:keyword nil)
;; nil
user> (get nil :hi)
;; nil
user> (assoc nil :hi "bye")
;; {:hi "bye"}
user> (first nil)
;; nil
user> (rest nil)
;; ()
user> (next nil)
;; nil
user> (count nil)
;; 0
user> (nth nil 10)
;; nil
user> (pop nil)
;; nil
user> (peek nil)
;; nil
user> (println "You get the idea.")
You get the idea.
;; nil

Singleton

Ensure a class has one instance, and provide a global point of access to it.

A less philosophical pattern, the singleton appears a lot in Clojure codebases. Thanks to software transactional memory, we don&apost have to worry about non-atomic operations, but it&aposs still an anti-pattern because one piece of client code may update the atom with something another piece of client code doesn&apost expect, causing a hard-to-debug error.

When to use it

Just don&apost.

Clojure analogue

defonce macro and an atom/agent

Keep in mind

Global state is best to be avoided unless absolutely necessary. Global configuration, database connection, and web servers are arguably important enough objects to use this, since they&aposre the stateful sole instance and must stick around for the process runtime, but most Clojurians go in for a state management library or framework like Component, Mount, or Integrant.

Sample Code
(defonce web-server (atom nil))

(defn start-server
  []
  (if @web-server
    (log/warn "Server is already running!")
    (reset! webserver (run-jetty routes {:port (parse-int (:port config))
                                         :join? false}))))


Command

Encapsulate a request as an object, thereby letting users parameterize clients with different requests, queue or log requests, and support undoable operations.

The errors from functions are a lot easier to debug though because of stack traces. Our command pattern, the object-oriented version of callbacks, can be implemented in Clojure via higher-order functions (obviously).

When to use it

When you need to decouple the execution of a block of code from when it needs to be called.

Clojure analogue

Higher-order functions. All functions in Clojure can be passed around as arguments, and support closures.

Keep in mind

If you&aposre passing around lambdas a lot, use the fn form and give it a name to help with debugging. The road to callback hell is paved with #(...) lambdas.

State with Callbacks are typically considered an anti-pattern, but JavaScript promises are a thing, so pick your poison.

Sample Code
;; Trivial example
(defn fire
  "fire!!!"
  []
  (println "Fire the gun"))

(defn jump
  "jump!!!"
  []
  (println "Jump"))

(defn run
  "run!#!#"
  []
  (println "Run enabled"))

(defn diagonal-aim
  "Shoot things on the walls"
  []
  (println "Aiming diagionally"))

(defn remap-button
  "Remap a player&aposs button to a function.
  Can be used on the remap menu screen, or
  for temporary game mechanics."
  [player button f]
  (swap! player assoc-in [:controls button] f))

;; (remap-button player :a-button fire)

(defn do-buttons!
  "Execute the button actions assigned to the
  players controller map."
  [button-pressed controllor-map]
  (let [{:keys [a-button
                b-button
                y-button
                x-button]} controller-map]
    (case button-pressed
      "A" (a-button) ;; "Execute" the command!
      "B" (b-button)
      "X" (x-button)
      "Y" (y-button))))

Doing, redoing, and undoing

Because I know people will say it, here is an example of the command pattern supporting undo with all the previously made edits. Supporting redo will be left as an exercise for the reader.

(defn do-undo
  "Create a command and undo over a pure function"
  [f]
  (let [state (atom nil)]
    {:exec (fn execute!
             [& args]
             (let [v (apply f args)]
               (swap! state conj v)
               v))
     :undo (fn undo!
             []
             (swap! state pop)
             (peek @state))}))

(defn change-cell
  "A supposedly pure function"
  [s coord]
  (assoc sheet coord s))

(let [{:keys [exec undo]} (do-undo change-cell)]
  (def change-cell-cmd exec)
  (def change-cell-undo undo))

Observer

Define a one-to-many dependency between objects so that when one object changes state, all the dependents are notified and updated automatically.

I see a lot of functions written to watch for something to happen. Typically, developers will use a loop over a function until some state changes or "event" transpires. This works, but couples the event to the event reaction. Object-oriented programming addresses this with the Observer pattern, but we can address it with a Clojure watch.

When to use it

When you want to decouple changes in state from what the state affects, also known as a publish/subscribe pattern.

Clojure analogue

There are two ways to do this in Clojure:

  • Clojure watches. You can add a watch to any atom, agent, var, or ref with add-watch and remove it with remove-watch.
  • core.async&aposs pub and sub. pub creates a channel from a existing channel with optional topic, and then other channels can sub to the pub channel.
Keep in mind

When the watch target (the var/atom/agent/ref) changes, all the watch functions will be called syncronously even if multiple threads are manipulating the watch target. So, it&aposs best to use the new-state argument in all your watch functions.

Sample Code
(defn unlock
  "Unlocks the achiement given"
  [player k]
  (when-not (unlocked? player k)
      (do-unlock! player k)))

(defn on-bridge?
  "predicate to see if player is on the bridge"
  [player]
  (is-colliding? (get @entities :bridge) @player))

(defn achievement-toaster
  [k]
  (let [a (get achievements :player-fell-achievement)]
      (when a
        (swap! entities :current-toaster a))))

(defn unlock-achievement
  [key player old-state new-state]
  (case (:last-event new-state)
    :player-fell (when (on-bridge? new-state)
                   (unlock player :player-fell-achievement)
                   (achievement-toaster :player-fell-achievement))))

(defn watch-actions
  [player]
  (add-watch player :achievements unlock-achievement))

State

Allow an object to alter its behavior when its internal state changes. The object will appear to change it&aposs class.

We might not care about reacting to the state change though. Heck, we might not even care how the state happens, we just care about the results. A state pattern can give use a nice function to conceal our messy state transitions, so we don&apost have to write giant cond statements.

When to use it

The real thing to remember here is that we want to model a finite state machine (FSM) with our functions.

Clojure analogue
  • (atom f) - An atom over a function is the simplist version of the state pattern.
  • (loop [state (init-state data)] (recur (state data))) - Alternatively, the the current state could be re-bound a loop binding since the current state function always returns the next state.

This take uses a multimethod over an atom. This certainly models an FSM, but we kind of lose our ability to decouple the state from the context.

Keep in mind
  • The current state will always return the next state in the machine. Modeling state this way can be tricky, but can be super helpful if you find yourself with many nested case or cond forms. You&aposll have to decide of the cyclomatic complexity is worth the trade-off.

  • When your state transition returns one state, this method is fine. If you need more functions in a single &aposstate&apos, I&aposd consider using protocols to create states. I intentionally left this out of the sample code because it was huge.

  • If you need a history for your state machine, like a push-down automata, you can conj each state function to a vector and peek at the top.

  • The atom-free state pattern with loop bindings can be problematic if the current state needs to modify the binding itself. Use the atom version instead.

Sample Code
(defn play-game
  [game]
  (let [{:keys [components player]} game]
    (do
      (doseq [component components]
        (update! component))
      (draw-game game)
      (if (= :paused (:action @player))
        pause-menu
        play-menu))))

(defn start-menu
  [game]
  (let [{:keys [menu player]} game]
    (do
      (draw-game game)
      (if (= :affirm (:action @player))
        (when-let [menu-item (collision player)]
          (cond ;; our state transition
            (= menu-item :start-game) play-game
            (= menu-item :options) options-menu
            (= menu-item :quit-game) quit-game
            :else start-menu))))))

(let [game-state (atom start-menu)]
  (defn start-game
    [game]
    (reset! game-state (@game-state game))))
    
(defn main
  []
  (let [game-engine (game-engine)]
    (loop [quit? (:quit game-engine)]
      (when-not quit?
        (start-game game-engine)
        (recur (:quit game-engine))))))
                   
;; or stateless &aposstate&apos
(defn stateless-start-game
  [game]
  (start-menu game))

(defn stateless-main
  []
  (let [game-engine (game-engine)]
    (loop [game-state (start-game game-engine)]
      (when-not (:quit game-engine)
        (recur (game-state game-engine))))))

Visitor

Represent an operation to be performed on elements of an object structure. Visitor lets you define a new operation without changing the classes of the elements on which it operates.

Giant case statements are sometimes addressed with multimethods. That&aposs a valid strategy, but multimethods give us multiple dispatch for free.

When to use it

When you want to create new functionality without changing the existing code structure.

Clojure analogue

multimethods. Clojure&aposs multimethods offer dynamic dispatch functionality right out of the box, so we can put a multimethod inside of a multimethod to get the double dispatch functionality.

Keep in mind

Multimethods have a performance penalty. You&aposre unlikely to experience it, but it is there, so just be mindful of it in performance sensitive applications. If you need fast dispatching, consider dispatch on protocols instead.

Sample Code
(defrecord Equipment [elements])
(defrecord Sword [name])
(defrecord QuestItem [name])
(defrecord Armor [name])

(defmulti visit (fn [f x y] [(class x) (class y)]))

(defmethod visit :default [f x y])

(defmethod visit [Equipment Object] [f equipment other]
  (let [{:keys [elements]} equipment]
    (doseq [element elements]
      (f element other))))

(defmulti do-stuff (fn [x y] [(class x) (class y)]))

(defmethod do-stuff :default [x y]
  (println (format "You sit there bewildered, unable to understand what to do with %s and %s" x y)))

(defmethod do-stuff [Sword Long] [x y]
  (println (format "Swing your sword %s does %d damage" (:name x) y)))

(defmethod do-stuff [Sword String] [x y]
  (println (format "Using your sword %s, you make a noble gesture %s" (:name x) y)))

(defmethod do-stuff [Armor Long] [x y]
  (println (format "Your armor %s has prevented %d damage" (:name x) y)))

(defmethod do-stuff [Armor String] [x y]
  (println (format "You remove armor %s as a noble gesture %s" (:name x) y)))

(defmulti store-stuff (fn [x _] (class x)))

(defmethod store-stuff :default [x y]
  (println "You stored: " x))

Repl output:

user> (def equipment (->Equipment [(->Sword "Excalibur") (->Sword "Gramr") (->Sword "Zulfiqar") (->Sword "Durendal") (->QuestItem "dungeon key") (->Armor "Achilles")]))
;; #user/equipment
user> (visit store-stuff equipment 10)
You stored:  #user.Sword{:name Excalibur}
You stored:  #user.Sword{:name Gramr}
You stored:  #user.Sword{:name Zulfiqar}
You stored:  #user.Sword{:name Durendal}
You stored:  #user.QuestItem{:name dungeon key}
You stored:  #user.Armor{:name Achilles}
;; nil
user> (visit store-stuff equipment nil)
;; nil
user> (visit do-stuff equipment 50)
Swing your sword Excalibur does 50 damage
Swing your sword Gramr does 50 damage
Swing your sword Zulfiqar does 50 damage
Swing your sword Durendal does 50 damage
You sit there bewildered, unable to understand what to do with user.QuestItem@bb0c2cb0 and 50
Your armor Achilles has prevented 50 damage
;; nil

Iterator

Provide a way to access the elements of an aggregate object sequentially without exposing its underlying representation.

Like our Visitor pattern, the Iterator doesn&apost really reveal anything special. We can use it without any extra effort which makes this section fluff. Enjoy!

When to use it

When you want to have an ordered, sequential container for other objects which sounds an awful lot like a sequence or collection.

Clojure analogue

Lists and Vectors - Clojure list and vector types are sequence and collection, respectively, and support arbitrary typed elements. They have a rich set of functions for operating on them like doseq for map filter etc.

Keep in mind

Make sure you use doseq for side-effects. Some sequences (and transforms from collections) are Chunked, and can create strange behavior if using side-effects with map or filter.

A Java Iterator (supporting the java.util.Iterator interface) is not a sequence in Clojure. If you are working with a Java Iterators, be sure to call the iterator-seq function on your Java Iterator to get a (chunked) sequence. Some Java types do return collections though, and those should be fine in map or filter since Clojure collections inherit from the Java collections.

Sample code
(->> (range 1000) ;; Range gives us a list of numbers
     (filter even?) ;; Plugs into 
     (map inc))

I&aposve want to write this series for posts for years, but never got around to it, so thanks for reading. I think you&aposll enjoy the stuff I plan to write in the future, my next post will be Part 2 of this series. Subscribe for the next post or follow me on twitter @janetacarr , or don&apost ¯\_(ツ)_/¯ . You can also join the discussion about this post on twitter, hackernews, or reddit if you think I&aposm wrong.

1. Because it&aposs my favorite book on this subject, the inspiration for this post by and large came from Game Programming Patterns (print). The sample code for Command and Observer, as well as each pattern&aposs section structure, was adapted for this post&aposs theme.
2. This post is also largely a rebuttle to mishadoff&aposs Clojure design patterns.
3. Except for the Null Object, all quotes underneath pattern names come directly from Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides (1994). Design Patterns: Elements of Reusable Object-Oriented Software. Addison Wesley. ISBN 0-201-63361-2.

Permalink

Senior Software Engineer at Reify Health

Senior Software Engineer at Reify Health


About Us

Reify Health is changing the way medicines are developed by connecting and empowering the clinical trial ecosystem. We are a team of researchers, entrepreneurs, technologists, and healthcare-obsessed professionals building solutions that eliminate some of the biggest challenges in clinical research.

We care about the people who care for people...and we have fun while doing it.

We're looking for a Senior Software Engineer with a passion for continuous learning who applies newly acquired skills to your daily work. You delight in solving difficult problems, pay close attention to detail, and believe in the value of automation. You shine as a collaborator and excel as an individual contributor. You have the courage to lead and to tackle extremely difficult problems as a member of a powerful team. Your personal initiative and discipline allow you to thrive while working remotely. Your high degree of empathy for others makes you the kind of colleague everyone wants on their team. As an integral member of a fast-growing organization, you will put your fingerprint on what we do and how we do it.### What You'll Be Doing

  • Deliver extraordinary software that solves complex, real-world problems in healthcare.
  • Build high-quality, maintainable, and well-tested code across our entire application. We value the developer who focuses on “front-end” or “back-end”, as specialization brings deep technical understanding, leading to the ability to solve difficult problems elegantly.  We also value the developer who brings their own specialties, and who will enjoy working across our entire application stack. 
  • Strive for technological excellence and accomplishment through the adoption of modern tools, processes and standards.
  • Work closely with our Design and Product teams as features move through the value stream.
  • Support your teammates in an environment where collaboration, respect, humility, diversity and empathy are prized.

What You Bring to Reify Health

  • You have a minimum of 3 years of professional software product development in an Agile environment, ideally developing SaaS products but not required.
  • The applicant we are looking for has experience in functional programming, has a passion for learning and personal growth, and  works best when working with a team of diverse but like-minded individuals.
  • You have experience building software with functional programming languages like with languages like Clojure, Haskell, Lisp, F#, etc...
  • You have great oral and written communication skills and are comfortable with collaboration in a virtual setting. 
  • You demonstrate an enthusiastic interest to learn new technologies.
  • You are comfortable with modern infrastructure essentials like AWS, Docker, CI/CD tools, etc.
  • You are proficient with common development tools (e.g. Git, Jira) and processes (e.g. pairing, testing, code reviews) .
  • Prior experience in the healthcare domain, especially clinical trials and/or HIPAA Compliance is a plus. 

We value diversity and believe the unique contributions each of us brings drives our success. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Note: We are unable to sponsor work visas at this time.

Permalink

Senior Data Engineer at Reify Health

Senior Data Engineer at Reify Health


About Us

Reify Health is changing the way medicines are developed by connecting and empowering the clinical trial ecosystem. We are a team of researchers, entrepreneurs, technologists, and healthcare-obsessed professionals building solutions that eliminate some of the biggest challenges in clinical research.

We care about the people who care for people...and we have fun while doing it.

Our unique, rapidly growing data streams are enabling unique opportunities to manage clinical trials more efficiently and predictably. The Data Engineering group is looking for talented Data Engineers to build, expand, and support the cutting-edge, globally distributed data architecture which is the analytical backbone of our company. If you are empathetic, business-driven, and want to use your data engineering and data architecture skills to make a tangible impact in the clinical research community then this may be the role for you. As a fast-growing company, we're looking for people who can effectively balance rapid execution and delivery with sustainable and scalable architectural initiatives to serve the business most effectively. You have strong opinions, weakly held, and while well-versed technically know when to choose the right tool, for the right job, at the right level of complexity. You will work closely with our Data Analytics, Product, and Software Engineering groups to help collect, stream, transform, and effectively manage data for integration into critical reporting, data visualizations, and data products.### What You'll Be Working On

  • Support the development and international expansion of our next-generation, privacy-aware, Kappa-style data architecture.
  • Integrate additional data sources and real-time event streams from internal and third-party systems.
  • Help transform and deliver data in a practical manner, including through a data warehouse.
  • Develop APIs or interfaces to allow effective access to and management of data.
  • Learn to effectively understand and deftly navigate the global compliance ecosystem (HIPAA, GDPR, etc.) to ensure your work respects the rights, regulations, and consent preferences of all stakeholders.
  • Develop a deep understanding of the clinical ecosystem, our products, and our business and how they all uniquely interact to help people.

What You Bring To Reify Health

  • 4+ years of experience successfully developing and deploying data pipelines and distributed architectures, ideally in a space similar to ours (startup, healthcare, regulated data) .
  • Deep practical experience with AWS, streaming data technologies like Kafka, containerization, and containerized applications.
  • Experience or interest in developing and managing enterprise-scale data warehouse (OLAP) , data marts, and multidimensional data models.
  • Excellent programming skills in Clojure or Python and deep comfort with SQL.
  • Solid software testing, documentation, and debugging practices in the context of distributed systems.
  • Great communication skills and can work comfortably with technical and non-technical stakeholders to develop requirements.

You May:

  • Build new internal and external data integrations to bring rich historical data to stakeholders through our data warehouse.
  • Expand novel compliance systems to securely manage how data is transferred globally and used for analytical applications.
  • Create scalable architecture to support the management, deployment, and usage of machine learning models.
  • Help deploy, manage, and expand structured knowledge systems with our Data Products group.

We value diversity and believe the unique contributions each of us brings drives our success. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Note: We are unable to sponsor work visas at this time.

Permalink

Hacking the blog: social sharing

It appears that it has been one month and one day since I last hacked the blog. Hard to believe! It's easier to believe that it's been five six days (I started this post yesterday but didn't finish it until today 😬) since I last blogged. I went camping over the weekend, and still haven't finished putting my gear away! 😅

A desk with a computer and a lot of camping gear spread all over it

As much fun as I've been having with the actual blogging, I must say I've been having less fun sharing blog posts on Twitter, since when I do, the only thing I see is a boring old URL.

A tweet with a link to one of my blog posts which is just a URL

By contrast, when I share an excellent post about an excellent Arsenal performance by an excellent blogger, I see an excellent preview thingy with a picture and a title and a summary and I'm now super engaged and want to click!

A tweet with a link to a 7amkickoff blog post with a nice image and a summary

I want nice things too!

But luckily, since I'm the owner / operator of my blog, I can just make nice things for myself and then have those nice things (but not eat them, because apparently you can't have a thing and eat it too, because then you won't have it anymore).

The first order of business is figuring out what to search for. I tried "website thumbnail image" and found a great article by Michelle Mannering: "How to add a social media share card to any website". OK, so let's call this thingy a "share card" from now on.

According to Michelle, these are the tags that control the share card:

    <!-- Primary Meta Tags --> <!-- this is the default metadata which all websites can draw on --> 
    <title>YOUR_WEBSITE</title>
    <meta name="title" content="YOUR_HEADING">
    <meta name="description" content="YOUR_SUMMARY">

    <!-- Open Graph / Facebook --> <!-- this is what Facebook and other social websites will draw on -->
    <meta property="og:type" content="website">
    <meta property="og:url" content="YOUR_URL">
    <meta property="og:title" content="YOUR_HEADING">
    <meta property="og:description" content="YOUR_SUMMARY">
    <meta property="og:image" content="YOUR_IMAGE_URL">

    <!-- Twitter --> <!-- You can have different summary for Twitter! -->
    <meta name="twitter:card" content="summary_large_image">
    <meta name="twitter:url" content="YOUR_URL">
    <meta name="twitter:title" content="YOUR_HEADING">
    <meta name="twitter:description" content="YOUR_SUMMARY">
    <meta name="twitter:image" content="YOUR_IMAGE_URL">

(The article actually says <meta property="twitter:...">, but according to Twitter's Cards documentation, it should be <meta name="twitter:...">, so I'll use that instead.)

If I slap these tags into the <head> of my document, I should win!

But what to put in the content of these tags? Let's take them one by one, using the 7amkickoff sharing card as a reference:

  • YOUR_HEADING: "Summer days". This looks like the page title.
  • YOUR_URL: ???. I guess this is the page URL.
  • YOUR_SUMMARY: "Summer days are meant to be spent doing something quiet in the early morning, followed by...". OK, this is a preview of the post's content.
  • YOUR_IMAGE_URL: logo with the "7". The thumbnail image (called a "featured image" by Wordpress and Medium, IIRC).

Now we can go page by page, filling these in as we go.

  1. Index page:
    • YOUR_HEADING: page title (quickblog's :blog-title key)
    • YOUR_URL: page URL (quickblog: :blog-root + "index.html")
    • YOUR_SUMMARY: let's use a description of the blog here (quickblog: :blog-description)
    • YOUR_IMAGE_URL: we can put a blog logo here (let's add a new :blog-image key to quickblog)
  2. Archive page:
    • YOUR_HEADING: page title (quickblog: :blog-title + " - Archive")
    • YOUR_URL: page URL (quickblog: :blog-root + "archive.html")
    • YOUR_SUMMARY: (quickblog: "Archive - " + :blog-description)
    • YOUR_IMAGE_URL: (quickblog: :blog-image)
  3. Tags page (i.e. the page listing all of the tags):
    • YOUR_HEADING: page title (quickblog: :blog-title + " - Tags")
    • YOUR_URL: page URL (quickblog: :blog-root + "tags/index.html")
    • YOUR_SUMMARY: (quickblog: "Tags - " + :blog-description)
    • YOUR_IMAGE_URL: (quickblog: :blog-image)
  4. Tag pages (i.e. pages for individual tags with links to the posts with that tag):
    • YOUR_HEADING: page title (quickblog: :blog-title + " - Tag - " + tag name)
    • YOUR_URL: page URL (quickblog: :blog-root + "tags/{{tag}}.html")
    • YOUR_SUMMARY: (quickblog: "Posts tagged '{{tag}}' - " + :blog-description)
    • YOUR_IMAGE_URL: (quickblog: :blog-image)
  5. Posts:
    • YOUR_HEADING: page title, which is the value of the post's title metadata (specified in Markdown as Title: Something or other, as detailed in the very first Hacking the blog post)
    • YOUR_URL: page URL (quickblog: :blog-root + "{{file}}.html"; assuming the post's Markdown file is called something.md, file will be "something")
    • YOUR_SUMMARY: let's add a new piece of metadata to the Markdown file called Description: (I know the article I referenced is calling it YOUR_SUMMARY, but I figure it's less surprising for this to match the name of the meta tags where we'll put it)
    • YOUR_IMAGE_URL: let's add an Image: metadata for this

Having figured out what to put in the meta tags, let's actually implement this! The nice thing about my blog being powered by quickblog is that all of the changes happen there (and are thus available to all quickblog users). Let's start by cloning quickblog. I'll open a terminal, change to the parent directory of my blog, and then run:

$ git clone git@github.com:borkdude/quickblog.git

Now I have a quickblog directory as a sibling of the jmglov.net directory that contains my blog. In order for my blog to pick up the local changes I'm about to make to quickblog, I need to change my dependency from using quickblog from Github to use the local copy instead.

My bb.edn currently looks like this:

{:deps {io.github.borkdude/quickblog
        #_"You use the newest SHA here:"
        {:git/sha "1c26f244003e590863ae6bba0b25b2ba6a258ac9"}}
 ;; ...
 }

I'll change it to this:

{:deps {io.github.borkdude/quickblog {:local/root "../quickblog"}
        #_"You use the newest SHA here:"
        #_{:git/sha "1c26f244003e590863ae6bba0b25b2ba6a258ac9"}}
 ;; ...
 }

I left the {:git/sha "1c26f244003e590863ae6bba0b25b2ba6a258ac9"} bit there for reference, but commented it out with the #_ reader macro, which causes Clojure's reader to ignore the next form. You can think of it as more or less the `/* ... */` style comment in languages like Java and C.

Now, any changes I make to my local quickblog directory will be reflected in my blog when I run bb render.

Now that we're all set up, let's take a look at the quickblog source code and figure out how we're going to do this. The place to start is the page template, base.html. If we open it up and take a look at the <head> section, here's what we see:

  <head>
    <title>{{title}}</title>
    <meta charset="utf-8"/>
    <link type="application/atom+xml" rel="alternate" href="{{relative-path | safe}}atom.xml" title="{{title}}">
    <link rel="stylesheet" href="{{relative-path | safe}}style.css">
    <script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.28.0/prism.min.js"></script>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.28.0/components/prism-clojure.min.js"></script>
    {{watch | safe }}
    <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.28.0/themes/prism.min.css">

{% if favicon-tags %}{{favicon-tags | safe}}{% endif %}
  </head>

The {% ... %} and {{...}} stuff are Selmer tags and variables. The `{% if ... %} tag includes the stuff before the {% endif %}` if the condition is true, and the {{foo}} is substituted with the value of the foo template variable, or the empty string if the foo template variable is undefined or nil.

Let's add our social sharing tags below the <% if favicon-tags %> line (which you may remember from the "Hacking the blog: favicon" post). Since all pages have a title, we can include those tags unconditionally:

    <!-- Social sharing (Facebook, Twitter, LinkedIn, etc.) -->
    <meta name="title" content="{{title}}">
    <meta name="twitter:title" content="{{title}}">
    <meta property="og:title" content="{{title}}">

We can also throw in og:type, since that should always be "website" for our purposes:

    <meta property="og:type" content="website">

Since the template is already using {{title}}, we feel confident that the quickblog rendering code is providing it. Let's move on now to the description (YOUR_SUMMARY, I know, it's confusing; sorry). Let's add the tags to the template:

{% if sharing.description %}
    <meta name="description" content="{{sharing.description}}">
    <meta name="twitter:description" content="{{sharing.description}}">
    <meta property="og:description" content="{{sharing.description}}">
{% endif %}

This is something new. When we include a . in a template variable, what we're saying is the bit before the dot is a map which contains a field named the bit after the dot. In this case, we expect a template variable called sharing to be provided like this:

:sharing {:description "something"}

We'll wrap this whole thing in an {% if %} ... {% endif %} so that nothing will be added to the template if the sharing.description variable is undefined.

Let's have faith that future us will find a way to provide the sharing.description variable somehow and forge on with our template. Next up is the URL:

{% if sharing.url %}
    <meta name="twitter:url" content="{{sharing.url}}">
    <meta property="og:url" content="{{sharing.url}}">
{% endif %}

Again, we'll have faith in our future selves, competent programmers that we are! The final piece of the puzzle is the image. We'll follow the same pattern, but with one small tweak:

{% if sharing.image %}
    <meta name="twitter:image" content="{{sharing.image}}">
    <meta name="twitter:card" content="summary_large_image">
    <meta property="og:image" content="{{sharing.image}}">
    <meta property="og:image:alt" content="{{sharing.image-alt}}">
{% else %}
    <meta name="twitter:card" content="summary">
{% endif %}

The og:image:alt property is one that I tracked down in the Open Graph protocol documentation, and it provides alt text for the image, which is extremely important for making pages accessible to people using screen readers. I highly recommend reading resources like "Write good Alt Text to describe images" to learn more.

The twitter:card property has multiple options, according to Twitter's cards documentation:

  • summary
  • summary_large_image
  • app
  • player

It does not specify what these mean. I guess Twitter needs to keep the mystery alive! What we'll do for now is use summary_large_image when we have an image, and regular old "summary" when we don't.

According to this page, Twitter has another couple of meta tags we can set:

  • twitter:site - @username for the website used in the card footer
  • twitter:creator - @username for the content creator / author

We might as well do that, since quickblog has a :twitter-handle option.

{% if sharing.author %}
    <meta name="twitter:creator" content="{{sharing.author-twitter-handle}}">
{% endif %}
{% if sharing.twitter-handle %}
    <meta name="twitter:site" content="{{sharing.twitter-handle}}">
{% endif %}

The reason for defining them separately is that :twitter-handle is the owner of the blog, but the author of an individual post might be different, and we'll allow that to be specified with the Twitter-Handle: metadata tag in the post.

OK, now we have everything taken care of in the template itself. Let's turn our roving eye to the rendering code, starting with the index.

If we open up src/quickblog/api.clj, we'll find a spit-index function at line 157. It does some figuring out of which posts to include in the index, then makes a call to lib/write-page!. This is where the template variables are defined:

{:title blog-title
 :body body}

Looking back at our template, we want to add the following keys and values:

  • description
  • url
  • image
  • author-twitter-handle
  • twitter-handle

All of the information we need is contained in the opts that are passed to the function. Let's add the keys we need to the destructuring form:

(defn- spit-index
  [{:keys [blog-title blog-description blog-image blog-image-alt
           blog-root twitter-handle
           posts cached-posts deleted-posts modified-posts num-index-posts
           out-dir]
    :as opts}]

Now we can fill in the map of template variables:

(lib/write-page! opts out-file
                 (base-html opts)
                 {:title blog-title
                  :body body
                  :sharing {:description blog-description
                            :author twitter-handle
                            :twitter-handle twitter-handle
                            :image (format "%s/%s" blog-root blog-image)
                            :image-alt blog-image-alt
                            :url (format "%s/index.html" blog-root)}})

In this case, both the author and site Twitter handles are the same, since this is the index page of the entire blog.

There's only one thing here that is slightly worrisome: does the value of the :blog-root option end in a / or not? quickblog's documentation is silent on the matter, so we'd better handle both cases just to be safe. Let's add a function to internal.clj to take care of this:

(defn blog-link [{:keys [blog-root] :as opts} relative-url]
  (when relative-url
    (format "%s%s%s"
            blog-root
            (if (str/ends-with? blog-root "/") "" "/")
            relative-url)))

And now we can use this in spit-index:

(defn- spit-index
  [{:keys [blog-title blog-description blog-image blog-image-alt twitter-handle
           posts cached-posts deleted-posts modified-posts num-index-posts
           out-dir]
    :as opts}]
  ;; ...
        (lib/write-page! opts out-file
                         (base-html opts)
                         {:title blog-title
                          :body body
                          :sharing {:description blog-description
                                    :author twitter-handle
                                    :twitter-handle twitter-handle
                                    :image (lib/blog-link opts blog-image)
                                    :image-alt blog-image-alt
                                    :url (lib/blog-link opts "index.html")}})))))

Note that we no longer need the blog-root key in our destructuring form, so we've removed it to be neat and tidy.

Now onto the archive page. We see that there's a spit-archive function on line 181, so we'll do some very similar modifications there:

(defn- spit-archive [{:keys [blog-title blog-description
                             blog-image blog-image-alt twitter-handle
                             modified-metadata posts out-dir] :as opts}]
  ;; ...
        (lib/write-page! opts out-file
                         (base-html opts)
                         {:skip-archive true
                          :title title
                          :body (hiccup/html (lib/post-links "Archive" posts))
                          :sharing {:description (format "Archive - %s"
                                                         blog-description)
                                    :author twitter-handle
                                    :twitter-handle twitter-handle
                                    :image (lib/blog-link opts blog-image)
                                    :image-alt blog-image-alt
                                    :url (lib/blog-link opts "archive.html")}})))))

The tags page is now up, but there's no conveniently named spit-tags function, so we'll have to figure out how this is generated. If we just search api.clj for tags, we get a promising hit on line 120:

(defn- gen-tags [{:keys [blog-title modified-tags posts
                         out-dir tags-dir]
                  :as opts}]
  ;; ...
      (lib/write-page! opts tags-file template
                       {:skip-archive true
                        :title (str blog-title " - Tags")
                        :relative-path "../"
                        :body (hiccup/html (lib/tag-links "Tags" posts-by-tag))})
      ;; ...

Ah, our old friend lib/write-page!. Let's rinse and repeat here:

(defn- gen-tags [{:keys [blog-title blog-description
                         blog-image blog-image-alt twitter-handle
                         modified-tags posts out-dir tags-dir]
                  :as opts}]
  ;; ...
      (lib/write-page! opts tags-file template
                       {:skip-archive true
                        :title (str blog-title " - Tags")
                        :relative-path "../"
                        :body (hiccup/html (lib/tag-links "Tags" posts-by-tag))
                        :sharing {:description (format "Tags - %s"
                                                       blog-description)
                                  :author twitter-handle
                                  :twitter-handle twitter-handle
                                  :image (lib/blog-link opts blog-image)
                                  :image-alt blog-image-alt
                                  :url (lib/blog-link opts "tags/index.html")}})
      ;; ...

gen-tags looks like it also handles the individual tag pages:

(doseq [tag-and-posts posts-by-tag]
  (lib/write-tag! opts tags-out-dir template tag-and-posts))

Let's drill into the lib/write-tag! function, defined on line 383 of internal.clj:

(defn write-tag! [{:keys [blog-title modified-tags] :as opts}
                  tags-out-dir
                  template
                  [tag posts]]
  (let [tag-filename (fs/file tags-out-dir (tag-file tag))]
    (when (or (modified-tags tag) (not (fs/exists? tag-filename)))
      (write-page! opts tag-filename template
                   {:skip-archive true
                    :title (str blog-title " - Tag - " tag)
                    :relative-path "../"
                    :body (hiccup/html (post-links (str "Tag - " tag) posts
                                                   {:relative-path "../"}))}))))

Nice! There's a call to write-page!, so we know exactly what we need to do:

(defn write-tag! [{:keys [blog-title blog-description
                          blog-image blog-image-alt twitter-handle
                          modified-tags] :as opts}
                  tags-out-dir
                  template
                  [tag posts]]
  ;; ...
      (write-page! opts tag-filename template
                   {:skip-archive true
                    :title (str blog-title " - Tag - " tag)
                    :relative-path "../"
                    :body (hiccup/html (post-links (str "Tag - " tag) posts
                                                   {:relative-path "../"}))
                    :sharing {:description (format "Posts tagged \"%s\" - %s"
                                                   tag blog-description)
                              :author twitter-handle
                              :twitter-handle twitter-handle
                              :image (blog-link opts blog-image)
                              :image-alt blog-image-alt
                              :url (blog-link opts "tags/index.html")}}))

There's only one thing left to do: the post pages. Let's see if we can figure out how they're rendered.

Back in api.clj, there's a gen-posts function at line 89. It's a bit long and scary looking, but there is a call to a lib/write-post! function at line 102, so it looks like we can probably get away with leaving gen-posts as is and making our changes in lib/write-post!. Let's have a look:

(defn write-post! [{:keys [discuss-fallback
                           cache-dir
                           out-dir
                           force-render
                           page-template
                           post-template
                           posts-dir]
                    :as opts}
                   {:keys [file title date discuss tags html]
                    :or {discuss discuss-fallback}}]
  (let [out-file (fs/file out-dir (html-file file))
        markdown-file (fs/file posts-dir file)
        cached-file (fs/file cache-dir (cache-file file))
        body (selmer/render post-template {:body @html
                                           :title title
                                           :date date
                                           :discuss discuss
                                           :tags tags})
        rendered-html (render-page opts page-template
                                   {:title title
                                    :body body})]
    (println "Writing post:" (str out-file))
    (spit out-file rendered-html)))

There are a few things to note here:

  1. There are two :keys destructurings happening here. The first is our old friend opts, but the second has no name. The names of the keys look familiar, though. Title:, Date:, and Tags: are the pieces of metadata automatically added to new posts when we run the bb new command, so let's assume that this second set of keys is the metadata defined in the post itself, plus some extra metadata that quickblog attaches.
  2. There's a call to selmer/render here, which appears to be rendering the body of the post. Since the <meta> tags we're adding go in the <head> section of the page, we can safely ignore this part.
  3. There's no call to write-page!, but render-page looks pretty similar. Let's add our template variables there.

First, we'll add twitter-handle to the opts destructuring, give the second argument a name, post-metadata, and add the description, image, and image-alt keys to it:

(defn write-post! [{:keys [blog-root
                           twitter-handle
                           discuss-fallback
                           cache-dir
                           out-dir
                           force-render
                           page-template
                           post-template
                           posts-dir]
                    :as opts}
                   {:keys [file title date discuss tags html
                           description image image-alt]
                    :or {discuss discuss-fallback}
                    :as post-metadata}]
  ;; ...

Now, let's figure out what the values of the template variables should be. description and image-alt are straightforward; it's what the post author added as the Description: and Image-Alt: metadata in the post, so we can use it as is.

url is only a bit more complicated. We can use the blog-link function as usual, and the relative-url argument should be the name of the HTML file corresponding to this post. We can see on line 363 that the output file uses a function called html-file, which transforms the post's foo.md file into foo.html. Just what we needed!

twitter-handle, which is the Twitter handle of the blog owner, can be used straight up. For author, let's look first for a twitter-handle key in the post metadata, and then fall back to the blog's twitter-handle otherwise:

author (-> (:twitter-handle post-metadata) (or twitter-handle))

Finally, we want the post's author to be able to add Image: metadata to the post, which they should be able to specify either as an absolute URL or a relative URL. We can handle that here:

image (when image (if (re-matches #"^https?://.+" image)
                    image
                    (blog-link opts image)))

Now we can just feed these keys to the render-page function:

rendered-html (render-page opts page-template
                           {:title title
                            :body body
                            :sharing (->map description
                                            author
                                            twitter-handle
                                            image
                                            image-alt
                                            url)})

Let's take a brief detour to look at this ->map bit. It's a macro that lets us define a map with keys named the same as the variables holding the values. Or in other words, these two things are equivalent:

(->map description author twitter-handle image image-alt url)

{:description description
 :author author
 :twitter-handle twitter-handle
 :image image
 :image-alt image-alt
 :url url}

In case you're interested, the macro is defined at line 27:

(defmacro ->map [& ks]
  (assert (every? symbol? ks))
  (zipmap (map keyword ks)
          ks))

If you're interested but don't understand what's going on here, I can highly recommend "Mastering Clojure Macros", by Colin Jones, or Chapter 8 of "Clojure for the Brave and True", by Daniel Higgenbotham. You can read "Clojure for the Brave and True" for free online, but if you can afford to show Daniel some monetary appreciation, you can order the print version using his affiliate link: http://amzn.to/1H7MqmT.

OK, we actually have everything we need to make this work! Let's generate a new post and test it out:

$ bb new --file test.md --title "Test post"

If we open up posts/test.md, we can add some metadata tags:

Title: Test post
Date: 2022-08-17
Tags: clojure
Twitter-Handle: jmglov
Description: This is an amazing blog post which tests the equally amazing social sharing functionality that we just added to quickblog!
Image: https://jmglov.net/test/2022-08-16-sharing-preview.png
Image-Alt: A leather-bound notebook lies open on a writing desk

Write a blog post here!

Now let's try things out! If we run:

$ bb watch

we can browse to our blog at http://localhost:1888/. We should see the index page, and if we click on the Test post link, we can view the source of the page, look at the <head> section, and see:

<head>
  <!-- some boring stuff here -->

  <!-- Social sharing (Facebook, Twitter, LinkedIn, etc.) -->
  <meta name="title" content="Test post">
  <meta name="twitter:title" content="Test post">
  <meta property="og:title" content="Test post">
  <meta property="og:type" content="website">

  <meta name="description" content="This is an amazing blog post which tests the equally amazing social sharing functionality that we just added to quickblog!">
  <meta name="twitter:description" content="This is an amazing blog post which tests the equally amazing social sharing functionality that we just added to quickblog!">
  <meta property="og:description" content="This is an amazing blog post which tests the equally amazing social sharing functionality that we just added to quickblog!">

  <meta name="twitter:image" content="https://jmglov.net/test/2022-08-16-sharing-preview.png">
  <meta name="twitter:card" content="summary_large_image">
  <meta property="og:image" content="https://jmglov.net/test/2022-08-16-sharing-preview.png">
  <meta property="og:image:alt" content="A leather-bound notebook lies open on a writing desk">

  <meta name="twitter:creator" content="jmglov">
  <meta name="twitter:site" content="quickblog">
</head>

Awesome! But how can we know what this will look like when shared on social media sites? Well, I've done us all the great service of uploading this page to my website, so we can use metatags.io to test it. If we pop in https://jmglov.net/test/social-post.html to the text box at the top of the site, we should see something like this:

The metatags.io site showing a preview of the social sharing card for our test page

The spectacularity of this accomplishment cannot be overstated, my friends! 🏆

In case you're a quickblog user and you want to benefit from this stuff without having to do a bunch of typing, fear not! The latest version of quickblog already includes this functionality. 🙂

Permalink

Contract Programmer Seeks Job in Cambridge (£500 reward)

Anyone in Cambridge need a programmer? I'll give you £500 if you can find me a job that I take.

CV at http://www.aspden.com

I make my usual promise, which I have paid out on several times:

If, within the next six months, I take a job which lasts longer than one month, and that is not obtained through an agency, then on the day the first cheque from that job cashes, I'll give £500 to the person who provided the crucial introduction.

If there are a number of people involved somehow, then I'll apportion it fairly between them. And if the timing conditions above are not quite met, or someone points me at a shorter contract which the £500 penalty makes not worth taking, then I'll do something fair and proportional anyway.

And this offer applies even to personal friends, and to old contacts whom I have not got round to calling yet, and to people who are themselves offering work, because why wouldn't it?

And obviously if I find one through my own efforts then I'll keep the money. But my word is generally thought to be good, and I have made a public promise on my own blog to this effect, so if I cheat you you can blacken my name and ruin my reputation for honesty, which is worth much more to me than £500.



And I also make the following boast:

I know all styles of programming and many languages, and can use any computer language you're likely to use as it was intended to be used.

I have a particular facility with mathematical concepts and algorithms of all kinds. I can become very interested in almost any problem which is hard enough that I can't solve it easily.

I have a deserved reputation for being able to produce heavily optimised, but nevertheless bug-free and readable code, but I also know how to hack together sloppy, bug-ridden prototypes, and I know which style is appropriate when, and how to slide along the continuum between them.

I've worked in telecoms, commercial research, banking, university research, chip design, server virtualization, university teaching, sports physics, a couple of startups, and occasionally completely alone.

I've worked on many sizes of machine. I've written programs for tiny 8-bit microcontrollers and gigantic servers, and once upon a time every IBM machine in the Maths Department in Imperial College was running my partial differential equation solvers in parallel in the background.

I'm smart and I get things done. I'm confident enough in my own abilities that if I can't do something I admit it and find someone who can.

I know what it means to understand a thing, and I know when I know something. If I understand a thing then I can usually find a way to communicate it to other people. If other people understand a thing even vaguely I can usually extract the ideas from them and work out which bits make sense.

Permalink

The Best IntelliJ Productivity Key-bindings 2022

NOTE: These shortcuts are intended for the people using Clojure language with structural editing style like paredit mode

Pre-requisite:-

Editing Shortcuts:-

https://medium.com/media/082949a2892b20c366792579bd35501d/href

REPL Shortcuts:-

https://medium.com/media/b9f33f346e32f818751cd93ed37092ff/href

Git Shortcuts:-

https://medium.com/media/d3a6926cde9ccdfda9d5c52b3ecb7254/href

The Best IntelliJ Productivity Key-bindings 2022 was originally published in helpshift-engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.

Permalink

What is a high-level language?

We've all heard the term _high-level language_. Initially, it referred to the step from assembly languages to compiled languages. But it has another definition, which has to do with how well the language lets you think.

Permalink

Understanding transducers

Some time ago I ported most of Clojure’s core namespace to Fennel and made it into a library called fennel-cljlib. This was my first library for Fennel, so it wasn’t really great in terms of how it was implemented. While it was making Fennel more like Clojure syntax-wise, which I like, it wasn’t following Clojure’s semantics that well. The main thing that was missing is a proper seq abstraction, which Clojure relies on for providing both lazy sequences and generic functions that can work on any data type that implements ISeq. Such functions are map, filter, take and so on.

Since then, I’ve made a few more libraries for Fennel, which were somewhat more narrowly focused, and one of such libraries was lazy-seq - an implementation of Clojure’s lazy sequences. It doesn’t feature chunked sequences, (yet, maybe), but it implements almost all sequence-related functions from Clojure. And you can throw pretty much any Lua data structure that implements pairs or ipairs into it, and it will work out how to lazily transform it into a sequence.

This was one of the missing pieces for the fennel-cljlib, as its implementation of seq simply made a shallow copy of a given object in a linear time, making sure that the result is sequential. With the implementation of seq from the lazy-seq library, I could rewrite fennel-cljlib, also making all sequence-related functions lazy. And while this will make the library more Clojure-like one piece is still missing from both fennel-cljlib and lazy-seq libraries.

Transducers.

I’m quite familiar with transducers, well, I use them regularly at work, and I read about their implementation a few years ago. However, I’ve never implemented a transduceable context, i.e. a function that accepts a transducer, and applies it to elements of a given collection. So, as a part of the rewrite of the fennel-cljlib library, I needed not only to port transducers themselves, but I also had to implement such functions as into, transduce, and sequence, which are transduceable contexts.

Thankfully, into and transduce are written in Clojure, and are very straightforward to understand, but the sequence function is not. Here’s its source code:

(defn sequence
  ([coll]
   (if (seq? coll) coll
       (or (seq coll) ())))
  ([xform coll]
     (or (clojure.lang.RT/chunkIteratorSeq
         (clojure.lang.TransformerIterator/create xform (clojure.lang.RT/iter coll)))
       ()))
  ([xform coll & colls]
     (or (clojure.lang.RT/chunkIteratorSeq
         (clojure.lang.TransformerIterator/createMulti
           xform
           (map #(clojure.lang.RT/iter %) (cons coll colls))))
       ())))

It is written mostly via Java interop, and I can’t use this in my port of the clojure.core namespace to Fennel, because Fennel runs on Lua, and Lua can’t really interop with Java. So I had to reimplement this function in Fennel, and this is what motivated me to write this post. Also, I don’t really know Java, so understanding how sequence works was a challenge on its own1.

The interesting thing is, after I’ve tried to implement this function several times, I understood that I, actually, don’t understand transducers as well as I thought. So I had to go back a bit and learn how transducers actually work, and why are they written the way they are. It was really fascinating, and after a bit of trial and error, I’ve managed to implement sequence in Clojure first, and then port it to Fennel, using my implementation of lazy sequences.

I will show the resulting code later in this post, but first, let’s understand what transducers are, and how they work.

Transducers

First, a bit of theory. A transducer is a function that describes the process of transformation, without knowing how exactly the thing it transforms is organized. It is not the same as generic functions, because transducers are generic in a bit different way.

For example, we all know how map works in Clojure - you pass it a function and a collection, and it applies the function to each element of the collection:

(map inc [1 2 3]) ;; => (2 3 4)

Map walks through the given collection, applies the given function to each element, and puts results into a sequence. Seems nice, and Clojure actually makes map generic by transforming the given collection to a sequence before actually mapping over it. However, this approach isn’t generic enough, because there are things that can’t be transformed into a sequence in an efficient or even meaningful way. One such thing is an asynchronous channel, and when Clojure developers were working on the core.async library, they’ve realized that sequence manipulation functions are usable in the context of a channel, but reimplementing all these functions is a no-go.

So how did Clojure developers solve this problem? By decoupling the transformation process from the collection it transforms. Here’s a helpful analogy.

Imagine an airport worker, whose job is to weigh and sort the luggage before it goes into a plane. Their basic instructions are:

  • Take a bag;
  • Measure its weight and put a sticker on the bag;
  • If the bag weight is bigger than X, don’t put the bag on the plane;
  • Hand the bag over.

Note that this process, while can be applied to a single bag at a time, doesn’t at all specify how bags are coming to you and how they leave you. One day bags can come to you in containers brought by a vehicle, the other day they can come on a conveyor, it doesn’t matter to you - you just take a bag, weigh it, put a sticker and hand it over to another car or another conveyor. These details should not matter, because your job remains the same, even if you take bags from different sources every day.

This is, of course, an analogy, but it applies to programming pretty well. Bags are items in the collection you’re going to map a function over and then filter them out. The function is what you do with the bag, in the case of our fellow airport worker weighing bags and putting stickers. In addition, notice that we first weight the bag, and then filter it out immediately, in oppose to weighting all bags and then filtering them one by one.

However, in a programming language, the way how items are coming to us completely depends on the collection. And how we collect results into another collection depends on the implementation of the map function. These are two main things stopping us from describing an arbitrary transformation process, without tying ourselves to a particular collection implementation or a conversion protocol.

Looking at other languages, which provide different classes for different data structures, most of the time map is a method and not a function. This way map can be implemented in terms of the collection you’re mapping through, usually producing the same collection as a result, because this method knows how to map a function over this particular collection implementation.

Methods do not fly in functional languages, so another approach, some languages take is to provide different implementations of map functions via namespaces. For example, Elixir has a map function implemented for lists and another implementation for streams:

# Enum versions of map and filter
iex(1)> (0..3
         |> Enum.map(fn(x) -> IO.puts("map #{x}"); x + 1 end)
         |> Enum.filter(fn(x) -> IO.puts("filter #{x}"); rem(x, 2) != 0 end)
         |> List.first)
map 0
map 1
map 2
map 3
filter 1
filter 2
filter 3
filter 4
1

# Stream versions of map and filter
iex(2)> (0..3
         |> Stream.map(fn(x) -> IO.puts("map #{x}"); x + 1 end)
         |> Stream.filter(fn(x) -> IO.puts("filter #{x}"); rem(x, 2) != 0 end)
         |> Enum.to_list
         |> List.first)
map 0
filter 1
map 1
filter 2
map 2
filter 3
map 3
filter 4
1
Code Snippet 1: Elixir approach to the problem

The difference is quite substantial, as streams apply function composition to each element one by one, (similarly to how our airport worker does) whereas enumerations are fully traversed by map first and only then the result is being passed to filter. This distinction is possible thanks to different implementations of map, but be it a method of a specific class, or a namespaced function that’s exactly what Clojure developers wanted to avoid. So transducers were created.

Understanding transducers

To understand transducers, we first need to understand what map essentially does and how we can abstract it away from both the input collection and the result it produces. It may seem obvious: map applies a function to each element of a collection and puts the result into a new collection. I’ve marked important things in bold because if we think about these a bit, we’ll see that there are some concrete actions that the map function performs, which should be abstracted away.

First, let’s implement map in terms of reduce. This function is actually very similar to how mapv is implemented in Clojure, except we’ve left out some optimizations:

(defn mapr
  "A version of `map` that uses `reduce` to traverse the collection and
  build the result."
  [f coll]
  (reduce (fn reducer [result val] (conj result (f val))) [] coll))
;; ^       ^                        ^            ^        ^  ^
;; |       |                        |            |        |  |
;; |       |                        |            |        |  `Collection to iterate
;; |       |                        |            |        `Collection to put results
;; |       |                        |            `Function that produces the result
;; |       `A reducing function that`knows how to put elements to result
;; `This is how we get a collection element one by one

Here, reduce takes care of how to traverse the collection, and the reducer function takes care of how to put elements to the result. It may seem that we’ve decoupled these steps from map but we’ve just moved them into another place. More than that, if we were to implement filter this way we would have to put the logic, that decides what elements are going to be left out, into an anonymous function:

(defn filterr
  "A version of `filter` that uses `reduce` to traverse the collection
  and build the result."
  [f coll]
  (reduce (fn [res x] (if (f x) (conj res x) res)) [] coll))
;; ^                   ^
;; |                   |
;; |                   `Logic that decides whether the value will be left out
;; `generic way to iterate through a collection

Notice, that both mapr and filterr share a lot of structure, the only difference here is how to put the resulting value into the collection. This should give us a hint on how we can abstract this away. And given that Clojure is functional, we can write functions that accept other functions and return new functions, which will provide a generic way of how to put the result into a collection:

(defn map-transducer [f]
  (fn [reducer]
    (fn [result value]
      (reducer result (f value)))))

(defn filter-transducer [pred]
  (fn [reducer]
    (fn [result value]
      (if (pred value)
        (reducer result value)
        result))))

This isn’t what a transducer really is like, but a first real step towards them. The key point here is that now, we can describe a reducing process without knowing how to put the modified item into the resulting collection (or channel, or socket, or whatever). The only thing we need to implement for collection now is reduce or some other function that has the same interface as reduce. This is how we can use these prototype transducers:

user> (def incrementer (map-transducer inc)) ; a function that knows how to increment
#'user/incrementer
user> (incrementer conj) ; teaching `incrementer` how to put elements to the result
#function[user/map-transducer/fn--5698/fn--5699]
user> (reduce (incrementer conj) [] [1 2 3]) ; using this transducer in `reduce`
[2 3 4]
user> (reduce ((filter-transducer odd?) conj) [] [1 2 3]) ; same for `filter-transducer`
[1 3]

So what happens here is that when we call map-transducer and pass it a function inc it returns a function, that accepts the reducer also known as the reducing function of just rf for short. We then call this function, passing it the reducing function conj and get another function, that accepts the results so far, and the element to process. This function then calls inc, which we’ve supplied in the first step, on the element, and uses conj to put the resulting value to result. In other words, by passing inc and conj to the transducer we’ve basically constructed (fn [result value] (conj res (inc value))) function, that is then used by reduce. Here’s a demonstration:

user> (reduce ((map-transducer inc) conj) [] [1 2 3])
[2 3 4]
user> (reduce (fn [result value] (conj result (inc value))) [] [1 2 3])
[2 3 4]

And that’s basically what transducers are all about! They’re just a composition of functions, that produces the final transformation function, that acts as a single step over the given collection. And the amazing part of such design is that transducers can be composed with other transducers:

user> (reduce ((comp (map-transducer inc)
                     (filter-transducer odd?))
               conj)
              [] [1 2 3 4 5 6])
[3 5 7]
user> (reduce (fn [result value]       ; above is essentially the same as this
                ((fn [result value]    ; function composition
                   (if (odd? value)
                     (conj result value)
                     result))
                 result (inc value)))
              [] [1 2 3 4 5 6])
[3 5 7]

It may be a little hard to process, but don’t worry, I will go into details after we complete the implementation of transducers, as we’re not yet finished.

Completing transducers

Most transducers don’t have any intermediate state, but some do. For example, the partition-all function takes a collection and returns a list of partitions of elements from this collection. Transducer, that is returned by this function, needs to store elements inside an array, and only after the current partition is filled it will append it to the result. Seems logical, however, if the number of elements can’t be equally partitioned, some will be left over:

user> (partition-all 3 (range 8)) ; a regular partition-all call
((0 1 2) (3 4 5) (6 7))
user> (defn partition-all-transducer
        "Our naive implementation of `partition-all` as a transducer."
        [n]
        (fn [reducing-function]
          (let [p (volatile! [])]
            (fn [result value]
              (vswap! p conj value)     ; building the partition
              (if (= (count @p) n)
                (let [p* @p]
                  (vreset! p [])        ; clearing the partition storage
                  (reducing-function result p*)) ; adding the partition to the result
                result ; returning result as is, if the partition is not yet complete
                )))))
#'user/partition-all-transducer
user> (reduce ((partition-all-transducer 3) conj) [] (range 8))
[[0 1 2] [3 4 5]]

We can see that in the case of our implementation, only complete partitions were added to the result, yet there should be an additional incomplete partition, as shown by the direct partition-all call. This is because our transducer is missing a so-called completion step:

(defn partition-all-transducer [n]
  (fn [reducing-function]
    (let [p (volatile! [])]
      (fn
        ([result]                       ; completion arity
         (if (pos? (count @p))
           (reducing-function result @p)
           (reducing-function result)))
        ([result value]                 ; reduction arity
         (vswap! p conj value)
         (if (= (count @p) n)
           (let [p* @p]
             (vreset! p [])
             (reducing-function result p*))
           result))))))

Here I’ve added another arity, that must be called after the reduction process is complete. This arity checks if the array we’ve used to store the incomplete partition is not empty. If it’s not, it means that there are some leftovers, that we need to add to the result. So it calls reducing-function with this array, otherwise it will call reducing-function with result only, propagating completion step down the line. Invoking this arity after reduce completed, we can see that all partitions are present:

user> (let [f ((partition-all-transducer 3) conj)
            res (reduce f [] (range 8))]
        (f res)) ;; complete step
[[0 1 2] [3 4 5] [6 7]]

Our reduce example has become way too verbose, and there’s also a potential for error if our transducer leaked from this scope and someone else used it after it was completed. Notice that I’ve forgotten to clear the volatile p in the completion step, and if someone else calls this particular function again, these leftover elements will be added to the result again. (Try to achieve that.)

Therefore, Clojure abstracts the process of finalizing a transducer into a function, conventionally called transduce:

(defn transduce
  ([xform f coll]
   (transduce xform f (f) coll))
  ([xform f init coll]
   (let [f (xform f)
         ret (reduce f init coll)]
     (f ret))))

There’s one additional arity that can be added to our transducer implementation, which is used for initialization, done by calling the reducing function without arguments. This arity takes zero arguments, and it is optional, as not all reducing functions can come up with a meaningful initialization process, but it’s better to supply it than not. So a complete implementation of the map-transducer function is:

(defn map-transducer [f]
  (fn [reducing-function]
    (fn
      ([]                               ; init
       (reducing-function))
      ([result]                         ; complete
       (reducing-function result))
      ([result input]                   ; step
       (reducing-function result (f input)))
      ([result input & inputs]          ; step with multiple inputs
       (reducing-function result (apply f input inputs))))))
Code Snippet 2: A complete implementation of a mapping transducer

And that’s it! This is a complete implementation of a transducer. Clojure provides this as an additional arity of map where you only supply a function, without the collection, thus we don’t even need a separate function for this, it was merely for demonstration purposes.

If you look back at the Elixir example, you can see that when Stream implementations of map and filter are used, each function is applied in quick succession, opposed to Enum version, where map is applied first, and then the result is filter‘ed as a whole. With transducers, we just implemented, or with ones available to us in Clojure we can do exactly the same:

user> (->> (range 4)
           (map (fn [x] (println "map" x) (inc x)))
           (filter (fn [x] (println "filter" x) (odd? x)))
           first)
;; map 0
;; map 1
;; map 2
;; map 3
;; filter 1
;; filter 2
;; filter 3
;; filter 4
;; => 1
user> (->> (range 4)
           (transduce
            (comp (map (fn [x] (println "map" x) (inc x)))
                  (filter (fn [x] (println "filter" x) (odd? x))))
            conj)
           first)
;; map 0
;; filter 1
;; map 1
;; filter 2
;; map 2
;; filter 3
;; map 3
;; filter 4
;; => 1

Now let’s really understand how transducers work.

Understanding the inverse order in comp and how transducers are composed

The transduce call may have seemed a bit complicated because of comp, but here’s a nice trick I’ve found. Simply remember, that the order of transducers in comp is exactly the same as the order of calls in the ->> macro. However, it may seem counter-intuitive, because usually functions in comp are applied in the reverse order, e.g.:

((comp a  b  c  d) x)
                                        ; expressions are aligned for clarity
      (a (b (c (d  x))))

In other words, even though functions are provided in order a, b, c, d, the call order will be d, c, b, a. And in the case of transducers the composition works the same way, it’s just we have one extra step after we’ve passed the reducing function.

As I’ve demonstrated in this example, the composition basically is engineered in such a way that it kinda inverses its order twice. The first inversion happens after we compose functions, and the second inversion happens when we pass the reducing function. Let’s use the substitution model, sometimes referred to as normal order evaluation to see why this happens.

Substitution here basically means that before we do any reduction, we expand all forms until they only contain primitives. This is done by substituting names with expressions they refer to. For example, given these three functions:

(defn add [a b] (+ a b))
(defn square [x] (* x x))
(defn sum-squares [a b] (+ (square a) (square b)))

We can walk through the (sum-squares (add 1 1) (add 1 2)) expression, and see how it will be evaluated in normal order, and how it differs from applicative order:

Normal order Applicative order
(sum-squares (add 1 1) (add 1 2)) (sum-squares (add 1 1) (add 1 2))
(sum-squares (+ 1 1) (+ 1 2)) (sum-squares (+ 1 1) (+ 1 2))
(+ (square (+ 1 1)) (square (+ 1 2))) (sum-squares 2 3)
(+ (* (+ 1 1) (+ 1 1)) (* (+ 1 2) (+ 1 2))) (+ (square 2) (square 3))
(+ (* 2 2) (* 3 3)) (+ (* 2 2) (* 3 3))
(+ 4 9) (+ 4 9)
13 13

Notice, that in normal order (+ 1 1) and (+ 1 2) are executed twice, because all substituting happens before any reduction. In applicative order, reduction happens before substituting, so every expression is computed only once. Applicative order is more in line with real evaluation rules, but it’s harder to see what’s happening when we compose things, so I’ll use the normal order to show how transducers are composed.

With this in mind, let’s try to walk through the following expression:

((comp (map-transducer inc)
       (filter-transducer odd?))
 conj)

First, we need to substitute map-transducer and filter-transducer with their (simplified) implementations:

((comp ((fn [f]
          (fn [rf]
            (fn [result value]
              (rf result (f value))))) inc)
       ((fn [f]
          (fn [rf]
            (fn [result value]
              (if (f value)
                (rf result value)
                result)))) odd?))
 conj)

Next, let’s substitute f with inc and odd? and get rid of function calls, substituting them with anonymous functions that accept rf:

((comp (fn [rf]
         (fn [result value]
           (rf result (inc value))))
       (fn [rf]
         (fn [result value]
           (if (odd? value)
             (rf result value)
             result))))
 conj)

Now, knowing that composition of functions f and g is (fn [x] (f (g x))) we can substitute comp with this expression, and then substitute f and g in this expression with our functions from the previous step:

((fn [x]
   ((fn [rf]
      (fn [result value]
        (rf result (inc value))))
    ((fn [rf]
       (fn [result value]
         (if (odd? value)
           (rf result value)
           result)))
     x)))
 conj)

Note that according to composition rules, odd? should be executed first, and inc would follow it, but we’re not done composing yet. Let’s substitute x for conj and remove the function call:

((fn [rf]
   (fn [result value]
     (rf result (inc value))))
 ((fn [rf]
    (fn [result value]
      (if (odd? value)
        (rf result value)
        result)))
  conj))

We can now substitute rf for conj in the innermost function call, and remove the innermost function that accepts rf:

((fn [rf]
   (fn [result value]
     (rf result (inc value))))
 (fn [result value]
   (if (odd? value)
     (conj result value)
     result)))

Finally, we can substitute outer rf with the entire inner function body, and remove another call:

(fn [result value]
  ((fn [result value]
     (if (odd? value)
       (conj result value)
       result))
   result (inc value)))

As the last step, we’re substituting the inner function’s value to (inc value) and result to result to eventually get:

(fn [result value]
  (if (odd? (inc value))
    (conj result (inc value))
    result))

The final function that will be executed by reduce, which is just an ordinary two-argument function! And as you can see inc happens before odd?.

So yes, the order in comp may seem inverse, as inc and odd appear in a logical order, but thanks to how the whole composition process evolves, this logical order can be preserved in the resulting function. I hope that with this substitution model you can now understand the whole composition process of transducers, which is not a trivial process by any means. But I’m actually amazed by how this simple idea achieves such a complete abstraction that can be used in all kinds of transformation contexts.

Speaking of transduceable contexts, let’s implement one!

Implementing sequence transduceable context

Now we’re ready to implement sequence in Clojure, without direct Java interop. First, let’s remember what reduce does:

  • Accepts a function, initial value, and a collection;
  • Gets an element of a collection and passes the initial value and the element to the reducing function;
  • The reducing function returns current result;
  • The result is then passed to the reducing function alongside the next element from the collection;
  • Once the collection is exhausted, the result is returned.

Thus, reduce can be written as an ordinary loop. However, sequence is lazy, therefore we can’t just loop through the collection, but thankfully, lazy sequences can be recursive. And, there’s another problem, we must append each element to the sequence we’re producing, but we also need to check if we’ve actually finished, or if we need to skip the element because it is filtered out. And we need to call the completion step somewhere.

Though, if you think about it, we don’t need for our transducer to actually append elements to a collection, it can merely do the transformation, and since we know what kind of collection we’re building, we can build it later. With this approach, we can check the result of a transducer on each step and act accordingly.

First, let’s finalize the transducer with a reducing function:

(defn sequence [xform coll]
  (let [f (xform (completing #(cons %2 %1)))]
    ;; lazy loop?
    ))

It may seem that we’re using cons here because we’re building a sequence, but it’s not. We could actually use anything, like conj or even a function that returns something that we can later distinguish from other values. In cons call, we have to reverse arguments though, because cons adds the second argument to a list, provided as a first argument. And completing simply adds a completion step that just returns the value it’s been given, in other words, we could write it as (fn ([a] a) ([a b] (cons b a))).

Now let’s figure out how to loop through the collection. We can start with an ordinary loop and then convert it to recursion, adding laziness as the last step.

(defn sequence [xform coll]
  (let [f (xform (completing #(cons %2 %1)))
        res (loop [s (seq coll)
                   res ()]
              (if s
                (let [x (f nil (first s))]
                  (if (seq? x)
                    (recur (next s) (concat res x))
                    (recur (next s) res)))
                res))]
    res))

Let’s try it:

user> (sequence (map inc) [1 2 3])
(2 3 4)
user> (sequence (partition-all 2) [1 2 3 4 5])
([1 2] [3 4])
user> (clojure.core/sequence (partition-all 2) [1 2 3 4 5])
([1 2] [3 4] [5])

Seems to work, but we’re missing the completion step (hence no incomplete partition in the result), so let’s add it:

user> (defn sequence [xform coll]
        (let [f (xform (completing #(cons %2 %1)))
              res (loop [s (seq coll)
                         res ()]
                    (if s
                      (let [x (f nil (first s))]
                        (if (seq? x)
                          (recur (next s) (concat res x))
                          (recur (next s) res)))
                      (f res)))]        ; complete
          res))
user> (sequence (partition-all 2) [1 2 3 4 5])
([5] [1 2] [3 4])

Oops, remember, that we’re using cons and we can’t really call f with res as a completion step, because our reducing function doesn’t know how to build the whole collection, only how to transform a single element. Instead, we have to call it with nil as before, and concat it with the result:

user> (defn sequence [xform coll]
        (let [f (xform (completing #(cons %2 %1)))
              res (loop [s (seq coll)
                         res ()]
                    (if s
                      (let [x (f nil (first s))]
                        (if (seq? x)
                          (recur (next s) (concat res x))
                          (recur (next s) res)))
                      (concat res (f nil))))] ; proper completion
          res))
user> (sequence (partition-all 2) [1 2 3 4 5])
([1 2] [3 4] [5])

Now let’s make it recursive, by replacing loop with an anonymous function, and recur with actual recursion:

user> (defn sequence [xform coll]
        (let [f (xform (completing #(cons %2 %1)))
              step (fn step [coll res]
                     (if-some [s (seq coll)]
                       (let [x (f nil (first s))]
                         (if (seq? x)
                           (step (next s) (concat res x))
                           (step (next s) res)))
                       (concat res (f nil))))]
          (step coll ())))
#'user/sequence
user> (sequence (partition-all 2) [1 2 3 4 5])
([1 2] [3 4] [5])
user> (dorun (sequence (map inc) (range 100000)))
Execution error (StackOverflowError) at user/sequence$step (REPL:25).
null

It still seems to work, but overflows with enough elements. Luckily, we can use lazy-seq to eliminate this problem, and actually make our implementation lazy:

user> (defn sequence [xform coll]
        (let [f (xform (completing #(cons %2 %1)))
              step (fn step [coll res]
                     (if-some [s (seq coll)]
                       (let [x (f nil (first s))]
                         (if (seq? x)
                           (lazy-seq (step (rest s) (concat res x)))
                           (lazy-seq (step (rest s) res))))
                       (concat res (f nil))))]
          (step coll ())))
user> (dorun (sequence (map inc) (range 100000)))
Execution error (StackOverflowError) at user/sequence$step (REPL:1).
null

And it still throws the StackOverflowError. Why?

Well, because we’re not really lazy yet. Instead of passing the result to the next iteration of step, as we did in loop we should use the result of step and concatenate with it. Let’s reorganize our function:

user> (defn sequence [xform coll]
        (let [f (xform (completing #(cons %2 %1)))]
          ((fn step [coll]
             (if-some [s (seq coll)]
               (let [res (f nil (first s))]
                 (if (seq? res)
                   (concat res (lazy-seq (step (rest s))))
                   (step (rest s))))
               (f nil)))
           coll)))
user> (dorun (sequence (map inc) (range 100000)))
nil

Now it doesn’t overflow. However, it is not yet ready to be used, because reduce, as you may know, can be terminated with reduced, and some transducers, like take leverage that to terminate the process. So we need to check for reduced? in our implementation. Not only that, but we have to call the completion step on the value, returned by dereferencing the reduced object, otherwise it will not be added to the resulting sequence properly.:

user> (defn sequence [xform coll]
        (let [f (xform (completing #(cons %2 %1)))]
          ((fn step [coll]
             (if-some [s (seq coll)]
               (let [res (f nil (first s))]
                 (cond (reduced? res) (f (deref res)) ; checking for early termination
                       (seq? res) (concat res (lazy-seq (step (rest s))))
                       :else (step (rest s))))
               (f nil)))
           coll)))
#'user/sequence
user> (sequence (comp (partition-all 2) (take 5)) (range))
([0 1] [2 3] [4 5] [6 7] [8 9])
user> (sequence (comp (take 5) (partition-all 2)) (range))
([0 1] [2 3] [4])

And we’re done! Well, almost, sequence should coerce the result to an empty sequence, and our current version will return nil if the transducer never returned anything. It’s easy enough to fix:

(defn sequence [xform coll]
  (let [f (xform (completing #(cons %2 %1)))]
    (or ((fn step [coll]
           (if-some [s (seq coll)]
             (let [res (f nil (first s))]
               (cond (reduced? res) (f (deref res))
                     (seq? res) (concat res (lazy-seq (step (rest s))))
                     :else (step (rest s))))
             (f nil)))
         coll)
        ())))
Code Snippet 3: Final version of our sequence transducer.

We can see that it is lazy, by using side-effecting transducers, like ones with println in those:

user> (sequence
       (comp (map (fn [x] (println "map" x) (inc x)))
             (filter (fn [x] (println "filter" x) (odd? x)))
             (take 3))
       (range))
map 0
filter 1
(1map 1
filter 2
map 2
filter 3
 3map 3
filter 4
map 4
filter 5
 5)
Code Snippet 4: Notice that the output is all messed up, because the sequence started printing before it was realized, and side effects appeared during the pretty printing process, which itself is lazy.

And as you can see, our sequence behaves in the same way as the clojure.core/sequence, regarding laziness:

user> (do (sequence
           (comp (map (fn [x] (println "map" x) (inc x)))
                 (filter (fn [x] (println "filter" x) (odd? x))))
           (range)) ;; note, infinite range
          nil)
map 0
filter 1
nil
user> (do (clojure.core/sequence
           (comp (map (fn [x] (println "map" x) (inc x)))
                 (filter (fn [x] (println "filter" x) (odd? x))))
           (range))
          nil)
map 0
filter 1
nil

I think now sequence is completed, though I’ll need to test it very extensively in the future. I’ve already tested it a lot when I was porting it to Fennel, and I think it should work correctly, looking at the code at least I don’t see anything that could go wrong.

What have I learned

Implementing transduceable context, in this case, the sequence function, was a nice puzzle. I’m sure it’s not as efficient, as the clojure.core version, mainly due to the use of concat but I wasn’t able to come up with a better way of building the result which can be done lazily.

And after actually implementing sequence, porting it and a lot of transducers to Fennel, I’ve finally figured out how they actually work, and I hope that now you understand it too! This is why I love Clojure - the developers are putting a lot of thought into such features, and not taking shortcuts, like reimplementing all functions for data sources that can’t be transformed into sequences. For me, it’s a really practical language, with a lot of tools that can enhance the programming experience, and make it fun.

Of course, if you have any questions, feel free to email me, and I’ll try to answer them, and maybe update the post for future readers. This topic is a bit complicated, so I hope it was not a boring post. Thanks for reading!


  1. I actually like this approach to learning. Usually, I’m not reading how a thing is implemented, and instead trying to figure out everything myself. This unfortunately doesn’t produce the greatest results, as a lot of stuff I’m trying to learn this way has a lot of research put into it, and I basically try to reimplement a thing only based on assumptions and observations. Nevertheless, I’m satisfied with the process, and in the end, if I got something working, I feel happy, and even more so, when I’ve got some concepts exactly right. I’m not suggesting that this is a superior way to learn, but it is at least very enjoyable for me personally. ↩︎

Permalink

Clojure: Destructuring

In The Joy of Clojure (TJoC) destructuring is described as a mini-language within Clojure. It's not essential to learn this mini-language; however, as the authors of TJoC point out, destructuring facilitates concise, elegant code.

Making Code More Understandable

One of the scariest things for those who are just now learning how to do some coding is the fact that they have to try to figure out what a seemingly impossible set of rules and structures means for the work that they are trying to do. It is not easy at all, and many people struggle with it in big ways. 

Permalink

Clojure Deref (Aug 12, 2022)

Welcome to the Clojure Deref! This is a weekly link/news roundup for the Clojure ecosystem. (@ClojureDeref RSS)

Libraries and Tools

New releases and tools this week:

  • Clojure CLI 1.11.1.1155

  • city-weather-clj - Small web application which consumes the Open Weather API and makes use of Clojure’s atom construct as cache

  • flow-storm-debugger 2.3.131 - A debugger for Clojure and ClojureScript with some unique features

  • polylith 0.2.15 - A tool used to develop Polylith based architectures in Clojure

  • lsp4clj 1.0.0 - LSP base support for any LSP that is implemented in Clojure

  • eql-cli - A CLI for executing EQL queries on EDN dataA CLI for executing EQL queries on EDN data

  • querqy-clj - Querqy for Clojure

  • Tutkain 0.16.0 - A Sublime Text package for interactive Clojure development

  • biff 0.4.3-beta - A simple and easy web framework for Clojure

  • fixa 0.1.0 - Better test fixtures for clojure

  • flint 0.2.0 - SPARQL DSL library for Clojure(Script)

Permalink

Dogfooding Blambda 4: CLI, CLIier, CLIiest

In yesterday's installment of Dogfooding Blambda, I fully intended to show y'all Blambda's command line interface, then walk you through how I implemented it, using the amazing babashka.cli library. However, me being me, I kinda meandered all over the place, and by the time I had finished the "show" portion of Show and Tell, my dog let me know, kindly but firmly as only a dog can, that the "tell" portion would need to wait, because he couldn't anymore.

But hey, today is a new day, and I have heaps of time, so let's dig in!

After the less than wonderful experience I had using Blambda in my s3-log-parser lambda, I decided that Blambda should be a library that exposed its functionality via an API, similar to how quickblog works. Thinking about what the API should look like, I decided on the following functions:

  • build-runtime-layer: builds the Blambda custom runtime as a Lambda layer
  • deploy-runtime-layer: deploys the custom runtime layer
  • build-deps-layer: builds a layer for lambda dependencies
  • deploy-deps-layer: deploys the dependencies layer
  • clean: sweeps up around the place to give you that peaceful happy feeling that you so crave

I had the code for building and deploying the custom runtime layer in Blamba's bb.edn, as well as the code for building the deps layer, but in my infinite wisdom, the code for actually deploying the deps layer was in s3-log-parser, in the wonderfully named task_helper.clj file.

It was straightforward enough to move the code for building and deploying into a blambda.api namespace, and I had at least had the foresight to write a primitive argument parser in Blambda's own task_helper.clj that turned something like `bb build-runtime-layer --bb-arch arm64 --bb-version 0.9.161` into:

{:bb-arch "arm64"
 :bb-version "0.9.161"}

Since this is also how babashka.cli works, the migration was fairly straightforward. I took my old task-helper/defaults:

{:aws-region {:doc "AWS region"
              :default (or (System/getenv "AWS_DEFAULT_REGION") "eu-west-1")}
:bb-version {:doc "Babashka version"
             :default "0.9.161"}
:bb-arch {:doc "Architecture to target"
          :default "amd64"
          :values ["amd64" "arm64"]}
:deps-path {:doc "Path to bb.edn or deps.edn containing lambda deps"}
:layer-name {:doc "Name of custom runtime layer in AWS"
             :default "blambda"}
:target-dir {:doc "Build output directory"
             :default "target"}
:work-dir {:doc "Working directory"
           :default ".work"}}

and turned them into a babashka.cli spec:

{:aws-region {:desc "AWS region"
              :ref "<region>"
              :default (or (System/getenv "AWS_DEFAULT_REGION") "eu-west-1")}
 :bb-version {:desc "Babashka version"
              :ref "<version>"
              :default "0.9.161"}
 :bb-arch {:desc "Architecture to target"
           :ref "<arch>"
           :default "amd64"
           :values ["amd64" "arm64"]}
 :deps-path {:desc "Path to bb.edn or deps.edn containing lambda dependencies"
             :ref "<path>"}
 :deps-layer-name {:desc "Name of dependencies layer in AWS"
                   :ref "<name>"}
 :runtime-layer-name {:desc "Name of custom runtime layer in AWS"
                      :ref "<name>"
                      :default "blambda"}
 :target-dir {:desc "Build output directory"
              :ref "<dir>"
              :default "target"}
 :work-dir {:desc "Working directory"
            :ref "<dir>"
            :default ".work"}}

With this, I could create a nice CLI for Blambda:

(ns blambda.cli
  (:require [babashka.cli :as cli]))

(def spec
  {
  ;; ...
  })

(defn parse-opts [& _]
  (let [opts (cli/parse-opts *command-line-args* {:spec spec})]
    (if (:help opts)
      (do
        (println (cli/format-opts {:spec spec}))
        (System/exit 0))
      opts)))

This is all it takes to get a lovely usage message:

$ bb -m blambda.cli/parse-opts --help
  --aws-region         <region>  eu-west-1 AWS region
  --bb-version         <version> 0.9.161   Babashka version
  --bb-arch            <arch>    amd64     Architecture to target
  --deps-path          <path>              Path to bb.edn or deps.edn containing lambda dependencies
  --deps-layer-name    <name>              Name of dependencies layer in AWS
  --runtime-layer-name <name>    blambda   Name of custom runtime layer in AWS
  --target-dir         <dir>     target    Build output directory
  --work-dir           <dir>     .work     Working directory

So that's Blambda as a library. Now I can use that in s3-log-parser to manage all of the moving parts in one place. In order to do this, all I need to do is add tasks to s3-log-parser's bb.edn:

{:paths ["."]
 :deps {net.jmglov/blambda
        #_"You use the newest SHA here:"
        {:git/url "https://github.com/jmglov/blambda.git"
         :git/sha "e379410bb6b20bb9cf34acd42cfc65e5f4fd6845"}}
 :tasks
 {:requires ([blambda.api :as blambda])

  build-runtime-layer {:doc "Builds Blambda custom runtime layer"
                       :task (blambda/build-runtime-layer)}

  deploy-runtime-layer {:doc "Deploys Blambda custom runtime layer"
                        :task (blambda/deploy-runtime-layer)}

  build-deps-layer {:doc "Builds dependencies layer"
                    :task (blambda/build-deps-layer)}

  deploy-deps-layer {:doc "Deploys dependencies layer"
                     :task (blambda/deploy-deps-layer)}

  clean {:doc "Deletes target and work directories"
         :task (blambda/clean)}}}

Now I can do things like build my dependencies layer:

$ bb build-deps-layer --deps-path src/bb.edn 
Classpath before transforming: src:~/s3-log-parser/.work/m2-repo/com/cognitect/aws/endpoints/1.1.12.206/endpoints-1.1.12.206.jar:...

Classpath after transforming: src:/opt/m2-repo/com/cognitect/aws/endpoints/1.1.12.206/endpoints-1.1.12.206.jar:...

Compressing custom runtime layer: ~/s3-log-parser/target/deps.zip

It's a little annoying that I have to add one task per library function, and even more annoying that I have to repeat information that's already there in blambda.api:

(defn build-runtime-layer
  "Builds Blambda custom runtime layer"
  ;; ...
  )

(defn deploy-runtime-layer
  "Deploys Blambda custom runtime layer"
  ;; ...
  )

(defn build-deps-layer
  "Builds dependencies layer"
  ;; ...
  )

(defn deploy-deps-layer
  "Deploys dependencies layer"
  ;; ...
  )

(defn clean
  "Deletes target and work directories"
  ;; ...
  )

What I'd like to do is add a single task that delegates all of this stuff to the Blambda CLI. But how to find this holiest of all holy grails?

Of course the mighty borkdude has already thought of this, and babashka.cli has support for subcommands. If I expose each Blambda API function as a subcommand, I can interact with Blambda from s3-log-parser as I showed you yesterday:

$ bb blambda build-runtime-layer --bb-arch arm64
Downloading https://github.com/babashka/babashka/releases/download/v0.9.161/babashka-0.9.161-linux-aarch64-static.tar.gz
Decompressing .work/babashka-0.9.161-linux-aarch64-static.tar.gz to .work
Adding file bootstrap
Adding file bootstrap.clj
Compressing custom runtime layer: ~/my-lambda/target/bb.zip

$ bb blambda deploy-runtime-layer --bb-arch arm64
Publishing layer version for layer blambda
Published layer arn:aws:lambda:eu-west-1:289341159200:layer:blambda:1

And all this by adding a single line to my bb.edn (actually two lines, since I need to require blambda.cli, or rather three lines, since I want to nicely format the map, but you know what I mean):

:tasks
{:requires ([blambda.cli :as blambda])

 blambda {:doc "Controls Blambda runtime and layers"
          :task (blambda/dispatch)}}

Wow, now that's some amazing UX! 🏆

But how does this subcommand stuff work? babashka.cli's documentation gives this example:

(ns example
  (:require [babashka.cli :as cli]))

(defn copy [m]
  (assoc m :fn :copy))

(defn delete [m]
  (assoc m :fn :delete))

(defn help [m]
  (assoc m :fn :help))

(def table
  [{:cmds ["copy"]   :fn copy   :args->opts [:file]}
   {:cmds ["delete"] :fn delete :args->opts [:file]}
   {:cmds []         :fn help}])

(defn -main [& args]
  (cli/dispatch table args {:coerce {:depth :long}}))

So it looks like I need to build a table that looks something like this:

(def table
  [{:cmds ["build-runtime-layer"]  :fn api/build-runtime-layer}
   {:cmds ["deploy-runtime-layer"] :fn api/deploy-runtime-layer}
   {:cmds ["build-deps-layer"]     :fn api/build-deps-layer}
   {:cmds ["deploy-deps-layer"]    :fn api/deploy-deps-layer}
   {:cmds ["clean"]                :fn api/build-runtime-layer}
   {:cmds []                       :fn print-help}])

And according to the docs, I can include other babashka.cli/parse-arg options in a table entry, so I'll add :spec spec to each.

Now I can run bb blambda --help from s3-log-parser to get a nice usage message:

  --aws-region         <region>  eu-west-1 AWS region
  --bb-version         <version> 0.9.161   Babashka version
  --bb-arch            <arch>    amd64     Architecture to target
  --deps-path          <path>              Path to bb.edn or deps.edn containing lambda dependencies
  --deps-layer-name    <name>              Name of dependencies layer in AWS
  --runtime-layer-name <name>    blambda   Name of custom runtime layer in AWS
  --target-dir         <dir>     target    Build output directory
  --work-dir           <dir>     .work     Working directory

Well, nice-ish. If I run this, I have no idea what subcommands bb blambda has, what they do, and which of these options apply to each command. Let's see if we can fix this.

Let's start by enriching our print-help function a bit and modifying our table to use it:

(defn print-help [cmds]
  (println
   (format
    "Usage: bb blambda <subcommand> <options>

Subcommands:

%s"
    (->> cmds
         (map (comp first :cmds))
         (str/join "\n\n")))))

(def table
  (let [cmds
        [{:cmds ["build-runtime-layer"]  :fn api/build-runtime-layer}
         {:cmds ["deploy-runtime-layer"] :fn api/deploy-runtime-layer}
         {:cmds ["build-deps-layer"]     :fn api/build-deps-layer}
         {:cmds ["deploy-deps-layer"]    :fn api/deploy-deps-layer}
         {:cmds ["clean"]                :fn api/build-runtime-layer}]]
    (conj cmds
          {:cmds [] :fn (fn [_] (print-help cmds))})))

Now running bb blambda --help (or bb blambda help or bb blambda, for that matter) produces something a bit nicer:

Usage: bb blambda <subcommand> <options>

Subcommands:

build-runtime-layer

deploy-runtime-layer

build-deps-layer

deploy-deps-layer

clean

Let's add a description to each subcommand and then update print-help:

(def table
  (let [cmds
        [{:cmds ["build-runtime-layer"]
          :desc "Builds Blambda custom runtime layer"
          :fn api/build-runtime-layer}
         {:cmds ["deploy-runtime-layer"]
          :desc "Deploys Blambda custom runtime layer"
          :fn api/deploy-runtime-layer}
         ;; ...
        ]]
    (conj cmds
          {:cmds [] :fn (fn [_] (print-help cmds))})))

(defn print-help' [cmds]
  (println
   (format
    "Usage: bb blambda <subcommand> <options>

Subcommands:

%s"
    (->> cmds
         (map (fn [{:keys [cmds desc]}]
                (format "%s: %s" (first cmds) desc)))
         (str/join "\n\n")))))

Now we're getting somewhere!

Usage: bb blambda <subcommand> <options>

Subcommands:

build-runtime-layer: Builds Blambda custom runtime layer

deploy-runtime-layer: Deploys Blambda custom runtime layer

build-deps-layer: Builds dependencies layer from bb.edn or deps.edn

deploy-deps-layer: Deploys dependencies layer

clean: Removes work and target folders

Now it's time to bring back our options. If we look at the options we have, --target-dir and --work-dir apply to every command, --aws-region applies to the two deployment commands, and the rest of the options (--bb-arch, --bb-version, --deps-path, --runtime-layer-name, and --deps-layer-name) apply only to specific commands. Let's see how we can express this in our command table:

(def table
  (let [cmds
        [{:cmds ["build-runtime-layer"]
          :desc "Builds Blambda custom runtime layer"
          :fn api/build-runtime-layer
          :opts #{:bb-version :bb-arch}}
         {:cmds ["build-deps-layer"]
          :desc "Builds dependencies layer from bb.edn or deps.edn"
          :fn api/build-deps-layer
          :opts #{:deps-path}}
         {:cmds ["deploy-runtime-layer"]
          :desc "Deploys Blambda custom runtime layer"
          :fn api/deploy-runtime-layer
          :opts #{:aws-region :bb-arch :runtime-layer-name}}
         {:cmds ["deploy-deps-layer"]
          :desc "Deploys dependencies layer"
          :fn api/deploy-deps-layer
          :opts #{:aws-region :bb-arch :deps-layer-name}}
         {:cmds ["clean"]
          :desc "Removes work and target folders"
          :fn api/clean}]]
    (conj cmds
          {:cmds [], :fn (fn [m] (print-help cmds))})))

This makes sense, but now we need a way to turn a set of opts into a spec. Whilst we're at it, let's add the global options (--target-dir and --work-dir) to every spec:

(def specs
  {:aws-region
   {:desc "AWS region"
    :ref "<region>"
    :default (or (System/getenv "AWS_DEFAULT_REGION") "eu-west-1")}
   ;; ...
  })

(def global-opts #{:target-dir :work-dir})

(defn mk-spec [default-opts opts]
  (select-keys specs (set/union global-opts opts)))

Now we can plug mk-spec into our table:

(def table
  (let [cmds
        [{:cmds ["build-runtime-layer"]
          :desc "Builds Blambda custom runtime layer"
          :fn api/build-runtime-layer
          :spec (mk-spec #{:bb-version :bb-arch})}
         {:cmds ["build-deps-layer"]
          :desc "Builds dependencies layer from bb.edn or deps.edn"
          :fn api/build-deps-layer
          :spec (mk-spec #{:deps-path})}
         {:cmds ["deploy-runtime-layer"]
          :desc "Deploys Blambda custom runtime layer"
          :fn api/deploy-runtime-layer
          :spec (mk-spec #{:aws-region :bb-arch :runtime-layer-name})}
         {:cmds ["deploy-deps-layer"]
          :desc "Deploys dependencies layer"
          :fn api/deploy-deps-layer
          :spec (mk-spec #{:aws-region :deps-layer-name})}
         {:cmds ["clean"]
          :desc "Removes work and target folders"
          :fn api/clean}]]
    (conj cmds
          {:cmds [], :fn (fn [m] (print-help cmds))})))

And since we have the spec for each subcommand, we can use that in our help message:

(defn ->subcommand-help [{:keys [cmd desc spec]}]
  (format "%s: %s\n%s" cmd desc
          (cli/format-opts {:spec spec})))

(defn print-help [cmds]
  (println
   (format
    "Usage: bb blambda <subcommand> <options>

All subcommands support the options:

%s

Subcommands:

%s"
    (cli/format-opts {:spec (select-keys specs global-opts)})
    (->> cmds
         (map ->subcommand-help)
         (str/join "\n\n"))))
  (System/exit 0))

Running bb blambda now is very satisfying:

Usage: bb blambda <subcommand> <options>

All subcommands support the options:

  --work-dir   <dir> .work  Working directory
  --target-dir <dir> target Build output directory

Subcommands:

build-runtime-layer: Builds Blambda custom runtime layer
  --bb-arch    <arch>    amd64   Architecture to target (use amd64 if you don't care)
  --work-dir   <dir>     .work   Working directory
  --target-dir <dir>     target  Build output directory
  --bb-version <version> 0.9.161 Babashka version

build-deps-layer: Builds dependencies layer from bb.edn or deps.edn
  --work-dir   <dir>  .work  Working directory
  --deps-path  <path>        Path to bb.edn or deps.edn containing lambda deps
  --target-dir <dir>  target Build output directory

deploy-runtime-layer: Deploys Blambda custom runtime layer
  --bb-arch            <arch>   amd64     Architecture to target (use amd64 if you don't care)
  --runtime-layer-name <name>   blambda   Name of custom runtime layer in AWS
  --work-dir           <dir>    .work     Working directory
  --aws-region         <region> eu-west-1 AWS region
  --target-dir         <dir>    target    Build output directory

deploy-deps-layer: Deploys dependencies layer
  --work-dir        <dir>    .work     Working directory
  --aws-region      <region> eu-west-1 AWS region
  --target-dir      <dir>    target    Build output directory
  --deps-layer-name <name>             Name of dependencies layer in AWS

clean: Removes work and target folders

There's one tiny annoyance, though. The usage message says that all subcommands support --work-dir and --target-dir, but then those options are repeated for every subcommand, which is a bit unnecessary and distracting. What we need to do is suppress the global options in ->subcommand-help:

(defn ->subcommand-help [{:keys [cmd desc spec]}]
  (let [spec (apply dissoc spec global-opts)]
    (format "%s: %s\n%s" cmd desc
            (cli/format-opts {:spec spec}))))

dissoc is normally used to remove one key from a map:

(dissoc {:a 1, :b 2, :c 3} :a)  ;; => {:b 2, :c 3}

but you can also give it more keys:

(dissoc {:a 1, :b 2, :c 3} :a :b)  ;; => {:c 3}

We have a set, global-opts, which is seqable, so we can use apply to splat it onto the end of the list of arguments to dissoc:

(let [spec (apply dissoc spec global-opts)]
  ;; Now spec has all of the opts except the global ones
  )

Let's see what we've accomplished:

$ bb blambda
Usage: bb blambda <subcommand> <options>

All subcommands support the options:

  --work-dir   <dir> .work  Working directory
  --target-dir <dir> target Build output directory

Subcommands:

build-runtime-layer: Builds Blambda custom runtime layer
  --bb-arch    <arch>    amd64   Architecture to target (use amd64 if you don't care)
  --bb-version <version> 0.9.161 Babashka version

build-deps-layer: Builds dependencies layer from bb.edn or deps.edn
  --deps-path <path> Path to bb.edn or deps.edn containing lambda deps

deploy-runtime-layer: Deploys Blambda custom runtime layer
  --bb-arch            <arch>   amd64     Architecture to target (use amd64 if you don't care)
  --runtime-layer-name <name>   blambda   Name of custom runtime layer in AWS
  --aws-region         <region> eu-west-1 AWS region

deploy-deps-layer: Deploys dependencies layer
  --aws-region      <region> eu-west-1 AWS region
  --deps-layer-name <name>             Name of dependencies layer in AWS

clean: Removes work and target folders

Excellent!

We're still missing one thing that I showed off yesterday, though: the ability for a client to override Blambda's defaults. In the case of s3-log-parser, I want to make sure I'm building Blambda for the ARM64 architecture, and set my deps path and deps layer name so that I don't have to remember to type the args every time.

Let's start out by wishing the feature into existence in s3-log-parser's bb.edn:

:tasks
{:requires ([blambda.cli :as blambda])

 blambda {:doc "Controls Blambda runtime and layers"
          :task (blambda/dispatch
                 {:bb-arch "arm64"
                  :deps-path "src/bb.edn"
                  :deps-layer-name "s3-log-parser-deps"})}}

So we want the client to be able to pass defaults to blambda.cli/dispatch. Let's make it happen:

(defn dispatch
  ([]
   (dispatch {}))
  ([default-opts & args]
   (cli/dispatch (mk-table default-opts)
                 (or args
                     (seq *command-line-args*)))))

Because we're good Clojurists, we maintain backwards compatibility by keeping the 0-arity version of dispatch, and just have it send an empty default-opts map into the new 1-arity version.

We're not going to be able to keep our static version of table anymore either, so we'll wrap it in a function called mk-table that incorporates our defaults:

(defn mk-table [default-opts]
  (let [cmds
        [{:cmds ["build-runtime-layer"]
          :desc "Builds Blambda custom runtime layer"
          :fn api/build-runtime-layer
          :spec (mk-spec default-opts #{:bb-version :bb-arch})}
         ;; ...
         ]]
    (conj cmds
          {:cmds [], :fn (fn [m] (print-help default-opts cmds))})))

We need to pass the default-opts along to mk-spec and print-help as well:

(defn mk-spec [default-opts opts]
  (->> (select-keys specs (set/union global-opts opts))
       (apply-defaults default-opts)))

(defn ->subcommand-help [default-opts {:keys [cmd desc spec]}]
  (let [spec (apply dissoc spec global-opts)]
    (format "%s: %s\n%s" cmd desc
            (cli/format-opts {:spec
                              (apply-defaults default-opts spec)}))))

(defn print-help [default-opts cmds]
  (println
   (format
    "Usage: bb blambda <subcommand> <options> ..."
    (cli/format-opts {:spec (select-keys specs global-opts)})
    (->> cmds
         (map (partial ->subcommand-help default-opts))
         (str/join "\n\n"))))
  (System/exit 0))

And finally, let's look at this mysterious new apply-defaults function:

(defn apply-defaults [default-opts spec]
  (->> spec
       (map (fn [[k v]]
              (if-let [default-val (default-opts k)]
                [k (assoc v :default default-val)]
                [k v])))
       (into {})))

If we run bb blambda help now, we can see the effects of our changes:

Usage: bb blambda <subcommand> <options>
[...]
Subcommands:

build-runtime-layer: Builds Blambda custom runtime layer
  --bb-arch    <arch>    arm64   Architecture to target (use amd64 if you don't care)
  --bb-version <version> 0.9.161 Babashka version

build-deps-layer: Builds dependencies layer from bb.edn or deps.edn
  --deps-path <path> src/bb.edn Path to bb.edn or deps.edn containing lambda deps

deploy-deps-layer: Deploys dependencies layer
  --aws-region      <region> eu-west-1          AWS region
  --deps-layer-name <name>   s3-log-parser-deps Name of dependencies layer in AWS

Note that --bb-arch now defaults to arm64, and --deps-path and --deps-layer-name now have default values, which they didn't before! 🎉

I realise I'm quite a few words into this post now, but I do want to add one more tiny feature. With a subcommand setup, you expect `bb blambda build-runtime-layer --help to give you help on the build-runtime-layer` subcommand, but at the moment, our code just ignores the --help flag and calls the api/build-runtime-layer function, which is definitely not what we want.

In order to support this, let's wrap the api/build-runtime-layer in a function that checks for --help and does the right thing:

(defn mk-table [default-opts]
  (let [cmds
        [{:cmds ["build-runtime-layer"]
          :desc "Builds Blambda custom runtime layer"
          :fn (fn [opts]
                (when (:help opts)
                  (print-command-help cmd spec)
                  (System/exit 0))
                (api/build-runtime-layer))
          :spec (mk-spec default-opts #{:bb-version :bb-arch})}
         ;; ...
         ]]
    (conj cmds
          {:cmds [], :fn (fn [m] (print-help default-opts cmds))})))

We can define print-command-help as follows:

(defn print-command-help [cmd spec]
  (println
   (format "Usage: bb blambda %s <options>\n\nOptions:\n%s"
           cmd (cli/format-opts {:spec spec}))))

Now things work as expected:

$ bb blambda build-runtime-layer --help
Usage: bb blambda build-runtime-layer <options>

Options:
  --bb-arch    <arch>    arm64   Architecture to target (use amd64 if you don't care)
  --work-dir   <dir>     .work   Working directory
  --target-dir <dir>     target  Build output directory
  --bb-version <version> 0.9.161 Babashka version

Of course, now we're in the somewhat unpleasant situation of having to copy and paste our wrapper for all of the other subcommands, so let's create a function to do this for us:

(defn mk-cmd [default-opts {:keys [cmd spec] :as cmd-opts}]
  (merge
   cmd-opts
   {:cmds [cmd]
    :fn (fn [{:keys [opts]}]
          (when (:help opts)
            (print-command-help cmd spec)
            (System/exit 0))
          ((:fn cmd-opts) opts))}))

Now we can use this function in mk-table:

(defn mk-table [default-opts]
  (let [cmds
        [{:cmd "build-runtime-layer"
          :desc "Builds Blambda custom runtime layer"
          :fn api/build-runtime-layer
          :spec (mk-spec default-opts #{:bb-version :bb-arch})}
         ;; ...
         ]]
    (conj (mapv (partial mk-cmd default-opts) cmds)
          {:cmds [], :fn (fn [m] (print-help default-opts cmds))})))

So --help now works for all subcommands. And since we now have mk-cmd wrapping our subcommand function for us, let's also ensure that all options are set (we have no optional options here):

(defn mk-cmd [default-opts {:keys [cmd spec] :as cmd-opts}]
  (merge
   cmd-opts
   {:cmds [cmd]
    :fn (fn [{:keys [opts]}]
          (let [missing-args (->> (set (keys opts))
                                  (set/difference (set (keys spec)))
                                  (map #(format "--%s" (name %)))
                                  (str/join ", "))]
            (when (:help opts)
              (print-command-help cmd spec)
              (System/exit 0))
            (when-not (empty? missing-args)
              (error {:cmd cmd, :spec spec}
                     (format "Missing required arguments: %s" missing-args)))
            ((:fn cmd-opts) opts)))}))

The error function just formats a nice error message and exits:

(defn error [{:keys [cmd spec]} msg]
  (println (format "%s\n" msg))
  (print-command-help cmd spec)
  (System/exit 1))

We can test this by commenting out the :deps-path key-value pair in s3-log-parser's bb.edn:

:tasks
{:requires ([blambda.cli :as blambda])

 blambda {:doc "Controls Blambda runtime and layers"
          :task (blambda/dispatch
                 {:bb-arch "arm64"
;;                  :deps-path "src/bb.edn"
                  :deps-layer-name "s3-log-parser-deps"})}}

and then not mentioning --deps-path on the command line:

$ bb blambda build-deps-layer
Missing required arguments: --deps-path

Usage: bb blambda build-deps-layer <options>

Options:
  --work-dir   <dir>  .work  Working directory
  --deps-path  <path>        Path to bb.edn or deps.edn containing lambda deps
  --target-dir <dir>  target Build output directory

With that, we have achieved a great victory and can now move onto another activity (in my case, sleeping).

Permalink

The REPL is Not Enough

By Alys Brooks

The usual pitch for Clojure typically has a couple ingredients, and a key one is the REPL. Unfortunately, it’s not always clear on what ‘REPL’ means. Sometimes the would-be Clojure advocate breaks down what the letters mean—R for read, E for eval, P for print, and L for loop—and how the different stages connect to the other Lispy traits of Clojure, at least. However, even a thorough understanding of the mechanics of the REPL fails to capture what we’re usually getting at: The promise of interactive development.

To explore why this is great, let’s build it up from (mostly) older and more familiar languages. If you’re new to Clojure, this eases you in. If Clojure’s one of your first languages, hopefully it gives you some appreciation of where it came from.

Casino REPL: Interactive code execution

The first level is being able to interactively enter code and run it at all. You might be surprised to learn that (technically) Java has a REPL. Similar tools exist for other static languages, like C and C#.

These can be handy for figuring out an obscure or forgotten bit of syntax or playing around with an API.

These REPLs are typically afterthoughts and have limitations on what you can define. In jshell, for example, you can define static methods but no classes.

License to Eval: Full Access to the Language

The next step is basically no-compromises eval/execute—all the constructs of the language are available. Most dynamic languages offer this, as do static functional languages like Haskell and OCaml. Shells, including bash or PowerShell, also have this level of capability.

These languages are generally high-level and may even resemble the pseudocode you wrote or referenced, so using these REPLs to try out ideas and do some quick testing can be a fluent experience. After all, shells were the main interface to computers for several decades, and have remained in the toolbox of power users, system administrators, developers, and quartermasters since.

Still, you run into some disadvantages:

  • These REPLs start in a blank slate, but most development is in the context of an existing program, often a very large one. You have to use imports to bring in the relevant code, and it may take quite a bit of typing.
  • What you write in the REPL is often ephemeral. Ephemeral in the quality sense is actually okay—writing one (or two, or three) versions in the REPL to throw away isn’t bad. But it’s also ephemeral in a more literal sense. Once the REPL ends, your code is either lost or not in a convenient format. Recent history is typically just a few up arrow presses away, but to find earlier code you have to either search or wade through typos, uses of doc and other inspection, and design dead ends.

From Devtools With Love: Adding Context

Going beyond a sequential experience of entering code and seeing the result takes us another step toward understanding what our code is doing. Actually, it’s really two steps:

  1. Going beyond text to include graphs, images, animations, and widgets.
  2. Showing the current state at all times.

Web developers and data scientists have taken the lead here. Every major desktop browser has a suite of tools for inspecting not only the JavaScript code but also the interface (the DOM) it generates or alters. Similarly, RStudio and Spyder are data science-oriented IDEs that keep a running process and allow you to see the values of any currently defined variables.

Some supercharged REPLs and REPL sidekicks exist for Clojure:

  • Dirac tweaks Chrome’s DevTools to accomodate ClojureScript
  • Reveal adds the ability to explore and visualize data structures returned by the forms you’re evaluating.
  • Portal, inspired by Reveal, similarly lets you explore a variety of data types.

Along similar lines, re-frame applications can use re-frame10x, which allows for stepping back and forward to see the state of application.

Notebooks are another way of moving past the textual paradigm. Notebooks let you have inline diagrams, images, graphs, and even widgets. They also allow you to embed explanatory text and diagrams—the promised literate programming all the way from the 1970s. Some notebooks add a variable inspector and debugger, blurring the line between IDE and notebook. Clerk brings these to Clojure.

The Clojurian With the Golden Form: Out of the Textual REPL

Clojure’s base REPL already has the strengths of the dynamic, expressive languages mentioned in License to Eval (especially if you add tools and quality-of-life libraries from the previous section). However, many Clojure developers find they are most productive if they can evaluate code as they edit source code.

These are often done through advanced REPLs like nREPL, pREPL, and their alternatives. (Rich Hickey has argued “REPL” is a misnomer at least in nREPL’s case.)

Fully realized, Clojure forms and values become lingua franca allowing you to control, inspect, and redefine a variety of systems, as you send code from your editor, terminal, or notebook to a REPL, a local instance of your program, a browser, a node backend, or even a production instance. Unfortunately, most of these require some setup. In particular, getting a ClojureScript REPL is a multistage process, much like modern rockets, and prone to failure, much like early rockets.

These advantages transcend the basic command-line evaluation that “REPL” often suggests, so listing the REPL among Clojure’s advantages actually undersells the feature if you don’t explain what it can actually do.

Sessions are Forever: Common Lisp

The Clojure interactive development is not at the apex. Common Lisp went even further by persisting state between sessions and letting you examine the state.

Perhaps the most noticeable is that these save where you left off. This makes it easier to build up your program over time, at the cost of some state ambiguity. If you wrote a function process-input, renamed it to the canonicalize-user-commands, fixed a bug, and refactored it, process-input would still be hanging around, with subtle differences. Arguably working from an editor is a better fit for making changes to long-running systems or collaborating with other programmers, but being able to persist sessions would be nice for smaller programs or experimentation. In addition to Common Lisp, Smalltalk, and R also remember where you left off.

Common Lisp has another super power: conditions and restarts. When your program fails, it’s paused at the moment everything went wrong, allowing you to try to recover, explore what went wrong, or even redefine things and resume execution like nothing happened.

In Clojure and most other languages, an error, exception, or panic basically shuts everything down. You can see the stack at the moment of failure, but you can’t interact with it. Rich Hickey was inspired by Common Lisp and the lack of the condition system is not because he thought they weren’t valuable. As he explains in A History of Clojure,

I experimented a bit with emulating Common Lisp’s conditions system, but I was swimming upstream.

Some Clojurians have decided to try swimming upstream, but since these libraries aren’t widely used, you’ll have to think carefully about how they’ll interact with libraries that rely on Clojure’s native exceptions.

Conclusion

Common Lisp being the endpoint of our journey puts the lie to my blog post’s structure. Like most stories of only-increasing progress, this one isn’t completely true. Interactive development hasn’t simply gotten better and better over time. We’ve lost ground in some areas even as we’ve gained ground in others.

Still, we’re in a good place with Clojure. As the recent introduction of Clerk demonstrates, there’s still interest in improving the interactive development experience in Clojure.

Appendix: All the James Bond-Clojure Puns I Could, Regrettably, Not Fit in this Post

  • Dr. Nil
  • Dyn Another Day
  • Live and let Die
  • From nREPL with Love
  • The spy Who Loved Me
  • You Only defonce
  • MoonREPL
  • >
  • Permalink

    Becoming friends with Clojure protocols

    I’ve been programming Clojure for several years, and yet I’ve managed to avoid protocols during all that time (I’ve also avoided macros, but that is another story). I found myself always having a colleague do the “dirty work” or some sad excuse as of why it wasn’t necessary right now. No more… this week I got my hands dirty.

    For me, Clojure protocols solves the same problem, that I previously used interfaces in Java and PHP for: Dependency Injection (DI) and Iversion of Control (IoC). This kind of abstraction probably have several purposes, but I use it for being able to reason about a “service” without the knowledge of its implementation.

    Having your services “hidden” behind a protocol will make it very pleasant to test functions that would normally require external access causing side effects (like API endpoints, database and queues). But it also ties well in with applications state management libraries like Mount and Component, when needing a “standin” for one of these external resources, e.g. for some manual testing in the REPL.

    As soon as I dived into the example about protocols found on the Clojure website, I found that it was too superficial for someone like me. I’ve never approached programming very academically. For some unknown reason, most things with fancy words (polymorphism included) just refuse to stick to the inside of my skull until I see and feel it in action. My pleading for help was heard by Clojurian Slack, and after I understood (a bit more), I decided to create a more elaborate example, that maybe others would find useful.

    The protocol (interface)

    For a more realistic example than the one on the Clojure website, imagine some entity in a database with CRUD operations (Create, Read, Update & Delete):

    (defprotocol EntityStore
      (create [this id] [this id initial-data])
      (fetch [this id])
      (save [this id data])
      (delete [this id]))
    

    For the Read operation I choose to use a function named fetch (over get and read) and for the Update operation I use save (over update and replace). I think both fetch and save clearly describes the intention of the operation without conflicting with existing function names in clojure.core. The otherwise overlap of naming could confuse for developers, and at the same time the choice avoids linting warnings like … already refers to ….

    Adding doc-strings prior implementation, will force you to evaluate the exact needs of your protocol in order to articulate them. I found myself finding errors in my design on several occasions during this:

    (defprotocol EntityStore
      "All operations to the store are atomic (e.g. a DB implementation
       would use transactions or something similar)."
      (create [this id] [this id initial-data]
        "Creates a new entity in the store, and returns a map representing
         the new entity.")
      (fetch [this id]
        "Fetches (reads) an entity from the store or returns nil if it
         doesn't exist.")
      (save [this id data]
        "Saves (updates) an entity with the id `id` overwriting its data,
         returns a map representing the updated entity.")
      (delete [this id]
        "Deletes an entity with the id `id` from the store and returns
         nil."))
    

    I decided to put the protocol definition in the namespace my-app.service.entity-store, because it would allow me to use it in the code like so:

    (ns my-app.core
      (:require [my-app.service.entity-store :as entity-store-service]
                ...))
    
    ...
    (let [entity-a (entity-store-service/fetch entity-store "id-for-A")
      ...
    

    The service part of the NS, emphasizes that implementation details are “hidden away” on purpose, and I think entity-store-service/fetch read very well in the code.

    Not having the protocol definition in the same namespace as where it is used, tricked me at first and caused the error: Unable to resolve symbol: <symbol name> in this context. It took me a while to figure out that methods defined using defprotocol “live” in the same namespace as the namespace where they are defined.

    The (mock) implementation

    I’m going to start a bit backwards with a mock of the entity store, because it will be simpler in the sense that it does not require any third party libraries and such to implement.

    (ns my-app.service.impl.in-memory-entity-store
      (:require [my-app.service.entity-store :as entity-store-service]))
    
    (defn create
      ([store-atom id]
       (create store-atom id {}))
      ([store-atom id data]
       (swap! store-atom assoc id data)))
    
    (defn fetch
      [store-atom id]
      (get @store-atom id))
    
    (defn save
      [store-atom id data]
      (swap! store-atom assoc id data))
    
    (defn delete
      [store-atom id]
      (swap! store-atom dissoc id))
    
    (deftype InMemoryEntityStore [store-atom]
      entity-store-service/EntityStore
      (create [_this id] (create store-atom id))
      (create [_this id data] (create store-atom id data))
      (fetch [_this id] (fetch store-atom id))
      (save [_this id data] (save store-atom id data))
      (delete [_this id] (delete store-atom id)))
    

    A classic mistake to make at this point is to remove either create or save on the protocol, since the implementation is identical. But they are only identical (for now), because this mock is a very naive implementation. Also remember, the protocol should never know about the implementation details of the exposed functionality.

    For convenience

    Consider adding an extra convenience function in the “implementation” namespace (in above example: my-app.service.impl.in-memory-entity-store). Such a function allows you to avoid importing the class that deftype creates, which would otherwise require your code to look something like:

    (ns my-app.core
      (:require [my-app.service.impl.in-memory-entity-store])
      (:import [my-app.service.impl.in-memory-entity-store InMemoryEntityStore]))
    
    ...
    
    (InMemoryEntityStore. (atom {}))
    

    Instead, add a function like new-store:

    (ns my-app.service.impl.in-memory-entity-store
      ...
    
    (defn new-store
      "Convenience function for creating an in memory entity store."
      [store-atom]
      (InMemoryEntityStore. store-atom))
    

    Which would allow something like:

    (ns my-app.core
      (:require [my-app.service.impl.in-memory-entity-store :as in-memory-entity-store]))
    
    ...
    
    (in-memory-entity-store/new-store (atom {}))
    

    Real implementation

    The following NoSQL implementation using Monger, a Clojure client for MongoDB is also very naive: 😅

    (ns my-app.service.impl.mongo-entity-store
      (:require [monger.collection :as mongo-document]
                [monger.core :as mongo]
                [my-app.service.entity-store :as entity-store-service]))
    
    (def coll
      "Collection in which entities are stored in MongoDB."
      "entities")
    
    (defn create
      ([db oid]
       (create db oid {}))
      ([db oid data]
       (mongo-document/insert-and-return db coll (assoc data :_id oid))))
    
    (defn fetch
      [db oid]
      (mongo-document/find-map-by-id db coll oid))
    
    (defn save
      [db oid data]
      (mongo-document/update-by-id db coll oid data))
    
    (defn delete
      [db oid]
      (mongo-document/remove-by-id db coll oid))
    
    (deftype MongoEntityStore [db]
      entity-store-service/EntityStore
      (create [_this id] (create db id))
      (create [_this id data] (create db id data))
      (fetch [_this id] (fetch db id))
      (save [_this id data] (save db id data))
      (delete [_this id] (delete db id)))
    
    (defn new-store
      "Convenience function for creating a NoSQL entity store."
      [uri]
      (let [{:keys [db]} (mongo/connect-via-uri uri)]
        (MongoEntityStore. db)))
    

    On the surface, the above solution looks fine and dandy, but it has (at least) one flaw. It requires that the id given through the protocol is a BSON ObjectId (MongoDB specific Java object). Though in-memory implementation using an atom would not complain about using ObjectId as lookup keys, it is often preferable to avoid bleeding DB specifics outside the protocol. The following three functions (hexify, pad & s->oid) is a somewhat hacky attempt to work around it and use strings instead (here be dragons 🔥🐉):

    (ns my-app.service.impl.mongo-entity-store
      (:require [monger.collection :as mongo-document]
                [monger.core :as mongo]
                [my-app.service.entity-store :as entity-store-service])
      (:import [org.bson.types ObjectId]))
    
    ; Shamelessly copied from https://stackoverflow.com/questions/10062967/clojures-equivalent-to-pythons-encodehex-and-decodehex
    (defn hexify
      "Convert byte sequence to hex string"
      [coll]
      (let [hex [\0 \1 \2 \3 \4 \5 \6 \7 \8 \9 \a \b \c \d \e \f]]
        (letfn [(hexify-byte [b]
                  (let [v (bit-and b 0xFF)]
                    [(hex (bit-shift-right v 4)) (hex (bit-and v 0x0F))]))]
          (apply str (mapcat hexify-byte coll)))))
    
    ;; Strongly inspired by https://stackoverflow.com/questions/27262268/idiom-for-padding-sequences
    (defn pad
      [n val coll]
      (take n (concat coll (repeat val))))
    
    (defn s->oid
      [^String s]
      (->> (.getBytes s)
           (pad 12 0xFF)
           (hexify)
           (ObjectId.)))
    
    (def coll
      "Collection in MongoDB in which entities are stored."
      "entities")
    
    (defn create
      ([db id]
       (create db id {}))
      ([db id data]
       (mongo-document/insert-and-return db coll (assoc data :_id (s->oid id)))))
    
    (defn fetch
      [db id]
      (mongo-document/find-map-by-id db coll (s->oid id)))
    
    (defn save
      [db id data]
      (mongo-document/update-by-id db coll (s->oid id) data))
    
    (defn delete
      [db id]
      (mongo-document/remove-by-id db coll (s->oid id)))
    
    (deftype MongoEntityStore [db]
      entity-store-service/EntityStore
      (create [_this id] (create db id))
      (create [_this id data] (create db id data))
      (fetch [_this id] (fetch db id))
      (save [_this id data] (save db id data))
      (delete [_this id] (delete db id)))
    
    (defn new-store
      "Convenience function for creating a NoSQL entity store."
      [uri]
      (let [{:keys [db]} (mongo/connect-via-uri uri)]
        (MongoEntityStore. db)))
    

    The above solution have the following advantages:

    • Mongo specific implementation (the ObjectId class) is entirely hidden behind the protocol (almost - I’ll get back to this).
    • There is no need to add extra indexes on the collection in the Mongo database, which using an alternative field would have strongly encouraged.
    • The CRUD functions are all simple because they can leverage ...-by-id-functions in the Clojure Mongo driver (Monger).

    There are still a bit of Mongo hiding in the shadows because the id must be a string and only the first 12 bytes are considered for magically generating the ObjectId behind the scenes. Also, not being able to easily correlate the id "my-juicy-idA" with ObjectId("6d792d6a756963792d696441") is a bit of a bummer.

    It might be possible to use UUID’s encapsulated in Mongo BSON Binary though, but that is outside the scope of this post.

    The business logic

    Leaving all the exiting challenges with Mongo behind and moving on…

    An “Entity store service” is now available, which business logic can leverage oblivious to its implementation.

    Consider the following code describing some super important business logic:

    (ns my-app.core
      (:require [my-app.service.entity-store :as entity-store-service]
                [my-app.service.impl.mongo-entity-store :as mongo-entity-store]))
    
    (def entity-store
      (mongo-entity-store/new-store "mongodb://admin:secret@172.21.0.2/customer1"))
    
    (defn apply-business-logic
      [{:keys [entity-id id] :as _event}]
      (when-let [entity (entity-store-service/fetch entity-store entity-id)]
        (if-not (= (:name entity) "Donald Duck")
          entity
          (do ; Someone have been testing (again) - cleanup
            (entity-store-service/delete entity-store entity-id)
            nil))))
    

    The code in apply-business-logic, doesn’t care if entity-store is of the type MongoEntityStore or InMemoryEntityStore. This is very useful for testing, among other things.

    Tests (using the mock)

    Notice how the following test allows testing of apply-business-logic without having a database available during testing, or preparing test data in the database (and cleaning data in the database afterwards).

    (ns my-app.core-test
      (:require [clojure.test :refer [deftest is testing]]
                [my-app.core :as sut] ; System Under Testing
                [my-app.service.impl.in-memory-entity-store :as in-memory-entity-store]))
    
    (deftest apply-business-logic
      (testing "Normal entity"
        (with-redefs [my-app.core/entity-store
                      (in-memory-entity-store/new-store
                        (atom {"123" {:name "John Doe"}}))]
          (is (= {:name "John Doe"} (sut/apply-business-logic {:entity-id "123"})))))
      (testing "Bad entity"
        (let [store-atom (atom {"123" {:name "Donald Duck"}})]
          (with-redefs [my-app.core/entity-store
                        (in-memory-entity-store/new-store store-atom)]
            (is (contains? @store-atom "123"))
            (is (nil? (sut/apply-business-logic {:entity-id "123"})))
            (is (not (contains? @store-atom "123"))))))
      (testing "Unknown entity"
        (with-redefs [my-app.core/entity-store
                      (in-memory-entity-store/new-store (atom {}))]
          (is (nil? (sut/apply-business-logic {:entity-id "non-existing"}))))))
    

    The above code can be found on GitHub.

    This post is getting long… so before I even make the stubborn and enduring people tired, I will stop with:

    Protocols are your friend (that maybe you just need to get to know). 💜

    Permalink

    Copyright © 2009, Planet Clojure. No rights reserved.
    Planet Clojure is maintained by Baishamapayan Ghose.
    Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
    Theme by Brajeshwar.