Paths, Paths, Paths …

I started writing this point a few weeks ago, but it never felt finished. There is still so much more to cover. I might revise it when I find some more time, but I hope it is of some use already.

Classpath, :source-paths, :paths, :asset-path, …

There are a lot of different “paths” you’ll encounter when working on a Clojure(Script) project and their meaning can be confusing. Especially beginners often seem to struggle to get the project setup correctly and understanding what they all mean. I hope to clear up some of the confusion around the most common setups you’ll see when working with CLJS.


This is the big one and probably the most confusing if you are not coming from a JVM background. This is the basis for all CLJ(S) projects and understanding how this works is crucial.

The Classpath is how content addressing works in the JVM, it tells the system how to find resources. A resource is just a file but to make things a little more distinct I’ll be referring to “files on the classpath” as resources. Clojure and ClojureScript both use this mechanism when translating namespaces to resources and ultimately finding the actual files.

The Classpath is a “virtual filesystem” that combines multiple entries to make it look like one. Each entry is either a directory or a .jar file. They are basically files, so just imagine them as having a few files packed into a zip file with their pathnames intact.

Clojure(Script) use a simple namespacing mechanism which is mostly controlled via the ns form. Let us dissect a very simple form that you’ll often require in CLJ(S) code.

  (:require [clojure.string :as str]))

First we specified the namespace name as and then we required the clojure.string namespace. When we talk about names we need to translate those to a resource name and the rules for this are very simple. Replace . with a / and then append .cljs, .cljc or .clj depending on what you are looking for. There is also a rule for replacing - with _ but that is about it.

So we first translate to my/awesome/app.cljs which would be its resource name. clojure.string we translate to clojure/string.cljs.

The next step is translating this resource name to an actual filename on disk, which is where the classpath comes into play. The classpath is constructed when the process you are working with is started. Regardless whether you use shadow-cljs.edn, deps.edn or project.clj they all first construct the classpath based on your configuration.

:source-paths and :dependencies are the two options that control this in shadow-cljs.edn. project.clj via lein has a few more such as :test-paths, :resource-paths, etc. deps.edn just has :paths and :extra-paths and uses :deps to configure dependencies.

So say we have configured this shadow-cljs.edn (same applies for all the others, just using this as a simple example)

 [[reagent "1.0.0"]]

What the shadow-cljs command line utility will first download all dependencies such as reagent, shadow-cljs, clojurescript, etc. and put them into the proper place in the ~/.m2 directory. This is convention from the JVM maven ecosystem, but you’ll likely never need to actually look at it. The dependency is packaged as a .jar file and it’ll end up at ~/.m2/repository/reagent/reagent/1.0.0/reagent-1.0.0.jar.

To construct the actual classpath each tool will then combine all the manual paths (eg. :source-paths) you configured with the dependency .jar file and construct a list of them

  • src/dev
  • src/main
  • src/test
  • ~/.m2/repository/reagent/reagent/1.0.0/reagent-1.0.0.jar
  • ~/.m2/repository/thheller/shadow-cljs/2.12.1/shadow-cljs-2.12.1.jar
  • ~/.m2/repository/clojure/clojurescript/1.10.844/clojurescript-1.10.844.jar

This list can often get very long, but it is managed for you by the tool and your config so you don’t really need to worry about the fine details.

When translating a resource name to an actual filename it’ll just go over this list and stop when it finds a match. So we want to find the clojure/string.cljs resource the JVM will first check

  • src/dev/clojure/string.cljs
  • src/main/clojure/string.cljs
  • src/test/clojure/string.cljs

They all don’t exist, so it just keeps going, one by one in order. When the classpath entry is a .jar file it’ll look into that file to see if that contains the resource it is looking for. Eventually it’ll arrive at the clojurescript-1.10.844.jar and find the file it was looking for. I simplified here a little since the CLJS compiler will actually look for two files, it’ll first try to find the clojure/string.cljs and if it doesn’t find that it’ll look for clojure/string.cljc. Clojure will first look for .clj files and then for .cljc as well.

Since it traverses the classpath in order it is important to choose a unique namespace prefix as your code may otherwise collide with one of your dependencies. The common convention from the JVM world is using the reverse domain notation so becomes the com/company/foo resource path and namespace prefix.

Those rules then also tell you where to put your source code. The default convention would be to put into src/main/my/awesome/app.cljs. We often specify multiple :source-paths to separate out development or test-only code from the actual sources of our application. This is not strictly necessary, and you could instead put it all into one source path, but it can make the project setup slightly cleaner.

The important bit is that the resource name must be found exactly on the classpath. A common mistake is setting the wrong level of the source path, say you put the actual file src/main/my/awesome/app.cljs but configure {:source-paths ["src"]}. Following the rules this will only end up looking for src/my/awesome/app.cljs and thus never find your actual file.

Because of how the JVM works the classpath can currently only be configured once on startup and as such changing :dependencies or :source-paths will require a restart of the shadow-cljs, lein or clj process.

Output Paths and HTTP

The Classpath controls everything related to the “inputs” used for your programs. Locating source files and other additional resources. In CLJ you can access it at runtime but for CLJS it is only relevant during compilation but not at runtime.

A common issue many CLJS devs run into is how you access the files in a HTTP context. I’ll be using shadow-cljs with the built-in :dev-http as an example, but the same things really apply to all setups.

Browsers are really picky for file security reasons and generally refuse to run certain code if you just load it from disk directly. Therefore, you’ll need a HTTP server to actually make use of the files generated by a shadow-cljs build.

The most basic build config for a :browser build looks is this:

 {3000 "public"}
  {:target :browser
   :modules {:main {:init-fn}}
   :output-dir "public/js"
   :asset-path "/js"

The :output-dir and :asset-path values are actually the default so you could even omit those. For example purposes I added them.

Dissecting this we have a couple paths. The Classpath we covered so I omitted :source-paths and :dependencies. All of the paths left are related to the output.

The first relevant option is the :output-dir of "public/js". This tells shadow-cljs to put all files it generates into that directory. The :modules :main key controls how the file is called. So this will generate the public/js/main.js. Each additional configured module would just generate an additional file in the :output-dir. This is relevant if you want to do more advanced code-splitting setups.

The next option is the :dev-http {3000 "public"}, which instructs shadow-cljs to start a HTTP server on port 3000 serving the public “root” directory. This will make all files in this directory available over http://localhost:3000. When you request that the Browser will actually request http://localhost:3000/ since there must always be a path in the URL. The URL is constructed of several standardized pieces starting with the scheme http: then the host localhost and the port 3000 and a path of /.

Since / is not a valid filename you could create in a directory the convention is for the server to look for an index.html file instead when it receives a request ending with a /. Custom servers are in full control over this so this does not apply to all server, but it does for :dev-http.

Assume that HTML file contains a <script src="/js/main.js">. Since that only specified the path without a new scheme or host it’ll perform the request reusing parts from the initial request to http://localhost:3000/ making it http://localhost:3000/js/main.js. Since that is an actual filename the server will just look for it in its root directory and will end up giving you the content of public/js/main.js.

Basically you cut out the host portion and prepend the :dev-http root to select the file you’ll actually get

2. cut the scheme and host:port
3. prepend the :dev-http root

:dev-http actually allows specifying multiple roots as well as using the classpath. :dev-http {3000 ["foo" "bar" "classpath:public"]} would first look for foo/js/main.js, then try bar/js/main.js and then try to find the public/js/main.js resource on the actual classpath (including in the .jar files).

The :asset-path becomes important when the generated code needs to locate additional files to load at runtime. It should always be the bit of path that will need to be added to the generated module filename (eg. main.js). The final constructed path should be directly loadable in your browser, eg. http://localhost:<port>/<asset-path>/<module>.js.

At runtime on the client side the :asset-path is actually just treated as a prefix so you can specify a full URL if you actually host the JS code on another server (eg. :asset-path "http://some.cdn/with/a-nested/path").

Setting an incorrect :asset-path may work since it is only relevant when loading files dynamically at runtime. release builds may not actually be doing this but watch builds often do and having an incorrect path may lead to “file not found” request errors (eg. for source maps).


Playing New Music On Old Car Stereo With Clojure And Babashka

Car Music I think we can all agree that every one of us likes to listen to some music while driving somewhere. My brother and I ride together several times per week in his 2011 Suzuki Swift. It’s a nice car that has built-in mp3 player which can play files from USB stick. We like to pick out the music we listen, so we started making our playlist some time ago. The whole list can be best descri...


Hello World of Programming with Linear Algebra

Recent popularity of Machine Learning brought high performance computing on the radar of most programming. You've heard that math in general and linear algebra in particular is central in implementing and using this stuff. However, you might have a hard time connecting the linear algebra that you've learned in college (and probably forgot by now) to actual programming tasks where you'd use it now.

I hope that a simple Hello World example can at least give you a rough idea of how LA can be applied in programming. I won't dwell on the theory; let's see whether we can make this intuitive.

Just an ordinary domain model

Imagine this simplified code for inventory modeling. (When I say "simplified" i mean really simplified. We're using floats for prices (bad), we store the data in global state, the architecture is far from even a simple web application. But the model is familiar enough to a typical software developer.)

(def products {:banana {:price 1.3 :id :banana}
               :mango {:price 2.0 :id :mango}
               :pineapple {:price 1.9 :id :pineapple}
               :pears {:price 1.8 :id :pears}})

This (imaginary) application exists to track sales. A customer puts the desired products into a cart, we calculate the total price, and, later, perform the delivery. Each cart only stores the products' identifiers and quantities (in unspecified units; yes, it's super-simplified).

(def cart1 {:banana 10
            :pineapple 7
            :pears 3})

(def cart2 {:pineapple 3
            :mango 9})

Having defined product and cart data, we write a function that, given the products "database" and a cart, calculates the total price of the products in the cart. The cart-price function reduces all [product quantity] pairs in the cart, by retrieving the appropriate product map in the product-db, and taking the value associated with its :price key. It multiplies that price with the quantity, and accumulates it in total.

(defn cart-price [product-db cart]
  (reduce (fn [total [product quantity]]
            (+ total (* (:price (product-db product)) quantity)))

Let's call this function with the available carts, and see it in action.

(cart-price products cart1)
=> 31.699999999999996
(cart-price products cart2)
=> 23.7

We hopefully have more than one order. Our code can easily process sequences of carts, and compute the total revenue.

(reduce + (map (partial cart-price products) [cart1 cart2]))
=> 55.39999999999999

It's all good; but what does it have to do with linear algebra?

A more general algorithm

In the previous implementation, we entangled the specifics of data storage and the algorithm that computes the total price. In this simple model, it's not much of a problem, but if the data model is more complex, and the algorithm not as simple as the straightforward map/reduce, this quickly leads to (at least) two problems:

  • code becomes too complicated
  • program performance degrades quickly

Let's first tackle the code complexity by extracting the computation logic from the domain into the abstract mathematical notion of vectors and operations on these vectors. In this particular example, vectors help us encapsulate a bunch of numbers as one atomic unit.

(def product-prices [1.3 2.0 1.9 1.8])
(def cart-vec-1 [10 0 7 3])
(def cart-vec-2 [0 9 3 0])

We recognize that the logic we've already developed for computing the total price matches a simple and well known mathematical operation, known as the dot product, a scalar product of two vectors.

(defn dot-product-vec [xs ys]
  (reduce + (map * xs ys)))

Given two vectors, [1 2 3] and [4 5 6], the dot product computes one number, a scalar, that represent a scalar product of these two vectors. Right now, we don't even care about theoretical details of the dot product; we recognize that it technically computes the same thing that we need in our domain, and it seems useful. There are other ways to multiply vectors, which return non-scalar structures.

(dot-product-vec [1 2 3] [4 5 6])
=> 32

We can see that, when applied to the vectors holding product prices and quantities, it returns the correct results that we've already seen.

(dot-product-vec product-prices cart-vec-1)
=> 31.699999999999996
(dot-product-vec product-prices cart-vec-2)
=> 23.7

Getting the total price requires another map/reduce, but we will quickly see that this, too, can be generalized.

(reduce + (map (partial dot-product-vec product-prices)
               [cart-vec-1 cart-vec-2]))
=> 55.39999999999999

A library of linear algebra operations

When we abstract away the specifics of the domain, we end up with a number of general operations that can be reused over and over, and combined into more complex, but still general, operations. Countless such operations have been studied and theoretically developed by various branches of mathematics and related applied disciplines for a long time. What's more, many have been implemented and optimized for popular hardware and software ecosystems, so our main task is to learn how to apply that vast resource to the specific domain problems.

Linear algebra is particularly well supported in implementations. Whenever we need to process arrays of numbers, it is likely that at least some part of this processing, if not all of it, can be described through vector, matrix, or tensor operations.


Instead of developing our own naive implementations, we should reuse the well-defined data structures and functions provided by Neanderthal ().

Here we use vectors of double precision floating point numbers to represent products' prices and carts.

(def product-prices (dv [1.3 2.0 1.9 1.8]))
(def cart-vctr-1 (dv [10 0 7 3]))
(def cart-vctr-2 (dv [0 9 3 0]))

We use the general dot function in the same way as the matching function that we had implemented before.

(dot product-prices cart-vctr-1)
=> 31.7
(dot product-prices cart-vctr-2)
=> 23.7


Once we start applying general operations, we can see new ways to improve our code, not so obvious at first.

Instead of maintaining sequences of vectors that represent carts, and coding custom functions to process these vectors, we can put that data into the rows of a matrix. All carts are now represented by one matrix, and each row of the matrix represents one cart.

(def carts (dge 2 4))
#RealGEMatrix[double, mxn:2x4, layout:column, offset:0]
▥       ↓       ↓       ↓       ↓       ┓
→       0.00    0.00    0.00    0.00
→       0.00    0.00    0.00    0.00
┗                                       ┛

We could have populated the matrix manually, but, since we already have the data loaded in appropriate vectors, we can copy it, showing how these structures are related.

(copy! cart-vctr-1 (row carts 0))
#RealBlockVector[double, n:4, offset: 0, stride:2]
[  10.00    0.00    7.00    3.00 ]
(copy! cart-vctr-2 (row carts 1))
#RealBlockVector[double, n:4, offset: 1, stride:2]
[   0.00    9.00    3.00    0.00 ]

The following step is the usual opportunity for a novice to slip. Should we now iterate the rows of our newly created matrix, calling dot products on each row? No! We should recognize that the equivalent operation already exists: matrix-vector multiplication, implemented by the mv function!

Most functions in this domain have short names that might sound cryptic until you get used to it. There is a method to their naming, though, and they are usually very descriptive mnemonics. For example, mv stands for Matrix-Vector multiplication. You'll guess that mm is Matrix-Matrix multiplication and so on. Like in mathematical formulas, this naming makes for code that can be viewed in a contained place that can be grasped in one view.

(mv carts product-prices)
#RealBlockVector[double, n:2, offset: 0, stride:1]
[  31.70   23.70 ]
(asum (mv carts product-prices))
=> 55.4

Not only that the mv operation is equivalent to multiple calls to dot, but it takes advantage of the structure of the matrix, and optimizes the computation to the available hardware. This achieves much better performance, which can compound to orders of magnitude in improvements.

These improvements materialize in more serious examples. Any implementation of a small toy problem works fast.

…and more

Let's introduce a bit more complication. Say that we want to support different discounts for each product, in the form of multipliers. That gets us the price reductions, that we should subtract from the price. An alternative way, shown in the following snippets is to subtract the discount coefficients from 1.0 to get the direct multiplier that gets us to the reduced price.

(def discounts (dv [0.07 0 0.33 0.25]))
(def ones (entry! (dv 4) 1))

We can subtract two vectors by the axpy function. axpy stands for "scalar a times x plus y".

(axpy -1 discounts ones)
#RealBlockVector[double, n:4, offset: 0, stride:1]
[   0.93    1.00    0.67    0.75 ]

The mul function multiplies its vector, matrix, or tensor arguments element-wise, entry by entry.

(mul (axpy -1 discounts ones) product-prices)
#RealBlockVector[double, n:4, offset: 0, stride:1]
[   1.21    2.00    1.27    1.35 ]

The following code seamlessly incorporates this new part of the algorithm into the implementation that we already have.

(asum (mv carts (mul (axpy -1 discounts ones) product-prices)))
=> 46.87

Why stop here? Suppose that we want to simulate the effects of multiple discount combinations on the total price. As earlier, we put all these hypothetical discount vectors into a matrix, in this case, three samples that we'd like to investigate.

(def discount-mat (dge 4 3 [0.07 0 0.33 0.25
                            0.05 0.30 0 0.1
                            0 0 0.20 0.40]))
#RealGEMatrix[double, mxn:4x3, layout:column, offset:0]
▥       ↓       ↓       ↓       ┓
→       0.07    0.05    0.00
→       0.00    0.30    0.00
→       0.33    0.00    0.20
→       0.25    0.10    0.40
┗                               ┛

We have to subtract these numbers from 1.0. Instead of populating the matrix with 1.0, we will demonstrate the outer product operation, implemented by the function rk. Given two vectors, it produces a matrix that holds all combinations of the product of the entries of the vectors.

(rk ones (subvector ones 0 3))
#RealGEMatrix[double, mxn:4x3, layout:column, offset:0]
▥       ↓       ↓       ↓       ┓
→       1.00    1.00    1.00
→       1.00    1.00    1.00
→       1.00    1.00    1.00
→       1.00    1.00    1.00
┗                               ┛

We can also utilize rk to "lift" product prices vector to a matrix whose shape matches the shape of the discount combinations matrix.

(def ones-3 (subvector ones 0 3))
(def discounted-prices (mul (axpy -1 discount-mat (rk ones ones-3))
                            (rk product-prices ones-3)))

The result is a matrix of hypothetical discount prices that we'd like to simulate.

#RealGEMatrix[double, mxn:4x3, layout:column, offset:0]
▥       ↓       ↓       ↓       ┓
→       1.21    1.23    1.30
→       2.00    1.40    2.00
→       1.27    1.90    1.52
→       1.35    1.62    1.08
┗                               ┛

Now, the most interesting part: how do we calculate the totals from this matrix and the matrix of carts we've produced earlier. (Not) surprisingly, just a single operation, matrix multiplication, completes this task!

(mm carts discounted-prices)
#RealGEMatrix[double, mxn:2x3, layout:column, offset:0]
▥       ↓       ↓       ↓       ┓
→      25.05   30.51   26.88
→      21.82   18.30   22.56
┗                               ┛

Now we only need to sum the columns up to get the three final totals. We won't do this column-by-column. Instead, we'll use the "mv with ones" approach we've already encountered. Note that we need to transpose the matrix to match the desired structure.

(trans (mm carts discounted-prices))
#RealGEMatrix[double, mxn:3x2, layout:row, offset:0]
   ▤       ↓       ↓       ┓
   →      25.05   21.82
   →      30.51   18.30
   →      26.88   22.56
   ┗                       ┛

And, the final answer is…

(mv (trans (mm carts discounted-prices)) (subvector ones 0 2))
#RealBlockVector[double, n:3, offset: 0, stride:1]
[  46.87   48.81   49.44 ]

Given three (or three million) possible discount combinations, we get a vector of the total revenue amounts. Of course, being a toy example, this code doesn't take into account that lower prices would (likely) induce more sales; let's not carry a Hello World example too far.

So, the first major benefit of using a library, such as Neanderthal (), based on many decades of numerical computing research and development is that we have access to a large treasure trove of useful, well thought, general functions for developing useful, general or customized, number processing algorithms.

Another major improvement is performance. Although toy examples may be implemented in any way you'd like, and they'd still work reasonably well, real-world data processing almost always involves either many data points, or many computation steps, or, often – both.

I'm not talking about a couple dozen percentages, but improvements of many orders of magnitude. But that's a story that has to be looked at in more depth. I've written two books where I go into much, much, more depth and width on this. You can also check some earlier articles on this blog; there's many examples that demonstrate how fast this approach is!


Infrastructure As Code Is Wrong

There are several problems in computer science that are very hard. One of them is naming things. So it should no surprise when names make little sense.

One of the "bad" names is "Infrastructure as Code". I think it misleads more than it reflects the idea.

In the era of self-hosted systems, infrastructure was managed by directly handling hardware and manually setting configurations. This approach does not work any longer in the age of cloud computing. It does not scale, it is too slow, and too risky.

Instead, the "Infrastructure as Code" (IaC) represents a different idea - to represent infrastructure and configurations as machine- and human- readable text and then use automation to manage it. These tools can create infrastructure components as many times as we like, do it very fast, and make sure they all are exactly the same.

If we take for example Terraform, provisioning of EC2 instance will look something like this:

resource "aws_instance" "web" {
  ami           =
  instance_type = "t3.micro"

  tags = {
    Name = "HelloWorld"

It is human-readable (and writable), but it can also be processed by automation. Terraform will do all actual calls to AWS to provision the instance for us based on configurations in text files that we give it. Also, these files can be version-controlled. We can put them in git and track changes to infrastructure.

Just like code, right? But is it really code? It surely looks like it, but does not feel like it to me. What is code? Code is logic. It is either a series of imperative commands,

if this do that, else do the other thing

(C++, Java, Python, etc), or declarative computation pipelines

take data, pipe it through this function and apply this function to result

(Haskell, Elixir, Clojure, etc).
But in case of IaC, we describe the desired state of infrastructure

I want 2 EC2 instances with such and such properties

We don't specify how exactly to get them. We don't specify what APIs to call and in what order, we don't specify logic to handle dependencies between resources. Instead, we rely on Terraform (or CloudFormation) to figure that out and do it for us.

The actual logic, the code, is the tool, Terraform or CloudFormation. What we give it, that textual description of what we want, is rather data.

I have seen people take the name "Infrastructure as Code" for the face value and treat it the same way as the actual code. They try to fit loops, conditional statements and other imperative constructs into it, as if it was Java or C#. And it usually ends up pretty badly.

We should be very clear about what IaC is all about. Terraform or CloudFormation are not programming languages, not even frameworks. IaC is about being able to declaratively say what infrastructure you need and then use tools that will figure out how to get it for you. Trying to fit extensive logic into it is like trying to dig the ground with an iPhone. It can do it to some extent, but it is not what it's all about.

Infrastructure as Code seems like a very misleading term to me. When using this approach, you don't write actual code (logic) that will create infrastructure. You create configuration files that contain data that says what the infrastructure should look like. You create data, not code. That's why in my mind, "Infrastructure as Code" is actually Infrastructure as Data.


Functional programming vs object oriented programming

Functional programming is the programming technique that accentuates the functional factors required for creating and implementing the programs. Simply put, Functional programming (FP) is the process of building software by composing pure functions. Object-oriented programming is a programming paradigm based on the concept of "objects", which can contain data and code: data in the form of fields, and code, in the form of procedures.

Functional programming:
Functional programming is a declarative programming paradigm where programs are created by applying sequential functions rather than statements. Each function takes in an input value and returns a consistent output value without altering or being affected by the program state.
Functional programming is gaining popularity due to its efficiency and scalability to solve modern problems. Functional programming supports programming languages like Lisp, Clojure, Wolfram, Erlang, Haskell, F#, R, and other prominent languages. Functional programming is great for data science work.

Object-oriented programming:
Object Oriented programming (OOP) is a programming paradigm that relies on the concept of classes and objects. It is used to structure a software program into simple, reusable pieces of code blueprints (usually called classes), which are used to create individual instances of objects. Object-oriented programming languages include JavaScript, C++, Java, and Python. Object-oriented programming is about creating objects that contain both data and functions. A class is an abstract blueprint used to create more specific, concrete objects. Classes define what attributes an instance of this type will have, like color, but not the value of those attributes for a specific object.Classes can also contain functions, called methods available only to objects of that type. These functions are defined within the class and perform some action helpful to that specific type of object.

Functional programming vs Object oriented programming

  • Functional programming emphasizes on evaluation of functions while object oriented programming is based on the concept of objects.
  • Functional programming uses immutable data while object oriented programming uses the mutable data.
  • Functional programming follows the declarative programming model while object oriented programming follows the imperative programming model.
  • Functional programming supports parallel programming while object oriented programming does not.
  • In functional programming, statements can be executed in any order. In OOP, statements are executed in a particular order.
  • In functional programming, recursion is used for iterative data while in OOP, loops are used for iterative data.
  • Variables and functions are the basic elements of functional programming. Objects and models are the basic elements of object oriented programming.
  • Functional programming is used only when there are few things with more operations. Object-oriented programming is used when there are many things with few operations.
  • In functional programming, a state does not exist. In object-oriented programming, the state exists.
  • In functional programming, a function is the primary manipulation unit. In object-oriented, an object is the primary manipulation unit.

  • Functional programming provides high performance in processing large data for applications. Object-oriented programming is not good for big data processing.

  • Functional programming does not support conditional statements. In Object-oriented programming, conditional statements can be used like if-else statements and switch statement.

Which is better?

Well,it depends on what your program is trying to do.
Both OOP and FP have the shared goal of creating understandable, flexible programs that are free of bugs. But they have two different approaches for how to best create those programs.
In all programs, there are two primary components: the data (the stuff a program knows) and the behaviors (the stuff a program can do to/with that data). OOP says that bringing together data and its associated behavior in a single location (called an “object”) makes it easier to understand how a program works. Functional programming says that data and behavior are distinctively different things and should be kept separate for clarity.
In functional programming, data cannot be stored in objects, and it can only be transformed by creating functions. In object-oriented programming, data is stored in objects. Object-oriented programming is widely used by programmers and successful also.

In Object-oriented programming, it is quite hard to maintain objects while increasing the levels of inheritance. In functional programming, it requires a new object to execute functions, and it takes a lot of memory for executing the applications.
Each has their own advantages and disadvantages, it is up to the programmers or developers to choose the programming language concept that makes their development productive and easy.


Do forces really exist?

Force is an important concept in Newtonian mechanics. But do forces really exist? In fact, it is an abstraction invented by Newton. The insight revolutionized physics and universalized his model. What can we learn from it?

The post Do forces really exist? appeared first on LispCast.


Destructuring in Clojure

Destructuring in Clojure can be confusing for someone coming from the language where this technique is not implemented. Hopefully when you are done reading this article, destructuring will make sense and you will be implementing this method in your own coding projects.


Six years of professional Clojure development

Six years of professional Clojure development

10.05.2021 Permalink

Over the last couple of years me and my colleagues here at doctronic have been busy creating and maintaining more than a dozen of individual software systems for our clients.

We have the privilege to almost exclusively use Clojure and ClojureScript, as this is still the valid strategic technology decision made by doctronic shortly before I joined them in 2015. In this post I'd like to list some of the strengths and weaknesses in everyday real-life use of the language and its ecosystem.

Stability of the language and libraries: Anyone who ever created software of practical value for paying customers knows that reliability of core language constructs and libraries is important. It supports predictability, and eventually saves you from insanity whenever you want to benefit from performance improvements or bug fixes. Upgrading to a new version of a library or even Clojure itself is risk-free. The authors and maintainers really do care about stability. If you listened to talks of Rich Hickey or Stuart Halloway you know that this is no coincidence. "Don't break things!" is part of the culture.

Code size: If you compare some of the idiomatic code snippets in Clojure with the equivalents in more popular languages you might conject that whole Clojure code bases might in general be smaller compared to what you can achieve in other languages. Now, after having worked more than 20 years with imperative OO languages and more than 5 years with Clojure I can firmly state that you'll achieve the same functionality with a fraction of the amount of code. This impacts costs: My feeling is that we now build systems with only half of the staff that I used to have on projects with a comparable scope.

Finding staff: Clojure's popularity is still far, far behind imperative general purpose languages like Java or Python, so fewer people feel encouraged to invest the time to learn the concepts and gain practice. It seems that this makes hiring harder. doctronic addressed this challenge by positioning itself as a supporter of the language in Germany through organizing and sponsoring the annual :clojureD conference. This helps greatly with staffing. But even if a company has a lower visibility, the fact that it uses Clojure will attract aficionados of advanced programming languages. If these developers do enjoy their workplace they will usually stay around longer with their employer. So I'd argue that you gain quality in skills and a lower turnover rate, but it might take more patience to build up a team of Clojure developers.

Teaching the language to apprentices: doctronic constantly employs apprentices so I was able to accompany some of them and help them learn Clojure. Young minds with very few programming experience are remarkably fast in picking up the language and become productive, it seems. I assume there are two main reasons for this: There's actually a whole lot of things they don't need to master compared to OO land. And they don't need to unlearn anything, in other words, they don't need to overcome old habits of doing things.

Some discipline required: Clojure is a language that gives us a wide range of options to implement solutions. And it imposes very little ceremony. For example, there is no static type system that checks your code upon compilation, but you can get some of the benefits for your data at runtime by using Clojure spec. This also serves as documentation and possibly a generator for test data. It is up to the team to decide when and where more checks, restrictions and tests have to be added. If the team misses out on this the code base might be harder to understand and maintain than necessary. In essence: with great freedom comes great responsibility.

Navigating the code: When maintaining or extending an existing code base you often need to find specific places where a function or a piece of data is referenced. To support this kind of search for specific identifiers it is important that naming is self-explaining and consistent, even more so because there are no links established via a type system. To mitigate this weakness in Clojure, you should use qualified keywords whereever possible.

Documentation becomes more important: When you're trying to understand a function that invokes other, non-core functions you'll need a clear idea what kind of data these functions expect and return. Without any data type declarations the code itself often doesn't reveal how the data that flows through you functions looks like. You'll need either Clojure spec or more thorough documentation to mitigate this problem. It is not so rare that you need to test functions of interest isolated in the REPL to get a clear idea how the data looks like.

The promise of purity: Any practical software application must cause some side-effects. So even if Clojure strongly encourages developers to create pure functions your systems will contain substantial parts that either depend on some environment or cause side-effects. The trick is to separate these pieces sharply and keep the unpure parts small. Without any discipline you could easily spread side-effects everywhere and end up with the typical imperative mess. Clojure makes it easy to create a mostly pure implementation but it does not enforce this. To ensure this and other qualities the teams at doctronic conduct code reviews for almost every commit before it is promoted into the master branch.

Reusability: Here's where Clojure really shines. Because most functions are pure and operate on few common data structures (mostly maps, vectors and sets) it is very easy to write reusable code. We created many internal libraries just by separating the candidate functions into their own namespaces within the regular project source tree. To finally establish the library we move the code from the project repo into a new Git repo and include the resulting library Jar in the project.clj as dependency. Thus, we have a very lean process that results in production-quality resusable assets.

Startup time: Starting a fresh JVM with a Clojure Uber-Jar to bring up an application takes noticable time. I assume that JVM class loading causes this delay. So if you consider to create a command line tool that has a short net execution time you wouldn't want this runtime overhead.

Runtime stability: Clojure applications are in general very stable. I remember that in bigger Java projects from my past we always had to do some load testing in order to detect programming flaws like memory leaks or race conditions. With Clojure, we do load testing only to find real performance issues.

Ok, time to come to an end before this post becomes too long. The list is perhaps not complete, but I guess I included the most important aspects that recently popped into my mind. When I decided to fully jump on this train in 2015 I expected more unpleasant surprises. Now, six years later I have proven to myself that Clojure and ClojureScript are a practical and sound choice for real world software development, and I still enjoy using them.


Cataloging norns scripts


monome norns

Pictured above is norns, a sound computer music framework many-faced instrument1, from the folks at monome.

It runs scripts, little music-making apps, that anyone can develop and share through it package manager (maiden) and/or the community forum (lines).


Before deciding to acquire my own norns, I did some research to see what it was all about.

The official documentation (at the time2) was rather elusive, leaning more towards poetry than a technical reference.

I finally took the plunge but even after playing a bit with it, I still had a hard time pinpointing what I got my hands on.

As a matter of fact, norns can be many things, each script being mostly limited by its creator’s imagination.

For my own use, as I usually do with any subject matter3, I wrote down some notes as I got things figured out.

As I arrived pretty late in norns’ history, I was astonished by the shear number of scripts available.

To try them out and easily go back to them, I started cataloging those scripts.

It quickly became apparent that this effort could benefit to others, so I converted my notes to markdown (for easier collaboration) and made them public:

p3r7/awesome-monome-norns - GitHub

The reaction was pretty positive, and it sparked discussion with long time monome collaborators such as Sam Boling (post) and Dan Derks.

Early developments of

Still, it appeared that the official docs + the lines forum weren’t enough.

My effort got well received but wasn’t flexible enough to gain collaborative traction4.

As first theorized by Dan in this post, there was a need for more.

Overseen by Brian (@tehn), Dan and Tyler Etters started playing seriously with the idea of a wiki so that knowledge between script creators and users could be shared.

Before that, the forum was used for that purpose but it was tedious to find back technical information buried deep between more general conversation.

A wiki looked indeed like the best tool to create a shared memory, a common documentary heritage.

The idea was to use awesome-monome-norns as a baseline for the structure but to make it more script author-centered, notably allowing them to have space to document each of their creations.

At that time I got consulted by Dan on my views on how to structure information and that I could, if I wanted, be involved in the project in the future.

I shared with him some views I had gathered from my experience in the industry5 but didn’t really gave a clear feedback about being more involved.

I realized that one thing that was clearly missing from previous efforts (official doc, awesome-monome-norns) was a gallery view of all available scripts.

Indeed, they relied on lists or tables to present them. But all those scripts have a unique visual identity that is often more recognizable than their names.

I also wanted to address a common complaint I had with awesome-monome-norns: the table of connectivity options wasn’t legible enough with its many columns. A set of dedicated icons would make things both more compact and legible.


I gave myself a few hours over the weekend to make a working prototype, trying to respect as much as possible monome’s design language.

p3r7/norns-gallery - GitHub


I presented it to Dan, who forwarded it to Tyler.

Both of them got hooked by the idea and got back to convince me of integrating it with the wiki.

At the time I got recontacted, Tyler and Dan had since made a huge effort on the data architecture side of things.

They asked me for advices on this aspect, but I didn’t see anything worth changing, so kudos to them.

Regarding the integration of the, on the other hand, things weren’t as smooth.

Firstly, one problematic aspect was how to share state (list of scripts and meta-data) between the 2 instances.

We quickly figured out that the wiki had an API that would solve the issue.

Secondly, we failed to clearly scope how the gallery would fit with the features provided with the wiki.

Indeed, the gallery was initially conceived as a search tool and there was an overlap of functionality with the wiki’s native tagging & searching system.

We initially thought that the latter was too limiting for the amount of search criterias we originally wanted6 and saw the gallery as a way to implement proper faceted search.


This early prototype still lives as an interactive demo.

However, we quickly saw several drawbacks:

  • the wiki’s native tag system would be a pretty limiting meta-data structure7 for storing many different aspects (dimensions)
  • imposing too many constraints on the proper tagging of script may have deterred contributors
  • having 2 search interfaces would be counter-intuitive to end users

So we agreed to simplify and store only 2 aspects of scripts with tags: their category and their connectivity options.

Tyler then oversaw the integration of the gallery’s DOM as an embedded iframe.

In this process, he made it more responsive8, more compact and did some CSS touchups to make the whole thing appear seamless.

Furthermore, he pushed the idea of tighter coupling, making filtered versions for pages dedicated to authors and specific categories.

Going beta then live

Dan and Tyler invited several batches of script authors to register and add their creations to the platform.

It was rather satisfying to see their excitement as they proceeded and watch the gallery get automagically populated.

When we launched, the reception was very positive.


Going forward

This first release was a success.

At the time of writing, from half to 2/3rd of all existing scripts have been documented. There is still some hidden gems that would benefit from being cataloged.

There is the eventuality of a tighter integration with norns package manager (maiden) that may tie a tighter link between norns as a platform and the community aspect of its script developments.

I would also be happy to see the wiki host more general information pages. It could notably steal from awesome-monome-norns a list of engines and reusable Lua libs.

Ultimately I would like to mark awesome-monome-norns as deprecated, superseded by both the official doc and


It was a pleasure interacting with Dan and Tyler on this project.

Tyler was especially impressive at how quick he was to grasp and patch the gallery’s source code, even though it was written in a foreign language to him (ClojureScript).

Dan struck me on one occasion with his intuition, leading us to a great optimization with a few innocent questions even though we were touching on a subject he had of us all the less know-how.

You can read Tyler’s post-mortem of the project on his blog.

You can check out their scripts on their pages:


If you have notes you keep to yourself, it might be worth it taking the extra step of sharing them with the world.

In addition to helping others, as a positive side-effect you might end up meeting other passionate people with whom you might sharpen your skills and grow as a person.


  1. For more details read: What / Why is norns? 

  2. In the meantime, it got quite an overhaul, notably thanks to Dan. 

  3. I extensively use org-mode to offload as much things as possible out of my brain. 

  4. GitHub pull requests were not necessary the most user-friendly way to approach collaborative editing. 

  5. Notably the importance of having the same information delivered in different forms to target contexts and audiences. See also: The Documentation System 

  6. Being able to search by whether hardware controllers are required for script operation and also by support of specific grid models

  7. As opposed to key/value(s) pairs or even a relational data model. 

  8. Notably by replacing tailwind with Bootstrap for which he had a working secret sauce. 


Homoiconicity & Feature Flags

Homoiconicity & Feature Flags

At work we've been using feature flags to roll out various changes of the product. Most recently the rebrand from Icebreaker to Gatheround. This allowed us to continuously ship small pieces and review and improve these on their own pace without creating two vastly different branches of changes.

With the rebrand work in particular there were lots of places where we needed relatively small, local differentiations between the old and the new appearance. Oftentimes just applying a different set of classes to a DOM element. Less often, up to swapping entire components.

Overall this approach seemed to work really well and we shipped the rebrand without significant delays and at a level of quality that made everyone happy. What we're left with now is some 250+ conditionals involving our use-new-brand? feature flag.

This tells the story of how we got rid of those.

Introducing Homoiconicity

If you're well familiar with homoiconicity this may not be entirely new but for those who aren't: homoiconicity is the fancy word for when you can read your program as data. Among many other lisp/scheme languages Clojure is homoiconic:

(doseq [n (range 10)]
  (println n))

The program above can be run but it can also be read as multiple nested lists:

[doseq     [n [range 10]]    [println n]]

Now, if you know what I'm talking about you will see that I skipped over a small detail here, namely that the code above uses two types of parenthesis and that information got lost in this simplified array representation.

When doing it right (by differentiating between the two types of lists) we would end up with exactly the same representation as in the first code sample. And that is homoiconicity.

Homoiconicity & Feature Flags

With this basic understanding of homoiconicity, lets take a look at what those feature flags looked like in practice:

 {:class (if (config/use-new-brand?)
           "bg-new-brand typo-body"
           "bg-old-brand typo-large")}]
(when (config/use-new-brand?)
  (icon/Icon {:name "conversation-color"
              :class "prxxs h3"}))

And so on. Now we have 250+ of those in our codebase but don't really plan on reversing that change any time soon... so we got to get rid of them. Fortunately Clojure is homoiconic and doing this is possible in a fashion that really tickles my brain in a nice way.

Code Rewriting

... isn't new of course, CircleCI famously rewrote 14.000 lines of test code to use a new testing framework. I'm sure many others have done similar stuff and this general idea also isn't limited to Clojure. Code rewriting tools exist in many language ecosystems. But how easily you can do it in Clojure felt very empowering.

The next two sections will be about some 30 lines of code that got us there about 90% of the way.

Babashka + rewrite-clj

Babashka is a "fast, native Clojure scripting runtime". With Babashka you can work with the filesystem with shell-like abstractions, make http requests and much more. You can't use every Clojure library from Babashka but many useful ones are included right out of the box.

One of the libraries that is included is rewrite-clj. And, you guessed it, rewrite-clj helps you 🥁 ... rewrite Clojure/Script code.

I hadn't used rewrite-clj before much am still a bit unfamiliar with it's API but after asking some questions on Slack @borkdude (who also created Babashka) helped me out with an example of transforming conditionals that I then adapted for my specific situation.

I will not go into the code in detail here but if you're interested, I recorded a short 4 minute video explaining it at a surface level and demonstrating my workflow.

The rewriting logic showed in the video ignores many edge cases and isn't an attempt at an holistic tool to remove dead code branches but in our case this basic tool removed about 95% of the feature flag usages, leaving a mere 12 cases behind that used things like cond-> or conjunctions.

Of the more than 230 feature flags that have been removed only about ten needed additional adjustments for indentation. This happened mostly when a feature-flag-using conditional wrapped multiple lines of code. Due to the locality of our changes that (fortunately) was relatively uncommon. If we had set up an automatic formatter for our code this also wouldn't have required any extra work.


This has been an extremely satisfying project, if you can even call those 30 lines a "project". I hope you also learned something or found it helpful in other ways!

Thanks to Michiel "borkdude" Borkent for all his work on Babashka. The interactive development workflow shown in the video paired with blazing startup times and a rich ecosystem makes it feel like there is a lot of potential still to be uncovered.

I'd also like to thank Lee Read, who has done such an amazing job making rewrite-clj ready for more platforms like ClojureScript and Babashka as well as making sure it's future-proof by adding more tests and fixing many long standing bugs.

After writing this blog post and detailing the beginnings of this idea I also took a bit more time to clean up the code and put it on GitHub.

If you thought this was interesting, consider following me on Twitter!

reply on twitter


(clj 6) Three chapters in one year

It's been a bit more than a year since I posted my first blog post about learning Clojure. And it's been five months since my last blog post about it. So far I've made it through the first three chapters1 of "Clojure for the Brave and True". Instead of commenting on my learning pace at the start of every post, I've decided that this pace is the pace that works for me at this time, so there's no need to keep revisiting the topic.

Something I do want to mention is that one thing that triggered me to do some more Clojure was this episode of Gene Kim's excellent Idealcast podcast with Michael Nygard, in which they spend some time talking about Clojure.

Vim macros

The exercises at the end of chapter 3 got me to try out a lot of things, so I got bored having to type in the commands to copy a line (yy), paste it (p), replace it by its evaluation (c!$), comment it out (gcc), and add a "=>" to mark it as output. So I learned about Vim macros and recorded that sequence to run when I hit @c. At the end of my (clj 4) post I mentioned I might have to do this. Guess that moment came sooner than I expected.

Read more… (6 min remaining to read)


Agiliway Tech Talk – Functional Programming: Rapid Prototyping and Fast Delivery with Clojure

Clojure Webinar

Join our webinar “Functional Programming: Rapid Prototyping and Fast Delivery with Clojure” on May 25th at 9 AM (Pacific Time) / 12 PM (Eastern Time) / 4 PM (Greenwich Mean Time).

Register here

Choosing the most convenient programming language for the project plays a crucial role in the best business decisions. That is why Agiliway is launching a hands-on tech talk webinar to discuss the practical side of functional programming and explore fascinating techniques using Clojure programming language.

The webinar will be covered by Kostiantyn Cherkashyn, Senior Clojure Developer at Agiliway, and Viktoriia Yaremchuk, PhD, Project Manager at Agiliway.

  1. 1. Why functional programming?
  2. 2. Locality and simplicity: Solving Problems the Clojure Way
  3. 3. Clojure patterns that enable rapid feedback – REPL programming & the reloaded pattern
  4. 4. Clojure and Java interop
  5. 5. Clojure best practices
  6. 6. Q/A session

So, what benefits will you encounter with using Clojure? Why functional programming is a good choice in case you need to deliver your project fast and bugs-free? Be sure to attend our webinar to hear about new opportunities using Clojure.

Don’t forget to register! The webinar is free after registration, but we have limited seats.

Register here

The post Agiliway Tech Talk – Functional Programming: Rapid Prototyping and Fast Delivery with Clojure first appeared on Agiliway.


Configuration in Clojure

Clojure in Production

In This Chapter


In this chapter, we will discuss how to make a Clojure project easy to configure. We’ll take a look at the basics of config: file formats, environment variables, libraries, and their pros and cons.

Formulation of the Problem

In materials on Clojure, there are such examples:

(def server
  (jetty/run-jetty app {:port 8080}))

(def db {:dbtype   "postgres"
         :dbname   "test"
         :user     "ivan"
         :password "test"})

These are the server on port 8080 and the parameters for connecting to the database. The examples are useful because you can execute them in the REPL and check their result: open a page in a browser or perform a SQL query.

In practice, we should write code so that it does not carry concrete numbers and strings. Explicitly setting a port number to a server is considered bad practice. That is fine for documentation and examples, but not for the production launch.

Port 8080 and other combinations of zeros and eights are popular with programmers. There is a good chance that the port is occupied by another server. This happens when instead of running one service, you start a bunch of them at once during development or testing.

The code written by a programmer goes through several stages. These stages may differ between companies, but in general, they are development, testing, staging/pre-production, and production.

At each stage, the application runs alongside other projects. The assumption that port 8080 is free anytime is fanciful. In developer slang, the situation is called “hardcode” or “nailed down.” If there are nailed-down values in the code, they introduce problems into its life cycle. You cannot run two projects in parallel which declare port 8080 in their code.

The application does not need to know the server port – information about this comes from the outside. In a simple case, this source is the config file. The program reads the port from it and starts the server exactly as it needs to do on a specific machine.

In more complex scenarios, the file is not compiled by a person but a special program – a configuration manager. The manager stores information about network topology, machine addresses, and database access parameters. On request, it generate a config file for a specific machine or network segment.

The process of passing parameters to an application and accepting them is called configuration. This step in software development deserves close attention. When it is done well, the project easily goes through all the stages of production.


The purpose of a config is to control the program without changing the code. The need for it arises with the growth of the code base and infrastructure. If you have a small Python script, there is nothing wrong with opening it in notepad and changing a constant. At enterprises, such scripts have been working for years.

But the more complex a company’s infrastructure, the more constraints it has. Today’s software development practices negate spontaneous changes in a project. You can’t git push directly to the master branch; git merge is prohibited until at least two colleagues approve your work; an application will not reach the server until tests pass.

This leads to the fact that even a sligh change in the code will take hours to get in production. Editing in configuration is cheaper than releasing a new version of the product. The rule follows from this: if you can make something a configurable option, do it right now.

Large companies practice what is called a feature flag. It is a boolean field that enables a vast layer of the application logic. For example, a new interface, a ticket processing system, or an improved chat. Of course, updates are tested before releasing them, but there is always a risk of something going wrong in production. In this case, we set the flag to false and restart the service. Thus, the company will not only save time but also preserve its reputation.

Configuration Cycle

The better an application is designed, the more of its parts rely on parameters. That’s why, on startup, the program immediately looks for configuration. Processing of configuration is a collection of steps, not a monolithic task. Let’s list the most important of them.

At the first stage, the program reads the configuration. Most often, they are environment variables or a file. Data in a file is stored in JSON, YAML, and other formats. An app contains code to parse a format and get the data. We’ll look at the pros and cons of the well-known formats below.

Environment variables are part of an operating system. Think of them as a global map in memory. Every application inherits it when starting. Languages and frameworks offer functions to read variables into strings and maps.

Files and environment variables complement each other. For example, an application reads data from a file but looks for its path in environment variables. There might be an opposite approach. Sensitive data such as passwords and API keys are omitted in the file. So, other programs, including spyware, won’t see them. The application reads normal parameters from a file, but the secret information comes from variables.

Advanced configurations use tags. In the file, the tag is placed before the value: :password #env DB_PASSWORD. A tag is a short string meaning that the next value is processed specially. In our example, the password field contains not the DB_PASSWORD string but the value of the same name variable.

The first stage ends when we have received the data. It doesn’t matter if it was a file, environment variables, or something else. The application moves on to the second stage, type inference.

JSON and YAML have basic types: strings, numbers, booleans, and null. It is easy to see that there is no date among them. We use dates to define promotions or calendar events. In files, dates are specified either as an ISO string or as the number of seconds since January 1, 1970 (UNIX era). Specially designed code runs through the data and converts dates to the type accepted in the language.

Type inference applies to collections as well. Sometimes maps and arrays are not enough to work comfortably. For example, possible types of something are stored as a set because it cuts off duplicates and quickly validates if a value belongs to it. It’s easier to describe some complex types with plain values (strings, numbers) and coerce them later. A string will become an instance of, and a sequence of 36 hexadecimal characters will be a UUID.

Environment variables are less flexible than modern formats. JSON provides scalars and collections, while variables contain nothing but text. Type inference is not only desirable, but necessary for them. You cannot pass a port as a string to where a number is expected.

Data validation starts after type inference. In the chapter on Spec, we found out that a proper type does not promise a correct value. Validation is needed to make it impossible to specify port 0, -1, or 80 in the configuration.

From the same chapter, we remember that sometimes the values are correct individually but cannot be paired. Suppose we specified the promotion period in the configuration. It is an array of two dates: start and end ones. These dates may be be easily confused, and then checking of any date against an interval will return false.

After validation, proceed to the last stage. The application decides where to store the configuration, for example, in a global variable or a system component. Other parts of the program will read parameters from there, not from the file.

Config Errors

At each stage, an error may occur, e.g., file not found, syntax violations, invalid field. In this case, the program displays a message and exits. The text should explicitly answer the question of what happened. Too often, programmers keep in mind only the positive path and forget about errors. When running their programs, you see a stack trace that is difficult to understand.

If an error occurred during the verification stage, explain which field was a culprit. In the chapter on Spec, we looked at how to improve a spec report. It takes effort but pays off over time.

In the IT industry, some people write code, and others manage it. Your DevOps colleagues don’t know Clojure and won’t understand the raw s/explain. Sooner or later, they will ask you to improve the configuration messages. Do this in advance out of respect for your colleagues.

If there is something wrong with the config, then the program should terminate immediately rather than work, hoping that everything will settle somehow. Sometimes one of the parameters is specified incorrectly, but the program does not use it for the time being. Avoid this: the error will appear at the most inopportune moment.

If one of the configuration steps fails, the program should exit with nonzero code. The message is sent to the stderr channel to signal an abnormal condition. Advanced terminals print text from stderr in red to catch your attention.

Configuration Loader

To reinforce theory with practice, let’s write our configuration system. It will be a separate module of about one hundred lines. Before opening the editor, let’s think over the main points.

Let’s store the configuration in a JSON file. We’ll assume that the company has recently switched to Clojure, and DevOps has already written Python scripts to manage configuration settings. Of course, EDN would be the best choice for Clojure programs, but it will complicate work for our colleagues, so we’ll not use it for now.

The path to the config file is specified by the CONFIG PATH environment variable. From the file, we expect to get a server port, database parameters, and promotion date range. Dates should become java.util.Date objects. The start date is strictly less than the end date.

We will put the final map into the global variable CONFIG. If an error occurs at one of the steps, we will show a message and exit the program.

Let’s start with the exit helper function. It takes a completion code, a text, and formatting options. If the code is equal to zero, write the message to stdout, otherwise – to stderr.

(defn exit
  [code template & args]
  (let [out (if (zero? code) *out* *err*)]
    (binding [*out* out]
      (println (apply format template args))))
  (System/exit code))

Now let’s move on to the loader. It is a set of steps, where each one takes the result of the previous one. The logic of the steps is easy to understand from their name. Namely, there are four actions: finding the path to the config, reading a file, infer data types, and setting a global variable. Type coercion and validation were combined into coerce-config since, technically, this is the s/conform call.

(defn load-config! []
  (-> (get-config-path)

Now we will describe each step. The get-config-path function reads an environment variable and checks if such a file exists on disk. If everything is okay, the function will return the file path; otherwise, it will call exit:

(import '

(defn get-config-path []
  (if-let [filepath (System/getenv "CONFIG_PATH")]
    (if (-> filepath (new File) .exists)
      (exit 1 "File %s does not exist" filepath))
    (exit 1 "File path is not set")))

The read-config-file step reads the file by its path. The Cheshire library parses JSON. Its parse-string function returns data from a document string.

(require '[cheshire.core :as json])

(defn read-config-file
    (-> filepath slurp (json/parse-string true))
    (catch Exception e
      (exit 1 "Malformed config, file: %s, error: %s"
            filepath (ex-message e)))))

Type inference and validation are the most important steps. The application must not receive invalid parameters. The coerce-config step passes data from the file through s/conform. There is a chance of getting an exception when calling it, so wrap it in pcall – a safe call that will return an error and the result.

If there was an exception, we print its message and terminate the program. The same applied to the case when we got ::s/invalid keyword. The only difference is, we compose the message with the Expound library. We have to consider both cases because a failure and an incorrect result are different things.

(require '[clojure.spec.alpha :as s])
(require '[expound.alpha :as expound])

(defn coerce-config [config]
  (let [[e result] (pcall s/conform ::config config)]
      (some? e)
      (exit 1 "Wrong config values: %s" (ex-message e))

      (s/invalid? result)
      (let [report (expound/expound-str ::config config)]
        (exit 1 "Invalid config values: %s %s"
              \newline report))

      :else result)))

Now, only a spec is missing. Let’s open the configuration and examine its structure:

    "server_port": 8080,
    "db": {
        "dbtype":   "mysql",
        "dbname":   "book",
        "user":     "ivan",
        "password": "****"
    "event": [

Describe the spec from top to bottom. It is a map with the keys:

(s/def ::config
  (s/keys :req-un [::server_port ::db ::event]))

The server port is a combination of two predicates: a number check and a range check. Checking for a number is needed so that nil and a string do not get into the second predicate. Otherwise, this will throw an exception where you least expect it.

(s/def ::server_port
  (s/and int? #(<= 1024 % 65535)))

We meet number and range checks frequently, so Spec offers the s/int-in macro for this case. Please note that the right border is exclusive, meaning that it belongs to the interval. The mathematical notation for such an interval is written like [1024, 65535).

(s/def ::server_port
  (s/int-in 1024 (inc 65535)))

Now let’s describe the database connection. There won’t be any problems with it, because all its fields are strings. For more rigor we use ::ne-string to prevent empty lines. The database engine is specified as a enumeration of strings with the only item «mysql». This will eliminate extraneous values.

(s/def :db/dbtype   #{"mysql"})
(s/def :db/dbname   ::ne-string)
(s/def :db/user     ::ne-string)
(s/def :db/password ::ne-string)

(s/def ::db
  (s/keys :req-un [:db/dbtype

The event field is the most challenging one. It consists of a tuple of dates and an interval check:

(s/def ::event
  (s/and (s/tuple ::->date ::->date)

The s/tuple spec validates if a collection has exact number of items. In our case, a vector of one or three dates won’t pass it. The ::->date spec converts a string to a date. In order not to parse it manually, let’s take the read-instant-date function from the clojure.instant package. This function is format-tolerant and reads incomplete dates, for example, only a year. Let’s wrap it in s/conformer. We put ::ne-string in front to cut off the non-date garbage.

(require '[clojure.instant :as inst])

(s/def ::->date
  (s/and ::ne-string (s/conformer read-instant-date)))

Let’s describe range checking. It takes a couple of Date objects and compares them. Dates cannot be compared using “greater than” or “less than” signs. Instead, use the compare function, which will return -1, 0, and 1 for the less than, equal or greater than cases, respectively. We are interested in the first case when the result is negative.

(s/def ::date-range
  (fn [[date1 date2]]
    (neg? (compare date1 date2))))

The last step is set-config! that writes the map to the global CONFIG variable. We chose an uppercase name to avoid shadowing it with the local one config. To change a global variable, use alter-var-root.

(def CONFIG nil)

(defn set-config!
  (alter-var-root (var CONFIG) (constantly config)))

At the start of the program, execute (load-config!) so that the configuration appears in the variable. Other modules import CONFIG and read the keys they need. Below is how to start a server or execute a request based on configuration:

(require '[project.config :refer [CONFIG]])

(jetty/run-jetty app {:port (:server_port CONFIG)
                      :join? false})

(jdbc/query (:db CONFIG) "select * from users")

If there is something wrong with your configuration, the program will terminate with a clear message.


We have written a configuration loader. It is simple to maintain: every step is a function that is easy to modify. Our code does not pretend to be an industrial solution, but it is suitable for small projects.

Its advantage is that the configuration can be re-read at any time. This is handy for development: modify the file and run load-config! in the REPL. A new configuration appears in the CONFIG variable.

The downside of the loader is that the code is bound to the exit function, which terminates a JVM. In production, this is the right approach: you cannot continue if the parameters are misconfigured. In development, a termination is more of a problem than a benefit: any error kills the REPL, and you need to start it again.

The termination of a JVM is too drastic. We should separate an error and reaction to it. The naive way is to call load-config! while the exit is being redefined with a function that only throws an exception. Let’s name it fake-exit. The code below will not terminate the JVM; it will only throw an exception with the text that we passed to exit:

(defn fake-exit
  [_ template & args]
  (let [message (apply format template args)]
    (throw (new Exception ^String message))))

(defn load-config-repl! []
  (with-redefs [exit fake-exit]

A better solution is to pass additional parameters to load-config!. Let’s call one of them die-fn (the “death function”) that takes an exception. In production, it terminates the JVM, and in development, it writes a message to the REPL. Modify the loader to support the :die-fn parameter. Consider default behavior if the parameter is not specified.

Another point that addresses the issue of inferring types. The loader relies on the s/conform function for type inference. In the chapter on Spec, we looked at the case when s/conform adds logical tags and changes the data structure. If we replace our custom ::db spec with the ::jdbc/db-spec one, we will get the same case. We have set our database spec without s/or macros in order not to distort the data.

In another way, you can coerce types using tags. We will discuss this technique in the following sections of this chapter.

More on Environment Variables

A loader reads data from a file, taking only a small part – the file path – from environment variables. Let’s modify the loader: now it reads all data from the environment without using files. To know better the advantages of the new approach, let’s discuss it first in isolation from any specific language.

Environment variables are sometimes called ENV for short, for example, when reading a file of the same name or working with them in the code. This is a fundamental property of the operating system. Think of variables as a global map that is populated at a computer startup. The map contains the main system parameters: locale, home directory, a list of paths where the system looks for programs, and much more.

To see the current variables, run env or printenv in a terminal. The pairs NAME=value will appear on the screen. Variable names are in uppercase to make them stand out and emphasize their priority. Most systems are case sensitive, so home and HOME are different variables. Spaces and hyphens are not allowed; lexemes are separated by underscores. Here’s a snippet of printenv:


Each process receives a copy of this map. A process can add or remove a variable, but the changes are visible only to it and its descendants. A child process inherits the variables from its parent.

Local and Global Variables

Distinguish between environment and shell variables; they are also called global and local variables. Newbies often confuse them. Run the command in the terminal:

$ FOO=42

You have set a shell variable. To refer to a value by name, precede it with a dollar sign. The example below will print 42:

$ echo $FOO

If we execute printenv, we won’t see FOO in the output. The FOO=42 instruction sets a shell variable, not an environment variable. These variables are only visible to the shell, and its descendants do not inherit them. Let’s check it: start a new one from the current shell and repeat printing.

$ sh
$ echo $FOO

We get an empty string because the child does not have such a variable. Run exit to return to the parent shell.

The export command puts a variable into the environment. Printenv sees the variable set this way:

$ export FOO=42
$ printenv | grep FOO

The child processes also see it:

$ sh
$ echo $FOO

Sometimes you need to start a process with a variable but so as not to affect the current state. In such a situation, you should place the expression NAME=value before the basic command:

$ BAR=99 printenv | grep BAR

Printenv generates a new process that has access to the BAR variable. If we print $BAR once again, we’ll get an empty string.

Programs often read parameters from environment variables. A PostgreSQL client distinguishes between two dozen variables: PGHOST, PGDATABASE, PGUSER, and others. Environment variables take precedence over --host, --user, and similar parameters. If you execute the following in the current shell:

$ export PGDATABASE=project

then each PostgreSQL utility will run on the specified server and database. This is convenient for a series of commands: you don’t have to specify --host and other arguments every time.

Pay attention to the PG prefix. It prevents overwriting someone else’s HOST variable. There are no namespaces in the environment, so the prefix is the only way to separate your variables from others.

Config in the Environment

Each language provides functions to read a single variable to a string or get all of them as a map. It means we can set config with environment variables. Let’s look at the pros and cons of this approach.

The application does not access the disk while reading the environment since it is located in memory. We’re not aiming at the performance benefits, though. Yes, memory is much faster than disk, but you will never notice the difference between 0.01sec and 0.001sec. Our main point is that an application that does not depend on files is more autonomous and easier to maintain.

Sometimes a configuration file is unexpectedly located in a different folder, and an application cannot find it, or worse, the app starts up with an old file version. This makes things slower and more confusing.

Storing passwords and keys in variables is safer than in files. These data can be read in files by other programs, including malware. By mistake, a file can get into the repository and remain in history. Some scripts search open repositories for keys to cloud platforms and wallets (and sometimes find them, unfortunately).

Even if a file belongs to a user, others can get read access. Environment variables are ephemeral: they live only in operating system memory. One user cannot read another’s variables – this is a strict limitation at the operating system level.

The industry is moving from files to virtualisation. If earlier we copied files via FTP, today applications are running from images. They are archives that contain the code and its environment. Unlike a regular archive, we cannot change an image. To update a file in the image, you need to rebuild it, which complicates the process.

On the contrary, virtualisation is loyal to the environment variables. They are specified in the parameters when you start the image. The same image is used with different variables, so a new build is not required. The more options you can set with variables, the more convenient it is to work with the image. In the example below, the PostgreSQL server starts with a ready-to-use database and a user:

$ docker run \
  -e POSTGRES_DB=book \
  -e POSTGRES_USER=ivan \
  -d postgres

The Twelve-Factor App is a famous set of rules for developing robust applications. It also prescribes storing configuration in the environment. The author mentions the same advantages of variables that we have looked at – file independence, security, and support on all platforms.

Disadvantages of the Environment

Variables do not support types: any value is text. Type inference is up to you. Do it declaratively, not manually. Here’s a bad example in Python:

db_port = int(os.environ["DB_PORT"])

When there are more than two variables, the code becomes ugly. Specify a map where a key is a variable name and value is a function to transform a text value. The special code traverses the map and fills up the result. For the sake of shortness, let’s skip error handling:

import os
env_mapping = {"DB_PORT": int}

result = {}
for (env, fn) in env_mapping.iteritems():
    result[env] = fn(os.environ[env])

The approach is also valid for other languages: less code, more of the declarative part. In Clojure, we usually transform the data with spec.

Environment variables do not work with hierarchy. They are a flat set of keys and values that is not always suitable for config. The more parameters the configuration has, the more often they are grouped by meaning. Let’s say ten parameters define the connection to the database. We’ll take them out to the child map in order not to put a prefix in front of each.

;; so-so
{:db-name "book"
 :db-user "ivan"
 :db-pass "****"}


;; better
{:db {:name "book"
      :user "ivan"
      :pass "****"}}

Nested variables are read differently on different systems. For example, a single underscore separates lexemes but does not change the structure. Double underscore stands for nesting:


{:db-name "book"
 :db-pass "pass"}



{:db {:name "book"
      :pass "pass"}}

An array is specified in square brackets or separated by commas. When parsing one, there is a risk of false splitting. This happens when the comma or a bracket refers to a word, not syntax.

The JSON and YAML formats set a clear standard for how to describe collections. But there is no single convention for environment variables. The situation gets more complicated when a highly nested parameter is expected, such as a list of dictionaries. Environment variables do not fit well with such a structure.

The development reveals one more trade-off of these variables: they are read-only on some systems. That is ideologically true, but it forces you to re-enable the REPL for every configuration change, whereas the file only needs to be changed and read again.

Env Files

When there are many variables, entering them manually via export is tiresome. In such situations, we move the variables to a file called the env-configuration. Technically, it is a shell script, but the less scripting capabilities it has, the better. Ideally, such a file holds only NAME=value pairs, one for each line. Let’s just call it ENV without extension.


To read the variables into the shell, call source <file>. It is a bash command that will execute the script in the current session. The shorthand for this often-used command is a dot: . <file>. The script will add variables to the shell, and you will see them after source. This is an important difference from bash <file> command, which will execute the script in a new shell, and you won’t see any changes in the current one.

$ source ENV
$ echo $DB_NAME

If you run the application from the current shell, the app still won’t get the variables from the file. Recall that the expression VAR=value defines a local variable. DB_NAME and other variables will not get into the environment, and the program will not inherit them. Let’s check this with printenv:

$ source ENV
$ printenv | grep DB
# exit 1

You can solve the problem in two ways. The first is to open the file and place the export expression before each pair. Then the source command of this file will add variables to the environment:

$ cat ENV
export DB_NAME=book
export DB_USER=ivan
export DB_PASS=****

$ source ENV
$ printenv | grep DB

The disadvantage of this method is that now the file has become a script. If you do not put export before a variable, the application will not read it.

The second way is based on the -a (allexport) parameter of the current shell. When it is set, the local variable is sent to the environment as well. Before reading variables from a file, set the flag to “true” and then to “false” again.

$ set -a
$ source ENV
$ printenv | grep DB
# prints all the vars
$ set +a

The set statement is counterintuitive: the parameter is enabled with a minus and disabled with a plus. This is an exception to remember.

If you read a variable that is already in the environment, it will replace the previous value. This way, files with overrides appear. If you need particular settings for your tests, you don’t have to copy the entire file. Create a file with the fields to be replaced and execute it after the main one.

Let the test settings of our program differ by the base name. The ENV file contains the main parameters, and in ENV_TEST we put a single pair DB_NAME=test. Let’s read both files and see how it turned out:

$ set -a
$ source ENV
$ source ENV_TEST
$ set +a

$ echo $DB_NAME

You can notice that using ENV files is contrary to the statement above. We said that variables remove the dependency on files, but in the end, we put them in a file. Why?

The difference between JSON and ENV files is what reads them. In the first case, an application does it, and in the second case, an operating system. A file is located in a strictly defined directory, whereas environment variables are available from everywhere. We will free the application from the code that looks for and reads the file. At the same time, we will make it easier for our DevOps colleagues: they set variables differently depending on the tool (shell, Docker, Kubernetes). This makes the environment the main exchange point of all settings.

Environment Variables in Clojure

Clojure is a hosted platform, so the language does not provide access to system resources. There is no function for reading environment variables in its core module. Let’s get them from the java.lang.System class. You don’t need to import the class: it is available in any namespace.

The static getenv method will return either one variable by name or the entire map if no name is specified.

;; a single variable
(System/getenv "HOME")

;; all variables
{"JAVA_ARCH" "x86_64", "LANG" "en_US.UTF-8"} ;; truncated

In the second case, we got not a Clojure collection but a Java one. It is an instance of UnmodifiableMap class, so the variables cannot be changed after the JVM has started.

Let’s cast the map to the Clojure type to make it easier to work with it. At the same time, we will fix the keys: at the moment, these are uppercase strings with underscores. Clojure uses keywords and kebab-case: lowercase with hyphens.

Let’s write a function to convert a single key:

(require '[clojure.string :as str])

(defn remap-key [^String key]
  (-> key
      (str/replace #"_" "-")

and make sure that it works correctly:

(remap-key "DB_PORT")

The remap-env function traverses the Java map and returns its Clojure version with keywords for keys:

(defn remap-env [env]
   (fn [acc [k v]]
     (let [key (remap-key k)]
       (assoc acc key v)))

Here is a small part of the map:

(remap-env (System/getenv))

{:home "/Users/ivan"
 :lang "en_US.UTF-8"
 :term "xterm-256color"
 :java-arch "x86_64"
 :term-program ""
 :shell "/bin/zsh"}

Now that we have a map of variables, it follows the same pipeline: type inference, validation with a spec. Since all values are strings, the spec needs to be modified so that it converts strings to proper types. Previously, there was no need for this because the numbers came from JSON. Let’s make a better spec that considers both number and string types for numeric values. A smart number parser looks like this:

(s/def ::->int
   (fn [value]
       (int? value) value
       (string? value)
       (try (Integer/parseInt value)
            (catch Exception e
       :else ::s/invalid))))

With this spec, you can change the data source without editing the code.

Extra Keys Problem

The variable map has the disadvantage of many extraneous fields. The application doesn’t need to know the terminal version or the path to Python. These fields introduce noise during printing and logging. If the spec fails, we’ll see excessive data in explain.

In the last step of s/conform, you need to select only the useful data part from the map. The select-keys function will return a subset of another map with only the keys passed to the second argument. But where to get the keys? It takes a long time to list them manually, and besides, we duplicate the code. We have already specified the keys in the ::config spec, and we don’t want to do this a second time. We’ll use a trick to get the keys out of the spec.

The s/form function takes a spec key and returns the frozen form of whatever was passed to s/def. We will get a list where each item is a primitive or a collection of primitives (number, string, symbol, and others). For the ::config spec, we’ll get the following form:

(s/form ::config)

 :req-un [:book.config/server_port

Please note: this is a list indeed, not a code. The keys you need are in the third item after the :req-un keyword. We should consider other types of keys, for example, :opt-un. Let’s write a universal function that will return all keys from the s/keys spec.

We’ll drop the first symbol of the form. That leaves a list, where the odd items are the type of keys, and the even ones are their vector. Let’s rebuild the list into a map and combine the values. For -un keys, discard the namespace. As a result of these actions, we get the function:

(defn spec->keys
  (let [form (s/form spec-keys)
        params (apply hash-map (rest form))
        {:keys [req opt req-un opt-un]} params
        ->unqualify (comp keyword name)]
    (concat req
            (map ->unqualify opt-un)
            (map ->unqualify req-un))))

Let’s check the spec of our loader. Indeed, we get three keys:

(spec->keys ::config)
(:server_port :db :event)

Let’s rewrite reading variables into the map. In the last step, we select only those keys that we declared in our spec.

(defn read-env-vars []
  (let [cfg-keys (spec->keys ::config)]
    (-> (System/getenv)
        (select-keys cfg-keys))))

The advantage is that we managed to avoid repetitions. If a new field appears in ::config, the spec->keys function will automatically pick it up.

Environment Loader

Let’s modify the loader to work with environment variables. Replace the first two steps with read-env-vars. Now the program does not depend on the config file.

(defn load-config! []
  (-> (read-env-vars)

Make it so the data source can be specified using a parameter. For example, :source "/path/to/config.json" means read the file, and :source :env means environment variables.

An even more difficult problem is how to read both sources and combine them? Is the order important, and how to ensure it? How to combine maps asymmetrically, that is, when the second map only replaces the fields of the first one but does not add new fields?

Inference of Structure

It rarely happens that a configuration is a flat dictionary. Parameters related by their meaning are placed in nested dictionaries; for example, server and database fields are separate. When the settings are in a group, they are easier to maintain. A good example is splitting config into pieces using {:keys [db server]} syntax. Each component of the system accepts the part of the same name as a mini config.

Let’s improve our loader: we will teach it to read nested variables. Let’s agree that double underscore means a level change. We’ll put the following variables in the ENV_NEST file:


Now read it and start the REPL with the new environment:

$ set -a
$ source ENV_NEST
$ lein repl

Let’s change the parsing of the environment. The remap--key--nest function takes a string key and returns a vector of its constituent parts (lexemes):

(defn remap-key-nest
  [^String key]
  (-> key
      (str/replace #"_" "-")
      (str/split #"--")
      (->> (map keyword))))

(remap-key-nest "DB__PORT")
;; (:db :port)

Now we change the function that builds a map. For each name, we will get a vector of lexemes. Let’s add a value with assoc-in that produces a nested structure.

(defn remap-env-nest
   (fn [acc [k v]]
     (let [key-path (remap-key-nest k)]
       (assoc-in acc key-path v)))

The code below will return the parameters grouped as expected. Here is a subset of them:

(-> (System/getenv)
    (select-keys [:db :http]))

{:db {:user "ivan", :pass "****", :name "book"},
 :http {:port "8080", :host ""}}

Then we act as usual: write a spec, infer types from strings, and so on.

Think about setting an array in a variable. How to separate array elements? When is false splitting possible, and how to prevent it?

Simple configuration manager

At this point, you might decide that config in a file is a bad idea. However, don’t rush to rewrite your code with environment variables. In practice, hybrid models are used combining both approaches. The application reads basic parameters from a file, but passwords and API keys from the environment.

Let’s look at how to use both files and environments. A naive solution doesn’t require you to write any code: it runs on command-line utilities. The envsubst program from the “GNU gettext” package provides a simple templating system. To install gettext, run the command in a terminal:

$ <manager> install gettext,

, where <manager> is your system’s package utility (brew, apt, yum, and others).

The template text comes from stdin, and the environment variables are the context. The utility replaces the $VAR_NAME expressions with the values of the same name variable. Let’s put the template into the config.tpl.json file. The “tpl” part means a template.

    "server_port": $HTTP_PORT,
    "db": {
        "dbtype":   "mysql",
        "dbname":   "$DB_NAME",
        "user":     "$DB_USER",
        "password": "$DB_PASS"
    "event": [

Note that the server port is not quoted because it is a number (line 2). Now we create an env ENV_VARS file with the following content:

$ cat ENV_VARS

Let’s read them and render the template:

$ source ENV_VARS
$ cat config.tpl.json | envsubst

The substitution was successful:

    "server_port": 8080,
    "db": {
        "dbtype":   "mysql",
        "dbname":   "book",
        "user":     "ivan",
        "password": "*(&fd}A53z#$!"
    "event": [

To write the result to a file, add an output statement to the end:

$ cat config.tpl.json | envsubst > config.ready.json

The envsubst method seems primitive, but it is useful in practice. The template frees you from worries about the structure: variables are in the right places, so no trouble with nesting.

Sometimes an application requires multiple config files, including one for infrastructure. You need to specify the same parameter in different files to make the programs work in concert. For example, Nginx requires a web server port for proxying. In Sendmail, you need to specify the same email address as in the application. It goes without saying that there should be a single data source, and a template render can be such a source.

The envsubst utility becomes the configuration manager. To automate the process, add a script that runs templates and renders them based on variables. It is not an enterprise-level solution, but it is suitable for simple projects.

Reading the Environment from Config

The following techniques make an application read parameters from file and environment simultaneously. The difference is at what step it happens.

Suppose we put the main parameters in a file, and the password for the database comes from the environment. Since such solutions are team-wide, agree among yourselves that the password field contains not a password, but a variable name, for example, "DB_PASS". Let’s write a spec that infers the variable value by its name:

(s/def ::->env
   (fn [varname]
     (or (System/getenv varname) ::s/invalid))))

If the variable is not set, the output will return an error. For more control, remove the white space around the edges and make sure the string is not empty.

(s/def ::db-password
  (s/and ::->env
         (s/conformer str/trim)

A quick test: run the REPL with the DB_PASS variable and read it using the spec:

DB_PASS='secret123' lein repl

(s/conform ::db-password "DB_PASS")

To move a field out of the file to the environment, replace its value with the variable name. Update the spec for this field: add ::->env to the beginning of the s/and chain.

Another way to read variables from a file is to expand it with tags. A tag is a short word that indicates that the meaning behind it is read in a certain way. YAML and EDN formats support tags. Libraries offer several basic ones for them. You can easily add your own tag.

In EDN, a tag starts with a hash sign and captures the next value. For example, #inst "2019-07-10" converts a string to a date. The tag is associated with a single argument function that finds a value from the initial one. To set your tag, pass a special map to the clojure.edn/read-string function. Its keys are symbols, and values are functions.

Add the #env tag that will return the value of the variable by name. The name can be a string or a symbol. Let’s define a function:

(defn tag-env
    (symbol? varname)
    (System/getenv (name varname))
    (string? varname)
    (System/getenv varname)
    (throw (new Exception "Wrong variable type"))))

Now we’ll read the EDN line with the new tag:

(require '[clojure.edn :as edn])

(edn/read-string {:readers {'env tag-env}}
                 "{:db-password #env DB_PASS}")
;; {:db-password "secret123"}

To avoid passing the tags every time, let’s prepare the read-config function “charged” with the tags. We build it using partial. The new function accepts only a string:

(def read-config
  (partial edn/read-string
           {:readers {'env tag-env}}))

To parse a file with tags, read it into a string and pass it to read-config:

(-> "/path/to/config.edn"

YAML tags start with one or two exclamation marks, depending on the semantics. Standard tags have two marks, while third-party tags have one. This way, when we run into a tag, we immediately understand its semantics.

The Yummy library offers a YAML parser that has useful tags. Among others, we are interested in the !envvar tag, which returns the value of a variable by name. Let’s describe the configuration in the config.yaml file:

server_port: 8080
  dbtype:   mysql
  dbname:   book
  user:     !envvar DB_USER
  password: !envvar DB_PASS

Let’s add the library and read the file. In place of the tags, we get the environment values:

(require '[yummy.config :as yummy])

(yummy/load-config {:path "config.yaml"})

{:server_port 8080
 :db {:dbtype "mysql"
      :dbname "book"
      :user "ivan"
      :password "*(&fd}A53z#$!"}}

We’ll take a closer look at Yummy in the next section of the chapter.

Tags have both advantages and disadvantages. On the one hand, they make the config more concise: a line with a tag makes more sense. An expression like #env DB_PASS is shorter and more pleasing to the eye. Some libraries provide tags for complex types and classes.

On the other hand, tags make a config platform-specific. For example, the Python library fails to read the !envvar tag in the YAML file because this library does not have such a tag (more precisely, it does, but with a different name). Technically, this can be fixed: skip unfamiliar tags or install a stub. However, the approach does not guarantee the same results across platforms.

With tags, a config is overgrown with side effects. In functional programming terms, it loses its purity. It is tempting to move too much logic into a tag: include a child file, format strings. Tags blur the line between reading a config and processing it. When there are too many of them, the configuration is difficult to maintain.

These techniques — parsing with spec and tags — are opponents. Choose the method that is convenient for the team and process.

Overview of Formats

We have mentioned three data formats: JSON, EDN, and YAML. Let’s run through the features of each of them. Our goal is not to identify the ideal format but to prepare you for the unobvious moments that arise while working with these formats.


Even non-web developers are familiar with JSON. It is a data format based on JavaScript syntax. The standard’s basic types are numbers, strings, boolean, null, and two collections – an array and object, which is considered as a map. The collections can be nested within each other.

The advantage of JSON is its popularity. Today it is the standard for exchanging data between client and server. It is easier to read and maintain than XML. Today’s editors, languages, and platforms work with JSON. It is the natural way to store data in JavaScript.

But JSON does not provide an opportunity to comment. At first glance, this is a trifle, but in practice, comments are important to us. If you have added a new parameter, you should write a comment about what it does and what values it takes. Look at Redis, PostgreSQL, or Nginx configurations – more than half of the file are comments.

Developers have come up with tricks to get around this limitation in JSON. For example, put the same name field in front of the one to which the comment relates:

    "server_port": "A port for the HTTP server.",
    "server_port": 8080

We expect the library to walk through the fields in turn, and the second field will replace the first. The JSON standard does not specify the order of the fields, so proceed at your own risk. The library logic can be different, for example, to throw an exception or skip an already processed key.

Some programs carry their own JSON parser that supports comments. For example, Sublime Text editor stores settings in .json files with JavaScript comments (double slash). But there is no general solution to the problem.

The format does not support the tags we talked about above. There are Cheshire and Data.json libraries to work with JSON in Clojure. Both of them provide two main functions: to read and write a document. You will find detailed examples in GitHub pages of the projects.

JSON compares favorably with the verbose XML it replaces. JSON data looks cleaner and more convenient than a tag tree. But more modern formats express data even more clearly. In YAML, you can express any structure without a single bracket, thanks to indentation.

JSON syntax is noisy: it requires quotes, colons, and commas where other formats do without them. A comma at the end of an array or object is considered an error. Map keys must not be numbers. It is not allowed to write text on multiple lines.

Compare data in JSON and YAML (on the right). The YAML entry is shorter and visually better perceived:

    "server_port": 8080,
    "db": {
        "dbtype":   "mysql",
        "dbname":   "book",
        "user":     "ivan",
        "password": "****"
    "event": [


server_port: 8080
  dbtype:   mysql
  dbname:   book
  user:     user
  password: '****'
  - 2019-07-05T12:00:00
  - 2019-07-12T23:59:59


The YAML language, like JSON, has basic types: scalars, null, and collections. YAML focuses on code conciseness: it sets the nesting using indents rather than brackets. Commas are optional where they might be guessed by parser. An array of numbers written to a line looks like in JSON:

numbers: [1, 2, 3]

But for columns, commas and square brackets disappear:

  - 1
  - 2
  - 3

DevOps engineers like YAML because it supports Python-style comments (with hashes). Programs like Docker-compose and Kubernetes use YAML for configuration.

YAML allows you to write text across multiple lines. It is easier to read and copy than a single line with a newline character \n.

description: |
  To solve the problem, please do the following:

  - Press Control + Alt + Delete;
  - Turn off your computer;
  - Walk for a while.

  Then try again.

The language officially supports tags.

The cons of YAML stem from its pros. Indentation seems to be a good solution until the file gets too large. The gaze hops across the file to check if the structure levels are correct. Sometimes part of the data steps to the wrong level due to an unnecessary indent. In terms of YAML, there is no error, so it’s hard to find it.

Sometimes, missing quotes will result in incorrect types or structure. Suppose the phrases field lists phrases that a user will see:

  - Welcome!
  - See you soon!
  - Warning: wrong email address.

Because of the colon in the last line, the parser will think it is a nested map (pay attention to syntax highlighting). As a result, we get the wrong structure:

{:phrases ["Welcome!"
           "See you soon!"
           {:Warning "wrong email address."}]}

Other examples: product version 3.3 is a number, but 3.3.1 is a string. Phone +79625241745 is a number because the plus sign is considered a unary operator by analogy with the minus. Leading zeros mean octal notation, so if you don’t add quotes to 000042, you’ll get 34.

This does not mean that YAML is a failed format. The cases above are described in the documentation and have a logical explanation. But sometimes YAML doesn’t behave the way you expect – it’s a price to pay for a simplified syntax.


The EDN format occupies a special place in our review. It is as close as possible to Clojure and therefore plays the same role in the language as JSON in JavaScript. It is a Clojure-native way to associate data with a file.

EDN syntax is almost identical to the language grammar. The format covers more types than JSON and YAML. It contains scalars such as symbols and keywords (the Symbol and Keyword classes from the clojure.lang package). In addition to vectors and maps, EDN offers lists and sets. Maps can be typed to allow creating defrecord instances upon reading. We will talk more about entries in the chapter on systems.

A tag starts with a hash character. The standard offers two tags by default: #inst and #uuid. The former reads a string into a date and the latter into a java.util.UUID instance. Above, we showed how to add your own tag: you need to bind it to a one-argument function when reading a line.

Here’s an example with different types, collections, and tags:

{:user/banned? false
 :task-state #{:pending :in-progress :done}
 :account-ids [1001 1002 1003]
 :server {:host "" :port 8080}
 :date-range [#inst "2019-07-01" #inst "2019-07-31"]
 :cassandra-id #uuid "26577362-902e-49e3-83fb-9106be7f60e1"}

In EDN, data does not differ from code. If you copy them to the REPL or a module, the compiler will execute them. Conversely, the REPL output can be written to a file for further work.

Saving data to EDN means to bake them into a string a write to a file. The function pr-str returns a string which would appear in console if you would print an object. The code below creates a file dataset.edn with the data:

(-> {:some ["data"]}
    (->> (spit "dataset.edn")))

The opposite action is to read the file and parse the code in Clojure using edn/read-string:

(require '[clojure.edn :as edn])

(-> "dataset.edn" slurp edn/read-string)
;; {:some ["data"]}


EDN supports more than just regular comments. The #_ tag ignores any item following it, including the collection. If you need to “ignore” a map that spans several lines, put #_ in front of it, and the parser will skip it.


This way, you can disable entire sections of the configuration. In the following example, we ignore the third element of the vector. If you put a regular comment (semicolon) on a line, it would affect the closing brackets, and the expression will become invalid.

{:users [{:id 1 :name "Ivan"}
         {:id 2 :name "Juan"}
         #_{:id 3 :name "Huan"}]}

EDN is closely related to Clojure and, therefore, is not popular in other languages. Editors don’t highlight its syntax without plugins. EDN will provide challenges for DevOps engineers who mostly work with JSON and YAML. If your configuration is precessed with Python or Ruby scripts, you will have to install a library to work with EDN format.

Choose EDN where Clojure prevails over other technologies. It is the right choice when both the backend and the frontend run on the same Clojure(Script) stack.

Industrial Solutions

Configuration is significant to understand, but we don’t encorouge you to write it from scratch every time you run a new project. In the final section, we’ll take a look at what does the community provides for configuration handling. We’ll focus on Cprop, Aero, and Yummy. These libraries differ in ideology and architecture. We have specially selected them to see the problem from different angles.


The Cprop library works on the principle of “data from everywhere”. Unlike our loader, Cprop understands more sources. The library can read not only a file and environment variables but also resources, property files, and ordinary maps.

The library has a preset order of walking through sources and their priority. Fields from one source replace others. For example, environment variables are considered more important than a file. In Cprop, you can easily set your own loading order for special cases.

We are interested in the load-config function. If you call it without any parameters, it will start the standard loader. By default, it looks for two data sources: a resource and a property file. This resource must be named config.edn. If the system property conf is not empty, the library assumes that this is the property file path and loads it.

Properties are Java runtime variables, similar to the system environment. When loaded, JVM receives the default properties: operating system type, line separator, and others. Additional properties are set with the -D parameter when starting. The example below runs a jar file with a conf property:

$ java -Dconf="/path/to/" -jar project.jar

The .properties files are field=value pairs, one per line. Fields are like domains: they are lexemes separated by dots. Lexemes follow in descending order of priority:


The library treats dots as nested maps. The file above will return the following structure:

{:db {:type "mysql"
      :host ""
      :pool {:connections 8}}}

After receiving the configuration, Cprop looks for overriding in the environment variables. For example, the variable DB__POOL__CONNECTIONS=16 will replace the value 8 in the nested map. Cprop ignores variables that are not part of the config and thus keeps it tidy.

Non-standard paths to the resource and file are specified with the keys:

 :resource "private/config.edn"
 :file "/path/custom/config.edn")

For delicate work, Cprop offers the cprop.source module. Its from-env function reads all environment variables, and from-props-file loads the property file, and so on. It is easy to build the combination that the project needs using the module.

The :merge key unites the config with any source. The former holds a sequence of expressions that will return a map. Here is a detailed example from documentation:

 :resource "path/within/classpath/to.edn"
 :file "/path/to/some.edn"
 :merge [{:datomic {:url "datomic:mem://test"}}
         (from-file "/path/to/another.edn")
         (from-resource "path/within/classpath/to-another.edn")
         (from-props-file "/path/to/")

To track loading, set the DEBUG=y environment variable. With it, Cprop displays service information: a list of sources, loading order, overrides, and so on.

Cprop only reads data from sources but doesn’t validate it. There is no validation with a spec in the library, as it is done in our loader. The step is up to you.

The library casts types its own way. If the string contains only digits, it is converted to a number. Comma-separated values become lists. Sometimes these rules are not enough for complete type control. Thus, Spec and s/conform are still useful for error reporting and type inference.


Aero works with EDN files. The library offers tags, making the format look like a mini-programming language. Branching, import, formatting operators appear in it. This approach can be figuratively called “EDN on steroids”.

The read-config function reads an EDN file or resource:

(require '[aero.core :refer (read-config)])

(read-config "config.edn")
(read-config ( "config.edn"))

Tags are the main point in Aero, so let’s take a look at the main ones. The familiar #env one discovers the value of a variable by its name:

{:db {:passwod #env DB_PASS}}

The #envf tag formats a string using environment variables. Let’s say the connection to the database consists of separate fields, but you prefer the JDBC URI, a long string that looks like a web address. In order not to duplicate data, the address is composed from the original fields:

{:db-uri #envf ["jdbc:postgresql://%s/%s?user=%s"
                DB_HOST DB_NAME DB_USER]}

The #or tag is similar to its Clojure counterpart and is needed for default values. Suppose no database port is specified in the file. In this case, let’s specify the standard PostgreSQL port:

{:db {:port #or [#env DB_PORT 5432]}}

Pay attention, the value for the tag is always a vector or a list. Also, the example above introduces nested tags (#env inside #or).

The #profile tag allows you to find the value by profile. The value behind the tag must be a map. The map key is the profile, and the value is what we get as a result of its discovery. The profile is set in parameters of read-config.

The example below shows how to find the database name by profile. Without a profile, we get the “book” name, but for :test, it becomes “book_test”.

{:db {:name #profile {:default "book"
                      :dev     "book_dev"
                      :test    "book_test"}}}

(read-config "aero.test.edn" {:profile :test})
{:db {:name "book_test"}}

The #include tag puts another EDN file in the config. The file can also contain tags, and the library will execute them recursively. We use imports when the configuration becomes too large or there is a need to share its parts across multiple projects.

{:queue #include "message-queue.edn"}

The #ref tag refers to any part of the configuration file. It is a vector of keys that is usually passed to get-in. A reference will allow you to avoid duplication. For example, a background task component needs the user we specified for the database. In order not to copy it, let’s put the link:

;; config.edn
{:db {:user #env DB_USER}
 :worker {:user #ref [:db :user]}}

When reading a file, the link resolves to the value:

{:db {:user "ivan"}, :worker {:user "ivan"}}

Aero offers a simple configuration language. The library entices developers with the beauty of its idea and implementation. But the moment you feel like moving from inflexible JSON to Aero, think about the other side of the coin.

We do not accidentally separate config from code. If it weren’t for the industry’s need, we would store the parameters in the source files. But best practices, on the contrary, advise separating parameters from the code. This is also because, unlike code, the configuration is declarative.

Inflexible JSON files have an important feature: they are declarative. If you open a file or run cat on it, you will see the data. The syntax may be awkward, but data is self-explanatory, and there is only one way to read it.

On the contrary, a file with an abundance of tags is hard to read. It is not a config but code. To see the final data, you have to execute the file. When reading a file, your head runs a mini interpreter, which does not guarantee the correct result.

It turns out to be a kind of vicious circle: we moved the parameters into the config, added tags, and returned to the code. The approach has the right to exist, but you should choose it after weighing the pros and cons.


The Yummy library closes the overview. It differs from the libraries discussed above in two ways. First, it works with YAML files to read a config (hence the name). Second, the loading process is similar to the one we covered at the beginning of the chapter.

A fully featured loader does more than just read parameters. The cycle includes data validation and error output. The message clearly explains the cause of the error. Using options, you can set a reaction to an exception that occurred while working. Yummy offers all of the above.

The file path either might be set with parameters, or the library searches for it according to special rules. Here’s an option when the path is explicitly set:

(require '[yummy.config :refer [load-config]])

(load-config {:path "/path/to/config.yaml"})

In the second case, we specified the name of the project instead of the path. Yummy looks for the file path in the <project>_CONFIGURATION environment variable or the <project>.configuration property:

$ export BOOK_CONFIGURATION=config.yaml
(load-config {:program-name :book})

The library extends YAML with several tags. One is the familiar !envvar for environment variables:

  password: !envvar DB_PASS

The keyword! tag is useful for converting a string to the keyword:

  - !keyword task/pending
  - !keyword task/in-progress
  - !keyword task/done

Here is the result:

{:states [:task/pending :task/in-progress :task/done]}

The !uuid tag is similar to the #uuid one for EDN; it returns the java.util.UUID object from a string:

system-user: !uuid cb7aa305-997c-4d53-a61a-38e0d8628dbb

The !slurp tag reads the file, which is useful for encryption certificates. Their content is a long string that is inconvenient to store in a general configuration. The :auth, :cert, and :pkey keys will hold the contents of the files from the certs directory.

  auth: !slurp "certs/ca.pem"
  cert: !slurp "certs/cert.pem"
  pkey: !slurp "certs/key.pk8"

To check the configuration, pass the spec key to the load-config parameters. When a key is specified, Yummy executes s/assert with the data from the file. If the validation returns false, an exception will float up. For better reading of spec validation reports, Yummy uses Expound.

(load-config {:program-name :book
              :spec ::config})

An options map takes the :die-fn parameter. It is a function that will run if any stage fails. The function takes an exception and a label with a stage name.

If :die-fn is not specified, Yummy will call the default handler. It prints the text to stderr and exits the JVM with code 1. During the development phase, we do not want to terminate the REPL due to a config error. In an interactive session, our die-fn only prints the text and the error:

 {:program-name :book
  :spec ::config
  :die-fn (fn [e msg]
            (binding [*out* *err*]
              (println msg (ex-message e))))})

In production mode, write the exception to the log and exit the program.

 {:program-name :book
  :spec ::config
  :die-fn (fn [e msg]
            (log/error e "Config error" msg)
            (System/exit 1))})

One note about the s/assert macro which Yummy uses for validation. This macro does not coerce values, as s/conform does, but only throws an exception. This is done on purpose: types are coerced by tags, and the spec only validates them.


Let us briefly outline the main points of this chapter. The configuration is necessary for the project to go through the production stages: development, testing, release. At each step, the project is launched with different settings. This is not possible without configuration.

Loading configuration means reading data, infer types and validate values. In case of an error, a program displays a message and exits with an emergency code. It cannot continue working with invalid parameters.

Configuration sources can be a file, a resource, or environment variables. There are hybrid schemes when most of the data come from the file and secret fields from the environment.

Environment variables live in operating system memory. When there are many of these variables, we can place them in the ENV file. An application does not read it; this is done by a script that controls the app on the server. The application does not know where the variables come from.

The environment is a flat map. Variables store only text; there is no nesting or namespace in keys. Different systems have different conventions on how to extract a structure from a variable name. Dots, double underscores, or something else can be used.

Data formats differ in syntax and types. General-purpose formats define strings, numbers, maps, and lists. They are not very flexible, but they work everywhere. On the contrary, the platform-specific data format is closely tied to the platform but is unpopular in other languages.

Some formats support tags. Use them to describe complex types with primitives: strings and numbers. Tags are also helpful for pre-processing a document, for example, to import its nested parts. The danger of tags is: when there are too many, config turns into code.

Clojure offers several libraries for configuring applications. They differ in design and architecture, and each developer will find what they like. There is no definite answer to the question of which format or library is better. Choose what will solve your problem most cheaply.


Clojure developer

Clojure developer

Ardoq | flexible remote in/around Oslo. Visa sponsorship is possible

Ardoq is a fast-growing technology company in Norway with offices in London, New York, and Copenhagen. Our Graph platform enables our customers to map dependencies across strategic objectives, projects, applications, processes, and people, allowing them to assess the impact of change and make better and faster decisions.

Our company is backed by a solid commitment from investors and a majority of our employees are also shareholders. We're growing rapidly, and are looking for candidates to help scale our engineering team.

Ardoq's engineering team is a highly skilled group of people who like solving challenging problems, value feedback, continuous delivery, and automation of repetitive tasks. We maintain a high iteration speed through a focus on code quality and incremental change, not artificial deadlines. We believe working closely together and supporting each other is important if we are to achieve a common goal.

Who we're looking for

We're looking for caring, driven, and quality-focused engineers who can collaborate across the organization. You should have a learning and sharing mindset. That is, wanting to learn new things and being open and sharing your knowledge. As the company develops we implement our lessons learned and adapt to change. You should be proactive and take ownership.

We believe in finding people with the right qualities and skills rather than finding a person with the right degree. A BS/MS degree in Computer Science, Engineering, or related subject is always good but it's not a prerequisite.

You should have a good knowledge of web technologies and an interest in working with a functional language, as Clojure is our primary back-end language.

We think it's a plus if you consider yourself a full stack developer and don't mind getting your hands dirty. Since JavaScript/TypeScript and Clojure are quite different, we don't expect you to be an expert in both, but it is good to have an understanding of the other side.


You'll be an integral part of the engineering team. This means both working with greenfield feature development, finding bugs, and focusing on continuous quality improvement. There's also the possibility of helping on cloud infrastructure, automation, and developer tooling depending on your personal interests.

Our best work is done in an environment of mutual trust and support. We share knowledge and value diversity. We are proactive and volunteer our best effort every day. If we see a problem, we fix a problem.

What we can offer you

Ardoq's values are Bold, Caring and Driven. Living by these values is part of what makes Ardoq a great place to work. We make bold decisions that push the product, ourselves and our customers forward. We voice our opinions, have difficult conversations, disagree, and learn. We take care of both our colleagues and our customers and empathize with the challenges they face every day.

We also offer many benefits including investment opportunities for employees and generous parental leave schemes.

Although we have offices in Oslo, London, New York, and Copenhagen, we embrace remote work and flexible schedules.

If you identify with this, we can offer you a really great place to work.

Work language

English & Clojure. Although 44% of us are based at the Oslo headquarters, we are an international team representing many countries and languages.


Copyright © 2009, Planet Clojure. No rights reserved.
Planet Clojure is maintained by Baishamapayan Ghose.
Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
Theme by Brajeshwar.