Pragmatic Programmer [Book Notes]

Pragmatic Programmer

Pragmatic Programmer Book

Pragmatic philosophy

  • Your life it's your life
    • Craftsmanship
    • Early adopter
  • Responsibility
    • Offer options
  • Software Entropy
    • Simplicity
    • Maintenance
  • Good enough software
    • Quality is a requirement
  • Your knowledge portfolio:
    • Investment in knowledge always pays the best interest
    • Read nontechnical books
    • Read conceptual books
    • Learn one new language every year
  • Communicate
  • Testability is key

Pragmatic approach

  • The essence of good design: ETC
  • DRY: Code, Data, Documentation (Knowledge)
    • Don't abstract too early, wait until you have copied and pasted a couple of times, examples are needed to create good abstractions
  • Orthogonality:
    • Eliminate effects between unrelated things
    • Understandable, and easier to debug, test and maintain
    • Design patterns
    • SOLID
    • Prefer composition and FP languages
  • Reversibility:
    • Flexible architecture
    • Have options
  • Tracer bullets:
    • Code lean and complete
    • Find the target
  • Prototypes and post-it notes:

    • Information gathering
    • Is coupling minimized
    • Collaborations between components well-defined
    • Responsibilities
    • Interfaces and data clear and available
  • Domain languages:

    • Program close to the problem domain
  • Estimation:

    • I'll back to you
    • optimistic, most likely and pessimistic
    • model building: someone that already did it

Basic Tools

Be more productive with your tools

  • The power of plain text:
    • Self describing data
  • Shell games
  • Power Editing
  • Debugging skills
    • localhost test
    • Explain to someone else
  • Text manipulation
    • Unix: sed, awk
    • Scripting languages: Python
  • Engineering daybooks

Pragmatic Paranoia

Validate all the information we're given, assertions of bad data, and distrust data from potential attackers

  • Design by contract
    • Preconditions, postconditions: Clojure Specs
  • Dead programs tell no lies
    • Crash early
    • Defensive programming is a waste of time let it crash! (Supervisor)
  • Assertive programming
    • Use assertions to prevent the impossible
  • How to balance resources
    • Release free resources
  • Don't outrun your headlights
    • take small steps always

Bend or break

Make our code as flexible as possible, a good way to stay flexible it's to write less code

  • Decoupling
    • Allow flexibility
    • Shy code that promotes cohesion
    • Law of Demeter: Depend on abstractions
    • Avoid global data
    • Avoid inheritance
  • Juggling the real world
    • Events
    • Finite state machine
    • Observer
    • Publish/Subscribe (Channels)
    • Reactive Streams
  • Transforming programming
    • Think in programs like Input Output and transformation of data
    • Process of data
    • find . -name '*.java' | xargs wc -l | sort -n | tail -11 | head -10
    • Programming is about code but programs are about data
  • Inheritance tax
    • Coupling
    • Interfaces to express polymorphism
  • Configuration
    • Parameterize your app using external configuration

Concurrency

  • Concurrency: Two pieces of code run at the same time using Fibers, Threads, and process
  • Parallelism: Hardware that can do two things at once
  • Breaking temporal coupling
  • Avoid shared state
  • Actor and process
  • Blackboards
    • Communication using Kafka or other streaming services

While you are coding

  • Listen to your lizard brain
    • Give time and space to your brains to organize your ideas
  • Algorithm speed
  • Refactoring
    • Rethink
    • Gardening
    • Unit test
    • (Duplication, Not DRY, Bad performance, Outdated knowledge, Test passing. Nonorthogonal)
    • Redesign
    • Refactor early and often is like a surgery
  • Test the code
    • Feedback
    • Improve design
    • Embrace TDD
  • Property-based testing
  • Security
    • Authentication
    • I/O data
    • Principle of least privilege
    • Up to date
    • Encrypt sensitive information
  • Naming
    • At programming all the things have names and reveal the intent and belief of the system
    • Communication

 Before the project

  • Requirements Pit
    • User doesn't know what he wants
    • Our job is to help businesses to understand what they want
    • Improve the feedback loop
    • BDUF is not a good thing
    • Work with the user to think like one
  • Solving de puzzle
    • Think out of the box
    • Make time to think in the unfocused mode
  • Working together
    • Pair programming
    • Mob programming
  • Agile
    • It's about values, context, and feedback loop

Pragmatic Teams

  • Pragmatic Teams
    • No broken windows
    • Be aware of the environment and health of the project
    • DRY
    • Small teams
    • Cross-functional teams Tracer bullets
    • Automation
    • Create and identity (Team name)
    • Schedule time to improve knowledge portfolio
  • Context
    • Use the right tools and practices
    • Software delivery (When release flow is slow status meetings are high)
    • Kanban
  • The programmer starter kit
    • Version control
    • Ruthless testing
      • Unit, Integration, Component, Performance
      • If modules don't work well as a unit, they won't work well as a system
      • Saboteurs: Chaos engineering
  • Automate everything
    • Software delivery es fully automated
  • Delight your users
    • What are your expectations
    • Deliver quality not code
  • Pride and prejudice
    • Code that you feel proud
    • Collective ownership

Permalink

What the Reagent Component?!

Did you know that when you write a form-1, form-2 or form-3 Reagent component they all default to becoming React class components?

For example, if you were to write this form-1 Reagent component:

(defn welcome []
  [:h1 "Hello, friend"])

By the time Reagent passes it to React it would be the equivalent of you writing this:

class Welcome extends React.Component {
  render() {
    return <h1>Hello, friend</h1>
  }
}

Okay, so, Reagent components become React Class Components. Why do we care? This depth of understanding is valuable because it means we can better understand:

The result of all of this "fundamental" learning is we can more effectively harness JavaScript from within ClojureScript.

A Pseudoclassical Pattern

The reason all of your Reagent components become class components is because all of the code you pass to Reagent is run through an internal Reagent function called create-class. The interesting part of this is how create-class uses JavaScript mechanics to transform the Reagent component you wrote into something that is recognized as a React class component. Before we look into what create-class is doing, it's helpful to review how "classes" work in JavaScript.

Prior to ES6, JavaScript did not have classes. and this made some JS developers sad because classes are a common pattern used to structure ones code and provide mechanisms for:

  • instantiation
  • inheritance
  • polymorphism

But as I said, prior to ES6, JavaScript did not have a formal syntax for "classes". This led the JavaScript community to develop a series of instantiation patterns to help simulate classes.

Of all of these patterns, the pseudoclassical instantiation pattern became one of the most popular ways to simulate a class in JavaScript. This is evidenced by the fact that many of the "first generation" JavaScript libraries and frameworks, like google closure library and backbone, are written in this style.

The reason we are going over this history is because the thing about a programming language is there are "patterns" and "syntax". The challenge with "patterns" is:

  • They are disseminated culturally
  • They are often not easy to search
  • They often require a deeper understanding of the language and problem being solved to understand why the pattern became accepted.

The last point is relevant to our conversation because patterns ultimatley make assumptions. Assumptions like our understanding of the problem being solved and where and when a pattern should itself be used. The end result is that a pattern can just become a "thing" we do all while forgetting why we started to do it in the first place or what the world could look like without it.

For example, the most common way of writing a React class component is to use ES6 class syntax. But did you know that ES6 class syntax is little more than syntactic sugar around the pseudoclassical instantiation pattern?

For example, you can write a valid React class component using the pseudoclassical instantiation pattern like this:

// 1. define a function (component) called `Welcome`
function Welcome(props, context, updater) {
  React.Component.call(this, props, context, updater)

  return this
}

// 2. connect `Welcome` to the `React.Component` prototype
Welcome.prototype = Object.create(React.Component.prototype)

// 3. re-define the `constructor`
Object.defineProperty(Welcome.prototype, 'constructor', {
  enumerable: false,
  writable: true,
  configurable: true,
  value: Welcome,
})

// 4. define your React components `render` method
Welcome.prototype.render = function render() {
  return <h2>Hello, Reagent</h2>
}

While the above is a valid React Class Component, it's also verbose and error prone. For these reasons JavaScript introduced ES6 classes to the language:

class Welcome extends React.Component {
  render() {
    return <h1>Hello, Reagent</h1>
  }
}

For those looking for further evidence, we can support our claim that ES6 Classes result in same thing as what the pseudoclassical instantiation pattern produces by using JavaScript's built-in introspection tools to compare the pseudoclassical instantiation pattern to the ES6 class syntax.

pseudoclassical instantiation pattern:

function Welcome(props, context, updater) {
  React.Component.call(this, props, context, updater)

  return this
}

// ...repeat steps 2 - 4 from above before completing the rest

var welcome = new Welcome()

Welcome.prototype instanceof React.Component
// => true

Object.getPrototypeOf(Welcome.prototype) === React.Component.prototype
// => true

welcome instanceof React.Component
// => true

welcome instanceof Welcome
// => true

Object.getPrototypeOf(welcome) === Welcome.prototype
// => true

React.Component.prototype.isPrototypeOf(welcome)
// => true

Welcome.prototype.isPrototypeOf(welcome)
// => true

ES6 class

class Welcome extends React.Component {
  render() {
    console.log('ES6 Inheritance')
  }
}

var welcome = new Welcome()

Welcome.prototype instanceof React.Component
// => true

Object.getPrototypeOf(Welcome.prototype) === React.Component.prototype
// => true

welcome instanceof React.Component
// => true

welcome instanceof Welcome
// => true

Object.getPrototypeOf(welcome) === Welcome.prototype
// => true

React.Component.prototype.isPrototypeOf(welcome)
// => true

Welcome.prototype.isPrototypeOf(welcome)
// => true

What does all of this mean? As far as JavaScript and React are concerned, both definions of the Welcome component are valid React Class Components.

With this in mind, lets look at Reagent's create-class function and see what it does.

What Reagent Does

The history lesson from the above section is important because create-class uses a modified version of the pseudoclassical instantiation pattern. Let's take a look at what we mean.

The following code sample is a simplified version of Reagent's create-class function:

function cmp(props, context, updater) {
  React.Component.call(this, props, context, updater)

  return this
}

goog.extend(cmp.prototype, React.Component.prototype, classMethods)

goog.extend(cmp, React.Component, staticMethods)

cmp.prototype.constructor = cmp

What we have above is Reagents take on the pseudoclassical instantiation pattern with a few minor tweaks:

// 1. we copy to properties + methods of React.Component
goog.extend(cmp.prototype, React.Component.prototype, classMethods)

goog.extend(cmp, React.Component, staticMethods)

// 2. the constructor is not as "thorough"
cmp.prototype.constructor = cmp

Exploring point 1 we see that Reagent has opted to copy the properties and methods of React.Component directly to the Reagent compnents we write. That is what's happening here:

goog.extend(cmp.prototype, React.Component.prototype, classMethods)

If we were using the the traditional pseudoclassical approach we would instead do this:

cmp.prototype = Object.create(React.Component.prototype)

Thus, the difference is that Reagent's approach copies all the methods and properties from React.Component to the cmp prototype where as the second approach is going to link the cmp prototype to React.component prototype. The benefit of linking is that each time you instantiate a Welcome component, the Welcome component does not need to re-create all of the React.components methods and properties.

Exploring the second point, Reagent is doing this:

cmp.prototype.constructor = cmp

whereas with the traditional pseudoclassical approach we would instead do this:

Object.defineProperty(Welcome.prototype, 'constructor', {
  enumerable: false,
  writable: true,
  configurable: true,
  value: Welcome,
})

The difference in the above approaches is that if we just use = as we are doing in the Reagent version we create an enumerable constructor. This can have an implication depending on who consumes our classes, but in our case we know that only React is going to be consuming our class components, so we can do this with relative confidence.

What is one of the more interesting results of the above two Reagent modifications? First, if React depended on JavaScript introspection to tell whether or not a component is a child of React.Component we would not be happy campers:

Welcome.prototype instanceof React.Component
// => false...Welcome is not a child of React.Component

Object.getPrototypeOf(Welcome.prototype) === React.Component.prototype
// => false...React.component is not part of Welcomes prototype chain

welcome instanceof React.Component
// => false...Welcome is not an instance of React.Component

welcome instanceof Welcome
// => true...welcome is a child of Welcome

Object.getPrototypeOf(welcome) === Welcome.prototype
// => true...welcome is linked to Welcome prototype

console.log(React.Component.prototype.isPrototypeOf(welcome))
// => false...React.Component not linked to the prototype of React.Component

console.log(Welcome.prototype.isPrototypeOf(welcome))
// is Welcome is the ancestory?

What the above shows is that Welcome is not a child of React.component even though it has all the properties and methods that React.Component has. This is why were lucky that React is smart about detecting class vs. function components.

Second, by copying rather than linking prototypes we could inccur a performance cost but again, in our case this cost is negligible.

Conclusion

I think it's important to dive into the weeds. In my experience, it's these detours and thorough questioning of topics which has led to considerable improvements in my programming skill and general comfort with increasingly challenging topics.

However, I think the biggest thing for me is something I referenced a few times in this post: "cultural knowledge". I have come to see that that is the most powerful tools we have as a species. It's unfortunate that this kind of information is not always available and my hope is that I could fill some of the gaps with this writing and maybe even open the door to works which can be built ontop of this.

Less philosophically though, I find it encouraging to know that everything is, generally speaking, JavaScript under the hood. This is important because it allows us to take advantage of what has come before and really dig into interesting ways we can use and manipulate JS from within CLJS.

Permalink

What are the Clojure Tools?

This post is about "getting" the Clojure Tools. The reason? They stumped me in the beginning and I felt like if I can make someone's journey just a bit easier that might be a good thing.

My Clojure learning journey started by asking questions like:

  • How do I install Clojure?
  • How do I run a Clojure program?
  • How do I manage Clojure packages (dependencies)?
  • How do I configure a Clojure project?
  • How do I build Clojure for production?

Now, when I started working with Clojure the answer to these questions was: choose either lein or boot. Then Rich Hickey and his ride or die homeboys rolled by and provided their own answer: The Clojure Tools .

Their vision, like the vision of Clojure itself, is a bit offbeat. So, this post is about reviewing the Clojure Tools and figuring out a mental model for them.

At a high level, the Clojure Tools currently consist of:

  • Clojure CLI
  • tools.build

The first is a CLI tool and the second is a Clojure library which provides some helper functions to make it easier to build Clojure artifacts. The rest of this post will dig into each of these tools.

Clojure CLI

The Clojure CLI is made up of the following subprograms:

And here is what it looks like to use the Clojure CLI and some of the things it can do:

Run a Clojure repl

clj

Run a Clojure program

clj -M -m your-clojure-program

manage Clojure dependencies

clj -Sdeps '{:deps {bidi/bidi {:mvn/version "2.1.6"}}}'

The above is just the tip of the CLojure CLI iceburg. I have omitted more interesting examples so we can focus on the Clojure CLI at a higher level. In honor of said "high level" overview, the following sections will cover each of theClojure CLI's subprograms.

clj/clojure

As we see above, the Clojure CLI is invoked by calling one of the two shell commands:

  • clj
  • clojure

When you read through the Official Deps and CLI Guide you will see that you can use either clj or clojure. clj is the recommended version, but both are used. Furthermore, when you start to look at open source code, you will see that both are used.

What's the difference between these two commands? clj is mainly used during development. clojure is mainly used in a production or CI environment. The reason for this is because clj is a light wrapper around the clojure command.

The clj command wraps the clojure command in another tool called rlwrap. rlwrap improves the developer experience by making it easier to type in the terminal while you're running your Clojure REPL.

The tradeoff for the convenience provided by clj is that clj introduces dependencies. This is a tradeoff because you may not have access to rlwrap in production. In addition, a tool like rlwrap can make it harder to compose the clj command with other tools. As a result, it's a common practice to use clojure in production/ci environments .

Now that we see they both more or less the same command, what do they do? clj/clojure has one job: run Clojure programs against a classpath. If you dig into the clj/clojure bash script you see that it ultimatley calls a command like this:

java [java-opt*] -cp classpath clojure.main [init-opt*] [main-opt] [arg*]

The above might look like a simple command, but the value of having clj/clojure is that you as a new Clojure developer don't have to manually build the classpath, figure out the exact right Java command to run or work to make this execute on different environments (windows, linux, mac etc).

In summary, clj/clojure is about running Clojure programs in a classpath and orchestrates other tools. For example, in order to run against a classpath, there has to be a classpath. clj/clojure is not responsible for figuring out the classpath though. That's a job for tools.deps.alpha

tools.deps.alpha

tools.deps.alpha is a Clojure libary responsible for managing your dependencies. What it does is:

  • reads in dependencies from a deps.edn file
  • resolves the dependencies and their transitive dependencies
  • builds a classpath

Note that I said it's a Clojure library. You don't have to be using the Clojure CLI in order to use this tool. You can just use it by itself if you wanted to.

What makes tools.deps.alpha so great is that it's a small and focused library. There isn't much more to say about this other than if you want to learn more about the history, development and goals of the tool from the Clojure team I recommend listening to this episode of Clojure Weekly Podcast which features Alex Miller, the author of tools.deps.alpha.

As noted above, the first thing tools.deps.alpha is going to do is read in your project configuration and deps. This information is stored in deps.edn.

deps.edn

The deps.edn file is a Clojure map with a specific structure. Thus, when you run clj/clojure one of the first things it does is find a deps.edn file and reads it in.

deps.edn is where you configure your project and specify project dependencies. At it's heart, deps.edn is just an edn file. You can think of it like Clojure's version of package.json.

Here is an example of what a deps.edn file looks like:

{:deps    {...}
 :paths   [...]
 :aliases {...}}

As you can see, we use the keywords :deps, :paths and :aliases and more to start to describe your project and the dependencies it requires.

Tools.Build

This is the newest Clojure Tool. It's been in the works for a while and might be the simplest to understand conceptually: It's a Clojure library with functions that do things like build a jar, uberjar etc.

One distinction that's important to note is that tools.build is not the same as the Clojure CLI tool's -T switch. I am calling this out now because when tools.build was released the Clojure CLI was also enhanced to provide the -T switch. As one can imagine, this could be seen as confusing because of the similarity of their names.

The best way that I can currently explain the -T switch is by saying that it's meant to be another level of convenience provided by the Clojure CLI.

Regarding usage, it helps to first breakdown the main types of Clojure programs one might build into 3 sub categories:

  • A tool
  • A library
  • An app

You would use -T for Clojure programs that you want to run as a "tool". For example, deps-new is a Clojure library which creates new Clojure projects based on a template you provide. This is a great example of a Clojure project which is built to be a "tool".

I don't want to go into more detail about -T now because that means we would have to dive into other Clojure CLI switches like -X and -M. That's for another post. On to the Installer!

Installer

The "Clojure CLI Installer" is a fancy way of referring to the brew tap used to install Clojure on mac and linux machines. As of February 2020, Clojure started maintaining their own brew tap. Thus, if you installed the Clojure CLI via

brew install clojure

you will likely want to uninstall clojure and install the following:

brew install clojure/tools/clojure

In all likelihood, you would probably be fine with brew install clojure as it will recieve updates. However, while brew install clojure will still see some love, it won't be as active as the clojure/tools/clojure tap.

clj v lein v boot

This section will provide a quick comparison of clj, lein and boot.

Firstly, all of the above tools are more or less addressing the same problems in their own way. Your job is to choose the one you like best.

If you're curious which to choose, my answer is the Clojure CLI. The reason I like the Clojure CLI is because the tool is simple. You can read through clj and tools.deps.alpha in an afternoon and understand what they are doing if you had to. The same (subjectively of course) cannot be said for lein or boot. This is not just implementation, but also usage. Yes, lein seems easier to start, but the moment you break away from the beginner examples you are left deeps in the woods without a compass.

Secondly, the Clojure Tools promote libraries over frameworks. This is important when working with a language like Clojure because it really does reward you for breaking down your thinking.

Finally, the Clojure community is really leaning into building tools for Clojure CLI. For example, where lein used to have significantly more functionality, the community has built a ton of incredible tools that will cover many of your essential requirements.

So yes, Clojure Tools for the win.

Permalink

Space War

For the last month I've been spending a lot of time working on Space War. I know, I know, I should have been working on Clean Code Episode 67: Legacy Code, and Euler 5, and Countest and Curmugeon 3. I should have been working on a blog, or a new book, or... But I couldn't let go of Space War. It kept calling me.

The first time I wrote Space War was in 1978. I wrote it in Alcom, which was a simple derivative of Focal, which was an analog of Basic for the PDP-8. The computer was an M365 which was an augmented version of a PDP-8 and was proprietery to Teradyne, my employer at the time.

The UI was screen based, using character graphics, similar to curses. Screen updates took on the order of a second. All input was through the keyboard.

We used to play it on one machine while waiting for a compile on another.

Forty years later, in September of 2018, I started working on this version of Space War. It's an animated GUI driven system with a frame rate of 30fps. It is written entirely in Clojure and uses the Quil shim for the Processing GUI framework.

My justification for writing it was so that I could use it as the case study for my cleancoders.com videos on Functional Programming. Once that series of videos was complete, I set Space War aside and started working on other things.

Then, a month ago, the program called to me. I don't know why. Perhaps it was because I'd left it in a partially completed state. Perhaps it was because I had just finished Clean Craftsmanship and I needed a way to decompress. Or, perhaps it was just because I felt like it. Whatever the reason, I loaded up the project and started goofing around with it.

Now I'm sure you've had that feeling of trepidation when you pick up a code base that you haven't seen in three years. I certainly felt it. I mean, what was I going to find in there? Would I be able to get my bearings and understand the code? Or would I flail around aimlessly for weeks?

I needn't have worried. The code base was nicely organized. There was a very nice suite of tests that covered the vast majority of the game logic. The GUI code, though not tested, was simple enough to understand at a glance.

But, perhaps most importantly, this code was written to be 100% functional. No variables were mutated, anywhere in the code. This meant that every function did exactly what it said it did; and left no detritus around to confound other functions. No function could be impacted by the state of the system because the system did not have "a state".

Now maybe you are rolling your eyes at that last paragraph. Several years ago I might have rolled my eyes too. But the relief I experienced coming back into this code base after three years of not touching it, and knowing it was functional, was palpable.

Another thing that gave me a significant amount of help was that all the critical data structures in the system were described and tested using clojure/spec. This was profoundly helpful because it gave me the kind of declarative help that is usually reserved for statically typed languages.

For example, This is a Klingon:

(s/def ::x number?)
(s/def ::y number?)
(s/def ::shields number?)
(s/def ::antimatter number?)

(s/def ::kinetics number?)
(s/def ::torpedos number?)
(s/def ::weapon-charge number?)
(s/def ::velocity (s/tuple number? number?))
(s/def ::thrust (s/tuple number? number?))
(s/def ::battle-state-age number?)
(s/def ::battle-state #{:no-battle :flank-right :flank-left :retreating :advancing})
(s/def ::cruise-state #{:patrol :guard :refuel :mission})
(s/def ::mission #{:blockade :seek-and-destroy :escape-corbomite})

(s/def ::klingon (s/keys :req-un [::x ::y ::shields ::antimatter
                                  ::kinetics ::torpedos ::weapon-charge
                                  ::velocity ::thrust
                                  ::battle-state-age ::battle-state
                                  ::cruise-state
                                  ::mission]
                         :opt-un [::hit/hit]))

These kinds of clojure/spec descriptions gave me the documentation I needed to reaquaint myself with the critical data structures of the system. They also gave me the ability to check that any functions I wrote kept those data structures conformant to the spec.

All of this means that I was able to make progress in this code base quickly, and with a high degree of confidence. I never had that feeling of wading through bogs of legacy code.

Anyway, I'm done now, for the time being. I've given the player a mission to complete, and made it challenging, but possible, to complete that mission. A game requires 2-3 hours of intense play, is tactially and strategically challenging, and is often punctuated by moments of sheer panic.

I hope you enjoy downloading it, firing up Clojure, and playing it. Consider it my Christmas present to you.

One last thing. Three years ago Mike Fikes saw my Space War program and converted it from Clojure to ClojureScript. The change was so miniscule that the two are now a single code base with a tiny smattering of conditional compilation for the very few differences. So if you want to play the game on-line you can just click on http://spacewar.fikesfarm.com/spacewar.html. Mike has kindly kept this version up to date so -- have at it!

Permalink

Primes in Clojure part 2: Interop

A couple weeks ago I published a blog post on computing prime numbers with Clojure. I got some good feedback on it, including pointers to other implementation techniques I'd overlooked.In this post, I want to explore some of those, notably two based on the Java standard library.

Permalink

The Competition Won't Eat Java’s Lunch Anytime Soon

In a previous post, I drew some comparisons between the evolution of computer languages and natural languages. One such is Turing completeness: native speakers express everything they want with a limited toolkit of vocabulary, sounds, and syntactic rules that must not be too hard to master. Another parallel is the slow, incremental nature of language evolution. Languages avoid breaking changes, given the billions of lines of legacy that would otherwise be rendered unreadable.

What does this mean for Java? It makes sense to stay true to its OO roots and optimize its expressive potential, rather than introduce paradigm shifts that break backward compatibility. To remain in the same language, big syntactic overhauls are unlikely. Mutability and void statements will stay part of the language, despite the growing appetite for more functional languages features. This is fine because the JVM platform can offer the best of two worlds. Interoperability of byte code allows radically different languages (Frege, Clojure) easy access to a rich and stable ecosystem. Java doesn’t have to be the golden hammer of programming.

Permalink

Working at Wisedocs - Build Medical Document Processing Software

Wisedocs is a Canadian based, fast-growing, seed stage startup. Their team is made up of startup and scaleup veterans on a mission to make it easy and accessible for any company in the insurance, legal and medical space to understand medical documents quickly using AI. As part of our company spotlight series we chatted with the team at Wisedocs about their company, team, culture and interview process. Before we dive in, don’t forget to sign up and subscribe to our newsletter to get personalised content and job opportunities straight into your inbox!Tell us more about Wisedocs, what do you do?Wisedocs is an AI powered automation platform that processes and understands medical information i...

Tags: Javascript, Kubernetes, Redis, Django, Clojure, Terraform, Docker, Vue.Js, Aws, Python, Pytest

Continue reading

Permalink

Making nREPL and CIDER More Dynamic (part 2)

By Arne Brasseur

In part 1 I set the stage with a summary of what nREPL is and how it works, how editor-specific tooling like CIDER for Emacs extends nREPL through middleware, and how that can cause issues and pose challenges for users. Today we’ll finally get to the “dynamic” part, and how it can help solve some of these issues.

To sum up again what we are dealing with: depending on the particulars of the nREPL client (i.e. the specific editor you are using, or the presence of specific tooling like refactor-clj), or of the project (shadow-cljs vs vanilla cljs), certain nREPL middleware needs to present for things to function as they should. When starting the nREPL server you typically supply it with a list of middlewares to use. This is what plug-and-play “jack-in” commands do behind the scenes. For nREPL to be able to load and use those middlewares they need to be present on the classpath, in other words, they need to be declared as dependencies. This is the second part that jack-in takes care of.

This means that nREPL servers are actually specialized to work with specific clients, which is a little silly if you think about it. You can’t connect with vim-iced to a server that expects CIDER clients, or at least not without reduced functionality.

Instead what we want is for the nREPL server to be client-agnostic. Once a client connects it can then tell the server of its needs, and “upgrade” the connection appropriately. It can even upgrade the connection incrementally, loading support for extra operations when it first needs them.

Let’s unpack what is needed to make this a reality, we need to be able to

  • add extra middleware to a running server or connection
  • add entries to the classpath at runtime
  • resolve and download (transitive) dependencies

Add Middleware to a Running Server

Turns out this problem has already been solved! Yay! Over a year ago Shen Tian implemented a dynamic-loader middleware (see nrepl/nrepl#185), which provides an add-middleware operation.

(-->
  id         "23"
  op         "add-middleware"
  session    "33d052f6-04dd-4f0e-916f-ed94aa0188ec"
  time-stamp "2021-11-19 09:40:42.092185482"
  middleware ("cider.nrepl/wrap-apropos" "cider.nrepl/wrap-classpath" "cider.nrepl/wrap-clojuredocs" "cider.nrepl/wrap-complete" "cider.nrepl/wrap-content-type" "cider.nrepl/wrap-debug" "cider.nrepl/wrap-enlighten" "cider.nrepl/wrap-format" "cider.nrepl/wrap-info" "cider.nrepl/wrap-inspect" "cider.nrepl/wrap-macroexpand" "cider.nrepl/wrap-ns" "cider.nrepl/wrap-out" "cider.nrepl/wrap-slurp" "cider.nrepl/wrap-profile" "cider.nrepl/wrap-refresh" "cider.nrepl/wrap-resource" "cider.nrepl/wrap-spec" "cider.nrepl/wrap-stacktrace" "cider.nrepl/wrap-test" "cider.nrepl/wrap-trace" "cider.nrepl/wrap-tracker" "cider.nrepl/wrap-undef" "cider.nrepl/wrap-version" "cider.nrepl/wrap-xref")
)
(<--
  id         "23"
  session    "33d052f6-04dd-4f0e-916f-ed94aa0188ec"
  time-stamp "2021-11-19 09:40:42.958165047"
  status     ("done")
)

If you are a CIDER user and are fairly up-to-date (this may require using master) you can try this out today.

(add-hook 'cider-connected-hook #'cider-add-cider-nrepl-middlewares)

Install this hook, then connect to a vanilla nREPL server. You can run one in your project with:

clojure -Sdeps '{:deps {nrepl/nrepl {:mvn/version "RELEASE"}}}' -M: -m nrepl.cmdline

However, chances are you’ll see something like this in the cider-repl buffer:

WARNING: middleware cider.nrepl/wrap-trace was not found or failed to load.
WARNING: middleware cider.nrepl/wrap-macroexpand was not found or failed to load.
WARNING: middleware cider.nrepl/wrap-inspect was not found or failed to load.
...

That’s because the dynamic-loader middleware tries to (require 'cider.nrepl), and fails. We need to first get cider-nrepl onto the classpath.

Adding Entries to the Classpath at Runtime

I have written at great length recently about the classpath and classloaders, (see The Classpath is a Lie). Simply adding entries to the classpath is fairly easy, Clojure’s DynamicClassLoader has a public addURL method. dynapath provides an abstraction around this. You need a few lines of code to check if you have the right kind of classloader, and if not instantiate a new one, and you’re good.

The harder part is controlling which classloader is in use at the point where require gets called. Typically this is the “context classloader”, which is a mutable thread local variable. As if mutability alone wasn’t tricky enough. Inside an nREPL request you’re generally covered, since nREPL creates a DynamicClassLoader for you for each session. It used to be a little too eager about creating new classloaders, which I addressed in nrepl/nrepl#248. However there is still the issue mentioned in The Classpath is a Lie, which is that clojure.main creates a new DynamicClassLoader for each call to repl, which in nREPL means on every eval. We do some recursing to find the DynamicClassLoader which sits directly above the system classloader, and use that. This tends to give fairly predictable results. There has been talk of forking clojure.main/repl for nREPL’s use, which would allow us to get rid of this annoying behavior, which would help simplify things.

To try this at home first find the location of the cider-nrepl JAR. This is a fat jar, it includes all its dependencies inlined and shaded with MrAnderson, so we don’t need to worry about resolving transitive dependencies.

find ~/.m2 -name 'cider-nrepl-*.jar'

Now we end up with something like this.

;; remove the previous one if necessary
(pop 'cider-connected-hook)

;; install the new hook
(add-hook 'cider-connected-hook
          (lambda ()
            (cider-sync-tooling-eval
             (parseedn-print-str
              '(.addURL (loop ; shenanigans to find the "root" DCL
                         [loader (.getContextClassLoader (Thread/currentThread))]
                         (let [parent (.getParent loader)]
                           (if (instance? clojure.lang.DynamicClassLoader parent)
                               (recur parent)
                             loader)))
                        (java.net.URL. "file:/home/arne/.m2/repository/cider/cider-nrepl/0.27.2/cider-nrepl-0.27.2.jar"))))
            (cider-add-cider-nrepl-middlewares)))

And there you go, you’ve successfully turned a vanilla nREPL connection into a cider-nrepl connection. You can now make full use of CIDER’s capabilities!

Resolving Dependencies

The previous solution assumes that you already have cider-nrepl*.jar on your system, that you know where to find it, that it matches the CIDER version Emacs is using (or at least is compatible, they no longer need to match exactly), and that it doesn’t need any additional dependencies.

A more generic solution would allow you to simply provide dependency coordinates, the type you supply in deps.edn or project.clj, and let Clojure figure out and download what it needs. Something like this:

(add-hook 'cider-connected-hook
          (lambda ()
            (cider-sync-tooling-eval
             (parseedn-print-str
              `(update-classpath! ((cider/cider-nrepl . ,cider-injected-nrepl-version)))))
            (cider-add-cider-nrepl-middlewares)))

This is called “dependency resolution”. It means taking a set of artifact-name+version coordinates, trying to find the given versions in one or more repositories (like Clojars or Maven Central), downloading their .pom files to figure out any transitive dependencies (recursively), and finally downloading the actual jars.

This requires a good deal of machinery, machinery that is not present in every Clojure process out of the box. You could start your nREPL process with tools.deps.alpha as a dependency, or lean on other libraries that are lower-level (org.apache.maven, Aether), or higher level (lambdaisland/classpath, Pomegranate). In any case we need these to be declared and loaded at boot time, if they are not present than we have a chicken-and-egg problem, our connection upgrade is once again blocked.

It’s also worth pointing out that this is by far the most complicated part of this whole endeavor. Adding tools.deps.alpha adds about 11MB of dependencies. Maybe that’s fine, many apps will pull in hundreds of megabytes of dependencies, what’s a dozen more? Still, people often have good reasons to keep their dependencies to a minimum, so this is not a decision that nREPL can make for them.

But we can sidestep the issue, in the case of CIDER we only need to download a single jar. We can just do that and be done with it. In fact, CIDER already contains code to do just that:

(add-hook 'cider-connected-hook
          (lambda ()
            (cider-sync-tooling-eval
             (parseedn-print-str
              `(.addURL (loop
                         [loader (.getContextClassLoader (Thread/currentThread))]
                         (let [parent (.getParent loader)]
                           (if (instance? clojure.lang.DynamicClassLoader parent)
                               (recur parent)
                             loader)))
                        (java.net.URL. ,(concat "file:" (cider-jar-find-or-fetch "cider" "cider-nrepl" cider-injected-nrepl-version))))))
            (cider-add-cider-nrepl-middlewares)))

Alternatively we could shell out to a tool that can do this work for us. We actually have a lot of choice there at this point. There’s of course clojure (i.e. the Clojure CLI), but Babashka can do the same thing (bb clojure -Sdeps {} -Spath), and there’s deps.clj, available as standalone binaries, or as an Uberjar, which could be invoked in a separate process/JVM, or loaded onto the classpath and invoked from Clojure directly.

It would be neat to wrap this in a little library which looks for one of these executables in some default places, and falls back to downloading deps.clj. This way you could get the functionality of a 11MB dependency for perhaps a hundred lines of Clojure, although this may seem like an unsavory approach to some.

It’s probably clear by now that there’s more than one way to shear a sheep, and you may be wondering why we don’t just go and hide all these details behind a facade that Just Works™. To an extent that will probably happen, these are early days and we are still figuring out how to best fit these pieces together. But we’re also likely to find that there’s no one size that fits all, whether it’s all tools that build on nREPL, or all users of those tools.

In an upcoming blog post I’ll be talking a lot more about Mechanisms vs Policies. I think at this point it’s ok to focus on the mechanisms, make sure we have the individual ingredients, and let people experiment with combining them in the way that makes most sense to them and their project.

Speaking of ingredients, there’s one more piece we haven’t covered yet, the Sideloader!

The Role of the Sideloader

Besides implementing the dynamic-loader middleware, Shen Tian also implemented another nREPL piece, meant to complement it, the Sideloader. Inspired by a similar mechanism in unrepl, the Sideloader is a special kind of ClassLoader which requests the resources it needs from a connected nREPL client.

The way this works is that the client first installs the sideloader by invoking the sideloader-start op. Now every time you require a namespace, access an io/resource, or load a Java class, the nREPL client (i.e. your editor) gets a sideloader-lookup message. If it is able to provide the requested resource, then it responds by sending a sideloader-provide message back.

The idea is that instead of adding cider-nrepl to the classpath directly, we let CIDER (Emacs) supply each namespace or resource to nREPL on demand, over the network.

I’ve put considerable effort into the client side implementation of this, see clojure-emacs/cider#3037, and the half a dozen PRs linked from there. This is why the aforementioned cider-jar-find-or-fetch exists, we download the cider-nrepl JAR from Emacs, so that we can supply its contents piecemeal to the Clojure process.

You can try this out as well:

(add-hook 'cider-connected-hook #'cider-upgrade-nrepl-connection)

This will enable the sideloader, and inject the necessary middleware as before.

So far I find the results rather underwhelming. Doubly so when connecting to an nREPL server on a different machine, which is the use case where this approach would actually make the most sense. Round-tripping over the network, and extracting file contents from the JAR from Emacs Lisp, then base64 encoding and decoding them to go over the wire… it all adds a lot of overhead. Note that this impacts all classloader lookups, since we only fall back to the system classloader when the Sideloader has determined that CIDER is unable to supply the given resource. It’s also worth nothing that for every .clj file that needs to be loaded this way, we round-trip three times: once to look for an __init.class file, once for a .cljc, and only then does Clojure look for the .clj file.

There are two obvious ways to improve this, one is to only have the sideloader active for a limited amount of time. You activate it, inject the necessary middleware, and turn it back off. Currently this does not work because of cider-nrepl’s deferred middleware. Most middleware only gets required the first time it is actually used, at which point the sideloader has long been disabled.

What I’ve also experimented with is providing a list of prefixes, so that only cider-nrepl’s namespaces are fetched via the sideloader, it helps of course, but the results are still underwhelming.

So as it stands I’m not convinced the sideloader is going to be an important piece in making this dynamic upgrading of nREPL connections a reality. I think the approaches I’ve set out above, where we make sure any resources required are present on the classpath directly, are much to be preferred. Faster and more reliable.

I do however think the Sideloader could become a cool piece of kit for loading files from the user’s code base, especially when connected to a remote process.

I run a Minecraft server on a cloud instance, and use witchcraft-plugin to endow Minecraft with Clojure superpowers, including an nREPL server. Locally I have my cauldron repo where I do my Minecraft creative coding. It’s a collection of repl sessions, and of namespaces with utility functions. When I eval (require 'net.arnebrasseur.cauldron.structures) then that currently fails, because this Cauldron repo isn’t present on the server. I need to go into each namespace I need and manually eval them in topological order before I can use them. Not great. In this case I think it would be fantastic if CIDER could spoon feed the server the necessary namespaces on request.

Conclusion

With this post I hope to draw some attention to all the work that’s been happening. We’ve been laying the groundwork for really improving the user experience, now we need to figure out how to bring it all together in a way that makes sense.

There’s a risk though that pushing for these changes will initially negatively impact the user experience, because change is hard, and we can’t anticipate everyone’s use case and needs.

So I expect “jack-in” to stay around, and to remain the default recommendation. It’s not perfect, but it works well for the vast majority of users.

At the same time we want to invite power users and tooling authors, especially those that have experienced the limitations and frustrations that come with the current approach, to consider these alternatives. To try them out and report back, so that we can shave off the rough edges, abstract away some of the plumbing, and gradually make this ready for broad consumption.

>

Permalink

An Update on CIDER 1.2

When clj-refactor 3.0 was released, I promised you that the release of CIDER 1.2 wasn’t far behind. However, an entire month has passed since then and CIDER 1.2 is still brewing. Turned out there was more work to be done than I expected, that’s why I decided to provide a short progress update.

Several things contributed to the delay. Here’s a brief rundown in no particular order:

  • nREPL 0.9 was delayed a bit by some problems we’ve discovered with the new Unix socket transport. The problems have been addressed and I expect that we’ll finalize nREPL 0.9 in the next week or so.
  • Due to the nREPL issues mentioned above, we still haven’t added support for the new Unix socket transport to CIDER.
  • There’s some cleanup work that has to be done in Orchard with respect to switching from dynapath to enrich-classpath for automatically adding Java sources and Javadoc to the classpath. That cleanup is optional for this particular release, but I’d rather do it sooner rather than later.
  • I want us to support properly nbb in CIDER 1.2. This requires some changes to the ClojureScript bootstrap logic in CIDER, as nbb is a native Node.js implementation that works differently from all the ClojureScript REPLs we’re currently supporting (notably you don’t need to evaluate any code in a Clojure REPL to “upgrade” it to a ClojureScript REPL).
  • I still haven’t submitted CIDER and its dependencies to NonGNU ELPA, so that people would be able to install CIDER out-of-the-box on Emacs 28.
  • I’ve been crazy busy at work, which limited the amount of time I could allocate to CIDER and friends.

In light of everything listed above, right now I’m aiming to release CIDER 1.2 around Christmas. Note that the current CIDER snapshot is in a pretty good shape overall and you can upgrade to it without any concerns about the stability issues.

If you want to know more about the current state of affairs, please check out this meta ticket, which tracks all the remaining work. As usual any help with the outstanding tasks will be greatly appreciated!

To wrap up on a positive note - in the mean time we released clj-refactor 3.1 and 3.2 with a bunch of small improvements! We might be moving a bit too slow at times, but we’re always moving forward!

Permalink

Clojure Deref (Nov 24, 2021)

Welcome to the Clojure Deref! This is a weekly link/news roundup for the Clojure ecosystem. (@ClojureDeref RSS)

Highlights

Welcome to a special mid-week Deref as we will be out in the US this week! But on that note, big thanks to the Clojure community for always be interesting, inventive, and caring. I’m thankful to be a part of it.

Our big news this week is the release of Clojure 1.11.0-alpha3 which wraps up much of the work we’ve done in the last couple months. Probably the most interesting parts are the new things:

If you have questions about these, I would request that you read the ticket first - we’re trying to get thinking and background into the ticket descriptions and it’s important context. We’ve already had a lot of feedback about clojure.java.math re cljs portability and higher-order use so probably more to come on that. If you want to discuss on Clojurians Slack, the #clojure-dev room is the best place.

Docstring updates:

  • CLJ-2666 Make Clojure Java API javadoc text match the example

  • CLJ-1360 Update clojure.string/split docstring regarding trailing empty parts

  • CLJ-2249 Clarify clojure.core/get docstring regarding sets, strings, arrays, ILookup

  • CLJ-2488 Add definition to reify docstring

Perf:

  • CLJ-1808 map-invert should use reduce-kv and transient

Bug fix:

  • CLJ-2065 Support IKVReduce on SubVector

And last but not least, we added support for optional trailing maps to kwarg functions in Clojure 1.11.0-alpha1 but had not yet worked through what this meant for spec. We’ve now released an update to spec.alpha (0.3.214) that is included as a dependency in this release. For the background on this, see CLJ-2606.

Not to be be outshined, we also released an updated version of core.async 1.5.640, which has several important bug fixes, particularly if you are using any of the alt variants, or something that uses alt indirectly like mix or merge.

Libraries and Tools

New releases and tools this week:

  • stub 0.1.1 - Library to generate stubs for other Clojure libraries

  • sicmutils 0.20.0 - A port of the Scmutils computer algebra/mechanics system to Clojure

  • tweet-def - Tweet as a dependency

  • sweet-array 0.1.0 - Array manipulation library for Clojure with "sweet" array type notation and more safety by static types

  • spec.alpha 0.3.214 - Describe the structure of data and functions

  • cljs-macroexpand - clojurescript macroexpand-all macro with meta support

  • secret-keeper 1.0.75 - A Clojure(Script) library for keeping your secrets under control

  • datahike 0.4.0 - A durable Datalog implementation adaptable for distribution

  • Tutkain 0.11.0 (alpha) - A Sublime Text package for interactive Clojure development

  • unminify - unminifies JS stacktrace errors

Permalink

Search Relevance Using BM25 Algorithm

Here, I am implementing the BM25 algorithm to retrieve most relevant items out of a set of strings, given a search query. This is the first part of implementing a bare-bones version of ElasticSearch which also uses the BM25 algorithm internally.

Before getting into the implementation, following is a quick run down of the algorithm itself.

  1. Documents are strings, containing words (including repeating words).
  2. New documents can be inserted in the store, given a corresponding ID.
  3. Documents can be searched using a query string containing one or more words.
  4. The search function returns the most relevant documents first.
  5. Relevance is defined by BM25 to be composed of two factors: term frequency and inverse document frequency
  6. Term frequency is the number of times a word appears in a document. For example, if the word ‘Clojure’ appears 10 times in one document and 2 times in another document, and the query includes the word ‘Clojure’, the first document will be more relevant.
  7. Inverse document frequency reduces the impact of commonly occuring words. For example, most documents will contain words like ‘this’ and ‘that’; but few documents will contain the word ‘Clojure’. Document frequency tracks the number of documents in which a word appears. So, if the query term contains ‘this Clojure’, the contribution of the word ‘this’ to the final score will be less than the word ‘Clojure’ because, inverse document frequency is defined as 1 / document frequency.

More mathamatically, the scoring function which decides how relevant a document is, given a query, is defined as follows:

BM25 Equation

This logic is implemented in this repository. Following is a demo of the final result:

To insert a document, use the following text box:

Now to search for the inserted documents, use the following.

As an added utility, an IMDB dataset is available to index 1500 movie plots from my GitHub repository, which can be inserted here:

Now, you, for example, you can search with the string “Atlantic Iceberg” and see the movie Titanic as a result🤞.

The code used for this implementation is available here: https://github.com/oitee/bm25. Pull requests are most welcome!

In a future post, I will compare this implementation with ElasticSearch’s results and also provide a REST interface to interact from the external world with this implementation. Stay tuned!




Permalink

Head of Data Engineering at kleene.ai

Head of Data Engineering at kleene.ai

gbp95000

We are looking for someone to oversee and be responsible for the delivery from the data ingest part of our business. This department are responsible for building and extending the connectors that the application uses to extract data for our clients, as well as updating our internal tooling. Your role will be to ensure that the development process runs smoothly, that the devs have everything they need to deliver reliable and working code in an efficient manner. Your data engineering knowledge will also help ensure that the approach suggested is the right one and also help increase the team members’ knowledge.

This role will primarily be focussed on process, management and identifying the best approach/solution to deliver new functionality or improve resilience. Although you may spend some of your time hacking on internal POCs as part of an investigation, the majority of your time will be taken up with meetings, documentation and reporting. 

You will be working closely with the Product Managers and other heads in the Tech department, including the Solutions Architect, Devops Manager and CTO, so good communications skills are essential.

Requirements

  • Previous team leadership/management experience
  • Extensive data engineering experience - working knowledge of multiple APIs/databases, data wrangling and best practice for ensuring resilience
  • Polyglot with strong Clojure Experience but who will always pick the best solution regardless of the eco-system
  • Experience building/working with distributed systems
  • Experience of asynchronous and realtime data extracts
  • Experience of managing remote teams and applying the best processes to ensure reliable delivery
  • Excellent spoken and written English skills

About us

kleene.ai is a fast-growing startup that has been operating for nearly 5 years and was established with the goal of revolutionising data. We believe that all companies, regardless of size, should be able to easily access and understand their data. Our goal is to provide clients with the ability to access any of their data sources from within a simple interface without any technical knowledge.

Although we have a central London office, the current development team is entirely remote and we have developers working from Asia, Africa, Europe and South America. 

This is an exciting time to join us, the number of clients we have is continuing to grow and earlier this year we closed our Seed funding round.

Permalink

Copyright © 2009, Planet Clojure. No rights reserved.
Planet Clojure is maintained by Baishamapayan Ghose.
Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
Theme by Brajeshwar.