Ep 042: What Does It Mean to Be 'Data-Oriented'?

Each week, we answer a different question about Clojure and functional programming.

If you have a question you’d like us to discuss, tweet @clojuredesign, send an email to feedback@clojuredesign.club, or join the #clojuredesign-podcast channel on the Clojurians Slack.

This week, the question is: “What does it mean to be ‘data-oriented’?” We merge together different aspects of Clojure’s data orientation, and specify which of those help make development more pleasant.

Selected quotes:

  • “Clojure has the corner on data.”
  • “Other languages have data too, it’s just locked in little cages.”
  • “Data is inert, it can’t harm you.”
  • “Because Clojure is expressed in its own data structures, and those structures are simple, that makes Clojure syntax simple.”
  • “Find a good way to represent the information that you want to work with, in a way that feels appropriate for the subject matter.”
  • “If you find a good way of representing your information, that representation tends to be pretty stable. All of the change is in the functions you use to work with it.”

Permalink

Getting Started With Clojure CLI Tools

Clojure Command Line Interface (CLI) tools provide a fast way for developers to get started with Clojure and simplify an already pretty simple experience. With tools.deps it also provides a more flexible approach to including libraries, including the use of code from a specific commit in a Git repository.

Practicalli Clojure 35 - Clojure CLI tools - an introduction is a video of a live broadcast of this content (inclucing typos)

Clojure CLI tools provide:

  • Running an interactive REPL (Read-Eval-Print Loop)
  • Running Clojure programs
  • Evaluating Clojure expressions
  • Managing dependencies via tools.deps

Clojure CLI tools allow you to use other libraries to, referred to as dependencies or ‘deps’. These may be libraries you are writing locally, projects in git (e.g. on GitHub) or libraries published to Maven Central or Clojars.

The Clojure CLI tools can cover the essential features of Clojure Build tools Leiningen and Boot, but are not designed as a complete replacement. Both these build tools are mature and may have features you would otherwise need to script in Clojure CLI tools.

This article is a follow on from new Clojure REPL Experience With Clojure CLI Tools and Rebel Readline

Getting started

Clojure is packaged as a complete library, a JVM JAR file, that is simply included in the project like any other library you would use. You could just use the Java command line, but then you would need to pass in quite a few arguments as your project added other libraries.

Clojure is a hosted language, so you need to have a Java runtime environment (Java JRE or SDK) and I recommend installing this from Adopt OpenJDK. Installation guides for Java are covered on the ClojureBridge London website

The Clojure.org getting started guide covers instructions for Linux and MacOXS operating systems. There is also an early access release of clj for windows

Basic usage

The installation provides the command called clojure and a wrapper called clj that provides a readline program called rlwrap that adds completion and history once the Clojure REPL is running.

Use clj when you want to run a repl (unless you are using rebel readline instead) and clojure for everything else.

Start a Clojure REPL using the clj command in a terminal window. This does not need to be in a directory containing a Clojure project for a simple REPL.

1
clj

A Clojure REPL will now run. Type in a Clojure expression and press Return to see the result

Clojure CLI Tools REPL

Exit the REPL by typing Ctrl+D (pressing the Ctrl and D keys at the same time).

Run a Clojure program in a the given file. This would be useful if you wanted to run a script or batch jobs.

1
clojure script.clj

Aliases can be added that define configurations for a specific build task:

1
clojure -A:my-task

You can use and legal Clojure keyword name for an alias and include multiple aliases with the clojure command. For example in this command we are combining three aliases:
clojure -A:my-task:my-build:my-prefs

What version of Clojure CLI tools are installed?

The deps.edn file allows you to specify a particular version of the Clojure language the REPL and project use. You can also evaluate *clojue-version* in a REPL to see which version of the Clojure language is being used.

clj -Sdescribe will show you the version of the Clojure CLI tools that is currently installed.

clojure cli tools - describe install version

clj -Sverbose will also show the version of Clojure CLI tools used before it runs a REPL.

deps.edn

deps.edn is a configuration file using extensible data notation (edn), the language that is used to define the structure of Clojure itself.

Configuration is defined using a map with top-level keys for :deps, :paths, and :aliases and any provider-specific keys for configuring dependency sources (e.g. GitHub, GitLab, Bitbucket).

~/.clojure/deps.edn for global configurations that you wish to apply to all the projects you work with

project-directory/deps.edn for project specific settings

The installation directory may also contain a deps.edn file. On my Ubuntu Linux system this location is /usr/local/lib/clojure/deps.edn and contains the following configuration.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
{
:paths ["src"]
:deps {
org.clojure/clojure {:mvn/version "1.10.1"}
}
:aliases {
:deps {:extra-deps {org.clojure/tools.deps.alpha {:mvn/version "0.6.496"}}}
:test {:extra-paths ["test"]}
}
:mvn/repos {
"central" {:url "https://repo1.maven.org/maven2/"}
"clojars" {:url "https://repo.clojars.org/"}
}
}

Note: the install deps.edn is now depreciated and will not be included in a future version of the Clojure CLI tools.

The deps.edn files in each of these locations (if they exist) are merged to form one combined dependency configuration. The merge is done in the order above install/config/local, last one wins. The operation is essentially merge-with merge, except for the :paths key, where only the last one found is used (they are not combined).

You can use the -Sverbose option to see all of the actual directory locations.

Much more detail is covered in the Clojure.org article - deps and cli

Using Libraries - deps.edn

deps.edn file in the top level of your project can be used to include libraries in your project. These may be libraries you are writing locally, projects in git (e.g. on GitHub) or libraries published to Maven Central or Clojars.

Include a library by providing its name and other aspects like version. This information can be found on Clojars if the library is published there.

Libraries as JAR files will be cached in the $HOME/.m2/repository directory.

Example clj-time

Declare clojure.java-time as a dependency in the deps.edn file, so Clojure CLI tools can downloaded the library and add it to the classpath.

1
2
3
{:deps
{org.clojure/clojure {:mvn/version "1.10.1"}
clojure.java-time {:mvn/version "0.3.2"}}}

Writing code

For larger projects you should definately find an editor you find productive and has great CLojure support. You can write code in the REPL and you can just run a specific file of code, if you dont want to set up a full project.

Create a directory what-time-is-it.

Create a deps.edn file in this directory with the following code:

1
2
3
{:paths ["src"]
:deps {org.clojure/clojure {:mvn/version "1.10.1"}
clojure.java-time {:mvn/version "0.3.2"}}}

Create a src directory and the source code file src/practicalli/what_time_is_it.clj which contains the following code:

1
2
3
4
5
6
(ns practicalli.what-time-is-it
(:require [java-time :as time]))
(defn -main []
(println "The time according to Clojure java-time is:"
(time/local-date-time)))

The code has a static entry point named -main that can be called from Clojure CLI tools. The -m option defines the main namespace and by default the -main function is called from that namespace. So the Clojure CLI tools provide program launcher for a specific namespace:

1
2
3
clojure -m practicalli.what-time-is-it
The time according to Clojure java-time is: #object[java.time.LocalDateTime 0x635e9727 2019-08-05T16:04:52.582]

Using libraries from other places

With deps.edn you are not limited to using just dependencies from JAR files, its much easier to pull code from anywhere.

TODO: Expand on this section in another article with some useful examples

rebl readline

Rebel readline enhances the REPL experience by providing multi-line editing with auto-indenting, language completions, syntax highlighting and function argument hints as you code.

clj-new

clj-new is a tool to generate new projects from its own small set of templates. You can also create your own clj-new templates. It is also possible to generate projects from Leiningen or Boot templates, however, this does not create a deps.edn file for Clojure CLI tools, it just creates the project as it would from either Leiningen or Boot.

Add clj-new as an alias in your ~/.clojure/deps.edn like this:

1
2
3
4
5
6
7
8
{
:aliases
{:new {:extra-deps {seancorfield/clj-new
{:mvn/version "0.7.6"}}
:main-opts ["-m" "clj-new.create"]}}
}

Create a Clojure CLI tools project using the clj-new app template

1
2
3
clj -A:new app myname/myapp
cd myapp
clj -m myname.myapp

The app template creates a couple of tests to go along with the sample code. We can use the cognitec test runner to run these tests using the :test alias

1
clj -A:test:runner

clj-new currently has the following built-in templates:

app – a deps.edn project with sample code, tests and the congnitect test runner, clj -A:test:runner. This project includes :gensys directive, so can be run as an application on the command line via clj -m
lib – the same as the app template, but without the :gensys directive as this is mean to be a library.
template – the basis for creating your own templates.

figwheel-main

Use the figwheel-main template to create a project for a simple Clojurescript project, optionally with one or reagent, rum or om libraries.

Defining aliases

An alias is a way to add optional configuration to your project which is included when you use the alias name when running clojure or clj.

We will cover examples of using aliases as we discover more about Clojure CLI tools. For now, take a look at Clojure CLI and deps.edn - video by Sean Corfield

Multiple versions of Clojure CLI tools

Installing CLI tools downloads a tar file that contains the installation files, the executables, man pages, a default deps.edn file, an example-deps.edn and a Jar file.

Clojure cli tools - install - tar contents

The jar file is installed in a directory called libexec is not removed when installing newer versions of the Clojure CLI tools, so you may find multiple versions inside the libexec directory.

Clojure CLI tools - install - multiple versions

Summary

Despite the seemingly stripped-down set of options available in deps.edn (just :paths, :deps, and :aliases), it turns out that the :aliases feature really provides all you need to bootstrap a wide variety of build tasks directly into the clojure command. The Clojure community is building lots of tools on top of Clojure CLI tools to provide richer features that can simply be added as an alias.

What I really like best about this approach is that I can now introduce new programmers to Clojure using command line conventions that they are likely already familiar with coming from many other popular languages like perl, python, ruby, or node.

References

Thank you.
@jr0cket

Permalink

We need to talk about JSON

Like many in the Clojure community, I’ve considered JSON as a rather horrible syntax (compared to the obvious superiority of EDN).

As a configuration format, JSON is used almost everywhere: npm’s package.json, AWS CloudFormation, Hashicorp Terraform, Open API, .NET, the list is endless.

For me, the worst issue is the exclusion of comments. Of all the problems we’re leaving to future generations, the complete lack of explanation in the configuration files of future legacy software systems must be among the most irksome. Understanding is good, clear explanations are good; discouraging the use of helpful and clarifying comments is inexcusable.

The situation would improve if developers stopped adopting JSON for configuration files - and there are JSON-like formats available that are much better suited to this task, such as Hjson, Jsonnet, YAML and others. The downside of these is that they require special format reading software which is rarely built-in to languages. At JUXT, we use Aero, which we consider to be the best of all worlds.

Change the perspective

Let’s start again and think about what JSON is:

{
"list": ["string" {"map": {"key": "value"}}]
}

It’s maps, lists, maps-of-lists, lists-of-maps, lists-of-lists, maps-of-maps.

In fact, from a Clojure perspective, things couldn’t be better: Clojure is, hands down, the most adept language at processing this near-ubiquitous format. Of course, we’re a little bit biased at JUXT!

Back when everything was XML, and everyone wrote Java (or C#), I remember working on a large project. We spent weeks (literally, weeks!) working on serialisation and de-serialisation of data from XML into Java and back. I/O is a critical part of any program. Getting data in and out of a program shouldn’t be that hard, and with Clojure and JSON it isn’t.

Data format Ideal processing language

CSV

SQL

SGML

DSSSL

XML

XSLT

JSON

CLJ/S (Clojure)

It’s almost as if, behind the scenes, there’s been this Grand Lisp Conspiracy to get everyone programming in Lisp, whether or not they realise it.

guy steele

By disguising itself with some ornamental punctuation, JSON has infected the whole world with Lisp.

JSON Schema

JSON Schema is an emerging standard which is used to define the structure of JSON documents. It’s mostly used to validate that JSON documents conform to expectations. However, since JSON allows anyone to define new keywords and add them to 'vocabularies', I believe JSON Schema should be thought of as a more general way of specifying meta-data.

When combined with meta-data, data takes on some magical super-powers.

react jsonschema form

One example is Mozilla’s React JSON Schema Form which demonstrates a React component that can dynamically produce forms via introspecting JSON data and its schema.

I believe there will be plenty of other uses of JSON Schema in the future.

json schema

One example I’m particularly interested in is the use of JSON Schema as a pull-syntax for our Crux database. JSON Schema allows you to declare the structure of the document you want to 'pull' out of Crux, Crux will perform the necessary graph queries against its documents. The result documents will successfully validate against the schema, as the query is the schema.

jinx

I wanted to explore the potential of JSON Schema to provide a way of accessing metadata about documents in order to make the data more useful and powerful.

Thanks to some help from some colleagues (in particular, Shivek Khurana and Sunita Kawane), these explorations have resulted in a new Clojure(Script) library called jinx, which provides schema-powered capabilities, which of course includes validation. We’re hoping this can be used for a variety of applications, not just to validate data documents but to access the annotations made via the JSON Schema document.

A quick example:

(juxt.jinx.validate/validate
  {"firstName" "John"
   "lastName" "Doe"
   "age" 21}
  (juxt.jinx.schema/schema
   {"$id" "https://example.com/person.schema.json"
    "$schema" "http://json-schema.org/draft-07/schema#"
    "title" "Person"
    "type" "object"
    "properties"
    {"firstName"
     {"type" "string"
      "description" "The person's first name."}
     "lastName"
     {"type" "string"
      "description" "The person's last name."}
     "age" {"description" "Age in years which must be equal to or greater than zero."
            "type" "integer"
            "minimum" 0}}}))

=>

{:instance {"firstName" "John", "lastName" "Doe", "age" 21},
 :annotations
 {"title" "Person",
  :properties
  {"firstName" {"description" "The person's first name."},
   "lastName" {"description" "The person's last name."},
   "age"
   {"description"
    "Age in years which must be equal to or greater than zero."}}},
 :type "object",
 :valid? true}

Our jinx library is still in ALPHA status but it would be great to hear all feedback and comments. It’s available on GitHub at https://github.com/juxt/jinx.

Permalink

The structure of an end-to-end Purescript OTP project

All the posts so far..

Useful links

  • demo-ps The demo codebase we're talking about here
  • erl-pinto (the opinionated bindings to OTP we're using)
  • erl-stetson (the opinionated bindings to Cowbou we're using)

The structure of an end-to-end Purescript OTP project

Our demo-ps can be viewed as two separate chunks of code, the base layer is just a plain old Erlang application built using rebar3 and such, and then on top of that we have a pile of Purescript that compiles into Erlang that can then be compiled and used by the usual development workflow.

The Erlangy bits

  • release-files: Assets to be shipped during the release process
  • src: This is usually where the Erlang application lives, but there is no Erlang code
    • demo_ps.app.src: The entry point, just points at a Purescript module, we'll talk about that
  • rebar.config: Erlang dependencies and such
  • priv: Assets/files we want access to from code (static html/js/etc is covered here)

The purescript bits

  • server: The Purescript application that we want to compile into Erlang lives here
  • client: The Purescript application we want to compile into JS lives here
  • Makefile: Turns the Purescript into JS/Erlang
  • shared: Contains Purescript we'll share between JS/Erlang

In an ideal world, we'd just have a single Purescript entry point and forego our interaction with the Erlang world, but this would involve building out a lot more tooling - the result of this, is that sometimes you will be bringing Purescript dependencies down that require Erlang dependencies and then adding these to rebar.config and the entry point will be your responsibility.

The purescript dependencies can be found in in psc-package.json inside the server and client directories, and the Erlang dependencies can be found in rebar.config at the top level.

As a team already familiar with the Erlang ecosystem, this doesn't represent a hardship for us; but this definitely represents an area which could be improved by an enterprising developer or two, probably a plugin to the Purescript stack that stashes the rebar assets/etc in another build folder and allows us to just write PS/Erlang in the right place. (But this would then also involve modifying our editor plugins to know about this new structure, and as you can already see, it's a lot of work when we have something that is already functional..)

That entry point then

{application, demo_ps,
 [{description, "An OTP application"},
  {vsn, "0.1.0"},
  {registered, []},
  {mod, { bookApp@ps, []}},
  {applications,
   [kernel,
    stdlib,
    lager,
    gproc,
    recon,
    cowboy,
    jsx,
    eredis
   ]},
  {env,[]},
  {modules, []},
  {maintainers, []},
  {licenses, []},
  {links, []}
 ]}.

One of the key things to note here, is that we have cowboy as a dependency, this is to support (as mentioned), the Purescript libraries that binds to it (stetson and erl-cowboy.

The other big note, is that entry point module is 'bookApp@ps' - that module can be found in server/src/BookApp.purs, which defines a module 'BookApp' - the Purescript compiler will compile Purescript modules into *@ps*, as this is unlikely to clash with anything else in the global application namespace. Beyond this entry point there is no Erlang code in the application itself - it's Purescript all the way down...

The Makefile in server/Makefile does the work of compiling this Purescript into Erlang that can then be compiled by the usual rebar3 toolchain. The gist of the below Makefile being that we take all the .purs files lying around in the 'server' folder, and compile them into .erl files that end up in ../../src/compiled_ps.

We'll go into detail on the Purescript stuff in the next post, as that's the key; we put a pile of Erlang supporting files in the right location, and then write PS in the other location and everything "just kinda works".

Permalink

How do you develop algebraic thinking?

A few people have asked me how to develop Level 3 thinking. I’m not sure. But I’ve got some directions to try. In this episode, I go over 3 and a half ways to develop algebraic thinking.

Transcript

Eric Normand: How do you develop algebraic thinking? By the end of this episode, you should have three or maybe four different approaches to it. My name is Eric Normand. I help people thrive with functional programming.

I’ve been getting a bunch of questions along the lines of, “How do I develop level three thinking?” I’m writing a book. To do that, it’s part of writing the book. It’s a book about functional programming.

I’m trying to organize the material. Functional programming is a big, broad field. I have to choose what to teach and what not to teach. I’ve organized functional programming into three levels. I probably should call it functional thinking.

The first level is where you are able to distinguish actions from calculations and data. You can spot side effects. You’re pretty good with immutability. You know how to solve problems without resorting to side effects like global immutable variables, things like that. You can move stuff around and isolate the calculation from the action, that kind of thing.

Level two gets a little bit more sophisticated. You’re able to do higher order thinking. You’re using higher order functions, higher order actions. You’re using first class actions and even first class state if you need to. This lets you do stuff like data transformation pipelines with map filter and reduce, things like that.

Then there’s level three which is what we’re talking about today. This I’m calling algebraic thinking. I’ve also thought about calling it…Because algebraic thinking just brings up all sorts of questions like why algebra? This isn’t high school math, those kinds of associations with the term algebra. It’s about building composable models.

This may be a better way of talking about it but it’s a third level where you’re building very robust abstractions and basically algebras. Which I explained in the last episode so you should look that up.

I’ve been getting a bunch of questions about how to develop this. Now this whole show has been me exploring these ideas so that I can figure out how to put them into the book and so this is another exploration. I’m not sure. I don’t know.

This part is the last part of the book. It is the part that I have worked on the least and so these are really rough. Which is cool, you get to see them really early.

Here are three approaches to self-study to get to level three. Here we go. The first one I’m going to bring up is property-based testing. You’re probably familiar with unit testing or example-based testing where you give some arguments to a function and tell the system what to expect and that’s your test. You give specific values.

In property-based testing you don’t give specific values, those are generated randomly and you have to give a function basically that will tell the system whether the system under test got the right answer. This is called property-based testing because you’re developing properties — like algebraic properties — that your function must obey, must uphold.

Let’s just really quick because I don’t want to teach property-based testing right now. I just want to give it as an example. If you were going to test the Sort function, like you write a new Sort, you could test that with two properties. One is that the returned list — you call Sort and you get a list back — that that list is in order.

There’s another property you need which is that the return list is a permutation of the argument that you pass to Sort, the input list.

With those two properties you kind of covered the whole behavior of that function. You could test it with examples. You could give it the empty list. The empty list, sorting the empty list gives you empty list. Sorting a one gives you a one.

Sorting the list 2-1 gives you the list 1-2. You could give all the examples, but you never have to think higher order. You have somewhere in the back of your mind that it’s got to be the same elements. You have somewhere in the back of your mind that it’s got to be sorted when it comes out.

What you haven’t thought about how to write that down. How do you actually write a test that says for any input list, the output list is in order? What does it mean to be in order? How do I test that it is actually in order?

Likewise, how do I test that this list is a permutation of another list? How do I write that down? Those kinds of thinking where you have to deal with not just a given list, but every list. Any list. Each and every list. That is the kind of thinking that gets you to that next level.

It’s like first order logic. It’s not just A and B, it’s for all A. It’s a different level of thinking. By doing property-based testing I think you really force your mind to go there, because it’s all mental stuff. It’s all mental skills. I’m not talking about some very concrete thing, it’s all in your head.

Very related, that’s number two, is algebraic properties. There’s a lot of talk in the functional programming world about categories, category theory. This is similar reason to learn them, to learn categories as well as algebraic properties. Same idea.

They are very succinct truths about certain operations. They’re expressed in very small formulas. They’re relationships between a function and itself, or a function and another function. As an example, I could say a function F is commutative if F of A and B is equal to F of B and A.

To relationship, it’s an equality, and it’s of that one function F with itself. It relates how this function should operate in a totally abstract sense. We didn’t say what A was. We didn’t say what B was. We’re basically saying for all A and all B that are valid arguments to that function, then this should be true.

We’re thinking at a more abstract level, at an algebraic level, where you can start talking about F of A and B. You don’t have to say F of one and seven, those are very specific numbers. You’re talking about all numbers, A and B, or any numbers A and B.

It’s the same thinking, but it’s a different approach. You study algebra, you start to see that things can have meaning without being specific.

Addition, you can say A+B because you’ve been to algebra class. You sat through it, you did all the exercises, and now you get that there’s stuff that you know about A+B, even though you don’t know A or B.

That’s the kind of thinking that you need to get, to get to level three. Another one, the third one. This one is much more difficult material, but it is material. It is something I can link you to that you can watch. Whereas, the other two are a little bit more general concepts.

The third thing is called denotational design. This is basically some talks and some papers by Conal Elliott. He’s big in the Haskell community. He gave a really good talk. It’s like a workshop. It’s 2 hours and 20 minutes of him stepping through his thought process, which he calls denotational design for building a graphics system.

I’ve watched it many times. [laughs] I think that that is the best resource available for understanding the mindset from first principles designing something like that. Some notable things about it. The first one is that he’s working with the type signatures of the functions.

The function signatures, what arguments does it take, what does it return, totally never shows a single implementation of what an image is. The fact that he can walk through slide after slide of code with no implementation. He’s just showing the function signatures. That is the level you need to get to.

I have a whole thing on types which I’m about to say after I finish with denotational design. I’m not saying you need types. I’m not. What I’m saying is that he was thinking at the level of this function does this. I can talk about it and how it behaves without showing the implementation.

He starts from first principles, he builds a type for the image, what is an image, and then he starts showing how you could implement a combinator that acts on the image. How do we overlay two images? How do we have a mask where a shape, a region is shown through and then the rest isn’t shown?

How do we add a background color, a certain transparent color? All these operations he does show how they are simply implemented, as long as you have this very simple notion. Simple, but elegant simple. It’s a very elegant notion of what an image is.

You never know how he’d implemented image, which is where you need to get to. You need to be able to think about the operations on image, without thinking about the concrete implementation. In his system, an image isn’t an array of pixels because you’ve already limited yourself to a certain representation.

He has said, “Let’s not even implement it yet, so that we can talk about the kinds of operations we want to be able to do in the abstract.” You notice I’m trying to get at something almost ineffable without using the term algebra. [laughs]

With all these three examples, the property-based testing, the algebraic properties, the denotational design, you’re able to think about the system without caring about how it’s implemented. That’s the algebra part. You’re thinking in this static reasoning. You can look at the expressions that you can build out of this algebra and reason about them.

A couple of people asked if this was something that types help with. I mentioned that Conal Elliott uses the types to talk about functions without having to implement them. I don’t want to get into the typed versus untyped debate. Not in this episode. I don’t think that types are necessary in the language you write them in. I think that this is all stuff that’s happening here.

What the types do help you with though, if you used a strongly typed language, a statically typed, strongly-typed language, like Haskell, what you get is the discipline for free. You are forced to internalize the type system. What you’re internalizing that helps is a few things that helps with this kind of thinking. One is the case-based thinking.

In Haskell, if you have multiple cases in your algebraic data type, there’s a warning or you can turn it into an error if you forget on your cases. I suggest turning it into an error so you always have to deal with all the cases.

What that makes you do is make sure that you’re covering the whole thing, that you create a total function. At least it helps. It makes you think about that. I’m going to make this function work for all values of this type. That is something that I see people not doing in the untyped world that they could learn better from. They’ll have to do it themselves [laughs] with internal discipline. Sit and really make sure that all the bases are covered.

The function works for all values because you don’t want corner cases. That messes up your algebra. Another thing is pushing information into the type. What do I mean by that? Some functions don’t work for negative numbers. There’s no built-in type in Haskell for non-negative numbers, but you can make one and it wouldn’t be hard. It’s a common exercise to do. It’s a few lines. Then you have this type that represents non-negative numbers.

What is a little harder and speaks to the level three thinking — algebraic thinking — is to realize that when you’re using that type you don’t really need to care where it came from. It came from somewhere.

When I’m programming this function, when I’m implementing this function and it has that type, it’s a function that takes a non-negative integer, I don’t have to care where it came from. I’m not thinking about that. I just know that this type guarantees x, y, z.

I’m pushing that information into the type. That kind of encapsulation is also part of it. That I don’t really have to worry about where it comes from. I can let that be for now. I’m encapsulating that knowledge. This function’s functionality.

Conal Elliot once said that one of the cool things about numbers and one of the reason why they’re so useful is if you have the number seven, you can’t know how you got that seven. You don’t know if it was 3+4, or 5+2, or 1+6, or 14*0.5, or whatever. You don’t know where it came from. It’s equal to every other seven, regardless of where it came from.

That’s it. That’s level three. [laughs] I don’t know how else to say it. If you’re thinking in those terms, I don’t care where this come from, I don’t care how it got generated, that is level three.

To bring back the video example, I’ve been developing this example over several episodes. When I’m concatenating two videos, I don’t care where the videos came from. I want to be able to concatenate them, regardless of where they came from. I shouldn’t care. That is what I’m talking about, where you don’t care where it came from.

If you start in the wrong place, you could say, “Well, a video is a file on disc that has a sequence of frames.” Now when I’m developing concatenate, I brought all of this concrete stuff to concatenate. I might think, to implement concatenate, I open up a new file, copy the existing frames, and then add the other frames to the end of it.

You’ve already lost the abstract game. We do that. Before level three, we do that way too much, where we’re actually thinking about how to implement it way too fast. Way before we’ve thought about, what should it mean, what should it do, how does it relate to the other operations that are allowed.

This is a thing that I’ve experienced in Haskell where, if the type starts getting complicated, then you know you’ve got a problem. If you’ve got seven fields on your type, now there’s something wrong. You’re not really doing this kind of algebraic thinking.

The type has become too complex, it’s going to be impossible to do this kind of smooth, elegant algebra anymore. When you push information into the type, you got a signal for how complex the thing is. That’s really all I’ve got about to how to get to level three.

Part of it is pointing at it. Trying to point out to people who are not there and part of it is to try to use things that are there that you could go learn right now that will help you, that will help you get the ideas to get there.

My challenge as the author of the book that’s trying to get people there is to come up with good examples, good scenarios where this is useful. [laughs] It’s not just some abstract notion and also the exercises will get you there. You could see the benefit because it is a feeling. It’s not something. There’s a gray line that you cross.

At some point, you think, “Wow, this is a really solid algebra. No corner cases, very small and elegant. I can think abstractly about it.” It’s just some invisible line that you cross at some point.

If you like this episode, you can find all the old past episodes at lispcast.com/podcast. There you’ll find audio, video, and text versions of all the episodes. You’ll also find links to subscribe whether it’s on your podcast player or YouTube or however you want to consume it. Also links to social media where you can get in discussion with me.

That’s all I’ve got. This has been my thought on functional programming. I’m Eric Normand. Thank you for being there and rock on.

The post How do you develop algebraic thinking? appeared first on LispCast.

Permalink

7 Easy functional programming techniques in Go

There is a lot of hype around functional programming(FP) and a lot of cool kids are doing it but it is not a silver bullet. Like other programming paradigms/styles, functional programming also has its pros and cons and one may prefer one paradigm over the other. If you are a Go developer and wants to venture into functional programming, do not worry, you don't have to learn functional programming oriented languages like Haskell or Clojure(or even Scala or JavaScript though they are not pure functional programming languages) since Go has you covered and this post is for you.

If you are looking for functional programming in Java then check this out

I'm not gonna dive into all functional programming concepts in detail, instead, I'm gonna focus on things that you can do in Go which are in line with functional programming concepts. I'm also not gonna discuss the pros and cons of functional programming in general.

What is functional programming?

As per Wikipedia,

Functional programming is a programming paradigm—a style of building the structure and elements of computer programs—that treats computation as the evaluation of mathematical functions and avoids changing-state and mutable data.

Hence in functional programming, there are two very important rules

  • No Data mutations: It means a data object should not be changed after it is created.
  • No implicit state: Hidden/Implicit state should be avoided. In functional programming state is not eliminated, instead, its made visible and explicit

This means:

  • No side effects: A function or operation should not change any state outside of its functional scope. I.e, A function should only return a value to the invoker and should not affect any external state. This means programs are easier to understand.
  • Pure functions only: Functional code is idempotent. A function should return values only based on the arguments passed and should not affect(side-effect) or depend on global state. Such functions always produce the same result for the same arguments.

Apart from these there are functional programming concepts below that can be applied in Go, we will touch upon these further down.

Using functional programming doesn't mean its all or nothing, you can always use functional programming concepts to complement Object-oriented or imperative concepts in Go. The benefits of functional programming can be utilized whenever possible regardless of the paradigm or language you use. And that is exactly what we are going to see.

Functional programming in Go

Golang is a multi-paradigm language so let us see how we can apply some of the functional programming concepts above in Go.

First-class and higher-order functions

First-class functions(function as a first-class citizen) means you can assign functions to variables, pass a function as an argument to another function or return a function from another. Go supports this and hence makes concepts like closures, currying, and higher-order-functions easy to write.

A function can be considered as a higher-order-function only if it takes one or more functions as parameters or if it returns another function as a result.

In Go, this is quite easy to do

func main() {
    var list = []string{"Orange", "Apple", "Banana", "Grape"}
    // we are passing the array and a function as arguments to mapForEach method.
    var out = mapForEach(list, func(it string) int {
        return len(it)
    })
    fmt.Println(out) // [6, 5, 6, 5]

}

// The higher-order-function takes an array and a function as arguments
func mapForEach(arr []string, fn func(it string) int) []int {
    var newArray = []int{}
    for _, it := range arr {
        // We are executing the method passed
        newArray = append(newArray, fn(it))
    }
    return newArray
}

Closures and currying are also possible in Go

// this is a higher-order-function that returns a function
func add(x int) func(y int) int {
    // A function is returned here as closure
    // variable x is obtained from the outer scope of this method and memorized in the closure
    return func(y int) int {
        return x + y
    }
}

func main() {

    // we are currying the add method to create more variations
    var add10 = add(10)
    var add20 = add(20)
    var add30 = add(30)

    fmt.Println(add10(5)) // 15
    fmt.Println(add20(5)) // 25
    fmt.Println(add30(5)) // 35
}

There are also many built-in higher-order-functions in Go standard libraries. There are also some functional style libraries like this and this offering map-reduce like functional methods in Go.

Pure functions

As we saw already a pure function should return values only based on the arguments passed and should not affect or depend on global state. It is possible to do this in Go easily.

This is quite simple, take the below this is a pure function. It will always return the same output for the given input and its behavior is highly predictable. We can safely cache the method if needed.

func sum(a, b int) int {
    return a + b
}

If we add an extra line in this function, the behavior becomes unpredictable as it now has a side effect that affects an external state.

var holder = map[string]int{}

func sum(a, b int) int {
    c := a + b
    holder[fmt.Sprintf("%d+%d", a, b)] = c
    return c
}

So try to keep your functions pure and simple.

Recursion

Functional programming favors recursion over looping. Let us see an example for calculating the factorial of a number.

In traditional iterative approach:

func factorial(num int) int {
    result := 1
    for ; num > 0; num-- {
        result *= num
    }
    return result
}

func main() {
    fmt.Println(factorial(20)) // 2432902008176640000
}

The same can be done using recursion as below which is favored in functional programming.

func factorial(num int) int {
    if num == 0 {
        return 1
    }
    return num * factorial(num-1)
}
func main() {
    fmt.Println(factorial(20)) // 2432902008176640000
}

The downside of the recursive approach is that it will be slower compared to an iterative approach most of the times(The advantage we are aiming for is code simplicity and readability) and might result in stack overflow errors since every function call needs to be saved as a frame to the stack. To avoid this tail recursion is preferred, especially when the recursion is done too many times. In tail recursion, the recursive call is the last thing executed by the function and hence the functions stack frame need not be saved by the compiler. Most compilers can optimize the tail recursion code the same way iterative code is optimized hence avoiding the performance penalty. Go compiler, unfortunately, does not do this optimization.

Now using tail recursion the same function can be written as below, but Go doesn't optimize this, though there are workarounds, still it performed better in benchmarks.

func factorialTailRec(num int) int {
    return factorial(1, num)
}

func factorial(accumulator, val int) int {
    if val == 1 {
        return accumulator
    }
    return factorial(accumulator*val, val-1)
}

func main() {
    fmt.Println(factorialTailRec(20)) // 2432902008176640000
}

I ran some benchmarks with all 3 approaches and here is the result, as you can see looping is still the most performing followed by the tail recursion.

goos: linux
goarch: amd64
BenchmarkFactorialLoop-12           100000000           11.7 ns/op         0 B/op          0 allocs/op
BenchmarkFactorialRec-12            30000000            52.9 ns/op         0 B/op          0 allocs/op
BenchmarkFactorialTailRec-12        50000000            44.2 ns/op         0 B/op          0 allocs/op
PASS
ok      _/home/deepu/workspace/deepu105.github.io/temp  5.072s
Success: Benchmarks passed.

Consider using recursion when writing Go code for readability and immutability, but if performance is critical or if the number of iterations will be huge use standard loops.

Lazy evaluation

Lazy evaluation or non-strict evaluation is the process of delaying evaluation of an expression until it is needed. In general, Go does strict/eager evaluation but for operands like && and || it does a lazy evaluation. We can utilize higher-order-functions, closures, goroutines, and channels to emulate lazy evaluations.

Take this example where Go eagerly evaluates everything.

func main() {
    fmt.Println(addOrMultiply(true, add(4), multiply(4)))  // 8
    fmt.Println(addOrMultiply(false, add(4), multiply(4))) // 16
}

func add(x int) int {
    fmt.Println("executing add") // this is printed since the functions are evaluated first
    return x + x
}

func multiply(x int) int {
    fmt.Println("executing multiply") // this is printed since the functions are evaluated first
    return x * x
}

func addOrMultiply(add bool, onAdd, onMultiply int) int {
    if add {
        return onAdd
    }
    return onMultiply
}

This will produce the below output and we can see that both functions are executed always

executing add
executing multiply
8
executing add
executing multiply
16

We can use higher-order-functions to rewrite this into a lazily evaluated version

func add(x int) int {
    fmt.Println("executing add")
    return x + x
}

func multiply(x int) int {
    fmt.Println("executing multiply")
    return x * x
}

func main() {
    fmt.Println(addOrMultiply(true, add, multiply, 4))
    fmt.Println(addOrMultiply(false, add, multiply, 4))
}

// This is now a higher-order-function hence evaluation of the functions are delayed in if-else
func addOrMultiply(add bool, onAdd, onMultiply func(t int) int, t int) int {
    if add {
        return onAdd(t)
    }
    return onMultiply(t)
}

This outputs the below and we can see that only required functions were executed

executing add
8
executing multiply
16

There are also other ways of doing it using Sync & Futures like this and using channels and goroutines like this. Doing Lazy evaluations in Go might not be worth the code complexity most of the times, but if the functions in question are heavy in terms of processing then its is absolutely worth it to lazy evaluate them.

Type system

Go has a strong type system and also has pretty decent type inference. The only thing missing compared to other functional programming languages are something like case classes and pattern matching.

Referential transparency

From Wikipedia:

Functional programs do not have assignment statements, that is, the value of a variable in a functional program never changes once defined. This eliminates any chances of side effects because any variable can be replaced with its actual value at any point of execution. So, functional programs are referentially transparent.

Unfortunately, there are not many ways to strictly limit data mutation in Go, however by using pure functions and by explicitly avoiding data mutations and reassignment using other concepts we saw earlier this can be achieved. Go by default passes variables by value, except for slices and maps. So, avoid passing them by reference(using pointers) as much as possible.

For example, the below will mutate external state as we are passing a parameter by reference and hence doesn't ensure referential transparency

func main() {
    type Person struct {
        firstName string
        lastName  string
        fullName  string
        age       int
    }
    var getFullName = func(in *Person) string {
        in.fullName = in.firstName + in.lastName // data mutation
        return in.fullName
    }

    john := Person{
        "john", "doe", "", 30,
    }

    fmt.Println(getFullName(&john)) // johndoe
    fmt.Println(john) // {john doe johndoe 30}
}

If we pass parameters by the value we can ensure referential transparency even if there is an accidental mutation of passed data within the function

func main() {
    type Person struct {
        firstName string
        lastName  string
        fullName  string
        age       int
    }
    var getFullName = func(in Person) string {
        in.fullName = in.firstName + in.lastName
        return in.fullName
    }

    john := Person{
        "john", "doe", "", 30,
    }

    fmt.Println(getFullName(john))
    fmt.Println(john)
}

We cannot rely on this when passed parameters are maps or slices.

Data structures

When using functional programming techniques it is encouraged to use functional data types such as Stacks, Maps and Queues.
Hence maps are better than arrays or hash sets in functional programming as data stores.

Conclusion

This is just an introduction for those who are trying to apply some functional programming techniques in Go. There are a lot more that can be done in Go and with the addition of generics in the next major version, this should be even easier. As I said earlier functional programming is not a silver bullet but it offers a lot of useful techniques for more understandable, maintainable and testable code. It can co-exist perfectly well with imperative and object-oriented programming styles. In fact, we all should be using the best of everything.

I hope you find this useful. If you have any question or if you think I missed something please add a comment.

If you like this article, please leave a like or a comment.

You can follow me on Twitter and LinkedIn.

Permalink

Frameworks and Why (Clojure) Programmers Need Them

It seems like there's a strong aversion to using frameworks in the Clojure community. Other languages might need frameworks, but not ours! Libraries all the way, baby!

This attitude did not develop without reason. Many of us came to Clojure after getting burned on magical frameworks like Rails, where we ended up spending an inordinate amount of time coming up with hacks for the framework's shortcomings. Another "problem" is that Clojure tools like Luminus and the top-rate web dev libraries it bundles provide such a productive experience that frameworks seem superfluous.

Be that as it may, I'm going to make the case for why the community's dominant view of frameworks needs revision. Frameworks are useful. To convince you, I'll start by explaining what a framework is. I have yet to read a definition of framework that satisfies me, and I think some of the hate directed at them stems from a lack of clarity about what exactly they are. Are they just glorified libraries? Do they have to be magical? Is there some law decreeing that they have to be more trouble than they're worth? All this and more shall be revealed.

I think the utility of frameworks will become evident by describing the purpose they serve and how they achieve that purpose. The description will also clarify what makes a good framework and explain why some frameworks end up hurting us. My hope is that you'll find this discussion interesting and satisfying, and that it will give you a new, useful perspective not just on frameworks but on programming in general. Even if you still don't want to use a framework after you finish reading, I hope you'll have a better understanding of the problems frameworks are meant to solve and that this will help you design applications better.

Frameworks have second-order benefits, and I'll cover those too. They make it possible for an ecosystem of reusable components to exist. They make programming fun. They make it easier for beginners to make stuff.

Last, I'll cover some ways that I think Clojure is uniquely suited to creating kick-ass frameworks.

(By the way: I've written this post because I'm building a Clojure framework! So yeah this is totally my Jedi mind trick to prime you to use my framework. The framework's not released yet, but I've used it to build Grateful Place, a community for people who are into cultivating gratitude, compassion, generosity, and other positive practices. Just as learning Clojure makes you a better programmer, learning to approach each day with compassion, curiosity, kindness, and gratitude will make you a more joyful person. If you want to brighten your day and mine, please join!)

What is a Framework?

A framework is a set of libraries that:

  • Manages the complexity of coordinating the resources needed to write an application...
  • by providing abstractions for those resources...
  • and systems for communicating between those resources...
  • within an environment...
  • so that programmers can focus on writing the business logic that's specific to their product

I'll elaborate on each of these points using examples from Rails and from the ultimate framework: the operating system.

You might wonder, how is an OS a framework? When you look at the list of framework responsibilities, you'll notice that the OS handles all of them, and it handles them exceedingly well. Briefly: an OS provides virtual abstractions for hardware resources so that programmers don't have to focus on the details of, say, pushing bytes onto some particular disk or managing CPU scheduling. It also provides the conventions of a hierarchical filesystem with an addressing system consisting of names separated by forward slashes, and these conventions provide one way for resources to communicate with each other (Process A can write to /foo/bar while Process B reads from it) - if every programmer came up with her own bespoke addressing system, it would be a disaster. The OS handles this for us so we can focus on application-specific tasks.

Because operating systems are such successful frameworks we'll look at a few of their features in some detail so that we can get a better understanding of what good framework design looks like.

Coordinating Resources

Resources are the "materials" used by programs to do their work, and can be divided into four categories: storage, computation, communication, and interfaces. Examples of storage include files, databases, and caches. Computation examples include processes, threads, actors, background jobs, and core.async processes. For communication there are HTTP requests, message queues, and event buses. Interfaces typically include keyboard and mouse, plus screens and the systems used to display stuff on them: gui toolkits, browsers and the DOM, etc.

Specialized resources are built on top of more general-purpose resources. (Some refer to these specialized resources as services or components.) We start with hardware and build virtual resources on top. With storage, the OS starts with disks and memory and creates the filesystem as a virtual storage resource on top. Databases like Postgres use the filesystem to create another virtual storage resource to handle use cases not met by the filesystem. Datomic uses other databases like Cassandra or DynamoDB as its storage layer. Browsers create their own virtual environments and introduce new resources like local storage and cookies.

For computation, the OS introduces processes and threads as virtual resources representing and organizing program execution. Erlang creates an environment with a process model that's dramatically different from the underlying OS's. Same deal with Clojure's core.async, which introduces the communicating sequential processes computation model. It's a virtual model defined by Clojure macros, "compiled" to core clojure, then compiled to JVM bytecode (or JavaScript!), which then has to be executed by operating system processes.

Interfaces follow the same pattern: on the visual display side, the OS paints to monitors, applications paint to their own virtual canvas, browsers are applications which introduce their own resources (the DOM and <canvas>), and React introduces a virtual DOM. Emacs is an operating system on top of the operating system, and it provides windows and frames.

Resources manage their own entities: in a database, entities could include tables, rows, triggers, and sequences. Filesystem entities include directories and files. A GUI manages windows, menu bars, and other components.

(I realize that this description of resource is not the kind of airtight, axiomatic, comprehensive description that programmers like. One shortcoming is that the boundary between resource and application is pretty thin: Postgres is an application in its own right, but from the perspective of a Rails app it's a resource. Still, hopefully my use of resource is clear enough that you nevertheless understand what the f I'm talking about when I talk about resources.)

Coordinating these resources is inherently complex. Hell, coordinating anything is complex. I still remember the first time I got smacked in the face with a baseball in little league thanks to a lack of coordination. There was also a time period where I, as a child, took tae kwon do classes and frequently ended up sitting with my back against the wall with my eyes closed in pain because a) my mom for some reason refused to buy me an athletic cup and b) I did not possess the coordination to otherwise protect myself during sparring.

When building a product, you have to decide how to create, validate, secure, and dispose of resource entities; how to convey entities from one resource to another; and how to deal with issues like timing (race conditions) and failure handling that arise whenever resources interact, all without getting hit in the face. Rails, for instance, was designed to coordinate browsers, HTTP servers, and databases. It had to convey user input to a database, and also retrieve and render database records for display by the user interface, via HTTP requests and responses.

There is no obvious or objectively correct way to coordinate these resources. In Rails, HTTP requests would get dispatched to a Controller, which was responsible for interacting with a database and making data available to a View, which would render HTML that could be sent back to the browser.

You don't have to coordinate web app resources using the Model/View/Controller (MVC) approach Rails uses, but you do have to coordinate these resources somehow. These decisions involve making tradeoffs and imposing constraints to achieve a balance of extensibility (creating a system generic enough for new resources to participate) and power (allowing the system to fully exploit the unique features of a specific resource).

This is a very difficult task even for experienced developers, and the choices you make could have negative repercussions that aren't apparent until you're heavily invested in them. With Rails, for instance, ActiveRecord (AR) provided a good generic abstraction for databases, but early on it was very easy to produce extremely inefficient SQL, and sometimes very difficult to produce efficient SQL. You'd often have to hand-write SQL, eliminating some of the benefits of using AR in the first place.

For complete beginners, the task of making these tradeoffs is impossible because doing so requires experience. Beginners won't even know that it's necessary to make these decisions. At the same time, more experienced developers would prefer to spend their time and energy solving more important problems.

Frameworks make these decisions for us, allowing us to focus on business logic, and they do so by introducing communication systems and abstractions.

Resource Abstractions

Our software interacts with resources via their abstractions. I think of abstractions as:

  • the data structures used to represent a resource
  • the set of messages that a resource responds to
  • the mechanisms the resource uses to call your application's code

(Abstraction might be a terrible word to use here. Every developer over three years old has their own definition, and if mine doesn't correspond to yours just cut me a little slack and run with it :)

Rails exposes a database resource that your application code interacts with via the ActiveRecord abstraction. Tables correspond to classes, and rows to objects of that class. This a choice with tradeoffs - rows could have been represented as Ruby hashes (a primitive akin to a JSON object), which might have made them more portable while making it more difficult to concisely express database operations like save and destroy. The abstraction also responds to find, create, update, and destroy. It calls your application's code via lifecycle callback methods like before_validation. Frameworks add value by identifying these lifecycles and providing interfaces for them when they're absent from the underlying resource.

You already know this, but it bears saying: abstractions let us code at a higher level. Framework abstractions handle the concerns that are specific to resource management, letting us focus on building products. Designed well, they enable loose coupling.

Nothing exemplifies this better than the massively successful file abstraction that the UNIX framework introduced. We're going to look at in detail because it embodies design wisdom that can help us understand what makes a good framework.

The core file functions are open, read, write, and close. Files are represented as sequential streams of bytes, which is just as much a choice as ActiveRecord's choice to use Ruby objects. Within processes, open files are represented as file descriptors, which are usually a small integer. The open function takes a path and returns a file descriptor, and read, write, and close take a file descriptor as an argument to do their work.

Now here's the amazing magical kicker: file doesn't have to mean file on disk. Just as Rails implements the ActiveRecord abstraction for MySQL and Postgres, the OS implements the file abstraction for pipes, terminals, and other resources, meaning that your programs can write to them using the same system calls as you'd use to write files to disk - indeed, from your program's standpoint, all it knows is that it's writing to a file; it doesn't know that the "file" that a file descriptor refers to might actually be a pipe.

Exercise for the reader: write a couple paragraphs explaining precisely the design choices that enable this degree of loose coupling. How can these choices help us in evaluating and designing frameworks?

This design is a huge part of UNIX's famed simplicity. It's what lets us run this in a shell:

# list files in the current directory and perform a word count on the output
ls | wc

The shell interprets this by launching an ls process. Normally, when a process is launched it creates three file descriptors (which, remember, represent open files): 0 for STDIN, 1 for STDOUT, and 2 for STDERR, and the shell sets each file descriptor to refer to your terminal (terminals can be files!! what!?!?). Your shell sees the pipe, |, and sets ls's STDOUT to the pipe's STDIN, and the pipe's STDOUT to wc's STDIN. The pipe links processes' file descriptors, while the processes get to read and write "files" without having to know what's actually on the other end. No joke, every time I think of this I get a little excited tingle at the base of my spine because I am a:

This is why file I/O is referred to as the universal I/O model. I'll have more to say about this in the next section, but I share it here to illustrate how much more powerful your programming environment can be if you find the right abstractions. The file I/O model still dominates decades after its introduction, making our lives easier without our even having to understand how it actually works.

The canonical first exercise any beginner programmer performs is to write a program that prints out, Wassup, homies?. This program makes use of the file model, but the beginner doesn't have to even know that such a thing exists. This is what a good framework does. A well-designed framework lets you easily get started building simple applications, without preventing you building more complicated and useful ones as you learn more.

One final point about abstractions: they provide mechanisms for calling your application's code. We saw this a bit earlier with ActiveRecord's lifecycle methods. Frameworks will usually provide the overall structure for how an application should interact with its environment, defining sets of events that you write custom handlers for. With ActiveRecord lifecycles, the structure of before_create, create, after_create is predetermined, but you can define what happens at each step. This pattern is called inversion of control, and many developers consider it a key feature of frameworks.

With *nix operating systems, you could say that in C programs the main function is a kind of onStart callback. The OS calls main, and main tells the OS what instructions should be run. However, the OS controls when instructions are actually executed because the OS is in charge of scheduling. It's a kind of inversion of control, right? 🤔

Communication

Frameworks coordinate resources, and (it's almost a tautology to say this) coordination requires communication. Communication is hard. Frameworks make it easier by translating the disparate "languages" spoken by resources into one or more common languages that are easy to understand and efficient, while also ensuring extensibility and composability. Frameworks also do some of the work of ensuring resilience. This usually entails:

  • Establishing naming and addressing conventions
  • Establishing conventions for how to structure content
  • Introducing communication brokers
  • Handling communication failures (the database is down! that file doesn't exist!)

One example many people are familiar with is the HTTP stack, a "language" used to communicate between browser and server resources:

  • HTTP structures content (request headers and request body as text)
  • TCP handles communication failures
  • IP handles addressing

Conventions

The file model is a "common language", and the OS uses device drivers to translate between between the file model and whatever local language is spoken by hardware devices. It has naming and addressing conventions, letting you specify files on the filesystem using character strings separated by slashes that it translates to an internal inode (a data structure that stores file and directory details, like ownership and permissions). We're so used to this that it's easy to forget it's a convention; *nix systems could have been designed so that you had to refer to files using a number or a UUID. The file descriptors I described in the last section are also a convention.

Another convention the file model introduces is to structure content as byte streams, as opposed to bit streams, character streams, or xml documents. However, bytes are usually too low-level, so the OS includes a suite of command line tools that introduce the further convention of structuring bytes by interpreting them as characters (sed, awk, grep, and friends). More recently, more tools have been introduced that interpret text as YAML or JSON. The Clojure world has further tools to interpret JSON as transit. My YAML tools can't do jack with your JSON files, but because these formats are all expressed in terms of lower-level formats, the lower-level tools can still work with them. Structure affects composability.

The file model's simplicity is what allows it to be the "universal I/O model." I mean, just imagine if all Linux processes had to communicate with XML instead of byte streams! Hoo boy, what a crazy world that would be. Having a simple, universal communication system makes it extremely easy for new resources to participate without having to be directly aware of each other. It allows us to easily compose command line tools. It allows one program to write to a log while another reads from it. In other words, it enables loose coupling and all the attendant benefits.

Communication Brokers

Globally addressable communication brokers (like the filesystem, or Kafka queues, or databases) are essential to enabling composable systems. Global means that every resource can access it. Addressable means that the broker maintains identifiers for entities independently of its clients, and it's possible for clients to specify entities using those identifiers. Communication broker means that the system's purpose is to convey data from one resource to another, and it has well-defined semantics: a queue has FIFO semantics, the file system has update-in-place semantics, etc.

If Linux had no filesystem and processes were only allowed to communicate via pipes, it would be a nightmare. Indirect communication is more flexible than direct communication. It supports decoupling over time, in that reads and writes don't have to happen synchronously. It also allows participants to drop in and out of the communication system independently of each other. (By the way, I can't think of the name for this concept or some better way to express it, and would love feedback here.)

I think this is the trickiest part of framework design. At the beginning of the article I mentioned that developers might end up hacking around a framework's constraints, and I think the main constraint is often the absence of a communication broker. The framework's designers introduce new resources and abstractions, but the only way to compose them is through direct communication, and sometimes that direct communication is handled magically. (I seem to recall that Rails worked with this way, with tight coupling between Controller and Views and a lack of options for conveying Controller data to other parts of the system). If someone wants to introduce new abstractions, they have to untangle all the magic and hook deep into the framework's internals, using -- or even patching! -- code that's meant to be private.

I remember running into this with Rails back when MongoDB was released; the document database resource was sufficiently different from the relational database resource that it was pretty much impossible for MongoDB to take part in the ActiveRecord abstraction, and it was also very difficult to introduce a new data store abstraction that would play well with the rest of the Rails ecosystem.

For a more current example, a frontend framework might identify the form as a resource, and create a nice abstraction for it that handles things like validation and the submission lifecycle. If the form abstraction is written in a framework that has no communication broker (like a global state container), then it will be very difficult to meet the common use case of using a form to filter rows in a table because there's no way for the code that renders table data to access the form inputs' values. You might come up with some hack like defining handlers for exporting the form's state, but doing this on an ad-hoc basis results in confusing and brittle code.

By contrast, the presence of a communication broker can make life much easier. In the Clojure world, the React frameworks re-frame and om.next have embraced global state atoms, a kind of communication broker similar to the filesystem (atoms are an in-memory storage mechanism). They also both have well defined communication protocols. I'm not very familiar with Redux but I've heard tell that it also has embraced a global, central state container.

If you create a form abstraction using re-frame, it's possible to track its state in a global state atom. It's further possible to establish a naming convention for forms, making it easier for other participants to look up the form's data and react to it. (Spoiler alert: the framework I've been working on does this!)

Communication systems are fundamental. Without them, it's difficult to build anything but the simplest applications. By providing communication systems, frameworks relieve much of the cognitive burden of building a program. By establishing communication standards, frameworks make it possible for developers to create composable tools, tools that benefit everybody who uses that framework. Standards make infrastructure possible, and infrastructure enables productivity.

In this section I focused primarily on the file model because it's been so successful and I think we can learn a lot from it. Other models include event buses and message queues. I'm not going to write about these because I'm not made of words, ok?!?

Environments

Frameworks are built to coordinate resources within a particular environment. When we talk about desktop apps, web apps, single page apps, and mobile apps, we're talking about different environments. From the developer's perspective, environments are distinguished by the resources that are available, while from the user's perspective different environments entail different usage patterns and expectations about distribution, availability, licensing, and payment.

As technology advances, new resources become available (the Internet! databases! smart phones! powerful browsers! AWS!), new environments evolve to combine those resources, and frameworks are created to target those environments. This is why we talk about mobile frameworks and desktop frameworks and the like.

One of the reasons I stopped using Rails was because it was a web application framework, but I wanted to build single page applications. At the time (around 2012?), I was learning to use Angular and wanted to deploy applications that used it, but it didn't really fit with Rails's design.

And that's OK. Some people write programs for Linux, some people write for macOS, some people still write for Windows for some reason (just kidding! don't kill me!). A framework is a tool, and tools are built for a specific purpose. If you're trying to achieve a purpose the tool isn't built for, use a different tool.

More Benefits of Using Frameworks

So far I've mostly discussed how frameworks bring benefits to the individual developer. In this section I'll explain how frameworks benefit communities, how they make programming fun, and (perhaps most importantly) how they are a great boon for beginners.

First, to recap, a framework is a set of libraries that:

  • Manages the complexity of coordinating the resources needed to write an application
  • By providing abstractions for those resources
  • And systems for communicating between those resources
  • Within an environment
  • So that programmers can focus on writing the business logic that's specific to their product

This alone lifts a huge burden off of developers. In case I haven't said it enough, this kind of work is hard, and if you had to do it every time you wanted to make an application it would be frustrating an exhausting. Actually, let me rephrase that: I have had to do this work, and it is frustrating and exhausting. It's why Rails was such a godsend when I first encountered it in 2005.

Frameworks Bring Community Benefits

Clear abstractions and communication systems allow people to share modules, plugins, or whatever you want to call framework extensions, creating a vibrant ecosystem of reusable components.

If you accept my assertion that an operating system is a framework, then you can consider any program which communicates via one of the OS's communication systems (sockets, the file model, etc) to be an extension of the framework. Postgres is a framework extension that adds an RDBMS resource. statsd is an extension that adds a monitoring resource.

Similarly, Rails makes it possible for developers to identify specialized resources and extend the framework to easily support them. One of the most popular and powerful is Devise, which coordinates Rails resources to introduce a new user authentication resource. Just as using Postgres is usually preferable to rolling your own database, using Devise is usually preferable to rolling your own authentication system.

Would it be possible to create a Devise for Clojure? I don't think so. Devise is designed to be database agnostic, but because Clojure doesn't really have a go-to framework that anoints or introduces a go-to database abstraction, no one can write the equivalent of Devise in such a way that it could easily target any RDBMS. Without a framework, it's unlikely that someone will be able to write a full-featured authentication solution that you can reuse, and if you write one it's unlikely others would see much benefit if you shared it. I think it's too bad that Clojure is missing out on these kinds of ecosystem benefits.

Another subtler benefit frameworks bring is that they present a coherent story for how developers can build applications in your language, and that makes your language more attractive. Building an application means coordinating resources for the environment you're targeting (desktop, mobile, SPA, whatever). If your language has no frameworks for a target environment, then learning or using the language is much riskier. There's a much higher barrier to building products: not only does a dev have to learn the language's syntax and paradigms, she has to figure out how to perform the complex task of abstracting and coordinating resources using the language's paradigms. If your goal is to create a mass-market product, choosing a language that doesn't have frameworks for your target environments is a risky choice.

Finally, frameworks become a base layer that you can create tooling for. The introduction of the filesystem made it possible for people to write tools that easily create and manipulate files. Rails's abstractions made it easy to generate code for creating a new database table, along with an entire stack - model, view, controller - for interacting with it.

Frameworks Make Development Fun

If you still think frameworks are overkill or more trouble than they're worth, believe me I get it. When I switched from Rails to Clojure and its "libraries not frameworks" approach, I loved it. A framework felt unnecessary because all the pieces were so simple that it was trivial for me to glue them together myself. Also, it was just plain fun to solve a problem I was familiar with because it helped me learn the language.

Well, call me a jaded millenial fart, but I don't think that this work is fun anymore. I want to build products, not build the infrastructure for building products. I want a plugin that will handle the reset password process for me. I want an admin panel that I can get working in five minutes. Frameworks handle the kind of work that ideally only has to be done once. I don't want to have to do this work over and over every time I want to make something.

For me, programming is a creative endeavor. I love making dumb things and putting them in front of people to see what will happen. Rails let me build (now defunct) sites like phobiatopia.com, where users could share what they're afraid of. The site would use their IP address to come up with some geo coordinates and use Google Maps to display a global fear map. A lot of people were afraid of bears.

Frameworks let you focus on the fun parts of building an app. They let you release an idea, however dumb, more quickly.

Frameworks Help Beginners

Frameworks help beginners by empowering them to build real, honest-to-god running applications that they can show to their friends and even make money with, without having to fully understand or even be aware of all the technology they're using. Being able to conjure up a complete creation, no matter how small or ill-made, is the very breath of wonder and delight. (I don't know exactly what this means, but I like how it sounds!)

There's a kind of thinking that says frameworks are bad because they allow beginners to make stuff without having to know how it all works. ActiveRecord is corrupting the youth, allowing them to build apps without even knowing how to pronounce SQL.

There's another line of thinking that says it's bad to try to make things easier for beginners. It's somehow virtuous for people to struggle or suffer for the sake of learning.

Hogwash. Fiddlefaddle. Poppycock. Joy beats suffering every time, and making learning more joyful allows more people to reap the benefits of whatever tool or product you've created.

I am a photographer. I have a professional camera, and I know how to use it. Some of my photos require a fair amount of technical knowledge and specialized equipment:

tea

This isn't something you can create with a camera phone, yet somehow I'm able to enjoy myself and my art without complaining that point-and-shoot cameras exist and that people like them.

Novices benefit greatly from expert guidance. I don't think you can become a master photographer using your phone's camera, but with the phone's "guidance" you can take some damn good photos and be proud of them. And if you do want to become a master, that kind of positive feedback and sense of accomplishment will give you the motivation to stick with it and learn the hard stuff. Frameworks provide this guidance by creating a safe path around all the quicksand and pit traps that you can stumble into when creating an app. Frameworks help beginners. This is a feature, not a bug.

A Clojure Framework

Frameworks are all about managing the complexity of coordinating resources. Well, guess what: Managing Complexity is Clojure's middle name. Clojure "Managing Complexity" McCarthy-Lisp. Personally, I want a single-page app (SPA) framework, and there are many aspects of Clojure's design and philosophy that I think will make it possible to create one that seriously kicks ass. I'll give just a few examples.

First, consider how Linux tools like sed and awk are text-oriented. Developers can add additional structure to text by formatting it as JSON or YAML, and those text-processing tools can still work the structured text.

In the same way, Clojure's emphasis on simple data structures means that we can create specialized structures to represent forms and ajax request, and tools to process those structures. If we define those structures in terms of maps and vectors, though, we'll still be able to use a vast ecosystem of functions for working with those simpler structures. In other words, creating specialized structures does not preclude us from using the tools built for simpler structures, and this isn't the case for many other languages.

Second, Clojure's abstraction mechanisms (protocols and multimethods) are extremely flexible, making it easy for us to implement abstractions for new resources as they become available.

Third, you can use the same language for the frontend and backend!!! Not only that, Transit allows the two to effortlessly communicate. This eliminates an entire class of coordination problems that frameworks in other languages have to contend with.

In my opinion, the Clojurian stance that frameworks are more trouble than they're worth is completely backwards: Clojure gives us the foundation to build a completely kick-ass framework! One that's simple and easy. One can dream, right?

My ambition in building a SPA framework is to empower current and future Clojure devs to get our ideas into production fast. I want us to be able to spend more time on the hard stuff, the fun stuff, the interesting stuff. And I want us to be able to easily ship with confidence.

The framework I'm building is built on top of some truly amazing libraries, primarily Integrant, re-frame, and Liberator. Integrant introduces a component abstraction and handles the start/stop lifecycle of an application. re-frame provides a filesystem and communication broker for the frontend. Liberator introduces a standard model for handling HTTP requests.

If my framework is useful at all it's because the creators of those tools have done all the heavy lifting. My framework introduces more resources and abstractions specific to creating single-page apps. For example, it creates an abstraction for wrapping AJAX requests so that you can easily display activity indicators when a request is active. It creates a form abstraction that handles all the plumbing of handling input changes and dispatching form submission, as well the entire form lifecycle of fresh, dirty, submitted, invalid, succeeded, etc. It imposes some conventions for organizing data.

As I mentioned, the framework is not quite ready for public consumption yet becaause there's still a lot of churn while I work out ideas, and because there's basically no documentation, but I hope to release it in the near future.

If you'd like to see a production app that uses the framework, however, I invite you to check out Grateful Place, a community site for people who want to support each other in growing resilience, peace, and joy by practicing compassion, gratitude, generosity, and other positive values. By joining, you're not just helping yourself, you're helping others by letting them know that you support them and share their values.

Please click around and look at the snazzy loading animations. And if you feel so moved, please do join! I love getting to interact with people in that context of mutual support for shared values. One of the only things I care about more than Clojure is helping people develop the tools to navigate this crazy-ass world :D

In the mean time, I'll keep working on getting this framework ready for public consumption. Expect another blawg article sharing some details on how Grateful Place is implemented. Then, eventually, hopefully, an actual announcement for the framework itself :)

If you don't want to wait for my slow butt, then check out some ofthe amazing Clojure tools that already exist:

  • Luminus
  • Fulcro which probably does everything I want my framework to, only better
  • re-frame remains my favorite frontend framework
  • duct is great but its docs aren't that great yet
  • Coast on Clojure, a full stack web framework

(Sorry if I neglected your amazing Clojure tool!)

Thanks to the following people who read drafts of this article and helped me develop it:

  • Mark Bastian
  • Dmitri Sotnikov aka @yogthos
  • Sergey Shvets
  • Kenneth Kalmer
  • Sean whose last name I don't know
  • Tom Brooke
  • Patrick whose last name I don't know (update: It's Patrick French!)
  • Fed Reggiardo
  • Vincent Raerek
  • Ernesto de Feria
  • Bobby Towers
  • Chris Oakman
  • The TriClojure meetup

Permalink

Introducing Pinto and Stetson - Opinionated Purescript bindings to OTP and Cowboy

If you're reading this, you've either been given the link to the posts ahead of time cos you've asked to see what is going on, or I've hit the publish button in which case hooray. Either way, this is a little series of posts going through some of the Purescript/Purerl code that we've quietly open sourced on Github under the Apache 2.0 license. Hopefully betwen these posts, the published markdown docs and the sample application there will be enough to get started with.

Over the last year or so, we've been gradually building out our capacity to create applications end-to-end in Purescript, compiled to JS on the front-end and compiled to Erlang on the back, building on top of both OTP and our existing libraries from nearly a decade of company history doing business on top of the Erlang stack.

The repositories we're looking at are:

The best place to start if you want to dive right in, is probably the demo-ps project as it demonstrates the usage of most of the above, and that is indeed where we'll be starting in this series.

Purerl

The Purerl organisation contains the core sets of bindings to much of Erlang's base libraries, as well as the fork of the Purescript compiler that can generate Erlang as a backend.

Purerl-package-sets

Essentially a pile of Dhall that generates a package.json containing a list of versions of the various Purerl libraries that work together, you'll not need to touch this directly unless you end up using Purerl internally in an organisation and you want to fork it and add your own internal/private Purerl dependencies.

Stetson

Cowboy is the de-facto webserver in the Erlang world, and direct bindings exist for the project already, however when it came time to start building applications on top of this, it was clear that there was little gain to be had by directly using them over simply writing Erlang in the first place. Stetson was my attempt to mirror the experience I've had in other functional languages using libraries such as Compojure and Scotty. It isn't by any means complete, and merely serves as a statement of intent around the kind of interaction I'd personally like to have around routing/etc in a Purerl world. I fully hope/expect that somebody will write a native http server in time rather than simply wrapping Cowboy as I have done here.

Pinto

There have been a few examples written demonstrating how to interact with OTP from Purerl, but again at the point of building a real application, direct bindings don't offer a good user experience once you start building out functionality and repeating yourself a whole ton. I cheated a lot when putting together Pinto and skipped the direct bindings step, going straight to the "desired usage" step and doing a pile of cheats around types and such. It seeks to largely mirror the existing OTP interactions, but in a more functional manner. Much like with Stetson, I fully expect/hope that in time somebody (maybe even us) will want a more idiomatic Purescript experience and choose to build something even more opinionated outside the familiar comfort of the OTP vocabulary. For now, we have Pinto.. :)

Demo-ps

This is a completely pointless web app that uses purescript-erl-stetson, purescript-erl-pinto, purescript-simple-json and purescript-halogen to store data in Redis using some FFI and display it in a single page application, sharing the view models between the server and client components. It seeks to demonstrate rough usages of all of these without cluttering up the interactions with "real code" (read: business logic).

Next post, we'll look at the structure of the demo-ps project, as understanding this is essential if you wish to build your own.

Permalink

PurelyFunctional.tv Newsletter 339: Design is about pulling things apart

Issue 339 – August 12, 2019 · Archives · Subscribe

Clojure Tip 💡

Design is about pulling things apart

I think it was Rich Hickey who said this. It’s a big part of the philosophy of Clojure and its community. Pulling things apart is why we have reducers. It pulls apart folding from order so it can be parallelized. Pulling things apart is why we have transducers. It pulls apart sequence operations (like map: do f to every element) from where you find the items (sequences or core.async channels). It’s also what lets Datomic scale reads separate from writes. And of course, there’s the term decomplect.

I don’t now if design is only about pulling things apart, but it’s a big part of it. Pulling things apart lets you get really granular when you’re controlling things. So much complexity comes from not having the precise control you need. Instead, you’re doing workarounds, which add complexity. If you could just get down a layer, where things are controlled separately, you could do just what was needed.

But it’s not easy. The first step is to recognize the problem. The second step is to see that maybe things could be separated. Third is figuring out how to separate them. This third step is usually a messy, iterative process with many experiments. You won’t know if you can succeed until you do.

Even if you don’t succeed, it’s worth the effort. You’ll learn a lot.

Just finished reading 📖

At home in the universe by Stuart Kauffman.

Well, what a great book. It came out in 1995. Complexity theory was so hot then, and this book had a lot to do with it. By working with computational models, Kauffman explains a lot about our universe and why there is life. After reading a few chapters, I was very convinced that life is inevitable. The details are random, but it’s quite easy to reach a critical mass of proteins that self-catalyze. Once you’re over the threshold, you get a mess of new proteins, and those new proteins make it even more likely to catalyze.

It made me think a lot about computation. What can we learn from life to design distributed systems that are vastly larger than we currently do? How can you guarantee that it will converge on an answer?

This book has not changed my life, but I think it could have if I had read it when I was younger. In High School, I was getting into Complexity Theory and Artificial Life, but I never came across this book. If I had read it then, it may have altered my path substantially.

Book status 📖

Well, I am confident now that the book will be published very soon. It was in process internally at the publisher for a while, but it looks like it could emerge this week. Yes, it has been stressful. I want my book out there, and if the messages I get are any indication, many people also want it. But the publisher is making it as good as they can before it comes out.

I’ll keep posting about it weekly in the newsletter. Or you can sign up for the book-specific launch list here.

Currently recording 🎥

I am now recording a course called Property-Based Testing with test.check. Property-based testing (PBT) is a powerful tool for generating tests instead of writing them. You’ll love the way PBT makes you think about your system. And you can buy it now in Early Access. Of course, PF.tv members have had access since the beginning, along with all of the other courses.

New lessons this week include:

  1. Strategies for properties: generate the output — in which test things that are not easy to generate.
  2. Strategies for properties: metamorphic properties — in which we test things even if we don’t know exactly what they do.
  3. Behind the scenes: size — in which we peer behind the curtain at how random values are.

Last week I said that there were three more lessons for properties, and here I am showing you only two. I reviewed the notes for the third one and I had already covered everything. Soon we’re going to get into real testing examples. But before that, we have to talk about shrinkage.

I’ve made the first and fifth lessons free to watch. Go check them out.

Members already have access to the lessons. The Early Access Program is open. If you buy now, you will get the already published material and everything else that comes out at a serious discount. There are already 6 hours of video, and looking at my plan, this one might be 9-10 hours. But that’s just an estimate. It could be more or less. The uncertainty about it is why there’s such a discount for the Early Access Program.

Clojure Challenge 🤔

Last week’s challenge

The puzzle in Issue 338 was to write a function that created the derivative of a function using the limit definition of the derivative.

You can check out the submissions here.

This week’s challenge

symbolic differentiation

I got excited last week by the numeric differentiator challenge. When I originally read that section of SICP so many years ago, I was elated. Do you know how hard that would be to do in Java?

Now for something that’s even harder to do in Java: symbolic differentiation.

Here’s the relevant link into SICP, for reference.

The challenge this week is to use four differentiation formulas to make a symbolic differentiator. It has been a while since I’ve done any calculus, so I had to look these up:

  1. dc/dx = 0 (for a constant c, or variable c different from x)
  2. dx/dx = 1
  3. d(u+v)/dx = du/dx + dv/dx (addition rule)
  4. d(uv)/dx = u(dv/dx) + v(du/dx) (product rule)

This won’t differentiate every formula, but it will differentiate sums and products of variables and numbers. You should make a function that takes an expression and a variable to differentiate by, like so:

(deriv '(+ x 5) 'x) ;=> (+ 1 0)

Don’t worry about simplifying the final expression. Just make sure it’s right.

As usual, please send me your implementations. I’ll share them all in next week’s issue. If you send me one, but you don’t want me to share it publicly, please let me know.

Rock on! Eric Normand

The post PurelyFunctional.tv Newsletter 339: Design is about pulling things apart appeared first on PurelyFunctional.tv.

Permalink

Copyright © 2009, Planet Clojure. No rights reserved.
Planet Clojure is maintained by Baishamapayan Ghose.
Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
Theme by Brajeshwar.