Simulation Testing using Datomic simulant library

Why?

Dynamic, REPL driven languages like Clojure are brilliant for rapid development.

However once the system is deployed in production and real people depend upon its continued functioning, the interactive dynamic nature can become a hindrance.

How can you be sure that your refactoring doesn’t break the existing system. What are the unforeseen consequences of deploying your latest feature? How will the system scale when you become successful?

These are questions that simulant will help to answer.

What?

Simulant is a smallish library belonging to the datomic.com domain.

It is fizzing with good ideas yet feels somewhat unfinished. This is partly a consequence of its problem domain; 80% of any simulation testing code is likely to be bespoke for the domain in question. However the simulant code base could be doing more to be generally helpful

Concepts

Simulant uses datomic to read and write almost all its data.

This provides

  • Temporal Seperation You can generate a bunch of test and they are stored in datomic. You can create sims out of them a long time afterwards

  • Scalability You can use datomic as memory that is shared across multiple Java Virtual Machines. You can thus co-ordinate multiple JVMs running across multiple machines, as long as they all have access to the datomic database

  • Long Term Testing You can keep tests and sim reports for long periods of time. This means that you can rerun the same test script on new versions of your code and discover what is newly broken, and how you have affected performance. You can analyse these reports over time to show how the project is progressing

It does mean that you will need a reasonable understanding of datomic in order to get anything done.

Simulant splits its work as follows, all modelled as datomic entities

Model

Describes high level usage scenarios;

  • What are we testing?

  • How many users?

  • What ratio of corrupted or nonsensical data?

Test

Takes a model and generates a runbook for testing that is independent of any specific target system. A Test entity will contain at least

  • a Test Type

  • a set of TestAgent entities

TestAgent

A Test Agent has a list of actions which will be presented for execution in order of their atTime value Actions are run independently, within a bounded thread pool.

Action

An action models a distinct unit of work (and the verifiction of whether it completed normally) It needs to contain at least

  • action type, used to drive the (defmulti perform-action multimap

  • atTime, provides a rough real-time ordering. This controle the time that the action is submitted to the ThreadPool; A busy ThreadPool may execute in arbitrary order

Sim

Associates a test with a specific system; It must contain

  • datomic URI where the sim may retrieve its configuration data

  • process Count how many distinct Java Virtual Machines will share the execution of the Sim Agents this will probably include additionally

    • The URL of the system under test

    • a set of credentials to gain test access to the system under test

Service

A Service is spawned on every process that has joined the simulation. It models a resource that that has its own start/stop lifecycle For example

  • logsvc used to report the results of a simulant action

  • loginsvc used to acquire session cookies, or other authorisation data

  • uuidsvc I am testing a system that allocated UUIDs to all its data items. I therefore need a way to retrieve the specific UUID for this sim for an action item.

  • processStateService This can be used to stored data that belongs within one specific simulation run

    A Service stores all the data required to construct it in a datomic entity (including its factory method)
    A Service instance is then constructed in every sim process
SimClock

describes how the atTime attribute is to be used to dispatch actions. Default behaviour is to use atTime as a millisecond offset from the start time, speeded up by a constant amount

How?

Simulant processes are started by running the simulant.sim/run-sim-process function

  1. Each process will register its presence in datomic, the entire simulation will wait until as many processes have joined as were asked for when the sim was created. Each process polls datomic once a second to see when enough processes have joined.

  2. By default each process is allocated an equal share of SimAgents

  3. The Services are then started and stored in a thread local services map

  4. The process then lazily streams action entities from datomic, keeping the ones belonging to its Sim Agents

  5. It waits until the Sim Clock says it is time to start the next action

  6. The action is then dispatched to a clojure agent which associated with a SimAgent

  7. It repeats until all the actions have been dispatched

  8. It awaits execution on all clojure agents

  9. It instructs the services to complete

Permalink

Senior Clojure Developer

Senior Clojure Developer

All Street Research | London, UK
An AI research assistant at every financial analyst's desktop.
£60000 - £80000

Background

AllStreet is a new fintech company based in London, using Clojure and Clojurescript to create tools for financial analysts. We are developing AI-based automation tools to provide the equivalent of a research assistant at every analyst's desktop.

We are growing the number of developers in London, and are looking for people with Clojure and Clojurescript development skills. The ideal candidate will have been using Clojure commercially, although we will consider others with less Clojure experience, if you have other strong functional programming skills, and other significant skills, relevant to our wider technology stack and domain of interest.

Experience of any of the following would be an asset::

C#/.Net

Javascript

HTML5

Python

ElasticSearch

Natural Language Processing

Semantic Analysis of Text

Concept Extraction

Natural Language Generation

Machine Learning

Behavioural Analytics

Content Production Systems

Document Formatting

The Job

We are looking to hire a Senior Developer to work in our London office, and work within a small team of engineers. We have a vibrant, productive working environment. We apply agile principles to our product development, and have adopted a Scrum-based methodology, with a view to delivering user-centric software which is iterated quickly to provide continuously increasing value to our customers.

The work is highly collaborative in nature, so the environment requires a high level of interpersonal trust and confident communication between team members.

While the core product is a web-based system written in Clojure, you will be exposed to advanced natural language processing and machine learning techniques, including a state-of-the-art semantic analysis engine, being developed in C#.

The ideal candidate, will be deployment focused, with a high degree of curiosity, and with the ability to solve challenging problems and ask thoughtful questions, while designing and delivering simple and elegant solutions using functional programming.

You must be able to envision a variety of solutions and to clearly articulate them to colleagues, including non-technical stakeholders, and explain the trade-offs involved in choosing between options.

You must love working with others and sharing knowledge, to get the best result for the end-user and ultimately for the business.

Requirements

Ideally, the successful candidate will have:

  • At least five years experience of delivering software in a professional environment
  • Demonstrable experience of Clojure
  • Familiarity with Scrum
  • Familiarity with Continuous Deployment
  • Interest in Natural Language Programming
  • Interest in Machine Learning

Location

We are located in the City of London, where many of our target customers are based. This is also adjacent to Shoreditch and Silicon Roundabout, the home of Google's Campus London and Microsoft Reactor, which make it a vibrant area, full of like-minded developers and entrepreneurs.

Permalink

Episode 017: Data, at Your Service

Nate finds it easier to get a broad view without a microscope.

  • After last week’s diversion into time math, we are back to the core problem this week.
  • Now we want a total by date.
  • Need to refactor the function to return the date in addition to minutes.
  • “We’re letting the code grow up into the problem.”
  • “Let’s let the problem pull the code out of us.”
  • First attempt
    • Use map to track running totals by day
    • As each new entry is encountered, update the total for that day in the map
  • New complication: Now we want a total for all work on Sundays.
  • The loop + recur approach is getting complicated!
    • More and more concerns all mixed together in one place
    • Closely ties the traversal of the data to the processing of the data
  • Better idea: use reduce. Just write “reducer” functions.
  • Simplify by ensuring data passed to reduce is already filtered.
  • “In imperative land, let’s take three different dimensions of consideration and shove them all together in this one zone.”
  • Motivating question for a solution: “How is this composable?”
  • “In Clojure you end up with really small functions because you end up composing them at the end.”
  • Ugly: the reducer for “work on Sundays” still has an if for throwing away data.
  • Better: add another filter to just pass through Sundays.
  • Best: minimal work in the reducer. Use map and filter to get the data in shape first.
  • Imperative thinking: what value do I need to operate on?
  • Functional thinking: how can I accurately represent the data present in the input?
  • After you have all the data at hand, you can summarize it however you want!
  • Why reducers? When you need to operate one step at a time: streaming data, game state, etc.
  • Clojure’s sequence abstraction is powerful and unifying.
  • “All the functions in the core work on all the data.”

Related episodes:

Clojure in this episode:

  • loop, recur
  • map, filter, reduce
  • group-by
  • if
  • ->, ->>

Code sample from this episode:

(ns time.week-03
  (:require
    [clojure.java.io :as io]
    [clojure.string :as string]
    [java-time :as jt]))


; Functions for parsing out the time format: Fri Feb 08 2019 11:30-13:45

(def timestamp-re #"(\w+\s\w+\s\d+\s\d+)\s+(\d{2}:\d{2})-(\d{2}:\d{2})")

(defn localize [dt tm]
  (jt/zoned-date-time dt tm (jt/zone-id)))

(defn parse-time [time-str]
  (jt/local-time "HH:mm" time-str))

(defn parse-date [date-str]
  (jt/local-date "EEE MMM dd yyyy" date-str))

(defn adjust-for-midnight
  [start end]
  (if (jt/before? end start)
    (jt/plus end (jt/days 1))
    end))

(defn parse
  [line]
  (when-let [[whole dt start end] (re-matches timestamp-re line)]
    (let [date (parse-date dt)
          start (localize date (parse-time start))
          end (adjust-for-midnight start (localize date (parse-time end)))]
      {:date date
       :start start
       :end end
       :minutes (jt/time-between start end :minutes)})))


; How many minutes did I work on each day?

(defn daily-total-minutes
  [times]
  (->> times
       (group-by :date)
       (map (fn [[date entries]] (vector date (reduce + (map :minutes entries)))))
       (into {})))


; How many minutes total did I work on Sundays?

(defn on-sunday?
  [{:keys [date]}]
  (= (jt/day-of-week date) (jt/day-of-week :sunday)))

(defn sunday-minutes
  [times]
  (->> times
       (filter on-sunday?)
       (map :minutes)
       (reduce +)))


; Functions for turning the time log into a sequence of time entries

(defn lines
  [filename]
  (->> (slurp filename)
       (string/split-lines)))

(defn times
  [lines]
  (->> lines
       (map parse)
       (filter some?)))


; Process a time log with the desired summary calculation

(defn summarize
  [filename calc]
  (->> (lines filename)
       (times)
       (calc)))


(comment
  (summarize "time-log.txt" daily-total-minutes)
  (summarize "time-log.txt" sunday-minutes)
  )

Permalink

Journal 2019.8 - jdk regression, clojure.main

JDK regression

I wrote about this issue last week and spent a good chunk of this week better understanding the scope of the problem and brainstorming some possible mitigations. Probably the best explanation at this point can be found in this blog from Claes Redestad from Oracle. Ghadi Shayban has been working with Claes to better understand the problem and sync that up with our Clojure world. Big thanks to Claes, David Holmes, and Vladimir Ivanov for everything they’ve done on the ticket. I’d be happy to buy those guys a beer sometime.

In Clojure terms, we’ve verified that this does not affect AOT or genclass classes, and is really primarily isolated to doing a lot of work in user.clj. This is because user.clj is loaded during the static initializer of clojure.lang.RT (the Clojure runtime). Loading makes calls into the Compiler and other things that ultimately call back into the RT static methods (and that should look like the Bad or AlsoBad scenarios in Claes’s blog).

In terms of mediations, we’ve looked at two different approaches and have working solutions for both. The first is moving the load of user.clj out of RT static initializer and into an explicit initialization method. Then at any entry point, the explicit initializer needs to be called. This is a pretty simple change but has the tricky probleme of knowing where the “entry points” are. There are some obvious ones if you’re starting from clojure.main, the Java Clojure entry point, a genclasse or AOT class file and those are easy to hook. But because this happens sort of on-demand right now there are a ton of possible entry points people could be using in weird scenarios. We have not done any heroic effort at this point. Maybe for loading user.clj it’s not important to cover all those. Ghadi put this patch together.

Another approach is suggested by Claes in his blog - if you can move the actual static methods into a separate class and fully initialize that first, your initializer doesn’t encounter the problematic scenario. I put together a version of this. Basically all of the current methods in RT move into a new RT.Impls class which has no static initializer. That class is fast to load. The RT static initializer then just makes calls into the RT.Impls class instead. To remove the reentrant calls, everything in Clojure that calls RT calls into RT.Impl instead as well. All of the methods in RT do still need to be there though (or we’d break calls from already compiled Clojure classes from previous versions or from advanced Clojure code that called directly into RT). This works, but it’s a giant patch. We think maybe we wouldn’t need to patch all calls - could probably just be a strategic subset.

We’re still trying to decide the best course of action. I expect there will be some mitigations in Java itself, but hard to say their impact or when they might arrive.

clojure.main

I wrote a few weeks ago about clojure.main not making use of our new error printing and how that affects other tools like Leiningen. I spent some more time this week trying to assess exactly what we should do for the ticket with respect to printing, stack traces, etc. Still working on the decisions and a better patch for that.

spec

I was out with the flu early in the week and with the other stuff above, did not touch spec this week. Sorry! That’s how it goes sometimes.

Other stuff I enjoyed this week…

Two songs floated through my world this week that I’ll share with you. They are very different.

First, I enjoyed this progressive metal (metal core? who can keep the micro genres straight): Absolomb by Periphery. Does it djent? Yes it does.

Second, it’s that time of the year in St. Louis where my wife asks me every day if it’s time to open the pool yet. I was delighted to find this new jam today. Pool Party by Rudy Willingham is the funky tune you need to keep summer in your heart.

Permalink

The REPL

The REPL

Surveying the landscape
View this email in your browser

The REPL

 

-main

  • If you haven’t read Bozhidar’s post on the future of CIDER I’d urge you to do so.
      I really think the work I’ve done (not to mention the great work others have done) is worth a bit more than this, and I hope that people are going to realize this before I decide to throw in the towel and call it quits.

    If you use CIDER in your day job, think about supporting the Open Collective, either individually, or through your company. Since he wrote the post, annual donations have gone from $4k/year to $7k/year. Based on the most recent survey results, 46% of respondents use CIDER, so there is a lot of opportunity for us to grow that number. A few dollars each month is a small price to pay to ensure that CIDER continues to grow and be maintained.
  • The Clojure survey results are out. I analysed the free-form comments, and Alex Miller had a good follow-up on the comments.
  • Alternatives to Clojurians Slack.
  • Clojurists Together is funding Aleph and Neanderthal for the next three months.
  • Deep Learning in Clojure from Scratch to GPU - Part 0 - Why Bother?

Libraries & Books.

  • aerospike-clj is a new Clojure client for the Aerospike database
  • Sean Corfield has been doing some work on a new JDBC library.
  • Datascript has a new release out with significant performance improvements. Some of this work was funded by Clojurists Together.
  • js-interop looks like a really nice CLJS library for interoperating with JS objects. Be sure to read the motivation section to see how it compares to your other options.

Foundations.

Tools.

Recent Developments.

Misc.

I’m Daniel Compton. I maintain public Maven repositories at Clojars, private ones at Deps, and help fund OSS Clojure projects (along with tons of generous members like Pitch, JUXT, Metosin, Adgoji, and Funding Circle) at Clojurists Together. If you’ve enjoyed reading this, tell your friends to sign up at therepl.net, or post a link in your company chatroom. If you’ve seen (or published) a blog post, library, or anything else Clojure/JVM related please reply to this to let me know about it.

If you’d like to support the work that I’m doing, consider signing up for a trial of Deps, a private, hosted, Maven Repository service that I run.

Thanks!

 
Copyright © 2019 Daniel Compton, All rights reserved.


Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list

Email Marketing Powered by Mailchimp

Permalink

What are side-effects?

In functional programming, people often use the term side-effect. But what does it mean? Side-effect is any external effect a function has besides its return value.

Transcript

Eric Normand: What is a side effect? Hi, my name is Eric Normand. These are my thoughts on functional programming. If you like what you hear, please hit subscribe.

What is a side effect? This is a very important concept in functional programming. Side effects are anything that is outside of the scope of a normal pure function, otherwise known as a mathematical function.

If you remember from high school algebra, a mathematical function maps the domain to the range. It maps the arguments to the return value as we would say as programmers. It is a mapping that is timeless, meaning it doesn’t matter when it’s run.

It’s not having an effect on the world. It’s a mapping just by definition. It just maps this to that. We do that a lot with our functions. We write a function that takes X and doubles it, and it returns that doubling.

Every time we call it with three, we’re going to get six, every time. It’s not dependent on anything outside of the argument that we pass it. Imagine a function that has some other influence over that answer.

Or imagine every time we call this function, something happens in the world. You could imagine that number six getting written to a file or getting sent over the network stored in a database or logged to disk, printed onto the screen.

Now, it’s not timeless. Now, it’s not really this relationship between inputs and outputs. It’s a relationship between inputs and outputs with something else happening, which is called the side effect.

You call this to get the number six out, but something else happens that maybe you didn’t know about, maybe you didn’t want it to happen, or maybe that’s exactly what you wanted to happen. You don’t even care about the return value.

In functional programming, we call that a side effect. It’s a side effect as opposed to just an effect. The term is used loosely. As these terms go, it’s complicated. The term is used loosely, but in general it’s used to say that this isn’t the main reason to call this thing.

The main reason to call double is to get six out when you pass in three. This other thing happens, too, so it’s a side effect. You could call them effects. If you have the print function that prints to the console, it doesn’t have a return value really.

It’s really weird to call it a side effect, but people do. They use the term loosely, like I said. In functional programming, we try to isolate those effects. Separate them out from the mathematical functions.

Double is a perfectly good mathematical function. You could write it once and use it for the rest of your life. Maybe you want to optimize it or something, but it’s going to work forever, regardless of context.

It’s something that you could put into a utility library and basically never touch again. Your context of where things are running, maybe you don’t want to print it, maybe your log system changes, maybe the database you’re going to store it in changes, all that stuff is way less timeless than this relationship between the argument and the output.

There’s a good reason to separate that out. Timeless stuff, it changes less often. It’s easier to test. It’s easier to debug. Imagine testing this thing. If you’re going to test a function, you’re going it run it multiple times with different inputs.

You’re going to test easy cases, some hard cases. Maybe you just put it through its paces and you test a hundred, a thousand cases. Imagine if it was writing to disk or hitting the database.

If it’s hitting the database, you have to make sure the database is running. If you are writing to a file, you got to make sure the file exists and is open. Your test become a lot more complicated.

Whereas this easy return value function, it’s all it does. It’s all in outputs. That’s so easy to test, so easy. That’s one of the reasons why we like to isolate them. The hard stuff in programming where most of our bugs come from is all having an effect outside of itself.

All the writing to disk, the sending a message over the network, the reading from the database, the drawing stuff on the screen, all of that stuff is hard. Responding to user input. This is where the complexity lies.

As much as possible, we want our code to not depend on that. You need to do all those things. That’s why we run our software but if you can isolate those things and manage them…You isolate them so that you can manage them. Whereas all this other stuff that’s just calculations, that’s just functions in the mathematical sense, all of that stuff doesn’t need managing in the same way. It’s way easier.

This has been me talking and rambling on about side effects. The term, is it good or not? I don’t know. People use it. I don’t really like the term, but I’ll use it just to be understood. I would prefer something like an action or an effect. Effect is way better.

Just know what it means. It’s important to know what it means to be involved in the discussion. You can get in touch with me. Speaking of discussion, I love to talk on Twitter, have conversations. I’m @ericnormand.

If you want to email me, I love that kind of discussion, too. We can get into longer discussions. You can email me at eric@lispcast.com. I’m trying to get into LinkedIn. If you’re on LinkedIn and you like that stuff, find me there. Awesome! See you later.

The post What are side-effects? appeared first on LispCast.

Permalink

18: Testing Clojure and ClojureScript with Arne Brasseur

Arne Brasseur talks about Kaocha, Heart of Clojure, Lambda Island, and Clojureverse

Permalink

PurelyFunctional.tv Newsletter 314: Collection functions vs. sequence functions

Issue 314 – February 18, 2019 · Archives · Subscribe

Clojure Tip 💡

Use -> for collection functions and ->> for sequence functions.

Many people complain that Clojure’s argument order is inconsistent. Take the example of map. The function comes before the sequence.

(map inc [1 2 3]) ;; function then sequence

But in update, the collection comes first, and the function is last:

(update [1 2 3] 2 inc) ;; collection, key, then function

This seeming inconsistency is actually deliberate. It indicates two distinct classes of operation. Let’s group some function by whether they take the collection at the beginning or end.

1. At the beginning

  • update (and update-in)
  • assoc (and assoc-in)
  • dissoc
  • conj

2. At the end

  • map
  • filter
  • take
  • drop
  • cons

When we group them like this, some other differences become apparent. The functions in group 2 all return a sequence. On the other hand, functions in group 1 return a collection of the same type as the argument.

It turns out that the functions in group 2 are sequence functions. They take seqables and return sequences. When you pass them a collection, seq is called implicitly. The functions in group 1 are collection operations. These functions take a collection and return a collection of the same type. However, they may not work for all collections. For instance, dissoc does not work on vectors.

As Clojurists, we recognize the differences and how best to use them.

For example, collection functions do well with the -> (thread first) macro.

(-> {}
  (assoc :a 1)
  (assoc :b 2)
  (update :a inc))

Whereas the sequence functions do well with the ->> (thread last) macro.

(->> (range)
  (filter even?)
  (map #(* % %))
  (take 10))

More importantly, collection functions work well with the reference modifier operations. swap!, alter!, and send for atoms, refs, and agents, respectively. The argument order matches the order expected by those functions. That lets us neatly do:

(swap! counts update :strawberry + 3) ;; count 3 more strawberries

If you’ve ever tried using sequence functions with swap!, it’s a lot more awkward. Give it a shot at the REPL.

If you learn better in video form, I’ve got a video on the topic. Or you can read more about collections in general. Do you know any other tips for using collection functions vs. sequence functions? Let me know and I’ll share your answer.

PurelyFunctional.tv Update 💥

There are 2 updates to report this week.

1. reCaptcha is no longer

My Chinese Clojurist friends should know that I’ve gotten rid of reCaptcha for registering a new user. There is now a custom robot filter. Sorry it has taken so long! Let’s hope it holds up against the torrents of bots. Please sign up for a membership if you couldn’t before.

2. Better registration form

I’ve revamped the layout of the registration page where you can sign up for a membership. It should be easier to fill in. If it was holding you back, go forth and register!

Clojure Media 🍿

I attended IN/Clojure last month and it was amazing. I really appreciated meeting so many enthusiastic Clojurists from the other side of the world from me. Thanks for the invitation. You can catch my talk on design in Clojure and watch all of the other talks on YouTube. We’re lucky to have such a great conference as part of the lineup.

Brain skills 😎

When you’re learning some new Clojure stuff, jump into a REPL and get active. What better way to learn a skill than by doing it. I like to use a Leiningen plugin called lein try that lets you try out a new library from a REPL without creating a new project.

After installing it, you can test out clj-http like this:

$CMD lein try clj-http "3.9.1"

You can install lein try here.

The Clojure command-line interface lets you do something similar. There’s no setup, but the command is longer:

$CMD clj -Sdeps '{:deps {clj-http {:mvn/version "3.9.1"}}}'

clj also has the advantage of being able to load libraries from git. Learn more about the CLI here.

Clojure Teaser 🤔

Answer to last week’s puzzle

Here’s the faulty code:

(def counter (atom 0))

(defn print-counter []
  (let [counter @counter]
    (println "===========")
    (println "| COUNTER |")
    (println (format "| %07d |" @counter))
    (println "==========")))

The reason it was throwing an exception was that I was derefing the value twice. One in the let binding and one in the body. The reason the exception was about casing to a Future is that deref has two branches. (See the deref source code if you want.) The first branch is for when the argument is an IDeref, the second when it’s not. But in that second branch, it assumes it’s a Future and casts. This is an example of what I would call an accidental error message. I don’t know how much time I’ve spent looking for Futures in my code when it was just a stray @.

About a dozen people wrote in with the correct answer. You get gold stars! ⭐️⭐️⭐️ Some people suggested that spec could easily detect this error and give a better error message. I agree. Let’s make it happen.

This week’s challenge

The Sieve of Eratosthenes is a neat way to find all of the prime numbers below some number. If you do it on paper (and I suggest you do), you can easily find all primes below 100 within a few minutes. The challenge this week is to write a version in Clojure that lets you calculate all the primes below a given number n.

The biggest challenge with this is that you might run into problems with stack overflows or speed. Make sure the algorithm can still run with n=100,000.

I’d love to see your answers. Bonus points for creative and innovative solutions. I’ll share the best next week. Fire up a REPL, try it out, and send me your answers. (Please note that, unless you tell me otherwise, I will share your name in the newsletter if you send me an answer.)

See you next week!
Eric

The post PurelyFunctional.tv Newsletter 314: Collection functions vs. sequence functions appeared first on PurelyFunctional.tv.

Permalink

What are concurrency and parallelism?

What are concurrency and parallelism? What’s the difference? Concurrency is functional programming’s killer app. As we write more and more distributed systems on the web and on mobile, the sharing of resources becomes a major source of complexity in our software. We must learn to share these resources safely and efficiently. That’s what concurrency is all about.

Transcript

Eric Normand: What do we mean when we use the terms parallel and concurrent? What’s the difference, and how are they related? Hi. My name is Eric Normand. These are my thoughts on functional programming. In the recent episode, I talked about race conditions and the conditions under which you would get a race condition.

You have two threads or two timelines that are sharing a resource. It’s that sharing of the resource that makes something…It could be problematic. You could have a race condition, or it could be a thing about concurrency. Concurrency is all about sharing resources.

Resources could be anything. They could be a global variable. It could be a database. It could be a network connection. It could be the screen that you’re printing out in your terminal. All these things are shared resources among different threads.

Concurrency is all about efficiently and safely sharing those resources. That’s what it’s about. You could also say your CPU is a resource. You can have a task-scheduler that allows you to share that one CPU among different processes. That’s one way to look at it, but it’s all about sharing these resources.

Parallelism is about increasing the number of things that are sharing the resources. That’s increasing the number of threads. That’s increasing the number of nodes in a distributed system. This is making something more parallel. Now notice it’s very hard to make things parallel if they’re not sharing safely and efficiently.

They are intertwined. This idea of concurrency, how we share things and parallelism, which is how many things are sharing the thing. How many things? You could look at it in another way, which is if it’s a CPU. This is why the CPU one is a little hard because it’s the thing being shared, and it’s the thing that things are running on.

You could look at it like you are increasing the number of CPUs. That increases the number of threads that actually can run at the same time and not have to switch out to run because you can start actually having multitasking. You’re increasing the number of things that are shared but also the number of things that are running at the same time.

It’s a big corner case when you’re sharing the CPU because that’s also what you’re running on. The main point is that concurrency is about sharing resources safely and efficiently. Safely, meaning without bad race conditions, race conditions that could lead to the wrong result. Parallelism is about, once you have that, being able to increase the number of things that are sharing.

Different concurrency mechanisms exist. Most of them have parallels in the real world. A simple example is if you want to share a bathroom with people. You have six people living in the house. There’s one bathroom. How do you share that safely and efficiently? Well, you could put a lock on the door. That keeps other people out while one person is using it. That’s the safety.

It’s still private. It’s pretty efficient because you can tell the door is unlocked. You could go in and just use the bathroom. Then you unlock it on the way out. It’s pretty clear what the rules are. Imagine if you had something like 12, or 18, or 25 people in that house. Maybe a lock is not going to work anymore.

Maybe you’re going to start seeing, for instance, someone who’s had to go to the bathroom for two hours, and they just keep missing their turn. You’re probably going to want something more robust to ensure that everyone can share that. It’s not just the fastest people who can get to the bathroom.

Another mechanism we use all the time is a queue, lining up. You want to order food at the restaurant you get in line. They’re processed in order, the order you get in the line. That means that everyone has a fair chance. That resource, the person who’s taking orders, is going to be calling people up one at a time. It’s fair. We intuit that it’s a fair system.

This is a concurrency mechanism that’s used all the time. Another one is something like a schedule. Instead of getting in line, you could put your name on a list. I know restaurants are doing this now in my city where you go up, and you put your name on a list. When the table is available, they’ll text you. It’s asynchronous. That means I can go take a walk while my table is being used.

Then, once it’s free, I’m notified. Then I can go back. I don’t have to just wait in line. It’s a little bit more efficient. It probably lets you handle more people, I imagine. Especially since if it’s going to be two hours before I can sit down, I’ll probably go home and come back later when it’s closer to the time that they’re estimating.

These are all concurrency mechanisms that you can find in a computer. In the first case, there’s a thing called a lock in programming. It’s also called mutual exclusion or a semaphore. It’s a way of making sure that only one thread is accessing that block of code at the same time. There’s queues.

There’s multiple implementations of queues where you put a value in the queue that signals the work that you want done. Then when the CPU is available to do that work, it will pull the next thing off of the queue and just process them one at a time. There’s schedules or call-backs, things like that, like you have with the texting when my table is ready.

These concurrency primitives are really important. I have to say I’m not the only one who believes this, but concurrent programming, distributed systems programming, parallel programming is really the killer app for functional programming. People were talking about having hundreds, thousands of cores on our machines. We’d have to start programming those. That didn’t really happen.

What did happen was we’re now programming distributed systems all the time, so your cell phone has an app. That app is talking to the server. The server is talking to the database, talking to three third-party APIs. It’s all distributed. Not every pieces are distributed, but most of them are distributed.

Functional programming does really well with this because you can have distributed concurrency constructs. I won’t go into those right now. That’s why I bring up concurrency and parallelism because that’s really what functional programming does best. It uses immutable values, which are able to be shared between different threads with no problems, no race conditions.

If they’re immutable, it means they never change, which means any copy I have you can share the same copy because it’s never going to change. What’s the problem? We can both read it. My thread can read it. Your thread can read it, and it’s totally safe. Problem comes when you can modify it.

Then you start to get into, “Well, if you’re modifying it while I’m reading it, what’s going to happen?” that kind of thing. That’s parallelism and concurrency. That’s pretty much all I have to say on it.

If you want to get in touch with me, if you have a question, or you want to disagree with me, or give me some praise, or tell me I’m wrong, you can email me at eric@lispcast.com. You can also find me on Twitter, I’m @ericnormand with a D. Also, you could find me on LinkedIn and connect there. Don’t forget to hit subscribe. I’ll see you later. Bye.

The post What are concurrency and parallelism? appeared first on LispCast.

Permalink

re-frame interactive demo

What is re-frame?

re-frame is a functional framework for writing SPAs in ClojureScript, using Reagent.

Being a functional framework, it is about two things:

  1. data
  2. the functions which transform that data.

And, because it is a reactive framework, the “data coordinates the functions” (and not the other way around).

re-frame is often described as a 6-domino cascade:

One domino triggers the next, which triggers the next, et cetera, boom, boom, boom, until we are back at the beginning of the loop, and the dominoes spring to attention again, ready for the next iteration of the same cascade.

The six dominoes are:

  1. Event dispatch
  2. Event handling
  3. Effect handling
  4. Query
  5. View
  6. DOM

domino

The purpose of this tutorial is to explain how to write the code of a re-frame app that corresponds to those 6 dominoes.

Credits

This article is an interactive rewrite of the code walkthrough from re-frame repo. It is published with the blessing of Mike Thompson. Some of the details have been omitted in order to keep the article as easy to read as possible. Be sure to read also the original article to fill out all the details.

The interactive snippets are powered by Klipse.

There are two kinds of Klipse snippets in this article:

  1. regular Clojure snippets for which Klipse displays below the snippet the evaluation of the last expression of the snippet.
  2. reagent snippets for which Klipse renders the reagent component just below the snippet as explained here.

Usage

In order to use re-frame, you have to require both re-frame and reagent:

(ns simple.core
  (:require [reagent.core :as reagent]
            [re-frame.db :as db]
            [re-frame.core :as rf]))

Begin with the end in mind

The app we are going to build contains around 70 lines of code.

This app:

  1. displays the current time in a nice big, colourful font
  2. provides a single text input field, into which you can type a hex colour code, like #CCC or red, used for the time display

When it is running, here’s what it looks like:

App database

In re-frame, there is this notion of a single app database (sometimes called store or app state) that holds all the data of our application. We call it the app-db. In our case, the app-db will contain a two-key map like this:

{:time       (js/Date.)  ;; current time for display
 :time-color "#f88"}     ;; the colour in which the time should be shown

Events (domino 1)

Events are data. re-frame uses a vector format for events. For example:

[:time-color-change "red"]

The first element in the vector, :time-color-change, is a keyword which identifies the kind of event. The further elements are optional, and can provide additional data associated with the event. The additional value above, "red", is presumably the color of the time display.

Rule: events are pure data. No sneaky tricks like putting callback functions on the wire. You know who you are.

dispatch

To send an event, call rf/dispatch with the event vector as argument:

"magenta"
#_(rf/dispatch [:time-color-change "magenta"])

Feel free to uncomment the code snippet just above and see how the color of the time display is updated at the top of the page.

After dispatch

dispatch puts an event into a queue for processing.

So, an event is not processed synchronously, like a function call. The processing happens later - asynchronously. Very soon, but not now.

The consumer of the queue is a router which looks after the event’s processing.

The router:

  1. inspects the 1st element of an event vector
  2. looks for the event handler (function) which is registered for this kind of event
  3. calls that event handler with the necessary arguments

As a re-frame app developer, your job, then, is to write and register an event handler (function) for each kind of event.

Event Handlers (domino 2)

Collectively, event handlers provide the control logic in a re-frame application.

In this application, 3 kinds of event are dispatched: :initialize, :time-color-change and :timer.

3 events means we’ll be registering 3 event handlers.

Event handler functions take two arguments coeffects and event, and they return effects.

Conceptually, you can think of coeffects as being “the current state of the world”. And you can think of event handlers as computing and returning changes (effects) based on “the current state of the world” and the arriving event.

Event handlers can be registered via either reg-event-fx or reg-event-db (-fx vs -db). Because of its simplicity, we’ll be using the latter here: reg-event-db.

reg-event-db allows you to write simpler handlers for the common case where you want them to take only one coeffect - the current app state - and return one effect - the updated app state.

Here is the syntax of reg-event-db:

(rf/reg-event-db
  :the-event-id
  the-event-handler-fn)

The handler function you provide should expect two arguments:

  1. db, the current application state (the value contained in app-db)
  2. v, the event vector (what was given to dispatch)

So, your function will have a signature like this: (fn [db v] ...).

Each event handler must compute and return the new state of the application, which means it returns a modified version of db (or an unmodified one, if there are to be no changes to the state).

:initialize

On startup, application state must be initialized. We want to put a sensible value into app-db, which starts out containing {}.

So a (dispatch [:initialize]) will happen early in the app’s life (more on this below), and we need to write an event handler for it.

Now this event handler is slightly unusual because not only does it not care about any event information passed in via the event vector, but it doesn’t even care about the existing value in db - it just wants to plonk a completely new value:

(rf/reg-event-db     ;; sets up initial application state
 :initialize
 (fn [_ _]           ;; the two parameters are not important here, so use _
   {:time (js/Date.) ;; What it returns becomes the new application state
    :time-color "orange"})) 
nil

This particular handler fn ignores the two parameters (usually called db and v) and simply returns a map literal, which becomes the application state.

Let’s initialize our app now, by dispatching an [:initialize] event that will be handled by the event handler we just wrote:

(rf/dispatch-sync [:initialize]) 

:timer

Now, we set up a timer function to (dispatch [:timer now]) every second:

(We use defonce in order to ensure that no more than a single timer is created.)

(defn dispatch-timer-event
  []
  (let [now (js/Date.)]
    (rf/dispatch [:timer now])))  ;; <-- dispatch used

;; call the dispatching function every second
(defonce do-timer (js/setInterval dispatch-timer-event 1000))

And here’s how we handle it:

(rf/reg-event-db                 ;; usage:  (rf/dispatch [:timer a-js-Date])
  :timer
  (fn [db [_ new-time]]          ;; <-- de-structure the event vector
    (assoc db :time new-time)))  ;; compute and return the new application state
nil

:time-color-change

When the user enters a new colour value a :time-color-change event is going to be dispatched via the view.

Here is how we handle a :time-color-change event:

(rf/reg-event-db
  :time-color-change            ;; usage:  (rf/dispatch [:time-color-change 34562])
  (fn [db [_ new-color-value]]
    (assoc db :time-color new-color-value)))   ;; compute and return the new application state
nil

Effect Handlers (domino 3)

Domino 3 realises/puts into action the effects returned by event handlers.

In this “simple” application, our event handlers are implicitly returning only one effect: “update application state”.

This particular effect is accomplished by a re-frame-supplied effect handler. So, there’s nothing for us to do for this domino. We are using a standard re-frame effect handler.

And this is not unusual. You’ll seldom have to write effect handlers…

Subscription Handlers (domino 4)

Subscription handlers, or query functions, take application state as an argument and run a query over it, returning something called a “materialised view” of that application state.

When the application state changes, subscription functions are re-run by re-frame, to compute new values (new materialised views).

Ultimately, the data returned by query functions is used in the view functions (Domino 5).

reg-sub

reg-sub associates a query id with a function that computes that query, like this:

(rf/reg-sub
  :some-query-id  ;; query id (used later in subscribe)
  a-query-fn)     ;; the function which will compute the query

Then later, a view function (domino 5) subscribes to a query like this: (subscribe [:some-query-id]), and a-query-fn will be used to perform the query over the application state.

Each time application state changes, a-query-fn will be called again to compute a new materialised view (a new computation over app state) and that new value will be given to all view functions which are subscribed to :some-query-id. These view functions will then be called to compute the new DOM state (because the views depend on query results which have changed).

Along this reactive chain of dependencies, re-frame will ensure the necessary calls are made, at the right time.

Remember that our application state is a simple Clojure map. In fact, re-frame allows us to access the app state:

(By the way, the following code snippet is evaluated in a loop every second…)


@db/app-db

Returning the :time-color of our app state is a simple as this:


(:time-color @db/app-db)

Here’s the code for defining our 2 subscription handlers:

(rf/reg-sub
  :time
  (fn [db _]     ;; db is current app state. 2nd unused param is query vector
    (:time db))) ;; return a query computation over the application state

(rf/reg-sub
  :time-color
  (fn [db _]
    (:time-color db)))

Notice that we don’t have to deref the app db atom as re-frame passes to the subscription handlers the content of the atom.

View Functions (domino 5)

View functions turn data into DOM. They are “State in, Hiccup out” and they are Reagent components.

An SPA will have lots of view functions, and collectively, they render the app’s entire UI.

Hiccup

Hiccup is a data format for representing HTML.

Here’s a trivial view function which returns hiccup-formatted data:

(defn greet
  []
  [:div "Hello viewers"])  ;; means <div>Hello viewers</div>

And if we call it:

(greet)
(first (greet))

Yep, that’s a vector with two elements: a keyword and a string.

But when we render it with reagent, it becomes a DOM element

[greet]

Now, greet is pretty simple because it only has the “Hiccup Out” part. There’s no “Data In”.

Subscribing

To render the DOM representation of some part of the app state, view functions must query for that part of app-db, and that means using subscribe.

subscribe is always called like this:

(rf/subscribe  [query-id some optional query parameters])

There’s only one (global) subscribe function and it takes one argument, assumed to be a vector.

The first element in the vector (shown above as query-id) identifies the query, and the other elements are optional query parameters. With a traditional database a query might be:

SELECT * from customers WHERE name="blah"

In re-frame, that would be done as follows: (subscribe [:customer-query "blah"]), which would return a ratom holding the customer state (a value which might change over time!).

Because subscriptions return a ratom, they must always be dereferenced to obtain the value. This is a recurring trap for newbies.

The View Functions

This view function renders the clock:

(defn clock
  []
  [:div.example-clock
   {:style {:color @(rf/subscribe [:time-color])}}
   (-> @(rf/subscribe [:time])
       .toTimeString
       (clojure.string/split " ")
       first)])

As you can see, it uses subscribe twice to obtain two pieces of data from app-db. If either change, re-frame will re-run this view function.

We can render the clock as any other reagent component:

[clock]

The cool thing is that when we change the a value in our app state, the clock changes immediately:

Uncomment the following swap! expression and see how the clock changes its color:

"blue"
#_(swap! db/app-db assoc :time-color "blue")

And this view function renders the input field:

(defn color-input
  []
  [:div.color-input
   "Time color: "
   [:input {:type "text"
            :value @(rf/subscribe [:time-color])        ;; subscribe
            :on-change #(rf/dispatch [:time-color-change (-> % .-target .-value)])}]])  ;; <---

Notice how it does BOTH a subscribe to obtain the current value AND a dispatch to say when it has changed.

It is very common for view functions to run event-dispatching functions. The user’s interaction with the UI is usually the largest source of events.

We can render the color-input as any other reagent component:

[color-input]

And then a view function to bring the others together, which contains no subscriptions or dispatching of its own:

(defn ui
  []
  [:div.clock
   [:h1.clock "Hello world, it is now"]
   [clock]
   [color-input]])

Kick Starting The App

Below, run is called to kick off the application once the HTML page has loaded.

It has two tasks:

  1. Load the initial application state
  2. Load the GUI by “mounting” the root-level function in the hierarchy of view functions – in our case, ui – onto an existing DOM element.
(defn run
  []
  (rf/dispatch-sync [:initialize])  ;; puts a value into application state
  (reagent/render [ui]   ;; mount the application's ui into '<div id="app" />'
                  (js/document.getElementById "app")))

After run is called, the app passively waits for events. Nothing happens without an event.

(run)

The run function renders the app in the DOM element whose id is app: this DOM element is located at the top of the page. This is the element we used to show how the app looks like at the begining of the article.

Because I know you are too lazy to scroll up till the begining of the article, I decided to render the whole app as a reagent element, just here:

[ui]

When it comes to establishing initial application state, you’ll notice the use of dispatch-sync, rather than dispatch. This is a simplifying cheat which ensures that a correct structure exists in app-db before any subscriptions or event handlers run.

I hope you enjoyed this interactive tutorial and got a better understanding about how to write a re-frame application. If you liked this article, you might also like my book…

Permalink

Copyright © 2009, Planet Clojure. No rights reserved.
Planet Clojure is maintained by Baishamapayan Ghose.
Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
Theme by Brajeshwar.