Decompiling Clojure

Guillermo Winkler has started a series of posts on decompiling Clojure.

Thus far, Decompiling Clojure I and Decompiling Clojure II, The Compiler have been posted.

From the first post:

This is the first in a series of articles about decompiling Clojure, that is, going from JVM bytecode created by the Clojure compiler, to some kind of higher level language, not necessarily Clojure.

This article was written in the scope of a larger project, building a better Clojure debugger, which I’ll probably blog about in the future.

These articles are going to build form the ground up, so you may skip forward if you find some of the stuff obvious.

Just in case you want to read something more challenging than the current FCC and/or security news.

Permalink

Using SendGrid with Clojure in The Next Web HackBattle

Using SendGrid with Clojure in The Next Web HackBattle

This week is The Next Web conference in Amsterdam, where I participated in the HackBattle. Like I did two years ago, I took a project I am currently working on and tried to use one of the HackBattle API partners. Last time I used BrowserChannel and the Rijksmuseum API. The current project I am working on is to complete the Clojure web application story by building a back-end for a TodoMVC ClojureScript front-end. I've integrated SendGrid's Inbound Parse Webhook API to be able to add todos via email.

Using SendGrid you can send an email with your todo to add it to your todo list, when you use your profile name in the subject line. On the server side this was easily implemented by adding a POST handler and setting some DNS settings properly.

Permalink

functional software developer at OpinionLab (Full-time)

alt text

OpinionLab is seeking a Software Developer with strong agile skills to join our Chicago, IL based Product Development team in the West Loop.

As a member of our Product Development team, you will play a critical role in the architecture, design, development, and deployment of OpinionLab's web-based applications and services. You will be part of a high-visibility agile development team empowered to deliver high-quality, innovative, and market leading voice-of-customer (VoC) data acquisition and feedback intelligence solutions. If you thrive in a collaborative, fast-paced, get-it-done environment and want to be a part of one of Chicago's most innovative companies, we want to speak with you!

Key Responsibilities include:

  • Development of scalable data collection, storage, processing & distribution platforms & services.
  • Architecture and design of a mission critical SaaS platform and associated APIs.
  • Usage of and contribution to open-source technologies and framework.
  • Collaboration with all members of the technical staff in the delivery of best-in-class technology solutions.
  • Proficiency in Unix/Linux environments.
  • Work with UX experts in bringing concepts to reality.
  • Bridge the gap between design and engineering.
  • Participate in planning, review, and retrospective meetings (à la Scrum).

Desired Skills & Experience:

  • BDD/TDD, Pair Programming, Continuous Integration, and other agile craftsmanship practices
  • Desire to learn Clojure (if you haven't already)
  • Experience with both functional and object-oriented design and development within an agile environment
  • Polyglot programmer with mastery of one or more of the following languages: Lisp (Clojure, Common Lisp, Scheme), Haskell, Scala, Python, Ruby, JavaScript
  • Knowledge of one or more of: AWS, Lucene/Solr/Elasticsearch, Storm, Chef

Get information on how to apply for this position.

Permalink

Decomposing web app development

Web applications’ story has been incomplete for a long time. There’s a lot of people working in web development, a lot of effort put into it, a lot of thought (I hope), and still we’re far, far away from complex, evolving, reactive web apps. It’s still the Dark Ages.

Web frameworks approach this problem by solving all problems at once. They mix rendering, state management, server communication and reactivity into one big ball of, khm, software. It’s a complex, hard to control, hard to combine and rarely fit-all-your-needs-perfectly way to live your life. Unless you’re writing a TodoMVC app. Then you have a lot of good options, with perfect documentation and loads of examples.

But there’s no reason it has to be that way. We can get closer to building large, maintainable browser apps by separating concerns and providing solutions for them independently.

Rendering DOM was a big problem with a lot of somewhat-okay-ish solutions to choose from, but then React.js popped up and now, just one year later, React is really, really hard to ignore. Even me, working on server side most of the day, have already published a public praise to it.

Communication between components is still very much unexplored. Core.async is more than fine foundation for it, but usage patterns and best practices have yet to emerge. I know, it is trivial on a small scale, just like connecting plug and socket, but when you have 100 cables to connect, you better wait and see how smart people do it first.

And then there’s an application state. It has been a grey area for a long time, with most frameworks covering it either too aggressive (like Meteor.js), or more as an afterthought. And that’s where DataScript enters.

You always start small, and back then any state management solution seemed like an overkill. You know, “I’ll do fine by just putting this into array...”, “I‘ll create a global variable to store result of this AJAX request”, this kind of attitude. As you grow, this non-uniform, ad-hoc approach to state starts to get in your way. At some point of your life, you will need to query app state in interesting ways. You will need to subscribe for updates not in one model, but in two, three at once, or look for specific pattern in data. You will need rollback. You will need two-ways server sync, failure handling strategy, you will need caching and transactions. Unless you won’t. I mean, that’s a lot of needs, and I have no illusions I can give you one single pill that will address all of them. And it’s not just me, so far nobody succeeded in this area. But it doesn’t mean I can’t help.

If you’re familiar with Om, you may have noticed that main thing it sells is not a React integration. It’s state management solution. Which is, in Om, just an atom (a mutable ref holding pointer to immutable tree) where you put state of your whole app. This thing alone gives you a lot of nice properties: rewind to any point in time, subscribe to state changes, synchronization logic can be done outside of the components and is not their concern. Even rendering is, in fact, decoupled nicely from the state, being just one of many potential listeners to state storage. Which is totally fine and a huge win by all means, the only problem being that your state is rarely a nested Hashmap. You can present your app state as a nested Hashmap, but you’ll soon realize that a rare component depends on a strict subtree of that structure. I mean, I wrote a 200-line Om app and I already faced this issue.

So, if we want to do better (which we do), how do we keep all these nice properties of Om? They come from two simple facts: state is an immutable data structure, and state management is uniform: everything you app cares about is stored in one single place.

DataScript is exactly that: it’s immutable uniform state management solution. You can think of it as a DB (and that’s totally correct because it imitates a server-side DB, Datomic), but very lightweight and pure in-memory. Or, to put it better, it’s an immutable data structure, like a Hashmap, with ability to run non-trivial queries over it. The whole database is an immutable value, and at any point you can take its value, run query over it (no matter if it’s current actual database or a snapshot from 2 weeks ago), pass it to render, put it into array, store it, send over the wire and so on. It then adds a thin layer on top of that which provides atomic mutations and ability to subscribe for data coming in and out of the database.

And that’s it. There’s nothing more to it. It does not do automatic server sync, it does not do lazy loading, it does not persists itself to local storage, it does not do reactive programming. Instead, it’s a foundation. A sound, capable primitive to build storage solution that fits your application’s needs.

The idea of having a database running inside your browser sounds less crazy when you start to think how much state modern client-side application have to deal with. Take GMail, one of the pioneers of rich web applications: it loads a pile of emails organized into threads which are attached to the labels. At each moment, you have up to three simultaneous views into the same dataset to be kept in sync. Stuff like this is most naturally expressed as queries to structured storage.

But browsers are so scarce in resources, you say. That’s why I do not recommend to think of DataScript as of database. In traditional mindset, doing SQL query is a pain. It’s a thing to avoid. For in-memory database, there’s no particular overhead to it. You don’t do networking, you don’t do serialization/deserialization. It’s all comes down to a lookup in data structure. Or series of lookups. Or array iteration. You put little data in it, it’s fast. You put a lot of data, well, at least it has indexes. That should do better than you filtering an array by hand anyway. Yes, you may even get some performance benefits out of it, although it’s not a primary objective. But the thing is really lightweight.

So, here’s DataScript. Check out the repo. I hope this example will motivate other people to build other solutions with different performance characteristics and different usage experience. I’ll definitely be glad to see that. The idea is not to use my library, but to have tons of libraries with intentionally narrow scope and excellent combinability. As an app developer, I want to decompose needs of my app and, for every one of them, choose the best possible solution out there. Maybe I’ll have to write some glue code, but in this perfect world, I’m totally ok with it.

Permalink

Othello from Paradigms of Artificial Intelligence Programming, re-written in Clojure Part 2: Strategies

Othello Strategies

This is the second part of my re-write of Norvig’s Othello, originally written in Common Lisp, in Clojure. In this part I show how I’ve implemented the various Othello playing strategies in Clojure.

One important thing to note is that, with the exception of the random and human player strategies, the strategies are static. By that I mean given the same board position they will always generate the same move. This means playing two of these strategies against each other will always end with the same result (switching which moves first may produce a different result)! Norvig notes this in Section 18.8 and shows how we can use random starting positions to evaluate strategies against each other.

As with Part One of this series I’ve left my test and experimental code in place but commented out (usually just using ; but occasionally using the #_ reader macro which, as you recall, causes the next form to be ignored). The same caveats apply: I may well have modified the code or changed names since writing the commented out elements so they may no longer work.

This takes us up to section 18.8 of PAIP (I’ve decided not to implement the tournament clock until after I have a working GUI).

;;;;
;;;; OTHELLO GAME STRATEGIES
;;;;

(ns othello.strategies
  (:use [othello.core]))

(defn legal-moves
  "Return an array of legal moves for player"
  [board player]
  (for [row [1 2 3 4 5 6 7 8] column [1 2 3 4 5 6 7 8]
      :when (legal-move? board player [column row])]
      [column row]))

;(legal-moves starting-position :black)
;(legal-moves full-board :black)
;(random-strategy full-board :black)
;(legal-moves full-board :black)

(defn random-strategy
  "Returns a random legal move:
  simple, but not a very effective Othello playing strategy."
  [board player]
  (rand-nth (legal-moves board player)))

;(rand-nth [[0 1][1 1][1 2][2 3]])
;(rand-nth (legal-moves initial-board :black))
;(random-strategy initial-board :black)

(defn maximiser
  "Return a strategy that will consider every legal move,
  apply eval-fn to each resulting board, and choose
  the move for which eval-fn returns the best score.
  FN takes two arguments: the board and the player-to-move."
  [eval-fn]
  (fn
    [board player]
    (let [moves (legal-moves board player)
          scores (map (fn [move]
                          (eval-fn (make-move board move player)
                                   player))
                          moves)
          best-index (first (apply max-key second (map-indexed vector scores)))]
      (nth moves best-index))))

;
; Calculates the weighted position of the given board for the given player
;

(def ^:const weights
                    [[0   0   0   0   0   0   0   0   0   0]
                     [0 120 -20  20   5   5  20 -20 120   0]
                     [0 -20 -40  -5  -5  -5  -5 -40 -20   0]
                     [0  20  -5  15   3   3  15  -5  20   0]
                     [0   5  -5   3   3   3   3  -5   5   0]
                     [0   5  -5   3   3   3   3  -5   5   0]
                     [0  20  -5  15   3   3  15  -5  20   0]
                     [0 -20 -40  -5  -5  -5  -5 -40 -20   0]
                     [0 120 -20  20   5   5  20 -20 120   0]
                     [0   0   0   0   0   0   0   0   0   0]])

(defn weight-this-square
  [player square-piece square-weight]
  (cond
         (= square-piece player) square-weight
         (= square-piece (opponent player))  (- square-weight)
         :else 0))

(defn weight-row
  [player row row-weights]
  (reduce + (map (partial weight-this-square player) row row-weights)))

(defn weighted-squares
  "An eval-fn to use with the maximiser function that will generate a
  strategy that maximises the weighted score (using weights)."
  [board player]
  (reduce + (map (partial weight-row player) board weights)))

;;
;; Minimax
;;

(def winning-value 100000)
(def losing-value -100000)
(def draw-value 0)

(defn- final-value
  "Is this a win, loss or draw for player?"
  [board player]
  (let [score (count-difference board player)]
  (cond
     (neg? score) losing-value
     (pos? score) winning-value
     :else draw-value)))

; (bigger [1 [1 2]] [10 [3 4]])

(defn- convert
  "Converts the value for an opposing player's
  evaluated move by negating the value component"
  [[value move]]
  [(- value) move])

(defn- bigger
  "Compares two [value move] and returns the one with the bigger value.
  Returns the second one (different move) if they have the same value."
  [[val-1 mv-1 :as val-mv-1] [val-2 mv-2 :as val-mv-2]]
  (if (> val-1 val-2)
    val-mv-1
    val-mv-2))

#_(declare minimax)

#_(defn- best-move
  "Returns the best move out of MOVES for the player
  as a 2 element array [value move]"
  [board moves player ply eval-fn]
  (reduce bigger
    (for [move moves] ; create a vector of 2 element vectors of value and move.
      [(- (first (minimax (make-move board move player)
                        (opponent player)
                        (dec ply)
                        eval-fn))) ; note that we deliberately do *not*
                                   ; use the move returned by minimax
                                   ; as that is the *opponent's* move.
      move])))

(defn- minimax
  "Find the best move for PLAYER, according to EVAL-FN,
  searching PLY levels deep and backing up values."
  [board player ply eval-fn]
  (if (zero? ply)
    [(eval-fn board player) nil]
    (let [moves (legal-moves board player)]
      (if (empty? moves)
        (if (any-legal-move? board (opponent player))
          (convert (minimax board (opponent player) (dec ply) eval-fn))
          [(final-value board player) nil])
;        (best-move board moves player ply eval-fn)
        (reduce bigger
          (for [move moves] ; create a vector of 2 element vectors of value and move.
               [(- (first (minimax (make-move board move player)
                                   (opponent player)
                                   (dec ply)
                                   eval-fn))) ; note that we deliberately do *not*
                                              ; use the move returned by minimax
                                              ; as that is the *opponent's* move.
                  move]))

        ))))

;(minimax starting-position :black 8 weighted-squares)
;(final-value full-board :white)

(defn minimax-searcher
  "Returns a strategy function based on minimax."
  [ply eval-fn]
  (fn
    [board player]
    (nth (minimax board player ply eval-fn) 1)))

;(minimax-searcher 3 weighted-squares)

;;
;; Minimax with alpha-beta pruning.
;;

#_(declare alpha-beta)

#_(defn- best-move-alpha-beta
  "Returns the best move out of MOVES for the player
  as a 2 element array [value move]"
  [board moves player achievable cutoff ply eval-fn]
  (loop  [mvs moves
          current-achievable achievable
          best-move (first moves)]
    (if (empty? mvs)
      [current-achievable best-move]
      (let [move (first mvs)
            value (- (first (alpha-beta (make-move board move player)
                                        (opponent player) (- cutoff)
                                        (- achievable) (dec ply) eval-fn)))]
        (if (> value current-achievable)
          (if (>= value cutoff)
              [value move]
              (recur (rest mvs) value move))
          (recur (rest mvs) current-achievable best-move))))))

(defn alpha-beta
  "Find the best move, for PLAYER, according to EVAL-FN,
  searching PLY levels deep and backing up values,
  using cutoffs whenever possible."
  [board player achievable cutoff ply eval-fn]
  (if (zero? ply)
    [(eval-fn board player) nil]
    (let [moves (legal-moves board player)]
      (if (empty? moves)
        (if (any-legal-move? board (opponent player))
          ; player's turn skipped, opponent plays again
          (convert (alpha-beta board (opponent player)
                               (- cutoff) (- achievable)
                               (dec ply) eval-fn))
          ; Neither player nor opponent has a move: game over
          [(final-value board player) nil])
        ; player has at least one legal move, which is the best?
;        (best-move-alpha-beta board moves player achievable cutoff ply eval-fn)
        (loop  [mvs moves current-achievable achievable best-move (first moves)]
          (if (empty? mvs)
            [current-achievable best-move]
            (let [move (first mvs)
                  value (- (first (alpha-beta (make-move board move player)
                                              (opponent player) (- cutoff)
                                              (- achievable) (dec ply) eval-fn)))]
              (if (> value current-achievable)
                (if (>= value cutoff)
                  [value move]
                  (recur (rest mvs) value move))
                (recur (rest mvs) current-achievable best-move)))))

        ))))

; (minimax starting-position :black 8 weighted-squares)
;(time (trampoline (alpha-beta starting-position :black losing-value winning-value 8 weighted-squares)))

(defn alpha-beta-searcher
  "Returns a strategy that searches to PLY and uses EVAL-FN."
  [ply eval-fn]
  (fn
    [board player]
    (second (alpha-beta board player losing-value winning-value ply eval-fn))))

;;
;; Modified weighted-squares
;;

;; Bit boring

#_(defn modified-weighted-squares
  "Like WEIGHTED-SQUARES, but don't take off
  for moving near an occupied corner."
  [board player]

  (let [w (weighted-squares player board)]
    (dolist [corner [11 18 81 88]]
      (when-not (eql (bref board corner) :empty)
        (dolist [c (neighbours corner)]

                (when-not (eql (bref board c) :empty))
                  (incf w (* (- 5 (aref *weights* c))
                             (if (eql (bref bard c) player)
                               +1
                               -1))))))
    w))

Next

I’m currently working on using a Clojure wrapper for Swing, SeeSaw, to build a GUI for Othello. This uses an atom to hold the current board state with a watcher to update the display of the board when the board changes. My currently problem is handling the mismatch between repaint!, paint and reference watchers! Hopefully that won’t prove insoluble without having to resort to a non-functional solution (such as dynamic scope vars to pass values around). Readers familiar with PAIP will note that Norvig doesn’t implement a GUI (his focus is AI of course) but one of the big reasons I chose Clojure over Common Lisp was precisely the ability to build a standard, native-looking GUI application; something that’s not possible on Common Lisp because the ANSI standard doesn’t cover GUIs.


Permalink

Left to right, top to bottom

TL;DR - Clojure's threading macro keeps code in a legible order, and it's more extensible than methods.

When we create methods in classes, we like that we're grouping operations with related data. It's a useful organizational scheme. There's another reason to like methods: they put the code in an order that's easy to read. In the old days it might read top-to-bottom, with subject and then verb and then the objects of the verb:


With a fluent interface that supports immutability, methods still give us a pleasing left-to-right ordering:
Methods look great, but it's hard to add new ones. Maybe I sometimes want to add functionality for returns, or print a gift receipt. With functions, there is no limit to this. The secret is: methods are the same thing as functions, except with an extra secret parameter called this
For example, consider JavaScript. (full gist) A method there can be any old function, and it can use properties of this.


var
completeSale = function(num) {
console.log("Sale " + num + ": selling " 


+ this.items + " to " + this.customer);
}


Give that value to an object property, and poof, the property is a method:


var sale = {

customer: "Fred",

items: ["carrot","eggs"],

complete: completeSale

};
sale.complete(99);
// Sale 99: selling carrot,eggs to Fred


Or, call the function directly, and the first argument plays the role of "this":

completeSale.call(sale, 100)
// Sale 100: selling carrot,eggs to Fred

In Scala we can create methods or functions for any operation, and still organize them right along with the data. I can choose between a method in the class:

class Sale(...) {
   def complete(num: Int) {...}
}

or a function in the companion object:

object Sale {
   def complete(sale: Sale, num: Int) {...}
}

Here, the function in the companion object can even access private members of the class[1]. The latter style is more functional. I like writing functions instead of methods because (1) all input is explicit and (2) I can add more functions as needed, and only as needed, and without jumbling up the two styles. When I write functions about data, instead of attaching functions to data, I can import the functions I need and no more. Methods are always on a class, whether I like it or not.

There's a serious disadvantage to the function-with-explicit-parameter choice, though. Instead of a nice left-to-right reading style, we get:


It's all inside-out-looking! What happens first is in the middle, and the objects are separated from the verbs they serve. Blech! It sucks that function application reads inside-out, right-to-left. The code is hard to follow.

We want the output of addCustomer to go to addItems, and the output of addItems to go to complete. Can I do this in a readable order? I don't want to stuff all my functions into the class as methods.
In Scala, I wind up with this:

Here it reads top-down, and the arguments aren't spread out all over the place. But I still have to draw lines, mentally, between what goes where. And sometimes I screw it up.

Clojure has the ideal solution. It's called the threading macro. It has a terrible name, because there's no relation to threads, nothing asynchronous. Instead, it's about cramming the output of one function into the first argument of the next. If addCustomer, addItems, and complete are all functions which take a sale as the first parameter, the threading macro says, "Start with this. Cram it into first argument of the function call, and take that result and cram it into the first argument of the next function call, and so on." The result of the last operation comes out. (full gist

\\ Sale 99 : selling [carrot eggs] to Fred


This has a clear top-down ordering to it. It's subject, verb, object. It's a great substitute for methods. It's kinda like stitching the data in where it belongs, weaving the operations together. Maybe that's why it's called the threading macro. (I would have called it cramming instead.)

Clojure's prefix notation has a reputation for being harder to read, but this changes that. The threading macro pulls the subject out of the first function argument and puts it at the top, at the beginning of the sentence. I wish Scala had this!

-----------------
Encore:
In case you're still interested, here's a second example: list processing.

Methods in Scala look nice:



but they're not extensible. If these were functions I'd have:


which is hideous. So I wind up with:
That is easy to mess up; I have to get the intermediate variables right.
In Haskell it's function composition:
That reads backwards, right-to-left, but it does keep the objects with the verbs.

Notice that in Haskell the map, filter, reduce functions take the data as their last parameter.[2] This is also the case in Clojure, so we can use the last-parameter threading macro. It has the cramming effect of shoving the previous result into the last parameter:

Once again, Clojure gives us a top-down, subject-verb-object form. See? the Lisp is perfectly readable, once you know which paths to twist your brain down.


Update: As @ppog_penguin reminded me, F# has the best syntax of all. Its pipe operator acts a lot like the Unix pipe, and sends data into the last parameter.
F# is my favorite!

------------
[1] technical detail: the companion object can't see members that are private[this]
[2] technical detail: all functions in Haskell take one parameter; applying map to a predicate returns a function of one parameter that expects the list.

Permalink

Tawny and Protege

Tawny-OWL [1] enables a rich programmatic interface to OWL and ontology building. To an extent, I wrote Tawny because I wanted to get away from the use of Protege [2] as an ontology editor. I compare the experience of Protege to Tawny as similar to a comparison between Excel and R; if the former does what you need, then it’s fine, but it’s hard to extend. So, it is with Tawny — it is simple to add patterns, new syntaxes, new capabilities. And I have access to all the standard tools that I expect with any programmatic environment; I can use versioning, build tools and test harnesses.

Having said all of this, Tawny-OWL comes with some cost. Although most IDEs have good capabilities for jumping to definitions and the like, they are limited compared to the display capabilities of Protege [2]; the ability to navigate quickly and rapidly through an ontology, to use tools like OWLViz to get a broad overview of the ontology structure.

Even if I feel that Protege is limited as an editor, I would still like to use its visualisation capabilities; it is unfortunate if, in choosing Tawny-OWL, I have to abandon Protege. This is not, however, necessary. It is possible to use Protege to visualise an ontology created by Tawny with synchronisation; changes are displayed by Protege immediately, as they are displaying the live data models that Tawny is manipulating. This is achieved by Protege-Nrepl; in this post, I describe the implementation behind it.


Background

Tawny is implemented in Clojure which is a lisp that compiles down to Java bytecodes; the OWL functionality comes from the OWL API which is the same API that Protege uses. In an abstract sense, then it should be possible to plug the two together; to have Tawny operate over the same data structures that Protege is displaying.

There are a number of ways to connect a Clojure process to an IDE, but the most common way is with a relatively recent tool called nrepl. This is a protocol and an tool implementing this protocol which allows communication with a Clojure process. There are now quite a few tools which have implemented clients to this protocol.


Protege-Nrepl

I was fortunate that Clojure provided most of the tools that I needed. Protege-Nrepl is a protege plugin which places a single menu item into the Protege frame. This then launches an internal Clojure process, which in turn launches a Nrepl socket. As it stands, Protege-Nrepl is not specific to Tawny — it simple provides a Clojure process. On the top of this, there is a small bridge package called Tawny-Protege which links together the data structures of Tawny, and Protege.

From a practical point-of-view, this means that I can launch protege, then connect to it from Emacs (or any other Clojure IDE). The IDE then operates in the same way as if Clojure were launched internally.

In theory, the process is very simple: I chose to implement the plugin itself in Java because this seemed easiest, not least because Protege provides a standard maven file to build plugins (initially, I used the older ant build, but the dependencies were a pain). Protege is an OSGI application; I have little knowledge of OSGI, so not having to work this part out was a relief. Java side the relevant code, looks like this:

RT.loadResourceScript("protege/dialog.clj");
RT.loadResourceScript("protege/nrepl.clj");

Var init = RT.var("protege.nrepl","init");
init.invoke();

// and later
Var newDialog = RT.var("protege.dialog", "new-dialog-panel");

Additionally there is some glue to implement the plugin interface, and some threading (loading Clojure in the paint thread is not a good idea). The protege.nrepl/init function loads a user config file, while protege.dialog/new-dialog-panel creates a GUI which starts the nrepl server.

That should be the process complete, but in my hands this failed; the problem is that OSGI requires me to pre-declare all the packages that I want to import within a bundle, so they get into the classpath. In this case, I included all the dependencies transitively anyway; the whole point of the plugin was to package Clojure up for Protege, so there was little point adding it independently. Protege classes (for the plugin) need to come from the protege environment, as do the OWL API classes, or I will not be able to manipulate objects created by protege with Tawny, as they would be different classes (of the same name, but different classloader).

For reasons that I could not determine, the OSGI manifest plugin also inserted a large number of dependency packages, including javax.servlet, junit, and some sun.misc classes; these are not available meaning that, even though they are not actually used, unless they are excluded specifically they make the plugin crash. All of this was achieved with the following modifications to the maven-bundle plugin.

<instructions>
  <Bundle-ClassPath>.</Bundle-ClassPath>
  <Bundle-SymbolicName>${project.artifactId};singleton:=true</Bundle-SymbolicName>
  <Bundle-Vendor>Phil Lord</Bundle-Vendor>

  <!-- We exclude a bunch of things here which otherwise get
   into the import list and are not provided from anywhere. How
   do they get there? No idea! -->
   <Import-Package>
     !javax.servlet*,!junit.*,!org.junit*,!org.apache.*,
     !org.testng.*,!sun.misc.*,*
   </Import-Package>
   <Include-Resource>plugin.xml,{maven-resources}</Include-Resource>
   <Embed-Transitive>true</Embed-Transitive>
   <Embed-Dependency>*;scope=compile</Embed-Dependency>
   <Require-Bundle>
     org.protege.editor.core.application,
     org.protege.editor.owl,
     org.semanticweb.owl.owlapi
   </Require-Bundle>
</instructions>

On the clojure side, the final addition was Pomegranate; enabling Clojure in Protege is fairly useless without being able to add new dependencies (such as Tawny!), but I did not want to add these to the maven build. Pomegranate allows me to add new dependencies on the fly.

As I always use Tawny, I add the following to ~/.protege-nrepl/init.clj so that it is alongside Protege. I may change this so it happens automatically; if anyone wanted to use protege-nrepl without Tawny they could still do so.

(ns init
  (:require
   [cemerick.pomegranate]
   [protege model nrepl]))

;; force loading of tawny
(cemerick.pomegranate/add-dependencies
 :coordinates '[[uk.org.russet/tawny-protege "1.1.0-SNAPSHOT"]]
 :repositories (merge cemerick.pomegranate.aether/maven-central
                                          {"clojars" "http://clojars.org/repo"}))
;; and monkey patch the thing
(require 'tawny.protege-nrepl)

;; initing the dialog takes ages -- so auto connect
(dosync (ref-set protege.model/auto-connect-on-default true))

Lein-Sync

When launched from within Protege, the Clojure process will be running independently of a Maven or leiningen project. If, for example, I try and load the tawny.pizza/pizza, clojure will fail as it cannot find the local resources, nor any dependencies.

To handle this situation, I have created lein-sync — this is a leiningen plugin which is run in the project directory, which creates a .sync.clj file which contains all the Pomegranate code needed to extend the local classpath. For instance, this file generated for the tawny.pizza looks like this:

;; This file is auto-generated by lein sync
(require 'cemerick.pomegranate)
(cemerick.pomegranate/add-dependencies
 :coordinates
 '[[uk.org.russet/tawny-owl "1.0-SNAPSHOT"]
   [org.clojure/tools.nrepl
    "0.2.3"
    :exclusions
    ([org.clojure/clojure])]
   [clojure-complete/clojure-complete
    "0.2.3"
    :exclusions
    ([org.clojure/clojure])]
   [ritz/ritz-nrepl-middleware "0.7.0"]
   [org.clojure/tools.trace "0.7.5"]
   [compliment/compliment "0.0.1"]]
 :repositories
 '[["central"
    {:snapshots false, :url "http://repo1.maven.org/maven2/"}]
   ["clojars" {:url "https://clojars.org/repo/"}]])
(cemerick.pomegranate/add-classpath
 "/home/phillord/src/knowledge/ontology-clj/tawny-pizza/src")
(cemerick.pomegranate/add-classpath
 "/home/phillord/src/knowledge/ontology-clj/tawny-pizza/dev-resources")
(cemerick.pomegranate/add-classpath
 "/home/phillord/src/knowledge/ontology-clj/tawny-pizza/resources")
(.println System/out "Loaded .sync in pizza")

Some of these dependencies (compliment, tools.trace) come from my local leiningen configuration. Loading this file, ensures an nrepl launched from within Protege behaves in the same way as a locally launched nrepl. Currently, classpath extension uses fully qualified paths which obviously requires the same (or a shared) file system between the leiningen instance generating .sync.clj and Protege; I may address this latter as it would enable me to run Protege on a different machine from the IDE.

Finally, I have written some Emacs to connect to the nrepl server and automatically run .sync.clj on connection; adding something similar for other IDEs would be straight-forward, although manual use of the repl is also possible.


Conclusions

Given all the availability of the tools, conceptually building protege-nrepl was straight-forward. In practice, it was made somewhat more complex through a combination of ClassLoaders, OSGI and the need to dynamically extend the classpath in a running JVM. In particular, my experience of running OSGI has not been positive; I spent a substantial amount of time chasing down a very strange bug caused by an inconsistency between the OWL API and Protege. Combined with the strange behaviour of the maven plugin which I only solved by multiple trial and error restarts, it all added a lot of complexity. Currently, I am using a pre-release version of Protege as this has been ported to maven; this requires a local build which I realize is not an end-user experience.

The end product, however, was worth the effort. Despite my criticisms of Protege, it remains an excellent tool; having a running Protege, updating live is a considerable advance over the old “save and reload” workflow that I used previously. I look forward to the next release of Protege, as this use of Tawny-OWL, protege-nrepl and Protege will increase the attractiveness of Tawny considerably.

References

  1. P. Lord, "The Semantic Web takes Wing: Programming Ontologies with Tawny-OWL", OWLED 2013, 2013. http://www.russet.org.uk/blog/2366
  2. "The Protégé Ontology Editor and Knowledge Acquisition System"http://protege.stanford.edu/

Permalink

Learning Clojure: … [What NOT to Read]

Learning Clojure: Tutorial, Books, and Resources for Beginners by Nikola Peric.

From the post:

New to Clojure and don’t know where to start? Here are some books, tutorials, blog posts, and other resources for beginners that I found useful while getting used to the language. I’ll also highlight some resources I’d recommend staying away from due to better alternatives. Brief disclaimer: I have either read at least ~75% of each of these resources – some just weren’t worth reading through to the end.

Let’s start with some books after the break!

A refreshing post on Clojure resources!

Nikola not only has positive suggestions but also says what resources he would avoid.

Reporting every book on Clojure could be useful for some purpose but for beginners, straight talk about what NOT to read is as important as what not to read.

Point anyone interested in Clojure to Nikola’s post.

Permalink

Simple error handling using slingshot and clj-http

Lately I've been been thinking about simplicity in software. When I say simple I mean: not compound. This is different from easy, which is a measure of familiarity. If this distinction is unfamiliar to you, I recommend you stop reading this and go watch Rich Hickey's talk Simple made easy first. With that said, let's turn to the subject matter at hand: What does simple error handling look like?

Read more »

Permalink

Clojure Weekly, April 23rd, 2014

Welcome to another issue of Clojure Weekly, my small routine blog contribution to the Clojure sphere! These are just a few links, normally 4/5 urls, pointing at articles, documentation, screencasts, podcasts or anything else that attracts my attention. I add a small comment so you can decide if you want to look at the whole thing or not. That’s it, enjoy!

TDD is dead. Long live testing. (DHH) Not strictly Clojure related, but the discussions about TDD vs REPL driven development is still hot in Clojure-land. DHH here is publicly declaring that TDD is dead. I saw the tweets coming from the keynote at RailsConf and I’m looking forward to the Rails community reaction. DHH was never a fan of TDD but he was sort of doing it in the past, without realising exactly what was wrong with that approach. This blog post explains why he now thinks that TDD is just like learning wheels (oh, I so wait for Uncle Bob’s write up!). What’s my take on this? I still start test first, I can’t resist, but I’m much less strict than the past. I quickly abandon it as soon as third party services are invoked, or some piece of infrastructure that requires a lot of stubbing is required (like a rest endpoint for example). This is when I abandon TDD and use the REPL to play with the app. Mind that I’m also using the REPL throughout development, to create the implementation that makes my tests green. Still happy with that, it helps me think and focus, exactly like this weekly writing.

keyword parameters? - Google Groups Old but interesting discussion about named parameters for functions. I already talked about defnk in the past and it was abandoned after migrating out from clojure contrib. Still it is an interesting little exercise to see how simply the feature can be added to the language (something that happened for Ruby properly in 2.0). Rich is showing the code in the last email. With this macro you can define functions that declares parameters with a name and a default. If the last two parameters in the example are not given (or others are given), it takes the default. The only way to change that default is to prefix the keyword to the parameter. The only caveat, at least in this simple implementation, is that the named params need to be at the end.

Decompiling Clojure I - Interrupted This is the start of a very interesting and well written series of article about Clojure internals, the relationship with Java and the produced byte code. It doesn’t assume previous knowledge of any of the above which is very good if you never decompiled a class file. The last part explains the additional information that the compiler can add to the byte code to help the JVM during debugging (for example). Next instalment will be about the compiler and reader and I’m looking forward to it.

ClojureDocs - clojure.core/or “or” can be used the standard way as in many other languages. But since nil is false in predicates, “or” can be also used to handle those frequent cases where some parameter can be nil and a default can be provided. (or mime “application/html”) for example could be used to accept a given mime variable or use an application/html default when mime is nil. That is equivalent to the more verbose (if mime mime “application/html”) which is not as nice to read.

The rise of functional programming in Banks | Oxford Knight Blog An interesting non-technical point of view from a recruitment firm that is investing a lot in the field of functional programs. The post sates that the raise of FP especially in banks since 2006 can be related to the fact that it models concurrency better, it models math better and that skills learned on the JVM (of course Scala and Clojure only) can be easily sold to other OO-based businesses. The last part reports back on teams adopting the main five FP languages: scala, clojure, haskell, ocaml and F#. No mentions for Erlang (always talking about banks, hedge funds and investment banks in general).

Permalink

Copyright © 2009, Planet Clojure. No rights reserved.
Planet Clojure is maintained by Baishamapayan Ghose.
Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
Theme by Brajeshwar.