Personal Data Preservation, Inspired by Ancient Writing – Will Byrd

Will ByrdYes, folks! The inimitable Will Byrd of miniKanren fame is going to be speaking at Clojure SYNC.

Will Byrd presented a talk on his continuing efforts to preserve his work for at least 5,000 years. What are the challenges? Why would you want to do that?

This talk falls squarely in the Context theme I talk about here.

Video

Please forgive the poor audio quality. We had technical trouble with the audio recording and had to use the backup audio.

 

Transcript

Eric:  [00:00] Our next speaker probably knows more Egyptian hieroglyphics than he does Clojure.

William Byrd:  [00:10] This is true.

[00:15] [laughter]

Eric:  [00:15] This is part of the context track. There are three tracks in this conference. Tracks, they are themes. One is context.

[00:29] The other one is code, so all about coding, programming. The other is, I call it career, [inaudible] . It’s business. The first talk was in the business theme.

[00:42] [pause]

Eric:  [00:42] The context one, I asked Will if he could give a talk because software is changing the world. It’s the new medium that hasn’t been around that long. We don’t know exactly how it’s going to change our lives.

[00:59] What we can do is look to other media and how it changed the world and history. I might go back all the way to the beginning of history, to the invention of writing. That is why I asked Will to speak.

[01:16] Will Byrd is a friend of the coding community. He created miniKanren, is what core logic is based on. He’s also working on Barliman, which will replace us all in the future, and to program her that’s better than us.

[01:33] Will Byrd, thank you.

[01:35] [applause]

Will:  [01:36] I should point out that all the work I’ve done is with other people, except for what I’m going to show you today. It’s all me.

[01:50] [laughter]

Will:  [01:50] The miniKanren’s for you, and the other people [inaudible] . Now for something different. Eric has known for a while that I’m interested in the history of writing.

[02:06] My language skills are pretty weak. Some of my friends make fun of me. They say I’m a programming language researcher, but I only know one programming language [inaudible] . I know other languages, but I refuse to program [inaudible].

[02:21] [laughter]

Will:  [02:21] The only natural language I know is English. I’m very interested, not such much the language, but in writing. Alan [inaudible] and many other people claim that writing is the most important invention humans have ever come up with. I believe that’s probably true.

[02:44] I’m really fascinated by the history of writing and in particular the origins of writing. I also think the writing for some of these systems is extremely beautiful. I really love ancient Egyptian. I really love cuneiform, Sumerian, [inaudible] , that kind of thing.

[03:05] When I first went to school, when I went to college I was at University of Chicago and I didn’t want to take Spanish because I had done so poorly, so I decided to take an advanced graduate course in Babylonian which would have gone as well as you might expect.

[03:21] [laughter]

Will:  [03:22] The other students in the class, the grad student majoring in creative linguistics would be like, “Oh, is that the case like in Hebrew or Aramaic,” and I’d be like, “Is that like Spanish?”

[03:36] [laughter]

Will:  [03:36] Her husband was like, “No. It’s not like Spanish.” But anyway it hasn’t stopped me from being interested in systems, especially the writing.

[03:47] I’ve taught myself a little bit of Egyptian and a little bit of Babylonian. Babylonian as a language isn’t that hard. The writing system is very difficult to read but the language itself is, it’s a Semitic language and it’s not that different from learning Hebrew, basically.

[04:06] So Eric asked me to talk about the history of writing and programming and what in the world do you talk about? I had no idea so I just started reading as much as I could. I used it as an excuse to geek out, and also I’ve been in England a fair amount recently so every time I go to England now I get the Goodfellow room next to the British museum. I’ve also been to Oxford and Cambridge recently.

[04:36] I set up the Ashmolean museum in Oxford a few times, so this is some cuneiform writing from the British museum. Let’s see, I’m not sure how old this particular one is but it’s pretty old, so it’s carved in stone. By the way my slides are just photos I’ve taken and stuff like that so they’re not the most organized. I’ll just talk around it.

[05:13] Here is another piece. I’m very interested also in ceramics, so obviously if you have writing on stone that lasts a long time. If you have ceramics, you can have complicated art work that lasts a long time. That’s like 2,500 years old.

[05:32] Here you see some ceramic pieces. You can see whoever made the ceramics, or, actually I’m not sure if this is the person who made the ceramics or it’s someone afterwards carving into it, but they could write something, inscribe it in the pottery and this will last thousands of years. Here’s some cuneiform tablets. These are close to 4,000 years old. This is a Sumerian king list, so that’s almost 4,000 years old.

[06:09] Again, you just go in the Ashmolean museum and it’s sitting there on display. The Sumerians and Babylonians didn’t just write on the flat tablets, they also had these prism structures and some [inaudible] looking type things.

[06:27] These are some inscriptions talking about mathematics. They have inscriptions talking about astronomy. These were preserved for a very long period of time.

[06:40] Then this is very, very old writing from the Sumerians. This is sort of accounting information. This is about 5,200 years old, I think, so 5,200 years old. I was thinking a lot about these ancient writings and also another fun thing…Oh yeah, this is really, really old writing. This is apparently proto‑cuneiform I think for beer.

[07:09] [laughter]

Will:  [07:11] Yeah, once again, this is like 5,000 years old. You can also see these sorts of master works. This is a Greek vase. It looks like it was made yesterday, but this is thousands of years old. Now we come to the modern world.

[07:31] [laughter]

Will:  [07:33] As I’m doing these things, as I’m reading up and learning and thinking, and visiting these museums, I also do things like get emails from Google saying that they’re going to shut down my account since we don’t have data, “Don’t want your account for your data? If you don’t want this Google Apps account and don’t want to save any of your data, you don’t need to do anything.” It’ll take care of itself.

[08:10] They give their justification for doing it, but what happens to your account? Basically all your information is gone forever. It’s gone forever. You can find some interesting websites. This is from “Le Monde,” but this is a Google memorial to all of the services that Google has shut down.

[08:29] [laughter]

Will:  [08:30] They couldn’t all fit on the screen easily. I don’t particularly mean to pick on Google. This is not just a Google phenomenon. The Internet Archive Team run by Jason Scott keeps a deathwatch of all the services and websites that are likely to go down.

[08:54] Here’s their “Likely to Die” list. They’re keeping an eye on these websites and they’re archiving them. This is a partial list of all the sites and services that shut down in 2014. I couldn’t fit it on one page, of course.

[09:11] This is the thing that got Jason Scott really involved in this, is that GeoCities shut down. If you remember GeoCities that was big chunk of the Web, the early Web, that people had in their GeoCities pages and Yahoo! Just shut it down.

[09:29] Apparently they gave, their notice was like a little aside in an FAQ saying, “This service is going to be shut down in two months,” a few months, or whatever. Jason Scott and this Archive Team managed to pull down much of GeoCities and several other teams also tried archiving it.

[09:55] If you go to Internet Archive you can find their “Special Collection for 2009” for GeoCities. Actually if you go back and look at “Sunset Happy,” yeah, “Archive Team officially proclaims Yahoo! The least trustable host and its arch enemy.

[10:16] [laughter]

Will:  [10:17] Prove us different, or not.” This is what I’m thinking about, also we’re seeing while I’m reading about these documents that are thousands of years old.

[10:34] I know a little Egyptian. Let me see if I can find that little bit. These aren’t too well organized.

[10:43] Anyway, the British Museum has writing in Egyptian where I can read king names just fine, things like that. I can still read writing that’s thousands of years old. This was actually forgotten, the language was forgotten. People didn’t know how to read it, because of Egyptian, people didn’t know how to read Sumerian, people didn’t know how to read Babylonian, so this information was lost for a long time.

[11:15] Only relatively recently was it rediscovered. We’re still trying to figure out how to read Sumerian better. Their language is like Linear B to decipher. Then there are languages like Linear A where we haven’t been able to decipher them. Why haven’t we been able to decipher Linear A, does anybody know that?

[11:40] [pause]

Audience Member:  [11:40] Sample sizes.

Will:  [11:40] Yeah, the sample size is really small. There’s not enough of it. There’s this saying that 90 percent of life is showing up. I guess that’s true of history too. 90 percent of history is showing up. If your language is forgotten, and there’s enough pieces left, enough fragments, people in the future, maybe thousands of years in the future at least have a chance to try to recover that.

[12:11] But if you don’t leave those pieces, it’s gone, there’s nothing you can do about it. I started thinking a lot about how when I was in elementary school I would write something down and my parents still have my writings from elementary school and some of my report cards from elementary school. But everything I did in high school and college is all gone because I did it on computer, I did it on a word processor.

[12:41] Some people say, “Well, now we have the Web, we have the cloud.” But I think people of the Archive Team would say that that doesn’t give us any safety guarantee. Maybe for a couple of years, but certainly not over time scales of so even 10 years, and definitely not over time scales of like 2,000 years.

[13:03] We only can make our data as safe in GitHub and GitHub’s probably not going away tomorrow, but can you really say that in the next 30 years may not be acquired or some company that buys GitHub?

Audience Member:  [13:19] Yahoo!

Will:  [13:19] Yeah, Yahoo! Is going to buy GitHub.

[13:22] [laughter]

Will:  [13:23] They’ll take all that Alibaba money.

[13:31] [laughter]

Will:  [13:31] Obviously people are aware of this problem. People are aware of this problem and they’re also aware that they have to be very careful of their media, that their media can fail, hard drives fails, all these sorts of things. I think there’s an awareness of it, but I think it doesn’t go too far beyond awareness for most of us.

[13:53] We have some awareness that this could be a problem, but I don’t think we’re acting on what is a problem, or a potential problem. When I’m looking at this writing that’s 5,000 years old, I can’t help but wonder how much of the stuff we’re doing today will be around in 5,000 years, how much of it is going to be lost forever.

[14:16] That inspired me to start changing some of my new practices and thinking about my new practices. I’m going to share some of the changes I’ve made and some of the things I’m thinking about and trying to work on.

[14:35] Interestingly enough, I thought this would just be some weird rabbit hole I go down that no one cares about, but many of the programmers I talk to think this is really interesting. I said, “Why? Why do you think this is interesting?” They said, “Well, programmers get excited about keyboard switches…

[14:50] [laughter]

Will:  [14:50] or whatever.” There are certain types of input devices or certain…If your living is made by entering text, then it’s natural to care about your input devices or your screens, or whatever.

[15:10] To my surprise a number of programmers seem to be interested in this, so I am going to speak out a little bit and tell you about what I do and invite you to join me, if you are interested. This is definitely a project I’m working on actively and I will be working on for the rest of my life.

[15:36] What have I decided to do? What I’ve decided to do is to take it as a personal challenge that I want my research notes and things like that to last at least 5,000 years and hopefully longer. That’s my task for myself that everything I do that I care about I want to last at least 5,000 years, and it has the potential of lasting 5,000 years.

[16:00] I want the equivalent of clay tablets or carved in stone, or that kind of thing, so the oldest papyrus that’s still existent is about 4,600 years old. Egyptian papyrus that this is written on that we can read is about 4,600 years old. That I think is a reasonable target to shoot at.

[16:25] How do you do that? How do you make sure that your research notes last 5,000 years or whatever? It’s going to take some doing, maybe.

[16:36] The first thing I thought about was, “I could try to store everything on like GitHub. Maybe that will last 5,000 years.”

[16:45] [laughter]

Will:  [16:45] I think Jason Scott has disabused me of that notion. What do I do instead? I’ve become like the anti‑Ted Nelson. Ted Nelson worked on the Xanadu Project for a long time and he is not a big fan of paper, of emulating paper on the computer.

[17:04] I understand the reasons for this, so I’m not really anti‑Ted Nelson, Ted Nelson’s work is great. But it will look like I’m anti‑Ted Nelson, because now I’m a computer scientist and I’ve gone back to paper. For anything I care about, it has to be on paper. It can also be on a computer, in fact it will also be on a computer.

[17:24] But it also has to be on paper, so I’ve also gotten interested not just in super high‑quality paper, but I’ve gotten very interested in inks and pens and papyrus, and things like that. Here’s some papyrus. I’ve got lots of stuff for people to check out afterwards if you’re interested in this.

[17:42] This is a papyrus I bought online from Egypt I don’t think is very high quality, so I bought a kit that makes my own. But that also I think is not really sufficient, because I also want to understand how these things are made. Did you know, I think it’s at Lowe’s, you can buy papyrus plants? I’m going to start growing my own papyrus.

[18:08] [laughter]

Will:  [18:09] Then, my understanding is that the ancient Egyptian papyrus was actually very high quality. Very high quality and they basically, I don’t know if they intentionally did this, but they effectively did selective breeding of papyrus and we’ve lost that. I’m going to start breeding programs to start breeding papyrus to try to get back to something that’s decent. That’s one thing.

[18:38] But papyrus is actually annoying to write on and has some other issues. I want some writings on papyrus but I also want writings on other media. One of the things I’m using now is paper. The paper that I use is not typical paper. It was something you could buy, but you have to look out for.

[19:04] This is what one of my notebooks look like today. I sort of make my own. Basically what this is it’s a spring‑loaded thesis binder. Let me get these [inaudible] . Then I have paper which is a hundred‑percent cotton, acid free, 24‑pound weight paper made by a company called Strathmore, that makes very good paper for artists.

[19:31] Artists are people who care about things lasting a long time. If you’re getting into materials, you often find that the artists are people who care about it. The other people who care about it are conservators, and librarians, and people like that dealing with old manuscripts or old art collections, and so forth. This is the paper I use.

[19:53] What do you write on? I’m sorry, that’s what you write on, but what do you write with?

[20:02] What I’ve decided to do is, “Let’s use fountain pens.” Basically I carry a roll of fountain pens around. My go‑to fountain pen is the Pilot 823 document filler. I’ve got two of these.

[20:17] They’re not cheap, but the reason I use these pens is that they’re safe to carry on an airplane. Not to use on an airplane but to carry on an airplane, because they’re resistant to atmospheric pressure changes.

[20:29] If you don’t have a pen like that then you will see that the ink can explode and go all over the place. I’ve got a couple of special pens and then I use some special ink. The ink I use is what’s called Platinum brand carbon black ink.

[20:48] This is the same sort of ink that the Egyptians used, or basically the same sort of ink that the Egyptians used for the papyrus. That black, that’s 4,600 years ago. This sort of ink should last not thousands of years but tens of thousands of years. It will last way longer than the paper will.

[21:08] The thing that I’m not sure about is that you have to suspend the carbon in a binder and I’m not sure what’s in the binder and most of these ink manufacturers will not tell you. I’ve got some friends who run mass spec experiments, so I’ve thought about asking them…

[21:26] [laughter]

Will:  [21:27] if they could run a mass spec on the ink to try to…Basically my theory is paranoia.

[21:33] [laughter]

Will:  [21:34] This is like security. If you want to play this game, you have the ultimate enemy which is time. Time will find some way to destroy these things. It could be, if you look at the history of ink during the Middle Ages there was a type of ink called iron gall ink that was used in the Western world.

[21:58] There are many formulas for iron gall ink, but many of those formulas turn out to be quite acidic and iron gall ink changes with exposure to the atmosphere. There are manuscripts where the paper is in perfect condition, but the area where the writing is gone, or someone drew a rectangle and now there’s a hole in the paper, and that kind of thing. I’ve become extremely conservative.

[22:24] You can get archival pens with archival ink, but they are only certified to a hundred years. You can get Sakura art pens, for example. They age test them, accelerated age test them to a hundred years, but that’s child’s play.

[22:41] [laughter]

Will:  [22:41] I don’t trust it. The reason I don’t trust it is because they don’t tell you what the formula is. This is just like carbon, tiny particles of carbon and I think we understand, to some extent how that behaves.

[22:53] If you ever read Neal Stevenson’s novel, “Zodiac,” the main character will not do any drug where the chemical formula is too complicated, if he feels like he can’t understand it. That’s how I feel about these things.

[23:07] It’s like a hundred‑percent cotton paper, all right, I can understand that. A hundred‑percent carbon ink with a little bit of binder, OK, we can look at the binder and try to figure that out. Now it comes down to things like storage. How do you store that paper?

[23:23] I have an archival storage facility which used to be my bedroom closet. But I became a little worried about air circulation, so now I’ve moved my setup outside of the closet, but I have a monitoring system. I monitor temperature and humidity.

[23:44] I have actually a wireless network in my apartment where I have my paper storage facility, my papyrus storage facility and the notes I’ve written on. For my phone I’ve got an app where I can check at any time and get graphs of the temperature and humidity for my paper [inaudible] .

[24:08] I live in Alabama now and I’m very worried about the humidity. I’m sure that Alabama has a lot of humidity, but New Orleans would also be a problem. You can also do things like send the paper to a salt mine. It’s how they store film, film stock, that kind of thing.

[24:25] You can do salt mines. The places I’ve looked at it’s like, “Call us for an estimate.”

[24:30] [laughter]

Will:  [24:31] It’s never been cited. Another thing is I want my paper to be here and the other thing is I want to try to figure out how to do this on a budget. Right now I haven’t been really too worried about the budget. I was like, “All right, I’ll spend a little money, it’s a lot cheaper than…Playing around with fountain pens is a lot cheaper than buying a boat or something.”

[24:53] As I’m trying to figure it out, I’m willing to spend a little of my money, but I want to try and figure out, “What’s a low‑cost way of doing this? How can we do it for $50, or something like that?” These are the sorts of things I’m thinking about.

[25:13] Then there’s another wrinkle which is, “Well, this is fine for my research notes, but what about for the code I write?” What do you with the code that you write?

[25:28] Fortunately, or unfortunately, depending on who you talk to, I do all my work in Scheme or Racket and in those languages you tend to write things that can be in specific languages, using macros, similar to what we did in Clojure and that means that I can often get very good compression ratios for my code.

[25:49] I was thinking about it. All the projects I’ve worked on for the last 13 years as a researcher, you could probably take all of that code and squeeze it into 5,000 lines [inaudible] . At that point you can actually print it out.

[26:04] This, for example, is a printout of my paper of micro [inaudible] . It fits on these two pages. Now, what sort of ink do you use?

[26:19] You could try to print it out with a laser printer. I have a laser‑jet printer and that actually uses carbon toner, so you’d think it’s very stable. However, there seem to be problems where aging and as the humidity changes for the paper.

[26:39] Basically the paper’s being stressed when it becomes more and less humid, hotter and then cooler, the paper fibers are expanding and contracting and the carbon that lays on top, the toner that lays on top of the paper can start flaking off, so that’s a possible failure.

[26:58] The people who really care about longevity with printing, are people who do things like fine art prints and they want them to be archival, so those people are pretty serious about it. What they do is they use inkjet printers.

[27:13] This is printed with a relatively inexpensive Epson printer and the reason I got that particular printer is not because I care about inkjet printers or Epson printers, but because you can get special ink.

[27:27] There is one person in the world who makes this ink which is the carbon ink that I have basically for my fountain pens. It’s basically a similar formula, although once again, I’m paranoid, because they changed the formula to version 1.1 because a supplier couldn’t handle something.

[27:43] That makes me very nervous. That sounds like something that should be “mass spec‑ed.”

[27:47] [laughter]

Will:  [27:49] This ink should be very similar in composition to the ink from my fountain pens, this should last tens of thousands of years unless they’re doing something funny or something weird about the binder.

[28:01] This is part of the idea that I can go from digital to analog and preserve at least the things I care about most. It can go the other way. I’ve also got a scanner, a flatbed scanner, so now I scan all my notebooks and I’ll have digital copies of those. I want to do bi‑directional everything. I want digital copies of everything, I want analog copies of everything on very high‑quality artifacts.

[28:29] Another thing I’ve started getting into is making my own paper, so with my parents I started making my own 100‑percent cotton paper and I can talk to you about how you do that. It’s actually not very difficult. This is 100‑percent cotton paper and you can make your own.

[28:49] Once again, this is a way to both learn about these technologies that we often take for granted, but also to try to control the ingredients. It’s just interesting to learn about paper and the history of paper and so forth, and be able to try to control the medium that you’re using.

[29:09] I’ve also started doing some pottery. My mom does pottery, so I started doing some pottery. I’ve seriously considered trying to make my own clay tablets and things like that. I’ve also looked into, I’ve used quartz glass and that kind of thing.

[29:30] [laughter]

Will:  [29:30] I’m interested in nanotechnology, so I’m building a scanning atomic microscope and there’s a…You can run that in atomic force mode so you can move atoms around, so maybe at some point you can do a nanotech version of things. This is something I want to work on for fun, but also it’s extremely interesting, because you can start getting into failure modes of medieval manuscripts for example.

[29:55] This is a book, “Introduction to Manuscript Studies,” which I highly recommend. If you want to check it out, I can show you. It’s full of all of the ways that manuscripts go bad and how you fix them, how you recognize them, how you store them.

[30:10] There’s this whole area of knowledge that humans have accumulated having to do with the preservation of analog artifacts that we know can last for a very, very long period of time, but I don’t think we know how to do that for digital artifacts, not very well. I’m still trying to figure that out and we’re in this danger period until we do figure it out of losing a lot of our history.

[30:43] The other part of this is, let’s say that we come up with some relatively inexpensive ways to preserve things like research notes and code or things that we care about that we want to preserver for posterity, particularly as the programming language designer. As someone who designs programming systems, I want to record that information for myself in the future and for other people, to try to give people some context. What were we trying to do? Why did we make those decisions?

[31:20] That’s great, however there’s another aspect of it, which is, how do you organize this knowledge? I’ve been studying my nights, scanning old notebooks. I have probably hundreds of notebooks, research notebooks going back for quite a long time all written on paper of dubious quality with ink of dubious quality. I’m trying to scan all of those things.

[31:46] Now I have all of these JPEG images. 100 dpi JPEG images. What do we do with them? More generally, if you think about how our knowledge is spread out, where do I have knowledge captured?

[31:59] I’ve got notes on my phone. I’ve got notes on my computer and I’ve got random Emax files. I tried messing around with word mode.

[32:07] I could never figure out how that’s supposed to work. I’ve got scripts of documents written with [inaudible] .

[32:13] I’ve got drafts of books. I’ve got stuff in GitHub, I’ve got stuff in Bitbucket. I have bookmarks in my browsers, in different browsers. I have YouTube playlists. If you think about that, that’s a way to capture knowledge.

[32:27] All the YouTube videos that currently exist, here are the videos that I’m most interested in. Here I’m going to organize them.

[32:34] Here’s a fun thing about the YouTube playlist. I’ve got some YouTube playlists, for example, music I like to listen to while I’m programming. Great.

[32:44] Sometimes an account goes away or a video gets taken down and then the YouTube playlist just has a, “Video removed.” It doesn’t have the title of the video. It’s gone.

[32:57] This is actually a hole in the knowledge. I’ve got all of these bits of information spread out all over the place and I’ve no way to search over those really, organize them, [inaudible] them or whatever.

[33:11] The next stage I want to do, I’m trying to, in addition to develop better technologies and techniques or refine the practices that I’ve come up with for my own personal analog preservation, I’m also trying to figure out, how do we organize this knowledge? How do I want to organize my knowledge personally?

[33:36] One of the things I currently use is a program called TiddlyWiki, which is actually a decent program. I have a whole bunch of notes in TiddlyWiki as well. It has some linking and tabs and things like that but I also find it just doesn’t meet my needs.

[34:01] After talking a lot to my friends, many of my friends pointed out that actually there’s a lot of interface work that went into something like TiddlyWiki and this is absolutely true. Something like TiddlyWiki is both great and I shouldn’t underestimate the amount of time that it took to develop a nice interface.

[34:19] At the same time, as someone who’s a programmer and someone who has particular needs, it’s the same reason I use Lisp. The reason I use Lisp is that Lisp is a recognition that whoever designed the programming language doesn’t know the program as well as you do, therefore you may have to do things like change the language or create a new language to solve your particular program.

[34:44] That’s a very powerful [inaudible] . TiddlyWiki is great however TiddlyWiki is not designed to solve my particular problems. I can mess around with it in JavaScript and try to make it work the way I want but instead, I’ve decided I’m going to build something from the ground up.

[35:02] In fact, I’m probably going to build many of these things to try to [inaudible] but I’ve decided if I actually want to be serious about this, I’m going to have to take ownership of this problem and try to be better than just having all of my knowledge spread out on various devices, various computers, bookmarks, places where I don’t even necessarily think of knowledge organization like YouTube playlists.

[35:25] That represents my organization of my knowledge. That’s a project that I’m working on seriously. I guess in that part of it, I guess what I’d say is, if this interests you, I’ll very much happily share everything I know. In fact, I’m going create a [inaudible] thing, right? I guess I’ll crate a GitHub page to share my practices.

[36:00] [laughter]

Will:  [36:00] The practices are ephemeral. The technology we use is ephemeral but I want the data to last a long time. That’s the idea. Anything we do is a snapshot and we’re going to have to keep working on it and improving it but if you’re interested in this, I’m happy to share anything I know. I’ve got samples of all sorts of stuff you can play with.

[36:21] There’s actually a really nice fountain pen store around the corner, which I actually ran into yesterday. Turned out to be an expensive mistake.

[36:27] [laughter]

Will:  [36:27] If you’re interested in this, we can even take a field trip there, show you the stuff but the other part is digital organization. How do we design that? I’ve read a whole lot about things like the [inaudible] and Engelbart’s work and Ted Nelson’s work.

[36:49] All these people who were interested in trying to organize knowledge in sophisticated ways.

[36:54] There’s been a lot more recent work also but now I’m at the point where it’s like, I’m just going to have to start building things. What I build is probably going to look weird because it’s going to be for me but maybe over time, I can figure out ways to develop things that are useful more generally or at least find ways of building specific tools that are useful to people.

[37:16] I think we need to do that. If you are not somehow recording the things that you’re doing, the design decisions you’re making, whatever you’re thinking about, I would encourage you to do that and to try to save that. It doesn’t have to be expensive. It doesn’t have to be super fancy.

[37:36] It would be simple but even if you’re doing it on sucky paper with sucky ink, at least there’s a chance that at some point you can scan it. Even sucky paper today tends to last quite a while.

[37:49] It’s much better to record it than not and if you don’t do that, then the future is going to be, I think, horrible. There’s this very interesting book called “Playing at the World,” by Jon Peterson. He’s a researcher. He was interested in the history of Dungeons and Dragons.

[38:06] He’s written this 722‑page book on the history of Dungeons and Dragons and he has a blog. He finds all of these documents. He’s a collector. He’s found all of these original design documents with the original campaigns for Dungeons and Dragons and the original character sheets and things like that.

[38:24] You can track over time how the game changed and how different people had different ideas about how the game should work. In fact, I didn’t really understand advanced Dungeons and Dragons rules from 1977 to ’80 or whatever that I grew up with until I saw those sheets. It was like, “Oh, that’s the sort of campaign that we run.”

[38:43] This is an example with Dungeons and Dragons but you can also see this with programming language design. There’s this series of conferences, History of Programming Languages, HOPL. Three HOPLS. HOPL four is coming up.

[38:54] There’s hopefully going to be many more HOPL. If you haven’t read HOPL, any of the proceedings, I recommend HOPL one and two. They’re just completely fascinating to me. I love them. As a programming language designer, I want to start capturing the intent and the ideas and design decisions and context.

[39:15] For [inaudible] for example, we didn’t do that. Now it’s like, how did we come up with that? I don’t really remember, so we kind of have to make up stories.

[39:24] If you’re designing things, I’m sure you’re making some decisions, something you’re building, I would encourage you to write that down and save it. Don’t fill up notebooks and throw them away. Save it. Maybe scan it.

[39:38] Then there’s a whole set of other practices, which are ‑‑ what do you do with this information? What do you do if you get hit by a bus? What do you make public?

[39:49] There are things in my notebooks where it’s like, maybe I’m talking about the ideas with someone else and I don’t want to scoop them unintentionally by putting it online.

[39:59] Maybe I’ve written something about a paper that sounds like it’s a nasty tone, just to myself. There’s also a whole set of practices of, what do you make available? When do you make it available? There’s a long tradition of this in the humanities and libraries. That kind of thing.

[40:20] I think it’s important as people who design things that we think very hard about what’s going to be the future of the decisions we make. Will people be able to recover that information? What can we do know to try to help? Also, there are all these giants in the field of organizing knowledge.

[40:42] There are people like Engelbart and [inaudible] and Ted Nelson. All these folks that did fantastic work but I think there’s too much of a thing where I’m like, now you go read a book about this, the good old days or something like that instead of, we need to learn the history book, we also need to be building modern versions of these things for ourselves.

[41:06] I do feel like, for the programming languages I enjoy, those were all languages designed by the designers for themselves, for their own purposes.

[41:18] I would like to see more systems being designed for the users themselves. Whatever system I design, it’s not going to be like TiddlyWiki. It’s going to be for my own needs but I think there’s one way to try to explore the space much more and try to come up with new approaches. I think we desperately need it.

[41:38] Anyway, I guess my basic message is because we are living in this hybrid analog/digital world, we’re in the worst of both worlds. We’re not paying serious attention to the analog things we make or record and we haven’t figured out how to do digital preservation, just putting it on the web or whatever, put it in the cloud, is just totally insufficient.

[42:08] Learn about the Internet Archive. Learn about archival [inaudible] . Look at libraries, look at these sorts of things and think deliberately about, how long do you think the decisions you’re making and the artifacts you’re making, how long will they last?

[42:24] What can we do to try to make sure that we can capture history so that 5,000 years from now, when people have forgotten English but they put the parts together, they could actually recover something about what it’s like to be here in 2018. That’s it.

[42:46] [applause]

Eric:  [42:49] I’d like to ask for the questions to come up here. Will, if you have some time…

[43:02] [crosstalk]

Will:  [43:03] Yeah, sure. Also, I’ve got a couple more pictures. I made that.

[43:07] [applause]

Eric:  [43:08] Totally last 5,000 years.

Will:  [43:14] Yeah. Here’s something else I made, by the way. This is a turtle I made in eighth grade. That turtle will last longer than English perhaps, right?

[43:28] That’s the thing we have to keep in mind, that if you want to be serious about these sorts of time scales, you have to think about the very serious possibility that people will forget English and English will have to be rediscovered, or the fact that this little clay thing, which I put my name on the bottom, that inscription of my name will maybe last longer than civilization.

[43:55] We’ll have some sort of terrible scenario.

Eric:  [43:59] Cool. A lot of great questions here. I’m going to start with one of mine. We’re producing a lot of data now. Basically, the more we produce, the more time it’s going to take to read the data. If we all start producing all this data, when are we ever going to have time to read it?

[44:28] Why are we going to want it in 5,000 years? It’s just going to be exponentially growing. Why would we want this data?

Will:  [44:36] Why do we want it now?

Eric:  [44:39] For short term, we want to have a conversation with someone, like, what was I thinking 2 years ago, 10 years ago? But in 5,000 years with millions and billions of people, are they going to worry about what you were thinking 5,000 years ago?

Will:  [45:01] It’s an obvious problem, right? The amount of data we’re generating is huge. Obviously, we can’t just do everything on clay tablets right now. Big tablets. [inaudible] .

[45:17] [laughter]

Will:  [45:17] That’s a problem. I think one thing is we can be somewhat selective about the things that we consider extremely high value. Like I said, I think you could boil down the last 15 years of my work or 15 years of my works to like 5,000 lines of code, or maybe 10,000 lines of code.

[45:36] There’s certain core ideas that I’m willing to hand curate. These ideas that I think are particularly important. Things like conversations between people, I don’t actually think that information…

[45:45] [crosstalk]

Will:  [45:45] that well. I can only send a finite number of emails in a day. People used to write huge numbers of letters and stuff like that. I think at the individual level, me communicating something, sure I’m on IM, whatever but that’s actually pretty small and if you look at the heroic efforts that people have done to try to uncover biographical information.

[46:09] Read Robert Caro’s magnificent book “The Power Broker,” about Robert Moses in New York City and look at the amount of effort he did to try to uncover what was going on at that time. It’s true that most of the data we’re collecting, people aren’t going to be [inaudible] .

[46:27] It’s also true that we’re going to have to figure out, how do we store these things at all? Fortunately, this space, this capacity’s still increasing but we probably at some point have to be a little judicious but at the same time, I feel like I personally owe it to people in the future not to make that decision for them.

[46:49] I’ts like, “Oh, sorry, I [inaudible] these things for you. Sorry, 2018 was the year of, you don’t get to learn anything about what it’s like.

Audience Member:  [47:01] Zero.

Will:  [47:02] 02018, there you go. That’s right. 00 2018.

[47:04] [laughter]

Audience Member:  [47:05] So you talk about boiling down your academic output. Do you think there’s a conflict between the way that you do scientific discourse where you have to get it past peer review? You have to explain all this stuff but then in the end, there’s only 3,000 lines that would need to be preserved. Is there some conflict in it?

Will:  [47:33] Yeah, I think there’s some. Of course, I’m being a little facetious when I said that everything’s just 5,000 lines of code. Also, we have papers and have written books and things like that but on the other hand…

[47:47] If you really want to know about the work that we’ve been doing, what you really need to do is look at all the rejected papers that haven’t appeared anywhere. They’re on my laptop ‑‑ rejected, rejected, rejected, rejected ‑‑ and see how we change the idea, see how we try to improve them and maybe the papers got accepted not because the ideas are good.

[48:08] Maybe it’s just because we present it in a different way. In a way people can more easily understand it or seems sexy or whatever. Actually, what I’m interested in also is recording all of the stuff that never saw the light of day because I got rejected or, for example, the book “The Reasoned Schemer.”

[48:28] The first edition of that book, working with Dan Friedman and Oleg Kiselyov, Dan’s motto is, “If you’re not sure how to write a chapter, if there’s two ways to write the chapter, you write it both ways and then you throw away at least one of the two, maybe both.”

[48:44] That book had 10 chapters in it. We had at least 10 chapters that we threw away. They were finished chapters but they were never shown the light of day, or never seen the light of day.

[48:56] I think that’s also part of it. You’re trying to collect information so that people have more context, so they can see like, D&D, what were the alternatives? What the rules people turned on and rejected?

[49:07] I want to very seriously think about how to capture that and then the other part of it is, whenever you’re trying to do any sort of curation, there’s also the thing of, I want to make myself look good. I want to now show all the scummy things. I want to show the great stuff that makes me look brilliant.

[49:28] I’m also thinking, how can I capture a bigger, more accurate piece where it’s sort of like, here’s all the stuff ‑‑ at some point in the future ‑‑ here’s all the stuff, go through it, come up with your own conclusions. That’s [inaudible] .

Eric:  [49:44] This is an interesting question. This is from Dr. Sussman. It’s more of a comment but I’ll make it into a question. One of the things that is, I think, really interesting about Egypt as a culture is they seem to be very interested in preservation. Pyramids, you just make something that’ll never be destroyed.

[50:13] You have mummification to try to make the bodies last forever and if they didn’t care so much…I guess papyrus was pretty good but we don’t have so much paper. It doesn’t last. What about new technologies for encoding stuff in genome. This is from Dr. Sussman.

[50:37] Making something that will make its own copies, reproduce and be around hopefully a little longer.

Will:  [50:47] A couple thoughts there. One is, yes, Egypt was interested in preserving things but we should also think about the Sumerians and the Babylonians…

Eric:  [50:54] We still have Egyptian DNA, right? In the mummies. We could clone an Egyptian king.

Will:  [51:03] Your words, not mine.

[51:05] [laughter]

Eric:  [51:07] They made it, is what I’m saying.

Will:  [51:12] The Egyptians were interested in preservation. The Babylonians and Sumerians were also extremely interested in preservation and actually the first archivists and librarians were from Mesopotamia, as far as I can tell.

[51:26] The Babylonians and Sumerians were also extremely into preservation. Actually, the first archivists and librarians were from Mesopotamia, as far as I can tell. Akkadian has two dialects. Babylonian has Assyrian. The scribes in Akkadian were writing in a system that used both Akkadian words and Sumerian words.

[51:45] Sumerian’s a totally different language than Akkadian. They had to study Sumerian, so you had these people creating dictionaries of Sumerian and things like that thousands of years ago. They were interested in trying to capture linguistic knowledge. Here’s how Sumerian works, right?

[52:04] They had libraries and archives and things like that many thousands of years ago. This isn’t just a recent thing.

[52:12] As far as new technologies with things like DNA, there’s actually a big project Microsoft has sponsored where you have a whole bunch of DNA that they’re sequencing in code data. The idea is the DNA, if it’s been dried and kept in a nice environment, is actually extremely stable. It’ll last a very long time.

[52:34] You can fit a whole lot of it into a small area and then you can do sequencing, for example, to read that information in a very different fashion. You can do error‑correcting codes and things like that because there’s some sort of chemical reaction with the [inaudible] or something.

[52:52] Microsoft has been very serious about this. I think they’ve [inaudible] $50 million [inaudible] Bill Gates trying to clone himself or something like that.

[53:04] [laughter]

Will:  [53:04] This was like a long‑term data storage thing. Then there’s the other idea of like trying to encode the information in a living creatures genome right? [inaudible] try to pass [inaudible] time.

[53:15] One of the things I find interesting is there’s this idea of how do you tell people, you know, when we were building the [inaudible] repository for highly radioactive waste. How do you tell people in the future, after English isn’t spoken anymore not to go near this area. How do you tell people 10,000 years from now, “Don’t go into this area”? A whole bunch of architects came up with interesting ideas.

[53:40] One idea was to genetically engineer cats so cats would change color when near radioactive waste. Then you’d have this rural folklore about if the cat changes color don’t go there.

[53:52] [laughter]

[53:52] [laughter]

Will:  [53:54] That’s one way maybe to encode knowledge. Some of you, these are interesting possibilities.

[54:04] Even if the media is going to last thousands of years, you still are going to have the issue of, what do you record, how do you find it, how do you organize that knowledge, privacy issues, when you do release it? How do you make it easier for people in the future to tell what may be of interest to them?

[54:29] Alan Kay, and in particular his students, has this paper, “The Cuneiform Tablets of 2015.” Where they tell more of these data lost stories.

[54:40] They propose a way by using virtual machines and various storage media so that people 5,000 years from now could bootstrap [inaudible] or something like that. Go through all the processes. If you give me enough information, here’s a virtual machine. Once you have that you can run the software on it. Hopefully your computers are faster than today.

[55:04] They [inaudible] this story about this. In the great story of this whole [inaudible] paper which is very interesting read is the story about the “Domesday Book.” You know about the Domesday Book in England? After the Battle of Hastings, the French conquered England, created this book of all the holdings in England.

[55:23] This book can still be read today. This is from the 1080s or something like that, the book was created.

[55:31] The BBC decided apparently in the late ’80s, they were going to create a modern Domesday Book using [inaudible] technology. The BBC micro with a special light optical disk thing.

[55:45] They solicited all sorts of entries from people around England to represent the state of England. What’s it like to be English person in the 1980s. Of course, within a few years no one could read this disk. The machine broke.

[56:00] There was a big preservation effort. An academic team got together to try to read this information. They were going to put it on the Web. Then they started running out of funding. The project leader apparently died. As of now, the website’s down.

[56:16] That was their example of we can read these clay tablets, we can read the Domesday Book, but we can’t read the archival digital project that BBC launched just a few years ago.

[56:30] [crosstalk]

Eric:  [56:30] You mentioned moving out of your closet into an office.

Will:  [56:35] Yeah, [inaudible] .

[56:35] [laughter]

Eric:  [56:36] Do you have backups? I mean, it’s being one fire will…

[56:44] [crosstalk]

Will:  [56:44] A fire alarm went off in my new apartment. I ran outside, I grabbed my laptop bag, I forgot my archival documents. I go back in. Is this really a fire or what?

[56:59] [laughter]

Will:  [57:00] That’s what got me to get serious. I better scan this stuff. I am going to do all this work and this fire starts or water damage. My last apartment, the water heater, like the bottom fell out and flooded my entire apartment and then it got mold [inaudible] .

[57:16] That’s the closet we’re behind. I moved it out of the closet because it’s better airflow. You can see the WiFi station, you can see one of the temperature/humidity sensors, it’s wireless.

[57:28] You can see one of the regular temperature/humidity sensors. There is my archival box. You can see that.

[57:36] That’s one of the temperature sensors and humidity sensors. That’s 48 percent relative humidity, 71 degrees, which is too hot.

[57:44] Then inside the box, I’ve got special paper. This is my storage area for the reams of paper in the future. I got a temperature/humidity sensor on that.

[57:59] A lot of this is actually trying to figure out practices. I can tell you about the materials, but the practices, how do you order this paper? It turns out all the paper that is 100‑percent cotton isn’t the same quality. I like the Strathmore paper, but maybe they will stop making it someday. That’s one problem.

[58:16] Another problem is if you want to get this shipped, first of all the paper is not cheap. If you get it shipped, guess what, it’s usually packed…Amazon or paper mill or wherever you order it, all the places I’ve seen they just put it in a cardboard box.

[58:32] What ends up happening is the cardboard box, the corners get smashed in during shipping. That stack of paper on top, those are all pieces of paper that are basically unusable because they’ve been folded so much in shipping. It’s stuff like that.

[58:45] What I did is I had the bright idea in that I ordered three reams. The ream in the middle would be usable. That’s not the best long‑term option.

[58:55] A lot of it is figuring out stupid stuff like that. How do I get paper where the edges aren’t all crushed in already? Maybe I figure out how to un‑crush it or something.

[59:07] I watched an awesome YouTube video last night. Seven ways to hide a lavaliere microphone for people doing film industry stuff. Seven different ways to thread a mic through a shirt to hide it.

[59:22] This is an example of you can have the lavaliere microphone. You can have tape, but there’s a whole set of practices that people build up over time.

[59:31] It’s the same thing here. Most of my effort is trying to figure out what are the set of practices that are useful. How do I refill my pens in the best way? How do I make sure my paper doesn’t [inaudible] ?

[59:45] I had a household emergency where it rained non‑stop in Alabama. I left my archival notebook in my laptop bag where it was soaking wet for a week.

[59:57] By the time I pulled it out, it had this really weird smell. Oh‑oh, I better scan that and now I got it in an isolation area. I put it in my freezer. I froze it to try to…

[60:07] Anyways, there is all sorts of stuff like that I am trying to figure out.

[60:12] I’ll give you one more…

[60:13] [crosstalk]

Eric:  [60:14] Do you have backups that aren’t digital? I’m thinking of…

[60:20] [crosstalk]

Will:  [60:20] That’s why I got this printer. The reason I got the printer, here’s my fade test, my window exposure test so you can tell how the inks would do.

[60:32] Here’s another test, by the way. This is with the special Epson printer and a special ink. I printed out this page and this from my cotton pad. Then I soaked this paper in water to try to smear it to see how well. It actually stood up pretty well.

[60:49] Now that we get the paper, the printer, I actually can print that out and then I can scan it again and print it out. This is a scan of me printing out the digital document. I want to have this workload where now that I scanned all my documents of all my notebooks in, I am going to print them all out so I’ll have another physical backup. Maybe I’ll send that to the salt mine.

Eric:  [61:12] Maybe you just have another printer in the salt mine.

[61:18] [laughter]

Will:  [61:18] You’re right. I don’t have any trust in this sort of my digital [inaudible] .

Eric:  [61:22] I’m thinking, literacy during the Dark Ages, it was scribes, just copying, and copying all the time, who preserved it.

Will:  [61:33] That’s right. If you haven’t read “A Canticle for Leibowitz,” that’s inspirational reading material. That, and [inaudible] . You’ll see them somewhere else…

Eric:  [61:44] Before we break for lunch, what are the big hopes that you have for [inaudible] preservation?

Will:  [61:50] This is me [inaudible] . I’ve got like a thousand photos in it, videos.

[61:57] What was that?

[61:59] [off‑mic comment]

Will:  [62:01] Oh, yeah, for the photos? Yeah. I can show you a picture of you that’s [inaudible] .

[62:03] Anyway, sorry, what was…?

Eric:  [62:06] You started talking about digital preservation at the end of your talk, but you said you still have to get into that. What are your hopes for that?

Will:  [62:18] For many, many, many, many years, as long as I can remember, I’ve been frustrated by trying to organize my information ‑‑ index cards, a zillion other ways, and I’ve never been very successful. I think, ultimately, I’ll never be super happy with what I come up with, but I can come up with something better than the current ways of organizing things.

[62:37] I’ve been extremely interested in new media studies, and extremely interested in digital preservation and reading about knowledge organization, and [inaudible] work, and Engelbart and all this stuff, Nelson, and all these people.

[62:53] But now I’m at the point where I’ve just decided I’m going to have to build my own system. One of the reasons I’ve gotten the confidence is that now…

[63:03] I’m working on this project for the National Institutes of Health, with Greg Rosenblatt and Matt Might and the Hugh Kaul Precision Medicine Institute at the University of Alabama, Birmingham. We’ve been building our own biomedical reasoning system, from scratch.

[63:22] We originally were just going to take some off‑the‑shelf software, cobble some stuff together, and we’re actually building it entirely from scratch. We’re using miniKanren logic programming, and things like that.

[63:32] The system is actually only a few thousand lines of code. Once again, we’ve been able to try to keep it very, very small, but try to do sophisticated things through [inaudible] language [inaudible] and stuff like that, and building interfaces for that.

[63:47] I’m very much in this mindset of both trying to build reasoning tools and knowledge organization tools, but also thinking about interface, thinking about how people use this, watching people use these tools, and coming to the conclusion that no one can build the tool I need better than I can.

[64:09] That’s why I learned how to become a part of it. That’s what I’m going to do. It may not be useful to anyone else, but it’s going to be useful to me eventually. Maybe not tomorrow, but it will be.

Eric:  [64:22] Thank you very much, Will.

Will:  [64:23] Thank you.

[64:24] [applause]

 

The post Personal Data Preservation, Inspired by Ancient Writing – Will Byrd appeared first on Clojure SYNC.

Permalink

test2junit Version 1.4.0 Released

This is a bit delayed announcement of the release of test2junit 1.4.0. test2junit lets you “Emit Clojure test output in Junit XML format and optionally automatically invoke HTML generation.”

With some version 1.3.x of test2junit, I added a bit more colorful command line output. What I did not consider back then was that this output may “break” test2junit in cases in which its output is redirected, e.g., when running it in a continuous integration environment.

With version 1.4.0, I added an option to silence the test2junit command line output. To silence the output, set “:test2junit-silent true” in your project.xml.

I hope this update is useful for you and fixes issues for people for which the 1.3.x versions caused trouble. If you have constructive feedback, please let me know.

Permalink

JOB: Software Developer

Location: Anywhere within the U.S.
Target Start Date: May 15, 2018
Salary Range: $82,000 to $130,000 per year (salary offers vary by experience and location-specific cost-of-living adjustment)
Benefits:
Vision, dental, & medical insurance; 403(b) retirement savings plan; generous minimum vacation policy; parental leave; long-term disability; employee assistance program
Level: Multiple levels, mid through senior

At Democracy Works, we believe voting should fit the way we live. To that end, we build technology for both voters and election administrators that simplifies the process and ensures that no voter should ever have to miss an election.

TurboVote, our first service, helps voters register, stay registered, and cast a ballot in every election, from municipal to national. TurboVote signed up its millionth voter in 2016 by building the largest college, nonprofit, and corporate voter engagement coalition in the country, including 176 campuses, companies like Starbucks, Univision, Facebook, Google, Snapchat, and dozens more. Our other work includes the Voting Information Project, whose polling-place data received 123 million impressions in 2016, an Election Technology Cooperative to provide affordable, voter-centered technology to election administrators, and Ballot Scout, which tracks absentee ballots through the mail, providing transparency in the vote-by-mail process and making it easier to follow up when things go awry.

These products are the work of our ten-person developer team. Most of our development involves writing microservices in Clojure running in Docker containers on Kubernetes and hosted on AWS. These services communicate over RabbitMQ and store their data in Datomic. Our users primarily interact with web apps written in ClojureScript and re-frame. We also have projects that use JavaScript, Node, React, Python, and PostgreSQL. We hope you have experience with some of these technologies and are excited to get experience with the rest.

We pair program, collaborate with product managers, and make sure our efforts deliver value to voters. We rotate roles and projects on our team so that everyone gets a variety of experience and working relationships and can bring their unique strengths to as wide a swath of our work as possible.

To apply:
Send a short email with resume, addressed to Chris and Wes, at work@democracy.works with the subject line “Will code for democracy” to begin the application process. Please include how you found this job listing. We also encourage all applicants to state their preferred pronouns when applying for any job opening at Democracy Works. Qualified candidates who meet the above requirements will have the opportunity to complete an anonymized skills evaluation before we schedule an interview. Based on the application, evaluation results, interviews, and reference checks, offers will be made to one or more finalist applicants.

Applications will be accepted and interviews will be conducted on a rolling basis.

Democracy Works is committed to diversity and inclusion in everything we do and aspires to have a team which is representative of the voters we serve. When hiring, we practice proactive outreach to top talent that’s underrepresented in our sector (including Latinx, Black, AAPI, and Indigenous candidates), and we offer every candidate an anonymized skills evaluation, to reduce implicit bias and resume-dependency in our process. We're a woman- and gay-founded startup, and promote an inclusive culture that stands against racism, sexism, homophobia, and ableism (to name a few). To be explicit, we strongly encourage applicants of all races, ethnicities, political party associations, religions (or lack thereof), national origins, sexual orientations, genders, sexes, ages, abilities, and branches of military service. Feel free to contact work@democracy.works if you have any questions about our commitment to inclusion or about general hiring practices.

Permalink

Dutch Clojure Days 2018 round-up

As you may recall, I had enthusiastic expectations and resolutions for Dutch Clojure Days. My first Clojure-only conference, my first proper face-to-face with the community. How could I not be excited?

On Saturday 21st at 8:30 am sharp we were at the TQ building’s reception, greeted by Carlo Sciolla. A couple of words on the venue: a simple but elegant building, close to Dam Square and right in front of a fascinating flower market. The conference happened on the fourth floor, with a balcony to enjoy the outstanding view on the city, and food and drinks for everybody. My first, huge “thank you, DCD!” goes to the vegetarian option which was palatable for a vegan, but let’s keep the cheering and the hand-clapping for the end.

Eleven speakers were waiting for us. Vijay Kiran set the stage and the playful mood of the day, leaving soon room to Alex Yakushev. “Embrace the JVM” was a talk to treasure. Observability, performance profiling, memory inspection. I am by no means a JVM expert, however the tools Alex showed us will definitely help me get a better understanding of the machinery behind Clojure.

Simon Belak was up next talking about transducers and statistical analysis. This was probably the hardest one for me. I haven’t found a way to appreciate the value of transducers yet, and statistical analysis is not my strongest skill. But I still appreciated the concept of sketch algorithms and I will hunt histograms pretty soon.

Srihari Sriraman with “Practical Generative Testing Patterns” blew my mind and, if you fancy ratings and such, was the highlight of the day. We all know test.check is good, but the approach of Srihari to automation, seeding relevant data and testing plausible behaviours left me eager to grab my keyboard and implement something similar.

After lunch we were treated to one more talk before the lightning sessions. Wilker Lúcio explained the beauty and easy-of-use of GraphQL, an interesting alternative to REST for better APIs.

The lightning talks kicked off with some magical REPL-debugging from Valentin Waeselynck. scope-capture looked promising, and I can only hope for an integration with CIDER. Dr Roland Kay reminded us of the usefulness of clojure.spec, although if I had to base my opinion of clojure.spec on his talk, it roughly looked like the type-system Clojure is missing. No trolling intended. Thomas van der Veen hit the MQTT broker pedal, mixing Java and Clojure, but I am still not sure I got the purpose of the experiment aside from the sake of learning. Ray McDermott closed the lightning sessions with an amazing browser-driven, multi-user REPL he is devising which can make live pair-programming scattered around the world a breeze.

The last three talks reflected experiences of using Clojure for business. Josh Glover, Philip Mates and Pierre-Yves Ritschard shared with us the journeys of their companies and projects and how designing, developing and testing have only improved since their move to our beloved language.

Drinks followed before a bit of REPL-driven comedy courtesy of Ray McDermott. Suffice it to say we sang the Clojure version of Bowie’s “Rebel Rebel” aptly entitled “REPL REPL”. If you weren’t there, well, you don’t know what you missed.

Dutch Clojure Days left me with the impression that the Clojure community is alive and hard-working, and its heart is in the right place. Ideas flourish, projects boom, boundaries get stretched. We can only be thankful to the DCD staff for being able to set up such a pleasant event, asking us only to join them to share our passion.

Permalink

Website routing made simple and easy

I was kidding myself for a long time. I thought of myself as an engineer, I was making complex billion user web apps, I was ENGINEERING the front end and the back end, of course I was. Except I wasn't.

I make websites.

When I finally came around and saw myself how other people see me, I reached website maker enlightenment. I stopped using complex js frameworks, complex backend api tech, I quit graphql, I quit REST APIs, I quit doing stupid stuff, and I started finishing projects.

I want you to finish your projects too, and the first step is to dump those complex frontend frameworks, and maybe embrace the simplicity of clojure.

How exactly is routing made simple or easy. WTF does that even mean?

I'll show you. Here's a route all by itself

[:get "/" home]

Here's two routes

[[:get "/" home]
 [:get "/@:name" profile]]

This is a route in coast on clojure

(ns your-project
  (:require [coast.gamma :as coast]))

(defn home [request]
  {:status 200
   :headers {"Content-Type" "text/html"}
   :body "Hello world!"})

(def routes [[:get "/" home]])

(def app (coast/app routes))

(app {:request-method :get :uri "/"}) ; => Hello world!

Routes in coast are a vector of vectors, the routes that are on top get matched first.

(defn hello [request]
  {:status 200
   :headers {"Content-Type" "text/html"}
   :body "hello world!"})

(defn goodbye [request]
  {:status 200
   :headers {"Content-Type" "text/html"}
   :body "goodbye, cruel world!"})

(def routes [[:get "/" hello]
             [:get "/" goodbye]])

(def app (coast/app routes))

(app {:request-method :get :uri "/"}) ; => hello world!

If you don't want to write vectors all day, you can use coast's route functions

(ns routes
  (:require [coast.router :refer [get post put delete wrap-routes]]
            [coast.responses :as res])
  (:refer-clojure :exclude [get]))

(defn home [request]
  "welcome!")

(defn profile [request]
  (str "hello, " (get-in request [:params :name])))

(def routes (-> (get "/" home)
                (get "/@:name")))

(def app (coast/app routes))

Here's a more complete example for something like auth with buddy

(ns routes
  (:require [coast.router :refer [get post put delete wrap-routes]]
            [coast.responses :as res]
            [controllers.home :as c.home]
            [controllers.users :as c.users]
            [buddy.auth])
  (:refer-clojure :exclude [get]))

(defn wrap-auth [handler]
  (fn [request]
    (if (buddy.auth/authenticated? request)
      (handler request)
      (res/forbidden
        "I'm sorry dave, I can't let you do that."))))

(def auth (-> (get "/users/:id" c.users/show)
              (wrap-routes middleware/wrap-auth)))

(def public (get "/" c.home/index))

(def routes (concat public auth))

Routing in coast on clojure is meant to be easy and meant to be easy to understand. Hopefully I've given you a glimpse into just one aspect of how making websites, not complex over-engineered front end heavy web apps can be made simple and easy.

If you're picking up what I'm putting down, give coast on clojure a try!

Permalink

Bringing order with Clojure's sort-by

It is unavoidable, really.

Any data eventually needs to be sorted for presentation. Most of the times we’re very lucky and we could lean on the implicit order of data returned from the database, or we can decorate that HTML table with DataTables and get sorting for free.

Implicit sorting is a crutch that I’ve relied on many times, and I’m sure you have too.

Implicit sorting in RDBMS systems

What many of us observe when using, say MySQL, is that the rows get returned to us in an order we're familiar with. The insertion order. Using InnoDB tables this generally means ascending by numeric auto-incrementing primary keys. For MyISAM it is strictly insertion order. Other storage engines might have different properties.

Already it is clear that within one RDBMS system we can have multiple behaviours depending on the underlying storage engine used by the table.

Long story short, we have no gaurantees and this can easily nip us in the behind if we're not careful.

Leaving it to the caller

Knowing now that we cannot depend on the source to sort the data for us, it is almost certainly up to the caller to sort the data. This often comes in the form of a query parameter, be it an ORDER BY clause in a SQL statement or a query parameter to an API.

More bespoke sorting tends to happen in the consuming code, not at the source, and this is where we'll look at what Clojure offers us.

Sorting data with Clojure

Clojure is great at taming data of all kinds, here I just want to explore a few ways to get some order to your data using simple functions.

Example data

For the examples below I’m going to be working with some invoice data. Each invoice looks something like this:

{:invoice/number "ACMEINV00001"
 :invoice/date "2016-11-25"
 :invoice/total-before-tax 100
 :invoice/tax 10
 :invoice/items [...]}

I’ll leave it up to you to imagine how rich these data structures can become in a real invoicing system.

Getting started with sort-by

The first requirement could be to sort a vector of invoices by total. These is relatively simple with sort-by:

(sort-by :invoice/total-before-tax invoices)

The first argument to sort-by should be function, and since keywords are functions of maps you can just specify the key in the map to be used. One caveat, the value of the entry in the map must be comparable.

That gets you a new vector, with the smallest invoices first and the valuable ones at the end. Hardly useful for business, so lets flip it around by supplying a comparator function too:

(sort-by :invoice/total-before-tax > invoices)

Now we’re cooking with gas! The most valuable invoices are now at the head of the list! Need the top 10? Just take what you need:

(take 10 (sort-by :invoice/total-before-tax > invoices))

How did this happen? Clojure compared the values returned by :invoice/total-before-tax using the > function.

Sorting by composite keys

The next requirement might be to sort by :invoice/number and :invoice/total-before-tax. Imagine the idea is that when two invoices have the same total that they are then sorted by their invoice number to show some kind of implied order.

This is where our friend juxt comes in. juxt accepts a list of functions and returns a new function, that when called, returns the results of all the original functions in a vector.

(def head-and-tail (juxt first last))
(head-and-tail [1 2 3 4 5]) #=> [1 5]

Here you can see that juxt applied first, and last, to the supplied list of numbers and gave us the head and the tail of the list. Keen readers might have just figured out where I’m going with this.

sort-by can compare these vectors too, so we can sort our invoices like this:

(def total-before-tax-and-number (juxt :invoice/total-before-tax :invoice/number))
(sort-by total-before-tax-and-number invoices)

Now the results will be sorted from lowest value invoice to the highest, with the invoice numbers in order too. So we're halfway there.

Mixing the order (or composing functions)

The results of the previous example doesn't make much sense. How can we combine sorting the invoice amounts in descending order and have the invoice numbers run sequentially when there is an overlap?

One possible solution is to use comp, and make a new function that will return the value of :invoice/total-before-tax as a negative number. comp works by accepting a list of functions and returning a new function, which when called, calls the arguments from right to left and passing the result of the previous call to the next one, starting with the parameter when called.

An example will be worth a thousand words:

(require '[clojure.string :as str])
(def up-and-reverse (comp str/reverse str/upper-case))
(up-and-reverse "elloh") #=> "HELLO"

;; or

(str/reverse (str/upper-case "elloh")) => "HELLO"

In order to get the negative total we can just comp together - and :invoice/total-before-tax like this:

(def negative-total (comp - :invoice/total-before-tax))
(negative-total invoice) #=> -100

If this right-to-leftness of comp bothers you, you could also simply declare it as an anonymous function which wraps a thread-first functional pipeline: #(-> % :invoice/total-before-tax -).

And using our new friend juxt we can simply roll it up like this:

(def negative-total-and-number (juxt negative-total :invoice/number))
(sort-by negative-total-and-number invoices)

And now we have a list of invoices sorted by total from most to least valuable, and where the totals match the invoice numbers follow a progression.

Another variation would be to sort by number of items on an invoice. This can be achieved by composing count and :invoice/items together:

(sort-by (comp count :invoice/items) > invoices)

And you'll have the invoice with the most items in at the head of the list.

Wrapping up

Although my examples are a bit contrived, the power that comes from composing functions in these intuitive ways are nearly endless. This works equally well for predicate functions used by filter, remove and many others.

It seems many small and composable functions will end up serving you better in the long run!

Thanks

A big thanks for Robert Stuttaford for helping to review this post

References & further reading

Cover image by Willi Heidelbach — Creative Commons Attribution-Share Alike 3.0 Unported — Wikipedia

Permalink

Passing around components with reagent and Semantic UI

I've been happily using Semantic UI React since I first wrote about it more than a year ago. Everything in the previous post still holds true, the only thing that has changed is the version number of the semantic-ui-react package.

The CLJSJS community has been great at keeping things up to date, and I can't recall any breaking changes in the components that I've been using.

One thing I unknowingly skipped in the previous article was passing around React components as arguments for other components. I don't think I realized at the time it was possible. I was learning the minimum viable React through re-frame, which got me very far (and continues to do).

This oversight was noticed by others though. A few people reached out for advice in private, and on StackOverflow. Where I fell short other community members helped out in the Clojurians Slack.

The Problem

Looking at the tabs example in the Semantic UI React docs, you encounter this:

import React from 'react'
import { Tab } from 'semantic-ui-react'

const panes = [
  { menuItem: 'Tab 1', render: () => <Tab.Pane>Tab 1 Content</Tab.Pane> },
  { menuItem: 'Tab 2', render: () => <Tab.Pane>Tab 2 Content</Tab.Pane> },
  { menuItem: 'Tab 3', render: () => <Tab.Pane>Tab 3 Content</Tab.Pane> },
]

const TabExampleBasic = () => (
  <Tab panes={panes} />
)

export default TabExampleBasic

The same thing occurs in several places, including popups.

So the question really is how do we pass along our reagent component along as a React component?

The Widget

Sticking with my tradition of building over engineered GitHub widgets, here are some tabs in action:

The source code is available on GitHub at kennethkalmer/re-frame-semantic-ui-react-github-tabs.

Here is a truncated version of the tabs in the above video:

(ns github-repo-widget.views
  (:require [reagent.core :as reagent]
            [re-frame.core :as re-frame]
            [github-repo-widget.events :as events]
            [github-repo-widget.subs :as subs]
            [github-repo-widget.ui :as ui]))

(defn- readme-tab []
  (let [loading? @(re-frame/subscribe [::subs/repo-readme-loading?])
        readme   @(re-frame/subscribe [::subs/repo-readme])
        pane     (ui/component "Tab" "Pane")]

    [:> pane {:loading loading?}
     [:div {:dangerouslySetInnerHTML {:__html readme}}]]))


(defn- stats-tab []
  (let [loading? @(re-frame/subscribe [::subs/repo-info-loading?])
        pane     (ui/component "Tab" "Pane")]

    [:> pane {:loading loading?}
     ;; ...
     ]))


(defn- repo-tabs []
  (let [panes [{:menuItem "Readme"
                :render #(reagent/as-component [readme-tab])}
               {:menuItem "Stats"
                :render #(reagent/as-component [stats-tab])}]
        tab (ui/component "Tab")]

    [:> tab {:panes panes}]))

The solution

Reagent gives us reagent.core/as-component, which is exactly the interop we need to turn our Reagent component into a React component for these cases.

In the case of the tab panes, Semantic UI React expects a function that returns a component as a value for the render property. In other places it expects the component directly, as seen with popups:

(defn info-icon
  ([message]
   (info-icon {} message))

  ([options message]
   (let [popup (component "Popup")
         icon  (component "Icon")]

     [:> popup
      {:trigger (reagent/as-component [:> icon
                                          (merge {:name "info"}
                                                 options)])}
      " "
      message])))

Here the trigger property expects another component, not a function that returns the component.

In close

Being able to use Semantic UI React directly in ClojureScript without going through some insane incantations or rituals is a testament to the amazing work done by the Reagent contributors.

It is also a testament to Clojures pragmatic approach of embracing the language/environment that hosts it.

Permalink

Nothing public.

Nothing public. Just use (defn ^:export my-fnc []) to expose to JS. Check our Mori from David Nolen. I can always hope on Zoom if you have specific questions.

Permalink

test-doubles: A small spying and stubbing library for Clojure and ClojureScript

As you may know from a previous post I’m working for GreenPowerMonitoras part of a team that is developing a challenging SPA to monitor and manage renewable energy portfolios using ClojureScript.

We were dealing with some legacy code that was effectful and needed to be tested using test doubles, so we explored some existing ClojureScript libraries but we didn't feel comfortable with them. On one hand, we found that some of them had different macros for different types of test doubles and this made tests that needed both spies and stubs become very nested. We wanted to produce tests with as little nesting as possible. On the other hand, being used to Gerard Meszaros’ vocabulary for tests doubles, we found the naming used for different types of tests doubles in some of the existing libraries a bit confusing. We wanted to stick to Gerard Meszaros’ vocabulary for tests doubles.

So we decided we'd write our own stubs and spies library.

We started by manually creating our own spies and stubs during some time so that we could identify the different ways in which we were going to use them. After a while, my colleague André Stylianos Ramos and I wrote our own small DSL to create stubs and spies using macros to remove all that duplication and boiler plate. The result was a small library that we've been using in our ClojureScript project for nearly a year and that we've recently adapted to make it work in Clojure as well:

I’m really glad to announce that GreenPowerMonitor has open-sourced our small spying and stubbing library for Clojure and ClojureScript: test-doubles.

In the following example written in ClojureScript, we show how we are using test-doubles to create two stubs (one with the :maps option and another with the :returns option) and a spy:

We could show you more examples here of how test-doubles can be used and the different options it provides, but we’ve already included a lot of explained examples in its documentation.

Please do have a look and try our library. You can get its last version from Clojars. We hope it might be as useful to you as it has been for us.

Permalink

Capital One shutdown Level Money last year.

Capital One shutdown Level Money last year. It’s doubtful any of the code was brought to other projects.

Using the package approach with JavaScript interop is still a good option to start working clojurescript into an organization without having to get the entire company to align on a tech stacks.

Since Triforce, I’ve built another app that’s all Clojurescript in React Native and AWS Lambdas.

Permalink

Copyright © 2009, Planet Clojure. No rights reserved.
Planet Clojure is maintained by Baishamapayan Ghose.
Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
Theme by Brajeshwar.