Whenever I encounter some cool new technology, it goes immediately and unconsciously goes into either the category of “things I know I’ll never use”, or the category “maybe someday”. It is pretty easy to tell if something is part of even the widest interpretation of your personal problem space.

For a long time, Elasticsearch was on the second of those lists. And it even seemed as though anyone in the server-side “web space” should have a search solution in there quiver. But search is hard, and there are always more urgent things to do. Until now!

This week I finally started using Elasticsearch in a project. This article is going to be a n00b’s eye view of what getting started with Elasticsearch feels like, with a look at some of the kinds of problems it can help with. This is certainly not a tutorial! The code I’ve included is just for explanatory purposes and is not example code of any kind!

What it is

Briefly, Elasticsearch is a Lucene-based search tool (engine/platform/database/server/whatever). Lucene is the venerable Java based search engine. Solr is Elasticsearch’s direct competitor and does basically the same thing.

Traditionally, when choosing between Solr and Elasticsearch, one would be told that Elasticsearch was built to be distributed (hence the “Elastic” part). That difference may not be so anymore. Solr has progressed recently and uses Zookeeper. So that now they can say this on their website:

Built on tried and true Apache Zookeeper, Solr favors consistency and avoids hairy split brain issues common to other approaches

— http://lucene.apache.org/solr/features.html#solrcloud

They’re talking about Elasticsearch there. There is lots more to read about this debate. You can start with Kyle Kingsbury’s (or @aphyr) Jepsen testing of Elasticsearch, and move on to this piece about Solr. The problems that show up with Elasticsearch here would be important if Elasticsearch was your primary data store, which I have indeed heard people recommend. If your data is elsewhere, like in a database, and Elasticsearch is just a search engine, you should be fine.

For me though, none of that made any difference, because all my data was going to fit in a single shard. The biggest reason that made me choose Elasticsearch, was simply that there is a solid Clojurewerkz library called Elastisch. As I started learning more about Elasticsearch, I realized that it has other features that make it an interesting choice for other reasons. Which brings us to my project…​

The problem

The goal was fairly simple. We have around 100,000 documents, web pages from the past 15 years or so. We also have a list of about 12,000 keywords that we are interested in and we would like to have an idea of the evolution of their importance over the years. We also know that the results won’t be perfect because matching keywords to texts is no simple task and because we aren’t ready to get into advanced keyword detection algorithms at this point.

The basic approach is pretty simple:

  • Index the 100,000 documents.

  • Search them using the keywords.

  • Generate a report with the totals for each month over the past 15 years.

Looping through each month since 2000 and running a search on a keyword seems ungainly and possibly too slow (although who knows, with a data set this size). This is where some of Elasticsearch’s features become really useful.

Getting started

This isn’t a tutorial, at all. I will mention some of the difficulties I had along the way though.

I first installed Elasticsearch from Elasticsearch’s apt repository. This installation was simple enough but Elasticsearch would silently die whenever I started it. /etc/init.d/elasticsearch stop would report that a PID was found but that there was no running process. Is this because Elasticsearch runs as java? I don’t know. I moved on to the manual “install” which, at least for development purposes, amounts basically to untaring the tarball and running bin/elasticsearch.

Interaction usually happens via http. The examples all use curl. Since I am doing this in Clojure, using Elastisch, the Clojure REPL is also a great way to interact with the search engine. For Emacs people, there is also es-mode, which is really nice because you can have a file full of different Elasticsearch requests and then type C-c C-c to send one of them to the server. Highly recommended!

Keyword mechanics

Finding keywords

Identifying keywords in texts is obviously a huge problem. The keywords I had to deal with were more than just simple words. Here is a sample:

  • Sciences humaines

  • Sémiotique des médias

  • Proust, Marcel

  • Vigny (de), Alfred

  • Empire

  • revue Culture et musées

Exact matches would not necessarily be useful. What if someone writes “`empires`” or “`une science humaine`” or “`les sciences, mêmes humaines`” (which would mean something like “sciences, even if they are human sciences”)? I actually wrote an initial version of this project that just used straight up string matching. It worked, in the sense that it produced somewhat meaningful results, for some keywords. Others would disappear completely. In the list above, something like “revue Culture et musées” might not work well, because the first word, which means “journal” might often be omitted.

These are classic human language parsing problems. The nice thing about using a tool like Lucene, is that a lot of this has been dealt with already. As you may have noticed, these keywords are in French. Lucene knows how to index French. Just tell Elasticsearch that a field is in French.

{ :body        {:type "string", :store "yes", :analyzer "french"}
  :title       {:type "string", :store "yes", :analyzer "french"}}

(Note that I’m writing this in Clojure. Classic Elasticsearch is in JSON, so the Clojure keywords get converted into strings.)

Fast forward: we’ve pulled the 100,000 documents out of their MySQL table and indexed them. I can run searches against them, I can find stuff, all very cool. But search, as such, is not the primary goal here. At this point I was reassured to be moving forward with Elasticsearch but unsure about how I was going to get my keyword data.

In comes the Percolator.

The Elasticsearch percolator reverses the typical relationship between a query and the document that is found. With the percolator, queries are stored ahead of time, as documents. Then we can send a document, I mean like a real document, and see which of the pre-selected queries match. It’s like a backwards version of Google, where you would type in a URL and get back a zillion pages of queries that would have matched that web page. You use a document to query the queries.

The idea is to see which of our keywords might pertain in some way to the documents. We are using a document to search for keywords.

The basic setup is simple enough. Here is my function for installing an individual keyword as a percolator document:

(defn register-tag-percolator [es-conn tag tagnorm id]
    (str "tag-" tagnorm "-" id)
    :query (t/query-from-tag tag)))

nperc refers to the clojurewerkz.elastisch.native.percolation namespace. (The native part means that we aren’t using the RESTful interface, but the native Java interface. This is a (small) advantage of Clojure’s Java interop.) “ourindex” is the index where all of our data is already stored. The argument after that, (str "tag-" tagnorm "-" id) is the name of the percolator query document that we are storing. Our keywords have a normalized form and a numeric id, so we are just assembling a human-readable string. This will be important later though. The t/query-from-tag function is our own function. It will build an Elasticsearch query object from the actual tag string. More about that soon.

So we loop over our 12K keywords with this function. Now we can throw a document at them and see what happens. Here is our function for doing that:

(defn percolate-doc
  [es-conn doc]
  (nperc/percolate es-conn "ourindex" "page" {:doc doc}))

I agree, there isn’t much to see here. I’m showing this function just to point out that we can percolate any arbitrary document, even if it isn’t in the index yet. The doc argument to this function can be any JSON object with the same fields as the other documents. This is very important, because it is going to allow me to grab the keywords before I index the document, then include the results of that work in the document.

Now I can send a document to the percolator and get an answer like this

{:count 117,

... etc.)}

And at this point (assuming I’m satisfied with these results, but I’ll get back to that), I can take these 117 keyword matches and make them part of the document before I index it. Here is the function that does it:

(fn [r]
  (let [page (page-to-es r)
        {tags :matches} (percolate-page es-conn page)
        clean-tags (mapv (fn [t]
                           (last (str/split t #"-" ))) tags)]
    (esnd/create es-conn "ourindex" "page"
      (assoc page :autotags clean-tags))))

This function is the :row-fn argument to a clojure.java.jdbc query call. That means that r is a row in an SQL result set.

So what’s going on here? page-to-es is formatting the database row as a page document for the Elasticsearch index that we’ve already defined. The next line, with {tags :matches}, sends the document to the percolator to get a list of keywords, like we just saw. We then map over that to grab just the numeric part of the tag names, and finally we index the document, associng in a vector of keyword ids as the :autotags field.

Aggregating it all back together

Our lists of keywords that are now part of each document that we put into the index are not going to improve search results, since they already are search results. The reason we are storing them this way is that we want to look at the entire collection at once so that we can graph the frequency of different keywords over time.

This is where Elasticsearch’s aggregations come in. Here is how we get our report back:

GET /ourindex/page/_search?search_type=count
  "query": {
    "match": {"autotags": "8442" }

      "date_histogram" : {
        "field" : "date",
        "interval" : "month",
        "format" : "YYYY-mm-dd"

(You’ll notice that I’ve switched to JSON notation here: no particular reason except that I haven’t coded this part into my application yet.)

Running this query produces something like this:

  "_shards": {
  "hits": {
  "aggregations": {
      // etc.

That little bit of query code was enough to give us a doc_count for each month. The rest, as they say, is histograms!

It works but does it mean anything?

So far I’ve basically been telling the story of a proof of concept. It was very reassuring to see that Elasticsearch’s features could get us from our raw data to the output that we needed in very few steps. And, without caching anything, that is, without building up an intermediate index somewhere from which we would have generated our statistics. Once the keywords and the documents are indexed, the report generation is live. From a nuts and bolts perspective, this part was very satisfying.

But…​ does it mean anything? How good are the results?

This is what I’ve learned so far, when working with “medium data”: it is easy to draw a nice looking chart, but how do you know that you are measuring something meaningful? In this case, the question boils down to the quality of the matches in the percolation phase. Are those 117 keywords relevant to the document that they matched?

I’ve glossed over some important details in telling this part of the story, and now I want to come back to some of the other problems that I had to solve along the way. In the percolating function that I showed above, there was a query-from-tag function. This function’s job is to take a string like “Vigny, Alfred (de)” and turn it into a useful Elasticsearch query.

There are several problems here. Most of the time, this nineteenth century poet’s name will just be written as “Vigny”. The first problem is that, because of the way that Elasticsearch (and Lucene?) analyze the documents that they index, “Vigny”, a very distinctive proper noun, can be also find documents with vigne (grapevine). As far as I can tell, the question of whether to analyze a text field or not is an index-level decision. Since I want full text search on the text body of each page document, it has to be analyzed; that means that the search query will be analyzed too, and so vigne will be considered as an equivalent of “Vigny”. There might be a way to get around this but I did not find one.

So what about using “Alfred de Vigny” instead? A query could easily be written where “alfred” and “vigny” would both have to be present to get a hit. Except that would mean missing all the occurrences of “Vigny” by itself. So we can’t do that.

The next logical approach would be to give a higher score to matches on the complete name and then maybe eliminate the lower scoring hits. This is where we run into something that is, in my opinion, one of the major limitations of the Elasticsearch percolator for this kind of problem. The results of a percolator query, ie. throwing a document at 12,000 queries to see which ones stick, is that, unlike in traditional search queries, the results are not scored. It is binary: hit or miss.

On my first try, most of the documents were getting between 700 and 1000 keyword hits. By delving into the way that Elasticsearch queries work, I was able to fine tune the matching to a certain degree, ending up in the range of 100 to 150 hits per page. If a person were to manually assign keywords, it would probably be more like ten to 30 keywords for a document. I don’t know much about machine learning, but something tells me that a discrepancy of about an order of magnitude between human and machine keyword guessing might not be so bad considering how little work actually went into setting this up.

I’m sure that there are technical reasons why this is true. A percolator query is a strange thing, and this version of the percolator has only been around for a short time. The results can also still be useful. As you can see in the sample results, the document got hits from about 1% of the available keywords. This is enough to give a rough idea of the evolution of certain concepts over time, but we have to remember that the results are rough.

If we look back at the list of sample keywords I gave at the beginning, there was this one: “`revue Culture et musées`”. It is the name of a journal. It will receive an incredible number of false positives because all three of its terms are going to be present in various configurations in a lot of documents, most of the time not referring to the journal in question.

For our purposes this is not a major issue. The idea is to get approximate results with as little human intervention as possible. If we really needed higher quality keyword matching, one solution would be to simply invest more time in the curation of the keyword queries, possibly writing some or all of them more or less by hand, or adding metadata so that we can tell the system that the word “`revue`” needs special treatment in the query. Elasticsearch provides a fairly elaborate query DSL that would permit this.

And as a suprise bonus…​

One positive result of this work is that, thanks to the machine generated keywords, we are going to be able to provide a limited set of keyword suggestions for our users, who will be able to manually tag pages. In my opinion, this alone might make the whole project worthwhile.

Not to mention that we could even use this thing as a search engine if we had to.

I’m working on a project where I need to extract textual data from a lot of HTML. Today I was working on a way to normalize my text input by parsing the HTML, especially the entities, and outputting raw text. I started to look around for a library to do this and everything seemed to point to Christophe Grand’s Enlive library.

This library was not unfamiliar to me, because I’ve used it for templating in a Ring app. I also knew that it could be used for webscraping, but I had never really paid attention to that aspect.

So I’ve got my library, I’ve got my data, and I start trying to learn how to break HTML down into simple text. I do this (en is for enlive here):

(en/html-resource  (java.io.StringReader. "<em>stuff</em>"))

and I get this:

({:tag :html, :attrs nil, :content
       ({:tag :body, :attrs nil, :content
                     ({:tag :em, :attrs nil, :content ("stuff")})})})

Now that data structure looks kind of familiar. It should, since that is the standard way of representing XML trees in Clojure.

Suddenly something clicks deep in my mind. “We can use a zipper!”

And indeed, after require-ing clojure.zip :as zip, we can do this:

(zip/xml-zip (first (en/html-resource  (java.io.StringReader. "<em>stuff</em>"))))
[{:tag :html, :attrs nil, :content
       ({:tag :body, :attrs nil, :content
                     ({:tag :em, :attrs nil, :content ("stuff")})})} nil]

And now we are off to the races! We have our handy zipper toolbox at our disposal. (I wrote three articles about zippers a few months ago: one, two and three.)

The last of the three articles provides a pretty good pattern for this situation. Here is what I ended up doing:

(defn normalize-text
  "Extract text: lower case everything, parse html, remove tags."
  (->> text
    (iterate zip/next)
    (take-while (complement zip/end?))
    (filter (complement zip/branch?))
    (map zip/node)
    (apply str)))

The use of the →>> macro here might make this slightly unintuitive if you aren’t used to it. Note how, serendipitously, all the forms here are either single-argument functions like first, or multiple argument functions where the key argument comes at the end. I guess today I did feel lucky.

Anyway, let’s break this down. The first few lines are what we’ve already seen, parsing HTML with Enlive’s html-resource and making an xml zipper. Let’s take it from there.

;; we make the zipper
(def hzip (->> "<p>sample <em>text</em> with words.</p>"
                  zip/xml-zip ))

;; and we walk through it and grab the text
(apply str (map zip/node
                   (filter (complement zip/branch?)
                     (take-while (complement zip/end?)
                       (iterate zip/next hzip)))))

;; with the result
"sample text with words."

How did that work? The take-while and iterate parts walk through the tree, lazily. We filter out all the branch nodes, which in this case means everything that isn’t…​ text. Which is what we are looking for. At this point all we have to do is map over the locations (remember, we are in zipper-space where everything is a location containing the entire tree — see my other articles about that) with zip/node to get the individual textual payloads. Make it all into a proper string by applying str and we’re done.

Once again, nothing particularly amazing about all of this, though I was pleased with the elegance and brevity of a solution that I was able to cobble together rather quickly. There are other ways of getting the same results too. But this does show how a little bit of familiarity with Clojure’s zipper libraries can be surprising useful.

So I’m working on something I have some percentile values for each month over the past ten years. I need to write an endpoint so that my javascript, which will eventually be some D3 histograms, can easily grab that data.

The way my tables are set up, there aren’t any zero values. If I have zero occurences for a given month, there is simply no corresponding row. This means that for my histograms, I need to fill in the missing months. And I start thinking about how I should do that: server side or in the browser, looping over all the possible months, checking to see if we have data etc.

There isn’t anything particularly difficult about this, aside from the fact that we don’t particularly enjoy touching dates. It is just kind of tedious and possibly easy to get wrong.

And then suddenly I think: SQL!

After all, we are basically just taking a database query, munging it and sending it down the road.

Here is the query the way I wrote it the first time:

select tm.month as month, p.percentile as percentile
from tag_month_percentiles p
join tag_month tm on tm.id = p.tag_month_id
join tag t on tm.tag_id = t.id
where tm.tag_id = ?
order by tm.month asc

What I needed was an outer join to produce lines for all those “missing” months.

After fussing around a little, here is what I ended up with:

select tm.month as month, IfNull(peep.percentile, 0) as percentile
from tag_month tm
left join
     (select tm2.month, p.percentile
     tag_month_percentiles p
     join tag_month tm2 on tm2.id = p.tag_month_id
     where tm2.tag_id = ?) as peep
on tm.month = peep.month
group by tm.month
order by tm.month asc

I don’t do this often enough for it to be second nature but in the end it is pretty simple. The original query becomes a subquery and we do a left join against the entire table. All the empty months, that disappeared with the original inner join, reappear as nulls. And we use IfNull (this is MySQL - not my fault) to replace the nulls with zero in the final results.

Because this is coming directly from the data source, I don’t need to figure out when my list of months should begin or end. All the months are automatically where they should be. I don’t have to think about it anymore.

There is nothing earth shattering here…​ but it does show how with about five lines of code, and the declarative nature of SQL, I avoided writing many more lines elsewhere, and avoided all the possible bugs that could have crept in.

Over the past couple of months I’ve been working on a web app, writing it in Clojurescript and using David Nolen’s Om framework. I may end up writing about some detailed aspects of Om, but for now I would like to talk a little bit about what it is like to build an Om app. This isn’t meant to be a tutorial at all, or an example; I just want to give a sense of what it feels like, and what some of the harder things to grasp are.

(If you are unfamiliar with Om, I recommend reading David Nolen’s blogpost, and listening to this Javascript Jabber episode. Or watching David Nolen’s talk at the last Clojure/West.)

My previous experience with Javascript frameworks was mostly with Backbone.js. And before Backbone existed, at least publicly, I already could sense that I needed some kind of tool to abstract the relationships between data (what would eventually become models in most frameworks) and the DOM. On one project, I wrote a kind of abstraction layer that tried to separate the data from the view. Everything was based on anonymous functions that got passed around. It worked, but when I came back four years later to debug it, I realized that it was completely incomprehensible. The only consolation I got was that because data and presentation were more or less separate, it only took about a day and a half to shoehorn it into Backbone.

Anyway, I digress…​

React and Om (from here on out I will mostly just say Om, but the two are obviously intertwined), change all of this. In my opinion, in fact, Om coupled with Clojure’s core.async, basically renders the MVC terminology obsolete. Not that we don’t have models, views and controllers anymore; it’s just that the layering of the three is no longer relevant. I would argue that in Backbone at least, the V and the C are already blended together, or that what Backbone calls a View is really a Controller, and that the DOM is in fact the View. Other frameworks handle this differently, and people have talked about MVVM as well. From what I know about other, more recent frameworks like AngularJS and Ember, they seem to be (huge) improvements over what Backbone started, but that they don’t really change the way we think about all of this.

Om (and React) are definitely view-centric. However, I would argue, for the time being anyway, that Om (with core.async) determines many other aspects of the process in a way that traditional Model or Controller layers are no longer relevant. In other words, this is not just a View layer is search of an M and a C. It is a different way of thinking about the whole process. And because it is so different I thought it would be worth going over some of the fundamental ideas, to perhaps alleviate some of the disorienting strangeness that we run into when we learn new patterns.


If Om is so different, it is because the abstractions the developer has to deal with are rather different from what we are used to. And the React component lifecycle is the abstraction that is probably the hardest to understand, as well as being the abstraction that you have to deal with constantly.

By lifecycle, they mean the stages of how a given thing (a component) appears in the DOM, changes and perhaps disappears from the DOM. Every framework tries to take care of certain aspects of the task at hand so that we don’t have to think about them anymore. A big part of what Om does as a framework is moving your data through these different states.

Understanding what is happening here is therefore crucial, but difficult, especially if you, like me, don’t have any direct experience with React. And of course part of the point of using a library like Om is to not have to know React that well to get going.

So here is the complete lifecycle:

  • IInitState

  • IWillMount

  • IDidMount

  • IShouldUpdate

  • IWillReceiveProps

  • IWillUpdate

  • IDidUpdate

  • IRender

  • IRenderState

  • IDisplayName

  • IWillUnmount

These protocols are implemented by functions in Om components. A component must implement one of the rendering protocols, the rest are optional.

This list makes things look complicated, but we can actually pare this list down considerably. For example IRender and IRenderState are conceptually identical; IShouldUpdate is something that Om takes care of by itself, but could be an issue in pure React; IDisplayName exists really for debugging only.

With a little simplification, we really have four or five concepts to worry about. The basic lifecycle starts with an initial state, ie. IIinitState. When our component comes to life, it first needs to be mounted into the DOM. Hence: IWillMount, IDidMount and IWillUnmount, when the component is finally destroyed.

Rendering is the heart of this process, at least from the programmer’s point of view, because that is where we build the HTML that the user will eventually see. An important difference with other frameworks is that rendering happens a lot. For an animation, like moving a <div> from left to right, the IRenderState function might be called several times a second. Sometimes it is better to think of your app like a movie (a cartoon, I guess) rather than a static set of objects sitting around waiting for events to happen. And if you are thinking: “that must be horribly slow”…​ well, it isn’t, thanks to React’s virtual DOM that efficiently projects changes out into the real DOM.

So, back to the lifecycle, IRenderState basically gets called whenever something changes. Besides mounting and unmounting a component, there are also the “change” phases: IShouldUpdate (which we ignore, because in Om the components “know” if they should update or not), IWillReceiveProps, IWillUpdate, and IDidUpdate. The will and did parts are fairly simple: do we need to pre-process that new data coming in before updating (and rendering)? After updating, is there something we need to do, like grabbing the DOM node that might have changed, to get its dimensions, for example? They work like hooks in other systems.

I like to think of this as a river of data flowing down towards the virtual DOM. These protocols and their corresponding functions are how we guide that data to the right place and mold it into the right form.

And so what about the data?

React (and Om) are based on the idea of one-way data binding. Change happens in your data somehow, and the components in the app translate that into a new representation in the DOM. This is why the conceptual model is so appropriate for a functional programming approach, and so appropriate for Clojurescript with its immutable data structures: in a very broad sense, there is data on one side, then a series of functions that transform it, and finally DOM output on the other side. This gets us away from the traditional UI model where lots of little pieces talk to each other all the time, and state is intertwined (or “complected”) with the interface itself.

Of course, the details are a little bit more complicated than a pure data in, DOM out model, partly because we do need information from the DOM to work its way back up the river.

Om (and React) have two kinds of data: application state (which React calls “props”) and component state. This is an area that trips people up because it is not always clear what kind of data should go in what kind of state. More on that in a second.

Application state

In Om, all application state is generally contained in a single atom that has a tree structure composed of maps and vectors. Om uses something called a cursor to access individual parts of this tree. David Nolen has compared cursors to Clojure zippers. The interface to cursors is nohwere near as elaborate as zippers, but the idea is indeed similar: keep a path to wherever the data you are interested in is located.

For example, if you have a menu with a submenu in your app, you might reference it like this: [:main-menu :sub-menu].

The reason this is important is that most components will only have access to one or a possibly a few parts of the overall tree. In a more traditional javascript framework, we would say that some part of the View was “listening” to that part of the model. This would not be accurate with Om (or React), where it makes more sense to say that changes are being pushed out from the application state (the atom) to the view and ultimately to the DOM.

It’s worth noting that, in Om, both application state and components tend to be organized in a tree-like manner: components branch out into sub-components just like application state does. This pattern works well with the top down, data-pouring-through-the-ap approach.

Component state

The other part of the state picture is component local state. What should be local is sometimes a tricky question, but as a starting point, it might be helpful to think that component state tends to be more related to the mechanics of making the component work. For instance, if two components need to communicate with each other via a core.async channel, the endpoints of the channel would belong to each component’s local state. The other classic example is an animation where an offset is being incremented on each render; that offset doesn’t need to belong to the app state. It is just a “mechanical” part of making the component do what it should.

Back to application state, with an example

Application state, on the other hand, deals with what you are trying to represent. This can still be tricky, depending on what your definition of “what” is…​

In the project I’m currently working on, most of the actual content can be either in French or in Latin, and the user can choose to toggle between the two languages inside a given component. So at first, I thought this sounded like component state, since it was a question of choosing two different representations of the same time, and because all that data was already available to the component.

This quickly started to break down though, because pretty soon I wanted components to behave differently depending on what language was displayed by nearby components. I started setting up channels to tell the neighbors about changes to language state and everything started to get complicated and spaghetti-ish. I finally realized that my mental distinction between component state and application state needed some adjusting.

It turns out that the French/Latin language choice is really part of what the app is supposed to be showing. It isn’t an implementation detail, so it goes into the app state.

Earlier, I mentioned a menu and a sub-menu as being part of application state. In some circumstances, we could imagine that the contents of those menus might be derived from other information in the app. A menu isn’t so much a “thing” to represent as a tool within your application. However, since it is an entity that is part of what your are trying to show, it probably deserves its own piece of app state real estate. Whether a collapsing menu is visible or not might, on the other hand, be a suitable candidate for component state…​

At any rate, this isn’t meant to be a complete discussion of the topic, but just enough to give you an idea of how our thinking has to change when using Om.

Back upstream

React is supposed to be fast, and Om possibly even faster. But that implies interaction, and so far I’ve been describing a great river of data that lands in the DOM. How does change occur?

The simplest case would be a list item with a button that adds a star to the item when we click it. The list item would be a component, with the button part of the same component. The button would have a handler that would look a little like this, but not quite:

(defn star-handler [ev]
      (.preventDefault ev)
      (om/transact! list-item (fn [li] (assoc li :star :true))))

list-item refers to the part of the application state that is at the origin of this particular list item. om/transact! takes a function that operates on list-item, thus returning a new version with :star set to true.

What is nice here, is that application state gets changed immediately. If our render function is set up correctly, it will render a star now that :star is true. If there is a star counter somewhere else in the app, that keeps track of how many items are starred, it would be incremented immediately. Without any even listeners, except for the one listening for the click itself, these changes can be reflected everywhere in the app.

Now, this is probably the simplest possible case, because the state change concerns the item itself. If we wanted to remove the list item, instead of modifying it, we would have to do that from the parent component, or, to be more precise, the “owner” of the component. Unlike the DOM, “ownees” can’t access or manipulate their owners. Data just flows one way. So if we need to communicate with the parent or owner, to tell it to remove a child element, we can establish a core.async channel to tell it to call om/transact!, and everything will just work out. core.async allows us to jump back up the tree when we need to.

The same thing holds for new data coming in from the server. It goes into a channel and gets directly integrated into application state, and whatever changes need to be made can just flow back out through the rendering functions.

This is also why I was saying that Om (and React) are really much more than a View layer: Om has its own state management in the form of the state atom and cursors. It isn’t quite a model layer, but anything missing, like server sync, ultimately can just be added on as needed. The same is true of what would be the Controller: to the extent possible, you just want to write functions that react to input and modify data. In other words, while Om doesn’t provide everything you need to build a complete app, it is more than just a brick in a bigger edifice, because it imposes an architectural model that determines how you set up your “models”, your “controllers”. That is, to the extent that it is still relevant to talk about models and controllers in this context.

Next time, or maybe the time after that, I think I’ll talk about some of the things that are indeed somewhat difficult with Om.

Thus far, in the first two parts of what is turning into a series, when walking Clojure zippers, we’ve been mostly using a pattern like this:

(require '(clojure [zip :as zip]))
(defn recursion-for-recursion
  "Walking around doing nothing."
  (if (zip/end? loc)
    (recur (zip/next loc))))

Now, obviously, this function wouldn’t do anything except run the zipper out to its :end. But that has been our pattern so far.

Recursion smell

But this is Clojure and every use of recur should make you wonder for a second whether or not you can accomplish the same thing using a sequence. Take a look at this function, that I ran into here:

(defn leafnodes
  "Returns all leaf nodes in loc. "
  (filter (complement zip/branch?) ;filter only non-branch nodes
          (take-while (complement zip/end?) ;take until the :end
                      (iterate zip/next loc))))

This could be rewritten with our familiar pattern:

(defn leafnodes-r
  "Recursive version of leafnodes."
  [loc accum]
  (if (zip/end? loc)
    (recur (zip/next loc)
           (if (zip/branch? loc)
             (conj accum loc)))))

It is interesting to see these functions side by side, or at least one above the other. (iterate zip/next loc) does the actual (lazy) recursion; (take-while (complement zip/end?)…​) limits the recursion so that we stop at the end. By this time, it is pretty clear that we are getting a seq of locations that we could do basically anything with. filter is used here to take only the leaf nodes, which is what we get in the end.

Let’s try some stuff:

(def zzz (zip/vector-zip [:a [:b [:c :d] :e]]))
(map zip/node (leafnodes zzz))
; (:a :b :c :d :e)

Mission accomplished: all leaf nodes. What about the recursive version?

(map zip/node (leafnodes-r zzz []))
; (:a :b :c :d :e)
(= (leafnodes zzz) (leafnodes-r zzz []))
; true

;; just for fun
(flatten (zip/root zzz))
; (:a :b :c :d :e)

That’s fine then, it works. The powerful thing here is that we now have the full combined firepower of the seq and zipper abstractions for extracting information from a tree.

Parsing HTML

Let’s have some fun with this pattern playing with everyone’s favorite form of XML. We are going to play around with extracting the links from an HTML page. Here is our namespace declaration:

(ns info.josf.examples.zipper-seq
  (:require [clojure.zip :as zip]
            [clojure.data.xml :as dx]))

I’ve actually put all these examples into a Github repo, which could be helpful if you want to follow along in your REPL.

First, we grab a page and stuff it into a zipper:

(defn zappy
  "grab a webpage we can trust, and parse it into a zipper"
  (-> "http://richhickey.github.io/clojure/clojure.zip-api.html"

There are three simple steps here: dx/parse wants a stream, hence java.io. It parses the page then we pass that on to zip/xml-zip, which is the specialized zipper-builder for XML. Even here there is the node-space/zipper-space thing going on: the first function builds the actual tree, and zip/xml-zip wraps it up as as zipper. Now let’s do something.

(defn get-links
"Returns a list of href attributes from :a nodes in an xml zipper."
  (map #(get-in (zip/node %) [:attrs :href])
       (filter #(= :a (:tag (zip/node %)))
               (take-while (complement zip/end?)
                           (iterate zip/next loc)))))
(get-links zappy)
; ("index.html" "index.html" "#" "index.html"
; "branch-1.1.x/index.html" "index.html" "api-index.html"
; "#" "clojure.core-api.html".... etc.)

There is almost nothing to explain here. It is the same basic function as before, except we’ve adapted the filter predicate to work with an XML tree, and we’re using map to grab the :href value from each node.

If you’ve never dealt with XML trees in Clojure before, this might look slightly confusing. The basic idea with this kind of tree representation is that each node is a map with three keys. (clojure.data.xml actually uses records; non-text nodes are represented as Elements, but it amounts to the same thing.) It looks like this:

{:tag :a,
 :attrs {:href "http://example.com"}
 :content ("The link text")}

In our function, when we say (map #(get-in (zip/node %) [:attrs :href]), we are just grabbing the href out of the node’s attrs map.

This is a pretty basic task: walk a tree, grab some data, report back with a list. Any language can do this of course. Clojure zippers let you do it fairly simply, and in fact this little function seems at least as simple as something comparable in XPath/XSLT.

A contrived twist

Now let’s try something a little crazier. We’ll parse the same document, but this time we want more than just the href values: we also want to know the depth, relative to the document root, of each link, as well as how many sub-nodes there are in the <a/>. For a given link, we want something like this:

{:href "index.html" :depth 5 :subnodes 1}

Step one: just a seq

The get-links function returns a seq of URL strings, but for this we need locs, so we have to simplify the function:

(defn get-a-zips
  "Return a lazy seq of :a locations"
  (filter #(= :a (:tag (zip/node %)))
          (take-while (complement zip/end?)
                      (iterate zip/next loc))))
;;; Let's run put the result in a var. Printing the result
;;; will eat up a lot of space in your REPL.
(def a-list (get-a-zips zapi))

This is the same function as before, less the mapping that extracted the URLs. It is just going to give us the bare list of locations. The rest of the work is coming up with a function to map over this seq.

Step two: ancestor count

Suddenly we realize that each element in the list of link node zippers has access to the entire tree. Even though it feels like we are pulling the tree apart, filtering out everything but the parts that we are interested in, each part is actually one of those grains of sand that contains the entire universe (so to speak).

Now, we all love Clojure’s seq interface, but there is something that can trip you up. Or at least that is what happened to me. I’ve talked a lot about the distinction between zipper space'' and node space”, because, in my experience anyway, this is often the difference between getting it or not.

So when we then start thinking about all of this through the seq interface, it can get a little bit more confusing: we’ve got a seq, but a seq of what?

I expected to get a bunch of nodes. This is such a common pattern: walk a tree, look at the nodes, accumulate data, do something. What is really cool, and possibly confusing, about Clojure zippers, is that often you don’t get nodes, you get locations. That means that you aren’t extracting data, you are locating it in the tree. Each element of the seq has access to the entire tree. Each element that gets returned is still a zipper. (Unless it’s not, of course: functions like zip/children return nodes, not locs.)

So, given a loc, how do you count its ancestors? Well, you write a function.

(defn ancestor-count
  "How many ancestors do I have?"
  (count (take-while zip/up
               (iterate zip/up loc))))

This is the same pattern as before, except that we iterate using zip/up, and we keep going up until zip/up returns nil, which is what can only happen at the root node.

Let’s take it for a spin:

(map ancestor-count a-list)
; (4 5 7 9 9 6 6 6 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 11 12 12 12 12 12 12
12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 10 10
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
10 10 4)

Looks good to me…​ except wait a second! We can do the same thing in fewer characters using the zipper’s path:

(defn ancestor-count
  "How many ancestors do I have?"
  (count (zip/path loc)))

Step three: how many kids?

For simplicity’s sake, we’ll just count the total number of descendant nodes. This is one of those times when using a sub-zipper seems like a good idea. By that I mean, as I discussed in my first post on zippers, that zippers are cheap to make, and that it can be useful sometimes to turn a given node into a “sub-zipper”.

(defn descendant-count
  "How many descendants do I have?"
  (let [sub-zip (zip/xml-zip (zip/node loc))]
    (count (take-while (complement zip/end?)
                       (iterate zip/next (zip/down sub-zip))))))

(map descendant-count a-list)
; (1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
; 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
; 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1)

Not the most exciting results, since all the links on the page apparently only have text content. You get the idea though. We could have counted siblings, for example, or done any other kind of calculation.

Step four : all together

Just for consistency we need one more little function to grab the href as well:

(defn extract-url
  "Get the href value from an <a> zipper. Takes a loc."
  (get-in (zip/node loc) [:attrs :href]))

And now we can do what we set out to do:

(defn depth-href-kidcount
  "loc-seq is a seq of <a> node zipper locations."
   (fn [loc]
    {:depth (ancestor-count loc)
     :href (extract-url loc)
     :subnodes (descendant-count loc)})

Deep breath…​

(depth-href-kidcount a-list)
; ({:depth 4, :href "index.html", :subnodes 1}
; {:depth 5, :href "index.html", :subnodes 1} {:depth 7,
; :href "#", :subnodes 1} {:depth 9, :href "index.html",
; :subnodes 1} {:depth 9, :href "branch-1.1.x/index.html", :subnodes 1}
; ...etc.)

Simple as that!

Obviously, this was a contrived example, but I hope that it shows how there are a lot of creative to mix the seq and the zipper interfaces, and that it can be a powerful way to work with your data. In my opinion, zippers aren’t really that complicated, but they do require us to adjust our thinking. Once we do, there are lots of fun things to do.

There is one inherent limit to this pattern, and that is that there is no way to systematically edit the tree once it has been iterated into a sequence. All the locs in the are snapshots of, or references to, the exact same tree. Any cumulative editing will just produce a sequence of new trees, each one modified once, with no way to combine all of the edits into the same tree.

This is a follow-up to my Getting acquainted with Clojure zippers post. That was more of an introduction, this article is going to more about some of the difficulties we run into when using zippers to modify tree structures. We are going to try to write a function to modify a tree, and along we will run into lots of problems that, once we deal with them, will eventually lead to zipper enlightenment. The code here isn’t meant as recipes or a perfect example, but just a way to work through some issues that come up when working with Clojure zippers.


The neat thing about zippers in Clojure is that they allow tree manipulation to be purely functional, Zipper manipulation functions can be pure : given the same zipper they will always return the same thing. This is true of movement functions like zip/down and zip/next, since position inside the tree is part of what we are feeding into the function: given the same tree, and the same position inside the tree, the result will alway be the same. The tree itself, obviously, doesn’t change just because we’ve moved. The zipper here is an abstraction that separates the tree itself from the notation of our location in the tree, and thus allows us to deal with both explicitly.

…​but editable?

What about modifying the tree, though? What happens then?

It turns out that simple edits are simple. There are several functions in clojure.zip :

  • edit

  • append-child

  • insert-child

  • insert-left

  • insert-right

  • remove

  • replace

zip/edit is interesting because its arguments are a location and a function to modify the current node. This kind of editing could go rather far, since the current node can be the top of subtree. In a simple case, though, it could look something like this:

(require '(clojure [zip :as zip]))
(def zzz (zip/vector-zip [1 [:a :b] 2 3 [40 50 60]]))
; #'user/zzz
(-> zzz zip/down (zip/edit inc) zip/root)
; [2 [:a :b] 2 3 [40 50 60]]

The first node in the tree, the 2, has been inced up from 1. Editing is possible!

In an XML based tree we might have more complex editing needs. XML trees are map-based. Each node is a map with a :tag key, an :attrs key and a :content key. If we wanted to change a <ul> to an <ol> in an HTML document, the call to zip/edit would look something like this:

(zip/edit loc #(assoc % :tag :ol))

We’ll stick to vector trees here, in the interest of simplicity and short codes snippets.

Let’s try inserting a node now. We’ll try to put a :c after the :b in the first subvector.

(-> zzz zip/down ;; we are at 1
              zip/right ;; this is the subvector
          (zip/append-child :c) ;; add as the last child
          zip/root) ;; back to the top to appreciate our work
; [1 [:a :b :c] 2 3 [40 50 60]]

zip/append-child is a good choice here because we are appending to the end of the vector, and that is what it always does. What about prepending? This time we’ll put a :aa at the beginning.

(-> zzz zip/down zip/right zip/down ;; down to the :a node
       (zip/insert-left :aa)
[1 [:aa :a :b] 2 3 [40 50 60]]

insert-left: does just what it says. Let’s delete any numbers greater than 10:

(defn death-to-big-numbers
        (if (zip/end? loc)
          (if (or (not (integer? (zip/node loc)))
                  (< (zip/node loc) 11))
            (recur (zip/next loc)) ;; base case: keep moving
            (recur (zip/next (zip/remove loc)))))) ;; evil big numbers
(zip/root (death-to-big-numbers zzz))
; [1 [:a :b] 2 3 []] ;; they're gone!

This is our standard zipper-walking format. Most of the work is just figuring out which locations need to be removed and passing them off to zip/remove.

Immutable gotchas : let’s wrap

Now let’s try something more complicated. We’re going to start over with a new vector tree.

(def z2 (zip/vector-zip [1 2 :a :b [3 4 :c :d 5] :e]))

Our goal this time is to make sure that all keyword nodes are wrapped together in their own subtrees, with adjacent keywords wrapped together. We want to be able to take a tree like z2 and produce this:

[1 2 [:a :b] [3 4 [:c :d] 5] [:e]]

(I encountered this problem in an HTML document, trying to wrap adjacent, indented <p> elements inside <blockquote>s.)

Here we go. Strategy number one. We’ll look for keywords and, if we find one, we’ll wrap it and then start appending any right siblings.

First, let’s just try wrapping each keyword individually:

Strategy 1 - a: wrap and go…​ down
(defn strategy-one-a [loc]
        (if (zip/end? loc)
          (if (keyword? (zip/node loc))
            (recur (zip/next (zip/edit loc #(vector %))))
            (recur (zip/next loc)))))
;; wait, don't do this: infinite loop

What? Why is this infinite? Well, when we edit the node, we create a new node, [:a], and then we say zip/next. Except that the next node in this case is down…​ to that :a again! Which we wrap again, and then go find again and again and again. next is relative to the new zipper, not the old one from before the edit. This is one of those immutable/pure-function mind warp things: you want to be able to grab the new state and hold on to the old one too. But here, the old one, where zip/next was going to logically lead to the right sibling, is no longer available to us. Or, to be more precise, it is still available, but it is now a parallel universe where the edit we want to do never really happened. We can’t have the new edit and the old position at the same time, because the old position only exists in a world where the edit never happened.

Back to reality: let’s introduce some logic to make sure we don’t try to wrap keywords that are already inside vectors containing only keywords. In other words, if we run into [:x :y :z] we leave it alone.

First, we write a little function that looks up at a keyword’s parent to see if all the siblings are already keywords:

(defn parent-ok? [loc]
        (when (zip/up loc)
          (every? keyword? (zip/children (zip/up loc)))))

(Note here that zip/children returns nodes, not locs, so we can feed the results directly to keyword? without having to call zip/node.)

And then we plug it into our previous tree walking function:

Strategy 1 b
(defn strategy-one-b [loc]
        (if (zip/end? loc)
          (if (and (keyword? (zip/node loc))
                   (not (parent-ok? loc)))
            (recur (zip/next (zip/edit loc #(vector %))))
            (recur (zip/next loc)))))

; [[1 2 [:a] [:b] [3 4 [:c] [:d] 5] [:e]] :end] ;; here we see the ended-out zipper

Good: no infinite recursion, and all of our keywords are wrapped, but individually, not grouped together like we want.

Let’s try grabbing them from the left-most keyword. As soon as we hit a keyword, we’ll try to gobble up any keywords to the right before moving on.

Strategy 1 c (starting to get ugly)
(defn strategy-one-c [loc]
        (if (zip/end? loc)
          (if (and (keyword? (zip/node loc))
                   (not (parent-ok? loc))) ;; up to here, same as before
            (let [right-keys (take-while #(keyword? %) (zip/rights loc))]
                (recur (zip/next
                               (reduce (fn [z n]
                                          (zip/append-child z n)) (zip/edit loc #(vector %)) right-keys))))
            (recur (zip/next loc)))))

The basic structure is the same as before, except that now we try to incorporate keyword nodes to the right of the locations that need to be dealt with (ie. keyword nodes that are not in a parent node containing only keyword children, as established by the parent-ok? function we wrote earlier). Conveniently, the zip library provides zip/rights, which gives us a sequence of nodes (not locs, mind you) that we grab with take-while.

So far nothing extraordinary.

If we do have interesting siblings, we need to bring them into the node we are going to create. That is what we do with reduce here, and it is probably worth breaking down the logic there.

(reduce (fn [z n]
           (zip/append-child z n)) (zip/edit loc #(vector %)) right-keys)

This is the three argument form of reduce, where the second argument provides an initial value. Here our starting value is going to be the original keyword node edited to be wrapped inside a new vector. Then, for each of the keyword siblings to the right, we are going to keep calling zip/append.

It would be the equivalent of this, if we could know ahead of time that right-keys contained :a, :b and :c:

(zip/append (zip/append (zip/append (zip/edit loc #(vector %)) :a) :b) :c)

So, we gather up some more courage and we run the thing:

(strategy-one-c z2)
; [[1 2 [:a :b] [:b] [3 4 [:c :d] [:d] 5] [:e]] :end]

Well, this kind of works, in that we are getting our keywords grouped together. Obviously, we haven’t done anything to remove the nodes that we’ve copied into our aggregating nodes. And look what happens if we have more successive keyword nodes in our input:

(strategy-one-c (zip/vector-zip [1 2 :a :b :c :d :e 3]))
; [[1 2 [:a :b :c :d :e] [:b :c :d :e] [:c :d :e] [:d :e] [:e] 3] :end]

Oh dear. We can’t just leave those siblings out there, because they end up getting siblingized themselves. Before we send our big reduced, zip/appended, zip/edited zipper back to zip/next, we need to prune the original keyword siblings from the zipper. Okay, one last try.

Works but…​ (Strategy 1 d)
(defn strategy-one-d [loc]
        (if (zip/end? loc)
          (if (and (keyword? (zip/node loc))
                   (not (parent-ok? loc)))
            (let [right-keys (take-while #(keyword? %) (zip/rights loc))
                  loc-no-rights (reduce (fn [z _] (zip/remove (zip/right z))) loc right-keys)
                  loc-w-incorporated-keys (reduce (fn [z n]
                                                                (zip/append-child z n))
                                                            (zip/edit loc-no-rights #(vector %))
                 (recur (zip/next loc-w-incorporated-keys)))
            (recur (zip/next loc)))))

I’ve shifted things around a little here for clarity, but it really isn’t too different from the previous rendition. Inside the let, loc-w-incorporated-keys is the result of the same reduce call. The difference here is that before building our final zipper, we remove the siblings.

zip/remove, after blowing away the node, automatically moves to whatever the previous location would be, as determined by calling zip/prev. This could be a problem if we didn’t know that we were working on right siblings, because if we removed the leftmost node, we would end up at the parent node suddenly, which could confuse things considerably. That won’t be a problem here though. This behaviour also allows us to use the combination of zip/right and zip/remove for as many times as we like, because we just keep removing the next node over each time.

loc-no-rights is the intermediary version of the zipper. And because right-keys is holding nodes and not zipper locations, we can come back to them and append them before moving on with zip/next.

Deep breath…​

(strategy-one-d z2)
; [[1 2 [:a :b] [3 4 [:c :d] 5] [:e]] :end]
(strategy-one-d (zip/vector-zip [1 2 :a :b :c :d :e 3]))
; [[1 2 [:a :b :c :d :e] 3] :end]

I confess that at this point I’m just glad to have something that works. I think, however, that you’ll agree that this is just too complicated, too messy and too painful for what should be a relatively simple operation. Going through all these contorsions was meant to show some of the problems that crop up when learning about zippers, and how we have to bend our thinking a little (or a lot) when using them. Now let’s try to do this right…​

Strategy 2

This time we are going to do everything by editing parent nodes. Most of what we do will be in what I’ve been calling “node space” and should be simpler. Each time we hit a branch node, we will look at its children, wrap any keyword nodes and then keep moving.

The first step is to write a generic wrapping function. We want it to take a list of things and group together adjacent keywords. Here is what I came up with:

(defn wrap-keywords [lst]
        (into []
               #(if (keyword? (first %))
                  (list (apply vector %))
                (partition-by keyword? lst))))
(wrap-keywords [:a :b :c 1 2 [:d :e] 3 4 :f])
; [[:a :b :c] 1 2 [:d :e] 3 4 [:f]])

The trick here is that partition-by just breaks up the initial sequence into segments depending on how the predicate reacts to each item. With the sequence in the example, it produces this:

(partition-by keyword?  [:a :b :c 1 2 [:d :e] 3 4 :f])
; ((:a :b :c) (1 2 [:d :e] 3 4) (:f))

The rest of our function, the mapcat part, just turns the keyword segments into vectors (apply vector), wraps each vector in a list to put it back on the same level as its siblings, and then concats the whole thing back together, essentially erasing the lists but leaving the vectors. And finally, we use into to box everything up in a vector.

(Note that we can use vectors here because we are using vector trees. In more serious code it would be wiser to use zip/make-node.)

Now it’s time to try this out on a tree.

Strategy 2: go after the parents
(defn strategy-two [loc]
        (if (zip/end? loc)
          (if (and (zip/branch? loc)
                      (some keyword? (zip/children loc))
                      (not (every? keyword? (zip/children loc))))
            (recur (zip/next (zip/edit loc wrap-keywords)))
            (recur (zip/next loc)))))


(strategy-two z2)
; [[1 2 [:a :b] [3 4 [:c :d] 5] [:e]] :end]
(strategy-two zzz)
; [[1 [:a :b] 2 3 [40 50 60]] :end]

In the interest of full disclosure, I messed up the first time through this and fell into the same infinite recursion trap as before. We still need to make sure that the keyword nodes aren’t already wrapped, since otherwise we’ll end up wrapping them forever. That’s what the (not (every? keyword?…)) stuff is all about.

The rest should be pretty self-explanatory by now. We check to see if we are on a parent node, using zip/branch?. We peek at the node’s children to see if any of them are keywords, and then, when the planets are correctly aligned, we edit the node as though it were a simple seq. Because zip/edit modifies the node, not the zipper.

This, in my opinion, is one of the tricky parts to zippers: knowing when to be in zipper-space or in node-space, knowing how to leverage one or the other depending on what you need to do. At the same time, having these two options, and being able to quickly switch from one to the other, is what gives Clojure zippers all their power.

Zippers, in Clojure, are one of those things you end up running into that can be rather confusing. A little googling of the topic seems to reveal a universe divided between people who understand zippers perfectly and those who don’t get them at all. Zippers are a core library, so they get mentioned in the Clojure books, but they are also a special case that would take several pages to really explain in detail.

Most of all, they are, for a lot of us, slightly strange, a solution to the problem of “traversing” a tree structure without any implied state. In this post, I’m going to try to help those who don’t immediately understand zippers to at least avoid some of the subtleties that tripped me up.

Why “zipper”?

I can’t use zip files because I don’t have a Zip drive''. Why does everything have to be called zip”? What is a zipper, anyway?

With their usual terseness, the docs simply say:

Functional hierarchical zipper, with navigation, editing, and enumeration. See Huet
— Rich Hickey
API for clojure.zip

Huet turns out to be Gérard Huet, who published Functional pearl: the Zipper in 1997. Huet insists on how simple the concept is, and almost asks forgiveness for writing a paper about something that really should have been documented long ago:

This simple idea must have been invented on numerous occasions by creative programmers, and the only justification for presenting what ought to be folklore is that it does not appear to have been published, or even well-known.
— Gérard Huet
Functional Pearl: the Zipper

Huet is unsurprisingly one of those people who just “gets it”. Here is how he explains the Zipper:

The basic idea is simple: the tree is turned inside-out like a returned glove, pointers from the root to the current position being reversed in a path structure. The current location holds both the downward current subtree and the upward path.

Now maybe this kind of spatial and logical torsion is intuitive for you. Reading this now, I actually do understand what he means, and it might be `simple'' in a Rich Hickey sense, as in `uncomplected”, but it certainly isn’t that easy to understand.

A few basic things

So here are a few things I would have liked to have understood to begin with, and that I didn’t fully grasp before going through lots of trial and error and frustration.

Zipper vs. tree vs. node

We know what a tree is. In Clojure we would represent it as something like this, with nested vectors for example:

[1 [:a :b] 2 3 [40 50 60]]

This is just a tree, not a zipper, because a zipper is not a tree: it is a tree plus your position in the tree. Or to be more precise: a tree seen from a particular location in the tree.

So to make a zipper out of that tree, we do this:

user> (require '(clojure [zip :as zip]))
user> (zip/vector-zip [1 [:a :b] 2 3 [40 50 60]])
[[1 [:a :b] 2 3 [40 50 60]] nil]

Notice how the result is just the same tree wrapped in a two-element vector? No big deal, right? Can it be that simple?

Well…​ this is Huet’s glove, but still right side out. If you want to see the lining, just do this:

user> (->  [1 [:a :b] 2 3 [40 50 60]]
[[:a :b] {:l [1], :pnodes [[1 [:a :b] 2 3 [40 50 60]]], :ppath nil, :r (2 3 [40 50 60])}]

There’s the lining, or at least the beginning of it. All that extra stuff is the zipper looking back up toward the root of the tree, to the left and to the right. Moving around the tree means calling a function like zip/down on the current zipper, and getting back a new zipper with some of the nodes shifted around to represent your new position in the tree. This is functional programming. We are always creating new values, not modifying old values. Position in a tree is typically a form of implicit state: we just are somewhere. Here our location becomes explicit, and the zipper infrastructure makes it so we don’t have to think too much about how all of this happens. We don’t ever need to know about :pnodes or :ppath, we just deal with the zipper interface. (Imagine having to do this “by hand”, associng the left nodes over to the right nodes, dissocing them from the left, etc.)

Two forms of movement

While we are talking about movement, it’s worth pointing out that there are two ways of moving through the zipper. The first way is what we’ve just seen: zip/down, zip/up, zip/left, zip/right and functions like zip/root that take us all the way back to the top.

The second way of getting around the tree, zipper-style, is to walk, as in tree-walk. One of the added benefits of using zippers is that you get, for free, a depth-first tree traversal, just by calling zip/next as many times as you need.

Going back to our example tree, with zip/next you can do stuff like this:

;; let's make a var to be easier to read.
 user> (def zzz (zip/vector-zip [1 [:a :b] 2 3 [40 50 60]]))
user> (-> zzz zip/next zip/node)
user> (-> zzz zip/next zip/next zip/node)
[:a :b]
user> (-> zzz zip/next zip/next zip/next zip/node)
user> (-> zzz zip/next zip/next zip/next zip/next zip/node)
user> (-> zzz zip/next zip/next zip/next zip/next zip/next zip/node)

(Notice that we call zip/node at the end each time, so that we get the value and not the whole zipper. More on that in a second.)

Each time we add a call to zip/next, we get one more node. Without even trying very hard, we get a complete walk of the tree just by calling zip/next a bunch of times. The function just “knows” where to go to hit each node in the tree exactly once.

End games

Now, when walking a zipper, like in any recursive situation, we wouldn’t want to be in an infinite loop. zip/next needs to know when to quit. Which is why the Clojure zip library provides an end? function that tells us if we are there yet.

user> (defn fast-forward [loc]
	(if (zip/end? loc)
	  (zip/node loc)
	  (recur (zip/next loc))))
user> (fast-forward zzz)
[1 [:a :b] 2 3 [40 50 60]]

Here we’ve zipped all the way to the end, and we end up with the whole tree, that is the root node.

Check out this slighly different version of our fast-forward function:

user> (defn fast-forward-z [loc]
	(if (zip/end? loc)
	  (recur (zip/next loc))))

user> (fast-forward-z zzz)
[[1 [:a :b] 2 3 [40 50 60]] :end]

This time we get the zipper, and not the just the node, as in the previous. And we have this strange little :end keyword as the second part of the zipper vector. The zipper has been “consumed”. We can’t walk anymore. zip/next will just keep returning the same thing, and zip/prev will just return nil. The glove has been turned so far inside out that it can’t go any more (yet, strangely, it looks right side out again).

So not only does the zipper represent our position in the tree, it also “knows” where we’ve been, or at least whether we can keep walking or not.

Nodes and locs

So what about nodes and this zip/node function? Really, it is actually fairly simple: a node is of course a tree, a subtree, a branch or a leaf. The node is your data. But it doesn’t know what it’s parents or siblings are, or how far it is from the root.

On the other hand, a location (usually pronounced loc) contains a node (the node is the first element in the zipper vector), but a location is actually the entire zipper, zipped to a certain point. Sometimes you need the node, and sometimes you need the loc. Somehow, figuring out when to look at a node and when to look at a location was one of the major conceptual hurdles that I encountered.

What tripped me up, I think, looking back, was that zippers require you to go back and forth between the zipper view of the world (that is: the whole tree seen from a particular position) and a node view of the world, where we are back in a good old tree structure looking at individual values.

Say we wanted to find the first keyword (in preorder depth-first traversal) in our tired zzz vector. We could do this:

user> (defn first-keyword [loc]
	(if (zip/end? loc)
	  (if (keyword? (zip/node loc))
	    (zip/node loc)
	    (recur (zip/next loc)))))
user> (first-keyword zzz)

The “keywordiness” of a particular location in the tree is distinct from its location. To see if the current node is a keyword, we have to use the zip/node function to slip into nodespace. If, on the other hand, we want to know whether our node has a right or left sibling, for example, we need to be in zipper space. If we want to find nodes of a certain kind that have parents or children of another kind, our tests have to dance between nodes and zippers. This is the kind of thing that seems completely obvious once you understand it, but, at least for me, kept me scratching my head early on.

Zippers are cheap

One last thing that took a while to sink in: zippers are cheap. Huet insists on how zippers are supposed to be fast since, in a very clojurey way, only the differences in path and position need to be taken into account, leaving most of the original data intact.

In a more concrete way, just looking at a zipper, we see that there isn’t much to them. Here is your data:

user> (def your-data [:a :b :c])

Here is your data in a zipper:

[[:a :b :c] nil]

It’s just a two-item vector. Well, almost: there is some metadata attached, so it doesn’t work to just stuff some data in a raw vector followed by nil. You have to use functions like vector-zip or xml-zip to create them.

But still, the data structure is very lightweight, especially when the zipper is at the root position of the tree. As a result, you can be creative with when you decide to break out a zipper. Suppose for instance that you know you only want to walk one subtree within a larger tree. You need a zipper to navigate down to the top of the subtree that interests you. Once you’ve found it, rather than figuring out how to constrain your search to just that subtree, it is simpler to just take the node containing the subtree and plunk it into a new zipper. Then you can zip/next to your heart’s content, knowing that you’ll never end up in some weird place you didn’t expect.

I’ll talk about “editing” zippers some other time, but I recently came onto a problem where I wanted to walk a tree several times with different functions each time. I couldn’t just hold onto a reference to the original zipper because I was modifying as I went. I need to be able to chain my zipper-walking functions together in a sequence. The problem was that after each walk, I was at the :end. My functions worked individually but couldn’t be chained together. The original zipper had been walked too far and just couldn’t take it anymore. What to do? hiredman helped me out on IRC with this: just make a new zipper. In my case that meant just calling zip/xml-zip again on the exhausted zipper.

There are some more gotchas that I’ve run into with zippers, especially when trying to edit trees, so I’ll take that on later. My feeling about zippers is that they are an extremely powerful tool that can be used in lots of unusual situations, not just parsing XML, but anywhere you have a tree that starts to get a little too complicated.