Over the years, I’ve actually written quite a lot of code in PHP. I never chose PHP, I would never choose PHP except possibly when a client needs a CMS. Even now, look, no Wordpress!

But fairly often, I end up having to use PHP. I could go on and on about why I don’t like using PHP. But it is here and sometimes you can’t escape. That isn’t a reason to turn off your brain though.

One of the things that is great about learning new languages whose basic paradigms are different from what you use everyday is that you end up thinking differently. So sometimes you force even, yes, PHP to be more like the language you like.

So here is a little situation that I would have approached much differently before “getting” Clojure.

I had a fairly large “Array”, actually a map or a hash, with about 1200 words that mapped to other words. I suddenly needed to do reverse lookup on the same batch of words. I started going through the myriad “array functions” in the PHP docs and saw that there is array_flip:

array_flip() returns an array in flip order, i.e. keys from array become values and values from array become keys.

Perfect, I thought for a few seconds… Bu what about duplicates ? Well… not really PHP’s fault for this kind of function, but:

If a value has several occurrences, the latest key will be used as its value, and all others will be lost.

And I had duplicates. And I needed to put them into arrays so that some words could map to all of their corresponding words.

A few years ago, at this point I would have just started writing a loop. But, because of Clojure, you start thinking that there should be a way to describe the kind of data you end up with, or at least composing some transformations on the original map. Anyhow, this is what I ended up with:

// $this->dict is our original dictionary
$flipped = array_flip($this->dict);

// The second flip gives us something like a copy of the original
// dictionary, but without any duplicates
$nodupes = array_flip($flipped);

//  and now we compare the two  to see which keys are missing
$lostInTrad =  array_diff_key($this->dict, $nodupes);

// now we plug the missing values back into the flipped dictionary
// as arrays of multiple reverse translations
foreach ($lostInTrad as $key => $val) {
    if (! is_array($flipped[$val])) {
      $flipped[$val] = array($flipped[$val]);
    }
      $flipped[$val][] = $key;
  }
  return $flipped;

So, there is a loop at the end, but only over the duplicates, which in my case was a fairly small subset of the original batch of words. If I really wanted to pursue this, I would do an array merge with a callback.

How would I do the same thing in Clojure?

It was actually harder than I thought it would be. Here is what I ended up with:

(def dict { "word"        "wordA"
            "wordy"       "wordB"
            "wordy-word"  "wordA"
            "wordy-wordy" "wordC"})
(into {}
    (map (fn [[x y]] [x (map first y)]) 
        (group-by second dict)))

group-by is a pretty cool function that I had never used before. Once you’ve got that, most of the complexity is mangling the results back into the form we need. So not as clean as I expected but still fairly elegant because most of the hard work, the matching of duplicates, is done by a standard function.

Comments