rosetta-serving-dish-4

I visisted Rosetta Code in my comparison between Clojure and Python and now I’ve decided to go back. There are many tasks which are not yet implemented in Clojure, so what are we going to do about that? Here you’ll see 2 very easy solutions and 2 that are a little more complicated.


Preface

Over the past 1.5 month this blog has been read over 80.000 times and a big part of the interest is in the ‘X vs Y’ posts I’ve done, comparing some language with Clojure in regards to a specific task or theme. Many people have voiced their opinions both those in favor and those opposed to such comparisons. Rich Hickey, author of Clojure, has never approved of direct comparisons as he feels that when an expert user of X reviews Y it won’t be a fair review and Y then gets shortchanged. Its a valid point and I definitely don’t want to be unfair towards anybody. In most cases I think I’ve tried to compare headlines such as brevity, expressiveness, support for FP or concurrency etc, but as Rich points out: Rosetta its the place for stand-alone code-comparisons.

Before I move on to Rosetta let me just say a word on Open Source. Normally we consider Open Source to mean free, both as in ‘free beer’ but also ‘freedom’. The truth however, is that someone always pays either with their money or their time. In the case of Clojure, it has come a long way and benefited many of us tremendously and the only guy who has been paying for all of this is Rich Hickey. I want to encourage all of you, business owners and users alike, to read through this page and consider supporting the development of Clojure in 2010: Funding

Rosetta

I’m not a regular at Rosetta, so I’ll share my initial experiences here as I walk through some problems. Rosetta hosts many programming challenges which range from extreme boredom to complicated problems, all of these are implemented in a range of programming languages. Looking through the problems I think the vast majority these problems are less than challenging, which got me thinking: Is it time we give Rosetta some competion? :)

Below you’ll see 4 solutions to problems previously not solved using Clojure — Can you guess why I picked 4?

#1: Sum of the Squares

The challenge is very simple: Provide a function which takes a vector of digits and returns the sum of the squares and 0 for an empty vector.

(defn sum-of-squares [v]
  (reduce + (map #(Math/pow % 2) v)))

Clojure makes this too easy as I can just map (Math/pow(er) x 2) on the vector to get the squares. If the vector is empty, map does no work, 0 is returned.

#2: Parameterized SQL

Rosetta correctly notes that parameterized SQL is a good guard against SQL Injection attacks, so they ask us to provide code which produces a parameterized SQL statement and then executes it. With quality SQL libs like ClojureQL, this is almost too easy:

(require 'clojureql)

(run (make-connection-info "mysql" "localhost" "usr" "psw")
   (update players [name   "Smith, Steven" score  42 active true]
        (= 99 jerseyNum)))

They wanted me specifically to mimic “UPDATE players SET name = ‘Smith, Steve’, score = 42, active = true WHERE jerseyNum = 99″, which I’ve done in Lisp syntax as you see above.

#3: Data Munging

Leaving easy-street we move on to data munging. For this task we get a file called readings.txt, which contains readings from a pollution monitoring station in this format

10-11-12   25.55    1    50.00    0 ....
  date    reading  flag reading flag

The readings are floats and the flags are >= 1 for a good reading and < 1 for a bad reading. For the row to be labeled good, all flags must be greater than or equal to 1.

The task is this:

  • Confirm the general format
  • Identify duplicate timestamps
  • Count rows which only contains good readings

This is another one of those tasks for which Clojure is very well suited. We have some input data, which we can manipulate into a Clojure datastructure and then pull some numbers from it. In terms of modules we need a line-scanner which does the manipulation and a function which outputs the contents of this datastructure. Lets start with the line-scanner.

The line-scanner will go through all the lines individually. I’ll use a hashmap as my accumulator wherein I put the munged data. For every line I first need to know if it’s a duplicate or not. If we use the dates for keys, this is straight forward:

  (let [[date & values] (.split line "\t")]
    (if (contains? (:valid accumulator) date)
      (update-in accumulator [:duplicates] conj date)

I destructure the line argument into ‘date’ and ‘& values’ after having split it on tabs. If the accumulator already contains the date, this is a duplicate and I store it under the key :duplicates.

If the date is not a duplicate, then I just need to know if it contains 48 values and if the date is correctly formatted as dd-mm-yy

      (if (or (not= 48 (count values))
              (empty? (re-seq #"\d{2}-\d{2}-\d{2}" date)))
        (assoc accumulator :mal-formed (conj (:mal-formed accumulator) date))
        (assoc-in accumulator  [:valid date] values)))))

And thats the line scanner. Now I can scan the entire file simply by saying

(reduce scan-line {} lines)

This is the entire program:

(import '(java.io BufferedReader FileReader))

(defn scan-line [accumulator line]
  (let [[date & values] (.split line "\t")]
    (if (contains? (:valid accumulator) date)
      (update-in accumulator [:duplicates] conj date)
      (if (or (not= 48 (count values))
              (empty? (re-seq #"\d{2}-\d{2}-\d{2}" date)))
        (assoc accumulator :mal-formed (conj (:mal-formed accumulator) date))
        (assoc-in accumulator  [:valid date] values)))))

(defn munge [uri]
  (let [lines (->> uri FileReader. BufferedReader. line-seq)
        data  (reduce scan-line {} lines)]
    (println "Duplicates:     "  (:duplicates data))
    (println "Malformed rows: "  (count (:malformed data)))
    (println "Good readings:  "
             (->> (:valid data)
                  (filter (fn [[k v]] (every? #(pos? (Integer. %))
                                              (take-nth 2 (rest v)))))
                  count))
    (println "-----------------------------")
    (println "Total rows:     "  (count lines))))

As you can see, once the data is stuffed into a hashmap things are very simple. Getting the duplicates is just (:duplicates data), counting the malformed entries is (count (:malformed data)) etc. Good readings might be tricky to read, but it’s first taking out the :valid readings, which is lines with alternating readings/flags. These are then passed to a filter, which skips the first item (rest) and then takes every second item (the flags). These are then cast to an Integer and checked with pos? if they are >= 1. The rows where every answer to pos? is true, are filtered out and counted. The generated output is:

user> (munge "readings.txt")
Duplicates:      (1995-03-26 1993-03-28 1992-03-29 1991-03-31 1990-03-25)
Malformed rows:  0
Good readings:   5013
-----------------------------
Total rows:      5471

I can’t say for sure if this is the best way to demonstrate Clojures chops for such a task, so if you have a better take on it, please do share.

#4: The Game of 24

Now we’re actually going to write a game. The game supplies the player with 4 unique digits from 1 — 9. You as the player then has to compose an equation using no more than these digits and (), +, -, *, / and the result of your statement must equal 24.

The components of the task are:

  • Validate user input
  • Evaluate mathematical expression
  • Input loop

So we can start with the easy stuff: Mathematical evaluation. In Clojure we can tap into everything Java has got, which also means Javascript evaluation. In its simplest form, we can instantiate a script engine and call eval:

user> (.eval (.getEngineByName (javax.script.ScriptEngineManager.) "js")
             "5 + 5")        
10.0

Thats actually calling the Rhino Javascript engine directly. So with this in mind we can define the main game loop like so:

(defn play-game [digits greeting]
  (println (format "Your digits: %s\n\n%s\n" digits greeting))
  (let [evaluator  (fn [s]
                     (try
                      (.eval (.getEngineByName (javax.script.ScriptEngineManager.) "js") s)
                      (catch Exception e nil)))
        userinput  (read-line)]
    (cond
      (= "!" userinput)                      (recur (make-digits) "")
      (= "q" userinput)                      (println "Thank you for playing!")
      (not= (sane-input? digits userinput))  (recur digits "You may only use the 4 provided digits!")
      (= 24 (evaluator userinput))           (println "Congratulations, you've solved the equation!")
      :else                                  (recur digits "Please try again"))))

By passing both the digits and the greeting, we save a little typing later on. You’ve already seen the evaluator and read-line does what it says. Our conditional then needs to check if the user has typed “!”. If thats the case he gets a new set of digits and no greeting.

If he doesn’t give sane input, then he gets the same digits and is told to restrict himself to the 4 supplied digits. If neither of those are true, the input is evaluated and we check if returns 24, ending the game. The :else clause covers whatever else might happen, in this case if the expression does not evaluate to 24.

The tricky bit here, is then the input checking, which I’ll split into 2 parts and explain separately.

First, we need to extract the digits and operators from the user-input, which could be “(2+4)*4/2″:

(defn sane-input? [digits i]
  (let [input      (.replaceAll i " " "")
        re-digits  (sort (map #(Integer. %) (re-seq #"\d{1}+" input)))
        operators  (re-seq #"\D+" input)]

The first regex extracts all digits from i(nput) and casts them to Integer. This cant fail because my regex will only return digits. Then the ‘operators’ are selected as anything thats not a digit, which could be anything.

    (when re-digits
      (and (= 4 (count re-digits))
           (every? true? (map #(some (fn [_] (= _ %))  ["+" "/" "-" "*"]) operators))
           (every? true? (map = (sort digits) re-digits))))))

If re-digits isn’t nil, we know that the user has provided some digits which we can work with. First I make sure that only 4 digits are supplied and then I check if every operator is in the allowed operators list and if the sorted list of digits is equal to the sorted list of user provided digits.

The entire game is shown here:

(defn make-digits []  (vec (map #(inc (rand-int %)) (repeat 4 9))))

(defn sane-input? [digits i]
  (let [input      (.replaceAll i " " "")
        re-digits  (sort (map #(Integer. %) (re-seq #"\d{1}+" input)))
        operators  (re-seq #"\D+" input)]
    (when re-digits
      (and (= 4 (count re-digits))
           (every? true? (map #(some (fn [_] (= _ %))  ["+" "/" "-" "*"]) operators))
           (every? true? (map = (sort digits) re-digits))))))

(defn play-game [digits greeting]
  (println (format "Your digits: %s\n\n%s\n" digits greeting))
  (let [evaluator  (fn [s]
                     (try
                      (.eval (.getEngineByName (javax.script.ScriptEngineManager.) "js") s)
                      (catch Exception e nil)))
        userinput  (read-line)]
    (cond
      (= "!" userinput)                      (recur (make-digits) "")
      (= "q" userinput)                      (println "Thank you for playing!")
      (not= (sane-input? digits userinput))  (recur digits "You may only use the 4 provided digits!")
      (= 24 (evaluator userinput))           (println "Congratulations, you've solved the equation!")
      :else                                  (recur digits "Please try again"))))

(play-game
 (make-digits)
 "Produce an equation using only the 4 digits listed above, which equals 24")


Conclusion

I haven’t submitted this to Rosetta, so if anybody wants to improve on it and do so, feel free. I am seriously considering setting up a new site, which would be a mix of Euler and Rosetta. Less mathematically focused than Euler and more challenging than Rosetta, supporting dialogue and comments — good idea you think?

Basically what I would like to see is a more high-level oriented forum, where we could battle it out on concurrency challenges and the likes. One of the things which really help a product/language/platform/vm take on new frontiers is dropping backwards compatibility, and I think many discussions on development would be well served in leaving much of the low-level stuff behind.

You just got my 2 cents, feel free to give me yours :)




Reddit Tweet this! Bookmark on Delicious Share on HackerNews