Reddit Clone in 10 minutes and 91 lines of Clojure

2010-02-02 21:00:47

Recently I had the good pleasure of reading this blog post, which demonstrates how to build a Reddit Clone in 100 lines of Common Lisp. I thought it might be interesting to see a port to Clojure, contrasting a couple of idioms and core functions of both languages.




Preface

Why contrast 2 Lisps you ask? Because the subtle differences are interesting to me and both languages take up about the same amount of code-space. Following the link above you'll actually be able to see how Sven writes out his entire Clone in a couple of screencasts, demonstrating Lisp Works. Whether you watch it or not, I recommend opening op his code in a 2nd tab while reading this post.


The not so subtle differences

Rich Hickey once remarked, that cl-Loop and cl-Format were in themselves more complex than the entire Clojure language. In this case Common Lisp has a function which I would very much see moved into Clojure, namely format. Format can render your input in more ways than you can think of, automatically figuring out wether to suffix and extra "s" to "sec" and what not - Its an impressive function, which we sadly don't have in Clojure yet (update: Please see Tom Faulhabers comment below the article - turns out Clojure does have cl-format). As you can see from Svens Reddit Clone he formats the links like so:

Title posted 1 minute 5 seconds ago Up Down

In the absense of format in Clojure I turn to Joda Time to mimic that behavior. Joda lets me define a PeriodFormatterBuilder, which I can use to coerce 'durations', which the timespan is, into text formatted like above:

(def formatter
     (.toPrinter (doto (org.joda.time.format.PeriodFormatterBuilder.)
                   .appendDays    (.appendSuffix " day "    " days ")
                   .appendHours   (.appendSuffix " hour "   " hours ")
                   .appendMinutes (.appendSuffix " minute " " minutes ")
                   .appendSeconds (.appendSuffix " second " " seconds "))))

This Printer can then be used to coerce a timestamp into the text you see above, by manually making a Duration (datatype) between the timestamp and DateTime/now:

(defn pprint [stamp]
  (let [retr   (StringBuffer.)
        period (Period. (Duration. stamp (DateTime.)))]
    (.printTo formatter retr period (java.util.Locale. "US"))
    (str retr)))

We can test it by making up a TimeStamp from January 31.th 12:00:00 and 00 milliseconds:

cloneit> (pprint (DateTime. 2010 1 31 12 00 00 00))
"2 hours 52 minutes 4 seconds "

Nice. This in no way emulates all of what cl-format can do, but enough for this exercise.


Rendering Links

To render links we first need to agree on a datastructure and for simplicity I'll go with a hash-map where the keys are the URLs and the values are hash-maps containing the properties of that URL. This makes for easy access later. To set up some test data:

(def data  (ref {"http://www.bestinclass.dk" {:title "Best in Class" :points 1 :date (DateTime.)}}))

We know that our users will want to sort the data on various columns, so it makes sense to write out a render-links function, to which we can pass our criteria for sorting. Clojure's sort-by is special in the sense that you can both pass it a function (keyfn) which extracts the data we cant to sort-by, and also a comparator to apply. Render-links thus becomes:

(defn render-links [keyfn cmp]
  (for [link (sort-by keyfn cmp @data)]
    (let [[url {:keys [title points date]}] link]
      [:li
       (link-to url title)
       [:span (format " Posted %s ago. %d %s " (pprint date) points "points")]
       (link-to (str "/up/" url)   "Up")
       (link-to (str "/down/" url) "Down")])))

What that does it walk through every link in the sequence which results from sorting. For every link it extracts the key, which is the URL as well as the keys in the hash-map attached to that key. The return is a sequence of vectors starting with [:li ...] compojure know s how to convert this to HTML.

I think the specific compojure helpers like (link-to) are pretty self explanatory, but its worth noting, that if you don't know them all you could still make a link like so:

[:a {:href "http://www.bestinclass.dk"} "My favorite blog"]

So the entrance fee is pretty low, as you can explore away. Lets say you want to sort all links by the number of points they have, call it like so:

(render-links #(:points (val %))  >)

So that hopefully makes sense right away. You get the key by calling :points on the value of each item, and you sort those using Greater Than as the comparator. Sorting my date might be a little more tricky:

(render-links #(.getMillis (Duration. (:date %) (DateTime.))) >)

As you can see I pull out the age of the each item in milliseconds and also compare them using GT.


Rendering Home

So to render a main-page almost exactly like the one Sven has, we do the following:

(defn reddit-home []
  (html
   [:head
    [:title "Reddit.Clojure"]]
   [:body
    [:h1 "Reddit.Clojure"]
    [:h3 (format "In exactly %d lines of gorgeous Clojure"
                 (->> (this-file) reader line-seq count))]
    [:a {:href "/"} "Refresh"] [:a {:href "/new/"} "Add link"]
    [:h1 "Highest ranking list"]
    [:ol (render-links #(:points (val %))  >)]
    [:h1 "Latest link"]
    [:ol (render-links #(.getMillis (Duration. (:date (val %)) (DateTime.))) >)]]))

The reason I said 'almost exactly', is because Svens version outputs "In about 100 lines of Lisp", where mine will dynamically output the exact number of lines. But looking past that small detail, I think its a very clean representation of that webpage.

To get the actual line count is a little tricky. Clojure stores the filename relative to the classpath when loading then file - that means that the only way to the actual filename is to store it once Clojure is loading my file. As soon as Clojure has moved on to the next file, *file* changes:

(defmacro this-file [] (str "src/" *file*))

Hackery you say? A little, but nevertheless it does dynamically output the number of lines in the source file.

important: If you're running this program from REPL (ie. not from a .jar file), this-file won't work because there is no file. Replace it with a dummy value.


Adding Links

So now we need to enable our users to add new links to the website and I'll implement the same validation as Sven, ie. valid non empty url? non empty title etc. To begin, I'll make a predicate to verify the URL:

(defn invalid-url? [url]
  (or (empty? url)
      (not (try (java.net.URL. url) (catch Exception e nil)))))

That makes our lives a little easier when writing the main logic. Secondly we need to set up a page which contains the input fields:

(defn reddit-new-link [msg]
  (html
   [:head
    [:title "Reddit.Clojure - Submit to our authority"]]
   [:body
    [:h1 "Reddit.Clojure - Submit a new link"]
    [:h3 "Submit a new link"]
    (when msg [:p {:style "color: red;"} msg])
    (form-to [:post "/new/"]
     [:input {:type "Text" :name "url" :value "http://" :size 48 :title "URL"}]
     [:input {:type "Text" :name "title" :value "" :size 48 :title "Title"}]
     (submit-button "Add link"))
    (link-to "/" "Home")]))

That function takes an argument for the sole reason, that I want to be able to call it with an error message while instructing the user on how to provide good input. You see that directly in the middle:

(when msg [:p {:style "color: red;"} msg])

That only kicks in if msg is non-nil, in which case it will output a p-tag with a red font containing msg. Now that we have all the rendering out of the way, we can implement the logic:

(defn add-link [[title url]]
  (redirect-to
   (cond
    (invalid-url? url) "/new/?msg=Invalid URL"
    (empty? title)     "/new/?msg=Invalid Title"
    (@data url)        "/new/?msg=Link already submitted"
    :else
    (dosync
     (alter data assoc url {:title title :date (DateTime.) :points 1})
     "/"))))

Call that function with both a title and an url (ie. the user input) and it will run a fall-through validation of that data, meaning if none of the predicates are true, then we start an STM transaction in which we associate the url with the title, a Timestamp and an initial point. All the strings you see, as well as the final "/" are the return of the conditional, which then becomes the argument to "redirect-to".


Rating Posts

Now there's only 2 things missing, rating and the server-setup. With our data rolled in a native Clojure structure it becomes extremely easy to rate an item:

(defn rate [url mfn]
  (dosync
   (when (@data url)
     (alter data update-in [url :points] mfn)))
  (redirect-to "/"))

That function takes the URL in question, as well as a function (modify-fn). The function can be (inc) (dec) #(+ 5 %) or whatever you'd like, its just a closure. Calling (when (@data url)) extracts the item specified by the url, if this is nil (ie. somebody tried to work around the system), then nothing happens. But if there is an URL by that name in the set, then we alter the data by update [url :points] directly within an STM transaction. That guarantees total concurrency safety even with many users.


Finalizing

So with all of the logic and rendering in place, we just need to bundle it in a set of routes which Compojure then serves our visitors:

(defroutes reddit
  (GET  "/"         (reddit-home))
  (GET  "/new/*"    (reddit-new-link (:msg params)))
  (POST "/new/"     (add-link (map #(params %) [:title :url])))
  (GET  "/up/*"     (rate (:* params) inc))
  (GET  "/down/*"   (rate (:* params) dec))
  (GET  "/styles/*" (serve-file "res" (params :*)))
  (ANY "*"  404))

Firstly we serve the main page to visitors hitting the root. If you request the "/new/" adress you get our input form, but if you post to it, the logic from (add-link) runs. The result as you recall is a redirect, either to the same page with an error or the front page.

The 4th item serves "/up/" and then feeds the remaining of the url into the key "*". That allows me to feed that directly to (rate) and a the final parameter (inc) which causes the :points property to be incremented by one. The opposite is true for the 5th item.

The call to serve-file allow me to serve statics like CSS, JS files etc.

Finally I have my failsafe (ANY "*" 404), which of course means that all other requests that those I've defined above will receive a 404 reponse - its not necessary, just nice to have. Launch these routes on a network interface calling my main function:

(defn -main [& args]
  (run-server {:port 8080} "/*" (servlet reddit)))

Throw in a call to (include-css "res/reddit.css") and you get this:

CloneIT


Deployment

The reason I felt like following Svens lead in producing a Reddit Clone, was because I think Clojure gives you a lot of mileage in this domain, which hopefully a few of you agree with after reading this. I've added the code to a Git Repo which I hope you newcomers will really enjoy:

$ git clone git://github.com/LauJensen/cloneit.git

That gives you the code. Now download Leiningen:

$ wget http://github.com/technomancy/leiningen/raw/stable/bin/lein

Put that on your path and make it executable

$ export PATH=$PATH:/path/to/lein
$ chmod +x lein

And then install it simply by calling

$ lein self-install

Now you're sitting with my code and one of the build tools which Clojurians use. Why is this great? Its great because now you don't have to scour the net to find Clojure, Contrib, Joda etc etc, just run

$ lein deps

And you'll have all of the dependencies on your own system. Want to run my program to experiment with the webservice? No problem:

$ lein compile
$ lein uberjar
$ java -jar cloneit-standalone.jar
2010-01-31 15:22:09.694::INFO:  Logging to STDERR via org.mortbay.log.StdErrLog
cloneit.proxy$javax.servlet.http.HttpServlet$0
2010-01-31 15:22:09.725::INFO:  jetty-6.1.x
2010-01-31 15:22:09.767::INFO:  Started SocketConnector@0.0.0.0:8080
 

Yes - It really is that easy to deploy! Now you have a portable Reddit Clone which will run on Linux, BSD, Mac OSX and even Windows - All with very little effort and less than 100 lines of code!

Conclusion

Now you know how to write a webservice, implement Reddit like functions, build it, handle dependencies and deploy cross platform - The language level support for concurrency is becoming invaluable at every turn these days and with Clojure infrastructure pieces quickly being put in place, Clojure is giving us a lot of mileage. Hope you all have some fun with it.

PS: Big thanks to Sven for getting the ball rolling.

Code here: Github


Hubert
2010-02-02 23:22:56
Hi Lau,

Very nice introduction to Compojure, enjoyed reading it.
This is asking to get connected to some persistence storage :-)

Will keep this post as reference for Compojure, thanks!

Hubert.
Mike
2010-02-02 23:53:50
How can I run this so Jetty will pick up incremental changes so I don't have to rebuild everything?  In other words, I don't want to build the standalone version as I develop.
dnolen
2010-02-03 00:04:45
@Mike, use "lein swank" instead. Then connect to the remote REPL with an IDE or text editor (like Emacs) that supports that behavior. That's all you need to be in incremental change bliss.
Benjamin Flesch
2010-02-03 00:29:53
Hey buddy,

nice code but you've included at least one XSS flaw int here:
/new/?msg=Invalid URL could be /new/?msg=alert(/XSS/); ;-)

- beni
Eric Normand
2010-02-03 00:34:27
Hey, great job of translating Sven's code.  I need to get back into Clojure before it passes me over!

Have you checked out the LispCast videos?
Paul Dorman
2010-02-03 00:40:26
Hi, I've been enjoying your blog and screencasts. Have you looked at the cl-format implementation in Clojure Contrib?

http://code.google.com/p/clojure-contrib/wiki/CommonLispFormat
Scott Rallya
2010-02-03 03:50:21
This, along with your other articles, are probably some of the best resources for getting started with Clojure and Compojure. I look forward to reading the other articles you've written and dabbling in some Clojure myself.
Carson
2010-02-03 04:29:59
Thanks for writing that! I've been reading lately a lot of the API documentation, hearing the talks, etc, but your intro here really helped me pull together a lot of the concepts.

Great job!
Perry T
2010-02-03 04:38:32
Lau,

Tom Faulhaber ported CL's format to Clojure! I'm not in a good place to give you the link, but search for his name & "cl-format" (it's on Github).
Meikel
2010-02-03 09:12:22
Tom's cl-format is included in contrib as clojure.contrib.pprint.
Meikel
2010-02-03 09:12:58
Nice post, BTW.
Lau
2010-02-03 09:29:55
Thanks for the great feedback everyone!

A few people have mentioned the format in contrib and I actually skimmed the source before doing the post. Firstly it seems a little outdated and second it doesn't handle 'time' at all it seems. So although it shares the name, it doesn't share all of the functionality.

@Mike: What DNolen said is correct, but you can also install Emacs/Swank and run M-x swank-clojure-project on the directory, to get everything running dynamically.
Andreas Schipplock
2010-02-03 12:00:35
I'm currently playing with compojure and I liked this read :).
Wolfgang Meyer
2010-02-03 14:15:18
Nice post, makes me want to learn Compojure.

However, isn't there a race condition in add-link? Checking whether a links was already submitted and actually submitting it should maybe happen within one transaction?
Lau
2010-02-03 16:34:39
@Wolfgang: Technically you're right, in order to ensure absolute order the (when (@data url)) should occur within the transaction. However in practical terms it'll make little difference. If 2 users submit at the same time, delivery of their request is greatly affected by networks speeds, traffic, etc, and the end result is the same: Only 1 URL will be found in the map.
Tom Faulhaber
2010-02-03 18:03:19
I don't understand what you mean by "outdated." cl-format is a complete (well 99% complete) implementation of the Common Lisp format function spec as defined in "Common Lisp: the Language, 2nd edition"

Common Lisp's format function does not itself have support for dates. If you look at Sven's code, you'll see that he uses special date formatting code from the s-utils library (http://homepage.mac.com/svc/s-utils/). In general, cl-format does not do any special handling for Java objects, but renders them the same way as Clojure's print function would.

Thinking about it, adding specific dispatch functions for Java classes passed to ~a would be easy and valuable. This would let  us handle things like dates  nicely (either Joda or regular Java).

By the way, I have come to the conclusion that any Clojure programmers who have to interact with time at all should just break down and learn Joda time. The built in Java stuff is so messed up that you get tangled up in no time. Joda is big (cause time is *hard*), but is pretty easy to use with a small amount of study.

Thanks for this series, Lau. I've begun to write some Compojure applications and the lack of doc in general is killing me. This stuff helps a lot with that.
Lau
2010-02-03 18:12:18
Hey Tom - I apologize for the premature judgement on your format implementation. Thanks for checking in.
Vincent Murphy
2010-02-10 13:41:31
Later version with registration here:

http://www.bestinclass.dk/index.php/2010/02/reddit-clone-with-user-registration/