Scala Vs Clojure – Let’s get down to business

Of the new languages that are emerging these days, no two are as interesting as Scala and Clojure. Both claim to be functional and geared for concurrency, one is a Lisp the other a Curly braces language. On paper, they stack fairly well against each other, so let’s investigate how well they are suited for business.

The Facts

I’ve drawn up this table, which compares Scala and Clojure, feature for feature, pound for pound.

Clojure Scala
Lisp-1
Macros
Functional Sorta functional
STM Actor model
Multimethods Object Oriented
Dynamically typed Statically typed
Lazy evaluation Strict evaluation

So there are certainly big differences between these 2 languages. Clojure being a Lisp is read as a series of expressions all passing their data upstream, like

(myfunc
 (first
  (iterate inc 1)))

You read that backwards,

  1. Generate a stream of all integers (by iterating inc) starting with 1
  2. Take the first from the sequence (that’ll be 1)
  3. Pass that to myfunc

That’s how most Lisp code looks. Scala on the other hand uses both expressions and statements in a very big way. Although the collections in scala.* are all immutable I’ve seen alot of code based on Arrays which perform well, but are mutable. An example written by Martin Odersky (author of Scala)

def sort(xs: Array[Int]) {
    def swap(i: Int, j: Int) {
        val t = xs(i); xs(i) = xs(j); xs(j) = t
    }

    def sort1(l: Int, r: Int) {
        val pivot = xs((l + r) / 2)
        var i = l; var j = r
        
        while (i <= j) {
           while (xs(i) < pivot) i += 1
           while (xs(j) > pivot) j = 1
               if (i <= j) {
                  swap(i, j)
                  i += 1
                  j = 1
               }
        }

       if (l < j) sort1(l, j)
       if (j < r) sort1(i, r)
   }

  sort1(0, xs.length 1)
}

This is one way to implement a Quick Sort routine in Scala – It’s not the nicest way, but it works. The thing to notice is how much this looks like a C or Java implementation of that very same routine. Values are mutated, nested while-loops, the works. Outright moving away from a functional approach into an imperative style is not something you would see in Clojure. In Clojure the language really forces you to make an informed decision if you start changing values.

A second thing to notice is that Clojure has explicitly rejected the OOP approach to programming, because

  • It’s born of simulation, but now used everywhere for no good reason
  • It’s got mutability principles baked into it

Scala has done quite the opposite, they’ve wholeheartedly embraced OOP making everything objects. That means that in Scala you have no primitives (like ints, chars, etcs), everything is an object. Martin Odersky, says what he likes about OOP is that it makes it easy to

  • Adapt
  • Extend

…complex systems. I think he has a point in saying that complex systems are easy to extend when built on OOP, I’ll make the argument though that the complexity which comes from an OOP approach makes the building of the system unecessarily difficult.

Through the rest of this post, keep in mind that some of the differences shouldn’t be viewed as a wrong/right discussion. Scala has a different approach to concurrency than Clojure and although my preference is clear, that doesn’t make Scala’s way wrong or inferior.

Static typing ?

Yes, Scala has Static typing but this is coupled with type inference. In practice you need to specify types in your function definitions and return types (update: As RSchulz correctly points out the Scala compiler checks exit points and infers return types), but throughout most of your logic Scala will be able to discern which types you’re dealing with.
For example

scala> 2 + 2
res0: Int = 4

I ask the Scala REPL to calculate 2 + 2, I don’t specify that these are of type Integer. It comes back with the first result of the day (res0) and says: Its an Int! So thats not unbearable like C is for instance. I would personally hate to have to specify types for all my functions while prototyping. For one because I feel it’s unnecessary, and second I want some of my functions to work with a multitude of types.

Before hearing the Scala community I considered this to be a fundamental weakness in Scala, that you have to submit to it’s type system. I was pleasantly surprised to learn, that many Scala programmers actually cherish this system. One guy asked me ‘Havent you ever passed arguments in the wrong order and expected a different return than what you got ?” and I’ll be honest: No, I haven’t. But for people who have a hard time getting arguments order and return types right, this system is a great help. So it has it’s place, I must say.

Sorta functional?

This a big point. Clojure is functional and pure, it protects you. If you want to mutate transients for example and one of those is accessed from an outside thread, you get an exception. Not so with Scala.

Scala is functional in the broadest sense of the word: It has immutable constructs. Scala does not in anyway discourage it’s users from side-effects and mutable code. This is baaaaaaaaaad. Programmers sometimes need the box to be visible in order to not step out of it, when it’s unnecessary. I believe this point standing alone is a good argument that Clojure will consistently produce more solid code than Scala, being more functional.

Let’s do an experiment!

I’ll walk you though the building of a factorial + benchmark in both languages. That’ll give you a feel of what it’s about.

Factorial

Written as: x!

Example 1: 5! = 5 * 4 * 3 * 2 * 1 = 120

Example 2: 0! = 1

Thats factorial in a mathematical sense, now lets code it.Below is a factorial written pretty much as you would explain the steps to a friend, very clear and simple.

Clojure

(defn factorial  [x]
  (if (zero? x)
    1
    (* x (factorial (dec x)))))
Scala

def factorial(x: BigInt): BigInt = {
  if (x == 0)
    1
  else
    x * (fact x - 1)
}

So there you have 2 versions, doing the exact same thing. Take an argument x, if x = 0 return 1, if not then return x * the factorial of x – 1. This recurses like so

(factorial 5)
(5   * (factorial 4))
(5 * 4  * (factorial 3))
(5 * 4 * 3  * (factorial 2))
(5 * 4 * 3 * 2 * (factorial 1))
(5 * 4 * 3 * 2 * 1 * (factorial 0))
(5 * 4 * 3 * 2 * 1 * 1)
---------------------
120

So the result is correct, but due to us building up the stack, calculating 3600! will blow the stack. Not wanting to look bad, we need to fix this and manually handle our stack. But before we do, please notice how readable the above code is. Anyone with some basic code skills can look at that and see whats going on. We don’t want to loose this as we move on.

To fix this, we add an accumulator as an argument so we don’t build up the stack. One way to do this, would look like this:

(defn factorial
  ([x] (factorial 1 x))
  ([acc x]
     (if (zero? x)
       acc
       (recur (* x acc) (dec x)))))
def factorial(x: Int): Int = factorial(x, 1)

def factorial(x: Int, acc: Int): Int = x match {
    case 0 =&gt; acc
    case n =&gt; factorial(n - 1, x * acc)
}

We are still concise on both parts. Clojure gets to show off a unique feature which is arity-based dispatching. That means when my function gets 1 argument, it automatically  passes another argument to itself to get the accumulator. This is a clever way of embedding helper functions, which are specific to a particular function, inside the main logic.

Scala uses a general switch case architecture which simply checks if x = 0 or not and then acts as before. Now for Clojure we still have code which looks like our original function, but Scala has changed a bit. Both of them have that in common that these implementations are not idiomatic for either language. To get back to the business of manipulating sequences, I’ll show you both of these implemented as idiomatic one-liners that don’t blow the stack:

(defn factorial [x]
  (reduce * (range 1 (inc x))))
def factorial(x: Int): BigInt =
  (1 to x).foldLeft(1: BigInt)(_ * _)

Decide for yourselves what would suit your style of code best. I think both are classy examples.

The Benchmark

Alright, so we’ve written code which earns us the envy of the world but how does it perform? Firstly, microbenchmarking is an art in itself, and often times a pointless one, so don’t read more into the results than you should! Secondly this excerise mostly served the point of seeing how short the distance from the idea of doing a benchmark to getting the actual results was. One of the things I’ve always loved about Lisp is the awesome velocity you work with. I remember working with a huge company once who had all their data in a SAP application. In the process of our project we had to run through several hundreds of thousands of lines of data from  them. Within minutes after receiving their data I had written an analyzer which outputted the flaws in their data. I sent it back and it took them days to reach the same results. This is Lisp, can Scala compete?

Off the top of my head I know I’ll want to run about 200 hits on factorial(5000) and calculate average cpu time. That’ll give us real computations (and not just cached results) and it will also put us through a round of garbage collection. Since Clojure has macros (functions where I control the evaluation myself), its very simple to get the time of a computation. There is a time-macro in core and it looks something like this:

(defmacro time
  [& body]
  (let [start-time (now)
        result     (evaluate body)]
    (println "time: " (- start-time (now)))
    result))

(pseudo code)

So thats very simple. Record the start-time, compute the body, subtract now from start-time. Output is like so:

user> (time (+ 2 2))
“Elapsed time: 0.093447 msecs”
4
[/clojure]

Alright, for the purpose of my little benchmark this is not optimal because I need to work with the number (ie. calculate an average) and not have them printed, but fortunately with-out-str rebinds *out* for me, so benchmarking is actually very straightfoward:

user> (let [values  (for [i (range 200)] (with-out-str (time (fac 5000))))
            doubles (map #(Double. (nth (.split % " ") 2)) values)]
        (println "Average cpu time: " (/ (apply + doubles)
                                         (count doubles))))
Average cpu time:  143.15819786499998

Line by line, this is what happends

  • values gets assigned the value of 200 runs on (time (fac 5000)) which is something like “Elapsed time: 150.0000 msecs” “Elapsed time: 165.0000 msecs” “Elapsed time: 144.1234 msecs” etc.
  • doubles gets assigned the result of casting the 3rd element of splitting each string up on spaces. The split will be an array like [“Elapsed” “Time:” “144.123” “msecs”], third element being “144.123”
  • Finally I print the result of applying + to all elements (meaning to take the sum) and dividing by the number of elements, which is the average.

Clojure weighs in at

132 msecs

Now for Scala. I didn’t want to be presumptuous, so I requested the help of the Scala community in doing this benchmark. Scala is not as straight forward as Clojure in this regard, because it doesn’t have macros. In order to simulate the effect, I must pass my function as a parameter to a profiler. In the current situation, it’s not a big problem since we’re passing 1 function, it would have been another story if we we’re passing an entire body of code.

def time(n: Int, fun: () => Unit) = {
    val start = System.nanoTime
    (0 until n).toList.foreach((_: Int) => fun())
    ((((System.nanoTime - start): Double) / (n: Double)) / 1000000.0)
}

val t = time(200, () => fact(5000))
println("Runtime: " + t + "ms")

Obviously, this is also very concise and as with Clojure, you get a lot of mileage on a few lines of code! Being run outside Emacs, Scala weighs in at:

130 msecs

So in this small example they are 2 msecs apart, leading me to believe they compile almost exactly the same bytecode. Although it’s worth noting that Scalas ‘Range’ isn’t Lazy where Clojure’s sequence is, in theory that should have had a very bad effect on our performance, but it seems to be able to hold it’s own.

Scala Outline

Now you’ve seen a bit of Scala and I hope you’re intrigued like I was when I first stumbled upon it. There are a few things which I don’t like.

When I first entered #scala someone said “Martin broke the build again?”. The next night when I logged on another said “Hmm, the build looks broken?” and this goes on and on. I suppose it’s not criminally bad that a development build is broken, but it just doesn’t happen with Clojure. Finally – Since nobody seems to be in regular contact with Martin Odersky, no immediate action was/could-be taken. (at the time of writing this, it just got fixed I think)

Secondly it was decided that “;” semicolons at the end of each line was not mandatory – I think because it’s a unnecessary ceremony – but instead could be sprinkled throughout the code. This had led to several examples I’ve come across where people have put in abundance of code on one line, killing clarity:

var y = 5; val x = (y + 5) + 10; y += 2;

I realize though, that to bring this up as a complaint is almost a compliment.

I’m also not a fan of OOP and Scala (the scalable language) gives me a way out by also letting me write small scripts where there are no objects in the user-code.

Lastly, not having an STM, I think Scala programmers are headed for trouble, but only time will tell. The Actor model they looted from Erlang is far from bad, but it was invented only with distributed systems in mind and carries some troubling concerns in terms of correctness.

Lazy vs Strict

Finally, Clojure does lazy evaluation per default, Scala evaluates strictly. In practice this means that when I do this in Clojure

user> (range 1000)
clojure.lang.LazySeq

A series of computations are lined up but are never performed – Nothing is computed! If I take out the first 5 items of that LazySeq then those 5 items gets computed, nothing else: That’s lazy! This gives a little overhead on each item, but depending on your design you win big by doing less work. It’s not Lazy like Haskell though, there difference is that our laziness rests on LazySeq, but thats a topic for another talk.

In scala its different:

scala> (1 until 1000)
res1: Range = Range(1, 2, 3, 4, 5, 6, 7, 8, 9.......1000)

Everything is calculated. This doesn’t divide the camps though, you can still do lazy evaluation in Scala, it’s just not the default. (update: Range is in fact lazy)

The Community

Ok, so Scala gets a 3.5 star rating. I deducted 1 star from Scala because Martin Odensky doesn’t attend it. I asked how people got in touch with him and I got comments like “He contacted me on Skype once”, “He uses white rats as messengers” etc. He’s not involved and interacting with the community in the same way Rich Hickey (author of Clojure) is and that really makes all the difference. The last 0.5 star I deducted was because I actually got really unfriendly messages from some of the members – not Common Lisp style evil – but getting there. In all the time I’ve spent on #clojure I remember 1 single instance where somebody dropped a rude childish remark and that person was immediately corrected by Rich Hickey – So the modesty and etiquette which Rich nurtures is not found on #scala in the same scale – and that’s a real shame.

Why does Clojure then get the full 5 stars? People are as friendly there as on #scala. I’ve often joked saying we have no FAQ we have Chouser (Chris Houser), but there’s some truth to it. Almost every question gets answered, everyone gets help regardless of the level on which you ask. Secondly Rich Hickey attends daily, taking debates, giving advice, educating the people. I’ve even seen him do a public code-review on the Google group which was just an amazing piece of consultant assistance that we got for free. And lastly – Never ever, does the language/tone/etiquette drop below common decency and friendliness.

Conclusion

Scala is awesome, Clojure is awesome. Both are breaking new ground in Software Development. If you thrive on OOP and need Static Typing then Scala is for you. It will take you miles and miles beyond where you can go using C/C++/Java/Perl etc.  It’s a very welcomed addition on the scene of JVM languages.

Both will get you along way with concurrent programming, I can’t say for now who will go the distance, but I give the STM the best odds. On the recommendation of a Scala community member, I’ll consider doing a post only looking at Actors Vs STM.

If you’re not keen on OOP and can administrate a large project without types and leverage the power of Lisp, Clojure is for you. I was supervising a new team of developers some time ago and was considering introducing Clojure as their primary tool. I described the scenario for Rich Hickey and asked for his advice, he said something like this “If you have a small and sharp team, you should consider it. If not, it’s probably not for you”. Lisp isn’t for everybody and the sooner you reach that conclusion, the better.

In a business setting, what would I use? Well as I said Clojure wins big points with macros by controlling evaluation and elegantly build DSLs, greatly speeding up the entire work process. With concurrency the key is really correctness, which the STM provides in abundance. If you can harness the power of Lisp by all means do so! Clojure is the businessmans best friend. However! If your team is not able to adobt and fully appreciate Clojure in all it’s aspects I would not hesitate a moment in recommending Scala.

/Lau

note: This article has a sequel

About the author

Lau Jensen is the owner of Best In Class, an avid Clojure Developer well experienced in project management and also the primary author of ClojureQL. You are very welcome to follow his posts on this blog or get in touch directly on lau.jensen@bestinclass.dk

  • Pingback: Scala Vs Clojure – Let’s get down to business | thoughts...()

  • aldrich wright

    Can’t you do STM with Scala using Akka? As a matter of fact, ScalaSTM is soon going to be added to the standard Scala library, if i’m not mistaken…..

  • Pingback: Clojure and Scala | IT Technologies()

  • Pingback: Functional Programming: Links, News And Resources (18) | Angel "Java" Lopez on Blog()

  • pdxdan

    The ‘sort’ function example you posted by Martin Odersky was from his “Scala by Example” paper as an example of how not to write Scala. The following page of that document shows the much shorter, elegant functional programming approach.
    http://www.scala-lang.org/docu/files/ScalaByExample.pdf

    • Chen “Harvey” Youfu

      That’s impressive

  • salc2

    I think there is a mistake in the Scala example the inner sort1 function invocation sort1(0, xs.length 1) you missed “-”

  • timcharper

    Scala has hygienic macro support. Also, Clojure is not lazy eval, just the lazy sequences are. In which case Scala is just as lazy. Also Scala has `lazy val`.

    Both Scala and Clojure are impure functional languages. Calling Scala “sorta functional”, but Clojure “functional” is not a correct comparison.

    • JeanClaude

      He should have called Clojure strictly functional, and Scala multi-paradigm with functional and OOP.

  • ssmoot

    I realize this is very old. And overall a good overview if a bit biased towards Clojure (which who isn’t? I don’t mean to offend).

    But for anyone reading this, please read pdxdan’s comment.

    Also, there’s a few style issues with the Scala. All the explicit typecasts are rather un-idiomatic. And the only place you’re really use Range.this.toList is in the REPL. (Exactly so you preserve laziness, but also just because it’s shorter.) So I’d submit the following tweaked version for `time`:


    def time(n: Int, fun: => Unit) = {
    val start = System.nanoTime
    1 to n foreach(_ => fun)
    (System.nanoTime - start) / n / 1000000.0
    }

    Then there’s the factorial implementation. You’ll never find anyone who loves pattern-matching more than me, but even I would think twice about that implementation. Pattern matching a method argument almost always means you’ve called the wrong method, or you should’ve written a Function1 instead. IMO. But I’ll leave that alone here.

    And this is really bike-shedding, but I think it’s fair to say the canonical factorial example is tail-recursive. Whenever you’re using a recursive function in fact it’s rather idiomatic to consider wether an accumulator is appropriate. And if so, then a nested function is the idiomatic choice IME as you don’t want to bloat your public API with methods never intended for public consumption. Also, accumulators appear on the left by convention.

    Anyways, that’d look something like this:


    def factorial(n: Int): BigInt = {
    @scala.annotation.tailrec
    def factorial(acc: BigInt, n: Int): BigInt = n match {
    case 0 => acc
    case _ => factorial(acc * n, n - 1)

    factorial(1, n)
    }

  • gurghet

    1 until 1000 is 1 to 999 actually not 1 to 1000

  • Rickety Janes

    Nah, even an noob can get a conversation started, then the good ol’ smarties like you can show up and show us how its done…because generally when you smarties write some article ‘for beginners’, it’s incomprehensible what the point is.

  • Rickety Janes

    Or you could just use FSharp and actually get work done wiring SQL databases to math processing to diplaying in GUIs…

    • cloxure

      really?

      • Rickety Janes

        oh ya, jijo…

  • Pingback: Clojure: Links, News And Resources (43) | Angel "Java" Lopez on Blog()

  • JimTheMan

    +1 for functional programming. :)

  • cloxure

    I will write the factorial example like this:

    (defn fact [x]
    (reduce *’ (range 1 (inc x))))

    note *’ to automatically promote the parameters and the result as bigint.

    using the factorial clojure example (using recursion) will cause a stack error.