Scala Vs Clojure - Lets get down to business

2009-09-14 13:49:53

Of the new languages that are emerging these days, no two are as interesting as Scala and Clojure. Both claim to be functional and geared for concurrency, one is a Lisp the other a Curly braces language. On paper, they stack fairly well against each other, so let's investigate how well they are suited for business.

The Facts

I've drawn up this table, which compares Scala and Clojure, feature for feature, pound for pound.

Clojure Scala
Lisp-1
Macros
Functional Sorta functional
STM Actor model
Multimethods Object Oriented
Dynamically typed Statically typed
Lazy evaluation Strict evaluation

So there are certainly big differences between these 2 languages. Clojure being a Lisp is read as a series of expressions all passing their data upstream, like

(myfunc
 (first
  (iterate inc 1)))

You read that backwards,

  1. Generate a stream of all integers (by iterating inc) starting with 1
  2. Take the first from the sequence (that'll be 1)
  3. Pass that to myfunc

That's how most Lisp code looks. Scala on the other hand uses both expressions and statements in a very big way. Although the collections in scala.* are all immutable I've seen alot of code based on Arrays which perform well, but are mutable. An example written by Martin Odersky (author of Scala)

def sort(xs: Array[Int]) {
  def swap(i: Int, j: Int) {
    val t = xs(i); xs(i) = xs(j); xs(j) = t
  }
  def sort1(l: Int, r: Int) {
    val pivot = xs((l + r) / 2)
    var i = l; var j = r
    while (i <_ style="color: #afeeee; font-weight: bold;" span="span" _="_" j="j">while (xs(i) < pivot) i += 1
      while (xs(j) > pivot) j =
        1
      if (i <_ style="color: #afeeee; font-weight: bold;" span="span" i="i" swapi="swapi" _="_" j="1">if (l < j) sort1(l, j)
    if (j < r) sort1(i, r)
  }
  sort1(0, xs.length 1)
}

This is one way to implement a Quick Sort routine in Scala - It's not the nicest way, but it works. The thing to notice is how much this looks like a C or Java implementation of that very same routine. Values are mutated, nested while-loops, the works. Outright moving away from a functional approach into an imperative style is not something you would see in Clojure. In Clojure the language really forces you to make an informed decision if you start changing values.

A second thing to notice is that Clojure has explicitly rejected the OOP approach to programming, because

Scala has done quite the opposite, they've wholeheartedly embraced OOP making everything objects. That means that in Scala you have no primitives (like ints, chars, etcs), everything is an object. Martin Odersky, says what he likes about OOP is that it makes it easy to

...complex systems. I think he has a point in saying that complex systems are easy to extend when built on OOP, I'll make the argument though that the complexity which comes from an OOP approach makes the building of the system unecessarily difficult. But thats another story.

Through the rest of this post, keep in mind that some of the differences shouldn't be viewed as a wrong/right discussion. Scala has a different approach to concurrency than Clojure and although my preference is clear, that doesn't make Scala's way wrong or inferior.

Static typing ?

Yes, Scala has Static typing but this is coupled with type inference. In practice you need to specify types in your function definitions and return types (update: As RSchulz correctly points out the Scala compiler checks exit points and infers return types), but throughout most of your logic Scala will be able to discern which types your dealing with. For example

scala> 2 + 2

res0: Int = 4

I ask the Scala REPL to calculate 2 + 2, I don't specify that these are of type Integer. It comes back with the first result of the day (res0) and says: Its an Int! So thats not unbearable like C is for instance. I would personally hate to have to specify types for all my functions. For one because I feel it's unnecessary, and second I want most of my functions to work with a multitude of types.

Before hearing the Scala community I considered this to be a fundamental weakness in Scala, that you have to submit to it's type system. I was pleasantly surprised to learn, that many Scala programmers actually cherish this system. One guy asked me 'Havent you ever passed arguments in the wrong order and expected a different return than what you got ?" and I'll be honest: No, I haven't. But for people who have a hard time getting arguments order and return types right, this system is a great help. So it has it's place, I must say.

Sorta functional?

This a big point. Clojure is functional and pure, it protects you. If you want to mutate transients for example and one of those is accessed from an outside thread, you get an exception. Not so with Scala.

Scala is functional in the broadest sense of the word: It has immutable constructs. Scala does not in anyway discourage it's users from side-effects and mutable code. This is baaaaaaaaaad. Programmers sometimes need the box to be visible in order to not step out of it, when it's unnecessary. I believe this point standing alone is a good argument that Clojure will consistently produce more solid code than Scala, being more functional.

Let's do an experiment!

I'll walk you though the building of a factorial + benchmark in both languages. That'll give you a feel of what it's about.

Factorial

Written as: x!

Example 1: 5! = 5 * 4 * 3 * 2 * 1 = 120

Example 2: 0! = 1

Thats factorial in a mathematical sense, now lets code it.Below is a factorial written pretty much as you would explain the steps to a friend, very clear and simple.

Clojure


(defn factorial  [x]
  (if (zero? x)
    1
    (* x (factorial (dec x)))))
Scala


def factorial(x: BigInt): BigInt = {
  if (x == 0)
    1
  else
    x * (fact x - 1)
}

So there you have 2 versions, doing the exact same thing. Take an argument x, if x = 0 return 1, if not then return x * the factorial of x - 1. This recurses like so

(factorial 5)
(5   * (factorial 4))
(5 * 4  * (factorial 3))
(5 * 4 * 3  * (factorial 2))
(5 * 4 * 3 * 2 * (factorial 1))
(5 * 4 * 3 * 2 * 1 * (factorial 0))
(5 * 4 * 3 * 2 * 1 * 1)
---------------------
120

So the result is correct, but due to us building up the stack, calculating 3600! will blow the stack. Not wanting to look bad, we need to fix this and manually handle our stack. But before we do, please notice how readable the above code is. Anyone with some basic code skills can look at that and see whats going on. We don't want to loose this as we move on.

To fix this, we add an accumulator as an argument so we don't build up the stack. One way to do this, would look like this:

(defn factorial
  ([x] (factorial 1 x))
  ([acc x]
     (if (zero? x)
       acc
       (recur (* x acc) (dec x)))))
def factorial(x: Int): Int = factorial(x, 1)

def factorial(x: Int, acc: Int): Int = x match {
    case 0 => acc
    case n => factorial(n - 1, x * acc)
}

We are still concise on both parts. Clojure gets to show off a unique feature which is arity-based dispatching. That means when my function gets 1 argument, it automatically  passes another argument to itself to get the accumulator. This is a clever way of embedding helper functions, which are specific to a particular function, inside the main logic.

Scala uses a general switch case architecture which simply checks if x = 0 or not and then acts as before. Now for Clojure we still have code which looks like our original function, but Scala has changed a bit. Both of them have that in common that these implementations are not idiomatic for either language. To get back to the business of manipulating sequences, I'll show you both of these implemented as idiomatic one-liners that don't blow the stack:

(defn factorial [x]
  (reduce * (range 1 (inc x))))
def factorial(x: Int): BigInt =
  (1 to x).foldLeft(1: BigInt)(_ * _)

Decide for yourselves what would suit your style of code best. I think both are classy examples :)

The Benchmark

Alright, so we've written code which earns us the envy of the world but how does it perform? Firstly, microbenchmarking is an art in itself, and often times a pointless one, so don't read more into the results than you should! Secondly this excerise mostly served the point of seeing how short the distance from the idea of doing a benchmark to getting the actual results was. One of the things I've always loved about Lisp is the awesome velocity you work with. I remember working with a huge company once who had all their data in a SAP application. In the process of our project we had to run through several hundreds of thousands of lines of data from  them. Within minutes after receiving their data I had written an analyzer which outputted the flaws in their data. I sent it back and it took them days to reach the same results. This is Lisp, can Scala compete?

Off the top of my head I know I'll want to run about 200 hits on factorial(5000) and calculate average cpu time. That'll give us real computations (and not just cached results) and it will also put us through a round of garbage collection. Since Clojure has macros (functions where I control the evaluation myself), its very simple to get the time of a computation. There is a time-macro in core and it looks something like this:

(defmacro time
  [& body]
  (let [start-time (now)
        result     (evaluate body)]
    (println "time: " (- start-time (now)))
    result))

(pseudo code)

So thats very simple. Record the start-time, compute the body, subtract now from start-time. Output is like so:

user> (time (+ 2 2))
"Elapsed time: 0.093447 msecs"
4

Alright, for the purpose of my little benchmark this is not optimal because I need to work with the number (ie. calculate an average) and not have them printed, but fortunately with-out-str rebinds *out* for me, so benchmarking is actually very straightfoward:

user> (let [values  (for [i (range 200)] (with-out-str (time (fac 5000))))
            doubles (map #(Double. (nth (.split % " ") 2)) values)]
        (println "Average cpu time: " (/ (apply + doubles)
                                         (count doubles))))
Average cpu time:  143.15819786499998

Line by line, this is what happends

Clojure weighs in at

132 msecs

Now for Scala. I didn't want to be presumptuous, so I requested the help of the Scala community in doing this benchmark. Scala is not as straight forward as Clojure in this regard, because it doesn't have macros. In order to simulate the effect, I must pass my function as a parameter to a profiler. In the current situation, it's not a big problem since we're passing 1 function, it would have been another story if we we're passing an entire body of code.

def time(n: Int, fun: () => Unit) = {
    val start = System.nanoTime
    (0 until n).toList.foreach((_: Int) => fun())
    ((((System.nanoTime - start): Double) / (n: Double)) / 1000000.0)
}

val t = time(200, () => fact(5000))
println("Runtime: " + t + "ms")

Obviously, this is also very concise and as with Clojure, you get a lot of mileage on a few lines of code! Being run outside Emacs, Scala weighs in at:

130 msecs

So in this small example they are 2 msecs apart, leading me to believe they compile almost exactly the same bytecode. Although it's worth noting that Scalas 'Range' isn't Lazy where Clojure's sequence is, in theory that should have had a very bad effect on our performance, but it seems to be able to hold it's own.

Scala Outline

Now you've seen a bit of Scala and I hope you're intrigued like I was when I first stumbled upon it. There are a few things which I don't like.

When I first entered #scala someone said "Martin broke the build again?". The next night when I logged on another said "Hmm, the build looks broken?" and this goes on and on. I suppose it's not criminally bad that a development build is broken, but it just doesn't happen with Clojure. Finally - Since nobody seems to be in regular contact with Martin Odersky, no immediate action was/could-be taken. (at the time of writing this, it just got fixed I think)

Secondly it was decided that ";" semicolons at the end of each line was not mandatory - I think because it's a unnecessary ceremony - but instead could be sprinkled throughout the code. This had led to several examples I've come across where people have put in abundance of code on one line, killing clarity:

var y = 5; val x = (y + 5) + 10; y += 2;

I realize though, that to bring this up as a complaint is almost a compliment.

I'm also not a fan of OOP and Scala (the scalable language) gives me a way out by also letting me write small scripts where there are no objects in the user-code.

Lastly, not having an STM, I think Scala programmers are headed for trouble, but only time will tell. The Actor model they looted from Erland is far from bad, but it doesn't suit my style and their concerns seem to be on performance where mine are on correctness - I think that's important when we're talking concurrency.

Lazy vs Strict

Finally, Clojure does lazy evaluation per default, Scala evaluates strictly. In practice this means that when I do this in Clojure

user> (range 1000)
clojure.lang.LazySeq

A series of computations are lined up but are never performed - Nothing is computed! If I take out the first 5 items of that LazySeq then those 5 items gets computed, nothing else: That's lazy! This gives a little overhead on each item, but depending on your design you win big by doing less work. It's not Lazy like Haskell though, there difference is that our laziness rests on LazySeq, but thats a topic for another talk.

In scala its different:

scala> (1 until 1000)
res1: Range = Range(1, 2, 3, 4, 5, 6, 7, 8, 9.......1000)

Everything is calculated. This doesn't divide the camps though, you can still do lazy evaluation in Scala, it's just not the default. (update: Range is in fact lazy)

The Community

Ok, so Scala gets a 3.5 star rating. I reached that conclusion because that crowd is generally very friendly and helpful. After spending a short time on #scala it seems that RSchulz, DRMcIver, jesnor, mapreduce, eikki and a few others are pillars of the community and willing to lend a hand when you need one. I asked theoretical and practical questions on both #clojure and #scala and generally got more well rounded answers in #clojure. I deducted 1 star from Scala because Martin Odensky doesn't attend it. I asked how people got in touch with him and I got comments like "He contacted me on Skype once", "He uses white rats as messengers" etc. He's not involved and interacting with the community in the same way Rich Hickey (author of Clojure) is and that really makes all the difference. The last 0.5 star I deducted was because I actually got really unfriendly messages from some of the members - not Common Lisp style evil - but getting there. In all the time I've spent on #clojure I remember 1 single instance where somebody dropped a rude childish remark and that person was immediately corrected by Rich Hickey - So the modesty and etiquette which Rich nurtures is not found on #scala in the same scale - and that's a real shame.

Finally, by implication, Martin Odersky can't take public debates live on IRC like Rich does when he's considering features/extensions. He does however have some activity on certain mailing lists.

Why does Clojure then get the full 5 stars? People are as friendly there as on #scala. I've often joked saying we have no FAQ we have Chouser (Chris Houser), but there's some truth to it. Almost every question gets answered, everyone gets help regardless of the level on which you ask. Secondly Rich Hickey attends daily, taking debates, giving advice, educating the people. I've even seen him do a public code-review on the Google group which was just an amazing piece of consultant assistance that we got for free. And lastly - Never ever, does the language/tone/etiquette drop below common decency and friendliness.

Conclusion

Scala is awesome, Clojure is awesome. Both are breaking new ground in Software Development. If you thrive on OOP and need Static Typing then Scala is for you. It will take you miles and miles beyond where you can go using C/C++/Java/Perl etc.  It's a very welcomed addition on the scene of JVM languages.

Both will get you along way with concurrent programming, I can't say for now who will go the distance, but I give the STM the best odds. On the recommendation of a Scala community member, I'll consider doing a post only looking at Actors Vs STM.

If you're not keen on OOP and can administrate a large project without types and leverage the power of Lisp, Clojure is for you. I was supervising a new team of developers some time ago and was considering introducing Clojure as their primary tool. I described the scenario for Rich Hickey and asked for his advice, he said something like this "If you have a small and sharp team, you should consider it. If not, it's probably not for you". Lisp isn't for everybody and the sooner you reach that conclusion, the better.

In a business setting, what would I use? Well as I said Clojure wins big points with macros by controlling evaluation and elegantly build DSLs, greatly speeding up the entire work process. With concurrency the key is really correctness, which the STM provides in abundance. If you can harness the power of Lisp by all means do so! Clojure is the businessmans best friend. However! If your team is not able to adobt and fully appreciate Clojure in all it's aspects I would not hesitate a moment in recommending Scala.

/Lau

note: This article has a sequel

dreish
2009-09-16 01:29:59
Your factorial microbenchmark is mostly measuring the performance of java.math.BigInteger. Did you remember to profile it?
Richard Vowles
2009-09-16 06:37:33
"no two are as interesting as Scala and Clojure"???? maybe to you - those two code samples show just how *nasty* those two languages are. If they are *that* nasty for simple factorial code, give me jjava anytime.
bestinclass
2009-09-16 07:52:13
@Richard: You've got it! :) Like the conclusion states Lisp isn't for everybody and the same goes for Scala. It requires you to adapt a new mindset which takes time, the pay-off being enormous increase in productivity.
bestinclass
2009-09-16 07:54:21
@dreish: No I haven't used profiling nor added noise or anything which would make the benchmark suitable for decision making. It was simply to show that both language provide a quick path from idea to code to result.
Nick Wiedenbrueck
2009-09-16 09:26:44
The quick sort example just shows that it's possible (but explicitly not encoureged) to write in an imperative style in Scala. The functional version is about 5 lines of code: http://www.scala-lang.org/node/58
dreish
2009-09-16 14:00:55
I agree you've done that.  I don't understand how anyone can look a one-line factorial function and call it "nasty".  Both of your final versions strike me as concise and elegant expressions of the solution to the problem, albeit with idioms that may be unfamiliar to programmers who haven't yet spent some time on functional programming.

The biggest problem I've had with Java is the sheer volume of classes and methods that I need to hold in my head at one time in order to have a mental model of what is going on in my program.  Both Clojure and Scala offer tools that can cut down on this verbosity and _simplify_ a complex problem until it can all be tucked away neatly in the corner of my brain.

Clojure seems to me to go further down that path than Scala does, at the cost of perhaps having slightly tricky performance characteristics compared to Scala.  It still seems to be faster than other popular dynamic languages, though, and as your example shows, the performance of underlying systems is often the dominant factor.
nuttycom
2009-09-16 17:13:59
The statement that Range is strict is incorrect; the reason that it appears to be strict in the REPL is that it is fully evaluated by the REPL in the process of printing its value as a string.
dmilith
2009-09-17 12:21:34
Don't forget that both clojure and scala code are translated to imperative Java code anyway (for JVM) ;}
IMHO pure functional languages are limitation not profit for programmer. That's why hybrid of OOP and Functional in Scala is so powerful choice.
bestinclass
2009-09-17 19:06:02
@nuttycom: Everything eventually compiles to machinecode, but there's a world of difference in the code compiled. I strongly disagree that freely mixing side-effects and functional code is anything but way to make it easier to let errors creep in. Sometimes we programmers need to be whipped into place. It took me months to conform "all" my code to a functional mindset, but it was a good investment.
Sam Stokes
2009-09-18 15:01:11
Scala has both strict and lazy evaluation, at the programmer's option: e.g. a method can be defined to take a mixture of strict and lazy arguments.

An example is Option[T].getOrElse:

Some(3).getOrElse(println("hello"))
// evaluates to 3 : Int, prints nothing

None.getOrElse(println("hello")
// evaluates to Unit, prints "hello"
Daniel Jomphe
2009-09-24 20:19:07
"I’ve drawn up this table, which compares Scala and Clojure, feature for feature, pound for pound."

Watch out! :)

STM vs Actors is *not* a correct comparison:

As you know, Clojure doesn't only have its MVCC-based STM! I think it's important to mention it. If you really want to compare something of Clojure's to Scala's Actors, it's Clojure's Agents you'll want to compare. Then, you'll want to mention Clojure's other alternatives for other situations. That said, for now, if you want to go distributed with Clojure, you need to use something like Terracotta. If you do so, you end up in a much better position than with Scala's concurrency features. Also, Scala has (in development?) an STM. That said, its STM won't be MVCC-based, so it will still be a far-stretch from the usefulness that comes with Clojure's STM.

Clojure is not lazily evaluated:

Like Scala, Clojure is eagerly evaluated by default. That said, it's true that its Sequences are lazy, and some of its APIs. Like with Scala, it's your choice, with eager being the default.
bestinclass
2009-09-24 21:17:07
@Daniel J: Thanks for chiming in. I realize that the actor model directly relates to the agent functionality in Clojure, but the point of the article was to flash the biggest guns on both ships. Agents have their use in Clojure but to me the STM is the primary feature worth getting familiar with regarding concurrency, but it's a matter of taste/the task at hand. Also adding to that, I don't think anybody is expecting much of the Scalas STM and from their own ranks they said "Actors are one of the many flawed libraries in Scala".

Lastly, I've received a few comments on what is lazy and what's not. Reading the text strictly, then yes you are right - but in reality most of our seq abstractions are lazy, letting you be lazy throughout most of your code without struggling for it - that was my point however fuzzy it might have come across :)
Rayne
2009-09-29 07:25:02
Clojure is /not/ a pure functional programming language. In reply to the statement in the post "Clojure is functional and pure, it protects you.", and in reply to dmilith. Also in reply to dmilith, I recommend you actually try and use a functional programming language for a while before you make the decision to say that they are limiting to programmers. You'll be rightfully flamed.
fogus
2009-10-20 15:17:13
Your Clojure vs. Scala lazy REPLs are misleading.  Aside from the previously mentioned Scala range laziness, your Clojure command `(range 1000)` is not representative of your REPL result.  Instead, you must have run `(class (range 1000))` instead.  As it stands, both the Clojure and Scala commands should have the same REPL induced evaluation of the results for range.
fogus
2009-10-20 15:20:06
@dmilith 

"both clojure and scala code are translated to imperative Java code anyway"

What's your point?  Haskell is translated to imperative assembly code full of mutation, but no one cares.  The language provides an abstraction above all of that.
fogus
2009-10-20 16:59:20
One final thing:

"This is a clever way of embedding helper functions, which are specific to a particular function, inside the main logic."

The multi-arity definition forms for functions are not isolated as helper functions to any specific function, but instead exposed as a public function interface.

(defn f ([] 0) ([x] 1) ([x y] 2)) allows for the public access of (f) (f :a) and (f :a :b).  "Isolated" helper functions are typically done with let or letfn.

It's a minor point probably, but maybe clears things up.
Daniel Sobral
2009-10-30 07:48:07
I think it funny you think it is "baaaaaaad" that Scala does not give you an error message if you try to mutate something outside the proper place, but, on the other hand, you don't think it bad that the compiler doesn't give you an error message if you pass some parameter to a function that was expecting something else.

Really, it is the same thing.
Lau
2009-10-30 10:03:33
@Daniel: It's not close to being the same thing. Basing your work on a fundamental principle saying 'mutability is accepted as the default' is giving yourself over to frailty and will cause many problems down the line, especially when concurrency is introduced. Of course Clojure will also break if you pass an incompatible type, there's no other way to go about it, but what I'm opposed to is static typing in general, as it seems overkill for so many many modules.
EdgarSanchez
2009-11-12 22:18:35
You should try F# (http://msdn.com/fsharp): 
let rec fact n = if n = 0 then 1 else n * fact (n - 1)
Lau
2009-11-12 22:21:27
@Edgar: Thanks for stopping by :)

Actually I'd rather stay clear of F#, its seems like a <a href="http://www.microsoft.com" rel="nofollow">vile attempt to copy</a> some of the beauty of other functional languages, without much success. For my hobby-coding I'm digging heavily into J.
EdgarSanchez
2009-11-12 23:03:39
Hi @Lau,

No doubt F# is heavily based on Ocaml, but could you explain why F# fails to implement functional semantics and beauty?
Lau
2009-11-12 23:06:32
@Edgar:

I have no doubt that it's based on functional semantics, but I've not yet seen some F# code where I thought ' Wow, that's nice and unique' - It's been overly verbose and overly complicated. But this is actually a while back, so maybe it's time to revisit? 

/Lau
Daniel Sobral
2009-11-12 23:15:27
@Lau: F# is OCaml, with a few enhancements a bit of simplification. Not unlike the relationship of Clojure with Lisp. And the ML family of functional languages is a well respected family

I think you are doing F# a major disservice.
Lau
2009-11-12 23:17:25
@Daniel &amp; Edgar: Then I apologize. I don't know enough about the current state of affairs in regards to F# to pass judgment - disregard my 2 comments please :)
EdgarSanchez
2009-11-12 23:57:16
@Lau,

No problem! To begin with, check out this:

let rec quickSort list =
    match list with
    | [] -&gt; []
    | h :: tail -&gt;
        let smaller, bigger = List.partition ((&gt;=) h) tail
        quickSort smaller @ [h] @ quickSort bigger

Actually, I like the Haskell version better, but F#'s is not bad :-)
Mike
2009-11-17 22:53:15
&gt; your team is not able to adopt and fully appreciate Clojure in all it’s aspects I would not hesitate a moment in recommending Scala.

This sentence is obviously biased against Scala - the subtle message is, "if you don't appreciate Clojure, your team is not able to adopt" which I wholeheartedly disagree.

There's no winner or loser in this comparison and it all comes down to perference.  I know Scheme like the back of my hand, and yet vastly prefer a programming language that doesn't force me to excessively use the SHIFT key due to bracket obsessiveness.
Lau
2009-11-17 22:57:00
@Mike: Correct - The subtle hint was, that if your team isn't sharp enough to free themselves of OO and Scalas mix 'n' mash attitude towards immutability, then I think they will be happy using it - Yet, there is a higher level to be reached for those who dare :)

If you truly believe that this comes down to preference I recommend that you head over to InfoQ and sit through <a href="http://www.infoq.com/presentations/Are-We-There-Yet-Rich-Hickey" rel="nofollow">Rich Hickeys keynote</a> from the JVM Language Summit.