Scala Vs Clojure – Round 2: Concurrency

The time for ignoring concurrency has almost passed. If you read my last post you’ll know that I’m looking at Scala Vs Clojure and a big part of both of these languages is their support for concurrency. Scala uses an Erlang inspired actor model, which is a distributed approach to concurrency. Clojure on the other hand leans on it’s STM, a non-distributed approach – Lets put them in the ring together!

The goal

My last post sparked heavy debate in the #scala camp, wherein many frustrations and concerns were expressed by the Scala community. Many responded by publicly flaming me, accusing me of lying etc etc. Before I start this post let me point out to the Scala community that you should welcome reviews not dread them, open up for a discussion rather than trying to silence it. I’ll try to be fair and shed light on both sides in return I expect you to leave a comment if you feel there’s something that I left out – Let the goal be a healthy debate on concurrency not a flamewar which nobody benefits from.

The problem

An old concurrency problem/exercise is ‘The sleeping barber’ and you can either read the full definition here or read my shorter version here:

We simulate a barber-shop with 1 barber, 3 waiting chairs and 1 chair for haircutting.

The barber sleeps between customers and is thus awoken when one enters the shop. If customers come in while he’s working they take a seat, if no seats are available they leave.

This is a perfect problem for comparing these two languages for the following reasons

  • In the O’Reilly book on Scala, they’ve already solved this!
  • Scala’s actor/OOP model is perfect for doing simulation (remember this?)

So we can finally put Clojure at a disadvantage and see if we can leverage its power to find a good solution anyway. If you read the Wikipedia link describing the challenge, you’ll notice that they point to manual locks as a solution, but anyone who’s ever worked with locks wants run away from those! Lets examine 2 escape routes:

Actors

Scala’s Actor model is heavily inspired by Erlang. It’s a distributed method which means that you have various agents working on their individual tasks/data/responsibilities and all they share is basically interfaces – You don’t need to know what’s under the hood. These Actors talk by way of messages:

scala> Redford ! "Go act!"

Now Redford will receive the message “Go act!” – That message lands in Redford’s queue which he walks through one by one, possibly doing some pattern-matching allowing him to act different to this String than he would to a float for instance.

Software Transactional Memory

Clojure uses an STM which is analogous to a database. To access shared memory the Clojure programmers can use mutable references to immutable data – an indirection. Mutable state being the root of all evil bugs must be contained in a thread-safe fashion: Enter the STM. To manipulate the references we wrap them in transactions. These guarantee that our threads get a consistent view of the world by setting values atomically. Closure’s STM implementation uses snapshot isolation to achieve this – in effect all reader-threads get their own snapshot of the ‘database’ when they work, that also means, no reader-locks. In the following code look for the dosync macro, which contains code which runs in a transaction.

Pitfalls of the Barber problem

If you think that the Barber problem looks simple, odds are you haven’t tried to implement it. There are quite a few things which can go wrong when we have several agents working on shared data, primarily

Deadlocks

Race conditions
A deadlock happen when 2 threads are waiting for each other to finish – the result being something like

When two trains approach each other at a crossing, both shall come to a full stop and neither shall start up again until the other has gone.

Race conditions are a result of programs using the 4.th dimension: Time. In the barber shop example we have 3 chairs for waiting customers. So in order to seat a customer, we could do something like this:

if (seats-taken < number-of-seats) then seat_customer else reject_customer

Seems simple enough, but imagine that there is 1 free seat and 2 threads ask at the exact same time if they can seat their customer. Both will get permission. How many seats are then taken? 4 out of 3. If we are to do successful concurrency this must never be allowed to happen.

Let’s do an experiment!

Like I mentioned in the preface, O’reilly has already forged an idiomatic solution to this problem, which you can read about: here. I’ve also built an implementation in Clojure which gets the same result, so lets present them. First you have Clojure, weighing in at 31 lines, on the right Scala weighing in at 102 lines

Clojure

(def queue  (ref (with-meta
                   clojure.lang.PersistentQueue/EMPTY
                   {:tally 0})))
(def seats  3)

(defn debug
  [_ msg n]
  (println msg (apply str (repeat (- 35 (count msg)) \space)) n)
  (flush))

(defn the-shop
  [a]
  (debug "(c) entering shop" a)
  (dosync
   (if (< (count @queue) seats)
     (alter queue conj a)
     (debug "(s) turning away customer" a))))

(defn the-barber
  [st q]
  (Thread/sleep (+ 100 (rand-int 600)))
  (dosync
   (when (peek @q)
     (debug "(b) cutting hair of customer" (peek @q))
     (ref-set queue (with-meta (pop @q)
                      {:tally (inc (:tally (meta @q)))})))))

(add-watcher queue :send (agent 'barber) the-barber)

(doseq [customer (range 1 20)]
; (Thread/sleep (+ 100 (rand-int 200)))
  (send-off (agent customer) the-shop))

(Thread/sleep 2000)
(println "(!) " (:tally (meta @queue)) "customers got haircuts today")

Scala

import scala.actors.Actor
import scala.actors.Actor._
import scala.util.Random
import scala.collection.{immutable, mutable}

case object Haircut

class Customer(val id: Int) extends Actor {
  var shorn = false

  def act() = {
    loop {
      react {
        case Haircut => {
          shorn = true
          println("(c) customer " + id + " got a haircut")
        }
      }
    }
  }
}

class Barber extends Actor {
  private val random = new Random()

  def helpCustomer(customer: Customer) {
    if (self.mailboxSize >= 3) {
      println("(b) not enough seats, turning customer " + customer.id + " away")
    } else {
      println("(b) cutting hair of customer " + customer.id)
      Thread.sleep(100 + random.nextInt(400))
      customer ! Haircut
    }
  }

  def act() {
    loop {
      react {
        case customer: Customer => helpCustomer(customer)
      }
    }
  }
}

class Shop extends Actor {
  val barber = new Barber()
  barber.start

  def act() {
    println("(s) the shop is open")

    loop {
      react {
        case customer: Customer => barber ! customer
      }
    }
  }
}

object BarbershopSimulator {
  private val random = new Random()
  private val customers = new mutable.ArrayBuffer[Customer]()
  private val shop = new Shop()

  def generateCustomers {
    for (i <_- style="color: #afeeee; font-weight: bold;" span="span" _="_" _20="_20" to="to" _1="_1">val customer = new Customer(i)
      customer.start()
      customers += customer
    }

    println("[!] generated " + customers.size + " customers")
  }

  // customers arrive at random intervals
  def trickleCustomers {
    for (customer <_- style="color: #afeeee; font-weight: bold;" span="span" thread.sleeprandom.nextint450="thread.sleeprandom.nextint450" customer="customer" shop="shop" _="_" customers="customers">def tallyCuts {
    // wait for any remaining concurrent actions to complete
    Thread.sleep(2000)

    val shornCount = customers.filter(c => c.shorn).size
    println("[!] " + shornCount + " customers got haircuts today")
  }

  def main(args: Array[String]) {
    println("[!] starting barbershop simulation")
    shop.start()

    generateCustomers
    trickleCustomers
    tallyCuts

    System.exit(0)
  }
}

When either of these is run, you’ll get an output somewhat similar to this:

[c] entering shop                    1
[c] entering shop                    2
[c] entering shop                    3
[b] cutting hair of customer         1
[c] entering shop                    4
[c] entering shop                    5
[s] turning away customer            5
[b] cutting hair of customer         2
[c] entering shop                    6
[b] cutting hair of customer         3
[c] entering shop                    7
[c] entering shop                    8
[s] turning away customer            8
[c] entering shop                    9
[s] turning away customer            9
[b] cutting hair of customer         4
[c] entering shop                    10
[c] entering shop                    11
[s] turning away customer            11
[c] entering shop                    12
[s] turning away customer            12
[b] cutting hair of customer         6
[c] entering shop                    13
[c] entering shop                    14
[s] turning away customer            14
[b] cutting hair of customer         7
[c] entering shop                    15
[b] cutting hair of customer         10
[c] entering shop                    16
[c] entering shop                    17
[s] turning away customer            17
[b] cutting hair of customer         13
[c] entering shop                    18
[c] entering shop                    19
[s] turning away customer            19
[b] cutting hair of customer         15
[b] cutting hair of customer         16
[b] cutting hair of customer         18
[!] 11 customers got haircuts today

General comparison

Normally the first argument you make for the STM is the correctness that comes with it, but where do we stand after exhanging blows? Well O’Reilly says

Were we not trying to simulate a real-world scenario, we would of course remove the call to Thread.sleep() and allow our barber to run full tilt.

Well, since our programs need to work in the real-world I pulled out the stops allowing the customers to hit the shop at 50 ms intervals and this blew O’Reilleys model to bits. Since he let his queue situation rest with the mailBox this happens:

Barber starts cutting 1
Customer 2, 3, 4, 5, 6 show up – Not taking free seats since they’re lying in a mailbox
Barber looks up after finishing the haircut and sends 2, 3, 4, 5 away and cuts #6.
Does that seem fair? Not at all, in fact in a real-word scenario this would ruin a business quite quickly. So the implementation of the actor-model is blown and the concurrency abilities of Scala which we set out to demonstrate has been found buggy in O’Reilly’s code. I’ll proceed to explain this model conceptually, even though this is actually disastrous.

The Scala architecture is like this

  • Extend Actor to make Customer who gets sent to a shop
  • Extend Actor to make a Shop who sends Customers to a barber
  • Extend Actor to make a Barber who receives Customer IDs which he will cut

Once this is implemented it’s a simple matter of traversing a list of customers, sending each to the Shop to get the show rolling.

 

The Clojure architecture is like this

  • Reference a shared resource called Queue, which contains customer IDs
  • Ask the Barber to watch this queue and wake up if there’s people in line
  • Define a shop-function which checks for available seats and either accepts or rejects the customer

Common for both architectures is that the division of labor is easy to reason about, ie. barbers are separate from customers, customers are separate from shops. If you had a problem in the Clojure model with the barbers behaviour you’d instinctly know smell the rat in “the-barber” and similar for Scala you’d go to class Barber.

Let’s have a closer look

I’ll point of a few keys in both approaches, starting with Clojure.

Clojure:

(def queue (ref 
             (with-meta clojure.lang.PersistentQueue/EMPTY
                        {:tally 0})))
(def seats 3)

First we have our definitions. queue is of type PersistentQueue, I figured it was easier to relate to given the task at hand but really I could have used anything from lists to vectors to sets. There are 2 interesting things about my queue though. Firstly it’s a reference and secondly it has meta-data. A reference is an indirection, meaning I can modify it from within a transaction to point somewhere other than where it is – that’s a safe way to mutate values.

Meta-data is a very clever thing, which is not found in Scala. It’s a way of describing your data without affecting the data. In business you’ll usually want to know where your data comes from, if it’s validated, how old it is etc. All of this can be safely tucked away as meta-data – In this case, I’ll keep a tally of how many people got their hair cut.

(code-style freebie: I explicitly define seats as 3 instead of having a 3 directly in my code-logic. Generally you refer to in-code numbers as ‘magic numbers’, because 6 months from now, nobody knows what they’re there for)

(defn the-shop
  [a]
  (debug "[k] entering shop" a)
  (dosync
     (if (< (count @queue) seats) 
        (alter queue conj a) 
        (debug "[s] turning away customer" a)))) 

Our definition of a shop is very simple: Check if there’s a seat available, if yes conj(oin) a to the queue, if no reject. Remember my concern about race-conditions in our problem definition? For Clojurians concerns about race-conditions are usually boiled down to remembering to initiate a transaction by wrapping your code in (dosync …). What happens if you forget? Clojure tells you!

(defn the-barber 
  [st q] 
  (Thread/sleep (+ 100 (rand-int 600))) 
  (dosync 
     (when (peek @q) 
        (debug "[b] cutting hair of customer" (peek @q)) 
        (ref-set queue (with-meta (pop @q) 
                                  {:tally (inc (:tally (meta @q)))}))))) 

This is the main logic and still very simple. I use 2 functions to work on my shared resource: Peek and pop. Peek gives me the head and pop the tail. If there is a head I change my ref to point to the tail instead (having cut the hair of the head, so to speak). While doing this I pass the old meta-data to inc (upping it by 1) and store it back in :tally. Simple?

Scala

Since each Actor in the Scala model has his own mailbox with pending messages O’reilly has chosen to use the barbers mailbox as a representation of the chairs. This means that once a message gets sent to the barber, he checks the size of the mailbox and depending on the result either cuts or rejects.

class Barber extends Actor { 
   private val random = new Random() 

   def helpCustomer(customer: Customer) { 
       if (self.mailboxSize >= 3) {
          println("[b] not enough seats, turning customer " + customer.id + " away")
       } else {
          println("[b] cutting hair of customer " + customer.id)
          Thread.sleep(100 + random.nextInt(400))
          customer ! Haircut
       }
   }

   def act() {
      loop {
        react {
           case customer: Customer => helpCustomer(customer)
        }
      }
   }
}

In the act() function we have a loop which activates in response to an incoming message, it reacts. If the message being sent is of class Customer, it passes that customer as an argument to its helper function helpCustomer. Like I mentioned above, the function then checks its mailbox to see if we can queue more people. If there’s room then the person gets a haircut after an artistic wait of 100 + random(400). As you would expect the OOP approach gives you a division of labor which is great for a white board presentation, but even for a small project such as this we’re generating a lot of code and also our first bug!

def trickleCustomers {
   for (customer def tallyCuts {
       // wait for any remaining concurrent actions to complete
       Thread.sleep(2000)

       val shornCount = customers.filter(c => c.shorn).size
       println("[!] " + shornCount + " customers got haircuts today")
}

Both Customer and Shop are implemented without any special tricks or points of interest, so I’ll close by showing you how nicely you initiate this whole procedure.

Observations

Well, first off there one thing I really hate about certain languages. It’s a phenomenon which on one hand can take a long time to master and on the other can cause you to write poor code by not leveraging the languages full power. It’s called syntax.

Scala has quite a bit of Syntax which you need to study. For instance less-than-minus…

for (item <- collection)

…will traverse a collection letting me work with one item at a time. Another is the exclamation mark, which we now know means to send message. A third could be slash-colon (/:) which means foldLeft and backslash-colon (\:) which means foldRight, ex:

scala> (0 /: (1 to 3)) (_ + _)
res5: Int = 6

I’ve scratched the surface here, fortunately Scala is very well documented and if you want to dig deeper you should read the 180 page language spec: here. If you want to learn the entire syntax for Clojure, I can teach you right now:

() = list

[] = vector

{} = map

#{} = set

All expressions take the form (func arg1 arg2 ... arg-n)

There’s also a syntax for macros, it’s also nicely documented on Clojure.org.

Caveats

So this seems almost too good to be true – has all concerns with concurrency really been nullified with the arrival of the STM and Actors? Not really, but we now have an easier way to reason about these problems, which is central to managing them.

Clojure

Looking at Clojure’s snapshot isolation, we still have to pay attention to write skew. Imagine an interface for managing security patrols on a military base, Area 51 for instance. You have 3 barracks, each containing 45 soldiers and you never want less than 100 soldiers in the barracks at any time. So you have 2 administrators who log on simultaneously, get their individual snapshots of the database and check if they can withdraw 20 solders.

Admin 1: if ((G1 - 20) + G2 + G3) > 100 then dispatchPatrol

Admin 2: if (G1 + (G2 - 20) + G3) > 100 then dispatchPatrol

Since neither of them sees the other transaction because they’re looking at isolated snapshots they both commit since (45 – 20) + 45 + 45 = 115, pulling out 40 solders, leaving the total at 95 – below the threshold. Thats write-skew. Clojure provides (ensure r) to prevent this, but you need to pay attention.

Scala

Scala raises some concerns too. Firstly the OOP model is showing some teeth in the way you are always extending and implementing act() – To me as a new-comer it seems ceremonial and the lines of code agree. But if you’re coding Scala you’ve accepted that this is life.

Secondly, the pattern-matching in the react/receive methods have a big pitfall – If you send messages of a type which is not implemented, what happens? You’ve guessed it, the Actor hordes these messages filling up its queue. In a massive OO project I can imagine this causing quite a few grievances.

Thirdly, with Clojure’s agents/refs you can always read their values at no cost. Scala is a distributed system which means no shared-memory space, so by implication the Actors are always polling. Distributing resources always comes with a cost in terms of resource waste.

Fourth, for Scala programmers they are already dealing with OOP which is making their programs more complex, harder to reason about and understand. Adding to that is the thin/invisible line between functional- and imperative code. On top of those two you add the Actor model, which in it self adds complexity being distributed. If you take the sum of this final concern and try to calculate a development cost, maintenance cost, bug fixing cost – I think you’ll want to scroll up and re-read the Clojure code. All this taken in account, it’s not a huge surprise that the official Sleeping Barber solution from Scala was found buggy.

Conclusion

I’ll make the argument that Scala isn’t doing much for you, if you want to simplify your code and shoot for elegance. In my last post I was focused on smaller examples where you could leverage Scalas functional methods to good effect, but taking on a larger (but still small) project it’s apparent that the end result is looking dangerously similar to ‘what we used to do’. On top of this is the fact that the official solution from the Scala camp was buggy, which gives little grounds for confidence in the Actor model. So the bottom line is this: 32 lines of Clojure blew the 100+ lines Scala Juggernaut out of the water.

I don’t claim that Clojure is problem free, but I let Rich worry about it. So far, I think it’s ready to go the distance even in concurrent waters.

/Lau

About the author

Lau Jensen is the owner of Best In Class, an avid Clojure Developer well experienced in project management and also the primary author of ClojureQL. You are very welcome to follow his posts on this blog or get in touch directly on lau.jensen@bestinclass.dk

  • sidazad

    Excellent post – very informative. I am just starting to learn Clojure (slowly). Being a Python programmer who has to write a concurrent program I am trying to figure the best alternatives among Golang, Python (concurrent.futures) and Clojure/Scala.
    1. Any thoughts why Python may not be the best idea in this case? Using concurrent.futures does avoid the GIL issues but obviously not having designed for concurrency I’d imagine Python would be inferior to Clojure/GoLang.

    2. Any thoughts on GoLang with respect to this post of yours?

    • I haven’t done any work in Go, so I can’t comment. What you’re looking for when doing concurrent programming is a language with strong concurrency semantics (agents, atoms, STM, actors, etc) which fits your purpose coupled with persistant immutable datastructures. You won’t find this cooked into the language itself using Python or Scala (see the Clojure Vs Scala post for details). Actors are great for out-of-process concurrency, ie. messaging, the STM is great for in-process shared memory concurrency. Im sure you can find libraries which emulates this behavior in other languages, but likely in a less oppoinionated design than is the case with Clojure.

    • Have a look at this imply https://gist.github.com/fb7d2a44303481731d89.git which uses Clojures core.async which is heavily inspired by GoLang

  • Thanks for a great walkthrough and a rather thorough discussion on these two concurrency-models. Even being more in the Clojure camp myself, I do think you’ve missed an important part of the actor model: It can (I believe) (non-trivially) be spread to several different machines without much more coding. That I guess would not be as trivial in Clojure.

    Also, I think this problem is a great match for core.async, I’ve done a quick (probably bug-ridden) implementation here: https://gist.github.com/fb7d2a44303481731d89.git

    • There’s no doubt, that Clojures model is designed for in-process concurrency, while Scala is better geared for distributed systems. I only picked up this example because it was the one covered in Programming Scala. Had they gone with an example of distribution, I would have picked up something similar :)

      For some reason, I get a blank page when clicking your gist link.

    • Robert Grant

      This is my worry as well – obviously STM is nicer than the Actor model on a single machine, because that’s its strength, and relates to the problem it’s trying to solve. Move it to multiple machines and I’m pretty sure it’s a different story.

  • Hariharan V

    We used clojure STM for an experiment in one of our systems. Disaster!! STM was horrible when there’s a lot of write collisions. Many a lot of transactions failed after several retries. Not to mention the performance.
    Then we moved on to actor based concurrency with Akka and Scala. So far, no issue.