Dining philosophers – The 4th solution

In my last post I set out to solve a classic concurrency problem called ‘The sleeping barber’ and contrasted an STM solution with an Actor based solution. It occurred to me afterwards, that the interest in concurrency is quite huge these days (and for good reason), so I’ve decided to walk through the deadlock/livelock/starvation trap called The dining philosophers.
If you don’t know the challenge, it’s basically this

You have 5 (or n) philosophers sitting round a table. On the table lies 5 forks. Every philosophers behaves is in this way: Either he wants to think, or he wants to eat. If he decides to eat, he must secure both forks, the one to his left and the one to his right. The philosophers never communicate with one another.

For a full explanation read: this. But otherwise I hope this image will give you an idea of the scenario:
Dining Philosophers

So that doesn’t sound too complicated does it? Well the trick is, that this challenge is actually ideal for seeing whether or not you can navigate around dead/live-locks. If Philosopher #1, 3 and 4 all have 1 fork in their hands, you risk that they’ll all sit and wait on each other, that’s a deadlock, nobody makes any progress. A livelock on the other hand is when all the processes seem to be making some progress, but never reach their goal.

There’s also the more subtle pitfall of starvation, where 2 excessive eaters (quick processes) can consume the entire meal (cpu time), leaving the remaining 3 philsophers (processes) starved. It’s a nice pun :)

Solutions?

Wikipedia suggests 3 way to solve this, which I will briefly outline here.

Waiter solution

Since the philosophers don’t speak with each other one solution is to have them speak to a waiter which keeps an overview of the table, he can then decide who gets forks and when.

Resource hierarchy solution

The resource hierarchy basically consists of ordering the forks by number. They will then always be requested by order and released in the reverse order. Read the Wiki for the full explanation.

Chandy / Misra solution

And finally, we have a solution wherein you describe the forks as either being clean or dirty and based on this categorization the philosophers requests the forks from one another. In my book, this request equals communication and so the restraint inherent to the problem has been violated – So this is really no solution at all.

Enter Clojure

I’ve writen up a solution in Clojure, which is quite idiomatic, runs without any outside coordination and doesn’t ever enter a dead-/live-lock. It weighs in at 34 lines of code, so lets walk through them.

Firstly, we need to describe our data.

(def rounds         (ref 20000))
(def philosophers   (doall (map #(agent %) (repeat 5 0))))
(def forks          (doall (map #(ref [% true]) (range (count philosophers)))))
(def logger         (agent 0))

Due to a food limitation in the universe, I’ve limited the number of rounds (portions) to 20000. Everytime a philsopher eats, this is decremented.

Second I have my philosophers, they’re all agents (that is, threads) and are spawned by applying the agent fn to 0, giving me a number of agents who’s initial state is 0. Notice, this is the only time in the entire program where I write “5”, if you change this number the number of philosophers and forks will change, and the show will go, that’s quite flexible.

Thirdly we need to describe the forks and since they’re our shared resources we (ref ..) them, making them only modifiable from within a transaction. What you’re seeing is a referenced vector for each fork, the first item being its id the second being a bool. The boolean describes availability, true = You can pick me up, false = don’t touch.

Lastly, I have a logger which allows me to dispatch logging as separate threads. Because I’m doing I/O (printing) from within my dosyncs this prevents you from seeing retries as threads are also dispatched upon succesfull commit.

This is the data, now we need some helpers to manipulate it.

(defn debug  [_ id msg r]
  (println id \space msg "(" r ")")
  (flush))

For debugging purposes, I’ve made this little function which takes the id of the philosopher a message and the number of remaining rounds. That’ll give us an indication of whether or not the program is running as it should. The first argument _ is simply there, because when you send/send-off threads the first argument is automatically the state of that agent.

(defn my-forks [id]
  (map #(nth (cycle forks) (+ (count forks) %)) [id (dec id)]))

This little helper gives me the state of my forks. It works and it only took 3 seconds to write and I guess you can tell that by looking at it :) Basically we have our philosophers named 0 – 4, and we have our forks named 0 – 4. So we know that philsopher #1 has forks 4 and 0. Philsopher #2 has forks 1 and 0. #3 has 2 and 1 and so on. The trick here is, that when I supply an id n, I want forks n and n-1 returned, but if I pass id 0, that’ll give me 0 and -1, which doesn’t exist. So my work around was to produce a cycle of the numbers as demonstrated here:

user> (take 15 (cycle [1 2 3]))
(1 2 3 1 2 3 1 2 3 1 2 3 1 2 3)

Basically I get an infinite sequence of my arguments cycled. So by offsetting my request with the length of my argument, I can pull ids with [id (dec id)] which is [n (n – 1)], like so with n=0 :

Clojure cycle fn

So adding the offset to n, lests us do a look-behind without causing an IndexOutOfBounds exception and as for ressource-waste, there is virtually none. Everything after the red “4” is never computated.

Now being able to identify my forks, I want to know if they’re available:

(defn got-forks?  [id]
  (every? #(= true (second (deref %))) (my-forks id)))

This will work no matter how many forks you have, so if we later want to extend our code to a more complex scenario, we haven’t written anything so far which will need alterations. (my-forks …) returns a sequence like [1 true] [2 false] and now I want to know if all my forks are available: (is) every? (value of second in % true) from (my-forks) ? Because % will be every item returned from my-forks and all those items are refs, I call (deref %) to get at the value. (could have been : @%)

Great, this is the sensory logic for our philosophers, they can now see what they need to see in order to decide what to do, there’s just one thing missing: How do they interact?

(defn handle-forks [id action]
  (doseq [fork (my-forks id)]
    (ref-set fork [(first @fork) (condp = action :take false :release true)])))

I want to pass an ID and an action and have that played out. The trick is, we want to do this for an arbitrary number of forks and with a user-supplied action. For that reason I wrap my ref-set statement in a doseq, walking over any item (my-forks..) returns, and instead of manually saying true or false, I add a conditional (condp) statement, which applies = to action and acts accordingly.

Bonus: Notice how in an imperative language, you’d find something like

if action == true:
    ref-set(forks, vector, n, true);
else:
    ref-set(forks, vector, n, false);

With Clojure we can avoid writing (ref-set forks n …) twice, because we’re not doing statements but expressions and the last value rolls up through the preceding expressions.

(As a little help for newcomers: If you have a hard-time visualizing what the data looks like and how it’s being passed around, load the code below into a repl and experiment like you see me doing in this little snippet below.

user> (my-forks 1)
(# #)
user> (got-forks? 1)
true

The thing to notice is that my-forks returns references, to read them use either (deref ref) or @ref. Secondly it’s in vectors, meaning (first @ref) gets you the integer, (second @ref) gives you the boolean. If you don’t yet have a well working Emacs/Slime set up, please read this carefully, it’s easy to follow but you need to meet the requirements (like git, java, etc): link

Ok, now we’ve got it all. Our philosophers can see the forks. Pick them up or release them. Now we need to work on our behavior model, which potentially is a complex nest of locks and conditionals. Lets take a look, and I’ll explain:

(defn behave [a id]
  (dosync                           ; Initiate transaction
   (when (pos? (ensure rounds))     ; Is there more food?
     (if (> 5 (rand-int 10))        ; Do I want to eat or think?
       (when (got-forks? id)        ; Are both of my forks available?
           (handle-forks id :take)
           (alter rounds dec)
           (send-off logger debug id "ate      " @rounds)
           (handle-forks id :release))
       (send-off logger debug id "thinks   " @rounds))
     (send-off *agent* behave id)))) ; Repeat above

The “;” comments almost say it all. Within a transaction we decide what we want to do. If we think, we think and if we eat, why try to get the forks and if we’re successful we eat. Are you thinking, was that it? :)

Well, yes it is. You see by ensuring the rounds in the beginning of our transaction, we effectively put a reader/writer lock on that reference, giving us time to play. When I modify ’rounds’ and ‘forks’ from within my loop Clojures STM makes sure that I have a consistent view of the world. This is critical! It means that if I perceive that I’ve got 2 forks available and can eat, then I don’t risk that condition being violated before I interact with my reference. Everything happens atomically, even if 2 threads try this at the exact same time. Why?

Because time has been cloned

In our normal perception of time, things can happen at the exact same time. In the STMs perception of time, they can’t. Our program will run similar to this:

t1=got-forks? yes: grap ’em, eat, release ’em
t2=got-forks? no: retry
t3=got-forks? yes: grap ’em, eat, release ’em

On the STMs clock, these actions are in fact atomic, so the dreaded “grap the first fork, somebody else grabs the second fork, wait, wait, waiiiiiit…deadlock” situation can’t take place.

Now the attentive reader might be thinking that this comes with a huge performance penalty, but it doesn’t. The reason being that Clojures primary datastructures are all immutable, they can’t change. For this reason ‘reader threads’ never wait, they sit and look at their own snapshot of the world, which they pulled out the moment they started their transaction.

It’s true however, that in scenarios with extreme contention you will see transactions retrying which comes with a penalty. It’s a challenge to model your architecture correctly to avoid/minimize this and it’s a challenge for Rich to resolve it quickly when it does happen.

The GIL springs to mind. Pythons Global Interpreter Lock – That’s a lock if I’ve ever seen one. Since Pythons data is mutable, the only way to help your users through concurrency is to restrict access, so in effect the GIL will force all computations to run serially. In the real world a multi threaded Python application will at best leave your quadcore-server with 80% idle time and at it’s worst will cripple your computation because of the threads are fighting for the GILs attention. This is a case of locks gone bad, wherein parallel computing and multi threading becomes a bad idea.

In contrast, I’ve seen the STM tested on a 600 core Azul system, where all the cores were working hard and in parallel up until a certain point, where adding more threads caused performance to crater – so that’s a challenge for the STM.

Observations

I think it’s interesting how in the year 2009 we can now solve a concurrency problem like ‘Dining philosophers’ without writing manual locks at all. For Clojurians concurrency is first and foremost a challenge of applying the tools correctly. If you want to give it a whirl yourself, I’ll paste the entire code which you can load in SLIME and it’ll start and finish automatically.

Below you’ll see my solution, but if you have a nice idiomatic take on this problem using another language, you’re very welcome to send it in and I’ll post it here on the site. The only requirements are that it must run flawlessly and not use the GIL 8) Happy hacking!

/Lau

(def rounds        (ref 20000))
(def philosophers  (doall (map #(agent %) (repeat 5 0))))
(def forks         (doall (map #(ref [% true]) (range (count philosophers)))))
(def logger        (agent 0))

(defn debug  [_ id msg r]
  (println id \space msg "(" r ")")
  (flush))

(defn my-forks [id]
  (map #(nth (cycle forks) (+ (count forks) %)) [id (dec id)]))

(defn got-forks?  [id]
  (every? #(= true (second (deref %))) (my-forks id)))

(defn handle-forks [id action]
  (doseq [fork (my-forks id)]
    (ref-set fork [(first @fork) (condp = action :take false :release true)])))

(defn behave [a id]
  (dosync                           ; Initiate transaction
   (when (pos? (ensure rounds))     ; Is there more food?
     (if (> 5 (rand-int 10))        ; Do I want to eat or think?
       (when (got-forks? id)        ; Are both of my forks available?
         (handle-forks id :take)
         (alter rounds dec)
         (send logger debug id "ate    " @rounds)
         (handle-forks id :release))
       (send logger debug id "thinks " @rounds))
     (send-off *agent* behave id))))

(doseq [i (range (count philosophers))]
  (send logger debug i "being sent off to dinner" @rounds)
  (send-off (nth philosophers i) behave i))

About the author

Lau Jensen is the owner of Best In Class, an avid Clojure Developer well experienced in project management and also the primary author of ClojureQL. You are very welcome to follow his posts on this blog or get in touch directly on lau.jensen@bestinclass.dk

  • Andreas Karlsen

    Hi Lau,
    Thanks for taking the time for this writeup.

    I got some minor corrections, due to a mismatch between the code snippets and the final program.
    First off, a &gt got in the way of > in the behave snippet.
    Secondly, you use the fn send-off in the behave snippet, but use send in the final program.

    I enjoyed reading through the concise code, as an novice clojure programmer :)

  • Pingback: Clojure Concurrency | Compuzzle()

  • Paul Bennett

    Hi Lau, I think there are a couple of problems with your solution.

    First of all, you have each philosopher try and grab both forks in the same transaction. This means that a philosopher can only eat when both his neighbors are thinking simultaneously. This leads to a much reduced chance of a philosopher eating, and is a problem of starvation. if you grabbed each fork separately, as it became free, each philosopher would have a much better chance of eating sooner, as this would guarantee that a philosopher will eat as soon as both his neighbors finish eating.

    But then this would expose a much more serous problem – you don’t implement the deadlock-free algorithm correctly. This states that each philosopher takes the lowest numbered fork first. This prevents each philosopher picking up the right fork simultaneously, which is the deadlocked situation. This happens because (assuming philosopher n has forks n and (mod n+1 (count philosophers)), with fork n on the LH side) philosophers 0 – (n-2) have the lowest numbers fork on their LH side, while philosopher n-1 has the lowest numbered for on the RH side i.e. fork 0. So philosopher n first picks his RH fork, while all the others first pick up their LH fork.

    The way you implement my-forks violates this – that is, your algorithm allocates the forks in such a way that all philosophers pick up their RH fork first.

    The only reason your solution does not deadlock is because of the first issue – i.e. you only get forks when both are free simultaneously. If you changed that to pick up forks individually, I’m pretty sure you solution would deadlock.

    For another Clojure solution see https://github.com/wizardpb/diningphils. There is also an implementation of the Chandy-Misra algorithm – as I disagree that this is ‘no solution at all’ :-). It’s just a different set of constraints.