Quantcast
Channel: keeping simple
Viewing all articles
Browse latest Browse all 195

Concurrency

$
0
0

\(\newcommand{\ess}{Initial} \)

Michel Charpentier provides a nice example of a simple, flawed, producer/consumer buffer implementation. The fundamental problem could be discovered by doing a Stack Overflow search, but it’s interesting to see what really goes wrong.  The code makes use of Java’s “synchronized”  attribute to construct atomic put and get operations.  Processes using the buffer are either producers or consumers. Say there are  P producers and C  consumers where  \(P > 0, C > 0\) and let’s say the buffer has at most \(k\) elements for some \(k>0\). Let \(B(s)\) be the number of elements in the buffer in the state defined by \(s\). Let \(W(s,producers)\) and \(W(s,consumers)\) indicate, respectively, how many producers and how many consumers are “waiting” in state \(s\). For the initial state:

\[B(\ess) = 0\]

\[W(\ess,producers) = W(\ess,consumers) = 0\]

At a high level, there are only two events that change system state: put events and get events.  Use the notation \(s.e\) for the state produced when event \(e\) drives the system from state \(s\). So \(s.put\) is the next state after \(s\) if there is a put event and \(s.get\) is the next state after a get event. If we define a variable in the initial state and inductively, after some event, then it is defined in every state.

\[ B(s.e) = \begin{cases} B(s)+1&\mbox{if }e=put\mbox{ and }B(s)< k;\\ B(s)-1&\mbox{if }e=get\mbox{ and }B(s) > 0;\\ B(s)&\mbox{otherwise}\end{cases}\]

The operating system determines which events happen and in which order – that is one of the two sources of “non-determinism” in this system.

The state variable \(W\) is a little more complicated. When a “put” operation is tried  while the buffer is full, the producer should be made to wait and the same for consumers on get operations when there is nothing in the buffer so if \(B(s) = k\) then  \(W(s.put,producers) = W(s,producers)+1\) and similar for “get” when \(B(s)=0\). But what happens when get or put succeeds? From the code we see that the producer or consumer  does a “notify” operation which wakes up some waiting process which should subtract 1 from either the number of waiting consumers or the number of waiting producers, if possible. If there is a choice of consumers or producers to “notify”, the operating system makes the choice.  This kind of “nondeterminism” often just means that the specification is incomplete or depends on some additional factors.   Suppose \(N(s)\in \{producer, consumer,pass\}\) is the logic of the notify operation that picks what gets woken up – or “pass” if there is nothing to wake up:

\[\mbox{If } W(s,producer)+W(s,consumer) > 0 \mbox{ then } N(s)\in\{producer,consumer\}\mbox{ otherwise }N(s)= pass \]

\[ \mbox{if }N(s)=v \mbox{ then } W(s,v) > 0\]

Then we can put together all the specification for W using N to tidy up all the nondeterminism.

\[ W(s.e,v) =\begin{cases} W(s,v)-1&\mbox{if }N(s)=v\\&\mbox{and } ((e = put\mbox{ and }B(s)< k)\\ &\mbox{or } ( e=get\mbox{ and }B(s)>0) )\\ W(s,v)+1&\mbox{if }v=producer\mbox{ and }e=put\mbox{ and }B(s)< k\\  &\mbox{if }v=consumer\mbox{ and }e=get\mbox{ and }B(s)>0\\W(s,b)&\mbox{otherwise}\end{cases}\]

Just developing this spec tells us that there is a problem in that N may fail to wake up any producers on a series of gets if there are enough consumers to wake up – there is no guaranty.  Imagine if \(W(s,producers)= P\) and \(W(s,consumers) > 0\) and \(0 < B(s)< k\). Then there is no guaranty that N will wake a producer after a get. In fact, if \(k – B(s)  < W(s,consumers) < C\), its possible for consumers to us up all the buffers without ever waking up a single producer.  For this condition to even be possible it must be the case that \(C > k\). To see that, suppose there are \(k\) puts in a row to fill up the buffer (because the selection of whether an unblocked consumer or producer gets to run next is up to the OS).  Now the OS decides to only schedule consumers so consumers use up all all the elements of the buffer, notifying \(k\) producers  until the buffer is empty and then all consumers block: \(W(s,consumers)=C\).  Then consumers empty the buffer and \(N\) only wakes consumers so all the producers are blocked, but the buffer is now empty. If \(z\) is that sequence of gets write \(sz\) for the sequence obtained by concatenating \(z\) to \(s\), so \(W(sz,consumers) = C -k\) and \(W(sz,producers) = P\) and \(B(s)=0\). There is no way forward. A similar problem can happen if \(P > k\). In fact, the obvious way to implement “wait” and “notify” is with a single fifo queue – which makes this kind of deadlock more likely).

More information about this method of describing state systems can be found here. 

Charpentier used TLA to look at his example and comparing this approach to TLA shows some of the differences. The goal here is to be able to specify the system operation mathematically to aid intuition about how the system is supposed to work (it also helps if we want to write some simulation code). That is, the intention here is to provide programmers with a way of sketching out the design abstractly. Formal proofs are not always useful in this context. The mathematical basis is state machines – which is also the underlying basis of TLA, but used in a very different way. For one thing, this method stresses events, while TLA, despite its name, has really only one action: an anonymous event that moves the system to the next state which necessitates non-deterministic state machines. States in TLA are essentially maps from state variable symbols to values while here, states turn out to be nothing more than finite event sequences. And, of course, TLA is a “formal method”, relying on formal logic, while the methods used here are from working mathematics.

 


Viewing all articles
Browse latest Browse all 195

Trending Articles