294 Background more complex and involve arrays and atomic snapshots (Aspnes and Herlihy,1990) Lock-freedom is similar to wait-freedom but only guarantees that some thread will complete an operation in a finite number of steps Typical lock-free implementations,such as the well-known Treiber stack (Treiber,1986),HSY elimination-backoff stack (Hendler et al., 2004)and Harris-Michael lock-free list(Harris,2001;Michael,2002), use the atomic compare-and-swap instruction cas in a loop to repeat- edly attempt an update until it succeeds.Figure 1.1(a)shows such an implementation of the counter object.It is lock-free,because whenever inc operations are executed concurrently,there always exists some suc- cessful update.Note that this object is not wait-free.For the following program(2.2.1),the cas instruction in the method called by the left thread may continuously fail due to the continuous updates of x made by the right thread. inc();while(true)inc(); (2.2.1) Wait-freedom and lock-freedom are progress properties for non- blocking implementations,where a delay of a thread cannot prevent other threads from making progress.In contrast,deadlock-freedom and starvation-freedom are progress properties for lock-based implementa- tions.A delay of a thread holding a lock will block other threads which request the lock. Informally people often state deadlock-freedom and starvation- freedom in terms of locks and critical sections (a critical section is the code segment between a lock-acquire and the matching lock-release). For example,in their textbook,Herlihy and Shavit(2008)say deadlock- freedom guarantees that some thread will succeed in acquiring the lock, and starvation-freedom guarantees that every thread attempting to acquire the lock will eventually succeed. However,as noted by Herlihy and Shavit(2011),the above definitions based on locks are unsatisfactory,because it is often difficult to identify a particular field in the object as a lock and a particular code segment as a critical section.Instead,they suggest defining them in terms of method calls.They also notice that the above definitions implicitly assume that every thread holding a lock will eventually release it.This
294 Background more complex and involve arrays and atomic snapshots (Aspnes and Herlihy, 1990). Lock-freedom is similar to wait-freedom but only guarantees that some thread will complete an operation in a finite number of steps. Typical lock-free implementations, such as the well-known Treiber stack (Treiber, 1986), HSY elimination-backoff stack (Hendler et al., 2004) and Harris-Michael lock-free list (Harris, 2001; Michael, 2002), use the atomic compare-and-swap instruction cas in a loop to repeatedly attempt an update until it succeeds. Figure 1.1(a) shows such an implementation of the counter object. It is lock-free, because whenever inc operations are executed concurrently, there always exists some successful update. Note that this object is not wait-free. For the following program (2.2.1), the cas instruction in the method called by the left thread may continuously fail due to the continuous updates of x made by the right thread. inc(); k while(true) inc(); (2.2.1) Wait-freedom and lock-freedom are progress properties for nonblocking implementations, where a delay of a thread cannot prevent other threads from making progress. In contrast, deadlock-freedom and starvation-freedom are progress properties for lock-based implementations. A delay of a thread holding a lock will block other threads which request the lock. Informally people often state deadlock-freedom and starvationfreedom in terms of locks and critical sections (a critical section is the code segment between a lock-acquire and the matching lock-release). For example, in their textbook, Herlihy and Shavit (2008) say deadlockfreedom guarantees that some thread will succeed in acquiring the lock, and starvation-freedom guarantees that every thread attempting to acquire the lock will eventually succeed. However, as noted by Herlihy and Shavit (2011), the above definitions based on locks are unsatisfactory, because it is often difficult to identify a particular field in the object as a lock and a particular code segment as a critical section. Instead, they suggest defining them in terms of method calls. They also notice that the above definitions implicitly assume that every thread holding a lock will eventually release it. This
2.2.Progress Properties 295 assumption requires fair scheduling,i.e.,every thread gets eventually executed.Otherwise,a thread holding a lock may never be scheduled. Then it has no chance to release the lock,even if the critical section terminates (if executed).As a result,all the other threads trying to acquire the lock are permanently blocked. Following Herlihy and Shavit (2011),we say an object is deadlock- free,if in each fair execution there always exists some method call that can finish.Similarly,a starvation-free object guarantees that every method call can finish in fair executions. The counter in Figure 2.1(b)is deadlock-free,because the test-and- set lock (see Figure 2.1(a))guarantees that eventually some thread will succeed in getting the lock via the cas instruction at line 4,and hence the method call of inc in that thread will eventually finish.It is not starvation-free,because there might be a thread that continuously fails to acquire the lock.For the following client program(2.2.2),the cas instruction executed by the left thread could always fail if the right thread infinitely often acquires the lock. inc();print(1);II while(true)inc(); (2.2.2) The counter in Figure 2.1(d)implemented with a ticket lock is starvation-free.Figure 2.1(c)shows the details of the ticket lock imple- mentation.It uses two shared variables owner and next,which are equal initially.The threads attempting to acquire the lock form a waiting queue.In tkL_acq,a thread first atomically increments next and reads its old value to a local variable i (line 19).It waits until the lock's owner equals its ticket number i (lines 20-23),then it acquires the lock.In tkL_rel,the thread releases the lock by incrementing owner (line 26).Then the next waiting thread(the thread with ticket number i+1,if there is one)can acquire the lock.We can see that the ticket lock implementation ensures the first-come-first-served property,and hence every thread calling inc_tkL can eventually acquire the lock and finish its method call. Progress properties of concurrent objects with partial methods.How- ever,the aforementioned progress properties are all proposed for concur- rent objects with total methods only,i.e.,methods that should always
2.2. Progress Properties 295 assumption requires fair scheduling, i.e., every thread gets eventually executed. Otherwise, a thread holding a lock may never be scheduled. Then it has no chance to release the lock, even if the critical section terminates (if executed). As a result, all the other threads trying to acquire the lock are permanently blocked. Following Herlihy and Shavit (2011), we say an object is deadlockfree, if in each fair execution there always exists some method call that can finish. Similarly, a starvation-free object guarantees that every method call can finish in fair executions. The counter in Figure 2.1(b) is deadlock-free, because the test-andset lock (see Figure 2.1(a)) guarantees that eventually some thread will succeed in getting the lock via the cas instruction at line 4, and hence the method call of inc in that thread will eventually finish. It is not starvation-free, because there might be a thread that continuously fails to acquire the lock. For the following client program (2.2.2), the cas instruction executed by the left thread could always fail if the right thread infinitely often acquires the lock. inc(); print(1); || while(true) inc(); (2.2.2) The counter in Figure 2.1(d) implemented with a ticket lock is starvation-free. Figure 2.1(c) shows the details of the ticket lock implementation. It uses two shared variables owner and next, which are equal initially. The threads attempting to acquire the lock form a waiting queue. In tkL_acq, a thread first atomically increments next and reads its old value to a local variable i (line 19). It waits until the lock’s owner equals its ticket number i (lines 20-23), then it acquires the lock. In tkL_rel, the thread releases the lock by incrementing owner (line 26). Then the next waiting thread (the thread with ticket number i+1, if there is one) can acquire the lock. We can see that the ticket lock implementation ensures the first-come-first-served property, and hence every thread calling inc_tkL can eventually acquire the lock and finish its method call. Progress properties of concurrent objects with partial methods. However, the aforementioned progress properties are all proposed for concurrent objects with total methods only, i.e., methods that should always
296 Background return when executed sequentially.They do not apply to objects with partial methods,such as the lock objects in Figure 2.1(a)and(c),which intend to permanently block at certain situations. Specifically,deadlock-freedom and starvation-freedom do not apply. They allow permanent blocking but only if the scheduling is unfair. Consider the following client program(2.2.3)using the TAS lock in Figure 2.1(a).One of the method calls never finishes. L_acq();II L_acq(); (2.2.3) It shows that the test-and-set lock object does not satisfy the traditional deadlock-freedom or starvation-freedom property we just presented. Neither does the ticket lock object in Figure 2.1(c). The problem is that L_acq intends to block when the lock is not available.The non-termination in the above example(2.2.3)is just the intention of a correct lock implementation;otherwise the lock cannot guarantee mutual exclusion. As a result,we need new progress properties for objects with partial methods.Moreover,the new progress properties should be able to distinguish the TAS lock and the ticket lock.Consider the following client program (2.2.4). L_acq();L_rel();II while(true){L_acq();L_rel();} (2.2.4) The call to L_acq()of the left thread may not return under fair schedul- ing with the TAS lock,but it must return in fair executions with the ticket lock.This shows that the TAS lock and the ticket lock have different progress properties,which we will call partial deadlock-freedom (PDF)and partial starvation-freedom (PSF)respectively (formally de- fined in Section 5). The relationships between the progress properties form a lattice shown in Figure 2.2 (where the arrows represent implications).For example,wait-freedom implies lock-freedom and starvation-freedom im- plies deadlock-freedom.PSF and PDF are generalizations of starvation- freedom and deadlock-freedom respectively
296 Background return when executed sequentially. They do not apply to objects with partial methods, such as the lock objects in Figure 2.1(a) and (c), which intend to permanently block at certain situations. Specifically, deadlock-freedom and starvation-freedom do not apply. They allow permanent blocking but only if the scheduling is unfair. Consider the following client program (2.2.3) using the TAS lock in Figure 2.1(a). One of the method calls never finishes. L_acq(); || L_acq(); (2.2.3) It shows that the test-and-set lock object does not satisfy the traditional deadlock-freedom or starvation-freedom property we just presented. Neither does the ticket lock object in Figure 2.1(c). The problem is that L_acq intends to block when the lock is not available. The non-termination in the above example (2.2.3) is just the intention of a correct lock implementation; otherwise the lock cannot guarantee mutual exclusion. As a result, we need new progress properties for objects with partial methods. Moreover, the new progress properties should be able to distinguish the TAS lock and the ticket lock. Consider the following client program (2.2.4). L_acq(); L_rel(); || while(true){ L_acq(); L_rel(); } (2.2.4) The call to L_acq() of the left thread may not return under fair scheduling with the TAS lock, but it must return in fair executions with the ticket lock. This shows that the TAS lock and the ticket lock have different progress properties, which we will call partial deadlock-freedom (PDF) and partial starvation-freedom (PSF) respectively (formally de- fined in Section 5). The relationships between the progress properties form a lattice shown in Figure 2.2 (where the arrows represent implications). For example, wait-freedom implies lock-freedom and starvation-freedom implies deadlock-freedom. PSF and PDF are generalizations of starvationfreedom and deadlock-freedom respectively
2.3.Contextual Refinement and Abstraction Theorems 297 wait-freedom lock-freedom starvation-freedom deadlock-freedom PSF PDF Figure 2.2:Relationships between progress properties. 2.3 Contextual Refinement and Abstraction Theorems It is difficult to use linearizability and progress properties directly in modular verification of client programs of an object,because their definitions fail to describe how the client behaviors are affected.To verify clients,we would like to abstract away the details of the object implementation.This requires a notion of object correctness,telling us that the client behaviors will not change when we replace the object methods'implementations with the corresponding abstract operations (as specifications). Contextual refinement gives the desired notion of correctness.In- formally,an object implementation II is a contextual refinement of another (more abstract)implementation II',written as IIII,if every observable behavior of any client program using II can also be observed when the client uses II'instead.That is, ΠE'if∀C.O(letΠinC)≤O(let'inC) Here(let II in C)denotes the client program C using II,and O returns the set of observable behaviors of the program.Then,when verifying a client ofΠ,we can soundly replaceΠwith its abstraction I'. Filipovic et al.(2009)have proved the abstraction theorem,saying that linearizability is equivalent to a contextual refinement II fin I between II and its atomic specification I.In this contextual refinement, O returns the prefix-closed set of finite traces of I/O events.A trace set is prefir-closed if,for any trace in the set,its prefix must also be in
2.3. Contextual Refinement and Abstraction Theorems 297 wait-freedom lock-freedom starvation-freedom deadlock-freedom PSF PDF Figure 2.2: Relationships between progress properties. 2.3 Contextual Refinement and Abstraction Theorems It is difficult to use linearizability and progress properties directly in modular verification of client programs of an object, because their definitions fail to describe how the client behaviors are affected. To verify clients, we would like to abstract away the details of the object implementation. This requires a notion of object correctness, telling us that the client behaviors will not change when we replace the object methods’ implementations with the corresponding abstract operations (as specifications). Contextual refinement gives the desired notion of correctness. Informally, an object implementation Π is a contextual refinement of another (more abstract) implementation Π0 , written as Π v Π0 , if every observable behavior of any client program using Π can also be observed when the client uses Π0 instead. That is, Π v Π0 iff ∀C. O(let Π in C) ⊆ O(let Π0 in C) Here (let Π in C) denotes the client program C using Π, and O returns the set of observable behaviors of the program. Then, when verifying a client of Π, we can soundly replace Π with its abstraction Π0 . Filipović et al. (2009) have proved the abstraction theorem, saying that linearizability is equivalent to a contextual refinement Π vfin Γ between Π and its atomic specification Γ. In this contextual refinement, O returns the prefix-closed set of finite traces of I/O events. A trace set is prefix-closed if, for any trace in the set, its prefix must also be in
298 Background the set.The superscript in the refinement relation fin means we only observe finite prefixes of execution traces. This basic contertual refinement can distinguish linearizable objects from non-linearizable ones,but it cannot characterize progress properties of objects.For the following example,although no concrete method call of f could finish,f is a basic contextual refinement of F. f(){while(true)skip; F(){skip; The reason is that the basic contextual refinement considers a prefir- closed set of event traces at the abstract side.For instance,consider the following client (2.3.1). print(1);f();print(1); (2.3.1) The prefix-closed set of externally observable event traces of the client calling f is fe,1},while the prefix-closed set produced by the same client calling F is fe,1,11),where e represents the empty trace.The former is a subset of the latter.But if we consider only complete traces instead of also taking prefixes,we will break the subset relation.The client calling f produces a singleton set {1}of complete I/O traces, while calling F produces {11}.The subset no longer holds,telling us that f and F have different termination behaviors. When taking progress properties into account,the corresponding contextual refinement should be sensitive to termination or divergence (non-termination).In particular,one should observe only complete execution traces of I/O events instead of taking the prefix-closed set of traces.Then,as we have seen,for the above example,the sets of observable behaviors of the client(2.3.1)calling f and F will be different. The termination-sensitive contextual refinement does not hold. But this is not the end of the story.To formulate a contextual refine- ment III,we need to fix at least three key aspects:the observable behaviors O,the scheduling and the abstraction II'. First,for multi-threaded client programs,do we observe per-thread termination/divergence or whole-program termination/divergence? Consider the lock-free counter inc in Figure 1.1(a)and the client (2.2.1).If we observe per-thread termination/divergence,the client
298 Background the set. The superscript in the refinement relation vfin means we only observe finite prefixes of execution traces. This basic contextual refinement can distinguish linearizable objects from non-linearizable ones, but it cannot characterize progress properties of objects. For the following example, although no concrete method call of f could finish, f is a basic contextual refinement of F. f(){ while(true) skip; } F(){ skip; } The reason is that the basic contextual refinement considers a prefixclosed set of event traces at the abstract side. For instance, consider the following client (2.3.1). print(1); f(); print(1); (2.3.1) The prefix-closed set of externally observable event traces of the client calling f is {, 1}, while the prefix-closed set produced by the same client calling F is {, 1, 11}, where represents the empty trace. The former is a subset of the latter. But if we consider only complete traces instead of also taking prefixes, we will break the subset relation. The client calling f produces a singleton set {1} of complete I/O traces, while calling F produces {11}. The subset no longer holds, telling us that f and F have different termination behaviors. When taking progress properties into account, the corresponding contextual refinement should be sensitive to termination or divergence (non-termination). In particular, one should observe only complete execution traces of I/O events instead of taking the prefix-closed set of traces. Then, as we have seen, for the above example, the sets O of observable behaviors of the client (2.3.1) calling f and F will be different. The termination-sensitive contextual refinement does not hold. But this is not the end of the story. To formulate a contextual refinement Π v Π0 , we need to fix at least three key aspects: the observable behaviors O, the scheduling and the abstraction Π0 . • First, for multi-threaded client programs, do we observe per-thread termination/divergence or whole-program termination/divergence? Consider the lock-free counter inc in Figure 1.1(a) and the client (2.2.1). If we observe per-thread termination/divergence, the client