A Program Logic for Concurrent Objects under Fair Scheduling Hongjin Liang Xinyu Feng School of Computer Science and Technology Suzhou Institute for Advanced Study University of Science and Technology of China Ihj1018@ustc.edu.cn xyfengOustc.edu.cn Abstract In addition to linearizability,a safety property,object imple- Existing work on verifying concurrent objects is mostly concerned mentations are expected to also satisfy progress properties.The with safety only,e.g.,partial correctness or linearizability.Although non-blocking progress properties,such as wait-freedom and lock- there has been recent work verifying lock-freedom of non-blocking freedom which have been studied a lot (e.g..[5,10,16,24]).guar- antee the termination of the method calls independently of how the objects,much less efforts are focused on deadlock-freedom and threads are scheduled.Unfortunately these properties are too strong starvation-freedom,progress properties of blocking objects.These properties are more challenging to verify than lock-freedom because to be satisfied by algorithms with blocking synchronization.For clients using lock-based objects,a delay of a thread holding a lock they allow the progress of one thread to depend on the progress of will block other threads requesting the lock.Thus their progress another,assuming fair scheduling. relies on the assumption that every thread holding the lock will We propose LiLi,a new rely-guarantee style program logic for eventually be scheduled to release it.This assumption requires fair verifying linearizability and progress together for concurrent objects under fair scheduling.The rely-guarantee style logic unifies thread- scheduling,i.e.,every thread gets eventually executed.As summa- modular reasoning about both starvation-freedom and deadlock rized by Herlihy and Shavit [14].there are two desirable progress freedom in one framework.It also establishes progress-aware properties for blocking algorithms,both assuming fair scheduling: abstraction for concurrent objects,which can be applied when Deadlock-freedom:In each fair execution,there always exists verifying safety and liveness of client code.We have successfully some method call that can finish.It disallows the situation in applied the logic to verify starvation-freedom or deadlock-freedom which multiple threads requesting locks are waiting for each of representative algorithms such as ticket locks,queue locks,lock- other to release the locks in hand.It ensures the absence of coupling lists,optimistic lists and lazy lists. livelock.but not starvation. Categories and Subject Descriptors D.2.4 [Software Engineer- Starvation-freedom:Every method call should finish in fair ing]:Software/Program Verification-Correctness proofs,Formal executions.It requires that every thread attempting to acquire methods;F.3.1 [Logics and Meanings of Programs]:Specifying a lock should eventually succeed and in the end release the and Verifying and Reasoning about Programs lock.Starvation-freedom is stronger than deadlock-freedom. Nevertheless it can often be achieved by using fair locks [13]. General Terms Theory,Verification Recent program logics for verifying concurrent objects either Keywords Concurrency,Progress,Program Logic,Rely-Guarantee prove only linearizability and ignore the issue of termination (e.g.,[6, Reasoning,Refinement 21,30,311),or aim for non-blocking progress properties(e.g.,[5. 10,16,24]),which cannot be applied to blocking algorithms that progress only under fair scheduling.The fairness assumption intro- 1.Introduction duces complicated interdependencies among progress properties of A concurrent object or library provides a set of methods that al- threads,making it incredibly more challenging to verify the lock- low multiple client threads to manipulate the shared data structure. based algorithms than their non-blocking counterparts.We will Blocking synchronization (i.e..mutual exclusion locks),as a straight explain the challenges in detail in Sec.2. forward technique to ensure exclusive accesses and to control the It is important to note that,although there has been much work interference,has been widely-used in object implementations to on deadlock detection or deadlock-freedom verification (e.g.,[4,20. achieve linearizability,which ensures the object methods behave as 32]),deadlock-freedom is often defined as a safety property,which atomic operations in a concurrent setting. ensures the lack of circular waiting for locks.It does not prevent live- lock or non-termination inside the critical section.Another limitation of this kind of work is that it often assumes built-in lock primitives, and lacks support of ad-hoc synchronization (e.g.,mutual exclusion Permission to make digital or hard copies of all or part of this work for classroom use is granted without fee wided that are not made or distribut achieved using spin-locks implemented by the programmers).The for profit or commercial advantage and that copic sbear this notice and the full citati deadlock-freedom we discuss in this paper is a liveness property and on the first page.Copyrights for components of this work owned by others than AC must be honored.Abstracting with credit is permitted.To copy otherwise,or republish. its definition does not rely on built-in lock primitives.We discuss to post on servers or to redistribute to lists,requires prior specific permission and/or a more related work on liveness verification in Sec.8. fee.Request permissions from Permissions@acm.org. In this paper we propose LiLi,a new rely-guarantee style logic POPL'/6,January 20-22,2016,St.Petersburg,FL.USA for concurrent objects under fair scheduling.LiLi is the first program logic that unifies verification of linearizability,starvation-freedom and deadlock-freedom in one framework (the name LiLi stands for 385
A Program Logic for Concurrent Objects under Fair Scheduling Hongjin Liang Xinyu Feng School of Computer Science and Technology & Suzhou Institute for Advanced Study University of Science and Technology of China lhj1018@ustc.edu.cn xyfeng@ustc.edu.cn Abstract Existing work on verifying concurrent objects is mostly concerned with safety only, e.g., partial correctness or linearizability. Although there has been recent work verifying lock-freedom of non-blocking objects, much less efforts are focused on deadlock-freedom and starvation-freedom, progress properties of blocking objects. These properties are more challenging to verify than lock-freedom because they allow the progress of one thread to depend on the progress of another, assuming fair scheduling. We propose LiLi, a new rely-guarantee style program logic for verifying linearizability and progress together for concurrent objects under fair scheduling. The rely-guarantee style logic unifies threadmodular reasoning about both starvation-freedom and deadlockfreedom in one framework. It also establishes progress-aware abstraction for concurrent objects, which can be applied when verifying safety and liveness of client code. We have successfully applied the logic to verify starvation-freedom or deadlock-freedom of representative algorithms such as ticket locks, queue locks, lockcoupling lists, optimistic lists and lazy lists. Categories and Subject Descriptors D.2.4 [Software Engineering]: Software/Program Verification – Correctness proofs, Formal methods; F.3.1 [Logics and Meanings of Programs]: Specifying and Verifying and Reasoning about Programs General Terms Theory, Verification Keywords Concurrency, Progress, Program Logic, Rely-Guarantee Reasoning, Refinement 1. Introduction A concurrent object or library provides a set of methods that allow multiple client threads to manipulate the shared data structure. Blocking synchronization (i.e., mutual exclusion locks), as a straightforward technique to ensure exclusive accesses and to control the interference, has been widely-used in object implementations to achieve linearizability, which ensures the object methods behave as atomic operations in a concurrent setting. In addition to linearizability, a safety property, object implementations are expected to also satisfy progress properties. The non-blocking progress properties, such as wait-freedom and lockfreedom which have been studied a lot (e.g., [5, 10, 16, 24]), guarantee the termination of the method calls independently of how the threads are scheduled. Unfortunately these properties are too strong to be satisfied by algorithms with blocking synchronization. For clients using lock-based objects, a delay of a thread holding a lock will block other threads requesting the lock. Thus their progress relies on the assumption that every thread holding the lock will eventually be scheduled to release it. This assumption requires fair scheduling, i.e., every thread gets eventually executed. As summarized by Herlihy and Shavit [14], there are two desirable progress properties for blocking algorithms, both assuming fair scheduling: • Deadlock-freedom: In each fair execution, there always exists some method call that can finish. It disallows the situation in which multiple threads requesting locks are waiting for each other to release the locks in hand. It ensures the absence of livelock, but not starvation. • Starvation-freedom: Every method call should finish in fair executions. It requires that every thread attempting to acquire a lock should eventually succeed and in the end release the lock. Starvation-freedom is stronger than deadlock-freedom. Nevertheless it can often be achieved by using fair locks [13]. Recent program logics for verifying concurrent objects either prove only linearizability and ignore the issue of termination (e.g., [6, 21, 30, 31]), or aim for non-blocking progress properties (e.g., [5, 10, 16, 24]), which cannot be applied to blocking algorithms that progress only under fair scheduling. The fairness assumption introduces complicated interdependencies among progress properties of threads, making it incredibly more challenging to verify the lockbased algorithms than their non-blocking counterparts. We will explain the challenges in detail in Sec. 2. It is important to note that, although there has been much work on deadlock detection or deadlock-freedom verification (e.g., [4, 20, 32]), deadlock-freedom is often defined as a safety property, which ensures the lack of circular waiting for locks. It does not prevent livelock or non-termination inside the critical section. Another limitation of this kind of work is that it often assumes built-in lock primitives, and lacks support of ad-hoc synchronization (e.g., mutual exclusion achieved using spin-locks implemented by the programmers). The deadlock-freedom we discuss in this paper is a liveness property and its definition does not rely on built-in lock primitives. We discuss more related work on liveness verification in Sec. 8. In this paper we propose LiLi, a new rely-guarantee style logic for concurrent objects under fair scheduling. LiLi is the first program logic that unifies verification of linearizability, starvation-freedom and deadlock-freedom in one framework (the name LiLi stands for Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions@acm.org. POPL’16, January 20–22, 2016, St. Petersburg, FL, USA c 2016 ACM. 978-1-4503-3549-2/16/01...$15.00 http://dx.doi.org/10.1145/2837614.2837635 385
Linearizability and Liveness).It supports verification of both mutex- (a)abstract operation INC:<x++>; based pessimistic algorithms(including fine-grained ones such as dfInc: lock-coupling lists)and optimistic ones such as optimistic lists and 1 local b :false,ri lazy lists.The unified approach allows us to prove in the same logic 2 while (!b){b :cas(&L,0,cid);/lock L for instance,the lock-coupling list algorithm is starvation-free if we 3 r :x;x :=r 1;/critical section use fair locks,e.g.,ticket locks [25],and is deadlock-free if regular 4 L :0;/unlock L test-and-set (TAS)based spin locks are used.Our work is based on (b)deadlock-free implementation df Inc using a test-and-set lock earlier work on concurrency verification,but we make the following new contributions: sfInc: 11oca11,0,r: 2 i:=getAndInc(next); We divide environment interference that affects progress of a 3 o :owner;while (i !o)o :owner; thread into two classes.namely blocking and delay.We show 4 r=x;x:=r+1; /critical section different occurrences of them correspond to the classification of owner :i+1; progress into wait-freedom,lock-freedom,starvation-freedom (c)starvation-free implementation sf Inc using a ticket lock and deadlock-freedom (see Sec.2.2.1 and Sec.6).Recognizing the two classes of interference allows us to come up with Figure 1.Counters different mechanisms in our program logic to reason about them separately.Our logic also provides parameterized specifications which can be instantiated to choose different combinations of objects too in Sec.6.Finally,we summarize the examples we have the mechanisms.This gives us a unified program logic that can verified in Sec.7.and discuss related work and conclude in Sec.8. verify different progress properties using the same set of rules. We propose two novel mechanisms.definite actions and strat- 2.Informal Development ified tokens,to reason about blocking and delay,respectively Below we first give an overview of the traditional rely-guarantee They are also our key techniques to avoid circularity in rely. logic for safety proofs [18],and the way to encode linearizability guarantee style liveness reasoning.A definite action character- verification in the logic.Then we explain the challenges and our izes a thread's progress that does not rely on the progress of the ideas in supporting liveness verification under fair scheduling. environment.Each blocked thread waits for a queue of definite actions.Starvation-freedom requires the length of the queue be 2.1 Background strictly decreasing,while deadlock-freedom allows disciplined queue jumps based on the token-transfer ideas [16.24].To avoid Rely-guarantee reasoning.In rely-guarantee reasoning [18]. circular delay,we further generalize the token-transfer ideas by each thread is verified in isolation under some assumptions on stratifying tokens into multiple levels,which enables us to verify its environment (i.e.,the other threads in the system).The judgment complex algorithms that involve both nested locks and rollbacks is in the form of R,GPC,where the pre-and post- (e.g.,the optimistic list algorithm). conditions P and Q specify the initial and final states respectively. The rely condition R specifies the assumptions on the environ- By verifying linearizability and progress together,we can pro- ment,which are the permitted state transitions that the environment vide progress-aware abstractions for concurrent objects (see threads may have.The guarantee condition G specifies the possible Sec.5).Our logic is based on termination-preserving simula- transitions made by the thread itself.To ensure that parallel threads tions as the meta-theory,which establish contextual refinements can collaborate,the guarantee of each thread needs to satisfy the that assume fair scheduling at both the concrete and the abstract rely of every other thread. levels.We prove the contextual refinements are equivalent to linearizability and starvation-freedom/deadlock-freedom.The Encoding linearizability verification.As a working example. refinements allow us to replace object implementations with Fig.1(b)shows a counter object dfInc implemented with a test- progress-aware abstract specifications when the client code is and-set (TAS)lock L.Verifying linearizability of dfInc requires verified.As far as we know,our abstraction for deadlock-free us to prove it has the same abstract behaviors as INC in Fig.1(a), (and linearizable)objects has never been proposed before. which increments the counter x atomically. We have applied our logic to verify simple objects with coarse- Following previous work [21,24,31],one can extend a rely- grained synchronization using TAS locks,ticket locks [25]and guarantee logic to verify linearizability.We use an assertion various queue locks (including Anderson array-based locks. arem(C)to specify as an auxiliary state the abstract operation C to CLH locks and MCS locks)[13].For examples with more be fulfilled,and logically execute C at the linearization point (LP) permissive locking schemes,we have successfully verified the of the concrete implementation.For dfInc,we prove a judgment in two-lock queues,and various fine-grained and optimistic list the form of R.GP A arem(INC)dfIncfQ A arem(skip) algorithms.To the best of our knowledge,we are the first to Here R and G specify the object's actions (i.e.,lock acquire and formally verify the starvation-freedom/deadlock-freedom of release,and the counter updates at both the concrete and the abstract lock-coupling lists,optimistic lists and lazy lists. sides)made by the environment and the current thread respectively. P and O are relational assertions specifying the consistency relation Notice that with the assumption of fair scheduling,wait-freedom between the program states at the concrete and the abstract sides. and lock-freedom are equivalent to starvation-freedom and deadlock- The postcondition arem(skip)shows that at the end of df Inc there freedom,respectively.Therefore our logic can also be applied to is no abstract operation to fulfill. verify wait-free and lock-free algorithms.We discuss this in Sec.6. 2.2 In the rest of this paper,we first analyze the challenges and Challenges of Progress Verification explain our approach informally in Sec.2.Then we give the basic Progress properties of objects such as deadlock-freedom and technical setting in Sec.3,and present our logic in Sec.4,whose starvation-freedom have various termination requirements of ob- soundness theorem,together with the abstraction theorem,is given ject methods.They must be satisfied with interference from other in Sec.5.We discuss how our logic supports wait-free and lock-free threads considered,which makes the verification challenging. 386
Linearizability and Liveness). It supports verification of both mutexbased pessimistic algorithms (including fine-grained ones such as lock-coupling lists) and optimistic ones such as optimistic lists and lazy lists. The unified approach allows us to prove in the same logic, for instance, the lock-coupling list algorithm is starvation-free if we use fair locks, e.g., ticket locks [25], and is deadlock-free if regular test-and-set (TAS) based spin locks are used. Our work is based on earlier work on concurrency verification, but we make the following new contributions: • We divide environment interference that affects progress of a thread into two classes, namely blocking and delay. We show different occurrences of them correspond to the classification of progress into wait-freedom, lock-freedom, starvation-freedom and deadlock-freedom (see Sec. 2.2.1 and Sec. 6). Recognizing the two classes of interference allows us to come up with different mechanisms in our program logic to reason about them separately. Our logic also provides parameterized specifications, which can be instantiated to choose different combinations of the mechanisms. This gives us a unified program logic that can verify different progress properties using the same set of rules. • We propose two novel mechanisms, definite actions and stratified tokens, to reason about blocking and delay, respectively. They are also our key techniques to avoid circularity in relyguarantee style liveness reasoning. A definite action characterizes a thread’s progress that does not rely on the progress of the environment. Each blocked thread waits for a queue of definite actions. Starvation-freedom requires the length of the queue be strictly decreasing, while deadlock-freedom allows disciplined queue jumps based on the token-transfer ideas [16, 24]. To avoid circular delay, we further generalize the token-transfer ideas by stratifying tokens into multiple levels, which enables us to verify complex algorithms that involve both nested locks and rollbacks (e.g., the optimistic list algorithm). • By verifying linearizability and progress together, we can provide progress-aware abstractions for concurrent objects (see Sec. 5). Our logic is based on termination-preserving simulations as the meta-theory, which establish contextual refinements that assume fair scheduling at both the concrete and the abstract levels. We prove the contextual refinements are equivalent to linearizability and starvation-freedom/deadlock-freedom. The refinements allow us to replace object implementations with progress-aware abstract specifications when the client code is verified. As far as we know, our abstraction for deadlock-free (and linearizable) objects has never been proposed before. • We have applied our logic to verify simple objects with coarsegrained synchronization using TAS locks, ticket locks [25] and various queue locks (including Anderson array-based locks, CLH locks and MCS locks) [13]. For examples with more permissive locking schemes, we have successfully verified the two-lock queues, and various fine-grained and optimistic list algorithms. To the best of our knowledge, we are the first to formally verify the starvation-freedom/deadlock-freedom of lock-coupling lists, optimistic lists and lazy lists. Notice that with the assumption of fair scheduling, wait-freedom and lock-freedom are equivalent to starvation-freedom and deadlockfreedom, respectively. Therefore our logic can also be applied to verify wait-free and lock-free algorithms. We discuss this in Sec. 6. In the rest of this paper, we first analyze the challenges and explain our approach informally in Sec. 2. Then we give the basic technical setting in Sec. 3, and present our logic in Sec. 4, whose soundness theorem, together with the abstraction theorem, is given in Sec. 5. We discuss how our logic supports wait-free and lock-free (a) abstract operation INC: <x++>; dfInc : 1 local b := false, r; 2 while (!b) { b := cas(&L, 0, cid); } // lock L 3 r := x; x := r + 1; // critical section 4 L := 0; // unlock L (b) deadlock-free implementation dfInc using a test-and-set lock sfInc : 1 local i, o, r; 2 i := getAndInc(next); 3 o := owner; while (i != o) { o := owner; } 4 r := x; x := r + 1; // critical section 5 owner := i + 1; (c) starvation-free implementation sfInc using a ticket lock Figure 1. Counters. objects too in Sec. 6. Finally, we summarize the examples we have verified in Sec. 7, and discuss related work and conclude in Sec. 8. 2. Informal Development Below we first give an overview of the traditional rely-guarantee logic for safety proofs [18], and the way to encode linearizability verification in the logic. Then we explain the challenges and our ideas in supporting liveness verification under fair scheduling. 2.1 Background Rely-guarantee reasoning. In rely-guarantee reasoning [18], each thread is verified in isolation under some assumptions on its environment (i.e., the other threads in the system). The judgment is in the form of R, G ` {P}C{Q}, where the pre- and postconditions P and Q specify the initial and final states respectively. The rely condition R specifies the assumptions on the environment, which are the permitted state transitions that the environment threads may have. The guarantee condition G specifies the possible transitions made by the thread itself. To ensure that parallel threads can collaborate, the guarantee of each thread needs to satisfy the rely of every other thread. Encoding linearizability verification. As a working example, Fig. 1(b) shows a counter object dfInc implemented with a testand-set (TAS) lock L. Verifying linearizability of dfInc requires us to prove it has the same abstract behaviors as INC in Fig. 1(a), which increments the counter x atomically. Following previous work [21, 24, 31], one can extend a relyguarantee logic to verify linearizability. We use an assertion arem(C) to specify as an auxiliary state the abstract operation C to be fulfilled, and logically execute C at the linearization point (LP) of the concrete implementation. For dfInc, we prove a judgment in the form of R, G ` {P ∧ arem(INC)}dfInc{Q ∧ arem(skip)}. Here R and G specify the object’s actions (i.e., lock acquire and release, and the counter updates at both the concrete and the abstract sides) made by the environment and the current thread respectively. P and Q are relational assertions specifying the consistency relation between the program states at the concrete and the abstract sides. The postcondition arem(skip) shows that at the end of dfInc there is no abstract operation to fulfill. 2.2 Challenges of Progress Verification Progress properties of objects such as deadlock-freedom and starvation-freedom have various termination requirements of object methods. They must be satisfied with interference from other threads considered, which makes the verification challenging. 386
2.2.1 Non-Termination Caused by Interference (line 5)and testing L2 (line 6).Suppose another thread t'does the In a concurrent setting,an object method may fail to terminate opposite:repeatedly acquiring L2 and testing L1.In this example the due to interference from its environment.Below we point out there acquirement of L2 by t'may cause t to fail its test of the availability are two different kinds of interference that may cause thread non- of L2.The test could have succeeded if t'did not interfere,so t' termination,namely blocking and delay.Let's first see a classic delays t.Conversely,the acquirement of Li by t may delay t'.Then deadlocking example the two threads can cause each other to continually roll back,and DL-12: DL-21: neither method progresses. lock L1;lock L2; lock L2;lock L1; Usually when delay is allowed,we need to make sure that the unlock L2;unlock L1; unlock L1;unlock L2; action delaying other threads is a"good"one in that it makes the The methods DL-12 and DL-21 may fail to terminate because executing thread progress (e.g.,a step towards termination).This is the case with the "benign delays"in the dfInc example and the of the circular dependency of locks.This non-termination is caused optimistic list example.But how do we tell if an action is good or by permanent blocking.That is,when DL-12 tries to acquire L2,it not?The acquirements of locks in Fig.2(b)do seem to be good could be blocked if the lock has been acquired by DL-21. For a second example,the call of the dfInc method (in Fig.1(b)) because they make the threads progress towards termination.How by the left thread below may never terminate. do we prevent such lock acquirements from delaying others,which may cause circular delay? dfInc();while (true)dfInc(); When the left thread tries to acquire the lock,even if the lock is 2.2.3 Ad-Hoc Synchronization and Dynamic Locks available at that time,the thread could be preempted by the right One may argue that the circularity can be avoided by simply thread,who gets the lock ahead of the left.Then the left thread enforcing certain orders of lock acquirements,which has been would fail at the cas command in the code of dfInc and have to a standard way to avoid"deadlock cycles"(note this is a safety loop at least one more round before termination.This may happen property,as we explained in Sec.1).Although lock orders can help infinitely many times,causing non-termination of the df Inc method liveness reasoning,it has many limitations in practice. on the left.In this case we say the progress of the left method is First,the approach cannot apply for ad-hoc synchronization.For delayed by its environment's successful acquirement of the lock. instance.there are no locks in the following deadlocking program The key difference between blocking and delay is that blocking is caused by the absence of certain environment actions,e.g. x:=1; y:=1: releasing a lock,while delay is caused by the occurrence of certain while (y =1)skip; while (x=1)skip; x:=0; y=0; environment actions,e.g.,acquiring the lock needed by the current thread (even if the lock is subsequently released).In other words.a Moreover,sometimes we need to look into the lock implementa blocked thread can progress only if its environment progresses first, tion to prove starvation-freedom.For instance,the df Inc in Fig.1(b) while a delayed thread can progress if we suspend the execution of using a TAS lock is deadlock-free but not starvation-free.If we re- its environment. place the TAS lock with a ticket lock,as in sfInc in Fig.1(c),the Lock-free algorithms disallow blocking (thus they do not rely on counter becomes starvation-free.Again,there are actually no locks fair scheduling),although delay is common,especially in optimistic in the programs if we have to work at a low abstraction level to look algorithms.Starvation-free algorithms allow (limited)blocking,but into lock implementations. not delay.As the dfInc example shows,delay from non-terminating Second,it can be difficult to enforce the ordering for fine-grained clients may cause starvation.Deadlock-free algorithms allow both algorithms on dynamic data structures (e.g.,lock-coupling list). (with restrictions).As the optimistic list in Fig.2(a)(explained in Since the data structure is changing dynamically,the set of locks Sec.2.3.4)shows,blocking and delay can be intertwined by the associated with the nodes is dynamic too.To allow a thread to combined use of blocking-based synchronization and optimistic con- determine dynamically the order of locks,we have to ensure its view currency,which makes the reasoning significantly more challenging of ordering is consistent with all the other threads in the system, than reasoning about lock-free algorithms. a challenge for thread-modular verification.Although dynamic How do we come up with general principles to allow blocking locks are supported in some previous work treating deadlock- and/or delay,but on the other hand to guarantee starvation-freedom freedom as a safety property (e.g.,[4.19)),it is unclear how to or deadlock-freedom? apply the techniques for general progress reasoning,with possible combination of locks,ad-hoc synchronization and rollbacks. 2.2.2 Avoid Circular Reasoning Rely-guarantee style logics provide the power of thread-modular 2.3 Our Approaches verification by circular reasoning.When proving the behaviors To address these problems,our logic enforces the following princi- of a thread t guarantee G,we assume that the behaviors of the ples to permit restricted forms of blocking and delay,but prevent environment thread t'satisfy R.Conversely,the proof of thread t' circular reasoning and non-termination. relies on the assumptions on the behaviors of thread t. First.if a thread is blocked.the events it waits for must eventually However,circular reasoning is usually unsound in liveness occur.To avoid circular reasoning,we find"definite actions"of verification [1].For instance,we could prove termination of each each thread,which under fair scheduling will definitely happen thread in the deadlocking example above,under the assumption that once enabled,regardless of the interference from the environment. each environment thread eventually releases the lock it owns.How Then each blocked thread needs to show it waits for only a finite do we avoid the circular reasoning without sacrificing rely-guarantee number of definite actions from the environment threads.They form style thread-modular reasoning? an acyclic queue,and there is always at least one of them enabled. The deadlocking example shows that we should avoid circular This is what we call"definite progress",which is crucial for proving reasoning to rule out circular dependency caused by blocking.Delay starvation-freedom may also cause circular dependency too.Figure 2(b)shows a thread Second,actions of a thread can delay others only if they are t using two locks.It first acquires L1 (line 1)and then tests whether making the executing object method to move towards termination L2 is available (line 2).If the test fails,the thread rolls back.It Each object method can only execute a finite number of such releases L1 (line 4),and then repeats the process of acquiring L1 delaying actions to avoid indefinite delay.This is enforced by 387
2.2.1 Non-Termination Caused by Interference In a concurrent setting, an object method may fail to terminate due to interference from its environment. Below we point out there are two different kinds of interference that may cause thread nontermination, namely blocking and delay. Let’s first see a classic deadlocking example. DL-12 : DL-21 : lock L1; lock L2; lock L2; lock L1; unlock L2; unlock L1; unlock L1; unlock L2; The methods DL-12 and DL-21 may fail to terminate because of the circular dependency of locks. This non-termination is caused by permanent blocking. That is, when DL-12 tries to acquire L2, it could be blocked if the lock has been acquired by DL-21. For a second example, the call of the dfInc method (in Fig. 1(b)) by the left thread below may never terminate. dfInc(); while (true) dfInc(); When the left thread tries to acquire the lock, even if the lock is available at that time, the thread could be preempted by the right thread, who gets the lock ahead of the left. Then the left thread would fail at the cas command in the code of dfInc and have to loop at least one more round before termination. This may happen infinitely many times, causing non-termination of the dfInc method on the left. In this case we say the progress of the left method is delayed by its environment’s successful acquirement of the lock. The key difference between blocking and delay is that blocking is caused by the absence of certain environment actions, e.g., releasing a lock, while delay is caused by the occurrence of certain environment actions, e.g., acquiring the lock needed by the current thread (even if the lock is subsequently released). In other words, a blocked thread can progress only if its environment progresses first, while a delayed thread can progress if we suspend the execution of its environment. Lock-free algorithms disallow blocking (thus they do not rely on fair scheduling), although delay is common, especially in optimistic algorithms. Starvation-free algorithms allow (limited) blocking, but not delay. As the dfInc example shows, delay from non-terminating clients may cause starvation. Deadlock-free algorithms allow both (with restrictions). As the optimistic list in Fig. 2(a) (explained in Sec. 2.3.4) shows, blocking and delay can be intertwined by the combined use of blocking-based synchronization and optimistic concurrency, which makes the reasoning significantly more challenging than reasoning about lock-free algorithms. How do we come up with general principles to allow blocking and/or delay, but on the other hand to guarantee starvation-freedom or deadlock-freedom? 2.2.2 Avoid Circular Reasoning Rely-guarantee style logics provide the power of thread-modular verification by circular reasoning. When proving the behaviors of a thread t guarantee G, we assume that the behaviors of the environment thread t 0 satisfy R. Conversely, the proof of thread t 0 relies on the assumptions on the behaviors of thread t. However, circular reasoning is usually unsound in liveness verification [1]. For instance, we could prove termination of each thread in the deadlocking example above, under the assumption that each environment thread eventually releases the lock it owns. How do we avoid the circular reasoning without sacrificing rely-guarantee style thread-modular reasoning? The deadlocking example shows that we should avoid circular reasoning to rule out circular dependency caused by blocking. Delay may also cause circular dependency too. Figure 2(b) shows a thread t using two locks. It first acquires L1 (line 1) and then tests whether L2 is available (line 2). If the test fails, the thread rolls back. It releases L1 (line 4), and then repeats the process of acquiring L1 (line 5) and testing L2 (line 6). Suppose another thread t 0 does the opposite: repeatedly acquiring L2 and testing L1. In this example the acquirement of L2 by t 0 may cause t to fail its test of the availability of L2. The test could have succeeded if t 0 did not interfere, so t 0 delays t. Conversely, the acquirement of L1 by t may delay t 0 . Then the two threads can cause each other to continually roll back, and neither method progresses. Usually when delay is allowed, we need to make sure that the action delaying other threads is a “good” one in that it makes the executing thread progress (e.g., a step towards termination). This is the case with the “benign delays” in the dfInc example and the optimistic list example. But how do we tell if an action is good or not? The acquirements of locks in Fig. 2(b) do seem to be good because they make the threads progress towards termination. How do we prevent such lock acquirements from delaying others, which may cause circular delay? 2.2.3 Ad-Hoc Synchronization and Dynamic Locks One may argue that the circularity can be avoided by simply enforcing certain orders of lock acquirements, which has been a standard way to avoid “deadlock cycles” (note this is a safety property, as we explained in Sec. 1). Although lock orders can help liveness reasoning, it has many limitations in practice. First, the approach cannot apply for ad-hoc synchronization. For instance, there are no locks in the following deadlocking program. x := 1; while (y = 1) skip; x := 0; y := 1; while (x = 1) skip; y := 0; Moreover, sometimes we need to look into the lock implementation to prove starvation-freedom. For instance, the dfInc in Fig. 1(b) using a TAS lock is deadlock-free but not starvation-free. If we replace the TAS lock with a ticket lock, as in sfInc in Fig. 1(c), the counter becomes starvation-free. Again, there are actually no locks in the programs if we have to work at a low abstraction level to look into lock implementations. Second, it can be difficult to enforce the ordering for fine-grained algorithms on dynamic data structures (e.g., lock-coupling list). Since the data structure is changing dynamically, the set of locks associated with the nodes is dynamic too. To allow a thread to determine dynamically the order of locks, we have to ensure its view of ordering is consistent with all the other threads in the system, a challenge for thread-modular verification. Although dynamic locks are supported in some previous work treating deadlockfreedom as a safety property (e.g., [4, 19]), it is unclear how to apply the techniques for general progress reasoning, with possible combination of locks, ad-hoc synchronization and rollbacks. 2.3 Our Approaches To address these problems, our logic enforces the following principles to permit restricted forms of blocking and delay, but prevent circular reasoning and non-termination. First, if a thread is blocked, the events it waits for must eventually occur. To avoid circular reasoning, we find “definite actions” of each thread, which under fair scheduling will definitely happen once enabled, regardless of the interference from the environment. Then each blocked thread needs to show it waits for only a finite number of definite actions from the environment threads. They form an acyclic queue, and there is always at least one of them enabled. This is what we call “definite progress”, which is crucial for proving starvation-freedom. Second, actions of a thread can delay others only if they are making the executing object method to move towards termination. Each object method can only execute a finite number of such delaying actions to avoid indefinite delay. This is enforced by 387
assigning a finite number of tokens to each method.A token must Definite actions. First,we introduce a novel notion called a be paid to execute a delaying action. "definite action"D.which models a thread action that,once enabled. Third,we divide actions of a thread into normal ones (which do must be eventually finished regardless of what the environment not delay others)and delaying ones,and further stratify delaying does.In detail,D is in the form of PaQd.It requires in every actions into multiple levels.When a thread is delayed by a level-k execution that Qa should eventually hold if Pa holds,and Pa should action from its environment,it is allowed to execute not only more be preserved (by both the current thread and the environment)until normal actions,but also more delaying actions at lower levels Qa holds.For sfInc,the definite action Pa d of a thread can Allowing one delaying action to trigger more steps of other delaying be defined as follows.Pa says that owner equals the thread's ticket actions is necessary for verifying algorithms with nested locks and number i,and Qa says that owner has been increased to i+1.Tha rollbacks,such as the optimistic lists in Fig.2(a).The stratification is,a thread definitely releases the lock when acquiring it.Of course prevents the circular delay in the example of Fig.2(b) we have to ensure in our logic that D is indeed definite.We will Fourth,our delaying actions and definite actions are all seman explain in detail the logic rule that enforces it in Sec.4.2.2 tically specified as part of object specifications,therefore we can support ad-hoc synchronizaiton and do not rely on built-in synchro- Definite progress.Then we use definite actions to prove termina- nization primitives to enforce ordering of events.Moreover,since tion of loops.We need to first find an assertion specifying the the specifications are all parametrized over states,they are expres- condition when the thread t can progress on its own,i.e.,it is not sive enough to support dynamic locks as in lock-coupling lists.Also blocked.Then we enforce the following principles: our "definite progress"condition allows each blocked thread to de- cide locally and dynamically a queue of definite actions it waits for. 1.If is continuously true,we need to prove the loop terminates There is no need to enforce a global ordering of blocking depen- following the idea of the TERM rule; dencies agreed by every thread.This also provides thread-modular 2.If O is false,the following must always be true: support of dynamic locks. Below we give more details about some of these key ideas. (a)There is a finite queue of definite actions of other threads that the thread t is waiting for,among which there is at least one (from a certain thread t')enabled.The length of the 2.3.1 Using Tokens to Prevent Infinite Loops queue is E. The key to ensuring termination is to require each loop to terminate (b)E decreases whenever one of these definite actions is fin- Earlier work [16,24]requires each round of the loop to consume ished; resources called tokens.The rule for loops is in the following form: (c)The expression E is never increased by any threads (no PAB→P*OR,GF{PCP matter whether O holds or not);and it is non-negative. (TERM) R,GH(P)while (B)CIPA-B) We can see E serves as a well-founded metric.By induction over E Here◇represents one token,and“*”is the normal separating we know eventually O holds,which implies the termination of the ohe one eken tain t ane loop by the above condition 1. These conditions are enforced in our new inference rule for token needs to be consumed to start this new round of loop.Since loops,which extends the TERM rule (in Sec.2.3.1)and is presented the number of tokens strictly decreases,we know the loop must in Sec.4.2.2.The condition 2 shows the use of definite actions terminate when the thread has no token. in our reasoning about progress.We call it the"definite progress" We use this simple idea to enforce termination of loops,and condition. extend it to handle blocking and delay in a concurrent setting The reasoning above implicitly makes use of the fairness as- sumption.The fair scheduling ensures that the environment thread t 2.3.2 Definite Actions and Definite Progress mentioned in the condition 2(a)is scheduled infinitely often.there- fore its definite action will definitely happen.By conditions 2(b) Our approach to cut the blocking-caused circular dependency is and 2(c)we know E will become smaller.In this way E keeps inspired by the implementation of ticket locks,which is used to decreasing until Q holds eventually. implement the starvation-free counter sfInc in Fig.1(c).It uses For sfInc,O is defined as (i owner)and the metric E is the shared variables owner and next to guarantee the first-come- first-served property of the lock.Initially owner equals next.To (i-owner).Whenever an environment thread t'finishes a definite action by releasing the lock,it increases owner,so E decreases. acquire the lock,a thread atomically increments next and reads When E is decreased to 0,the current thread is unblocked.Its loop its old value to a variable i (line 2).The value of i becomes the terminates and it succeeds in acquiring the lock thread's ticket.The thread waits until owner equals its ticket value i (line 3).Finally the lock is released by incrementing owner (line 5) such that the next waiting thread (the thread with ticket i+1.if 2.3.3 Allowing Queue Jumps for Deadlock-Free Objects there is one)can now enter the critical section. The method dfInc in Fig.1(b)implements a deadlock-free counter We can see sfInc is not concerned with the circular dependency using the TAS lock.If the current thread t waits for the lock,we problem.Intuitively the ticket lock algorithm ensures that the threads know the queue of definite actions it waits for is of length one requesting the lock always constitute a queue ti,t2,...,tn.The because it is possible for the thread to acquire the lock immediately head thread,t1.gets the ticket number which equals owner and after the lock is released.However,as we explain in Sec.2.2.1 can immediately acquire the lock.Once it releases the lock (by another thread t'may preempt t and do a successful cas.Then increasing owner),ti is dequeued.Moreover,for any thread t in thread t is blocked and waits for a queue of definite actions again this queue,the number of threads ahead of t never increases.Thus t This delay caused by thread t'can be viewed as a queue jump in our must eventually become the head of the queue and acquire the lock definite-progress-based reasoning.Actually dfInc cannot satisfy Here the dependencies among progress of the threads are in concert the definite progress requirement because we cannot find a strictly with the queue decreasing queue size E.It is not starvation-free. Following this queue principle,we explicitly specify the queue However,the queue jump here is acceptable when verifying of progress dependencies in our logic to avoid circular reasoning deadlock-freedom.This is because thread t'delays t only if t 388
assigning a finite number of tokens to each method. A token must be paid to execute a delaying action. Third, we divide actions of a thread into normal ones (which do not delay others) and delaying ones, and further stratify delaying actions into multiple levels. When a thread is delayed by a level-k action from its environment, it is allowed to execute not only more normal actions, but also more delaying actions at lower levels. Allowing one delaying action to trigger more steps of other delaying actions is necessary for verifying algorithms with nested locks and rollbacks, such as the optimistic lists in Fig. 2(a). The stratification prevents the circular delay in the example of Fig. 2(b). Fourth, our delaying actions and definite actions are all semantically specified as part of object specifications, therefore we can support ad-hoc synchronizaiton and do not rely on built-in synchronization primitives to enforce ordering of events. Moreover, since the specifications are all parametrized over states, they are expressive enough to support dynamic locks as in lock-coupling lists. Also our “definite progress” condition allows each blocked thread to decide locally and dynamically a queue of definite actions it waits for. There is no need to enforce a global ordering of blocking dependencies agreed by every thread. This also provides thread-modular support of dynamic locks. Below we give more details about some of these key ideas. 2.3.1 Using Tokens to Prevent Infinite Loops The key to ensuring termination is to require each loop to terminate. Earlier work [16, 24] requires each round of the loop to consume resources called tokens. The rule for loops is in the following form: P ∧ B ⇒ P 0 ∗ ♦ R, G ` {P 0}C{P} R, G ` {P}while (B) C{P ∧ ¬B} (TERM) Here ♦ represents one token, and “∗” is the normal separating conjunction in separation logic. The premise says the precondition P 0 of the loop body C has one less token than P, showing that one token needs to be consumed to start this new round of loop. Since the number of tokens strictly decreases, we know the loop must terminate when the thread has no token. We use this simple idea to enforce termination of loops, and extend it to handle blocking and delay in a concurrent setting. 2.3.2 Definite Actions and Definite Progress Our approach to cut the blocking-caused circular dependency is inspired by the implementation of ticket locks, which is used to implement the starvation-free counter sfInc in Fig. 1(c). It uses the shared variables owner and next to guarantee the first-comefirst-served property of the lock. Initially owner equals next. To acquire the lock, a thread atomically increments next and reads its old value to a variable i (line 2). The value of i becomes the thread’s ticket. The thread waits until owner equals its ticket value i (line 3). Finally the lock is released by incrementing owner (line 5) such that the next waiting thread (the thread with ticket i + 1, if there is one) can now enter the critical section. We can see sfInc is not concerned with the circular dependency problem. Intuitively the ticket lock algorithm ensures that the threads requesting the lock always constitute a queue t1,t2, . . . ,tn. The head thread, t1, gets the ticket number which equals owner and can immediately acquire the lock. Once it releases the lock (by increasing owner), t1 is dequeued. Moreover, for any thread t in this queue, the number of threads ahead of t never increases. Thus t must eventually become the head of the queue and acquire the lock. Here the dependencies among progress of the threads are in concert with the queue. Following this queue principle, we explicitly specify the queue of progress dependencies in our logic to avoid circular reasoning. Definite actions. First, we introduce a novel notion called a “definite action” D, which models a thread action that, once enabled, must be eventually finished regardless of what the environment does. In detail, D is in the form of Pd ❀ Qd. It requires in every execution that Qd should eventually hold if Pd holds, and Pd should be preserved (by both the current thread and the environment) until Qd holds. For sfInc, the definite action Pd ❀ Qd of a thread can be defined as follows. Pd says that owner equals the thread’s ticket number i, and Qd says that owner has been increased to i+ 1. That is, a thread definitely releases the lock when acquiring it. Of course we have to ensure in our logic that D is indeed definite. We will explain in detail the logic rule that enforces it in Sec. 4.2.2. Definite progress. Then we use definite actions to prove termination of loops. We need to first find an assertion Q specifying the condition when the thread t can progress on its own, i.e., it is not blocked. Then we enforce the following principles: 1. If Q is continuously true, we need to prove the loop terminates following the idea of the TERM rule; 2. If Q is false, the following must always be true: (a) There is a finite queue of definite actions of other threads that the thread t is waiting for, among which there is at least one (from a certain thread t 0 ) enabled. The length of the queue is E. (b) E decreases whenever one of these definite actions is finished; (c) The expression E is never increased by any threads (no matter whether Q holds or not); and it is non-negative. We can see E serves as a well-founded metric. By induction over E we know eventually Q holds, which implies the termination of the loop by the above condition 1. These conditions are enforced in our new inference rule for loops, which extends the TERM rule (in Sec. 2.3.1) and is presented in Sec. 4.2.2. The condition 2 shows the use of definite actions in our reasoning about progress. We call it the “definite progress” condition. The reasoning above implicitly makes use of the fairness assumption. The fair scheduling ensures that the environment thread t 0 mentioned in the condition 2(a) is scheduled infinitely often, therefore its definite action will definitely happen. By conditions 2(b) and 2(c) we know E will become smaller. In this way E keeps decreasing until Q holds eventually. For sfInc, Q is defined as (i = owner) and the metric E is (i − owner). Whenever an environment thread t 0 finishes a definite action by releasing the lock, it increases owner, so E decreases. When E is decreased to 0, the current thread is unblocked. Its loop terminates and it succeeds in acquiring the lock. 2.3.3 Allowing Queue Jumps for Deadlock-Free Objects The method dfInc in Fig. 1(b) implements a deadlock-free counter using the TAS lock. If the current thread t waits for the lock, we know the queue of definite actions it waits for is of length one because it is possible for the thread to acquire the lock immediately after the lock is released. However, as we explain in Sec. 2.2.1, another thread t 0 may preempt t and do a successful cas. Then thread t is blocked and waits for a queue of definite actions again. This delay caused by thread t 0 can be viewed as a queue jump in our definite-progress-based reasoning. Actually dfInc cannot satisfy the definite progress requirement because we cannot find a strictly decreasing queue size E. It is not starvation-free. However, the queue jump here is acceptable when verifying deadlock-freedom. This is because thread t 0 delays t only if t 0 388
1 local b :false,p,ci 1 lock L1; (MName)f∈String (ThrdID)t E Nat while (!b){ 2 local r :L2; (p,c):=find(e); 心 h11e(r1=0)[ (Epr)E:=x|n|E+E| lock p;lock c; unlock Li; (BExp)B ::true false E=E!B... 5 b :validate(p,c); 5 lock L1; (Instr)c:=:=E:=[E][E]:=E I print(E)1... 6 if (!b){ 6 x:=L2; 1 unlock c;unlock p; > () C ::=skip l c lx :f(E)I return E I (C) 8 心 lock L2; C;C if(B)C else C while(B)C} update(p,c,e); 9 unlock L2; (Prog)W ::skip let II in C...C 10 unlock c;unlock p; 10 unlock L1; (ODecl)Π,T={fh(r1,C1),.,fn(rn,Cn)} (a)optimistic list (b)rollback (S1ore)s,s∈ar⊥nt (Heap)h,h∈Nat一lmt Figure 2.Examples with multiple locks. (Mem)o,2:=(s,h) (PState)S::=(c;0o;...) successfully acquires the lock,which allows it to eventually finish Figure 3.The language syntax and state model. the dfInc method.Thus the system as a whole progresses Nevertheless,as explained in Sec.2.2.2,we have to make sure the queue jump(which is a special form of delay)is a"good"one. levels constitute a tuple (mk,...,m2,m).It is strictly descending We follow the token-transfer ideas [16.24]to support disci- along the dictionary order. plined queue jumps.We explicitly specify in the rely/guarantee The stratified -tokens naturally prohibit the circular delay conditions which steps could delay the progress of other threads problem discussed in Sec.2.2.2 with the object in Fig.2(b).The (jump their queues).To prohibit unlimited queue jumps without deadlock-free optimistic list in Fig.2(a)can now be verified.We making progress,we assign a finite number m of -tokens to an classify its delaying actions into two levels.Actions at level 2(the object method,and require that a thread can do at most m delaying highest level)update the list,which correspond to line 9 in Fig.2(a), actions before the method finishes. and each method can do only one such action.Lock acquirements Whenever a step of thread t'delays the progress of thread t. are classified at level 1,so a thread is allowed to roll back and we require t'to consume one -token.At the same time,thread t re-acquire the locks when its environment updates the list. could increase O-tokens so that it can loop more rounds.Besides we redefine the definite progress condition to allow the metric E 3. Programming Language (about the length of the queue)to be increased when an environment Figure 3 gives the syntax of the language.A program W consists of thread jumps the queue at the cost of a -token. multiple parallel threads sharing an object II.We say the threads are 2.3.4 Allowing Rollbacks for Optimistic Locking the clients of the object.An object declaration II is a mapping from a method name f to a pair of argument and code (method body) The ideas we just explained already support simple deadlock-free The statements C are similar to those in the simple language used objects such as dfInc in Fig.1(b),but they cannot be applied to for separation logic.The command print(E)generates exterally objects with optimistic synchronization,such as optimistic lists [13] observable events,which allows us to observe behaviors of programs. and lazy lists [11]. We assume it is used only in the code of clients.We use (C)to Figure 2(a)shows part of the optimistic list implementation. represent an atomic block in which C is executed atomically Each node of the list is associated with a TAS lock,the same lock To simplify the presentation,we make several simplifications as in Fig.1(b).A thread first traverses the list without acquiring any here.We assume there is only one concurrent object in the system locks (line 3).The traversal find returns two adjacent node pointers Each method of the object takes only one argument,and the method p and c.The thread then locks the two nodes (line 4),and calls body ends with a return E command.Also we assume there is no validate to check if the two nodes are still valid list nodes (line 5) nested method invocation.For the abstract object specification I, If validation succeeds,then the thread performs its updates(adding each method body is an atomic operation(C)(ahead of the return or removing elements)to the list (line 9).Otherwise it releases the command),and executing the code is always safe and never blocks two node locks (line 7)and restarts the traversal. The model of program states S is defined at the bottom of Fig.3 For this object,when the validation fails,a thread will release To ensure that the client code does not touch the object state,in S the locks it has acquired and roll back.Thus the thread may acquire we separate the states accessed by clients (oe)and by the object the locks for an unbounded number of times.Since each lock ()Here S may also contain auxiliary data about the control stacks acquirement will delay other threads requesting the same lock and for method calls.Execution of programs is modeled as a labeled each delaying action should consume one -token,it seems that transition system(W,S)(W',S').Here e is an event (either the thread would need an infinite number of -tokens.which we observable or not)produced by the transition.We give the small-step prohibit in the preceding subsection to ensure deadlock-freedom, transition semantics in TR [22]. even though this list object is indeed deadlock-free. Below we often write X,s and h for the notations at the abstract We generalize the token-transfer ideas to allow rollbacks in order level to distinguish from the concrete-level ones (o.s and h). to verify this kind of optimistic algorithms,but still have to be careful to avoid the circular delay caused by the "bad"rollbacks in 4. Fig.2(b),as we explain in Sec.2.2.2. Program Logic LiLi Our solution is to stratify the delaying actions.Each action is LiLi verifies the linearizability of objects by proving the method now labeled with a level k.The normal actions which cannot delay implementations refine abstract atomic operations.The top level other threads are at the lowest level 0.The -tokens are stratified judgment is in the form of D,R,GP)I<T.(The oBJ rule for accordingly.A thread can roll back and do more actions at level k this judgment is given in Fig.7 and will be explained later.)To verify only when its environment does an action at a higher level k',at the an object II,we give a corresponding object specification I(see cost of a k'-level -token.Note that the -tokens at the highest level Fig.3),which maps method names to atomic commands.In addition are strictly decreasing,which means a thread cannot roll back its we need to provide an object invariant (P)and rely/guarantee actions of the highest level.In fact,the numbers of -tokens at all conditions (R and G)for the refinement reasoning in a concurrent 389
1 local b := false, p, c; 2 while (!b) { 3 (p, c) := find(e); 4 lock p; lock c; 5 b := validate(p, c); 6 if (!b) { 7 unlock c; unlock p; } 8 } 9 update(p, c, e); 10 unlock c; unlock p; (a) optimistic list 1 lock L1; 2 local r := L2; 3 while (r != 0) { 4 unlock L1; 5 lock L1; 6 r := L2; 7 } 8 lock L2; 9 unlock L2; 10 unlock L1; (b) rollback Figure 2. Examples with multiple locks. successfully acquires the lock, which allows it to eventually finish the dfInc method. Thus the system as a whole progresses. Nevertheless, as explained in Sec. 2.2.2, we have to make sure the queue jump (which is a special form of delay) is a “good” one. We follow the token-transfer ideas [16, 24] to support disciplined queue jumps. We explicitly specify in the rely/guarantee conditions which steps could delay the progress of other threads (jump their queues). To prohibit unlimited queue jumps without making progress, we assign a finite number m of -tokens to an object method, and require that a thread can do at most m delaying actions before the method finishes. Whenever a step of thread t 0 delays the progress of thread t, we require t 0 to consume one -token. At the same time, thread t could increase ♦-tokens so that it can loop more rounds. Besides, we redefine the definite progress condition to allow the metric E (about the length of the queue) to be increased when an environment thread jumps the queue at the cost of a -token. 2.3.4 Allowing Rollbacks for Optimistic Locking The ideas we just explained already support simple deadlock-free objects such as dfInc in Fig. 1(b), but they cannot be applied to objects with optimistic synchronization, such as optimistic lists [13] and lazy lists [11]. Figure 2(a) shows part of the optimistic list implementation. Each node of the list is associated with a TAS lock, the same lock as in Fig. 1(b). A thread first traverses the list without acquiring any locks (line 3). The traversal find returns two adjacent node pointers p and c. The thread then locks the two nodes (line 4), and calls validate to check if the two nodes are still valid list nodes (line 5). If validation succeeds, then the thread performs its updates (adding or removing elements) to the list (line 9). Otherwise it releases the two node locks (line 7) and restarts the traversal. For this object, when the validation fails, a thread will release the locks it has acquired and roll back. Thus the thread may acquire the locks for an unbounded number of times. Since each lock acquirement will delay other threads requesting the same lock and each delaying action should consume one -token, it seems that the thread would need an infinite number of -tokens, which we prohibit in the preceding subsection to ensure deadlock-freedom, even though this list object is indeed deadlock-free. We generalize the token-transfer ideas to allow rollbacks in order to verify this kind of optimistic algorithms, but still have to be careful to avoid the circular delay caused by the “bad” rollbacks in Fig. 2(b), as we explain in Sec. 2.2.2. Our solution is to stratify the delaying actions. Each action is now labeled with a level k. The normal actions which cannot delay other threads are at the lowest level 0. The -tokens are stratified accordingly. A thread can roll back and do more actions at level k only when its environment does an action at a higher level k 0 , at the cost of a k 0 -level -token. Note that the -tokens at the highest level are strictly decreasing, which means a thread cannot roll back its actions of the highest level. In fact, the numbers of -tokens at all (MName) f ∈ String (ThrdID) t ∈ Nat (Expr) E ::= x | n | E + E | . . . (BExp) B ::= true | false | E = E | !B | . . . (Instr) c ::= x := E | x := [E] | [E] := E | print(E) | . . . (Stmt) C ::= skip | c | x := f(E) | return E | hCi | C; C | if (B) C else C | while (B){C} (Prog) W ::= skip | let Π in C k. . .kC (ODecl) Π, Γ ::= {f1 ❀ (x1, C1), . . . , fn ❀ (xn, Cn)} (Store) s, s ∈ Var * Int (Heap) h, h ∈ Nat * Int (Mem) σ, Σ ::= (s, h) (PState) S ::= (σc, σo, . . .) Figure 3. The language syntax and state model. levels constitute a tuple (mk, . . . , m2, m1). It is strictly descending along the dictionary order. The stratified -tokens naturally prohibit the circular delay problem discussed in Sec. 2.2.2 with the object in Fig. 2(b) . The deadlock-free optimistic list in Fig. 2(a) can now be verified. We classify its delaying actions into two levels. Actions at level 2 (the highest level) update the list, which correspond to line 9 in Fig. 2(a), and each method can do only one such action. Lock acquirements are classified at level 1, so a thread is allowed to roll back and re-acquire the locks when its environment updates the list. 3. Programming Language Figure 3 gives the syntax of the language. A program W consists of multiple parallel threads sharing an object Π. We say the threads are the clients of the object. An object declaration Π is a mapping from a method name f to a pair of argument and code (method body). The statements C are similar to those in the simple language used for separation logic. The command print(E) generates externally observable events, which allows us to observe behaviors of programs. We assume it is used only in the code of clients. We use hCi to represent an atomic block in which C is executed atomically. To simplify the presentation, we make several simplifications here. We assume there is only one concurrent object in the system. Each method of the object takes only one argument, and the method body ends with a return E command. Also we assume there is no nested method invocation. For the abstract object specification Γ, each method body is an atomic operation hCi (ahead of the return command), and executing the code is always safe and never blocks. The model of program states S is defined at the bottom of Fig. 3. To ensure that the client code does not touch the object state, in S we separate the states accessed by clients (σc) and by the object (σo). Here S may also contain auxiliary data about the control stacks for method calls. Execution of programs is modeled as a labeled transition system (W, S) e 7−→ (W0 , S 0 ). Here e is an event (either observable or not) produced by the transition. We give the small-step transition semantics in TR [22]. Below we often write Σ, s and h for the notations at the abstract level to distinguish from the concrete-level ones (σ, s and h). 4. Program Logic LiLi LiLi verifies the linearizability of objects by proving the method implementations refine abstract atomic operations. The top level judgment is in the form of D, R, G ` {P}ΠΓ. (The OBJ rule for this judgment is given in Fig. 7 and will be explained later.) To verify an object Π, we give a corresponding object specification Γ (see Fig. 3), which maps method names to atomic commands. In addition, we need to provide an object invariant (P) and rely/guarantee conditions (R and G) for the refinement reasoning in a concurrent 389