Modular Verification of Linearizability with Non-Fixed Linearization Points Hongjin Liang Xinyu Feng University of Science and Technology of China lhj1018@mail.ustc.edu.cn xyfeng@ustc.edu.cn Abstract intended operation.When a thread A detects conflicts with another Locating linearization points (LPs)is an intuitive approach for thread B,A may access B's descriptor and help B finish its intended proving linearizability.but it is difficult to apply the idea in Hoare- operation first before finishing its own.In this case,B's operation style logic for formal program verification,especially for verify- takes effect at a step from A.Thus its LP should not be in its own ing algorithms whose LPs cannot be statically located in the code code,but in the code of thread A. In this paper,we propose a program logic with a lightweight in- Besides,in optimistic algorithms and lazy algorithms (e.g., strumentation mechanism which can verify algorithms with non- Heller et al.'s lazy set [13)).the LPs might depend on unpredictable fixed LPs,including the most challenging ones that use the help- future interleavings.In those algorithms,a thread may access the ing mechanism to achieve lock-freedom (as in HSY elimination- shared states as if no interference would occur,and validate the based stack),or have LPs depending on unpredictable future exe- accesses later.If the validation succeeds,it finishes the operation; cutions (as in the lazy set algorithm),or involve both features.We otherwise it rolls back and retries.Its LP is usually at a prior state also develop a thread-local simulation as the meta-theory of our access,but only if the later validation succeeds. logic,and show it implies contextual refinement,which is equiv- Reasoning about algorithms with non-fixed LPs has been a alent to linearizability.Using our logic we have successfully ver- long-standing problem.Most existing work either supports only ified various classic algorithms,some of which are used in the simple objects with static LPs in the implementation code (e.g..[2. java.util.concurrent package. 5,19,301).or lacks formal soundness arguments (e.g.,[321).In this paper,we propose a program logic for verification of linearizability Categories and Subject Descriptors D.2.4 [Software Engineer- with non-fixed LPs.For a concrete implementation of an object ing]:Software/Program Verification-Correctness proofs,Formal method,we treat the corresponding abstract atomic operation and methods;F.3.1 [Logics and Meanings of Programs]:Specifying the abstract state as auxiliary states,and insert auxiliary commands and Verifying and Reasoning about Programs at the LP to execute the abstract operation simultaneously with the concrete step.We verify the instrumented implementation in General Terms Theory,Verification an existing concurrent program logic (we will use LRG [8]in Keywords Concurrency;Rely-Guarantee Reasoning;Lineariz- this paper),but extend it with new logic rules for the auxiliary ability;Refinement;Simulation commands.We also give a new relational interpretation to the logic assertions,and show that at the LP,the step of the original concrete 1.Introduction implementation has the same effect as the abstract operation.We handle non-fixed LPs in the following way: Linearizability is a standard correctness criterion for concurrent ob- ject implementations [16].It requires the fine-grained implementa- To support the helping mechanism,we collect a pending thread tion of an object operation to have the same effect with an instanta- pool as auxiliary state,which is a set of threads and their neous atomic operation.To prove linearizability,the most intuitive abstract operations that might be helped.We allow the thread approach is to find a linearization point(LP)in the code of the im- that is currently being verified to use auxiliary commands to plementation,and show that it is the single point where the effect help execute the abstract operations in the pending thread pool of the operation takes place. For future-dependent LPs,we introduce a try-commit mecha- However,it is difficult to apply this idea when the LPs are not nism to reason with uncertainty.The try clause guesses whether fixed in the code of object methods.For a large class of lock- the corresponding abstract operation should be executed and free algorithms with helping mechanism (e.g.,HSY elimination- keeps all possibilities,while commit chooses a specific pos- based stack [14]),the LP of one method might be in the code sible case when we know which guess is correct later. of some other method.In these algorithms,each thread maintains Although our program logic looks intuitive,it is challenging to a descriptor recording all the information required to fulfill its prove that the logic is sound w.r.t.linearizability.Recent work has shown the equivalence between linearizability and contextual re- finement [5,9,10].The latter is often verified by proving simula- tions between the concrete implementation and the atomic opera- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed tion [5].The simulation establishes some correspondence between the executions of the two sides,showing there exists one step in for profit or commercial advantage and that copies bear this notice and the full citation nthe first page.To copy otherwise,to republish,to post to redistribute the concrete execution that fulfills the abstract operation.Given the to lists,requires prior specific permission and/or a fee equivalence between linearizability and refinement,we would ex- PLDI'13,June 16-19,2013,Seattle,WA,USA pect the simulations to justify the soundness of the LP method and Copyright©2013ACM978-1-4503-2014-6/1306..s15.00 to serve as the meta-theory of our logic.However,existing thread- 459
Modular Verification of Linearizability with Non-Fixed Linearization Points Hongjin Liang Xinyu Feng University of Science and Technology of China lhj1018@mail.ustc.edu.cn xyfeng@ustc.edu.cn Abstract Locating linearization points (LPs) is an intuitive approach for proving linearizability, but it is difficult to apply the idea in Hoarestyle logic for formal program verification, especially for verifying algorithms whose LPs cannot be statically located in the code. In this paper, we propose a program logic with a lightweight instrumentation mechanism which can verify algorithms with non- fixed LPs, including the most challenging ones that use the helping mechanism to achieve lock-freedom (as in HSY eliminationbased stack), or have LPs depending on unpredictable future executions (as in the lazy set algorithm), or involve both features. We also develop a thread-local simulation as the meta-theory of our logic, and show it implies contextual refinement, which is equivalent to linearizability. Using our logic we have successfully verified various classic algorithms, some of which are used in the java.util.concurrent package. Categories and Subject Descriptors D.2.4 [Software Engineering]: Software/Program Verification – Correctness proofs, Formal methods; F.3.1 [Logics and Meanings of Programs]: Specifying and Verifying and Reasoning about Programs General Terms Theory, Verification Keywords Concurrency; Rely-Guarantee Reasoning; Linearizability; Refinement; Simulation 1. Introduction Linearizability is a standard correctness criterion for concurrent object implementations [16]. It requires the fine-grained implementation of an object operation to have the same effect with an instantaneous atomic operation. To prove linearizability, the most intuitive approach is to find a linearization point (LP) in the code of the implementation, and show that it is the single point where the effect of the operation takes place. However, it is difficult to apply this idea when the LPs are not fixed in the code of object methods. For a large class of lockfree algorithms with helping mechanism (e.g., HSY eliminationbased stack [14]), the LP of one method might be in the code of some other method. In these algorithms, each thread maintains a descriptor recording all the information required to fulfill its Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. PLDI’13, June 16–19, 2013, Seattle, WA, USA. Copyright c 2013 ACM 978-1-4503-2014-6/13/06. . . $15.00 intended operation. When a thread A detects conflicts with another thread B, A may access B’s descriptor and help B finish its intended operation first before finishing its own. In this case, B’s operation takes effect at a step from A. Thus its LP should not be in its own code, but in the code of thread A. Besides, in optimistic algorithms and lazy algorithms (e.g., Heller et al.’s lazy set [13]), the LPs might depend on unpredictable future interleavings. In those algorithms, a thread may access the shared states as if no interference would occur, and validate the accesses later. If the validation succeeds, it finishes the operation; otherwise it rolls back and retries. Its LP is usually at a prior state access, but only if the later validation succeeds. Reasoning about algorithms with non-fixed LPs has been a long-standing problem. Most existing work either supports only simple objects with static LPs in the implementation code (e.g., [2, 5, 19, 30]), or lacks formal soundness arguments (e.g., [32]). In this paper, we propose a program logic for verification of linearizability with non-fixed LPs. For a concrete implementation of an object method, we treat the corresponding abstract atomic operation and the abstract state as auxiliary states, and insert auxiliary commands at the LP to execute the abstract operation simultaneously with the concrete step. We verify the instrumented implementation in an existing concurrent program logic (we will use LRG [8] in this paper), but extend it with new logic rules for the auxiliary commands. We also give a new relational interpretation to the logic assertions, and show that at the LP, the step of the original concrete implementation has the same effect as the abstract operation. We handle non-fixed LPs in the following way: • To support the helping mechanism, we collect a pending thread pool as auxiliary state, which is a set of threads and their abstract operations that might be helped. We allow the thread that is currently being verified to use auxiliary commands to help execute the abstract operations in the pending thread pool. • For future-dependent LPs, we introduce a try-commit mechanism to reason with uncertainty. The try clause guesses whether the corresponding abstract operation should be executed and keeps all possibilities, while commit chooses a specific possible case when we know which guess is correct later. Although our program logic looks intuitive, it is challenging to prove that the logic is sound w.r.t. linearizability. Recent work has shown the equivalence between linearizability and contextual re- finement [5, 9, 10]. The latter is often verified by proving simulations between the concrete implementation and the atomic operation [5]. The simulation establishes some correspondence between the executions of the two sides, showing there exists one step in the concrete execution that fulfills the abstract operation. Given the equivalence between linearizability and refinement, we would expect the simulations to justify the soundness of the LP method and to serve as the meta-theory of our logic. However, existing thread- 459
local simulations do not support non-fixed LPs (except the recent 1 readPair(int i,j){ work [31].which we will discuss in Sec.7).We will explain the 1 push(int v){ 2 local a,b,v,w; challenges in detail in Sec.2. 2 local x,t,bi 3 while(true){ Our work is inspired by the earlier work on linearizability 3 x :new node(v); <a :m[i].d;v :m[i].v;> verification,in particular the use of auxiliary code and states by 4 do{ <b:=m[j].d;w:=m[j].v; Vafeiadis [32]and our previous work on thread-local simulation 5 t:=S; 61 trylinself;> RGSim [19],but makes the following new contributions: 6 x.next :=t; 6 if(v m[i].v){ 7 <b :cas(&s,t,x); commit(cid(end,(a,b))); We propose the first program logic that has a formal soundness 7 if(b)linself;> 7 return (a,b);} proof for linearizability with non-fixed LPs.Our logic is built 8 while(!b); 8]] upon the unary program logic LRG [8],but we give a relational 9] 9 write(int i,d){ interpretation of assertions and rely/guarantee conditions.We (a)Treiber Stack 10<m[i].d:=d;m[i].v+;>] also introduce new logic rules for auxiliary commands used (c)Pair Snapshot specifically for linearizability proofs. 1 push(int v){ 2 local p,him,q; We give a light instrumentation mechanism to relate con- 3 p :new thrdDescriptor(cid,PUSH,v); crete implementations with abstract operations.The system- while(true){ atic use of auxiliary states and commands makes it possible 5 if (tryPush(v))return; to execute the abstract operations synchronously with the con- 6 loc[cid]:p; crete code.The try-commit clauses allow us to reason about him :rand();q :loc[him]; future-dependent uncertainty without resorting to prophecy if (q !null &q.id him &q.op POP) variables [1,32],whose existing semantics (e.g.,[1])is un- 9 if (cas(&loc[cid],p,null)){ suitable for Hoare-style verification. 10 <b :cas(&loc[him],q,p); 101 if(b)lin(cid);lin(him);> We design a novel thread-local simulation as the meta-theory 11 if (b)return; for our logic.It generalizes RGSim [19]and other composi- 12 44 tional reasoning of refinement (e.g.,[5,30])with the support (b)HSY Elimination-Based Stack for non-fixed LPs. Figure 1.LPs and Instrumented Auxiliary Commands Instead of ensuring linearizability directly,the program logic and the simulation both establish contextual refinement,which we prove is equivalent to linearizability.A program logic for a sequence of values with"::"for concatenation.Then push(v) contextual refinement is interesting in its own right,since con- can be linearized at the successful cas since it is the single point textual refinement is also a widely accepted (and probably more where the operation takes effect. natural)correctness criterion for library code. We can encode the above reasoning in an existing (unary)con- current program logic,such as Rely-Guarantee reasoning [17]and We successfully apply our logic to verify 12 well-known algo- CSL [24].Inspired by Vafeiadis [32],we embed the abstract oper- rithms.Some of them are used in the java.util.concurrent ation and the abstract state 0 as auxiliary states on the concrete package,such as MS non-blocking queue [23]and Harris- side,so the program state now becomes ((,0)).where a is the Michael lock-free list [11.22]. original concrete state.Then we instrument the concrete implemen- In the rest of this paper,we first analyze the challenges in the tation with an auxiliary command linself(shorthand for "linearize logic design and explain our approach informally in Sec.2.Then self)at the LP to update the auxiliary state.Intuitively,linself will we give the basic technical setting in Sec.3,including a formal execute the abstract operation over the abstract state 0,as de- operational definition of linearizability.We present our program scribed in the following operational semantics rule: logic in Sec.4,and the new simulation relation as the meta-theory (Y,8)…(end,8) in Sec.5.In Sec.6 we summarize all the algorithms we have (linself,(o,(y,0))-→(skip,(a,(end,8)) verified and sketch the proofs of three representative algorithms We discuss related work and conclude in Sec.7. Here~encodes the transition of yy at the abstract level,and end is a termination marker.We insert linself into the same atomic block Challenges and Our Approach with the concrete statement at the LP,such as line 7'in Fig.1(a), so that the concrete and abstract sides are executed simultaneously. Below we start from a simple program logic for linearizability with Here the atomic block (C)means C is executed atomically.Then fixed LPs,and extend it to support algorithms with non-fixed LPs. we reason about the instrumented code using a traditional concur- We also discuss the problems with the underlying meta-theory rent logic extended with a new inference rule for linself. which establishes the soundness of the logic w.r.t.linearizability. The idea is intuitive,but it cannot handle more advanced algo- rithms with non-fixed LPs,including the algorithms with the help- 2.1 Basic Logic for Fixed LPs ing mechanism and those whose locations of LPs depend on the We first show a simple and intuitive logic which follows the LP future interleavings.Below we analyze the two challenges in detail approach.As a working example,Fig.1(a)shows the implementa- and explain our solutions using two representative algorithms,the tion of push in Treiber stack [29](let's first ignore the blue code at HSY stack and the pair snapshot. line 7').The stack object is implemented as a linked list pointed to by S,and push(v)repeatedly tries to update S to point to the new 2.2 Support Helping Mechanism with Pending Thread Pool node using compare-and-swap (cas)until it succeeds. HSY elimination-based stack [14]is a typical example using the To verify linearizability,we first locate the LP in the code.The helping mechanism.Figure 1(b)shows part of its push method LP of push(v)is at the cas statement when it succeeds (line 7). implementation.The basic idea behind the algorithm is to let a push That is,the successful cas can correspond to the abstract atomic and a pop cancel out each other. PUSH(v)operation:Stk :=v::Stk;and all the other concrete At the beginning of the method in Fig.1(b),the thread allocates steps cannot.Here we simply represent the abstract stack Stk as its thread descriptor (line 3),which contains the thread id,the name 460
local simulations do not support non-fixed LPs (except the recent work [31], which we will discuss in Sec. 7). We will explain the challenges in detail in Sec. 2. Our work is inspired by the earlier work on linearizability verification, in particular the use of auxiliary code and states by Vafeiadis [32] and our previous work on thread-local simulation RGSim [19], but makes the following new contributions: • We propose the first program logic that has a formal soundness proof for linearizability with non-fixed LPs. Our logic is built upon the unary program logic LRG [8], but we give a relational interpretation of assertions and rely/guarantee conditions. We also introduce new logic rules for auxiliary commands used specifically for linearizability proofs. • We give a light instrumentation mechanism to relate concrete implementations with abstract operations. The systematic use of auxiliary states and commands makes it possible to execute the abstract operations synchronously with the concrete code. The try-commit clauses allow us to reason about future-dependent uncertainty without resorting to prophecy variables [1, 32], whose existing semantics (e.g., [1]) is unsuitable for Hoare-style verification. • We design a novel thread-local simulation as the meta-theory for our logic. It generalizes RGSim [19] and other compositional reasoning of refinement (e.g., [5, 30]) with the support for non-fixed LPs. • Instead of ensuring linearizability directly, the program logic and the simulation both establish contextual refinement, which we prove is equivalent to linearizability. A program logic for contextual refinement is interesting in its own right, since contextual refinement is also a widely accepted (and probably more natural) correctness criterion for library code. • We successfully apply our logic to verify 12 well-known algorithms. Some of them are used in the java.util.concurrent package, such as MS non-blocking queue [23] and HarrisMichael lock-free list [11, 22]. In the rest of this paper, we first analyze the challenges in the logic design and explain our approach informally in Sec. 2. Then we give the basic technical setting in Sec. 3, including a formal operational definition of linearizability. We present our program logic in Sec. 4, and the new simulation relation as the meta-theory in Sec. 5. In Sec. 6 we summarize all the algorithms we have verified and sketch the proofs of three representative algorithms. We discuss related work and conclude in Sec. 7. 2. Challenges and Our Approach Below we start from a simple program logic for linearizability with fixed LPs, and extend it to support algorithms with non-fixed LPs. We also discuss the problems with the underlying meta-theory, which establishes the soundness of the logic w.r.t. linearizability. 2.1 Basic Logic for Fixed LPs We first show a simple and intuitive logic which follows the LP approach. As a working example, Fig. 1(a) shows the implementation of push in Treiber stack [29] (let’s first ignore the blue code at line 7’). The stack object is implemented as a linked list pointed to by S, and push(v) repeatedly tries to update S to point to the new node using compare-and-swap (cas) until it succeeds. To verify linearizability, we first locate the LP in the code. The LP of push(v) is at the cas statement when it succeeds (line 7). That is, the successful cas can correspond to the abstract atomic PUSH(v) operation: Stk := v::Stk; and all the other concrete steps cannot. Here we simply represent the abstract stack Stk as 1 push(int v) { 2 local x, t, b; 3 x := new node(v); 4 do { 5 t := S; 6 x.next := t; 7 <b := cas(&S,t,x); 7’ if(b) linself;> 8 } while(!b); 9 } (a) Treiber Stack 1 readPair(int i, j) { 2 local a, b, v, w; 3 while(true) { 4 <a := m[i].d; v := m[i].v;> 5 <b := m[j].d; w := m[j].v; 5’ trylinself;> 6 if(v = m[i].v) { 6’ commit(cid (end, (a, b))); 7 return (a, b); } 8 }} 9 write(int i, d) { 10 <m[i].d := d; m[i].v++;> } (c) Pair Snapshot 1 push(int v) { 2 local p, him, q; 3 p := new thrdDescriptor(cid, PUSH, v); 4 while(true) { 5 if (tryPush(v)) return; 6 loc[cid] := p; 7 him := rand(); q := loc[him]; 8 if (q != null && q.id = him && q.op = POP) 9 if (cas(&loc[cid], p, null)) { 10 <b := cas(&loc[him], q, p); 10’ if(b) {lin(cid); lin(him);}> 11 if (b) return; } 12 ... 13 } } (b) HSY Elimination-Based Stack Figure 1. LPs and Instrumented Auxiliary Commands a sequence of values with “::” for concatenation. Then push(v) can be linearized at the successful cas since it is the single point where the operation takes effect. We can encode the above reasoning in an existing (unary) concurrent program logic, such as Rely-Guarantee reasoning [17] and CSL [24]. Inspired by Vafeiadis [32], we embed the abstract operation γ and the abstract state θ as auxiliary states on the concrete side, so the program state now becomes (σ,(γ,θ)), where σ is the original concrete state. Then we instrument the concrete implementation with an auxiliary command linself (shorthand for “linearize self”) at the LP to update the auxiliary state. Intuitively, linself will execute the abstract operation γ over the abstract state θ, as described in the following operational semantics rule: (γ,θ) (end, θ ) (linself, (σ, (γ,θ))) −→ (skip, (σ, (end, θ ))) Here encodes the transition of γ at the abstract level, and end is a termination marker. We insert linself into the same atomic block with the concrete statement at the LP, such as line 7’ in Fig. 1(a), so that the concrete and abstract sides are executed simultaneously. Here the atomic block C means C is executed atomically. Then we reason about the instrumented code using a traditional concurrent logic extended with a new inference rule for linself. The idea is intuitive, but it cannot handle more advanced algorithms with non-fixed LPs, including the algorithms with the helping mechanism and those whose locations of LPs depend on the future interleavings. Below we analyze the two challenges in detail and explain our solutions using two representative algorithms, the HSY stack and the pair snapshot. 2.2 Support Helping Mechanism with Pending Thread Pool HSY elimination-based stack [14] is a typical example using the helping mechanism. Figure 1(b) shows part of its push method implementation. The basic idea behind the algorithm is to let a push and a pop cancel out each other. At the beginning of the method in Fig. 1(b), the thread allocates its thread descriptor (line 3), which contains the thread id, the name 460
of the operation to be performed,and the argument.The current thread cid first tries to perform Treiber stack's push (line 5).It returns if succeeds.Otherwise,it writes its descriptor in the global loc array (line 6)to allow other threads to eliminate its push.The elimination array loc1..n has one slot for each thread,which LP LPs oft and t' potential LP commit holds the pointer to a thread descriptor.The thread randomly reads a slot him in loc (line 7).If the descriptor q says him is doing (a)Simple Simulation (b)Pending Thread Pool (c)Speculation pop.cid tries to eliminate itself with him by two cas instructions. Figure 2.Simulation Diagrams The first clears cid's entry in loc so that no other thread could eliminate with cid (line 9).The second attempts to mark the entry of him in loc as "eliminated with cid"(line 10).If successful,it reads j's data,but only if the validation at line 6 succeeds.That is should be the LPs of both the push of cid and the pop of him,with whether we should linearize the operation at line 5 depends on the the push happening immediately before the pop. future unpredictable behavior of line 6. The helping mechanism allows the current thread to linearize As discussed a lot in previous work (e.g.,[1,32)),the future the operations of other threads,which cannot be expressed in the dependent LPs cannot be handled by introducing history variables, basic logic.It also breaks modularity and makes thread-local ver- which are auxiliary variables storing values or events in the past ex- ification difficult.For the thread cid,its concrete step could cor- ecutions.We have to refer to events coming from the unpredictable respond to the steps of both cid and him at the abstract level.For future.Thus people propose prophecy variables [1.32]as the dual him,a step from its environment could fulfill its abstract operation. of history variables to store future behaviors.But as far as we know We must ensure in the thread-local verification that the two threads there is no semantics of prophecy variables suitable for Hoare-style cid and him always take consistent views on whether and how the local and compositional reasoning. abstract operation of him is done.For example,if we let a concrete Instead of resorting to prophecy variables,we follow the specu- step in cid fulfill the abstract pop of him,we must know him is lation idea [31].For the concrete step at a potential LP(e.g..line 5 indeed doing pop and its pop has not been done before.Otherwise. of readPair),we execute the abstract operation speculatively and we will not be able to compose cid and him in parallel. keep both the result and the original abstract configuration.Later We extend the basic logic to express the helping mechanism. based on the result of the validation (e.g.,line 6 in readPair),we First we introduce a new auxiliary command lin(t)to linearize a keep the appropriate branch and discard the other. specific thread t.For instance,in Fig.1(b)we insert line 10'at For the logic,we introduce two new auxiliary commands: the LP to execute both the push of cid and the pop of him at the trylinself is to do speculation,and commit(p)will commit to the abstract level.We also extend the auxiliary state to record both appropriate branch satisfying the assertion p.In Fig.1(c),we insert abstract operations of cid and him.More generally,we embed lines 5,and 6,,where cid(end,(a,b))means that the cur- a pending thread pool U,which maps threads to their abstract rent thread cid should have done its abstract operation and would operations.It specifies a set of threads whose operations might return (a.b).We also extend the auxiliary state to record the mul- be helped by others.Then under the new state (o,(U,0)),the tiple possibilities of abstract operations and abstract states after semantics of lin(t)just executes the thread t's abstract operation speculation. in U,similarly to the semantics of linself discussed before. Furthermore,we can combine the speculation idea with the The shared pending thread pool U allows us to recover the pending thread pool.We allow the abstract operations in the pend- thread modularity when verifying the helping mechanism.A con- ing thread pool as well as the current thread to speculate.Then crete step of cid could fulfill the operation of him in U as well as we could handle some trickier algorithms such as RDCSS [12].in its own abstract operation;and conversely,the thread him running which the location of LP for thread t may be in the code of some in parallel could check U to know if its operation has been finished other thread and also depend on the future behaviors of that thread. by others (such as cid)or not.We gain consistent abstract infor- Please see Sec.6 for one such example. mation of other threads in the thread-local verification.Note that the need of 0 itself does not break modularity because the required 2.4 Simulation as Meta-Theory information of other threads'abstract operations can be inferred from the concrete state.In the HSY stack example,we know him The LP proof method can be understood as building simulations be- is doing pop by looking at its thread descriptor in the elimination tween the concrete implementations and the abstract atomic opera- array.In this case U can be viewed as an abstract representation of tions,such as the simple weak simulation in Fig.2(a).The lower- the elimination array. level and higher-level arrows are the steps of the implementation and of the abstract operation respectively,and the dashed lines de- note the simulation relation.We use dark nodes and white nodes 2.3 Try-Commit Commands for Future-Dependent LPs at the abstract level to distinguish whether the operation has been Another challenge is to reason about optimistic algorithms whose finished or not.The only step at the concrete side corresponding LPs depend on the future interleavings. to the single abstract step should be the LP of the implementation We give a toy example,pair snapshot [27],in Fig.1(c).The (labeled"LP"in the diagram).Since our program logic is based on object is an array m,each slot of which contains two fields:d for the LP method,we can expect simulations to justify its soundness. the data and v for the version number.The write(i,d)method In particular,we want a thread-local simulation which can handle (lines 9)updates the data stored at address i and increments the both the helping mechanism and future-dependent LPs and can en- version number instantaneously.The readPair(i,j)method in- sure linearizability. tends to perform an atomic read of two slots i and j in the presence To support helping in the simulation,we should allow the LP of concurrent writes.It reads the data at slots i and j separately at step at the concrete level to correspond to an abstract step made by lines 4 and 5,and validate the first read at line 6.If i's version num a thread other than the one being verified.This requires informa- ber has not been increased,the thread knows that when it read j's tion from other threads at the abstract side,thus makes it difficult data at line 5,i's data had not been updated.This means the two to build a thread-local simulation.To address the problem,we intro reads were at a consistent state,thus the thread can return.We can duce the pending thread pool at the abstract level of the simulation see that the LP of readPair should be at line 5 when the thread just as in the development of our logic in Sec.2.2.The new simula- 461
of the operation to be performed, and the argument. The current thread cid first tries to perform Treiber stack’s push (line 5). It returns if succeeds. Otherwise, it writes its descriptor in the global loc array (line 6) to allow other threads to eliminate its push. The elimination array loc[1..n] has one slot for each thread, which holds the pointer to a thread descriptor. The thread randomly reads a slot him in loc (line 7). If the descriptor q says him is doing pop, cid tries to eliminate itself with him by two cas instructions. The first clears cid’s entry in loc so that no other thread could eliminate with cid (line 9). The second attempts to mark the entry of him in loc as “eliminated with cid” (line 10). If successful, it should be the LPs of both the push of cid and the pop of him, with the push happening immediately before the pop. The helping mechanism allows the current thread to linearize the operations of other threads, which cannot be expressed in the basic logic. It also breaks modularity and makes thread-local verification difficult. For the thread cid, its concrete step could correspond to the steps of both cid and him at the abstract level. For him, a step from its environment could fulfill its abstract operation. We must ensure in the thread-local verification that the two threads cid and him always take consistent views on whether and how the abstract operation of him is done. For example, if we let a concrete step in cid fulfill the abstract pop of him, we must know him is indeed doing pop and its pop has not been done before. Otherwise, we will not be able to compose cid and him in parallel. We extend the basic logic to express the helping mechanism. First we introduce a new auxiliary command lin(t) to linearize a specific thread t. For instance, in Fig. 1(b) we insert line 10’ at the LP to execute both the push of cid and the pop of him at the abstract level. We also extend the auxiliary state to record both abstract operations of cid and him. More generally, we embed a pending thread pool U, which maps threads to their abstract operations. It specifies a set of threads whose operations might be helped by others. Then under the new state (σ,(U, θ)), the semantics of lin(t) just executes the thread t’s abstract operation in U, similarly to the semantics of linself discussed before. The shared pending thread pool U allows us to recover the thread modularity when verifying the helping mechanism. A concrete step of cid could fulfill the operation of him in U as well as its own abstract operation; and conversely, the thread him running in parallel could check U to know if its operation has been finished by others (such as cid) or not. We gain consistent abstract information of other threads in the thread-local verification. Note that the need of U itself does not break modularity because the required information of other threads’ abstract operations can be inferred from the concrete state. In the HSY stack example, we know him is doing pop by looking at its thread descriptor in the elimination array. In this case U can be viewed as an abstract representation of the elimination array. 2.3 Try-Commit Commands for Future-Dependent LPs Another challenge is to reason about optimistic algorithms whose LPs depend on the future interleavings. We give a toy example, pair snapshot [27], in Fig. 1(c). The object is an array m, each slot of which contains two fields: d for the data and v for the version number. The write(i,d) method (lines 9) updates the data stored at address i and increments the version number instantaneously. The readPair(i,j) method intends to perform an atomic read of two slots i and j in the presence of concurrent writes. It reads the data at slots i and j separately at lines 4 and 5, and validate the first read at line 6. If i’s version number has not been increased, the thread knows that when it read j’s data at line 5, i’s data had not been updated. This means the two reads were at a consistent state, thus the thread can return. We can see that the LP of readPair should be at line 5 when the thread (a) Simple Simulation (b) Pending Thread Pool (c) Speculation Figure 2. Simulation Diagrams reads j’s data, but only if the validation at line 6 succeeds. That is, whether we should linearize the operation at line 5 depends on the future unpredictable behavior of line 6. As discussed a lot in previous work (e.g., [1, 32]), the futuredependent LPs cannot be handled by introducing history variables, which are auxiliary variables storing values or events in the past executions. We have to refer to events coming from the unpredictable future. Thus people propose prophecy variables [1, 32] as the dual of history variables to store future behaviors. But as far as we know, there is no semantics of prophecy variables suitable for Hoare-style local and compositional reasoning. Instead of resorting to prophecy variables, we follow the speculation idea [31]. For the concrete step at a potential LP (e.g., line 5 of readPair), we execute the abstract operation speculatively and keep both the result and the original abstract configuration. Later based on the result of the validation (e.g., line 6 in readPair), we keep the appropriate branch and discard the other. For the logic, we introduce two new auxiliary commands: trylinself is to do speculation, and commit(p) will commit to the appropriate branch satisfying the assertion p. In Fig. 1(c), we insert lines 5’ and 6’, where cid (end,(a, b)) means that the current thread cid should have done its abstract operation and would return (a, b). We also extend the auxiliary state to record the multiple possibilities of abstract operations and abstract states after speculation. Furthermore, we can combine the speculation idea with the pending thread pool. We allow the abstract operations in the pending thread pool as well as the current thread to speculate. Then we could handle some trickier algorithms such as RDCSS [12], in which the location of LP for thread t may be in the code of some other thread and also depend on the future behaviors of that thread. Please see Sec. 6 for one such example. 2.4 Simulation as Meta-Theory The LP proof method can be understood as building simulations between the concrete implementations and the abstract atomic operations, such as the simple weak simulation in Fig. 2(a). The lowerlevel and higher-level arrows are the steps of the implementation and of the abstract operation respectively, and the dashed lines denote the simulation relation. We use dark nodes and white nodes at the abstract level to distinguish whether the operation has been finished or not. The only step at the concrete side corresponding to the single abstract step should be the LP of the implementation (labeled “LP” in the diagram). Since our program logic is based on the LP method, we can expect simulations to justify its soundness. In particular, we want a thread-local simulation which can handle both the helping mechanism and future-dependent LPs and can ensure linearizability. To support helping in the simulation, we should allow the LP step at the concrete level to correspond to an abstract step made by a thread other than the one being verified. This requires information from other threads at the abstract side, thus makes it difficult to build a thread-local simulation. To address the problem, we introduce the pending thread pool at the abstract level of the simulation, just as in the development of our logic in Sec. 2.2. The new simula- 461
(MName)f∈String (ThrdlD)t E Nat (Expr)E ::x n E+E... (Mem)o E (PVarUNat)-Int (BExp)B ::true false E=E !B .. (CallStk)::=(L:,C)o (Instr)c:=:=E:=[E]]:=E print(E) (ThrdPool)K:={t1∽K1,.·,tn∽Kn} |x=cons(E,,E)「dispose(E)|· (PState)S:=(acao,C) (Stmt)C ::skip cx:=f(E)return E noret (LState)s ::=(dc,o,K) (C)C;C if(B)C else C while (B){C) (Em))e=(t,f,n)I(t,ok,n)I(t,obj,abort) (Prog)W:=skip|letΠinCl..l‖C (t,out,n)(t,clt,abort) (ODecl)II ::={f1(1,C1),...,fn(n,Cn)} (ETrace)H::=e e::H Figure 3.Syntax of the Programming Language Figure 4.States and Event Traces tion is shown in Fig.2(b).We can see that a concrete step of thread We use a runtime command noret to abort methods that termi- t could help linearize the operation of t'in the pending thread pool nate but do not execute return E.It is automatically appended to as well as its own operation.Thus the new simulation intuitively the method code and is not supposed to be used by programmers. supports the helping mechanism. Other commands are mostly standard.Clients can use print(E)to As forward simulations,neither of the simulations in Fig.2(a) produce observable external events.We do not allow the object's and (b)supports future-dependent LPs.For each step along the con- methods to produce external events.To simplify the semantics,we crete execution in those simulations,we need to decide immedi- also assume there are no nested method calls. ately whether the step is at the LP,and cannot postpone the decision Figure 4 gives the model of program states.Here we partition a to the future.As discussed a lot in previous work (e.g.,[1,3.6,211). global state S into the client memory oe,the object oo and a thread we have to introduce backward simulations or hybrid simulations pool K.A client can only access the client memory oe,unless it to support future-dependent LPs.Here we exploit the speculation calls object methods.The thread pool maps each thread id t to its idea and develop a forward-backward simulation [21].As shown in local call stack frame.A call stack k could be either empty (o)when Fig.2(c).we keep both speculations after the potential LP,where the thread is not executing a method,or a triple (o,C),where the higher black nodes result from executing the abstract operation o maps the method's formal argument and local variables (if any) and the lower white nodes record the original abstract configura- to their values.is the caller's variable to receive the return value. tion.Then at the validation step we commit to the correct branch. and C is the caller's remaining code to be executed after the method Finally,to ensure linearizability,the thread-local simulation has returns.To give a thread-local semantics,we also define the thread to be compositional.As a counterexample,we can construct a local view s of the state. simple simulation(like the one in Fig.2(a))between the following Figure 5 gives selected rules of the operational semantics.We implementation C and the abstract atomic increment operation y, show three kinds of transitions:for the top-level program but C is not linearizable w.r.t. transitions,- t.n for the transitions of thread t with the methods' declaration II,and-t for the steps inside method calls of thread C:local t;t :x;x :t+1; y:x++ t.To describe the operational semantics for threads.we use an The reason is that the simple simulation is not compositional w.rt. execution context E: parallel compositions.To address this problem,we proposed a (ExecContext)E::=[]E:C compositional simulation RGSim [19]in previous work.The idea is to parameterize the simple simulation with the interference with The hole shows the place where the execution of code occurs. the environment,in the form of rely/guarantee conditions (R and E[C]represents the code resulting from placing C into the hole. G)[17].RGSim says,the concrete executions are simulated by the We label transitions with events e defined in Fig.4.An event abstract executions under interference from the environment R,and could be a method invocation (t,f,n)or return (t,ok,n).a fault all the related state transitions of the thread being verified should (t,obj,abort)produced by the object method code,an output satisfy G.For parallel composition,we check that the guarantee (t,out,n)generated by print(E),or a fault (t,clt,abort)from the G of each thread is permitted in the rely R of the other.Then the client code.The first two events are called object events,and the last simulation becomes compositional and can ensure linearizability. two are observable external events.The third one (t,obj,abort) We combine the above ideas and develop a new compositional belongs to both classes.An event trace H is then defined as a finite simulation with the support of non-fixed LPs as the meta-theory of sequence of events. our logic.We will discuss our simulation formally in Sec.5. 3.2 Object Specification and Linearizability 3.Basic Technical Settings and Linearizability Next we formalize the object specification T,which maps method names to their abstract operations as shown in Fig.6.y trans- In this section,we formalize linearizability of an object implemen- forms an argument value and an initial abstract object to a return tation w.r.t.its specification,and show that linearizability is equiv- value with a resulting abstract object in a single step.It specifies alent to contextual refinement. the intended sequential behaviors of the method.The abstract ob- 3.1 Language and Semantics ject representation 0 is defined as a mapping from program vari- ables to abstract values.We leave the abstract values unspecified As shown in Fig.3,a program W contains several client threads in here,which can be instantiated by programmers. parallel,each of which could call the methods declared in the object Then we give an abstract version of programs W,where clients II.A method is defined as a pair (C).where r is the formal interact with the abstract object specification I instead of its im- argument and C is the method body.For simplicity,we assume plementation II.The semantics is almost the same as the concrete there is only one object in W and each method takes one argument language shown in Sec.3.1,except that the abstract atomic opera only,but it is easy to extend our work with multiple objects and tion y is executed when the method is called,which now operates arguments. over the abstract object 0 instead of over the concrete one oo.The 462
(MName) f ∈ String (Expr) E ::= x | n | E + E | ... (BExp) B ::= true | false | E = E | !B | ... (Instr) c ::= x := E | x := [E] | [E] := E | print(E) | x := cons(E,...,E) | dispose(E) | ... (Stmt) C ::= skip | c | x := f(E) | return E | noret | C | C; C | if (B) C else C | while (B){C} (Prog) W ::= skip | let Π in C ...C (ODecl) Π ::= {f1 (x1, C1),...,fn (xn, Cn)} Figure 3. Syntax of the Programming Language tion is shown in Fig. 2(b). We can see that a concrete step of thread t could help linearize the operation of t in the pending thread pool as well as its own operation. Thus the new simulation intuitively supports the helping mechanism. As forward simulations, neither of the simulations in Fig. 2(a) and (b) supports future-dependent LPs. For each step along the concrete execution in those simulations, we need to decide immediately whether the step is at the LP, and cannot postpone the decision to the future. As discussed a lot in previous work (e.g., [1, 3, 6, 21]), we have to introduce backward simulations or hybrid simulations to support future-dependent LPs. Here we exploit the speculation idea and develop a forward-backward simulation [21]. As shown in Fig. 2(c), we keep both speculations after the potential LP, where the higher black nodes result from executing the abstract operation and the lower white nodes record the original abstract configuration. Then at the validation step we commit to the correct branch. Finally, to ensure linearizability, the thread-local simulation has to be compositional. As a counterexample, we can construct a simple simulation (like the one in Fig. 2(a)) between the following implementation C and the abstract atomic increment operation γ, but C is not linearizable w.r.t. γ. C : local t; t := x; x := t + 1; γ : x++ The reason is that the simple simulation is not compositional w.r.t. parallel compositions. To address this problem, we proposed a compositional simulation RGSim [19] in previous work. The idea is to parameterize the simple simulation with the interference with the environment, in the form of rely/guarantee conditions (R and G) [17]. RGSim says, the concrete executions are simulated by the abstract executions under interference from the environment R, and all the related state transitions of the thread being verified should satisfy G. For parallel composition, we check that the guarantee G of each thread is permitted in the rely R of the other. Then the simulation becomes compositional and can ensure linearizability. We combine the above ideas and develop a new compositional simulation with the support of non-fixed LPs as the meta-theory of our logic. We will discuss our simulation formally in Sec. 5. 3. Basic Technical Settings and Linearizability In this section, we formalize linearizability of an object implementation w.r.t. its specification, and show that linearizability is equivalent to contextual refinement. 3.1 Language and Semantics As shown in Fig. 3, a program W contains several client threads in parallel, each of which could call the methods declared in the object Π. A method is defined as a pair (x, C), where x is the formal argument and C is the method body. For simplicity, we assume there is only one object in W and each method takes one argument only, but it is easy to extend our work with multiple objects and arguments. (ThrdID) t ∈ Nat (Mem) σ ∈ (PVar ∪ Nat) Int (CallStk) κ ::= (σl, x, C) | ◦ (ThrdPool) K ::= {t1 κ1,...,tn κn} (PState) S ::= (σc, σo, K) (LState) s ::= (σc, σo, κ) (Evt) e ::= (t, f,n) | (t, ok, n) | (t, obj, abort) | (t, out, n) | (t, clt, abort) (ETrace) H ::= | e::H Figure 4. States and Event Traces We use a runtime command noret to abort methods that terminate but do not execute return E. It is automatically appended to the method code and is not supposed to be used by programmers. Other commands are mostly standard. Clients can use print(E) to produce observable external events. We do not allow the object’s methods to produce external events. To simplify the semantics, we also assume there are no nested method calls. Figure 4 gives the model of program states. Here we partition a global state S into the client memory σc, the object σo and a thread pool K. A client can only access the client memory σc, unless it calls object methods. The thread pool maps each thread id t to its local call stack frame. A call stack κ could be either empty (◦) when the thread is not executing a method, or a triple (σl, x, C), where σl maps the method’s formal argument and local variables (if any) to their values, x is the caller’s variable to receive the return value, and C is the caller’s remaining code to be executed after the method returns. To give a thread-local semantics, we also define the thread local view s of the state. Figure 5 gives selected rules of the operational semantics. We show three kinds of transitions: −→ for the top-level program transitions, −→t,Π for the transitions of thread t with the methods’ declaration Π, and −t for the steps inside method calls of thread t. To describe the operational semantics for threads, we use an execution context E: (ExecContext) E ::= [ ] | E; C The hole [ ] shows the place where the execution of code occurs. E[ C ] represents the code resulting from placing C into the hole. We label transitions with events e defined in Fig. 4. An event could be a method invocation (t,f,n) or return (t, ok, n), a fault (t, obj, abort) produced by the object method code, an output (t, out, n) generated by print(E), or a fault (t, clt, abort) from the client code. The first two events are called object events, and the last two are observable external events. The third one (t, obj, abort) belongs to both classes. An event trace H is then defined as a finite sequence of events. 3.2 Object Specification and Linearizability Next we formalize the object specification Γ, which maps method names to their abstract operations γ, as shown in Fig. 6. γ transforms an argument value and an initial abstract object to a return value with a resulting abstract object in a single step. It specifies the intended sequential behaviors of the method. The abstract object representation θ is defined as a mapping from program variables to abstract values. We leave the abstract values unspecified here, which can be instantiated by programmers. Then we give an abstract version of programs W, where clients interact with the abstract object specification Γ instead of its implementation Π. The semantics is almost the same as the concrete language shown in Sec. 3.1, except that the abstract atomic operation γ is executed when the method is called, which now operates over the abstract object θ instead of over the concrete one σo. The 462
(C,(ae,oo,K()》,n(C,(c2,。,) letⅡinCl..C..‖Cn,(oc,oo,)e(etΠinCl..Ci...Cn,(a2,a。,K{ix') (a)Program Transitions Π(f)=(g,C)【Ele=nx∈dom(oc)k=({y“n以,x,E[skip]) f是dom(四)or【Ele undefined or r是dom(ae) Ef(E)(e)(C;moret,(e)) (E()(e)))nabort k=(01,I,C)[ElGowa =n ae=detxn} [Eloc =n (E[return E]())() :.n(C,(a2,ao,o》 (E[print(E)],())(ou) .n(E[skip],(c,o,)) (C,o wa)t(C',owa)dom(a1)=dom(aj) (noret,)s:ohj.abort t,abort (C,(ae;o;(a1,I,Ce)))n(C',(de,p:(aj,r,Ce))) (b)Thread Transitions IE,lo undefined(1≤j≤i)orx是dom(a) (C;a)(skip,' (C,o)abort (E[x :cons(E1,...,Ei)],a)-tabort (E[(C)].)(E[skip].') (E[(C)],o)-◆t abort (c)Thread Transitions Inside Method Calls Figure 5.Selected Rules of Concrete Operational Semantics (AbsObj)B∈Par一AbsVal (MSpec)Y∈lnt+AbsObj一Imt×AbsObj We use HW,(e,)]to represent the set of histories pro- duced by the executions of W with the initial client memory (0Spec)T={f11,,fn∽m} oc.the object oo,and empty call stacks for all threads,and use (AbsProg)W::=skip with I do Cll...IlC H[W,(e,0)]to generate histories from the abstract program W with the initial client memory oe and the abstract object 0. Figure 6.Object Specification and Abstract Program A legal sequential history H is a history generated by any client using the specification I and an initial abstract object 0. abstract operation generates a pair of invocation and return events TP(8,H)些 atomically.Due to space limit,we give the semantics in TR [18]. 3n,C1,...,Cn:c.HE HI(with I do C1ll...IICn),(c:)l Linearizability.Linearizability [16]is defined using the notion of Then an object is linearizable iff all its completed concurrent histo- histories,which are special event traces H consisting of only object ries are linearizable w.rt.some legal sequential histories. events (i.e.,invocations,returns and object faults). Definition 2(Linearizability of Objects).The object's implemen- Below we use H(i)for the i-th event of H,and H for the tation II is linearizable w.r.t.its specification T under a refinement length of H.Hl represents the sub-history consisting of all the mapping,denoted byⅡ≤eT,if events whose thread id is t.The predicates is_inv(e)and is_res(e) mean that the event e is a method invocation and a response (i.e.,a Vn,C1,...,Cn;de,do,0,H. return or an object fault)respectively.We say a response e2 matches H∈Hl(letΠinC1ll..lCn),(ac,a。)lA(o(ao)=) → an invocation e iff they have the same thread id. 3He,H'.He∈completions(H)AfD(O,H')AHe≤nH A history H is sequential iff the first event of H is an invoca- Here the mapping relates concrete objects to abstract ones: tion,and each invocation,except possibly the last,is immediately followed by a matching response.Then H is well-formed iff,for all (RefMap)p∈Mem一AbsObj t.Ht is sequential.H is complete iff it is well-formed and every The side condition ()=0 in the above definition requires invocation has a matching response.An invocation is pending if no the initial concrete object o to be a well-formed data structure matching response follows it.We handle pending invocations in an representing a valid object 0. incomplete history H following the standard linearizability defini- tion [16]:we append zero or more response events to H,and drop 3.3 Contextual Refinement and Linearizability the remaining pending invocations.Then we get a set of complete Next we define contextual refinement between the concrete object histories.which is denoted by completions(H).We give formal and its specification,and prove its equivalence to linearizability definitions of the above concepts in TR [18]. This equivalence will serve as the basis of our logic soundness w.r.t. Definition 1(Linearizable Histories).Hn H'iff linearizability. Informally,the contextual refinement says,for any set of client 1.∀t.H:=H'; threads,the program W has no more observable behaviors than the 2.there exists a bijectionπ:{l,,lH}→{1,,lH'T}such corresponding abstract program W.Below we use [W,(e,)] that Vi.H(i)=H((i))and to represent the set of observable event traces generated during the i,i.i<jA is-res(H(i)A is_inv(H()=→T()<π(i) executions of W with the initial state (e,)(and empty stacks) It is defined similarly as W,(e,)but now the traces consist That is,H is linearizable w.nt.H'if the latter is a permutation of of observable events only (output events,client faults or object the former,preserving the order of events in the same threads and faults).The observable event traces [w,(c,0)]generated by the order of the non-overlapping method calls. the abstract program is defined similarly 463
(Ci, (σc, σo, K(i))) e −→i,Π (C i, (σ c, σ o, κ )) (let Π in C1 ...Ci ...Cn, (σc, σo, K)) e −→ (let Π in C1 ...C i ...Cn, (σ c, σ o, K{i κ })) (a) Program Transitions Π(f)=(y, C) Eσc = n x ∈ dom(σc) κ = ({y n}, x, E[skip ]) (E[ x := f(E) ], (σc, σo, ◦)) (t,f,n) −−−−→t,Π (C; noret, (σc, σo, κ)) f ∈ dom(Π) or Eσc undefined or x ∈ dom(σc) (E[ x := f(E) ], (σc, σo, ◦)) (t,clt,abort) −−−−−−−→t,Π abort κ = (σl, x, C) Eσoσl = n σ c = σc{x n} (E[ return E ], (σc, σo, κ)) (t,ok,n) −−−−−→t,Π (C, (σ c, σo, ◦)) Eσc = n (E[ print(E) ], (σc, σo, ◦)) (t,out,n) −−−−−→t,Π (E[skip ], (σc, σo, ◦)) (noret, s) (t,obj,abort) −−−−−−−→t,Π abort (C, σo σl) −t (C , σ o σ l) dom(σl) = dom(σ l) (C, (σc, σo, (σl, x, Cc))) −→t,Π (C , (σc, σ o, (σ l, x, Cc))) (b) Thread Transitions Ej σ undefined (1 ≤ j ≤ i) or x ∈ dom(σ) (E[ x := cons(E1,...,Ei) ], σ) −t abort (C, σ) −∗ t (skip, σ ) (E[ C ], σ) −t (E[skip ], σ ) (C, σ) −∗ t abort (E[ C ], σ) −t abort (c) Thread Transitions Inside Method Calls Figure 5. Selected Rules of Concrete Operational Semantics (AbsObj) θ ∈ PVar AbsVal (MSpec) γ ∈ Int → AbsObj Int × AbsObj (OSpec) Γ ::= {f1 γ1,...,fn γn} (AbsProg) W ::= skip | with Γ do C ...C Figure 6. Object Specification and Abstract Program abstract operation generates a pair of invocation and return events atomically. Due to space limit, we give the semantics in TR [18]. Linearizability. Linearizability [16] is defined using the notion of histories, which are special event traces H consisting of only object events (i.e., invocations, returns and object faults). Below we use H(i) for the i-th event of H, and |H| for the length of H. H|t represents the sub-history consisting of all the events whose thread id is t. The predicates is inv(e) and is res(e) mean that the event e is a method invocation and a response (i.e., a return or an object fault) respectively. We say a response e2 matches an invocation e1 iff they have the same thread id. A history H is sequential iff the first event of H is an invocation, and each invocation, except possibly the last, is immediately followed by a matching response. Then H is well-formed iff, for all t, H|t is sequential. H is complete iff it is well-formed and every invocation has a matching response. An invocation is pending if no matching response follows it. We handle pending invocations in an incomplete history H following the standard linearizability definition [16]: we append zero or more response events to H, and drop the remaining pending invocations. Then we get a set of complete histories, which is denoted by completions(H). We give formal definitions of the above concepts in TR [18]. Definition 1 (Linearizable Histories). H lin H iff 1. ∀t. H|t = H |t; 2. there exists a bijection π : {1,..., |H|} → {1,..., |H |} such that ∀i. H(i) = H (π(i)) and ∀i, j. i < j ∧ is res(H(i)) ∧ is inv(H(j)) =⇒ π(i) < π(j). That is, H is linearizable w.r.t. H if the latter is a permutation of the former, preserving the order of events in the same threads and the order of the non-overlapping method calls. We use H [[ W,(σc, σo) ]] to represent the set of histories produced by the executions of W with the initial client memory σc, the object σo, and empty call stacks for all threads, and use H [[ W,(σc, θ) ]] to generate histories from the abstract program W with the initial client memory σc and the abstract object θ. A legal sequential history H is a history generated by any client using the specification Γ and an initial abstract object θ. Γ (θ, H) def = ∃n, C1,...,Cn, σc. H ∈ H[ ([ with Γ do C1 ...Cn), (σc, θ) ]] Then an object is linearizable iff all its completed concurrent histories are linearizable w.r.t. some legal sequential histories. Definition 2 (Linearizability of Objects). The object’s implementation Π is linearizable w.r.t. its specification Γ under a refinement mapping ϕ, denoted by Π ϕ Γ, iff ∀n, C1,...,Cn, σc, σo, θ, H. H ∈ H[ ([ let Π in C1 ...Cn), (σc, σo) ]] ∧ (ϕ(σo) = θ) =⇒ ∃Hc, H . Hc ∈ completions(H) ∧ Γ (θ, H ) ∧ Hc lin H Here the mapping ϕ relates concrete objects to abstract ones: (RefMap) ϕ ∈ Mem AbsObj The side condition ϕ(σo) = θ in the above definition requires the initial concrete object σo to be a well-formed data structure representing a valid object θ. 3.3 Contextual Refinement and Linearizability Next we define contextual refinement between the concrete object and its specification, and prove its equivalence to linearizability. This equivalence will serve as the basis of our logic soundness w.r.t. linearizability. Informally, the contextual refinement says, for any set of client threads, the program W has no more observable behaviors than the corresponding abstract program W. Below we use O [[ W,(σc, σo) ]] to represent the set of observable event traces generated during the executions of W with the initial state (σc, σo) (and empty stacks). It is defined similarly as H [[ W,(σc, σo) ]], but now the traces consist of observable events only (output events, client faults or object faults). The observable event traces O [[ W,(σc, θ) ]] generated by the abstract program is defined similarly. 463