6 J.Comput.Sci.Technol.,Nov.2013,Vol.28,No.6 Next we introduce the equivalent transformation below it and two adjacent condensation nodes (with rules of shape graphs.They are used to represent trans- e1,a and e2,a2 below respectively).Similarly,it is formations between shape graphs preserving semantic an equivalent transformation.from two adjacent un- equivalence.We give the proof after introducing se- constrained condensation nodes to one unconstrained mantics of shape graphs. condensation node. Another derived rule is for an equivalent transfor- 2.3.1 Equivalent Transformation Rules for (Circular) mation between one structure node and one condensa- Singly-Linked List tion node with 1.true below it. Basic Rules. Another kind of derived rule is related to an al- Rules based on predicate definitions of(circular) ternative inductive definition of list(s,e,a).We can singly-linked list.For example,rule in Fig.5(a)is a rule use the disjunction of right-hand sides without edges based on definition of list(s,e,a)in Fig.4. labeled with p&in rules in Figs.5(f)and 5(g)as the Rules adding or removing a condensation node right-hand side of the definition.Similar rules can be whose expression e equals zero.Rules in Figs.5(b),5(c given for list(s). are this kind of rules.For rule in Fig.5(b),a similar rule 2.3.2 Equivalent Transformation Rules for (Circular) is not included in Fig.5,in which the condensation node Doubly-Linked List interchanges its position with the structure node in the right-hand side window.Moreover,if we change the Basic Rules structure node into predicate node,null node or dan- Rules based on predicate definitions of (circular) gling node (remove its out-edge),the result is also a doubly-linked lists. basic rule of this kind. Rules adding or removing a condensation node Rules folding and unfolding a condensation node whose expression e equals zero(e.g.,rule in Fig.6(a)). whose expression e is greater than zero. Rule in Rules folding and unfolding a condensation node Fig.5(d)is one of this kind of rules.Similarly,we can whose expression e is greater than zero (e.g.,rule in give a rule by interchanging the positions of the struc- Fig.6(b)).Due to the symmetric status of labels l and ture node and the condensation node in the right-hand r,rules in Figs.6(a)and 6(b)can also be applied to side window. cases where the condensation node is pointed by edge Rules folding and unfolding an unconstrained con- r from the structure node. densation node.Rule in Fig.5(e)is one of this kind of Rules folding and unfolding a condensation node, rules.Rules similar to rule in Fig.5(e)can be given if which is a marginal node (head node or tail node)in the nt edge sourced from the condensation node points a doubly-linked list.They are slightly different from to other kinds of nodes. rules in Figs.6(a)and 6(b). Rules folding and unfolding predicate nodes.Rules Rules folding and unfolding an unconstrained con- in Figs.5(f)and 5(g)are two of this kind of rules. densation node (e.g.,rule in Fig.6(c)).Similar rules can Derived Rules. be given if edge r sourced from the condensation node One derived rule states an equivalent transforma- points to other kinds of nodes.or the condensation node tion between a condensation node with e1+e2,a1A a2 itself is a marginal node a→(e==0) a→(e>=l) (b (c) Fig.6.Part of the equivalent transformation rules for (circular)doubly-linked list
6 J. Comput. Sci. & Technol., Nov. 2013, Vol.28, No.6 Next we introduce the equivalent transformation rules of shape graphs. They are used to represent transformations between shape graphs preserving semantic equivalence. We give the proof after introducing semantics of shape graphs. 2.3.1 Equivalent Transformation Rules for (Circular) Singly-Linked List Basic Rules. • Rules based on predicate definitions of (circular) singly-linked list. For example, rule in Fig.5(a) is a rule based on definition of list(s, e, a) in Fig.4. • Rules adding or removing a condensation node whose expression e equals zero. Rules in Figs. 5(b), 5(c) are this kind of rules. For rule in Fig.5(b), a similar rule is not included in Fig.5, in which the condensation node interchanges its position with the structure node in the right-hand side window. Moreover, if we change the structure node into predicate node, null node or dangling node (remove its out-edge), the result is also a basic rule of this kind. • Rules folding and unfolding a condensation node whose expression e is greater than zero. Rule in Fig.5(d) is one of this kind of rules. Similarly, we can give a rule by interchanging the positions of the structure node and the condensation node in the right-hand side window. • Rules folding and unfolding an unconstrained condensation node. Rule in Fig.5(e) is one of this kind of rules. Rules similar to rule in Fig.5(e) can be given if the nt edge sourced from the condensation node points to other kinds of nodes. • Rules folding and unfolding predicate nodes. Rules in Figs. 5(f) and 5(g) are two of this kind of rules. Derived Rules. • One derived rule states an equivalent transformation between a condensation node with e1 + e2, a1 ∧ a2 below it and two adjacent condensation nodes (with e1, a1 and e2, a2 below respectively). Similarly, it is an equivalent transformation, from two adjacent unconstrained condensation nodes to one unconstrained condensation node. • Another derived rule is for an equivalent transformation between one structure node and one condensation node with 1, true below it. • Another kind of derived rule is related to an alternative inductive definition of list(s, e, a). We can use the disjunction of right-hand sides without edges labeled with pk in rules in Figs. 5(f) and 5(g) as the right-hand side of the definition. Similar rules can be given for list(s). 2.3.2 Equivalent Transformation Rules for (Circular) Doubly-Linked List Basic Rules. • Rules based on predicate definitions of (circular) doubly-linked lists. • Rules adding or removing a condensation node whose expression e equals zero (e.g., rule in Fig.6(a)). • Rules folding and unfolding a condensation node whose expression e is greater than zero (e.g., rule in Fig.6(b)). Due to the symmetric status of labels l and r, rules in Figs. 6(a) and 6(b) can also be applied to cases where the condensation node is pointed by edge r from the structure node. • Rules folding and unfolding a condensation node, which is a marginal node (head node or tail node) in a doubly-linked list. They are slightly different from rules in Figs. 6(a) and 6(b). • Rules folding and unfolding an unconstrained condensation node (e.g., rule in Fig.6(c)). Similar rules can be given if edge r sourced from the condensation node points to other kinds of nodes, or the condensation node itself is a marginal node. Fig.6. Part of the equivalent transformation rules for (circular) doubly-linked list
Zhao-Peng Li et al.:A Shape Graph Logic and A Shape System Derived Rules. 1)From equivalent transformation rules in the form Rules can be given in a way similar to derived rules of W+WI VW2,we can obtain two implication rules of(circular)singly-linked lists(except rules related to W1→V and W2→W. alternative inductive definitions). 2)From equivalent transformation rules in the form No other edges (except the two edges labeled with ofWi台W2 with side condition(e1==e2Aa1)→ l and r)point to the condensation node is an impor- a2)A((e2 ==e1Aa2)=a1),we can obtain two implica- tant principle in given definitions and rules for (circu- tion rules W→W2andW2→Wi with side condition lar)doubly-linked lists.The reason is that if we allow (e1==e2∧a1)→a2and(e2==e1Aa2)→a1 respec- other edges to point to condensation nodes,we do not tively. know which nodes they should point to in the unfolded 3)There are implication transformation rules for shape graph. shape graphs changing condensation nodes with e,a to unconstrained condensation nodes. 2.3.3 Equivalent Transformation Rules for Binary For equivalent transformation rules in the form Tree of WiW2,if there are condensation nodes with Besides the rules based on the predicate tree(s) e1,a and e2,a2 in Wi and W2 respectively and e there are rules for condensation nodes similar to those is less than e2 under constraint ai A a2,we can ob- of singly-linked lists.The difference is that there is one tain an implication rule W W3.Wi(W3)differs additional edge from each structure node or condensa- from Wi(W2)in that the former has only replaced the tion node to a tree predicate node. condensation node with an unconstrained condensation node and other nodes and edges are the same with the 2.3.4 Equivalent Transformation Rules for latter. Substitutions in e and a of Condensation Node For equivalent transformation rules in the form of Rules in Figs.7(a)and 7(b)are rules for condensa- Wi+W2,if there is a condensation node with e,a in tion nodes of(circular)singly-linked lists and(circular) W2 and no condensation node in Wi,we can obtain an doubly-linked lists respectively.Similar rules can be implication rule W=W3.W3 differs from W2 in the given for condensation nodes of binary trees or for con- above mentioned way. densation nodes as the marginal nodes in doubly-linked lists. 2.4 Semantics of Shape Graph For a programming language with dynamic mem- nt ory allocation such as C,nodes denote stack slots or (e1==e2人a1)→a2) A(e2==eAa2)→a) heap blocks(except for the null and dangling node), elal e2,a2 and edges represent values of corresponding pointers.A (a) shape graph without condensation nodes and predicate nodes is a graphical representation of machine state. (e1==e2a1)→a2) A general shape graph is a graphical representation of A(e2==e1Aa2)→a1) machine state set. e1,a1 e2,a2 First,we define the semantics of nodes and edges. (b) In a machine with a stack and a heap,denotations (or meanings)of nodes and edges in shape graphs are given Fig.7.Part of the equivalent transformation rules for substitu- as follows. tions in condensation node. 1)A declaration node represents a declared pointer We can do transformations on shape graphs using whose name is the label of the out-edge of this declara- these equivalent transformation rules.For example,for tion node.Different declaration nodes denote distinct a proper context XI and rule Wi+W2,we can apply stack slots. this rule to shape graph X[Wi]to get X[W2]and vice 2)A structure node represents a structure variable versa.That is,XWi XW2.Similarly,we can created by calling malloc.The number of out-edges is get X[W]XWi]V X[W2]from corresponding rule the same with the number of the field pointers in the W÷WVW2. corresponding structure variable.Labels of these edges Next,implication transformation rules are intro- are corresponding names of field pointers.A structure duced.They are transformation rules for shape graphs node denotes a heap block whose cell number is its out- preserving implication in semantics. Corresponding degree and abstract addresses of cells are corresponding proof will be given in Subsection 2.4. labels
Zhao-Peng Li et al.: A Shape Graph Logic and A Shape System 7 Derived Rules. Rules can be given in a way similar to derived rules of (circular) singly-linked lists (except rules related to alternative inductive definitions). No other edges (except the two edges labeled with l and r) point to the condensation node is an important principle in given definitions and rules for (circular) doubly-linked lists. The reason is that if we allow other edges to point to condensation nodes, we do not know which nodes they should point to in the unfolded shape graph. 2.3.3 Equivalent Transformation Rules for Binary Tree Besides the rules based on the predicate tree(s), there are rules for condensation nodes similar to those of singly-linked lists. The difference is that there is one additional edge from each structure node or condensation node to a tree predicate node. 2.3.4 Equivalent Transformation Rules for Substitutions in e and a of Condensation Node Rules in Figs. 7(a) and 7(b) are rules for condensation nodes of (circular) singly-linked lists and (circular) doubly-linked lists respectively. Similar rules can be given for condensation nodes of binary trees or for condensation nodes as the marginal nodes in doubly-linked lists. Fig.7. Part of the equivalent transformation rules for substitutions in condensation node. We can do transformations on shape graphs using these equivalent transformation rules. For example, for a proper context X[] and rule W1 ⇔ W2, we can apply this rule to shape graph X[W1] to get X[W2] and vice versa. That is, X[W1] ⇔ X[W2]. Similarly, we can get X[W] ⇔ X[W1] ∨ X[W2] from corresponding rule W ⇔ W1 ∨ W2. Next, implication transformation rules are introduced. They are transformation rules for shape graphs preserving implication in semantics. Corresponding proof will be given in Subsection 2.4. 1) From equivalent transformation rules in the form of W ⇔ W1 ∨ W2, we can obtain two implication rules W1 ⇒ W and W2 ⇒ W. 2) From equivalent transformation rules in the form of W1 ⇔ W2 with side condition ((e1 == e2 ∧ a1) ⇒ a2)∧((e2 == e1∧a2) ⇒ a1), we can obtain two implication rules W1 ⇒ W2 and W2 ⇒ W1 with side condition (e1 == e2 ∧a1) ⇒ a2 and (e2 == e1 ∧a2) ⇒ a1 respectively. 3) There are implication transformation rules for shape graphs changing condensation nodes with e, a to unconstrained condensation nodes. • For equivalent transformation rules in the form of W1 ⇔ W2, if there are condensation nodes with e1, a1 and e2, a2 in W1 and W2 respectively and e1 is less than e2 under constraint a1 ∧ a2, we can obtain an implication rule W0 1 ⇒ W0 2 . W0 1 (W0 2 ) differs from W1(W2) in that the former has only replaced the condensation node with an unconstrained condensation node and other nodes and edges are the same with the latter. • For equivalent transformation rules in the form of W1 ⇔ W2, if there is a condensation node with e, a in W2 and no condensation node in W1, we can obtain an implication rule W1 ⇒ W0 2 . W0 2 differs from W2 in the above mentioned way. 2.4 Semantics of Shape Graph For a programming language with dynamic memory allocation such as C, nodes denote stack slots or heap blocks (except for the null and dangling node), and edges represent values of corresponding pointers. A shape graph without condensation nodes and predicate nodes is a graphical representation of machine state. A general shape graph is a graphical representation of machine state set. First, we define the semantics of nodes and edges. In a machine with a stack and a heap, denotations (or meanings) of nodes and edges in shape graphs are given as follows. 1) A declaration node represents a declared pointer whose name is the label of the out-edge of this declaration node. Different declaration nodes denote distinct stack slots. 2) A structure node represents a structure variable created by calling malloc. The number of out-edges is the same with the number of the field pointers in the corresponding structure variable. Labels of these edges are corresponding names of field pointers. A structure node denotes a heap block whose cell number is its outdegree and abstract addresses of cells are corresponding labels
J.Comput.Sci.Technol.,Nov.2013,Vol.28,No.6 Cells of other types have little relationship with our sf:Abs Value×FieldVar→Abs Value U{W,D}.The discussion on shapes,so they are not considered here domain of sd is the set of declared pointers.sd maps Unconstrained condensation nodes are regarded as con- declared pointers to their abstract values.sf maps the densation nodes with expression n and assertion n>0 abstract address of a heap block and a field pointer in the following part of this subsection. name to the abstract value of the field pointer (A.D 3)A condensation node with e,a below represents n are two special abstract values for null and dangling structure variables and denotes n separated heap blocks pointers).Next s or (sd,sf)is used to represent the (assume the value of e is n). machine state. 4)Null nodes and dangling nodes do not represent Under such a model,a shape graph without predi- any program object.So they do not denote anything cate nodes and condensation nodes is a graphic repre- of the machine. sentation of the machine state based on the semantics of 5)A predicate node represents several structure the nodes and edges.A generic shape graph is a graphic variables allocated dynamically.It denotes a number representation of a certain set of machine state. of separated heap blocks.Based on the rules of predi- Definition 4.The set of machine states SG rep- cate unfolding in Subsection 2.3,the relations among resented by shape graph G can be defined using the fol- these blocks are determined according to semantics of lowing rules: edges. 1)If there is no predicate node or condensation node Edges do not represent any program object.So they in G,then there is only one state in SG.G represents do not denote any locations in the machine memory. this state directly where all labels of out-edges of decla- They are used to represent the values of corresponding ration nodes and the values of corresponding out-edges pointers. form function sd and structure nodes plus the labels of 1)An edge pointing to a structure node represents their out-edges form function sf. the address of the heap block denoted by the node. 2)If there is a condensation node with e and a below 2)Edges pointing to null nodes are used to ex- it and e has k values (implied by a),under these k cases press that the values of corresponding pointers are the condensation node can be unfolded completely into null.Edges pointing to dangling nodes are used to ex- press that corresponding pointers are dangling point- shape graphs G1,...,Gk.Then,SG]=S[G1]U...U SIGkl.If a implies that e could equal 0,1,...,and the ers.Hence,it does not affect the meaning of the shape condensation node can be unfolded completely into the graph,whether multiple edges point to one null(dan shape graph Gn with n structure nodes (n20),then gling)node,or they point to different null (dangling) SIG]=SIGo]U SIG1]U...USIGn]U.... nodes. 3)For an edge pointing to a condensation node with 3)If there is a predicate node in G and its corre- e and a,it represents the address of the heap block sponding definition body is G'or G1 V G2,then SG] denoted by the first structure node (ordered by the di- =SIG']or SIG]SIGi]USIG2l. rection of edges)of the unfolded condensation node if Similarly,we can define the semantics of GVG2. e >0.Otherwise,if e =0,it should be determined The syntax for the access path in shape graphs is the after removing this node using corresponding transfor- same as that in PointerC in favor of discussing the rela- mation rules. tions between them.Clearly,only access paths from a 4)The denotation of edges pointing predicate nodes declaration node to the other node are considered.We can be determined after unfolding corresponding predi will introduce in the following two cases. cate nodes using transformation rules. Full representation of access paths.In this case, Next we present the semantics of shape graphs fo- the access path does not cross any condensation node, cusing on pointers by omitting variables of other types and the labels of the edges on the path should be listed and their corresponding memory cells.For non-null in turn to represent such an access path.For example, and non-dangling pointers,their values indicate the ab- if the labels are p,left,right in turn,the access path is stract addresses of the heap blocks pointed by them written as p->left->right. The addresses of heap blocks dynamically allocated Condensed representation of access paths.In this are different abstract values.In our abstract machine case,the access path crosses one or more condensation stack slots and heap cells are accessed via names of nodes.For instance,if the path includes an out-edge declared pointers and names of field pointers respec- labeled left of a condensation node which stands for n tively.Thus,the abstract state of the machine (the number of nodes,then (->left)"should appear in the machine state for short)can be represented by two access path,such as hd(->nrt)m and hd(->nrt)m->nrt functions:sd Dec Var Abs Value U{N,D}and in Fig.3(b)
8 J. Comput. Sci. & Technol., Nov. 2013, Vol.28, No.6 Cells of other types have little relationship with our discussion on shapes, so they are not considered here. Unconstrained condensation nodes are regarded as condensation nodes with expression n and assertion n > 0 in the following part of this subsection. 3) A condensation node with e, a below represents n structure variables and denotes n separated heap blocks (assume the value of e is n). 4) Null nodes and dangling nodes do not represent any program object. So they do not denote anything of the machine. 5) A predicate node represents several structure variables allocated dynamically. It denotes a number of separated heap blocks. Based on the rules of predicate unfolding in Subsection 2.3, the relations among these blocks are determined according to semantics of edges. Edges do not represent any program object. So they do not denote any locations in the machine memory. They are used to represent the values of corresponding pointers. 1) An edge pointing to a structure node represents the address of the heap block denoted by the node. 2) Edges pointing to null nodes are used to express that the values of corresponding pointers are null. Edges pointing to dangling nodes are used to express that corresponding pointers are dangling pointers. Hence, it does not affect the meaning of the shape graph, whether multiple edges point to one null (dangling) node, or they point to different null (dangling) nodes. 3) For an edge pointing to a condensation node with e and a, it represents the address of the heap block denoted by the first structure node (ordered by the direction of edges) of the unfolded condensation node if e > 0. Otherwise, if e = 0, it should be determined after removing this node using corresponding transformation rules. 4) The denotation of edges pointing predicate nodes can be determined after unfolding corresponding predicate nodes using transformation rules. Next we present the semantics of shape graphs focusing on pointers by omitting variables of other types and their corresponding memory cells. For non-null and non-dangling pointers, their values indicate the abstract addresses of the heap blocks pointed by them. The addresses of heap blocks dynamically allocated are different abstract values. In our abstract machine, stack slots and heap cells are accessed via names of declared pointers and names of field pointers respectively. Thus, the abstract state of the machine (the machine state for short) can be represented by two functions: sd : DecVar → AbsValue ∪ {N , D} and sf : AbsValue × FieldVar → AbsValue ∪ {N , D}. The domain of sd is the set of declared pointers. sd maps declared pointers to their abstract values. sf maps the abstract address of a heap block and a field pointer name to the abstract value of the field pointer (N , D are two special abstract values for null and dangling pointers). Next s or hsd, sf i is used to represent the machine state. Under such a model, a shape graph without predicate nodes and condensation nodes is a graphic representation of the machine state based on the semantics of the nodes and edges. A generic shape graph is a graphic representation of a certain set of machine state. Definition 4. The set of machine states SJGK represented by shape graph G can be defined using the following rules: 1) If there is no predicate node or condensation node in G, then there is only one state in SJGK. G represents this state directly where all labels of out-edges of declaration nodes and the values of corresponding out-edges form function sd and structure nodes plus the labels of their out-edges form function sf . 2) If there is a condensation node with e and a below it and e has k values (implied by a), under these k cases the condensation node can be unfolded completely into shape graphs G1, . . . , Gk. Then, SJGK = SJG1K ∪ · · · ∪ SJGkK. If a implies that e could equal 0, 1, . . ., and the condensation node can be unfolded completely into the shape graph Gn with n structure nodes (n > 0), then SJGK = SJG0K ∪ SJG1K ∪ · · · ∪ SJGnK ∪ · · ·. 3) If there is a predicate node in G and its corresponding definition body is G0 or G1 ∨ G2, then SJGK = SJG0 K or SJGK = SJG1K ∪ SJG2K. Similarly, we can define the semantics of G1 ∨ G2. The syntax for the access path in shape graphs is the same as that in PointerC in favor of discussing the relations between them. Clearly, only access paths from a declaration node to the other node are considered. We will introduce in the following two cases. • Full representation of access paths. In this case, the access path does not cross any condensation node, and the labels of the edges on the path should be listed in turn to represent such an access path. For example, if the labels are p, left, right in turn, the access path is written as p->left->right. • Condensed representation of access paths. In this case, the access path crosses one or more condensation nodes. For instance, if the path includes an out-edge labeled left of a condensation node which stands for n number of nodes, then (->left) n should appear in the access path, such as hd(->nxt) m and hd(->nxt) m->nxt in Fig.3(b)