141:6 Yue Li,Tian Tan,Anders Moller,and Yannis Smaragdakis iterator of class Collection;instead,a wrapper Iterator object,(created on line 8)in which object 1or2 is stored,flows out of this Our method. Object wrapping (Definition 2.2)occurs in line 15:objects 1 and (pointed to by obj)are stored into the next field of the object pointed to by this,and this points to the receiver object of the constructor call in line 8,which is also pointed to by itr in line 8.As a wrapper object(that stores object 1or2)flows out of an Our method of the same class,by Definition 2.4,the solid blue arrows in Figure 4 form a wrapped flow. With a context-insensitive analysis,objects 1and are merged in the same points-to set and further propagated according to this wrapped flow.However,unlike a direct flow,imprecision is not introduced until the access operation(e.g.,the next calls in lines 28 and 34)is applied on the flowing-out wrapper object,causing variables o1 and o2 to point to spurious objects.The wrapper objects carry the flowing-in objects,which originate from outside the class,so context sensitivity can separate the merged objects all along their flow through the Collection class. The example also helps illustrate some subtleties of the flow definitions.Note that the precision loss patterns are expressed relative to a class:for each of the three patterns,the IN method and the Our method must be in the same class,although the value flow may involve other classes,as described in Definitions 2.3-2.5.Intuitively,if the precision loss flows introduced in each class (through method calls on the objects of the class)could be identified and then avoided by use of context sensitivity,the imprecision of the whole program could be accordingly controlled via such a divide-and-conquer scheme.In addition,this design choice enables an efficient and elegant algorithm for identifying occurrences of the patterns in a given program,by considering each class one by one,as explained in Section 3. Therefore,the dashed arrows(bottom right of Figure 4)formed by calling the next method in lines 28 and 34,do not belong to the wrapped flow,because the calls happen after the wrapper objects flow out from the Our method of class Collection.Thus,as explained in Section 2.1, only methods add and iterator (in Collection)and the constructor Iterator (in Iterator)are included in the wrapped flow and thus considered precision-critical.However,if we consider IN and Our methods from the point of view of class Iterator,then method next is also precision-critical, since it is involved in a direct flow together with the Iterator constructor,much like the setter and getter methods in Section 2.1. 2.3 Pattern 3:Unwrapped Flow We use a synchronized box example(based on classes SynchronizedSet and Set in the JDK but heavily simplified)to illustrate an unwrapped flow,as shown in Figure 5.Class SyncBox encapsulates class Box by providing synchronization in the encapsulating method getItem(lines 6-12).Two objects 1 and 2 are stored into two Box objects(represented by and pointed to by b1 and b2 in lines 27 and 32),which are further stored into two SyncBox objects(lines 28 and 33). After executing the code,o1 in line 29(resp.o2 in line 34)points to object 1(resp.only. However,if any of the four methods of classes SyncBox and Box are analyzed context-insensitively, o1 and o2 will both imprecisely point to both objects 1and Let us examine how this imprecision is connected to the unwrapped flow pattern. As shown on the right-hand side of Figure 5,similar to the direct flow in Figure 3,two Box objects 1and 2(pointed to by b1 and b2,respectively)flow into the body of class SyncBox through its constructor,which acts as an IN method,and then further to b in line 8.Unlike in a direct flow, the flowing-in objects 1 and 2 do not flow out of the Our method getItem of class SyncBox; instead,the two unwrapped objects 1and (respectively stored inand 2)are the ones that flow out of this Our method. Proc.ACM Program.Lang.,Vol.2,No.OOPSLA,Article 141.Publication date:November 2018
141:6 Yue Li, Tian Tan, Anders Mùller, and Yannis Smaragdakis iterator of class Collection; instead, a wrapper Iterator object, , (created on line 8) in which object 1 or 2 is stored, flows out of this Out method. Object wrapping (Definition 2.2) occurs in line 15: objects 1 and 2 (pointed to by obj) are stored into the next field of the object pointed to by this, and this points to the receiver object of the constructor call in line 8, which is also pointed to by itr in line 8. As a wrapper object (that stores object 1 or 2 ) flows out of an Out method of the same class, by Definition 2.4, the solid blue arrows in Figure 4 form a wrapped flow. With a context-insensitive analysis, objects 1 and 2 are merged in the same points-to set and further propagated according to this wrapped flow. However, unlike a direct flow, imprecision is not introduced until the access operation (e.g., the next calls in lines 28 and 34) is applied on the flowing-out wrapper object, causing variables o1 and o2 to point to spurious objects. The wrapper objects carry the flowing-in objects, which originate from outside the class, so context sensitivity can separate the merged objects all along their flow through the Collection class. The example also helps illustrate some subtleties of the flow definitions. Note that the precision loss patterns are expressed relative to a class: for each of the three patterns, the In method and the Out method must be in the same class, although the value flow may involve other classes, as described in Definitions 2.3Ð2.5. Intuitively, if the precision loss flows introduced in each class (through method calls on the objects of the class) could be identified and then avoided by use of context sensitivity, the imprecision of the whole program could be accordingly controlled via such a divide-and-conquer scheme. In addition, this design choice enables an efficient and elegant algorithm for identifying occurrences of the patterns in a given program, by considering each class one by one, as explained in Section 3. Therefore, the dashed arrows (bottom right of Figure 4) formed by calling the next method in lines 28 and 34, do not belong to the wrapped flow, because the calls happen after the wrapper objects flow out from the Out method of class Collection. Thus, as explained in Section 2.1, only methods add and iterator (in Collection) and the constructor Iterator (in Iterator) are included in the wrapped flow and thus considered precision-critical. However, if we consider In and Out methods from the point of view of class Iterator, then method next is also precision-critical, since it is involved in a direct flow together with the Iterator constructor, much like the setter and getter methods in Section 2.1. 2.3 Pattern 3: Unwrapped Flow We use a synchronized box example (based on classes SynchronizedSet and Set in the JDK but heavily simplified) to illustrate an unwrapped flow, as shown in Figure 5. Class SyncBox encapsulates class Box by providing synchronization in the encapsulating method getItem (lines 6ś12). Two objects 1 and 2 are stored into two Box objects (represented by and pointed to by b1 and b2 in lines 27 and 32), which are further stored into two SyncBox objects (lines 28 and 33). After executing the code, o1 in line 29 (resp. o2 in line 34) points to object 1 (resp. 2 ) only. However, if any of the four methods of classes SyncBox and Box are analyzed context-insensitively, o1 and o2 will both imprecisely point to both objects 1 and 2 . Let us examine how this imprecision is connected to the unwrapped flow pattern. As shown on the right-hand side of Figure 5, similar to the direct flow in Figure 3, two Box objects 1 and 2 (pointed to by b1 and b2, respectively) flow into the body of class SyncBox through its constructor, which acts as an In method, and then further to b in line 8. Unlike in a direct flow, the flowing-in objects 1 and 2 do not flow out of the Out method getItem of class SyncBox; instead, the two unwrapped objects 1 and 2 (respectively stored in 1 and 2 ) are the ones that flow out of this Out method. Proc. ACM Program. Lang., Vol. 2, No. OOPSLA, Article 141. Publication date: November 2018
Precision-Guided Context Sensitivity for Pointer Analysis 141:7 Unwrapped Flow 1 class SyncBox 19 Object getItem()( Object box: 28 Object it this.item: (26) s1 s2 (31) 3 SyncBox(Boxbox){ 21 return it; this.box box: 22 (27) b1 b2 32) 23 box (3) 6 Object getItem(){ 24 //Usage Code synchronized(this){ 25 void main(){ this.box (4) 8 Box b this.box: 26 String s1 new String("A"): 9 object o b.getItem(): 27 Box b1 new Box(s1); 回 (8) 10 return o: 28 SyncBox sb1 new SyncBox(b1): 11 29 Object o1 sb1.getItem(): (20) Eh因1tem①② 12 30 13} 31 String s2 new String("B"): 14 class Box{ 32 Box b2 new Box(s2) ǎ (20) 15 Object item: 33 SyncBox sb2 new SyncBox(b2); (9) 16 Box(Object item){ 34 Object o2 sb2.getItem() 1) this.item item; 351 (29)01 02(34) 18 ①② ①@ Fig.5.Example of unwrapped flow. Object unwrapping(Definition 2.2)occurs in line 20,as a result of the call in line 9:the Box objects and 2pointed to by b)are the receiver objects of this virtual call,and this in line 20 will also point to them during pointer analysis.The load operation in line 20 lets the unwrapped objects(and 2)flow to it (line 20),and finally to o1 and o2 (lines 29 and 34)through consecutive method return values(line 21line 9 and then line 10-lines 29 and 34).As the unwrapped objects(retrieved from the flowing-in objects)flow out of an Our method of the same class,by Definition 2.5,the green arrows(in Figure 5)form an unwrapped flow. We can observe that objects and 2](and hence the unwrapped objects1and 2 they contain) are merged in the same points-to set and further propagated according to this unwrapped flow. Although the flowing-in objects do not flow out of an Our method of the same class to introduce imprecision,the unwrapped objects do,causing the receiving variables,in this case o1 and o2(lines 29 and 34),to point to spurious objects. Note that the program points where the unwrapped objects are stored in the flowing-in objects (lines 26-27 and 31-32)do not belong in the unwrapped flow,as the objects have not yet entered the IN method of class SyncBox.Thus,only constructor SyncBox,method getItem(in SyncBox), and method getItem(in Box)belong in the unwrapped flow and are considered precision-critical. However,as in the explanation of the wrapped flow example in Section 2.2,if we consider IN and Our methods from the point of view of class Box,its constructor,Box,will still be analyzed context-sensitively as it is part of a direct flow (together with the getItem method in Box). Finally,some imprecision cannot be described by one pattern alone but only by combinations. Consider the example of an object W that flows into an IN method,where an object O is unwrapped from W.Then O is wrapped into another wrapper object,W',which flows out from an Our method of the same class.Imprecision may arise in this case,and although none of the three basic flow patterns in isolation match this flow,it is captured by a combination of unwrapped and wrapped flows.ZIPPER identifies not only occurrences of the three patterns but also such combinations.Our experiments(Section 4)show that the patterns and their combinations account for essentially all the imprecision that may appear in context-insensitive analysis. Proc.ACM Program.Lang.,Vol.2,No.OOPSLA,Article 141.Publication date:November 2018
Precision-Guided Context Sensitivity for Pointer Analysis 141:7 o1 o2 (34) b1 s1 s2 b this.box box Unwrapped Flow 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 this.item class SyncBox { Object box; SyncBox(Box box) { this.box = box; } Object getItem() { synchronized(this) { Box b = this.box; Object o = b.getItem(); return o; } } } class Box { Object item; Box(Object item) { this.item = item; } //Usage Code void main() { String s1 = new String("A"); Box b1 = new Box(s1); SyncBox sb1 = new SyncBox(b1); Object o1 = sb1.getItem(); String s2 = new String("B"); Box b2 = new Box(s2); SyncBox sb2 = new SyncBox(b2); Object o2 = sb2.getItem(); } Object getItem() { Object it = this.item; return it; } } b2 it o (26) (31) (27) (32) (3) (4) (8) (20) (20) (9) (29) 1 2 1 2 1 2 1 2 1 2 Fig. 5. Example of unwrapped flow. Object unwrapping (Definition 2.2) occurs in line 20, as a result of the call in line 9: the Box objects ( 1 and 2 pointed to by b) are the receiver objects of this virtual call, and this in line 20 will also point to them during pointer analysis. The load operation in line 20 lets the unwrapped objects ( 1 and 2 ) flow to it (line 20), and finally to o1 and o2 (lines 29 and 34) through consecutive method return values (line 21 → line 9 and then line 10 → lines 29 and 34). As the unwrapped objects (retrieved from the flowing-in objects) flow out of an Out method of the same class, by Definition 2.5, the green arrows (in Figure 5) form an unwrapped flow. We can observe that objects 1 and 2 (and hence the unwrapped objects 1 and 2 they contain) are merged in the same points-to set and further propagated according to this unwrapped flow. Although the flowing-in objects do not flow out of an Out method of the same class to introduce imprecision, the unwrapped objects do, causing the receiving variables, in this case o1 and o2 (lines 29 and 34), to point to spurious objects. Note that the program points where the unwrapped objects are stored in the flowing-in objects (lines 26ś27 and 31ś32) do not belong in the unwrapped flow, as the objects have not yet entered the In method of class SyncBox. Thus, only constructor SyncBox, method getItem (in SyncBox), and method getItem (in Box) belong in the unwrapped flow and are considered precision-critical. However, as in the explanation of the wrapped flow example in Section 2.2, if we consider In and Out methods from the point of view of class Box, its constructor, Box, will still be analyzed context-sensitively as it is part of a direct flow (together with the getItem method in Box). Finally, some imprecision cannot be described by one pattern alone but only by combinations. Consider the example of an objectW that flows into an In method, where an object O is unwrapped fromW . Then O is wrapped into another wrapper object,W ′ , which flows out from an Out method of the same class. Imprecision may arise in this case, and although none of the three basic flow patterns in isolation match this flow, it is captured by a combination of unwrapped and wrapped flows. Zipper identifies not only occurrences of the three patterns but also such combinations. Our experiments (Section 4) show that the patterns and their combinations account for essentially all the imprecision that may appear in context-insensitive analysis. Proc. ACM Program. Lang., Vol. 2, No. OOPSLA, Article 141. Publication date: November 2018
141:8 Yue Li,Tian Tan,Anders Moller,and Yannis Smaragdakis ZIPPER Object Flow Graph Context-Insensitive OFG Precision Flow Graph Graph (OFG) hich meth Context-Sensitive Reachability Pointer Analysis (PFG) Construction Construction inter Analysis on PFG Fig.6.Overview of ZIPPER. 3 ZIPPER This section introduces ZIPPER:our approach to identifying precision-critical methods based on the precision loss patterns of Section 2.Even if the patterns successfully characterize the main causes of precision loss in context-insensitive analysis,two challenges remain.First,the precision loss patterns are defined in dynamic execution terms,while ZIPPER has to capture the potential for these patterns using static information.Second,useful static information has to be computable from a mere context-insensitive analysis,in order to guide a context-sensitive one.That is,the potential for precision loss has to be detected from an analysis that already exhibits this loss.The ZIPPER approach is defined with these goals in mind,and manages to make context-sensitive pointer analysis run faster while preserving most of its precision. We present the overview of ZIPPER in Section 3.1 and the concepts of object flow graphs and precision flow graphs in Sections 3.2 and 3.3,respectively. 3.1 Overview of ZIPPER The goal of ZIPPER is to efficiently recognize the precision-critical methods in a given program.The central part of ZIPPER is the notion of precision flow graphs(PFGs)that allow us to express all three precision loss patterns in a uniform way,in the sense that each kind of flow can be represented by a path in a PFG.Intuitively,a PFG is much like the right-hand side graphs of Figures 3-5,but replacing the field expressions by the abstract objects and their fields.Via the PFGs,we can convert the problem of identifying precision-critical methods to an abstract graph computation.All methods that are involved in one of the three kinds of flows can be efficiently extracted by solving a simple graph reachability problem on the PFGs. Constructing the PFGs requires information about how objects flow in the program.We leverage the concept of object flow graphs(OFGs)[Tonella and Potrich 2005]as explained in Section 3.2. The OFG for a program allows tracing the flow of objects through local assignments,calls and returns,and field load and store operations in the program.Therefore,it can naturally express the direct flow pattern,in a static analysis that approximates the dynamic flows of objects.However, the original OFG formulation does not represent wrapped and unwrapped flows,thus we cannot directly use it to identify precision-critical methods.For this reason,we build the PFGs on top of the OFG to uniformly express all three precision loss patterns. Figure 6 shows the overall structure of ZIPPER,which itself contains three components:the object flow graph construction,the precision flow graph construction,and the graph reachability computation.First,a fast but imprecise context-insensitive pointer analysis is performed as a pre-analysis for ZIPPER.To simplify the discussion,we assume that the pre-analysis abstracts objects by their allocation-sites [Chase et al.1990],but our technique also works for other object abstractions [Kanvar and Khedker 2016].This pre-analysis provides the information for the OFG construction,in the form of a relation pt(v)that captures the points-to set for each variable v.Based on the OFG,a PFG is constructed for each class.Afterwards,ZIPPER computes graph reachability on each PFG to determine which methods are precision-critical.Finally,a selective context-sensitive Proc.ACM Program.Lang.,Vol.2,No.OOPSLA,Article 141.Publication date:November 2018
141:8 Yue Li, Tian Tan, Anders Mùller, and Yannis Smaragdakis Object Flow Graph (OFG) Construction Precision Flow Graph (PFG) Construction Graph Reachability on PFG OFG PFG ZIPPER Context-Insensitive Pointer Analysis Context-Sensitive Pointer Analysis which methods need contexts Fig. 6. Overview of Zipper. 3 ZIPPER This section introduces Zipper: our approach to identifying precision-critical methods based on the precision loss patterns of Section 2. Even if the patterns successfully characterize the main causes of precision loss in context-insensitive analysis, two challenges remain. First, the precision loss patterns are defined in dynamic execution terms, while Zipper has to capture the potential for these patterns using static information. Second, useful static information has to be computable from a mere context-insensitive analysis, in order to guide a context-sensitive one. That is, the potential for precision loss has to be detected from an analysis that already exhibits this loss. The Zipper approach is defined with these goals in mind, and manages to make context-sensitive pointer analysis run faster while preserving most of its precision. We present the overview of Zipper in Section 3.1 and the concepts of object flow graphs and precision flow graphs in Sections 3.2 and 3.3, respectively. 3.1 Overview of Zipper The goal of Zipper is to efficiently recognize the precision-critical methods in a given program. The central part of Zipper is the notion of precision flow graphs (PFGs) that allow us to express all three precision loss patterns in a uniform way, in the sense that each kind of flow can be represented by a path in a PFG. Intuitively, a PFG is much like the right-hand side graphs of Figures 3ś5, but replacing the field expressions by the abstract objects and their fields. Via the PFGs, we can convert the problem of identifying precision-critical methods to an abstract graph computation. All methods that are involved in one of the three kinds of flows can be efficiently extracted by solving a simple graph reachability problem on the PFGs. Constructing the PFGs requires information about how objects flow in the program. We leverage the concept of object flow graphs (OFGs) [Tonella and Potrich 2005] as explained in Section 3.2. The OFG for a program allows tracing the flow of objects through local assignments, calls and returns, and field load and store operations in the program. Therefore, it can naturally express the direct flow pattern, in a static analysis that approximates the dynamic flows of objects. However, the original OFG formulation does not represent wrapped and unwrapped flows, thus we cannot directly use it to identify precision-critical methods. For this reason, we build the PFGs on top of the OFG to uniformly express all three precision loss patterns. Figure 6 shows the overall structure of Zipper, which itself contains three components: the object flow graph construction, the precision flow graph construction, and the graph reachability computation. First, a fast but imprecise context-insensitive pointer analysis is performed as a pre-analysis for Zipper. To simplify the discussion, we assume that the pre-analysis abstracts objects by their allocation-sites [Chase et al. 1990], but our technique also works for other object abstractions [Kanvar and Khedker 2016]. This pre-analysis provides the information for the OFG construction, in the form of a relation pt(v) that captures the points-to set for each variable v. Based on the OFG, a PFG is constructed for each class. Afterwards, Zipper computes graph reachability on each PFG to determine which methods are precision-critical. Finally, a selective context-sensitive Proc. ACM Program. Lang., Vol. 2, No. OOPSLA, Article 141. Publication date: November 2018
Precision-Guided Context Sensitivity for Pointer Analysis 141:9 Variable node o.f Object field node Program Object Flow Graph Local assignment b=a; a a b; ① a ① (a(p) Assignments in b c.m(a) c.f a; ② ② method calls/returns m(p){return r:} (C→mthi3 d c; ③ (r b) c.f d; ④ ④ Field load b=a.f; where o e pt(a) @.f -b e=d.f;⑤ Field store a.f =b; b 0.f assuming oe pt(c)and o E pt(d) where o e pt(a) Fig.7.Object flow graph construction,with an example. pointer analysis is performed,guided by ZIPPER's results,so that the pointer analysis applies context sensitivity to only the precision-critical methods reported by ZIPPER. 3.2 Object Flow Graphs The object flow graph(OFG)of a program,as in its original form by Tonella and Potrich [2005],is a directed graph that expresses how objects flow in the program.The nodes in the OFG represent program pointers,which can point to objects,and the edges represent basic object flow among the pointers.More precisely,the OFG contains a node for each variable in the program and for each field of each abstract object.Objects are abstracted in the same way as in the pre-analysis, as described in Section 3.1:we here assume allocation-site abstraction is being used,which is the most common choice,but the technique also works for other choices.An edge ab in the OFG means that the objects pointed by pointer a may flow to(and also be pointed to by)pointer b. Another way to view the OFG is that it is the subset constraint graph in an Andersen-style points-to analysis [Andersen 1994;Sridharan et al.2013]. Tonella and Potrich [2005]propose to build the OFG with more precision by cloning the variables of a method for each of its receiver objects(conceptually like object sensitivity [Milanova et al. 2002,2005]),so that the flow involved in different receiver objects of the same method can be distinguished.However,this is unnecessary for ZIPPER,since it builds the OFG based on the results of a context-insensitive analysis,and all flow queries are done at the class level instead of the object level,as explained in Section 2.Therefore,we perform no such cloning. Due to the close connection between OFGs and Andersen-style analysis,constructing the OFG is trivial,based on the points-to relation pt(v)provided by the context-insensitive pre-analysis. Figure 7 illustrates this construction.The left-hand side of Figure 7 lists(from left to right)the four basic object flows,the related Java statements that induce the flows,and the corresponding graph edges in the OFG. Consider the code fragment and its corresponding OFG on the right-hand side of Figure 7.There are five statements labeled 1-5,and each statement causes an edge(with the same label)to be added to the OFG.With the OFG,the object flow information can be directly obtained simply by checking graph reachability without the need to explicitly track alias information among variables or field accesses.For example,variable e is reachable from b in the OFG,which means that the objects pointed to by b may flow to (and also be pointed to by)e. As a result,direct flows can be expressed naturally by the paths in the OFG,however,that is not the case for wrapped and unwrapped flows.In the next section,we describe how to augment the OFG to express all three kinds of flows. Proc.ACM Program.Lang.,Vol.2,No.OOPSLA,Article 141.Publication date:November 2018
Precision-Guided Context Sensitivity for Pointer Analysis 141:9 Local assignment Assignments in method calls/returns Field load Field store b = a; b = c.m(a); m(p){ return r;} … b = a.f; a.f = b; p a b r a b b b v Variable node o.f Object field node Program Object Flow Graph a = b; c.f = a; d = c; c.f = d; e = d.f; a 1 2 3 4 5 b d e c 1 2 3 4 5 c mthis where o ∈ pt(a) o.f o.f o.f where o ∈ pt(a) assuming o ∈ pt(c) and o ∈ pt(d) Fig. 7. Object flow graph construction, with an example. pointer analysis is performed, guided by Zipper’s results, so that the pointer analysis applies context sensitivity to only the precision-critical methods reported by Zipper. 3.2 Object Flow Graphs The object flow graph (OFG) of a program, as in its original form by Tonella and Potrich [2005], is a directed graph that expresses how objects flow in the program. The nodes in the OFG represent program pointers, which can point to objects, and the edges represent basic object flow among the pointers. More precisely, the OFG contains a node for each variable in the program and for each field of each abstract object. Objects are abstracted in the same way as in the pre-analysis, as described in Section 3.1: we here assume allocation-site abstraction is being used, which is the most common choice, but the technique also works for other choices. An edge a→b in the OFG means that the objects pointed by pointer a may flow to (and also be pointed to by) pointer b. Another way to view the OFG is that it is the subset constraint graph in an Andersen-style points-to analysis [Andersen 1994; Sridharan et al. 2013]. Tonella and Potrich [2005] propose to build the OFG with more precision by cloning the variables of a method for each of its receiver objects (conceptually like object sensitivity [Milanova et al. 2002, 2005]), so that the flow involved in different receiver objects of the same method can be distinguished. However, this is unnecessary for Zipper, since it builds the OFG based on the results of a context-insensitive analysis, and all flow queries are done at the class level instead of the object level, as explained in Section 2. Therefore, we perform no such cloning. Due to the close connection between OFGs and Andersen-style analysis, constructing the OFG is trivial, based on the points-to relation pt(v) provided by the context-insensitive pre-analysis. Figure 7 illustrates this construction. The left-hand side of Figure 7 lists (from left to right) the four basic object flows, the related Java statements that induce the flows, and the corresponding graph edges in the OFG. Consider the code fragment and its corresponding OFG on the right-hand side of Figure 7. There are five statements labeled 1 ś 5 , and each statement causes an edge (with the same label) to be added to the OFG. With the OFG, the object flow information can be directly obtained simply by checking graph reachability without the need to explicitly track alias information among variables or field accesses. For example, variable e is reachable from b in the OFG, which means that the objects pointed to by b may flow to (and also be pointed to by) e. As a result, direct flows can be expressed naturally by the paths in the OFG, however, that is not the case for wrapped and unwrapped flows. In the next section, we describe how to augment the OFG to express all three kinds of flows. Proc. ACM Program. Lang., Vol. 2, No. OOPSLA, Article 141. Publication date: November 2018
141:10 Yue Li,Tian Tan,Anders Moller,and Yannis Smaragdakis Algorithm 1:PFGBUILDER OFG (Object Flow Graph) Input c (Input class) (Set of statements in the program) Output:PFGe (Precision Flow Graph for class c) 1PFGc←-{},VisitedNodes←{h,WUEdges←-{} 2 foreach m∈INe do foreach parameter p of m do LDFs(Np)where Np is the OFG node for p 5 return PFGc 6 Function DFs(N) 1 if NE VisitedNodes then 8 return 9 add N to VisitedNodes 10 if N is a variable node Na then 11 foreach b a.fes do /Handling unwrapped flow 12 Ladd Na一to WUEdges 13 foreach b.f aes do /Handling wrapped flow 14 foreach o∈pt(b)do 15 Ladd N No]to WUEdges 16 foreach N→N'∈OFGU WUEdges do 17 addN→W'to PFGc DEs(N) 3.3 Precision Flow Graphs and Graph Reachability We first explain how to construct precision flow graphs(PFGs)and then how to identify precision- critical methods by performing graph reachability on each PFG. Precision Flow Graph Construction.As explained in Section 3.2,one OFG is built for the entire program.Since the PFGs serve to express the three kinds of precision loss patterns,which are all defined relative to a class,as explained in Section 2,we construct one PFG for each class in the program.As the OFG can already describe direct flow(Section 3.2),the task of building the PFG is to add edges that can express the other two kinds of flows:wrapped and unwrapped flows. Algorithm 1(PFGBUILDER)shows how to build PFGe for a given class c.For simplicity,we represent the PFG and the OFG by their sets of graph edges,and the graph nodes are implicitly those that appear in the edge sets. Three sets are initialized to empty sets in line 1:the PFG edges,the set of visited nodes,and WUEdges,which denotes a set of extra edges for wrapped and unwrapped flows.As all three kinds of flows begin from the parameters of an IN method(see Section 2),the algorithm starts by iterating through those methods(lines 2-3,where INe denotes the set of IN methods of the input class c). Function DFs(line 6)traverses the input OFG and adds the edges for wrapped and unwrapped flows.As a result,the returned PFGe(line 5)includes all the nodes that can be reached from each parameter of IN methods of c,through direct,wrapped,and unwrapped flows,or combinations of these.Specifically,unwrapped and wrapped flows are handled in lines 11-12 and lines 13-15, respectively,by adding the corresponding edges to WUEdges.Finally,the generated PFG includes Proc.ACM Program.Lang.,Vol.2,No.OOPSLA,Article 141.Publication date:November 2018
141:10 Yue Li, Tian Tan, Anders Mùller, and Yannis Smaragdakis Algorithm 1: PfgBuilder Input : OFG (Object Flow Graph) c (Input class) S (Set of statements in the program) Output : PFGc (Precision Flow Graph for class c) 1 PFGc ← { }, VisitedNodes ← { }, WUEdges ← { } 2 foreach m ∈ Inc do 3 foreach parameter p of m do 4 Dfs(Np ) where Np is the OFG node for p 5 return PFGc 6 Function Dfs(N) 7 if N ∈ VisitedNodes then 8 return 9 add N to VisitedNodes 10 if N is a variable node Na then 11 foreach b = a.f ∈ S do // Handling unwrapped flow 12 add Na → Nb to WUEdges 13 foreach b.f = a ∈ S do // Handling wrapped flow 14 foreach o ∈ pt(b) do 15 add Na → N[o] to WUEdges 16 foreach N → N ′ ∈ OFG ∪ WUEdges do 17 add N → N ′ to PFGc 18 Dfs(N ′ ) 3.3 Precision Flow Graphs and Graph Reachability We first explain how to construct precision flow graphs (PFGs) and then how to identify precisioncritical methods by performing graph reachability on each PFG. Precision Flow Graph Construction. As explained in Section 3.2, one OFG is built for the entire program. Since the PFGs serve to express the three kinds of precision loss patterns, which are all defined relative to a class, as explained in Section 2, we construct one PFG for each class in the program. As the OFG can already describe direct flow (Section 3.2), the task of building the PFG is to add edges that can express the other two kinds of flows: wrapped and unwrapped flows. Algorithm 1 (PfgBuilder) shows how to build PFGc for a given classc. For simplicity, we represent the PFG and the OFG by their sets of graph edges, and the graph nodes are implicitly those that appear in the edge sets. Three sets are initialized to empty sets in line 1: the PFG edges, the set of visited nodes, and WUEdges, which denotes a set of extra edges for wrapped and unwrapped flows. As all three kinds of flows begin from the parameters of an In method (see Section 2), the algorithm starts by iterating through those methods (lines 2ś3, where Inc denotes the set of In methods of the input class c). Function Dfs (line 6) traverses the input OFG and adds the edges for wrapped and unwrapped flows. As a result, the returned PFGc (line 5) includes all the nodes that can be reached from each parameter of In methods of c, through direct, wrapped, and unwrapped flows, or combinations of these. Specifically, unwrapped and wrapped flows are handled in lines 11ś12 and lines 13ś15, respectively, by adding the corresponding edges to WUEdges. Finally, the generated PFG includes Proc. ACM Program. Lang., Vol. 2, No. OOPSLA, Article 141. Publication date: November 2018