1:6 Yue Li,Tian Tan,Anders Moller,and Yannis Smaragdakis Direct Wrapped Unwrapped Flow Flow Flow IN Method Objects Value flow via assignments S field load/store operations, OUT Method or method calls/returns Fig.2.Three general patterns of value flow that cause precision loss in context-insensitive analysis. the two objects are mixed together,so the analysis conservatively concludes that both x2 and y2 may point to both the A object and the B object. In contrast,a context-sensitive analysis would analyze m twice,corresponding to the two different call sites,and thereby conclude that x2 can only point to an A object and y2 can only point to a B object.The price of that extra precision is that the method needs to be analyzed multiple times, so context sensitivity should ideally only be applied when the precision gain outweighs the extra analysis time. To characterize the relevant value flows,we first introduce some terminology. Definition 3.1(IN and Our methods).Given a class C and a method M that is declared in C or inherited from C's super-classes,if M contains one or more parameters then M is an IN method of C,and if M's return type is non-void then M is an Our method of C.(In the example in Figure 1,m is both an IN and an Our method of the surrounding class.) Definition 3.2(Object wrapping and unwrapping).If an object O is stored in a field of an object W (or in an array entry of W,in case W is an array),then O is wrapped into W.Conversely,if an object O is loaded from a field of an object W(or from an array entry of W in case W is an array),then O is unwrapped from W.(The simple example in Figure 1 contains no wrapping or unwrapping.) With these definitions in place,we can describe the three precision-loss patterns as different kinds of value flows,depicted in Figure 2. Definition 3.3(Direct flow).If,in some execution of the program,an object O is passed as a parameter to an IN method M of class C,and then flows(via a series of assignments,field load/store operations,method calls,or returns)to the return value of an Our method,M2,of the same class C,then we say the program has direct flow from M to M2.(The example in Figure 1 is a simple instance of this pattern.) Definition 3.4(Wrapped flow).If,in some execution of the program,an object O is passed as a parameter to an IN method Mi of class C and then flows to a store operation that wraps O into an object W,where W subsequently flows to the result of an Our method,M2,of the same class C,then we say the program has wrapped flow from Mi to M2.More generally,the wrapped flow pattern also covers value flow through multiple object wrapping steps,for example when W is itself wrapped into another object W,which flows to the return value of M2. Definition 3.5(Unwrapped flow).If,in some execution of the program,an object O is passed as a parameter to an IN method Mi of class C and then flows to a load operation that unwraps an object U from O,where U subsequently flows to the return value of an OuT method,M2,of the same class C,then we say the program has unwrapped flow from M to M2.As in the previous definition, unwrapped flow also covers value flow through multiple object unwrapping steps. ACM Trans.Program.Lang.Syst.,Vol.1,No.1,Article 1.Publication date:January 2020
1:6 Yue Li, Tian Tan, Anders Møller, and Yannis Smaragdakis Direct Flow Wrapped Flow Unwrapped Flow IN Method OUT Method Value flow via assignments, field load/store operations, or method calls/returns Objects Fig. 2. Three general patterns of value flow that cause precision loss in context-insensitive analysis. the two objects are mixed together, so the analysis conservatively concludes that both x2 and y2 may point to both the A object and the B object. In contrast, a context-sensitive analysis would analyze m twice, corresponding to the two different call sites, and thereby conclude that x2 can only point to an A object and y2 can only point to a B object. The price of that extra precision is that the method needs to be analyzed multiple times, so context sensitivity should ideally only be applied when the precision gain outweighs the extra analysis time. To characterize the relevant value flows, we first introduce some terminology. Definition 3.1 (In and Out methods). Given a class 𝐶 and a method 𝑀 that is declared in 𝐶 or inherited from 𝐶’s super-classes, if 𝑀 contains one or more parameters then 𝑀 is an In method of 𝐶, and if 𝑀’s return type is non-void then 𝑀 is an Out method of 𝐶. (In the example in Figure 1, m is both an In and an Out method of the surrounding class.) Definition 3.2 (Object wrapping and unwrapping). If an object 𝑂 is stored in a field of an object 𝑊 (or in an array entry of𝑊 , in case𝑊 is an array), then 𝑂 is wrapped into 𝑊 . Conversely, if an object 𝑂 is loaded from a field of an object 𝑊 (or from an array entry of 𝑊 in case 𝑊 is an array), then 𝑂 is unwrapped from 𝑊 . (The simple example in Figure 1 contains no wrapping or unwrapping.) With these definitions in place, we can describe the three precision-loss patterns as different kinds of value flows, depicted in Figure 2. Definition 3.3 (Direct flow). If, in some execution of the program, an object 𝑂 is passed as a parameter to an In method 𝑀1 of class 𝐶, and then flows (via a series of assignments, field load/store operations, method calls, or returns) to the return value of an Out method, 𝑀2, of the same class 𝐶, then we say the program has direct flow from 𝑀1 to 𝑀2. (The example in Figure 1 is a simple instance of this pattern.) Definition 3.4 (Wrapped flow). If, in some execution of the program, an object 𝑂 is passed as a parameter to an In method 𝑀1 of class 𝐶 and then flows to a store operation that wraps 𝑂 into an object 𝑊 , where 𝑊 subsequently flows to the result of an Out method, 𝑀2, of the same class 𝐶, then we say the program has wrapped flow from 𝑀1 to 𝑀2. More generally, the wrapped flow pattern also covers value flow through multiple object wrapping steps, for example when 𝑊 is itself wrapped into another object 𝑊 ′ , which flows to the return value of 𝑀2. Definition 3.5 (Unwrapped flow). If, in some execution of the program, an object 𝑂 is passed as a parameter to an In method 𝑀1 of class 𝐶 and then flows to a load operation that unwraps an object 𝑈 from 𝑂, where 𝑈 subsequently flows to the return value of an Out method, 𝑀2, of the same class 𝐶, then we say the program has unwrapped flow from 𝑀1 to 𝑀2. As in the previous definition, unwrapped flow also covers value flow through multiple object unwrapping steps. ACM Trans. Program. Lang. Syst., Vol. 1, No. 1, Article 1. Publication date: January 2020
A Principled Approach to Selective Context Sensitivity for Pointer Analysis 1:7 Direct Flow 1 class Person 15} O @ String name;String id; 16 /Usage Code (19)name1 name2 (24) void setName(String nm){ 17 void main(){ this.name nm: 18 Person p1 new Person(); (3) updateID(): 19 String name1 new String("A"): this:name (4) 6 20 p1.setName(name1): 7 void updateID(){ 21 String id1 p1.getID(): newName (8) 8 String newName this.name: 22 9 this.id newName: 23 Person p2 new Person(); this.id (9) 101 24 String name2 new String("B"): 11 String getID(){ 25 p2.setName(name2): id (12) 12 String id this.id: 26 String id2 p2.getID(): 13 return id: 27 (21)1d1 1d2(26) 14 ①② ①② Fig.3.Example of direct flow.(The line number for each variable/field reference on the right-hand side is shown in parentheses.) 3.1 Pattern 1:Direct Flow The setter and getter example shown in Figure 3 demonstrates how direct flow is an indication of precision loss for a context-insensitive analysis.The Person class provides methods setName and getID to modify a person's name and retrieve his or her ID.Whenever a person's name is modified, the ID is updated accordingly (line 5). After executing this code,id1 in line 21(resp.id2 in line 26)points to object 1in line 19 (resp.2 in line 24)only.However,if the three methods of Person are analyzed using a context-insensitive pointer analysis,then idl and id2 will both imprecisely point to objects 1and2.Let us examine how this imprecision is connected to the direct flow pattern. The right-hand side of Figure 3 illustrates how two objects1and 2,respectively pointed to by name1 and name2,first flow from their creation sites in lines 19 and 24 to the parameter nm of the IN method setName in line 3,and then to id in line 12 through a series of store and load operations (line 4-line 8-line 9-line 12),and finally out of the Our method getID to id1 and id2 in lines 21 and 26.Hence,by Definition 3.3,the red arrows in Figure 3 form a direct flow. Notice that with a context-insensitive analysis,objects 1and 2are merged in the same points-to set and further propagated according to this direct flow.In the analysis,the merged objects will flow out of the Our method,causing id1 and id2 to point to spurious objects.Such imprecision will only get worse when some operations are further applied on id1 and id2(not shown in this example),possibly polluting other parts of the program. One way to avoid the imprecision is to apply context sensitivity to the methods that participate in the direct flow.We consider these to be precision-critical methods,since analyzing just one of them context-insensitively will likely introduce imprecision.With a context-sensitive analysis(for most variants of context sensitivity),in Figure 3,all variables and field references along the direct flow will be analyzed separately.For example,object sensitivity will use the two allocation sites at lines 18 and 23 as contexts.Accordingly,the merged paths along this direct flow are separated by the two contexts,like unzipping a zipper-hence the name of our technique.A similar strategy of separating merged paths also applies to wrapped and unwrapped flows,as shown next. 3.2 Pattern 2:Wrapped Flow The collection and iterator example shown in Figure 4 demonstrates how the wrapped flow pattern yields precision loss for a context-insensitive analysis.To keep the example simple,the collection ACM Trans.Program.Lang.Syst.,Vol.1,No.1,Article 1.Publication date:January 2020
A Principled Approach to Selective Context Sensitivity for Pointer Analysis 1:7 Direct Flow name1 1 2 1 2 class Person { 1 String name; String id; void setName(String nm) { this.name = nm; updateID(); } void updateID() { String newName = this.name; this.id = newName; } String getID() { String id = this.id; return id; } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 } // Usage Code void main() { Person p1 = new Person(); String name1 = new String("A"); p1.setName(name1); String id1 = p1.getID(); Person p2 = new Person(); String name2 = new String("B"); p2.setName(name2); String id2 = p2.getID(); } 15 16 17 18 19 20 21 22 23 24 25 26 27 name2 nm this.name newName this.id id id1 id2 2 (24) (3) (4) (8) (9) (12) (21) (26) (19) Fig. 3. Example of direct flow. (The line number for each variable/field reference on the right-hand side is shown in parentheses.) 3.1 Pattern 1: Direct Flow The setter and getter example shown in Figure 3 demonstrates how direct flow is an indication of precision loss for a context-insensitive analysis. The Person class provides methods setName and getID to modify a person’s name and retrieve his or her ID. Whenever a person’s name is modified, the ID is updated accordingly (line 5). After executing this code, id1 in line 21 (resp. id2 in line 26) points to object 1 in line 19 (resp. 2 in line 24) only. However, if the three methods of Person are analyzed using a context-insensitive pointer analysis, then id1 and id2 will both imprecisely point to objects 1 and 2 . Let us examine how this imprecision is connected to the direct flow pattern. The right-hand side of Figure 3 illustrates how two objects 1 and 2 , respectively pointed to by name1 and name2, first flow from their creation sites in lines 19 and 24 to the parameter nm of the In method setName in line 3, and then to id in line 12 through a series of store and load operations (line 4 → line 8 → line 9 → line 12), and finally out of the Out method getID to id1 and id2 in lines 21 and 26. Hence, by Definition 3.3, the red arrows in Figure 3 form a direct flow. Notice that with a context-insensitive analysis, objects 1 and 2 are merged in the same points-to set and further propagated according to this direct flow. In the analysis, the merged objects will flow out of the Out method, causing id1 and id2 to point to spurious objects. Such imprecision will only get worse when some operations are further applied on id1 and id2 (not shown in this example), possibly polluting other parts of the program. One way to avoid the imprecision is to apply context sensitivity to the methods that participate in the direct flow. We consider these to be precision-critical methods, since analyzing just one of them context-insensitively will likely introduce imprecision. With a context-sensitive analysis (for most variants of context sensitivity), in Figure 3, all variables and field references along the direct flow will be analyzed separately. For example, object sensitivity will use the two allocation sites at lines 18 and 23 as contexts. Accordingly, the merged paths along this direct flow are separated by the two contexts, like unzipping a zipper—hence the name of our technique. A similar strategy of separating merged paths also applies to wrapped and unwrapped flows, as shown next. 3.2 Pattern 2: Wrapped Flow The collection and iterator example shown in Figure 4 demonstrates how the wrapped flow pattern yields precision loss for a context-insensitive analysis. To keep the example simple, the collection ACM Trans. Program. Lang. Syst., Vol. 1, No. 1, Article 1. Publication date: January 2020
1:8 Yue Li,Tian Tan,Anders Moller,and Yannis Smaragdakis Wrapped Flow class Collection{ 18 Object next(){ ( ② 2 Object elem; 19 return this.next: (25)s1 s2(31) 3 void add(Object el){ 20 this.elem =el: 21 (3) 22 //Usage Code 6 Iterator iterator(){ 23 void main(){ this.elem (4) Object e this.elem: 24 Collection c1 new Collection(): Iterator itr 25 String s1 new String("A"): ⊙ new Iterator(e); 26 c1.add(s1): 9 return itr; 27 Iterator i1 c1.iterator(): (8)itr obi (14) 10 } 28 Object o1 i1.next() 11} 29 this next(③X②) (15) 12 class Iterator 30 Collection c2 new Collection(): 13 Object next: 1 String s2 new String("B"): 17 (27)(33) 14 Iterator(Object obj){ 32 c2.add(s2): ①回0回 15 this.next obj: 33 Iterator 12 c2.iterator(); 16 34 Object o2 12.next(); 01 (28)(34) 17 35 ①@①② Fig.4.Example of wrapped flow. only stores one element,however the code pattern is directly analogous to realistic code,for arbitrarily-sized collections.Class Collection provides an add method to add an element to the collection and an iterator method to return an iterator that has a pointer,next,pointing to the collection element(as set in line 15).The element is passed as an argument to the newly created iterator(line 8),which establishes a connection between the collection and its iterator.Two objects D(line 25)and (line 31)are stored in two different collections,c1(line 26)and c2 (line 32).The two objects are then accessed by the iterators of the collections(lines 28 and 34). After executing the code,o1 in line 28(resp.o2 in line 34)points to object 1(resp.)only.How- ever,if any of the four methods of Collection and Iterator are analyzed context-insensitively,o1 and o2 will both imprecisely point to both objects1and 2.Let us examine how this imprecision is connected to the wrapped flow pattern. As shown on the right-hand side of Figure 4,similarly to the direct flow example in Figure 3, objects 1 and2 flow into the IN method add of class Collection,and then further to lines 7, 8,and 14.Unlike a direct flow,the objects 1 and2 do not directly flow out of the Our method iterator of class Collection;instead,a wrapper Iterator object,(created on line 8)in which object 1 or 2 is stored,flows out of this Our method. Object wrapping(Definition 3.2)occurs in line 15:objects 1and (pointed to by obj)are stored into the next field of the object pointed to by this,and this points to the receiver object of the constructor call in line 8,which is also pointed to by itr in line 8.As a wrapper object(that stores object1or )flows out of an OUT method of the same class,by Definition 3.4,the solid blue arrows in Figure 4 form a wrapped flow. With a context-insensitive analysis,objects 1 and 2are merged in the same points-to set and further propagated according to this wrapped flow.However,unlike a direct flow,imprecision is not introduced until the access operation (e.g.,the next calls in lines 28 and 34)is applied on the flowing-out wrapper object,causing variables o1 and o2 to point to spurious objects.The wrapper objects carry the flowing-in objects,which originate from outside the class,so context sensitivity can separate the merged objects all along their flow through the Collection class. The example also helps illustrate some subtleties of the flow definitions.Note that the precision loss patterns are expressed relative to a class:for each of the three patterns,the IN method and ACM Trans.Program.Lang.Syst.,Vol.1,No.1,Article 1.Publication date:January 2020
1:8 Yue Li, Tian Tan, Anders Møller, and Yannis Smaragdakis s1 s2 o1 o2 itr this.elem el Wrapped Flow 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 Object next() { return this.next; } } //Usage Code void main() { Collection c1 = new Collection(); String s1 = new String("A"); c1.add(s1); Iterator i1 = c1.iterator(); Object o1 = i1.next(); Collection c2 = new Collection(); String s2 = new String("B"); c2.add(s2); Iterator i2 = c2.iterator(); Object o2 = i2.next(); } class Collection { Object elem; void add(Object el) { this.elem = el; } Iterator iterator(){ Object e = this.elem; Iterator itr = new Iterator(e); return itr; } } class Iterator { Object next; Iterator(Object obj) { this.next = obj; } e obj this.next 1 2 i1 i2 1 2 1 2 1 2 1 2 1 2 (25) (31) (3) (4) (7) (14) (15) (33) (34) (8) (27) (28) Fig. 4. Example of wrapped flow. only stores one element, however the code pattern is directly analogous to realistic code, for arbitrarily-sized collections. Class Collection provides an add method to add an element to the collection and an iterator method to return an iterator that has a pointer, next, pointing to the collection element (as set in line 15). The element is passed as an argument to the newly created iterator (line 8), which establishes a connection between the collection and its iterator. Two objects 1 (line 25) and 2 (line 31) are stored in two different collections, c1 (line 26) and c2 (line 32). The two objects are then accessed by the iterators of the collections (lines 28 and 34). After executing the code, o1 in line 28 (resp. o2 in line 34) points to object 1 (resp. 2 ) only. However, if any of the four methods of Collection and Iterator are analyzed context-insensitively, o1 and o2 will both imprecisely point to both objects 1 and 2 . Let us examine how this imprecision is connected to the wrapped flow pattern. As shown on the right-hand side of Figure 4, similarly to the direct flow example in Figure 3, objects 1 and 2 flow into the In method add of class Collection, and then further to lines 7, 8, and 14. Unlike a direct flow, the objects 1 and 2 do not directly flow out of the Out method iterator of class Collection; instead, a wrapper Iterator object, , (created on line 8) in which object 1 or 2 is stored, flows out of this Out method. Object wrapping (Definition 3.2) occurs in line 15: objects 1 and 2 (pointed to by obj) are stored into the next field of the object pointed to by this, and this points to the receiver object of the constructor call in line 8, which is also pointed to by itr in line 8. As a wrapper object (that stores object 1 or 2 ) flows out of an Out method of the same class, by Definition 3.4, the solid blue arrows in Figure 4 form a wrapped flow. With a context-insensitive analysis, objects 1 and 2 are merged in the same points-to set and further propagated according to this wrapped flow. However, unlike a direct flow, imprecision is not introduced until the access operation (e.g., the next calls in lines 28 and 34) is applied on the flowing-out wrapper object, causing variables o1 and o2 to point to spurious objects. The wrapper objects carry the flowing-in objects, which originate from outside the class, so context sensitivity can separate the merged objects all along their flow through the Collection class. The example also helps illustrate some subtleties of the flow definitions. Note that the precision loss patterns are expressed relative to a class: for each of the three patterns, the In method and ACM Trans. Program. Lang. Syst., Vol. 1, No. 1, Article 1. Publication date: January 2020
A Principled Approach to Selective Context Sensitivity for Pointer Analysis 19 Unwrapped Flow 1 class SyncBox 19 Object getItem()( Object box: 28 Object it this.item: (26) s1 s2 (31) 3 SyncBox(Boxbox){ 21 return it; this.box box: 22 (27) b1 b2 32) 23 box (3) 6 Object getItem(){ 24 //Usage Code synchronized(this){ 25 void main(){ this.box (4) 8 Box b this.box: 26 String s1 new String("A"): 9 object o b.getItem(): 27 Box b1 new Box(s1); 回 (8) 10 return o: 28 SyncBox sb1 new SyncBox(b1): 11 29 Object o1 sb1.getItem(): (20) Eh因1tem①② 12 30 13} 31 String s2 new String("B"): (20) 14 class Box{ 32 Box b2 new Box(s2): ǎ 15 Object item: 33 SyncBox sb2 new SyncBox(b2): (9) 16 Box(Object item){ 34 Object o2 sb2.getItem() 1) this.item item; 353 (29)01 02(34) 18 ①0 ①@ Fig.5.Example of unwrapped flow. the Our method must be in the same class,although the value flow may involve other classes,as described in Definitions 3.3-3.5.Intuitively,if the precision-loss flows introduced in each class (through method calls on the objects of the class)could be identified and then avoided by use of context sensitivity,the imprecision of the whole program could be accordingly controlled via such a divide-and-conquer scheme.In addition,this design choice enables an efficient and elegant algorithm for identifying occurrences of the patterns in a given program,by considering each class one by one,as explained in Section 4. Therefore,the dashed arrows(bottom right of Figure 4)formed by calling the next method in lines 28 and 34,do not belong to the wrapped flow,because the calls happen after the wrapper objects flow out from the Our method of class Collection.Thus,as explained in Section 3.1, only methods add and iterator (in Collection)and the constructor Iterator(in Iterator)are included in the wrapped flow and thus considered precision-critical.However,if we consider IN and Our methods from the point of view of class Iterator,then method next is also precision-critical, since it is involved in a direct flow together with the Iterator constructor,much like the setter and getter methods in Section 3.1. 3.3 Pattern 3:Unwrapped Flow We use a synchronized box example(based on classes SynchronizedSet and Set in the JDK but heavily simplified)to illustrate an unwrapped flow,as shown in Figure 5.Class SyncBox encapsulates class Box by providing synchronization in the encapsulating method getItem(lines 6-12).Two objects 1 and 2are stored into two Box objects(represented by and pointed to by b1 and b2 in lines 27 and 32),which are further stored into two SyncBox objects(lines 28 and 33). After executing the code,o1 in line 29 (resp.o2 in line 34)points to object1(resp.2)only. However,if any of the four methods of classes SyncBox and Box are analyzed context-insensitively, o1 and o2 will both imprecisely point to both objects1and 2.Let us examine how this imprecision is connected to the unwrapped flow pattern. As shown on the right-hand side of Figure 5,similar to the direct flow in Figure 3,two Box objects 1and 2(pointed to by b1 and b2,respectively)flow into the body of class SyncBox through its constructor,which acts as an IN method,and then further to b in line 8.Unlike in a direct flow, ACM Trans.Program.Lang.Syst.,Vol.1,No.1,Article 1.Publication date:January 2020
A Principled Approach to Selective Context Sensitivity for Pointer Analysis 1:9 o1 o2 (34) b1 s1 s2 b this.box box Unwrapped Flow 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 this.item class SyncBox { Object box; SyncBox(Box box) { this.box = box; } Object getItem() { synchronized(this) { Box b = this.box; Object o = b.getItem(); return o; } } } class Box { Object item; Box(Object item) { this.item = item; } //Usage Code void main() { String s1 = new String("A"); Box b1 = new Box(s1); SyncBox sb1 = new SyncBox(b1); Object o1 = sb1.getItem(); String s2 = new String("B"); Box b2 = new Box(s2); SyncBox sb2 = new SyncBox(b2); Object o2 = sb2.getItem(); } Object getItem() { Object it = this.item; return it; } } b2 it o (26) (31) (27) (32) (3) (4) (8) (20) (20) (9) (29) 1 2 1 2 1 2 1 2 1 2 Fig. 5. Example of unwrapped flow. the Out method must be in the same class, although the value flow may involve other classes, as described in Definitions 3.3—3.5. Intuitively, if the precision-loss flows introduced in each class (through method calls on the objects of the class) could be identified and then avoided by use of context sensitivity, the imprecision of the whole program could be accordingly controlled via such a divide-and-conquer scheme. In addition, this design choice enables an efficient and elegant algorithm for identifying occurrences of the patterns in a given program, by considering each class one by one, as explained in Section 4. Therefore, the dashed arrows (bottom right of Figure 4) formed by calling the next method in lines 28 and 34, do not belong to the wrapped flow, because the calls happen after the wrapper objects flow out from the Out method of class Collection. Thus, as explained in Section 3.1, only methods add and iterator (in Collection) and the constructor Iterator (in Iterator) are included in the wrapped flow and thus considered precision-critical. However, if we consider In and Out methods from the point of view of class Iterator, then method next is also precision-critical, since it is involved in a direct flow together with the Iterator constructor, much like the setter and getter methods in Section 3.1. 3.3 Pattern 3: Unwrapped Flow We use a synchronized box example (based on classes SynchronizedSet and Set in the JDK but heavily simplified) to illustrate an unwrapped flow, as shown in Figure 5. Class SyncBox encapsulates class Box by providing synchronization in the encapsulating method getItem (lines 6–12). Two objects 1 and 2 are stored into two Box objects (represented by and pointed to by b1 and b2 in lines 27 and 32), which are further stored into two SyncBox objects (lines 28 and 33). After executing the code, o1 in line 29 (resp. o2 in line 34) points to object 1 (resp. 2 ) only. However, if any of the four methods of classes SyncBox and Box are analyzed context-insensitively, o1 and o2 will both imprecisely point to both objects 1 and 2 . Let us examine how this imprecision is connected to the unwrapped flow pattern. As shown on the right-hand side of Figure 5, similar to the direct flow in Figure 3, two Box objects 1 and 2 (pointed to by b1 and b2, respectively) flow into the body of class SyncBox through its constructor, which acts as an In method, and then further to b in line 8. Unlike in a direct flow, ACM Trans. Program. Lang. Syst., Vol. 1, No. 1, Article 1. Publication date: January 2020
1:10 Yue Li,Tian Tan,Anders Moller,and Yannis Smaragdakis the flowing-in objects and 2]do not flow out of the Our method getItem of class SyncBox; instead,the two unwrapped objects 1 and 2(respectively stored in1 and 2)are the ones that flow out of this Our method. Object unwrapping (Definition 3.2)occurs in line 20,as a result of the call in line 9:the Box objects (1and 2pointed to by b)are the receiver objects of this virtual call,and this in line 20 will also point to them during pointer analysis.The load operation in line 20 lets the unwrapped objects(1 and 2)flow to it (line 20),and finally to o1 and o2 (lines 29 and 34)through consecutive method return values(line 21-line 9 and then line 10-lines 29 and 34).As the unwrapped objects(retrieved from the flowing-in objects)flow out of an Our method of the same class,by Definition 3.5,the green arrows(in Figure 5)form an unwrapped flow. We can observe that objects and 2](and hence the unwrapped objects 1and they contain) are merged in the same points-to set and further propagated according to this unwrapped flow. Although the flowing-in objects do not flow out of an Our method of the same class to introduce imprecision,the unwrapped objects do,causing the receiving variables,in this case o1 and o2(lines 29 and 34),to point to spurious objects. Note that the program points where the unwrapped objects are stored in the flowing-in objects (lines 26-27 and 31-32)do not belong in the unwrapped flow,as the objects have not yet entered the IN method of class SyncBox.Thus,only constructor SyncBox,method getItem(in SyncBox), and method getItem(in Box)belong in the unwrapped flow and are considered precision-critical. However,as in the explanation of the wrapped flow example in Section 3.2,if we consider IN and Our methods from the point of view of class Box,its constructor,Box,will still be analyzed context-sensitively as it is part of a direct flow(together with the getItem method in Box). 3.4 Combination of Flows Some imprecision cannot be described by one pattern alone but only by combinations.Consider the example of an object W that flows into an IN method,where an object O is unwrapped from W.Then O is wrapped into another wrapper object,W',which flows out from an Our method of the same class.Imprecision may arise in this case,and although none of the three basic flow patterns in isolation match this flow,it is captured by a combination of unwrapped and wrapped flows.ZIPPER identifies not only occurrences of the three patterns but also such combinations.Our experiments(Section 6)show that the patterns and their combinations account for essentially all the imprecision that may appear in context-insensitive analysis. 4 ZIPPER This section introduces ZIPPER:our approach for identifying precision-critical methods based on the precision loss patterns of Section 3.Even if the patterns successfully characterize the main causes of precision loss in context-insensitive analysis,two challenges remain.First,the precision loss patterns are defined in dynamic execution terms,while ZIPPER has to capture the potential for these patterns using static information.Second,useful static information has to be computable from a mere context-insensitive analysis,in order to guide a context-sensitive one.That is,the potential for precision loss has to be detected from an analysis that already exhibits this loss.The ZIPPER approach is defined with these goals in mind,and manages to make context-sensitive pointer analysis run faster while preserving most of its precision. We present the overview of ZIPPER in Section 4.1 and the concepts of object flow graphs and precision flow graphs in Sections 4.2 and 4.3,respectively. ACM Trans.Program.Lang.Syst.,Vol.1,No.1,Article 1.Publication date:January 2020
1:10 Yue Li, Tian Tan, Anders Møller, and Yannis Smaragdakis the flowing-in objects 1 and 2 do not flow out of the Out method getItem of class SyncBox; instead, the two unwrapped objects 1 and 2 (respectively stored in 1 and 2 ) are the ones that flow out of this Out method. Object unwrapping (Definition 3.2) occurs in line 20, as a result of the call in line 9: the Box objects ( 1 and 2 pointed to by b) are the receiver objects of this virtual call, and this in line 20 will also point to them during pointer analysis. The load operation in line 20 lets the unwrapped objects ( 1 and 2 ) flow to it (line 20), and finally to o1 and o2 (lines 29 and 34) through consecutive method return values (line 21 → line 9 and then line 10 → lines 29 and 34). As the unwrapped objects (retrieved from the flowing-in objects) flow out of an Out method of the same class, by Definition 3.5, the green arrows (in Figure 5) form an unwrapped flow. We can observe that objects 1 and 2 (and hence the unwrapped objects 1 and 2 they contain) are merged in the same points-to set and further propagated according to this unwrapped flow. Although the flowing-in objects do not flow out of an Out method of the same class to introduce imprecision, the unwrapped objects do, causing the receiving variables, in this case o1 and o2 (lines 29 and 34), to point to spurious objects. Note that the program points where the unwrapped objects are stored in the flowing-in objects (lines 26–27 and 31–32) do not belong in the unwrapped flow, as the objects have not yet entered the In method of class SyncBox. Thus, only constructor SyncBox, method getItem (in SyncBox), and method getItem (in Box) belong in the unwrapped flow and are considered precision-critical. However, as in the explanation of the wrapped flow example in Section 3.2, if we consider In and Out methods from the point of view of class Box, its constructor, Box, will still be analyzed context-sensitively as it is part of a direct flow (together with the getItem method in Box). 3.4 Combination of Flows Some imprecision cannot be described by one pattern alone but only by combinations. Consider the example of an object 𝑊 that flows into an In method, where an object 𝑂 is unwrapped from 𝑊 . Then 𝑂 is wrapped into another wrapper object, 𝑊 ′ , which flows out from an Out method of the same class. Imprecision may arise in this case, and although none of the three basic flow patterns in isolation match this flow, it is captured by a combination of unwrapped and wrapped flows. Zipper identifies not only occurrences of the three patterns but also such combinations. Our experiments (Section 6) show that the patterns and their combinations account for essentially all the imprecision that may appear in context-insensitive analysis. 4 ZIPPER This section introduces Zipper: our approach for identifying precision-critical methods based on the precision loss patterns of Section 3. Even if the patterns successfully characterize the main causes of precision loss in context-insensitive analysis, two challenges remain. First, the precision loss patterns are defined in dynamic execution terms, while Zipper has to capture the potential for these patterns using static information. Second, useful static information has to be computable from a mere context-insensitive analysis, in order to guide a context-sensitive one. That is, the potential for precision loss has to be detected from an analysis that already exhibits this loss. The Zipper approach is defined with these goals in mind, and manages to make context-sensitive pointer analysis run faster while preserving most of its precision. We present the overview of Zipper in Section 4.1 and the concepts of object flow graphs and precision flow graphs in Sections 4.2 and 4.3, respectively. ACM Trans. Program. Lang. Syst., Vol. 1, No. 1, Article 1. Publication date: January 2020