Chapter 2.Background:Context-Sensitive Points-to Analysis 14 method containing the allocation site as the heap context for the abstract object created at that site(we write""to denote an unused arguments): heapCtrSelector(c,)=c It implies that the domains of method context (C)and heap context (HC)are the same.For brevity,we omit heapCtrSelector in the rest of this section and focus on mtdCtaSelector. 2.2.3.1 Call-Site-Sensitivity For a method,call-site-sensitivity [71,99,34,also referred to as k-CFA 71,52] or call-string-sensitivity [70,60,80],distinguishes its contexts by sequences of call sites on the call stacks of method invocations that lead to the method.In this case, the domain of method context C consists of sequences of call sites: C=IUIUI2... When dealing with a method call,call-site-sensitivity concatenates the context c of the method containing the current call site and the call site l itself as the method context for the callee method (is a concatenation operator): mtdCtrSelector(c,1,_)=c+l 2.2.3.2 Object-Sensitivity In contrast to call-site-sensitivity,object-sensitivity distinguishes method contexts by the receiver objects at each call site 53,55,75.The intuition behind object- sensitivity is that in object-oriented languages such as Java,one critical role of instance methods is to manipulate the receiver objects on which they are invoked. Therefore object-sensitivity aims to separately analyze the operations performed on different (receiver)objects to improve the analysis precision
Chapter 2. Background: Context-Sensitive Points-to Analysis 14 method containing the allocation site as the heap context for the abstract object created at that site (we write “ ” to denote an unused arguments): heapCtxSelector(c, ) = c It implies that the domains of method context (C) and heap context (HC) are the same. For brevity, we omit heapCtxSelector in the rest of this section and focus on mtdCtxSelector. 2.2.3.1 Call-Site-Sensitivity For a method, call-site-sensitivity [71, 99, 34], also referred to as k-CFA [71, 52] or call-string-sensitivity [70, 60, 80], distinguishes its contexts by sequences of call sites on the call stacks of method invocations that lead to the method. In this case, the domain of method context C consists of sequences of call sites: C = I 0 ∪ I 1 ∪ I 2 ... When dealing with a method call, call-site-sensitivity concatenates the context c of the method containing the current call site and the call site l itself as the method context for the callee method (++ is a concatenation operator): mtdCtxSelector(c, l, , ) = c ++ l 2.2.3.2 Object-Sensitivity In contrast to call-site-sensitivity, object-sensitivity distinguishes method contexts by the receiver objects at each call site [53, 55, 75]. The intuition behind objectsensitivity is that in object-oriented languages such as Java, one critical role of instance methods is to manipulate the receiver objects on which they are invoked. Therefore object-sensitivity aims to separately analyze the operations performed on different (receiver) objects to improve the analysis precision
Chapter 2.Background:Context-Sensitive Points-to Analysis 15 In object-sensitivity,the context elements are(abstract)heap objects: C=HUH UH2... The mtdCtrSelector for object-sensitivity is defined as above.At a call site,it selects the receiver object o;together with its heap context hc as the context of the callee method: mtdCtrSelector(_,,hc,oi)=hc++oi In this way,the method contexts of a method totally depend on its receiver objects (with their heap contexts),and the methods invoked on different objects will be analyzed under different contexts. 2.2.3.3 Type-Sensitivity Type-sensitivity [75]is a kind of practical context-sensitivity as it achieves a large part of the precision of object-sensitivity while being more efficient.Technically, it is an approximation of object-sensitivity.Object-sensitivity uses the receiver objects at a call site as the contexts for the callee method.Instead,type-sensitivity replaces the receiver objects in the contexts by the class types which enclose the allocation sites of the receiver objects.Therefore,type-sensitivity approximates object-sensitivity by merging the objects allocated in the same types as contexts. Thus the domain of context in type-sensitivity is composed by sequences of class types: C=TOUTI UT2... To formalize type-sensitivity,we define a function enclosing Type:H->T which maps a heap object to a class type enclosing the allocation site of the object.Now we can instantiate mtdCtrSelector for type-sensitivity:
Chapter 2. Background: Context-Sensitive Points-to Analysis 15 In object-sensitivity, the context elements are (abstract) heap objects: C = H0 ∪ H1 ∪ H2 ... The mtdCtxSelector for object-sensitivity is defined as above. At a call site, it selects the receiver object oi together with its heap context hc as the context of the callee method: mtdCtxSelector( , , hc, oi) = hc ++ oi In this way, the method contexts of a method totally depend on its receiver objects (with their heap contexts), and the methods invoked on different objects will be analyzed under different contexts. 2.2.3.3 Type-Sensitivity Type-sensitivity [75] is a kind of practical context-sensitivity as it achieves a large part of the precision of object-sensitivity while being more efficient. Technically, it is an approximation of object-sensitivity. Object-sensitivity uses the receiver objects at a call site as the contexts for the callee method. Instead, type-sensitivity replaces the receiver objects in the contexts by the class types which enclose the allocation sites of the receiver objects. Therefore, type-sensitivity approximates object-sensitivity by merging the objects allocated in the same types as contexts. Thus the domain of context in type-sensitivity is composed by sequences of class types: C = T 0 ∪ T 1 ∪ T 2 ... To formalize type-sensitivity, we define a function enclosingType : H → T which maps a heap object to a class type enclosing the allocation site of the object. Now we can instantiate mtdCtxSelector for type-sensitivity:
Chapter 2.Background:Context-Sensitive Points-to Analysis 16 mtdCtrSelector(_,,hc,oi)=hc++enclosingType(oi) As explained in the beginning of Section 2.2.3,heapCtrSelector selects the con- text of the method which contains the allocation site o;as o;'s heap context he.So in mtdCtaSelector,the heap context hc of o;also consists of a sequence of types
Chapter 2. Background: Context-Sensitive Points-to Analysis 16 mtdCtxSelector( , , hc, oi) = hc ++ enclosingT ype(oi) As explained in the beginning of Section 2.2.3, heapCtxSelector selects the context of the method which contains the allocation site oi as oi ’s heap context hc. So in mtdCtxSelector, the heap context hc of oi also consists of a sequence of types
Chapter 3 BEAN:Precise Points-to Analysis via Object Allocation Graph 3.1 Overview Two major dimensions for improving precision of points-to analysis are flow- sensitivity and context-sensitivity.For C/C++programs,flow-sensitivity is needed by many clients [28,37,105,104,86].For object-oriented programs,e.g.,Java programs,however,context-sensitivity is known to deliver tractable and useful precision [39,41,42,43,72,75,79],in general. We have introduced several kinds of commonly-used context-sensitivity in Chap- ter 2.Among all the context abstractions proposed,object-sensitivity is generally the most precise one in practice and is regarded as arguably the best for points- to analysis in object-oriented languages [39,75,34].This can be seen from its widespread adoption in a number of points-to analysis frameworks for Java,such as Doop [22,14],CHORD [16]and WALA [98].In addition,object-sensitivity has also been embraced by many other program analysis tasks,including typestate 17
Chapter 3 Bean: Precise Points-to Analysis via Object Allocation Graph 3.1 Overview Two major dimensions for improving precision of points-to analysis are flowsensitivity and context-sensitivity. For C/C++ programs, flow-sensitivity is needed by many clients [28, 37, 105, 104, 86]. For object-oriented programs, e.g., Java programs, however, context-sensitivity is known to deliver tractable and useful precision [39, 41, 42, 43, 72, 75, 79], in general. We have introduced several kinds of commonly-used context-sensitivity in Chapter 2. Among all the context abstractions proposed, object-sensitivity is generally the most precise one in practice and is regarded as arguably the best for pointsto analysis in object-oriented languages [39, 75, 34]. This can be seen from its widespread adoption in a number of points-to analysis frameworks for Java, such as Doop [22, 14], Chord [16] and Wala [98]. In addition, object-sensitivity has also been embraced by many other program analysis tasks, including typestate 17
Chapter 3.BEAN:Precise Points-to Analysis via Object Allocation Graph 18 verification [25,106],data race detection [57],information flow analysis [6,27,50], and program slicing [43] Despite its success,a k-object-sensitive points-to analysis,denoted k-obj,which uses a sequence of k allocation sites (as k context elements)to represent a calling context of a method call,may end up using some context elements redundantly in the sense that these redundant context elements fail to induce a finer partition of the space of (concrete)calling contexts for the method call.As a result,many opportunities for making further precision improvements are missed. In this chapter,we introduce BEAN,a general approach for improving the preci- sion of a k-object-sensitive points-to analysis,denoted k-obj,for Java,by avoiding redundant context elements in k-obj while still maintaining a k-limiting context abstraction.BEAN can also be considered as a new context-sensitivity approach, which breaks the informed opinion about the consecutive context elements as inher- ited in the traditional k-CFA-based context-sensitivity.The novelty of BEAN lies in identifying redundant context elements by solving a graph problem on an Object Allocation Graph (OAG),which is built based on a pre-analysis (e.g.,a context- insensitive Andersen's analysis)performed initially on a program,and then avoid them in the subsequent k-object-sensitive analysis.By construction,BEAN is gen- erally more precise than k-obj,with a precision that is guaranteed to be as good as k-obj in the worst case. We have implemented BEAN and applied it to refine two state-of-the-art(whole- program)points-to analyses,2-obj and S-2-obj [34],provided in Doop [22],result- ing in two BEAN-directed points-to analyses,B-2-obj and B-S-2-0bj,respectively. We have considered may-alias and may-fail-cast,two representative clients used elsewhere [23,75,79]for measuring the precision of a points-to analysis on a set of nine large Java programs from the DaCapo benchmark suite [8].Our results show
Chapter 3. Bean: Precise Points-to Analysis via Object Allocation Graph 18 verification [25, 106], data race detection [57], information flow analysis [6, 27, 50], and program slicing [43]. Despite its success, a k-object-sensitive points-to analysis, denoted k-obj, which uses a sequence of k allocation sites (as k context elements) to represent a calling context of a method call, may end up using some context elements redundantly in the sense that these redundant context elements fail to induce a finer partition of the space of (concrete) calling contexts for the method call. As a result, many opportunities for making further precision improvements are missed. In this chapter, we introduce Bean, a general approach for improving the precision of a k-object-sensitive points-to analysis, denoted k-obj, for Java, by avoiding redundant context elements in k-obj while still maintaining a k-limiting context abstraction. Bean can also be considered as a new context-sensitivity approach, which breaks the informed opinion about the consecutive context elements as inherited in the traditional k-CFA-based context-sensitivity. The novelty of Bean lies in identifying redundant context elements by solving a graph problem on an Object Allocation Graph (OAG), which is built based on a pre-analysis (e.g., a contextinsensitive Andersen’s analysis) performed initially on a program, and then avoid them in the subsequent k-object-sensitive analysis. By construction, Bean is generally more precise than k-obj, with a precision that is guaranteed to be as good as k-obj in the worst case. We have implemented Bean and applied it to refine two state-of-the-art (wholeprogram) points-to analyses, 2-obj and S-2-obj [34], provided in Doop [22], resulting in two Bean-directed points-to analyses, B-2-obj and B-S-2-obj, respectively. We have considered may-alias and may-fail-cast, two representative clients used elsewhere [23, 75, 79] for measuring the precision of a points-to analysis on a set of nine large Java programs from the DaCapo benchmark suite [8]. Our results show