Making Pointer Analysis More Precise by Unleashing the Power of Selective Context Sensitivity TIAN TAN,Nanjing University,China YUE LI',Nanjing University,China XIAOXING MA,Nanjing University,China CHANG XU,Nanjing University,China YANNIS SMARAGDAKIS,University of Athens,Greece Traditional context-sensitive pointer analysis is hard to scale for large and complex Java programs.To address this issue,a series of selective context-sensitivity approaches have been proposed and exhibit promising results. In this work,we move one step further towards producing highly-precise pointer analyses for hard-to-analyze Java programs by presenting the Unity-Relay framework,which takes selective context sensitivity to the next 147 level.Briefly,Unity-Relay is a one-two punch:given a set of different selective context-sensitivity approaches, say S=S1,...,Sn,Unity-Relay first provides a mechanism(called Unity)to combine and maximize the precision of all components of S.When Unity fails to scale,Unity-Relay offers a scheme(called Relay)to pass and accumulate the precision from one approach Si in S to the next,Si+1,leading to an analysis that is more precise than all approaches in S. As a proof-of-concept,we instantiate Unity-Relay into a tool called BATON and extensively evaluate it on a set of hard-to-analyze Java programs,using general precision metrics and popular clients.Compared with the state of the art,BATON achieves the best precision for all metrics and clients for all evaluated programs. The difference in precision is often dramatic-up to 71%of alias pairs reported by previously-best algorithms are found to be spurious and eliminated. CCS Concepts:.Theory of computation-Program analysis. Additional Key Words and Phrases:Pointer Analysis,Alias Analysis,Context Sensitivity,Java ACM Reference Format: Tian Tan,Yue Li,Xiaoxing Ma,Chang Xu,and Yannis Smaragdakis.2021.Making Pointer Analysis More Precise by Unleashing the Power of Selective Context Sensitivity.Proc.ACM Program.Lang.5,OOPSLA, Article 147(October 2021),27 pages.https://doi.org/10.1145/3485524 1 INTRODUCTION Pointer analysis is important for an array of real-world applications such as bug detection [Chandra et al.2009;Naik et al.2006],security analysis [Arzt et al.2014;Livshits and Lam 2005],program verification [Fink et al.2008;Pradel et al.2012]and program understanding [Li et al.2016;Sridharan "Corresponding author Authors'addresses:Tian Tan,State Key Laboratory for Novel Software Technology,Nanjing University,China,tiantan@ nju.edu.cn;Yue Li,State Key Laboratory for Novel Software Technology,Nanjing University,China,yueli@nju.edu.cn; Xiaoxing Ma,State Key Laboratory for Novel Software Technology,Nanjing University,China,xxm@nju.edu.cn;Chang Xu, State Key Laboratory for Novel Software Technology,Nanjing University,China,changxu@nju.edu.cn;Yannis Smaragdakis, Department of Informatics and Telecommunications,University of Athens,Greece,yannis@smaragd.org. ©0 Y This work is licensed under a Creative Commons Attribution 4.0 International License. 2021 Copyright held by the owner/author(s). 2475-1421/2021/10-ART147 https:/doi.org/10.1145/3485524 Proc.ACM Program.Lang.,Vol.5,No.OOPSLA,Article 147.Publication date:October 2021
147 Making Pointer Analysis More Precise by Unleashing the Power of Selective Context Sensitivity TIAN TAN, Nanjing University, China YUE LI∗ , Nanjing University, China XIAOXING MA, Nanjing University, China CHANG XU, Nanjing University, China YANNIS SMARAGDAKIS, University of Athens, Greece Traditional context-sensitive pointer analysis is hard to scale for large and complex Java programs. To address this issue, a series of selective context-sensitivity approaches have been proposed and exhibit promising results. In this work, we move one step further towards producing highly-precise pointer analyses for hard-to-analyze Java programs by presenting the Unity-Relay framework, which takes selective context sensitivity to the next level. Briefly, Unity-Relay is a one-two punch: given a set of different selective context-sensitivity approaches, say 𝑆 = 𝑆1, . . . , 𝑆𝑛, Unity-Relay first provides a mechanism (called Unity) to combine and maximize the precision of all components of 𝑆. When Unity fails to scale, Unity-Relay offers a scheme (called Relay) to pass and accumulate the precision from one approach 𝑆𝑖 in 𝑆 to the next, 𝑆𝑖+1, leading to an analysis that is more precise than all approaches in 𝑆. As a proof-of-concept, we instantiate Unity-Relay into a tool called Baton and extensively evaluate it on a set of hard-to-analyze Java programs, using general precision metrics and popular clients. Compared with the state of the art, Baton achieves the best precision for all metrics and clients for all evaluated programs. The difference in precision is often dramaticÐup to 71% of alias pairs reported by previously-best algorithms are found to be spurious and eliminated. CCS Concepts: • Theory of computation → Program analysis. Additional Key Words and Phrases: Pointer Analysis, Alias Analysis, Context Sensitivity, Java ACM Reference Format: Tian Tan, Yue Li, Xiaoxing Ma, Chang Xu, and Yannis Smaragdakis. 2021. Making Pointer Analysis More Precise by Unleashing the Power of Selective Context Sensitivity. Proc. ACM Program. Lang. 5, OOPSLA, Article 147 (October 2021), 27 pages. https://doi.org/10.1145/3485524 1 INTRODUCTION Pointer analysis is important for an array of real-world applications such as bug detection [Chandra et al. 2009; Naik et al. 2006], security analysis [Arzt et al. 2014; Livshits and Lam 2005], program verification [Fink et al. 2008; Pradel et al. 2012] and program understanding [Li et al. 2016; Sridharan ∗Corresponding author Authors’ addresses: Tian Tan, State Key Laboratory for Novel Software Technology, Nanjing University, China, tiantan@ nju.edu.cn; Yue Li, State Key Laboratory for Novel Software Technology, Nanjing University, China, yueli@nju.edu.cn; Xiaoxing Ma, State Key Laboratory for Novel Software Technology, Nanjing University, China, xxm@nju.edu.cn; Chang Xu, State Key Laboratory for Novel Software Technology, Nanjing University, China, changxu@nju.edu.cn; Yannis Smaragdakis, Department of Informatics and Telecommunications, University of Athens, Greece, yannis@smaragd.org. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). © 2021 Copyright held by the owner/author(s). 2475-1421/2021/10-ART147 https://doi.org/10.1145/3485524 Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 147. Publication date: October 2021. This work is licensed under a Creative Commons Attribution 4.0 International License
147:2 Tian Tan,Yue Li,Xiaoxing Ma,Chang Xu,and Yannis Smaragdakis et al.2007].A more precise pointer analysis is always favored by its clients as it implies,e.g.,fewer false alarms of bugs and vulnerabilities,more accurate code navigation,or potential for optimization. Context sensitivity has been demonstrated to be a major scheme for improving precision of Java pointer analysis [Fegade and Wimmer 2020;Feng et al.2015;Kanvar and Khedker 2016;Lhotak and Hendren 2008;Lhotak and Hendren 2006;Milanova et al.2002,2005;Tan et al.2017;Thakur and Nandivada 2019,2020;Thiessen and Lhotak 2017;Whaley and Lam 2004;Xu and Rountev 2008].With years of development,traditional context sensitivity performs well for small/medium programs,e.g.,2obj(object-sensitive pointer analysis with context length being two)can analyze most DaCapo benchmarks rapidly and precisely [Kastrinis and Smaragdakis 2013;Smaragdakis et al.2011];however,it is hard for it to scale to large and complex Java programs with good precision [Smaragdakis et al.2014].(An analysis not scaling means that it blows up in complexity and does not terminate even under very high time limits.) To address this problem,a series of research work [Hassanshahi et al.2017;Jeon et al.2019; Jeong et al.2017;Li et al.2018a,b,2020;Lu and Xue 2019;Minseok Jeon and Oh 2020;Oh et al.2014, 2015;Smaragdakis et al.2014;Wei and Ryder 2015]propose to apply context sensitivity selectively in the sense that contexts no longer apply to all methods,while context elements(e.g.,objects, types,call-sites)and context lengths may vary for different selected methods.Other,non-selected, methods are treated context-insensitively,in order to enable good scalability. Conversely,to gain better precision,selective context-sensitivity approaches usually have to sacrifice efficiency by allowing more methods to be analyzed context-sensitively,sometimes with longer context lengths or more precise but heavier context elements.In other words,these analyses need to make careful precision and efficiency trade-offs,where one more step towards precision may put the analysis at the risk of unscalability [Li et al.2020].As a result,seeking further precision improvement becomes hard,as limited efficiency margins are left to play with.Under such constraints,if the approach for more precise pointer analysis is not designed well,it may introduce significant overhead with minor or no precision improvement(e.g.,by selecting for sensitive treatment many precision-useless methods).More commonly,the analysis will choose to scale,but will sacrifice precision(e.g.,by selecting very few methods to treat context sensitively). Problem.For many real-world applications where points-to or alias information is required,such as certain bug detectors,security analyzers,etc.,good precision is much favored,even if the price is to run the analysis for a long time (e.g.,several hours)[Christakis and Bird 2016].Thus,the challenge we seek to address is to achieve,given a long but reasonable time allowance,precise pointer analysis results for large and complex Java programs,for which traditional context-sensitive analyses fail to scale,and selective context-sensitive analyses scale but with limited precision Method.We introduce a general(meta-)framework called Unity-Relay to produce highly precise pointer analyses for hard-to-analyze Java programs,by systematically exploiting and utilizing any selective context sensitivity technique.Unity-Relay is a one-two punch,with precision as its first priority: Unity-Relay takes as inputs a program P and a set of different selective context-sensitivity approaches,S; it first provides a method(called Unity)to unify all approaches in S(playing the role of a meta-heuristic),maximizing precision by applying the most stringent context-sensitivity selection made by any approach in S; if Unity fails to scale,the Relay method is employed.It divides the scalability burden of Unity into different passes(each pass is a run of context-sensitive pointer analysis guided by an approach in S),with the precision achieved by one pass being transmitted to and Proc.ACM Program.Lang.,Vol.5.No.OOPSLA,Article 147.Publication date:October 2021
147:2 Tian Tan, Yue Li, Xiaoxing Ma, Chang Xu, and Yannis Smaragdakis et al. 2007]. A more precise pointer analysis is always favored by its clients as it implies, e.g., fewer false alarms of bugs and vulnerabilities, more accurate code navigation, or potential for optimization. Context sensitivity has been demonstrated to be a major scheme for improving precision of Java pointer analysis [Fegade and Wimmer 2020; Feng et al. 2015; Kanvar and Khedker 2016; Lhoták and Hendren 2008; Lhoták and Hendren 2006; Milanova et al. 2002, 2005; Tan et al. 2017; Thakur and Nandivada 2019, 2020; Thiessen and Lhoták 2017; Whaley and Lam 2004; Xu and Rountev 2008]. With years of development, traditional context sensitivity performs well for small/medium programs, e.g., 2obj (object-sensitive pointer analysis with context length being two) can analyze most DaCapo benchmarks rapidly and precisely [Kastrinis and Smaragdakis 2013; Smaragdakis et al. 2011]; however, it is hard for it to scale to large and complex Java programs with good precision [Smaragdakis et al. 2014]. (An analysis not scaling means that it blows up in complexity and does not terminate even under very high time limits.) To address this problem, a series of research work [Hassanshahi et al. 2017; Jeon et al. 2019; Jeong et al. 2017; Li et al. 2018a,b, 2020; Lu and Xue 2019; Minseok Jeon and Oh 2020; Oh et al. 2014, 2015; Smaragdakis et al. 2014; Wei and Ryder 2015] propose to apply context sensitivity selectively in the sense that contexts no longer apply to all methods, while context elements (e.g., objects, types, call-sites) and context lengths may vary for different selected methods. Other, non-selected, methods are treated context-insensitively, in order to enable good scalability. Conversely, to gain better precision, selective context-sensitivity approaches usually have to sacrifice efficiency by allowing more methods to be analyzed context-sensitively, sometimes with longer context lengths or more precise but heavier context elements. In other words, these analyses need to make careful precision and efficiency trade-offs, where one more step towards precision may put the analysis at the risk of unscalability [Li et al. 2020]. As a result, seeking further precision improvement becomes hard, as limited efficiency margins are left to play with. Under such constraints, if the approach for more precise pointer analysis is not designed well, it may introduce significant overhead with minor or no precision improvement (e.g., by selecting for sensitive treatment many precision-useless methods). More commonly, the analysis will choose to scale, but will sacrifice precision (e.g., by selecting very few methods to treat context sensitively). Problem. For many real-world applications where points-to or alias information is required, such as certain bug detectors, security analyzers, etc., good precision is much favored, even if the price is to run the analysis for a long time (e.g., several hours) [Christakis and Bird 2016]. Thus, the challenge we seek to address is to achieve, given a long but reasonable time allowance, precise pointer analysis results for large and complex Java programs, for which traditional context-sensitive analyses fail to scale, and selective context-sensitive analyses scale but with limited precision. Method. We introduce a general (meta-)framework called Unity-Relay to produce highly precise pointer analyses for hard-to-analyze Java programs, by systematically exploiting and utilizing any selective context sensitivity technique. Unity-Relay is a one-two punch, with precision as its first priority: • Unity-Relay takes as inputs a program P and a set of different selective context-sensitivity approaches, 𝑆; • it first provides a method (called Unity) to unify all approaches in 𝑆 (playing the role of a meta-heuristic), maximizing precision by applying the most stringent context-sensitivity selection made by any approach in 𝑆; • if Unity fails to scale, the Relay method is employed. It divides the scalability burden of Unity into different passes (each pass is a run of context-sensitive pointer analysis guided by an approach in 𝑆), with the precision achieved by one pass being transmitted to and Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 147. Publication date: October 2021
Making Pointer Analysis More Precise by Unleashing the Power of Selective Context Sensitivity 147:3 accumulated in the next pass:in each pass,the points-to results retrieved from the previous pass are used to on-the-fly filter intermediate points-to results,keeping inferred points-to sets small and,thus,enhancing scalability. Given any selective context-sensitive pointer analysis A(guided by an approach in S),if A is sound and scalable for P,no matter how precise A is,Unity-Relay can be expected to produce a sound and scalable analysis with a precision that is guaranteed to be at least as good as A's,in the worst case.(Experimental results show that the precision is always better.)This property also implies that even when new(possibly special-purpose)selective context-sensitivity insights are developed in the future,Unity-Relay will still be able to leverage them to produce a more precise analysis in practice. Results.As a proof-of-concept,we instantiate the Unity-Relay framework in a tool called BATON. We extensively evaluate BAToN on a set of hard-to-analyze Java programs(including the toughest programs evaluated in the past literature of selective context-sensitive pointer analysis for Java), using three general precision metrics and four popular clients(also the most complete set of precision metrics/clients used in recent literature).Compared to the state of the art,experimental results show that BAToN is able to improve precision significantly,and achieves the best precision for all metrics and clients for all evaluated programs.To the best of our knowledge,this is the first time that this level of analysis precision has been attained for these hard-to-analyze programs. Moreover,because of Unity-Relay's nature as a general meta-framework,we expect it to see more future instantiations and to unleash the power of more selective context-sensitivity approaches,to produce more precise pointer analyses. In summary,this work makes the following contributions: We present Unity-Relay,a simple and practical framework to produce precise pointer analyses for hard-to-analyze Java programs,by unleashing the power of selective context sensitivity(Section 3). We formulate the Unity-Relay framework,prove its soundness and precision guarantees, and discuss its scalability (Section 4). We present BAroN,a tool instantiated from the Unity-Relay framework,which will be released as an open-source tool(Section 5). We conduct extensive experiments by comparing BAroN with the state of the art,to demon- strate its effectiveness in the real world(the experimental results can be obtained using the accompanying artifact [Tan et al.2021])(Section 6).BATON substantially improves on the precision of previous algorithms in the literature-e.g.,eliminating over 33%of alias pairs (soundly determined to be spurious)on average,and up to 71%,compared to the best past contender evaluated. 2 BACKGROUND We assume that the reader is familiar with the high-level concepts of pointer analysis,as introduced in several surveys [Smaragdakis and Balatsouras 2015;Sridharan et al.2013].This section describes some necessary background knowledge about context sensitivity for whole-program pointer analysis for Java Traditional Context Sensitivity.A method is analyzed context-sensitively means that the static abstraction of different dynamic instantiations of variables and heap objects within the method will be treated separately during analysis under different abstract entities called contexts.Accordingly, spurious object flows will be reduced,thus making the analysis more precise.Given a method m,each of its contexts typically consists of a list of consecutive context elements in the form of Proc.ACM Program.Lang.,Vol.5,No.OOPSLA,Article 147.Publication date:October 2021
Making Pointer Analysis More Precise by Unleashing the Power of Selective Context Sensitivity 147:3 accumulated in the next pass: in each pass, the points-to results retrieved from the previous pass are used to on-the-fly filter intermediate points-to results, keeping inferred points-to sets small and, thus, enhancing scalability. Given any selective context-sensitive pointer analysis 𝐴 (guided by an approach in 𝑆), if 𝐴 is sound and scalable for P, no matter how precise 𝐴 is, Unity-Relay can be expected to produce a sound and scalable analysis with a precision that is guaranteed to be at least as good as 𝐴’s, in the worst case. (Experimental results show that the precision is always better.) This property also implies that even when new (possibly special-purpose) selective context-sensitivity insights are developed in the future, Unity-Relay will still be able to leverage them to produce a more precise analysis in practice. Results. As a proof-of-concept, we instantiate the Unity-Relay framework in a tool called Baton. We extensively evaluate Baton on a set of hard-to-analyze Java programs (including the toughest programs evaluated in the past literature of selective context-sensitive pointer analysis for Java), using three general precision metrics and four popular clients (also the most complete set of precision metrics/clients used in recent literature). Compared to the state of the art, experimental results show that Baton is able to improve precision significantly, and achieves the best precision for all metrics and clients for all evaluated programs. To the best of our knowledge, this is the first time that this level of analysis precision has been attained for these hard-to-analyze programs. Moreover, because of Unity-Relay’s nature as a general meta-framework, we expect it to see more future instantiations and to unleash the power of more selective context-sensitivity approaches, to produce more precise pointer analyses. In summary, this work makes the following contributions: • We present Unity-Relay, a simple and practical framework to produce precise pointer analyses for hard-to-analyze Java programs, by unleashing the power of selective context sensitivity (Section 3). • We formulate the Unity-Relay framework, prove its soundness and precision guarantees, and discuss its scalability (Section 4). • We present Baton, a tool instantiated from the Unity-Relay framework, which will be released as an open-source tool (Section 5). • We conduct extensive experiments by comparing Baton with the state of the art, to demonstrate its effectiveness in the real world (the experimental results can be obtained using the accompanying artifact [Tan et al. 2021]) (Section 6). Baton substantially improves on the precision of previous algorithms in the literatureÐe.g., eliminating over 33% of alias pairs (soundly determined to be spurious) on average, and up to 71%, compared to the best past contender evaluated. 2 BACKGROUND We assume that the reader is familiar with the high-level concepts of pointer analysis, as introduced in several surveys [Smaragdakis and Balatsouras 2015; Sridharan et al. 2013]. This section describes some necessary background knowledge about context sensitivity for whole-program pointer analysis for Java. Traditional Context Sensitivity. A method is analyzed context-sensitively means that the static abstraction of different dynamic instantiations of variables and heap objects within the method will be treated separately during analysis under different abstract entities called contexts. Accordingly, spurious object flows will be reduced, thus making the analysis more precise. Given a method m, each of its contexts typically consists of a list of consecutive context elements in the form of Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 147. Publication date: October 2021
147:4 Tian Tan,Yue Li,Xiaoxing Ma,Chang Xu,and Yannis Smaragdakis [c1,c2,...to model different run-time conditions.For example,for call-site sensitivity,c1 is the call site of m,and c2 is the call site of the method containing cl,etc.However,for efficiency, only the most recent I context elements are kept(I is also called the context length)and I is usually limited to a small number for whole-program analysis in practice [Kastrinis and Smaragdakis 2013;Smaragdakis et al.2011].There are three mainstream context-sensitivity variants for Java pointer analysis:call-site sensitivity,object sensitivity,and type sensitivity.Henceforth,we use 3obj to denote object sensitivity with context length being 3.Longer context indicates more precision:more spurious data flows can be possibly eliminated.Thus,3obj is more precise than 2obj.For different kinds of context elements with the same length,past work has demonstrated that object sensitivity typically outperforms call-site sensitivity,in terms of both precision and efficiency [Milanova et al.2005;Smaragdakis et al.2011],and is more precise but also more expensive than type sensitivity [Smaragdakis et al.2011;Tan et al.2017].Traditionally,context-sensitive pointer analysis uniformly applies context sensitivity to every method in a program to maintain high precision.However,this approach often does not work for large and complicated programs as such treatment is too costly to scale [Li et al.2018b;Smaragdakis et al.2014] Selective Context Sensitivity.Selective context sensitivity proposes to apply context sensitivity selectively,only for precision-useful/non-scalability-threat methods.The analysis precision of such methods will help improve the overall program analysis precision,while not incurring so much cost as to make the analysis hard to scale.The remaining methods are analyzed context-insensitively so that the space and time cost originally incurred for these methods(in traditional context sensitivity)is saved,leaving room to produce precise and scalable analyses even for hard-to-analyze programs.However,it is challenging to accurately determine which methods are precision-useful but not scalability-threat in general.To deal with this problem,in the past years,various selective context-sensitivity approaches have been presented,based on different insights and policies.For example,the target methods could be selected according to expert experience [WALA 2018]. program patterns [Li et al.2018a,2020],abstracted memory capacity [Li et al.2018b],parameterized heuristics [Hassanshahi et al.2017;Smaragdakis et al.2014],or machine learning approaches [Jeon et al.2018;Jeong et al.2017].In addition,in selective context sensitivity,some approaches apply the same context sensitivity to the selected methods [Hassanshahi et al.2017:Li et al.2018a.2020: Smaragdakis et al.2014],while in other approaches context elements and lengths may vary for different subsets of the selected methods [Jeon et al.2019;Jeong et al.2017;Li et al.2018b].We refer the readers to Section 7 for more details. 3 THE UNITY-RELAY FRAMEWORK,INFORMALLY We first give an overview of the Unity-Relay framework(Section 3.1),and then explain the key ideas of Unity (Section 3.2)and Relay(Section 3.3),respectively. 3.1 Overview Figure 1 shows a pictorial overview of the Unity-Relay framework:given a program P and a set of selective context-sensitive(C.S.)approach/strategy components,S,the framework will yield highly precise pointer analysis results for P. Unity-Relay is a one-two punch.It first calls its Unity component.Taking as input the results obtained by running each approach in S for P,the context-sensitivity selector of Unity(C.S.Selector in Figure 1)unifies them and makes new configuration about which portion of P should be analyzed by what context-sensitivity variants (ie.,what are the context elements and lengths)to maximize the precision,according to the unity principle,as introduced in Section 3.2.Intuitively,the Unity Proc.ACM Program.Lang.,Vol.5.No.OOPSLA,Article 147.Publication date:October 2021
147:4 Tian Tan, Yue Li, Xiaoxing Ma, Chang Xu, and Yannis Smaragdakis [c1,c2,...] to model different run-time conditions. For example, for call-site sensitivity, c1 is the call site of m, and c2 is the call site of the method containing c1, etc. However, for efficiency, only the most recent 𝑙 context elements are kept (𝑙 is also called the context length) and 𝑙 is usually limited to a small number for whole-program analysis in practice [Kastrinis and Smaragdakis 2013; Smaragdakis et al. 2011]. There are three mainstream context-sensitivity variants for Java pointer analysis: call-site sensitivity, object sensitivity, and type sensitivity. Henceforth, we use 3obj to denote object sensitivity with context length being 3. Longer context indicates more precision: more spurious data flows can be possibly eliminated. Thus, 3obj is more precise than 2obj. For different kinds of context elements with the same length, past work has demonstrated that object sensitivity typically outperforms call-site sensitivity, in terms of both precision and efficiency [Milanova et al. 2005; Smaragdakis et al. 2011], and is more precise but also more expensive than type sensitivity [Smaragdakis et al. 2011; Tan et al. 2017]. Traditionally, context-sensitive pointer analysis uniformly applies context sensitivity to every method in a program to maintain high precision. However, this approach often does not work for large and complicated programs as such treatment is too costly to scale [Li et al. 2018b; Smaragdakis et al. 2014]. Selective Context Sensitivity. Selective context sensitivity proposes to apply context sensitivity selectively, only for precision-useful/non-scalability-threat methods. The analysis precision of such methods will help improve the overall program analysis precision, while not incurring so much cost as to make the analysis hard to scale. The remaining methods are analyzed context-insensitively so that the space and time cost originally incurred for these methods (in traditional context sensitivity) is saved, leaving room to produce precise and scalable analyses even for hard-to-analyze programs. However, it is challenging to accurately determine which methods are precision-useful but not scalability-threat in general. To deal with this problem, in the past years, various selective context-sensitivity approaches have been presented, based on different insights and policies. For example, the target methods could be selected according to expert experience [WALA 2018], program patterns [Li et al. 2018a, 2020], abstracted memory capacity [Li et al. 2018b], parameterized heuristics [Hassanshahi et al. 2017; Smaragdakis et al. 2014], or machine learning approaches [Jeon et al. 2018; Jeong et al. 2017]. In addition, in selective context sensitivity, some approaches apply the same context sensitivity to the selected methods [Hassanshahi et al. 2017; Li et al. 2018a, 2020; Smaragdakis et al. 2014], while in other approaches context elements and lengths may vary for different subsets of the selected methods [Jeon et al. 2019; Jeong et al. 2017; Li et al. 2018b]. We refer the readers to Section 7 for more details. 3 THE UNITY-RELAY FRAMEWORK, INFORMALLY We first give an overview of the Unity-Relay framework (Section 3.1), and then explain the key ideas of Unity (Section 3.2) and Relay (Section 3.3), respectively. 3.1 Overview Figure 1 shows a pictorial overview of the Unity-Relay framework: given a program P and a set of selective context-sensitive (C.S.) approach/strategy components, 𝑆, the framework will yield highly precise pointer analysis results for P. Unity-Relay is a one-two punch. It first calls its Unity component. Taking as input the results obtained by running each approach in 𝑆 for P, the context-sensitivity selector of Unity (C.S. Selector in Figure 1) unifies them and makes new configuration about which portion of P should be analyzed by what context-sensitivity variants (i.e., what are the context elements and lengths) to maximize the precision, according to the unity principle, as introduced in Section 3.2. Intuitively, the Unity Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 147. Publication date: October 2021
Making Pointer Analysis More Precise by Unleashing the Power of Selective Context Sensitivity 147:5 Points-To info in previous Unity Relay pass PTpre Program P C.S.Selector of C.S.Selector of Unity (Figure 2) Trigger Relay (Figure 3) when Selective C.S. Selected Selected C.S. C.S. Approaches Unity Selective C.S. Selective C.S fails Pointer Analysis Pointer Analysis filtered by PTpre Precise Pointer Analysis Results for P Fig.1.Overview of the Unity-Relay Framework. principle maintains for each program element the most precise context that any component context- sensitivity approach would use.Then the new selected context-sensitivity information(Selected C.S. in Figure 1)will guide the selective context-sensitive pointer analysis to analyze P,and its results will be the output of Unity-Relay if the analysis is able to finish running within the time limit; otherwise(i.e,if Unity is too expensive to scale),Unity-Relay will trigger its Relay component to analyze P. Generally,given n approaches in S,Relay will run the selective context-sensitive pointer analysis n times(ie.,we have n passes in Relay).In each pass,as shown in Figure 1,the selective context- sensitive pointer analysis is guided by the context-sensitivity information(Selected C.S.)output from the C.S.selector of Relay according to the relay principle as illustrated in Section 3.3.Between successive passes,the precision of the earlier pass will be transmitted to and accumulated in the next pass.This is achieved by soundly filtering the pointer analysis results in the current pass by using the points-to information(points-to sets of variables)from the previous pass(PTpre in Figure 1).As a result,in Relay,the scalability burden is shared in each analysis run(i.e.,each pass): a component strategy is not burdened by others,but instead each strategy that completes helps both the precision and the scalability of the rest.Although overall precision is not guaranteed to reach that of Unity,precision improvements are reaped in a stable and accumulative way. 3.2 Unity Recall that every selective context-sensitive analysis represents a precision and efficiency trade-off, where precision bumps up against the limits of analysis scalability.In other words,for past selective context-sensitive analyses,the degree of precision(typically:the proportion of the program to be analyzed context-sensitively)is chosen to be approximately at a point where further improving it would render the analysis non-scalable.According to previous work,treating context-sensitively even a small set of methods may significantly hinder scalability,for the "wrong"choice of meth- ods [Li et al.2020].Now it seems that we are trapped:if a small step towards precision may hit the scalability wall(in the precision and efficiency balance made by each selective context-sensitivity approach),how can we reap a further noticeable precision improvement in a general,policy-agnostic meta-framework?To escape from the trap,in Unity,we propose to take a big step forward towards higher precision,by allowing many more methods to be analyzed under context sensitivity,with Proc.ACM Program.Lang.,Vol.5,No.OOPSLA,Article 147.Publication date:October 2021
Making Pointer Analysis More Precise by Unleashing the Power of Selective Context Sensitivity 147:5 C.S. Selector of Unity (Figure 2) Selective C.S. Pointer Analysis Selected C.S. C.S. Selector of Relay (Figure 3) Selective C.S. Pointer Analysis filtered by pai Selected C.S. Points-To info in previous Unity Relay Trigger when Unity fails Precise Pointer Analysis Results for P Program P Selective C.S. Approaches pass PTpre PTpre Fig. 1. Overview of the Unity-Relay Framework. principle maintains for each program element the most precise context that any component contextsensitivity approach would use. Then the new selected context-sensitivity information (Selected C.S. in Figure 1) will guide the selective context-sensitive pointer analysis to analyze P, and its results will be the output of Unity-Relay if the analysis is able to finish running within the time limit; otherwise (i.e., if Unity is too expensive to scale), Unity-Relay will trigger its Relay component to analyze P. Generally, given 𝑛 approaches in 𝑆, Relay will run the selective context-sensitive pointer analysis 𝑛 times (i.e., we have 𝑛 passes in Relay). In each pass, as shown in Figure 1, the selective contextsensitive pointer analysis is guided by the context-sensitivity information (Selected C.S.) output from the C.S. selector of Relay according to the relay principle as illustrated in Section 3.3. Between successive passes, the precision of the earlier pass will be transmitted to and accumulated in the next pass. This is achieved by soundly filtering the pointer analysis results in the current pass by using the points-to information (points-to sets of variables) from the previous pass (PT𝑝𝑟𝑒 in Figure 1). As a result, in Relay, the scalability burden is shared in each analysis run (i.e., each pass): a component strategy is not burdened by others, but instead each strategy that completes helps both the precision and the scalability of the rest. Although overall precision is not guaranteed to reach that of Unity, precision improvements are reaped in a stable and accumulative way. 3.2 Unity Recall that every selective context-sensitive analysis represents a precision and efficiency trade-off, where precision bumps up against the limits of analysis scalability. In other words, for past selective context-sensitive analyses, the degree of precision (typically: the proportion of the program to be analyzed context-sensitively) is chosen to be approximately at a point where further improving it would render the analysis non-scalable. According to previous work, treating context-sensitively even a small set of methods may significantly hinder scalability, for the łwrongž choice of methods [Li et al. 2020]. Now it seems that we are trapped: if a small step towards precision may hit the scalability wall (in the precision and efficiency balance made by each selective context-sensitivity approach), how can we reap a further noticeable precision improvement in a general, policy-agnostic meta-framework? To escape from the trap, in Unity, we propose to take a big step forward towards higher precision, by allowing many more methods to be analyzed under context sensitivity, with Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 147. Publication date: October 2021