7:16 Yue Li,Tian Tan,and Jingling Xue Application:Eclipse (v4.2.2) Class:org.eclipse.osgi.framework.internal.core.FrameworkCommandInterpreter 123 public Object execute(String cmd){... 155 Object[]parameters new Object[](this};... 167 for (int i =0;i<size;i++){ 174 method target.getclass().getMethod("_"+cmd,parameterTypes); 175 retval method.invoke(target,parameters);...} 228} Fig.10.Self-inferencing property for a reflective method invocation,deduced from the number and dynamic types of the components of the one-dimensional array argument,parameters,at a invoke()call site. the descriptor of the target method in line 175 must have only one argument and its actual type must be FrameworkCommandInterpreter or one of its supertypes. Application:Eclipse (v4.2.2) Class:org.eclipse.osgi.framework.internal.core.Framework 1652 public static Field getField(Class clazz,...){ 1653 Field[]fields =clazz.getDeclaredFields();... 1654 for (int i=0;i fields.length;i++){... 1658 return fields[i];}.. 1662) 1682 private static void forceContentHandlerFactory(...){ 1683 Field factoryField getField(URLConnection.class,...); 1687 java.net.ContentHandlerFactory factory (java.net.ContentHandlerFactory)factoryField.get(null);... 1709} Fig.11.Self-inferencing property for a reflective field access,deduced from the type casted on the returned value of,and the null argument used at,a get()call site. Example 2.2(Reflective Field Access(Figure 11)).In line 1683,factoryField is obtained as a Field object from an array of Field objects created in line 1653 for all the fields in URLConnection.In line 1687,the object returned from get()is cast to java.net.ContentHandlerFactory.Based on its cast operation and null argument,we know that the call to get()may only access the static fields of URLConnection with the type java.net.ContentHandlerFactory,its supertypes or its subtypes. Without the self-inferencing property at the get()call site,all the fields in URLConnection must be assumed to be accessed conservatively by an analysis. Example 2.3(Reflective Field Modification (Figure 12)).Like the case in Figure 11,the field object in line 290 is also read from an array of field objects created in line 302.This code pattern appears one more time in line 432 in the same class,i.e,org.eclipse.osgi.util.NLS(not shown here). According to the two arguments,null and value,provided at set()(line 290),we can deduce ACM Trans.Softw.Eng.Methodol.,Vol.28,No.2,Article 7.Publication date:February 2019
7:16 Yue Li, Tian Tan, and Jingling Xue Application: Eclipse (v4.2.2) Class:org.eclipse.osgi.framework.internal.core.FrameworkCommandInterpreter 123 public Object execute(String cmd) {... 155 Object[] parameters = new Object[] {this}; ... 167 for (int i = 0; i < size; i++) { 174 method = target.getClass().getMethod("_" + cmd, parameterTypes); 175 retval = method.invoke(target, parameters); ...} 228 } Fig. 10. Self-inferencing property for a reflective method invocation, deduced from the number and dynamic types of the components of the one-dimensional array argument, parameters, at a invoke() call site. the descriptor of the target method in line 175 must have only one argument and its actual type must be FrameworkCommandInterpreter or one of its supertypes. Application: Eclipse (v4.2.2) Class:org.eclipse.osgi.framework.internal.core.Framework 1652 public static Field getField(Class clazz, ...) { 1653 Field[] fields = clazz.getDeclaredFields(); ... 1654 for (int i = 0; i < fields.length; i++) { ... 1658 return fields[i]; } ... 1662 } 1682 private static void forceContentHandlerFactory(...) { 1683 Field factoryField = getField(URLConnection.class, ...); 1687 java.net.ContentHandlerFactory factory = (java.net.ContentHandlerFactory) factoryField.get(null); ... 1709 } Fig. 11. Self-inferencing property for a reflective field access, deduced from the type casted on the returned value of, and the null argument used at, a get() call site. Example 2.2 (Reflective Field Access (Figure 11)). In line 1683, factoryField is obtained as a Field object from an array of Field objects created in line 1653 for all the fields in URLConnection. In line 1687, the object returned from get() is cast to java.net.ContentHandlerFactory. Based on its cast operation and null argument, we know that the call to get() may only access the static fields of URLConnection with the type java.net.ContentHandlerFactory, its supertypes or its subtypes. Without the self-inferencing property at the get() call site, all the fields in URLConnection must be assumed to be accessed conservatively by an analysis. Example 2.3 (Reflective Field Modification (Figure 12)). Like the case in Figure 11, the field object in line 290 is also read from an array of field objects created in line 302. This code pattern appears one more time in line 432 in the same class, i.e., org.eclipse.osgi.util.NLS (not shown here). According to the two arguments, null and value, provided at set() (line 290), we can deduce ACM Trans. Softw. Eng. Methodol., Vol. 28, No. 2, Article 7. Publication date: February 2019
Understanding and Analyzing Java Reflection 7:17 Application:Eclipse (v4.2.2) Class:org.eclipse.osgi.util.NLS 300 static void load(final String bundleName,Class<?>clazz){ 302 final Field[]fieldArray clazz.getDeclaredFields(); 336 computeMissingMessages(.·,fieldArray,·..);·. 339} 267 static void computeMissingMessages(...,Field[]fieldArray,...){ 272 for (int i=0;i numFields;i++){ 273 Field field fieldArray[i]; 284 String value "NLS missing message:"+... 290 field.set(null,value);}... 2951 Fig.12.Self-inferencing property for a reflective field modification,deduced from the null argument and the dynamic type of the value argument at a set()call site. that the target field (to be modified in line 290)is static(from null)and its declared type must be java.lang.String or one of its supertypes(from the type of value). Definition 2.4(Self-Inferencing Property).For each reflective-action call site,its self-inferencing property comprises the information that can be used to infer its reflective targets,which consists of (1)all the information of its arguments(including receiver object),namely the number of arguments, their types,and(2)the possible downcasts on its returned values,and(3)the possible string values statically resolved at its corresponding class-retrieving and member-retrieving call sites. We argue that the self-inferencing property is inherent in most Java reflection code due to the characteristics of object-oriented programming and the Java reflection API.For example,the de- clared type of the object reflectively returned by get()and invoke()or created by newInstance() is always java.lang.Object.Therefore,the returned object must be first cast to a specific type before it is used as a regular object,except when its dynamic type is java.lang.Object or it will be used only as an receiver for the methods inherited from java.lang.Object;otherwise,the compilation would fail.As another example,the descriptor of a target method reflectively called at invoke()must be consistent with what is specified by its second argument (e.g.,parameters in line 175 of Figure 10);otherwise,exceptions would be thrown at runtime.These constraints should be exploited to enable resolving reflection in a disciplined way. The self-inferencing property not only helps resolve reflective calls more effectively when the values of string arguments are partially known(e.g.,when either a class name or a member name is known),but also provides an opportunity to resolve some reflective calls even if the string values are fully unknown.For example,in some Android apps,class and method names for reflective calls are encrypted for benign or malicious obfuscation,which "makes it impossible for any static analysis to recover the reflective call[48].However,this appears to be too pessimistic in our setting, because,in addition to the string values,some other self-inferencing hints are possibly available to facilitate reflection resolution.For example,given (A)invoke(o,{..))the class type of the target method can be inferred from the dynamic type of o(by pointer analysis),and the declared return type and descriptor of the target method can also be deduced from A and {..)respectively, as discussed above. ACM Trans.Softw.Eng.Methodol.,Vol.28,No.2,Article 7.Publication date:February 2019
Understanding and Analyzing Java Reflection 7:17 Application: Eclipse (v4.2.2) Class:org.eclipse.osgi.util.NLS 300 static void load(final String bundleName, Class<?> clazz) { 302 final Field[] fieldArray = clazz.getDeclaredFields(); 336 computeMissingMessages(..., fieldArray, ...); ... 339 } 267 static void computeMissingMessages(..., Field[] fieldArray,...) { 272 for (int i = 0; i < numFields; i++) { 273 Field field = fieldArray[i]; 284 String value = "NLS missing message: " + ...; 290 field.set(null, value); } ... 295 } Fig. 12. Self-inferencing property for a reflective field modification, deduced from the null argument and the dynamic type of the value argument at a set() call site. that the target field (to be modified in line 290) is static (from null) and its declared type must be java.lang.String or one of its supertypes (from the type of value). Definition 2.4 (Self-Inferencing Property). For each reflective-action call site, its self-inferencing property comprises the information that can be used to infer its reflective targets, which consists of (1) all the information of its arguments (including receiver object), namely the number of arguments, their types, and (2) the possible downcasts on its returned values, and (3) the possible string values statically resolved at its corresponding class-retrieving and member-retrieving call sites. We argue that the self-inferencing property is inherent in most Java reflection code due to the characteristics of object-oriented programming and the Java reflection API. For example, the declared type of the object reflectively returned by get() and invoke() or created by newInstance() is always java.lang.Object. Therefore, the returned object must be first cast to a specific type before it is used as a regular object, except when its dynamic type is java.lang.Object or it will be used only as an receiver for the methods inherited from java.lang.Object; otherwise, the compilation would fail. As another example, the descriptor of a target method reflectively called at invoke() must be consistent with what is specified by its second argument (e.g., parameters in line 175 of Figure 10); otherwise, exceptions would be thrown at runtime. These constraints should be exploited to enable resolving reflection in a disciplined way. The self-inferencing property not only helps resolve reflective calls more effectively when the values of string arguments are partially known (e.g., when either a class name or a member name is known), but also provides an opportunity to resolve some reflective calls even if the string values are fully unknown. For example, in some Android apps, class and method names for reflective calls are encrypted for benign or malicious obfuscation, which “makes it impossible for any static analysis to recover the reflective call” [48]. However, this appears to be too pessimistic in our setting, because, in addition to the string values, some other self-inferencing hints are possibly available to facilitate reflection resolution. For example, given (A)invoke(o, {...}), the class type of the target method can be inferred from the dynamic type of o (by pointer analysis), and the declared return type and descriptor of the target method can also be deduced from A and {...}, respectively, as discussed above. ACM Trans. Softw. Eng. Methodol., Vol. 28, No. 2, Article 7. Publication date: February 2019