Reaching Definitions

Home | DMP | Project | Journal | Contact
Relevant Java Classes | Delete Call Site | Template Call Graph | Research | Final Report
Previous (Flow-Sensitive Analysis of Variable Accesses)
Next (Incremental Analysis)

Type inference systems employing reaching definitions do not base their reasoning about control flow solely on the syntax of the target program but on the control flow graph built during inference. Thus, more precise approximations of control flow from assignments to variable reads are made available to hone the precision of the type inference itself. One can take advantage of control flow information by first computing reaching definitions and then defining types for variable accesses based on these reaching definitions.

WHAT ARE REACHING DEFINITIONS?
A variable definition is either an initialization or an assignment to the variable. Examples (where x is the variable being defined) include the following independent statements:

int x = 4; x = "a string"; x = factorial(5); x = b.abs();
Reaching definitions pertain to a particular variable read (or use) and are the definitions that may determine the value of the variable at that point in the program. In other words, the set of definitions reaching a point P in the program summarizes the relevant properties of control flow leading to P. While exact computation of reaching definitions has not yet been achieved, more conservative approximation algorithms have been successfully implemented.

USING REACHING DEFINITIONS IN TYPE INFERENCE
The reaching definitions approach to type inference takes the view that the type of a variable access is based on the set of definitions reaching it. More formally, Agesen defines the type of a variable read as "the union of the types of the definitions (of the relevant variable) reaching the read". When the set of reaching definitions for a variable access is a true subset of all definitions of the accessed variable, reaching definitions can be used to infer a smaller and more precise type than those determined by other inference approaches.

COMPUTATION OF REACHING DEFINITIONS
Type inference creates a global control flow graph in which templates represent the invocation of methods with particular argument combinations. With this structure in place, existing algorithms for the interprocedural computation of reaching definitions may be applied. The algorithms, in general, propagate definitions forward along the edges of the flow graph:
Let y be a variable in the target program, and assume that a definition has been assigned to y. If another definition that assigns to y is propagated to a program point P, the second definition replaces or "kills" the former and is the only definition propagated out of P (assuming, of course, that the second definition is a strong update rather than a weak update; in that case, both definitions are propagated from P). At a point where control flow merges, the union of the incoming definitions is propagated out from that point.
Propagation should be confined to valid paths - the subset of all paths in the control flow graph for which call/return edges are paired correctly. (See Pande, H.M. and B.G. Ryder. Static Type Determination and Aliasing for C++. Technical Report LCSR-TR-236, Rutgers University, December 1994; Reps, T., S. Horwitz, and M. Sagiv. Precise Interprocedural Dataflow Analysis via Graph Reachability. In Conference Record of POPL '95: 22nd ACM Symposium on Principles of Programming Languages, p. 49-61, San Francisco, California, January 1995; AND/OR Sharir, M. and A. Pnueli. Two Approaches to Interprocedural Dataflow Analysis. In Muchnick, S.S. and N.D. Jones (Eds.). Program Flow Analysis: Theory and Applications, Prentice-Hall, Englewood Cliffs, New Jersey, 1981, p. 189-236.) Other issues in reaching definitions analysis for object-oriented programming are mentioned in the following subsections.

POLYVARIANCE
According to Agesen, precise type inference and reaching definitions both require a polyvariant analysis - one that can create multiple templates for a single method. Fortunately, the templates created by a polyvariant type inference system provide the basis for polyvariant computation of reaching definitions; definitions are propagated through a structure derived from the set of templates rather than one derived from the syntax of the target program.

INTEGRATION OF TYPE INFERENCE AND REACHING DEFINITIONS
Type inference and the computation of reaching definitions are mutually dependent; each process relies on the other for improvement. One standard approach to this dilemma is to integrate the two. To maintain correctness, the types of variable accesses must grow monotonically during the dual analysis. (In other words, types cannot be removed from a type variable.) Because type inference ensures that the type sets of definitions grow monotonically, overall monotonicity is a given, as long as the set of reaching definitions for each read increases monotonically as well. Initially, the control flow graph has no edges, and thus no use has reaching definitions. As type inference proceeds and edges are added to the graph, definitions are permitted to flow along more and more paths. Implementations of reaching definitions must maintain the invariant that the addition of an edge to the control flow graph does not hinder any definition from reaching an access it could reach beforehand.

REACHING DEFINITIONS IN FLEX
FLEX provides two classes named ReachingDefs; the template call graph implementation uses the one in the package harpoon.Analysis. ReachingDefs is an abstract class implemented by several classes, but our focus is on ReachingDefsImpl. The constructors create a ReachingDefsImpl object for a provided HCode argument and perform reaching definitions analysis. (An HCode object corresponds to a list of instructions, with each statement represented by an HCodeElement). The method with which we are most concerned during type inference is reachingDefs:
public Set reachingDefs(HCodeElement hce, Temp t)
Invoking this method on a ReachingDefsImpl object returns the set of HCodeElements that provide definitions for the Temp t that reach HCodeElement hce. If hce is unreachable, the method returns the empty set. (Temps represent temporary variables. Each Temp is guaranteed to have a unique name, regardless of the different contexts in which a programmer may have reused a variable name.)
Relationships to keep in mind:

Definition - DEF - initialization or assignment
Use - USE - variable read/access
Reaching definitions - ReachingDefs - def-use chains

Home | DMP | Project | Journal | Contact
Relevant Java Classes | Delete Call Site | Template Call Graph | Research | Final Report
Previous (Flow-Sensitive Analysis of Variable Accesses)
Next (Incremental Analysis)