Despite significant progress in the theory and practice of program analysis, analysing properties of heap data has not reached the same level of maturity as the analysis of static and stack data. The spatial and temporal structure of stack and static data is well understood while that of heap data seems arbitrary and is unbounded. We devise bounded representations which summarize properties of the heap data. This summarization is based on the structure of the program which manipulates the heap. The resulting summary representations are certain kinds of graphs called access graphs. The boundedness of these representations and the monotonicity of the operations to manipulate them make it possible to compute them through data flow analysis.
An important application which benefits from heap reference analysis is garbage collection, where currently liveness is conservatively approximated by reachability from program variables. As a consequence, current garbage collectors leave a lot of garbage uncollected, a fact which has been confirmed by several empirical studies. We propose the first ever end-to-end static analysis to distinguish live objects from reachable objects. We use this information to make dead objects unreachable by modifying the program. This application is interesting because it requires discovering data flow information representing complex semantics.
Our approach is applicable to memory leak plugging in C/C++ languages also.
About Uday Khedker
Uday Khedker is an Associate Professor of Computer Science at the Indian Institute of Technology (IIT) Bombay. He is interested in the area of optimising compilers. His current topics of interest include Interprocedural Data Flow Analysis, Heap Reference Analysis, Static Inferencing of Flow Sensitive Polymorphic Types, and Compiler Verification. A recent research thrust involves cleaning up the GNU Compiler Collection (GCC) to simplify its deployment, retargetting, and enhancements. Other goals include increasing its trustworthiness as well as the quality of generated code. Two focussed activities involve (a) incorporating our precise, general, and efficient interprocedural data flow analysis algorithms in GCC, and (b) simplifying machine descriptions by changing the instruction selection mechanism in GCC.
His group has conducted a workshop on GCC Internals (http://www.cse.iitb.ac.in/~uday/gcc-workshop) to explain their findings to the GCC community.