Dynamic Program Analysis
for Secure and Reliable Computing

Rajiv Gupta & Neelam Gupta
Dr. Min Feng (2012)
Dr. Chen Tian (2010)
Dr. Dennis Jeffrey (2009)
Prof. Vijayanand Nagarajan (2009)
Dr. Sriraman Tallam (2007)
Prof. Xiangyu Zhang (2006)
Prof. Youtao Zhang (2002)
Dr. Clara Jaramillo (2000)
Prof. Ras Bodik (1999)
Dr. Evelyn Duesterwald (1996)

Funding

National Science Foundation, CNS, Computer Systems Research Program,
CCF-0810906/0614707, 9/2006-8/2011.
National Science Foundation, CNS, Computer Research Infrastructure Program,
CNS-0751949/0708199, 9/2007-9/2010.
National Science Foundation, CCF, Computer Processes and Artifacts Cluster,
CCF-0753470/0541382, 1/2006-1/2010.
Microsoft Research, Phoenix/SSCLI program, Redmond, Washington
4/2006-3/2007.
IBM, Eclipse Innovation Award,
1/2005-12/2005.
Microsoft Research, Redmond, Washington
9/2003-8/2006.
IBM, Eclipse Innovation Award,
1/2003-12/2003.

Description

The goal of this project is to develop scalable analysis techniques that make use of dynamic information from program runs as well as some limited static information. The analysis developed is being exploited to improve the Security and Reliability of software. We have developed techniques for collecting and storing large amounts of profile data and mining this data for information useful in automated fault location and automated testing of software. More specifically our contributions include the following:

Scalable Collection of Dynamic Information is being achieved through integrated use of checkpointing and logging as well as fine-grained tracing of software. We have developed a two step approach in which during initial program execution checkpointing and logging is used. If the program exhibits incorrect behavior, the log is analyzed to identify portion of execution that needs to traced using the replay phase to collect information useful in debugging. We have developed the Whole Execution Trace representation that integrates many different kinds of profiles that have been used for characterizing program behavior into one single representation. This includes address, value, control flow, as well as data and control dependence profiles. Collection, storage, and analysis of complete profiles is a challenging task. We has developed a compact and rapidly traversible representation of whole execution profiles. Whole execution traces produced by execution of a couple of billion instructions require a gigabyte of storage. We have further scaled this approach so it can be applied to long running programs by proposing Execution Reduction, a technique that combines checkpointing/logging with fine-grained tracing.
Dynamic Slicing & Matching are powerful analysis that are useful for many applications. The large amounts of information collected to enable computation of dynamic program slices can easily cause the machine to run out of memory. In our recent work we have experimentally evaluated various precise and imprecise dynamic slicing algorithms to show the drawbacks of imprecise slicing. We have shown that whole execution trace representation enables rapid dynamic program slicing of realistic program runs without running out of memory. Dynamic matching of whole execution traces of two program versions is being carried out to enable a wide range of applications including those of piracy detection, some debugging scenarios, and software maintenance. Unlike some static approaches for matching program versions, our approach does not require access to source code of the two program versions. Rich program execution histories enable us to rapidly produce highly accurate matches. Our work differs from static differencing techniques as it enables matching of code sequences that dynamically behave the same even though they statically appear to be different.
Fault Location & Correction. After developing practical implementations of dynamic slicing we have employed it for fault location. Three different types of dynamic slices are being used for fault location: backward slices of erroneous values; forward slices of minimal fault inducing inputs; and bidirectional slices of critical predicates. We have also developed techniques for pruning dynamic slices based upon positive evidence indicating that some of the statements in the dynamic slices may be expected to be correct because they are responsible for producing correct results. Location of faults is also important for security because faults can often be used to launch attacks. We have also developed a new approach to locate and correct an erroneous statement in a faulty function. Our approach combines ideas from software testing and weakest preconditions used in correctness proof methods to locate a likely erroneous statement. Our recent work is aimed at developing techniques for automatically identifying corrupted program state and determining corrections to the corrupted state. Our results show that this approach performs fault location with a very high degree of precision.
Software Testing is an important activity undertaken to improve the reliability of software. We have developed a greedy heuristic algorithm that iteratively exploits the implications among the test cases and implications among the requirements to generate a minimized suite that covers the same set of requirements as that of the unminimized suite. To address the problem of fault loss due to removing test cases from test suite, we have also developed a technique to selectively retain coverage redundant test cases that are likely to expose new faults. We have developed a new structural coverage criterion called invariant-coverage criterion for generating inputs to detect likely program invariants. We have developed numerical iterative techniques for automatically generating test data for a given path in a program.

Software

WET Dynamic Analysis Infrastructure: http://wet.cs.ucr.edu

Publications

Scalable Profiling

V. Nagarajan, D. Jeffrey, R. Gupta, and N. Gupta
ONTRAC: A System for Efficient ONline TRACing for Debugging,
International Conference on Software Maintenance (ICSM),
pages 445-454, Paris, September 2007.
X. Zhang, N. Gupta, and R. Gupta,
``Whole Execution Traces and Their Use in Debugging,''
The Compiler Design Handbook: Optimizations and Machine Code Generation,
Second Edition, Chapter 4, CRC Press, pages 4-1--4-17, Dec. 2007.
S. Tallam and R. Gupta,
Unified Control Flow and Dependence Traces,
ACM Transactions on Architecture and Code Optimization (TACO),
Vol. 4, No. 3, 31 pages, September 2007.
Y. Lin, Y. Zhang, and R. Gupta,
The Design and Evaluation of Path Matching Schemes on Compressed Control Flow Traces,
Journal of Systems and Software (JSS),
Vol. 80, No. 3, pages 396-409, 2007.
X. Zhang and R. Gupta,
Whole Execution Traces and their Applications,
ACM Transactions on Architecture and Code Optimization (TACO),
Vol. 2, No. 3, pages 301-334, Sept. 2005.
S. Tallam, R. Gupta, and X. Zhang,
Extended Whole Program Paths,
International Conference on Parallel Architectures and Compilation Techniques (PACT),
pages 17-26, St. Loius, Missouri, September 2005.
X. Zhang and R. Gupta,
Whole Execution Traces,
IEEE/ACM 37th International Symposium on Microarchitecture (MICRO),
pages 105-116, Portland, Oregan, December 2004.
S. Tallam, X. Zhang, and R. Gupta,
Extending Path Profiling across Loop Backedges and Procedure Boundaries,
Second Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO),
pages 251-262, San Jose, CA, March 2004.
Y. Zhang and R. Gupta,
Timestamped Whole Program Path Representation and its Applications,
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI),
pages 180-190, Snowbird, Utah, June 2001.

Dynamic Slicing and Matching
M. Feng and R. Gupta,
Detecting Virus Mutations Via Dynamic Matching,
International Conference on Software Maintenance (ICSM),
pages 105-114, Edmonton, September 2009.
V. Nagarajan, R. Gupta, X. Zhang, M. Madou, B. De Sutter, and K. De Bosschere
Matching Control Flow of Program Versions,
International Conference on Software Maintenance (ICSM),
pages 84-93, Paris, September 2007.
X. Zhang, S. Tallam, and R. Gupta
Dynamic Slicing Long Running Programs through Execution Fast Forwarding,
14th ACM SIGSOFT Symposium on Foundations of Software Engineering (FSE),
pages 81-91, Portland, Oregon, November 2006.
X. Zhang, R. Gupta, and Y. Zhang
Cost and Precision Tradeoffs of Dynamic Slicing Algorithms,
ACM Transactions on Programming Languages and Systems (TOPLAS),
Vol. 27, No. 4, pages 631-661, July 2005.
X. Zhang and R. Gupta,
Matching Execution Histories of Program Versions,
Joint 10th European Software Engineering Conference and
13th ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC-FSE),
pages 197-206, Lisbon, Portugal, September 2005.
X. Zhang and R. Gupta,
Cost Effective Dynamic Program Slicing,
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI),
pages 94-106, Washington D.C., June 2004.
X. Zhang, R. Gupta, and Y. Zhang
Effective Forward Computation of Dynamic Slices Using Reduced Ordered Binary Decision Diagrams,
IEEE/ACM International Conference on Software Engineering (ICSE),
pages 502-511, Edinburgh, UK, May 2004.
Recipient of ICSE 2003 Distinguished Paper Award.
X. Zhang, R. Gupta, and Y. Zhang
Precise Dynamic Slicing Algorithms,
IEEE/ACM International Conference on Software Engineering (ICSE),
pages 319-329, Portland, Oregon, May 2003.
Nominated for Best Paper Award.
N. Gupta and P. Rao,
``Program Execution Based Module Cohesion Measurement,''
16th IEEE International Conference on Automated Software Engineering (ASE),
pages 144-153, San Diego, USA, November 2001.
R. Gupta and M.L. Soffa,
Hybrid Slicing: An Approach for Refining Static Slices using Dynamic Information,
ACM SIGSOFT 3rd Symposium on the Foundations of Software Engineering (FSE),
pages 29-40, Washington, DC, October 1995.
Extended version in ACM Transactions on Software Engineering and Methodology (TOSEM),
Vol. 6, No. 4, pages 370-397, October 1997 (with J. Howard).
E. Duesterwald, R. Gupta, and M.L. Soffa,
Distributed Slicing and Partial Re-execution for Distributed Programs,
5th Workshop on Languages and Compilers for Parallel Computing (LCPC),
LNCS 757 Springer Verlag, pages 497-511, Yale University, New Haven, Connecticut, August 1992.

Fault Location and Correction
M. Feng and R. Gupta,
Learning Universal Probabilistic Models for Fault Localization,
Ninth ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE),
pages 81-88, Toronto, Canada, June 2010.
D. Jeffrey, V. Nagarajan, and R. Gupta
Execution Suppression: An Automated Iterative Technique for locating Memory Errors,
ACM Transactions on Programming Languages and Systems (TOPLAS),
Vol. 32, No. 5, 36 pages, May 2010.
D. Jeffrey, N. Gupta, and R. Gupta
Effective and Efficient Localization of Multiple Faults Using Value Replacement,
International Conference on Software Maintenance (ICSM),
pages 221-230, Edmonton, September 2009.
D. Jeffrey, M. Feng, N. Gupta, and R. Gupta
BugFix: A Learning-Based Tool to Assist Developers in Fixing Bugs,
IEEE 17th International Conference on Program Comprehension (ICPC),
pages 70-79, Vancouver, Canada, May 2009.
D. Jeffrey, N. Gupta, and R. Gupta
Identifying the Root Causes of Memory Bugs Using Corrupted Memory Location Suppression,
International Conference on Software Maintenance (ICSM),
pages 356-365, Beijing, China, September 2008.
D. Jeffrey, N. Gupta, and R. Gupta
Fault Localization using Value Replacement,
International Symposium on Software Testing and Analysis (ISSTA),
pages 167-178, Seattle, July 2008.
N. Gupta and R. Gupta
ExPert: Dynamic Analysis based Fault Location via Execution Perturbations,
NSF Next Generation Software Workshop (NSFNGS),
colocated with IPDPS, pages 1-6, March 2007.
X. Zhang, S. Tallam, N. Gupta, and R. Gupta
Towards Locating Execution Omission Errors,
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI),
pages 415-424, San Diego, June 2007.
X. Zhang, N. Gupta, and R. Gupta,
Locating Faulty Code By Multiple Points Slicing,
Software - Practice & Experience (SP&E) Journal,
Vol. 37, Issue 9, pages 935-961, July 2007.
X. Zhang, N. Gupta, and R. Gupta
A Study of Effectiveness of Dynamic Slicing in Locating Real Faults,
Empirical Software Engineering (ESE) Journal,
Vol. 12, No. 2, pages 143-160, April 2007.
X. Zhang, N. Gupta, and R. Gupta
Pruning Dynamic Slices With Confidence,
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI),
pages 169-180, Ottawa, Canada, June 2006.
X. Zhang, N. Gupta, and R. Gupta
Locating Faults Through Automated Predicate Switching,
IEEE/ACM International Conference on Software Engineering (ICSE),
pages 272-281, Shanghai, China, May 2006.
N. Gupta, H. He, X. Zhang, and R. Gupta,
Locating Faulty Code Using Failure-Inducing Chops,
20th IEEE/ACM International Conference on Automated Software Engineering (ASE),
pages 263-272, Long Beach, California, Nov. 2005.
X. Zhang, H. He, N. Gupta, and R. Gupta
Experimental Evaluation of Using Dynamic Slices for Fault Location,
SIGSOFT-SIGPLAN Sixth International Symposium on Automated and Analysis-Driven Debugging (AADEBUG),
pages 33-42, Monterey, California, September 2005.
H. He and N. Gupta,
``Automated Debugging using Path-Based Weakest Preconditions,''
Fundamental Approaches to Software Engineering (FASE),
ETAPS Joint Conference 2004, Barcelona, Spain, March 29-31, 2004.
X. Zhang and R. Gupta,
Hiding Program Slices for Software Security,
First Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO),
pages 325-336, San Francisco, CA, March 2003.
C. Jaramillo, R. Gupta, and M.L. Soffa,
Comparison Checking: An Approach to Avoid Debugging of Optimized Code,
Joint 7th European Software Engineering Conference and
7th ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC-FSE),
LNCS 1687, Springer Verlag, pages 268-284, Toulouse, France, Sept. 1999.

Software Testing
D. Jeffrey and N. Gupta,
``Experiments with Test Case Prioritization Using Relevant Slices,"
Journal on Systems and Software (JSS), for a Special Issue on Model-Based Software Testing,
Vol. 81, No. 2, pages 196-221, Feb. 2008.
D. Jeffrey and N. Gupta,
``Improving Fault Detection Capability by Selectively Retaining Test Cases During Test Suite Reduction,''
IEEE Transactions on Software Engineering,
pages 108-123, Vol. 33, No. 2, Feb. 2007.
Recipient of Best Paper Award.
D. Jeffrey and N. Gupta,
``Prioritizing Test Cases Using Relevant Slices,''
30th Annual International Computer Software and Applications Conference (COMPSAC),
pages 411-418, Chicago, USA, September 18-21, 2006.
S. Tallam and N. Gupta,
``A Concept Analysis Inspired Greedy Algorithm for Test Suite Minimization,''
ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE),
Lisbon, Portugal, September 5-6, 2005.
D. Jeffrey and N. Gupta,
``Test Suite Reduction with Selective Redundancy,''
21st IEEE International Conference on Software Maintenance (ICSM),
pages 549-558, Budapest, Hungary, September 25-30, 2005.
N. Gupta, Y-J. Cho, and M.Z. Hossain,
``Experiments with UNA for Solving Linear Constraints in Real Variables,''
ACM Symposium on Applied Computing (SAC),
Nicosia, Cyprus, March 14-17, 2004.
N. Gupta and Z.V. Heidepriem,
``A New Structural Coverage Criteria for Dynamic Detection of Program Invariants,''
18th IEEE International Conference on Automated Software Engineering (ASE),
Montreal, Quebec, Canada, October 6-10, 2003.
N. Gupta,
``Generating Test Data for Dynamically Discovering Likely Program Invariants,''
ICSE 2003 Workshop on Dynamic Analysis (WODA),
Portland, Oregon, USA, May 3-10, 2003.
S. Visvanathan and N. Gupta,
``Generating Test Data for Functions with Pointer Inputs,''
17th IEEE International Conference on Automated Software Engineering (ASE),
pages 149-160, Edinburgh, UK, September 2002.
N. Gupta and R. Gupta,
``Data Flow Testing,''
The Compiler Design Handbook: Optimizations and Machine Code Generation,
First Edition, Chapter 7, pages 247-267, CRC Press, September 2002.
N. Gupta
``A Hierarchy of Structural Coverage Criteria for Testing Multithreaded Programs,''
International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA),
pages 1641-1646, Las Vegas, Nevada, USA, June 2001.
N. Gupta, A.P. Mathur, and M.L. Soffa,
``Generating Test Data for Branch Coverage,''
15th IEEE International Conference on Automated Software Engineering (ASE),
Grenoble, France, September 2000.
N. Gupta, A.P. Mathur, and M.L. Soffa,
``UNA Based Iterative Test Data Generation and its Evaluation,''
14th IEEE International Conference on Automated Software Engineering (ASE),
pages 224-232, Cocoa Beach, Florida, USA, October 1999.
N. Gupta, A.P. Mathur, and M.L. Soffa,
``Automated Test Data Generation Using an Iterative Relaxation Method,''
ACM SIGSOFT Sixth International Symposium on Foundations of Software Engineering (FSE),
pages 231-244, Orlando, Florida, USA, November 1998
M.J. Harrold, R. Gupta, and M.L. Soffa,
``A Methodology for Controlling the Size of a Test Suite,''
ACM Transactions on Software Engineering and Methodology (TOSEM),
Vol. 2, No. 3, pages 270-285, July 1993.
R. Gupta, M.J. Harrold, and M.L. Soffa,
An Approach to Regression Testing using Slicing,
IEEE-CS International Conference on Software Maintenance (ICSM),
pages 299-308, Orlando, Florida, November 1992.

Profile-Guided and Demand-Driven Analysis
R. Bodik, R. Gupta, and M.L. Soffa,
Retrospective -- Complete Removal of Redundant Expressions,
20 Years of the ACM/SIGPLAN Conference on Programming Language Design and
Implementation (PLDI) (1979-1999): A Selection, ACM SIGPLAN Notices,
Vol. 39, No. 4, pages 596-597, April 2004.
R. Bodik, R. Gupta, and V. Sarkar,
ABCD: Eliminating Array Bounds Checks on Demand,
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI),
pages 321-333, Vancouver B.C., Canada, June 2000.
Selected for 20 Years of PLDI.
R. Bodik, R. Gupta and M.L. Soffa,
Complete Removal of Redundant Expressions,
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI),
pages 1-14, Montreal, Canada, June 1998.
R. Gupta, D. Berson, and J.Z. Fang,
Resource-Sensitive Profile-Directed Data Flow Analysis for Code Optimization,
IEEE/ACM 30th International Symposium on Microarchitecture (MICRO),
pages 558-568, Research Triangle Park, North Carolina, December 1997.
E. Duesterwald, R. Gupta, and M.L. Soffa,
Demand-Driven Computation of Interprocedural Data Flow,
ACM SIGPLAN-SIGACT 22nd Symposium on Principles of Programming Languages (POPL),
pages 37-48, San Francisco, California, January 1995.
Also published in ACM Transactions on Programming Languages and Systems (TOPLAS),
Vol. 19, No. 6, pages 992-1030, November 1997.
R. Bodik, R. Gupta, and M.L. Soffa,
Refining Data Flow Information using Infeasible Paths,
Joint 6th European Software Engineering Conference and
5th ACM SIGSOFT Symposium on the Foundations of Software Engineering,
LNCS 1301, Springer Verlag, pages 361-377, Zurich, Switzerland, September 1997.
R. Bodik and R. Gupta,
Partial Dead Code Elimination using Slicing Transformations,
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI),
pages 159-170, Las Vegas, Nevada, June 1997.
R. Gupta,
A Fresh Look at Optimizing Array Bound Checks,
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI),
pages 272-282, White Plains, NY, June 1990.
Also published in ACM Letters on Programming Languages and Systems (LOPLAS),
Vol.2, Nos.1-4, pages 135-150, March-December 1994.
R. Gupta,
``Generalized Dominators and Post-Dominators,''
ACM SIGPLAN-SIGACT 19th Symposium on Principles of Programming Languages (POPL),
pages 246-257, Albuquerque, New Mexico, January 1992.

Dynamic Program Analysis for Secure and Reliable Computing

Funding

Description

Software

WET Dynamic Analysis Infrastructure: http://wet.cs.ucr.edu

Publications

Scalable Profiling

Dynamic Slicing and Matching

Fault Location and Correction

Software Testing

Profile-Guided and Demand-Driven Analysis

Dynamic Program Analysis
for Secure and Reliable Computing