Abstract
Software engineering tools often deal with the source code of programs retrieved from the web or source code repositories. Typically, these tools only have access to a subset of a program's source code (one file or a subset of files) which makes it difficult to build a complete and typed intermediate representation (IR). Indeed, for incomplete object-oriented programs, it is not always possible to completely disambiguate the syntactic constructs and to recover the declared type of certain expressions because the declaration of many types and class members are not accessible.
We present a framework that performs partial type inference and uses heuristics to recover the declared type of expressions and resolve ambiguities in partial Java programs. Our framework produces a complete and typed IR suitable for further static analysis. We have implemented this framework and used it in an empirical study on four large open source systems which shows that our system recovers most declared types with a low error rate, even when only one class is accessible.
- JFreeChart. http://www.object-refinery.com/jfreechart/.Google Scholar
- Jython. http://www.jython.org.Google Scholar
- Lucene. http://lucene.apache.org.Google Scholar
- Spring Framework. http://www.springframework.org.Google Scholar
- Nathaniel Ayewah, William Pugh, J. David Morgenthaler, John Penix, and YuQian Zhou. Using findbugs on production software. In OOPSLA '07: Companion to the 22nd ACM SIGPLAN conference on Object oriented programming systems and applications companion, pages 805--806, 2007. Google ScholarDigital Library
- Rolf Bahlke and Gregor Snelting. The PSG system: from formal language definitions to interactive programming environments. ACM Trans. Program. Lang. Syst., 8(4):547--576, 1986. Google ScholarDigital Library
- Nicolas Bettenburg, Rahul Premraj, and Thomas Zimmermann. Extracting structural information from bug reports. In MSR '08: Proceedings of the 2008 international workshop on Mining software repositories, pages 27--30, 2008. Google ScholarDigital Library
- Barthelemy Dagenais and Martin P. Robillard. Recommending adaptive changes for framework evolution. In ICSE '08: Proceedings of the 30th International Conference on Software Engineering, pages 481--490, 2008. Google ScholarDigital Library
- Evelyn Duesterwald, Rajiv Gupta, and Mary Lou Soffa. A practical framework for demand-driven interprocedural data flow analysis. ACM Trans. Program. Lang. Syst., 19(6):992--1030, 1997. Google ScholarDigital Library
- Etienne Gagnon, Laurie J. Hendren, and Guillaume Marceau. Efficient inference of static types for java bytecode. In Static Analysis Symposium, pages 199--219, 2000. Google ScholarDigital Library
- Rajiv Gupta and Mary Lou Soffa. A framework for partial data flow analysis. In ICSM '94: Proceedings of the International Conference on Software Maintenance, pages 4--13, 1994. Google ScholarDigital Library
- Gregory Knapen, Bruno Lague, Michel Dagenais, and Ettore Merlo. Parsing C++ Despite Missing Declarations. In IWPC '99: Proceedings of the 7th International Workshop on Program Comprehension, page 114, 1999. Google ScholarDigital Library
- Rainer Koppler. A systematic approach to fuzzy parsing. Softw. Pract. Exper., 27(6):637--649, 1997. Google ScholarDigital Library
- Leon Moonen. Generating robust parsers using island grammars. In WCRE '01: Proceedings of the Eighth Working Conference on Reverse Engineering, page 13, 2001. Google ScholarDigital Library
- Nathaniel Nystrom, Michael R. Clarkson, and Andrew C. Myers. Polyglot: An extensible compiler framework for java. In Proc. of the 12th International Conference on Compiler Construction, pages 138--152, 2003. Google ScholarDigital Library
- Atanas Rountev, Ana Milanova, and Barbara G. Ryder. Fragment class analysis for testing of polymorphism in java software. IEEE Transactions on Software Engineering, 30(6):372--387, 2004. Google ScholarDigital Library
- Atanas Rountev, Barbara G. Ryder, and William Landi. Dataflow analysis of program fragments. In ESEC/FSE-7: Proceedings of the 7th European Software Engineering Conference held jointly with the 7th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pages 235--252, 1999. Google ScholarDigital Library
- Stephen M. Blackburn et al. The dacapo benchmarks: java benchmarking development and analysis. In OOPSLA '06: Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications, pages 169--190, 2006. Google ScholarDigital Library
- Suresh Thummalapenta and Tao Xie. Parseweb: a programmer assistant for reusing open source code on the web. In ASE '07: Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering, pages 204--213, 2007. Google ScholarDigital Library
- Raja Vallee-Rai, Phong Co, Etienne Gagnon, Laurie Hendren, Patrick Lam, and Vijay Sundaresan. Soot -- a Java bytecode optimization framework. In CASCON '99: Proceedings of the 1999 conference of the Centre for Advanced Studies on Collaborative research, page 13. IBM Press, 1999. Google ScholarDigital Library
- Wei Zhao, Lu Zhang, Yin Liu, Jiasu Sun, and Fuqing Yang. Sniafl: Towards a static noninteractive approach to feature location. ACM Trans. Softw. Eng. Methodol., 15(2):195--226, 2006. Google ScholarDigital Library
- Thomas Zimmermann, PeterWeissgerber, Stephan Diehl, and Andreas Zeller. Mining version histories to guide software changes. IEEE Transactions on Software Engineering, 31(6):429--445, 2005. Google ScholarDigital Library
Index Terms
- Enabling static analysis for partial java programs
Recommendations
Enabling static analysis for partial java programs
OOPSLA '08: Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applicationsSoftware engineering tools often deal with the source code of programs retrieved from the web or source code repositories. Typically, these tools only have access to a subset of a program's source code (one file or a subset of files) which makes it ...
Java bytecode as a typed term calculus
PPDP '02: Proceedings of the 4th ACM SIGPLAN international conference on Principles and practice of declarative programmingWe propose a type system for the Java bytecode language, prove the type soundness, and develop a type inference algorithm. In contrast to the existing proposals, our type system yields a typed term calculus similar to type systems of lambda calculi. ...
Principal Type Schemes for Gradual Programs
POPL '15: Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming LanguagesGradual typing is a discipline for integrating dynamic checking into a static type system. Since its introduction in functional languages, it has been adapted to a variety of type systems, including object-oriented, security, and substructural. This ...
Comments