ABSTRACT
Programming languages evolve over time, adding additional language features to simplify common tasks and make the language easier to use. For example, the Java Language Specification has four editions and is currently drafting a fifth. While the addition of language features is driven by an assumed need by the community (often with direct requests for such features), there is little empirical evidence demonstrating how these new features are adopted by developers once released. In this paper, we analyze over 31k open-source Java projects representing over 9 million Java files, which when parsed contain over 18 billion AST nodes. We analyze this corpus to find uses of new Java language features over time. Our study gives interesting insights, such as: there are millions of places features could potentially be used but weren't; developers convert existing code to use new features; and we found thousands of instances of potential resource handling bugs.
- Eclipse. http://www.eclipse.org/, 2014.Google Scholar
- Eclipse Java development tools (JDT). http://www.eclipse.org/jdt/overview.php, 2014.Google Scholar
- Netbeans. http://www.netbeans.org/, 2014.Google Scholar
- Netbeans inspect and transform. https://netbeans.org/kb/docs/java/ editor-inspect-transform.html#convert, 2014.Google Scholar
- Apache Software Foundation. Hadoop: Open source implementation of MapReduce. http://hadoop.apache.org/, 2014.Google Scholar
- P. F. Baldi, C. V. Lopes, E. J. Linstead, and S. K. Bajracharya. A theory of aspects as latent topics. In Proceedings of the 23rd ACM SIGPLAN conference on Object-Oriented Programming Systems Languages and Applications, OOPSLA, pages 543–562, 2008. Google ScholarDigital Library
- H. A. Basit, D. C. Rajapakse, and S. Jarzabek. An empirical study on limits of clone unification using generics. In Proceedings of the 17th International Conference on Software Engineering and Knowledge Engineering, SEKE, pages 109–114, 2005.Google Scholar
- G. Bracha, M. Odersky, D. Stoutamire, and P. Wadler. Making the future safe for the past: adding genericity to the Java programming language. SIGPLAN Not., 33(10), Oct. 1998. Google ScholarDigital Library
- O. Callaú, R. Robbes, E. Tanter, and D. Röthlisberger. How developers use the dynamic features of programming languages: the case of Smalltalk. In Proceedings of the 8th Working Conference on Mining Software Repositories, MSR, pages 23–32, 2011. Google ScholarDigital Library
- A. S. Christensen, A. Møller, and M. I. Schwartzbach. Precise analysis of string expressions. In Proceedings of the 10th international conference on Static Analysis, SAS, pages 1–18, 2003. Google ScholarDigital Library
- J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. In Proceedings of the 6th Symposium on Operating Systems Design and Implementation, OSDI, 2004. Google ScholarDigital Library
- Dice Holdings, Inc. Sourceforge website. http://sourceforge.net/, 2014.Google Scholar
- E. Duala-Ekoko and M. P. Robillard. Using structure-based recommendations to facilitate discoverability in APIs. In Proceedings of the 25th European conference on Object-oriented programming, ECOOP, pages 79–104, 2011. Google ScholarDigital Library
- R. Dyer, H. Nguyen, H. Rajan, and T. N. Nguyen. Boa: A language and infrastructure for analyzing ultra-large-scale software repositories. In Proceedings of the 35th ACM/IEEE International Conference on Software Engineering, ICSE, pages 422–431, 2013. Google ScholarDigital Library
- R. Dyer, H. Rajan, and T. N. Nguyen. Declarative visitors to ease fine-grained source code mining with full history on billions of AST nodes. In Proceedings of the 12th International Conference on Generative Programming: Concepts & Experiences, GPCE, 2013. Google ScholarDigital Library
- T. Gorschek, E. Tempero, and L. Angelis. A large-scale empirical study of practitioners’ use of object-oriented concepts. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering, ICSE, pages 115–124, 2010. Google ScholarDigital Library
- J. Gosling, B. Joy, and G. Steele. Java(TM) Language Specification. Addison-Wesley Longman Publishing Co., Inc., 1st edition, 1996. Google ScholarDigital Library
- J. Gosling, B. Joy, G. Steele, and G. Bracha. Java(TM) Language Specification. Addison-Wesley Longman Publishing Co., Inc., 2nd edition, 2000. Google ScholarDigital Library
- J. Gosling, B. Joy, G. Steele, and G. Bracha. Java(TM) Language Specification. Addison-Wesley Professional, 3rd edition, 2005. Google ScholarDigital Library
- J. Gosling, B. Joy, G. Steele, G. Bracha, and A. Buckley. Java(TM) Language Specification. Prentice Hall, Java SE 7 edition, 2013. Google ScholarDigital Library
- M. Grechanik, C. McMillan, L. DeFerrari, M. Comi, S. Crespi, D. Poshyvanyk, C. Fu, Q. Xie, and C. Ghezzi. An empirical investigation into a large-scale Java open source code repository. In International Symposium on Empirical Software Engineering and Measurement, ESEM, pages 11:1–11:10, 2010. Google ScholarDigital Library
- M. Hoppe and S. Hanenberg. Do developers benefit from generic types? An empirical comparison of generic and raw types in Java. In 4th ACM SIGPLAN conference on Systems, Programming, Languages and Applications: Software for Humanity, SPLASH, 2013. Google ScholarDigital Library
- E. Linstead, S. Bajracharya, T. Ngo, P. Rigor, C. Lopes, and P. Baldi. Sourcerer: mining and searching internet-scale software repositories. Data Mining and Knowledge Discovery, 18, 2009. Google ScholarDigital Library
- B. Livshits, J. Whaley, and M. S. Lam. Reflection analysis for Java. In Proceedings of the Third Asian conference on Programming Languages and Systems, APLAS, pages 139–160, 2005. Google ScholarDigital Library
- L. Meyerovich and A. Rabkin. Empirical analysis of programming language adoption. In 4th ACM SIGPLAN conference on Systems, Programming, Languages and Applications: Software for Humanity, SPLASH, 2013. Google ScholarDigital Library
- R. Muschevici, A. Potanin, E. Tempero, and J. Noble. Multiple dispatch in practice. In Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications, OOPSLA, pages 563–582, 2008. Google ScholarDigital Library
- C. Parnin, C. Bird, and E. R. Murphy-Hill. Java generics adoption: how new features are introduced, championed, or ignored. In 8th IEEE International Working Conference on Mining Software Repositories, MSR, 2011. Google ScholarDigital Library
- H. Rajan, T. N. Nguyen, R. Dyer, and H. A. Nguyen. Boa website. http://boa.cs.iastate.edu/, 2014.Google Scholar
- P. Ratanaworabhan, B. Livshits, and B. G. Zorn. Jsmeter: comparing the behavior of JavaScript benchmarks with real web applications. In Proceedings of the 2010 USENIX conference on Web application development, WebApps, 2010. Google ScholarDigital Library
- P. Resnick and H. R. Varian. Recommender systems. Commun. ACM, 40(3):56–58, 1997. Google ScholarDigital Library
- G. Richards, C. Hammer, B. Burg, and J. Vitek. The eval that men do: A large-scale study of the use of eval in JavaScript applications. In Proceedings of the 25th European conference on Object-oriented programming, ECOOP, pages 52–78, 2011. Google ScholarDigital Library
- G. Richards, S. Lebresne, B. Burg, and J. Vitek. An analysis of the dynamic behavior of JavaScript programs. In Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation, PLDI, 2010. Google ScholarDigital Library
- M. Robillard, R. Walker, and T. Zimmermann. Recommendation systems for software engineering. IEEE Software, 27(4):80–86, 2010. Google ScholarDigital Library
- S. R. Schach. Object-oriented and Classical Software Engineering. McGraw-Hill Higher Education. McGraw-Hill Higher Education, 2005. Google ScholarDigital Library
- E. Tempero. How fields are used in Java: An empirical study. In Proceedings of the 20th Australian Software Engineering Conference, ASWEC, pages 91–100, 2009. Google ScholarDigital Library
- E. Tempero, J. Noble, and H. Melton. How do Java programs use inheritance? An empirical study of inheritance in Java software. In Proceedings of the 22nd European conference on Object-Oriented Programming, ECOOP, pages 667–691, 2008. Google ScholarDigital Library
- W. Weimer and G. C. Necula. Finding and preventing run-time error handling mistakes. In Proceedings of the 19th ACM SIGPLAN conference on Object-oriented programming systems languages and applications, OOPSLA, pages 419–431, 2004. Google ScholarDigital Library
- C. Yue and H. Wang. Characterizing insecure JavaScript practices on the web. In Proceedings of the 18th international conference on World Wide Web, WWW, pages 961–970, 2009. Google ScholarDigital Library
Index Terms
- Mining billions of AST nodes to study actual and potential usage of Java language features
Recommendations
Understanding the API usage in Java
ContextApplication Programming Interfaces (APIs) facilitate the use of programming languages. They define sets of rules and specifications for software programs to interact with. The design of language API is usually artistic, driven by aesthetic ...
Evaluating the Java Native Interface JNI: Leveraging Existing Native Code, Libraries and Threads to a Running Java Virtual Machine
This article aims to explore JNI features and to discover fundamental operations of the Java programming language, such as arrays, objects, classes, threads and exception handling, and to illustrate these by using various algorithms and code samples. ...
A study of Java's non-Java memory
OOPSLA '10: Proceedings of the ACM international conference on Object oriented programming systems languages and applicationsA Java application sometimes raises an out-of-memory ex-ception. This is usually because it has exhausted the Java heap. However, a Java application can raise an out-of-memory exception when it exhausts the memory used by Java that is not in the Java ...
Comments