skip to main content
research-article

Intelligent Code Completion with Bayesian Networks

Published:02 December 2015Publication History
Skip Abstract Section

Abstract

Code completion is an integral part of modern Integrated Development Environments (IDEs). Developers often use it to explore Application Programming Interfaces (APIs). It is also useful to reduce the required amount of typing and to help avoid typos. Traditional code completion systems propose all type-correct methods to the developer. Such a list is often very long with many irrelevant items. More intelligent code completion systems have been proposed in prior work to reduce the list of proposed methods to relevant items.

This work extends one of these existing approaches, the Best Matching Neighbor (BMN) algorithm. We introduce Bayesian networks as an alternative underlying model, use additional context information for more precise recommendations, and apply clustering techniques to improve model sizes. We compare our new approach, Pattern-based Bayesian Networks (PBN), to the existing BMN algorithm. We extend previously used evaluation methodologies and, in addition to prediction quality, we also evaluate model size and inference speed.

Our results show that the additional context information we collect improves prediction quality, especially for queries that do not contain method calls. We also show that PBN can obtain comparable prediction quality to BMN, while model size and inference speed scale better with large input sizes.

References

  1. Marcel Bruch and Mira Mezini. 2008. Improving code recommender systems using Boolean factor analysis and graphical models. In Proceedings of the International Workshop on Recommendation Systems for Software Engineering (RSSE'08). ACM Press, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Marcel Bruch, Martin Monperrus, and Mira Mezini. 2009. Learning from examples to improve code completion systems. In Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering (ESEC/FSE'09). ACM Press, New York, 213--222. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Marcel Bruch, Thorsten Schafer, and Mira Mezini. 2006. FrUiT: IDE support for framework understanding. In Proceedings of the OOPSLA Workshop on Eclipse Technology eXchange (Eclipse'06). ACM Press, New York, 55--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Raymond P. L. Buse and Westley Weimer. 2012. Synthesizing API usage examples. In Proceedings of the International Conference on Software Engineering (ICSE'12). IEEE Press, 782--792. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Olivier Chapelle and Ya Zhang. 2009. A dynamic Bayesian network click model for Web search ranking. In Proceedings of the 18th International Conference on World Wide Web (WWW'09). ACM Press, New York, 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Stanley F. Chen and Joshua Goodman. 1996. An empirical study of smoothing techniques for language modeling. In Proceedings of the 34th Annual Meeting on Association for Computational Linguistics (ACL'96). Association for Computational Linguistics, 310--318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Thomas Cover and Peter Hart. 2006. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 1, 21--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Tihomir Gvero, Viktor Kuncak, Ivan Kuraj, and Ruzica Piskac. 2013. Complete completion using types and weights. In Proceedings of the 34th Conference on Programming Language Design and Implementation (PLDI'13). ACM Press, New York, 27--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Lars Heinemann, Veronika Bauer, Markus Herrmannsdoerfer, and Benjamin Hummel. 2012. Identifier-based context-dependent API method recommendation. In Proceedings of the 16th European Conference on Software Maintenance and Reengineering (CSMR'12). IEEE, 31--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Abram Hindle, Earl T. Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. 2012. On the naturalness of software. In Proceedings of the International Conference on Software Engineering (ICSE'12). IEEE Press, 837--847. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Zhenmin Li and Yuanyuan Zhou. 2005. PR-Miner: Automatically extracting implicit programming rules and detecting violations in large software code. In Proceedings of the 10th European Software Engineering Conference Held Jointly with the 13th International Symposium on The Foundations of Software Engineering (ESEC/FSE'05). ACM Press, New York, 306--315. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Benjamin Livshits and Thomas Zimmermann. 2005. DynaMine: Finding common error patterns by mining software revision histories. In Proceedings of the 10th European Software Engineering Conference Held Jointly with the 13th International Symposium on The Foundations of Software Engineering (ESEC/FSE'05). ACM Press, New York, 296--305. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Robert Cecil Martin. 2003. Agile Software Development: Principles, Patterns, and Practices. Prentice Hall, PTR, Upper Saddle River, NJ. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Andrew Mccallum, Kamal Nigam, and Lyle H. Ungar. 2000. Efficient clustering of high-dimensional data sets with application to reference matching. In Proceedings of the 6th International Conference on Knowledge Discovery and Data Mining (KDD'00). ACM Press, New York, 169--178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Amir Michail. 2000. Data mining library reuse patterns using generalized association rules. In Proceedings of the 22nd International Conference on Software Engineering (ICSE'00). ACM Press, New York, 167--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Martin Monperrus, Marcel Bruch, and Mira Mezini. 2010. Detecting missing method calls in object-oriented software. In Proceedings of the 24th European Conference on Object-Oriented Programming (ECOOP'10). 2--25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Meiyappan Nagappan, Thomas Zimmermann, and Christian Bird. 2013. Diversity in software engineering research. In Proceedings of the 9th Joint Meeting of the European Software Engineering Conference and the Symposium on The Foundations of Software Engineering (ESEC/FSE'13). ACM Press, New York, 466--476. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Anh Tuan Nguyen, Tung Thanh Nguyen, Hoan Anh Nguyen, Ahmed Tamrawi, Hung Viet Nguyen, Jafar Al-Kofahi, and Tien N. Nguyen. 2012. Graph-based pattern-oriented, context-sensitive source code completion. In Proceedings of the International Conference on Software Engineering (ICSE'12). IEEE Press, 69--79. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Tung Thanh Nguyen, Hoan Anh Nguyen, Nam H. Pham, Jafar M. Al-Kofahi, and Tien N. Nguyen. 2009. Graph-based mining of multiple object usage patterns. In Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the Symposium on The Foundations of Software Engineering (ESEC/FSE'09). ACM Press, New York, 383--392. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jakob Nielsen. 1994. Usability Engineering. Elsevier, Amsterdam.Google ScholarGoogle Scholar
  21. Sebastian Proksch, Sven Amann, and Mira Mezini. 2014. Towards standardized evaluation of developer-assistance tools. In Proceedings of the 4th International Workshop on Recommendation Systems for Software Engineering (RSSE'14). ACM Press, New York, 14--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Irina Rish. 2001. An empirical study of the naive Bayes classifier. In Proceedings of the Workshop on Empirical Methods in Artificial Intelligence (IJCAI'01). IBM, New York, 41--46.Google ScholarGoogle Scholar
  23. Martin P. Robillard, Eric Bodden, David Kawrykow, Mira Mezini, and Tristan Ratchford. 2013. Automated API property inference techniques. IEEE Trans. Softw. Engin. 39, 5, 613--637. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Michael Schultz and Mark Liberman. 1999. Topic detection and tracking using idf-weighted cosine coefficient. In Proceedings of the DARPA Broadcast News Workshop. Morgan Kaufmann Publishers, 189--192.Google ScholarGoogle Scholar
  25. Olin Shivers. 1988. Control flow analysis in scheme. In Proceedings of the Conference on Programming Language Design and Implementation (PLDI'88). ACM Press, New York, 164--174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Olin Shivers. 1991a. Data-flow analysis and type recovery in scheme. In Topics in Advanced Language Implementation. The MIT Press, Cambridge, MA.Google ScholarGoogle Scholar
  27. Olin Shivers. 1991b. The semantics of scheme control-flow analysis. In Proceedings of the ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation (PEPM'91). ACM Press, New York, 190--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Alexander Strehl, Joydeep Ghosh, and Raymond Mooney. 2000. Impact of similarity measures on web-page clustering. In Proceedings of the Workshop on Artificial Intelligence for Web Search (AAAI'00). 58--64.Google ScholarGoogle Scholar
  29. Xiwang Yang, Yang Guo, and Yong Liu. 2011. Bayesian-inference based recommendation in online social networks. In Proceedings of the INFOCOM Conference (INFOCOM'11). 551--555.Google ScholarGoogle ScholarCross RefCross Ref
  30. Cheng Zhang, Juyuan Yang, Yi Zhang, Jing Fan, Xin Zhang, Jianjun Zhao, and Peizhao Ou. 2012. Automatic parameter recommendation for practical API usage. In Proceedings of the International Conference on Software Engineering (ICSE'12). IEEE Press, 826--836. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Hao Zhong, Lu Zhang, and Hong Mei. 2008. Inferring specifications of object oriented APIs from API source code. In Proceedings of the 15th Asia-Pacific Software Engineering Conference (APSEC'08). IEEE Computer Society, 221--228. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Intelligent Code Completion with Bayesian Networks

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Software Engineering and Methodology
            ACM Transactions on Software Engineering and Methodology  Volume 25, Issue 1
            December 2015
            339 pages
            ISSN:1049-331X
            EISSN:1557-7392
            DOI:10.1145/2852270
            Issue’s Table of Contents

            Copyright © 2015 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 2 December 2015
            • Accepted: 1 March 2015
            • Revised: 1 January 2015
            • Received: 1 February 2014
            Published in tosem Volume 25, Issue 1

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader