Abstract
Research at the intersection of machine learning, programming languages, and software engineering has recently taken important steps in proposing learnable probabilistic models of source code that exploit the abundance of patterns of code. In this article, we survey this work. We contrast programming languages against natural languages and discuss how these similarities and differences drive the design of probabilistic models. We present a taxonomy based on the underlying design principles of each model and use it to navigate the literature. Then, we review how researchers have adapted these models to application areas and discuss cross-cutting and application-specific challenges and opportunities.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, A Survey of Machine Learning for Big Code and Naturalness
- Mithun Acharya, Tao Xie, Jian Pei, and Jun Xu. 2007. Mining API patterns as partial orders from source code: From usage scenarios to specifications. In Proceedings of the Joint Meeting of the European Software Engineering Conference and the Symposium on the Foundations of Software Engineering (ESEC/FSE’07). Google ScholarDigital Library
- Karan Aggarwal, Mohammad Salameh, and Abram Hindle. 2015. Using Machine Translation for Converting Python 2 to Python 3 Code. Technical Report.Google Scholar
- Alex A. Alemi, Francois Chollet, Geoffrey Irving, Christian Szegedy, and Josef Urban. 2016. DeepMath--Deep sequence models for premise selection. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS’16). Google ScholarDigital Library
- Miltiadis Allamanis, Earl T. Barr, Christian Bird, Premkumar Devanbu, Mark Marron, and Charles Sutton. 2016. Mining Semantic Loop Idioms from Big Code. Technical Report. Retrieved from https://www.microsoft.com/en-us/research/publication/mining-semantic-loop-idioms-big-code/.Google Scholar
- Miltiadis Allamanis, Earl T. Barr, Christian Bird, and Charles Sutton. 2014. Learning natural coding conventions. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE’14). Google ScholarDigital Library
- Miltiadis Allamanis, Earl T. Barr, Christian Bird, and Charles Sutton. 2015. Suggesting accurate method and class names. In Proceedings of the Joint Meeting of the European Software Engineering Conference and the Symposium on the Foundations of Software Engineering (ESEC/FSE’15). Google ScholarDigital Library
- Miltiadis Allamanis and Marc Brockschmidt. 2017. SmartPaste: Learning to adapt source code. arXiv Preprint arXiv:1705.07867 (2017).Google Scholar
- Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. 2018. Learning to represent programs with graphs. In Proceedings of the International Conference on Learning Representations (ICLR’18).Google Scholar
- Miltiadis Allamanis, Pankajan Chanthirasegaran, Pushmeet Kohli, and Charles Sutton. 2017. Learning continuous semantic representations of symbolic expressions. In Proceedings of the International Conference on Machine Learning (ICML’17).Google Scholar
- Miltiadis Allamanis, Hao Peng, and Charles Sutton. 2016. A convolutional attention network for extreme summarization of source code. In Proceedings of the International Conference on Machine Learning (ICML’16).Google Scholar
- Miltiadis Allamanis and Charles Sutton. 2013. Mining source code repositories at massive scale using language modeling. In Proceedings of the Working Conference on Mining Software Repositories (MSR’13). Google ScholarDigital Library
- Miltiadis Allamanis and Charles Sutton. 2014. Mining idioms from source code. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE’14). Google ScholarDigital Library
- Miltiadis Allamanis, Daniel Tarlow, Andrew Gordon, and Yi Wei. 2015. Bimodal modelling of source code and natural language. In Proceedings of the International Conference on Machine Learning (ICML’15). Google ScholarDigital Library
- Sven Amann, Sebastian Proksch, Sarah Nadi, and Mira Mezini. 2016. A study of visual studio usage in practice. In Proceedings of the International Conference on Software Analysis, Evolution, and Reengineering (SANER’16).Google ScholarCross Ref
- Gene M. Amdahl. 1967. Validity of the single processor approach to achieving large scale computing capabilities. In Proceedings of the Spring Joint Computer Conference. Google ScholarDigital Library
- Matthew Amodio, Swarat Chaudhuri, and Thomas Reps. 2017. Neural attribute machines for program generation. arXiv Preprint arXiv:1705.09231 (2017).Google Scholar
- Jacob Andreas, Marcus Rohrbach, Trevor Darrell, and Dan Klein. 2016. Learning to compose neural networks for question answering. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’16).Google ScholarCross Ref
- Daniel Arp, Michael Spreitzenbarth, Malte Hubner, Hugo Gascon, and Konrad Rieck. 2014. DREBIN: Effective and explainable detection of android malware in your pocket. In Proceedings of the Network and Distributed System Security Symposium.Google ScholarCross Ref
- Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of the International Conference on Learning Representations (ICLR’15).Google Scholar
- Matej Balog, Alexander L. Gaunt, Marc Brockschmidt, Sebastian Nowozin, and Daniel Tarlow. 2017. DeepCoder: Learning to write programs. In Proceedings of the International Conference on Learning Representations (ICLR’17).Google Scholar
- Antonio Valerio Miceli Barone and Rico Sennrich. 2017. A parallel corpus of Python functions and documentation strings for automated code documentation and code generation. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers) 2 (2017), 314--319.Google Scholar
- Rohan Bavishi, Michael Pradel, and Koushik Sen. 2017. Context2Name: A deep learning-based approach to infer natural variable names from usage contexts. TU Darmstadt, Department of Computer Science.Google Scholar
- Tony Beltramelli. 2018. pix2code: Generating code from a graphical user interface screenshot. In Proceedings of the ACM SIGCHI Symposium on Engineering Interactive Computing Systems. ACM, 3 pages. Google ScholarDigital Library
- Al Bessey, Ken Block, Ben Chelf, Andy Chou, Bryan Fulton, Seth Hallem, Charles Henri-Gros, Asya Kamsky, Scott McPeak, and Dawson Engler. 2010. A few billion lines of code later: Using static analysis to find bugs in the real world. Communications of the ACM 53, 2 (2010), 66--75. Google ScholarDigital Library
- Sahil Bhatia and Rishabh Singh. 2018. Automated correction for syntax errors in programming assignments using recurrent neural networks. In Proceedings of the International Conference on Software Engineering (ICSE’18).Google Scholar
- Avishkar Bhoopchand, Tim Rocktäschel, Earl Barr, and Sebastian Riedel. 2016. Learning Python code suggestion with a sparse pointer network. arXiv Preprint arXiv:1611.08307 (2016).Google Scholar
- Benjamin Bichsel, Veselin Raychev, Petar Tsankov, and Martin Vechev. 2016. Statistical deobfuscation of android applications. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. Google ScholarDigital Library
- Pavol Bielik, Veselin Raychev, and Martin Vechev. 2015. Programming with “big code”: Lessons, techniques and applications. In Proceedings of the LIPIcs-Leibniz International Proceedings in Informatics.Google Scholar
- Pavol Bielik, Veselin Raychev, and Martin Vechev. 2016. PHOG: Probabilistic model for code. In Proceedings of the International Conference on Machine Learning (ICML’16). Google ScholarDigital Library
- David M. Blei. 2012. Probabilistic topic models. Communications of the ACM 55, 4 (2012), 77--84. Google ScholarDigital Library
- Marc Brockschmidt, Yuxin Chen, Pushmeet Kohli, Siddharth Krishna, and Daniel Tarlow. 2017. Learning shape analysis. In Proceedings of the International Static Analysis Symposium. Springer.Google ScholarCross Ref
- Peter John Brown. 1979. Software Portability: An Advanced Course. CUP Archive.Google Scholar
- Marcel Bruch, Martin Monperrus, and Mira Mezini. 2009. Learning from examples to improve code completion systems. In Proceedings of the Joint Meeting of the European Software Engineering Conference and the Symposium on the Foundations of Software Engineering (ESEC/FSE’09). Google ScholarDigital Library
- Raymond P. L. Buse and Westley Weimer. 2012. Synthesizing API usage examples. In Proceedings of the International Conference on Software Engineering (ICSE’12). Google ScholarDigital Library
- Joshua Charles Campbell, Abram Hindle, and José Nelson Amaral. 2014. Syntax errors just aren’t natural: Improving error reporting with language models. In Proceedings of the Working Conference on Mining Software Repositories (MSR’14).Google ScholarDigital Library
- Lei Cen, Christoher S. Gates, Luo Si, and Ninghui Li. 2015. A probabilistic discriminative model for Android malware detection with decompiled source code. IEEE Transactions on Dependable and Secure Computing 12, 4 (2015), 400--412.Google ScholarDigital Library
- Luigi Cerulo, Massimiliano Di Penta, Alberto Bacchelli, Michele Ceccarelli, and Gerardo Canfora. 2015. Irish: A hidden markov model to detect coded information islands in free text. Science of Computer Programming 105 (2015), 26--43. Google ScholarDigital Library
- Kwonsoo Chae, Hakjoo Oh, Kihong Heo, and Hongseok Yang. 2017. Automatically generating features for learning program analysis heuristics for C-like languages. In Proceedings of the Conference on Object-Oriented Programming, Systems, Languages 8 Applications (OOPSLA’17).Google ScholarDigital Library
- Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly detection: A survey. ACM Computing Surveys (CSUR) 41, 3 (2009), 15. Google ScholarDigital Library
- Stanley F. Chen and Joshua Goodman. 1999. An empirical study of smoothing techniques for language modeling. Computer Speech and Language 13, 4 (1999), 359--394. Google ScholarDigital Library
- Kyunghyun Cho, Bart van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the properties of neural machine translation: Encoder--Decoder approaches. In Syntax, Semantics and Structure in Statistical Translation (2014).Google ScholarCross Ref
- Edmund Clarke, Daniel Kroening, and Karen Yorav. 2003. Behavioral consistency of C and verilog programs using bounded model checking. In Proceedings of the 40th Annual Design Automation Conference. Google ScholarDigital Library
- Trevor Cohn, Phil Blunsom, and Sharon Goldwater. 2010. Inducing tree-substitution grammars. Journal of Machine Learning Research 11, Nov (2010), 3053--3096. Google ScholarDigital Library
- Christopher S. Corley, Kostadin Damevski, and Nicholas A. Kraft. 2015. Exploring the use of deep learning for feature location. In Proceedings of the 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME’15). Google ScholarDigital Library
- Patrick Cousot, Radhia Cousot, Jerôme Feret, Laurent Mauborgne, Antoine Miné, David Monniaux, and Xavier Rival. 2005. The ASTRÉE analyzer. In ESPO. Springer.Google Scholar
- William Croft. 2008. Evolutionary linguistics. Ann. Rev. Anthropol. (2008).Google Scholar
- Chris Cummins, Pavlos Petoumenos, Zheng Wang, and Hugh Leather. 2017. End-to-end deep learning of optimization heuristics. In Proceedings of the 26th International Conference on Parallel Computing Technologies (PACT'17). IEEE, 219--232.Google ScholarCross Ref
- Chris Cummins, Pavlos Petoumenos, Zheng Wang, and Hugh Leather. 2017. Synthesizing benchmarks for predictive modeling. In Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization (CGO’17). IEEE, 86--99. Google ScholarCross Ref
- Hoa Khanh Dam, Truyen Tran, and Trang Pham. 2016. A deep language model for software code. arXiv Preprint arXiv:1608.02715 (2016).Google Scholar
- Florian Deißenböck and Markus Pizka. 2006. Concise and consistent naming. Software Quality Journal 14, 3 (2006), 261--282. Google ScholarDigital Library
- Yuntian Deng, Anssi Kanervisto, Jeffrey Ling, and Alexander M. Rush. 2017. Image-to-markup generation with coarse-to-fine attention. In Proceedings of the International Conference on Machine Learning (ICML’17). 980--989.Google Scholar
- Premkumar Devanbu. 2015. New initiative: The naturalness of software. In Proceedings of the International Conference on Software Engineering (ICSE’15). Google ScholarDigital Library
- Jacob Devlin, Jonathan Uesato, Surya Bhupatiraju, Rishabh Singh, Abdel rahman Mohamed, and Pushmeet Kohli. 2017. Robustfill: Neural program learning under noisy I/O. In Proceedings of the International Conference on Machine Learning (ICML’17).Google Scholar
- Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan, and Tien N. Nguyen. 2013. Boa: A language and infrastructure for analyzing ultra-large-scale software repositories. In Proceedings of the International Conference on Software Engineering (ICSE’13). Google ScholarDigital Library
- Kevin Ellis, Daniel Ritchie, Armando Solar-Lezama, and Joshua B. Tenenbaum. 2017. Learning to infer graphics programs from hand-drawn images. arXiv Preprint arXiv:1707.09627 (2017).Google Scholar
- Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf. 2001. Bugs as deviant behavior: A general approach to inferring errors in systems code. In ACM SIGOPS Operating Systems Review. Google ScholarDigital Library
- Michael D. Ernst. 2017. Natural language is a programming language: Applying natural language processing to software development. In Proceedings of the LIPIcs-Leibniz International Proceedings in Informatics.Google Scholar
- Ethan Fast, Daniel Steffee, Lucy Wang, Joel R. Brandt, and Michael S. Bernstein. 2014. Emergent, crowd-scale programming practice in the IDE. In Proceedings of the Annual ACM Conference on Human Factors in Computing Systems. Google ScholarDigital Library
- John K. Feser, Marc Brockschmidt, Alexander L. Gaunt, and Daniel Tarlow. 2017. Neural functional programming. InProceedings of the International Conference on Learning Representations (ICLR’17).Google Scholar
- Matthew Finifter, Adrian Mettler, Naveen Sastry, and David Wagner. 2008. Verifiable functional purity in java. In Proceedings of the 15th ACM Conference on Computer and Communications Security. ACM, 161--174. Google ScholarDigital Library
- Eclipse Foundation. Code Recommenders. Retrieved June 2017 from www.ecli pse.org/recommenders.Google Scholar
- Jaroslav Fowkes, Pankajan Chanthirasegaran, Razvan Ranca, Miltos Allamanis, Mirella Lapata, and Charles Sutton. 2017. Autofolding for source code summarization. IEEE Transactions on Software Engineering 43, 12 (2017), 1095--1109. Google ScholarDigital Library
- Jaroslav Fowkes and Charles Sutton. 2015. Parameter-free probabilistic API mining at GitHub Scale. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE’15). Google ScholarDigital Library
- Christine Franks, Zhaopeng Tu, Premkumar Devanbu, and Vincent Hellendoorn. 2015. Cacheca: A cache language model based code suggestion tool. In Proceedings of the International Conference on Software Engineering (ICSE’15). Google ScholarDigital Library
- Wei Fu and Tim Menzies. 2017. Easy over hard: A case study on deep learning. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE’17). Google ScholarDigital Library
- Mark Gabel and Zhendong Su. 2008. Javert: Fully automatic mining of general temporal properties from dynamic traces. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE’08). Google ScholarDigital Library
- Mark Gabel and Zhendong Su. 2010. A study of the uniqueness of source code. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE’10). Google ScholarDigital Library
- Rosalva E. Gallardo-Valencia and Susan Elliott Sim. 2009. Internet-scale code search. In Proceedings of the 2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation. Google ScholarDigital Library
- Alexander L. Gaunt, Marc Brockschmidt, Rishabh Singh, Nate Kushman, Pushmeet Kohli, Jonathan Taylor, and Daniel Tarlow. 2016. TerpreT: A probabilistic programming language for program induction. arXiv Preprint arXiv:1608.04428 (2016).Google Scholar
- Spandana Gella, Mirella Lapata, and Frank Keller. 2016. Unsupervised visual sense disambiguation for verbs using multimodal embeddings. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’16).Google ScholarCross Ref
- Elena L. Glassman, Jeremy Scott, Rishabh Singh, Philip J. Guo, and Robert C. Miller. 2015. OverCode: Visualizing variation in student solutions to programming problems at scale. ACM Transactions on Computer-Human Interaction (TOCHI) 22, 2 (2015), 7 pages. Google ScholarDigital Library
- Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. Google ScholarDigital Library
- Andrew D. Gordon, Thomas A. Henzinger, Aditya V. Nori, and Sriram K. Rajamani. 2014. Probabilistic programming. In Proceedings of the International Conference on Software Engineering (ICSE’14).Google Scholar
- Orlena Gotel, Jane Cleland-Huang, Jane Huffman Hayes, Andrea Zisman, Alexander Egyed, Paul Grünbacher, Alex Dekhtyar, Giuliano Antoniol, Jonathan Maletic, and Patrick Mäder. 2012. Traceability fundamentals. In Software and Systems Traceability. Springer, 3--22.Google Scholar
- Alex Graves, Greg Wayne, and Ivo Danihelka. 2014. Neural Turing machines. arXiv Preprint arXiv:1410.5401 (2014).Google Scholar
- Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, and Sunghun Kim. 2016. Deep API learning. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE’16). Google ScholarDigital Library
- Sumit Gulwani and Mark Marron. 2014. NLyze: Interactive programming by natural language for spreadsheet data analysis and manipulation. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. Google ScholarDigital Library
- Sumit Gulwani, Oleksandr Polozov, Rishabh Singh, and others. 2017. Program synthesis. In Foundations and Trends® in Programming Languages 4, 1--2 (2017), 1--119.Google ScholarCross Ref
- Jin Guo, Jinghui Cheng, and Jane Cleland-Huang. 2017. Semantically enhanced software traceability using deep learning techniques. In Proceedings of the International Conference on Software Engineering (ICSE’17). Google ScholarDigital Library
- Rahul Gupta, Aditya Kanade, and Shirish Shevade. 2018. Deep reinforcement learning for programming language correction. arXiv Preprint arXiv:1801.10467 (2018).Google Scholar
- Rahul Gupta, Soham Pal, Aditya Kanade, and Shirish Shevade. 2017. DeepFix: Fixing common C language errors by deep learning. In Proceedings of the Conference of Artificial Intelligence (AAAI’17).Google Scholar
- Tihomir Gvero and Viktor Kuncak. 2015. Synthesizing java expressions from free-form queries. In Proceedings of the Conference on Object-Oriented Programming, Systems, Languages 8 Applications (OOPSLA’15). Google ScholarDigital Library
- Alon Halevy, Peter Norvig, and Fernando Pereira. 2009. The unreasonable effectiveness of data. IEEE Intelligent Systems 24, 2 (2009), 8--12. Google ScholarDigital Library
- Vincent J. Hellendoorn and Premkumar Devanbu. 2017. Are deep neural networks the best choice for modeling source code? In Proceedings of the International Symposium on Foundations of Software Engineering (FSE’17). Google ScholarDigital Library
- Vincent J. Hellendoorn, Premkumar T. Devanbu, and Alberto Bacchelli. 2015. Will they like this?: Evaluating code contributions with language models. In Proceedings of the Working Conference on Mining Software Repositories (MSR’15). Google ScholarDigital Library
- Felix Hill, KyungHyun Cho, Anna Korhonen, and Yoshua Bengio. 2016. Learning to understand phrases by embedding the dictionary. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’16).Google ScholarCross Ref
- Abram Hindle, Earl T. Barr, Mark Gabel, Zhendong Su, and Premkumar Devanbu. 2016. On the naturalness of software. Communications of the ACM 59, 5 (2016), 122--131. Google ScholarDigital Library
- Abram Hindle, Earl T. Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. 2012. On the naturalness of software. In Proceedings of the International Conference on Software Engineering (ICSE’12). Google ScholarDigital Library
- G. E. Hinton, J. L. McClelland, and D. E. Rumelhart. 1986. Distributed representations. In Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1. MIT Press, 77--109. Google ScholarDigital Library
- C. A. R. Hoare. 1969. An axiomatic basis for computer programming. Commun. ACM 12, 10 (Oct. 1969), 576--580. Google ScholarDigital Library
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735--1780. Google ScholarDigital Library
- Reid Holmes, Robert J. Walker, and Gail C. Murphy. 2005. Strathcona example recommendation tool. In ACM SIGSOFT Software Engineering Notes 30, 5 (2005), 237--240. Google ScholarDigital Library
- Chun-Hung Hsiao, Michael Cafarella, and Satish Narayanasamy. 2014. Using web corpus statistics for program analysis. In ACM SIGPLAN Notices 49, 10 (2014), 49--65. Google ScholarDigital Library
- Xing Hu, Yuhan Wei, Ge Li, and Zhi Jin. 2017. CodeSum: Translate program language to natural language. arXiv Preprint arXiv:1708.01837 (2017).Google Scholar
- Andrew Hunt and David Thomas. 2000. The Pragmatic Programmer: From Journeyman to Master. Addison-Wesley Professional. Google ScholarDigital Library
- Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2016. Summarizing source code using a neural attention model. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’16).Google ScholarCross Ref
- Siyuan Jiang, Ameer Armaly, and Collin McMillan. 2017. Automatically generating commit messages from diffs using neural machine translation. In Proceedings of the International Conference on Automated Software Engineering (ASE’17). Google ScholarDigital Library
- Daniel D. Johnson. 2016. Learning graphical state transitions. In Proceedings of the International Conference on Learning Representations (ICLR’16).Google Scholar
- Dan Jurafsky. 2000. Speech 8 Language Processing (3rd. ed.). Pearson Education.Google Scholar
- René Just, Darioush Jalali, and Michael D. Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’14). Google ScholarDigital Library
- Neel Kant. 2018. Recent advances in neural program synthesis. arXiv Preprint arXiv:1802.02353 (2018).Google Scholar
- Svetoslav Karaivanov, Veselin Raychev, and Martin Vechev. 2014. Phrase-based statistical translation of programming languages. In Proceedings of the 2014 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming 8 Software. ACM, 173--184. Google ScholarDigital Library
- Andrej Karpathy, Justin Johnson, and Fei-Fei Li. 2015. Visualizing and understanding recurrent networks. arXiv Preprint arXiv:1506.02078 (2015).Google Scholar
- Reinhard Kneser and Hermann Ney. 1995. Improved backing-off for m-gram language modeling. In Procdedings of the 1995 International Conference on Acoustics, Speech, and Signal Processing (ICASSP’95). 1 (1995), 181--184.Google ScholarCross Ref
- Donald Ervin Knuth. 1984. Literate programming. The Computer Journal 27, 2 (1984), 97--111. Google ScholarDigital Library
- Ugur Koc, Parsa Saadatpanah, Jeffrey S. Foster, and Adam A. Porter. 2017. Learning a classifier for false positive error reports emitted by static code analysis tools. In Proceedings of the 1st ACM SIGPLAN International Workshop on Machine Learning and Programming Languages. Google ScholarDigital Library
- Rainer Koschke. 2007. Survey of research on software clones. In Dagstuhl Seminar Proceedings. Schloss Dagstuhl-Leibniz-Zentrum für Informatik.Google Scholar
- Ted Kremenek, Andrew Y. Ng, and Dawson R. Engler. 2007. A factor graph model for software bug finding. In Proceedings of the International Joint Conference on Artifical intelligence (IJCAI’07). Google ScholarDigital Library
- Roland Kuhn and Renato De Mori. 1990. A cache-based natural language model for speech recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 12, 6 (1990), 570--583. Google ScholarDigital Library
- Nate Kushman and Regina Barzilay. 2013. Using semantic unification to generate regular expressions from natural language. In Proceedings of Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’13).Google Scholar
- Tessa Lau. 2001. Programming by Demonstration: A Machine Learning Approach. Ph.D. Dissertation. University of Washington. Google ScholarDigital Library
- Quoc V. Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In Proceedings of the International Conference on Machine Learning (ICML’14). Google ScholarDigital Library
- Tien-Duy B. Le, Mario Linares-Vásquez, David Lo, and Denys Poshyvanyk. 2015. Rclinker: Automated linking of issue reports and commits leveraging rich contextual information. In Proceedings of the International Conference on Program Comprehension (ICPC’15). Google ScholarDigital Library
- Dor Levy and Lior Wolf. 2017. Learning to align the source code to the compiled object code. In Proceedings of the International Conference on Machine Learning (ICML’17).Google Scholar
- Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel. 2016. Gated graph sequence neural networks. In Proceedings of the International Conference on Learning Representations (ICLR’16).Google Scholar
- Percy Liang, Michael I. Jordan, and Dan Klein. 2010. Learning programs: A hierarchical bayesian approach. In Proceedings of the International Conference on Machine Learning (ICML’10). Google ScholarDigital Library
- Ben Liblit, Mayur Naik, Alice X. Zheng, Alex Aiken, and Michael I. Jordan. 2005. Scalable statistical bug isolation. In ACM SIGPLAN Notices 40, 6 (2005), 15--26. Google ScholarDigital Library
- Xi Victoria Lin, Chenglong Wang, Deric Pang, Kevin Vu, Luke Zettlemoyer, and Michael D. Ernst. 2017. Program Synthesis from Natural Language using Recurrent Neural Networks. Technical Report UW-CSE-17-03-01. University of Washington Department of Computer Science and Engineering, Seattle, WA.Google Scholar
- Xi Victoria Lin, Chenglong Wang, Luke Zettlemoyer, and Michael D. Ernst. 2018. NL2Bash: A corpus and semantic parser for natural language interface to the linux operating system. In Proceedings of the International Conference on Language Resources and Evaluation.Google Scholar
- Wang Ling, Edward Grefenstette, Karl Moritz Hermann, Tomas Kocisky, Andrew Senior, Fumin Wang, and Phil Blunsom. 2016. Latent predictor networks for code generation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’16).Google ScholarCross Ref
- Han Liu. 2016. Towards better program obfuscation: Optimization via language models. In Proceedings of the 38th International Conference on Software Engineering Companion. Google ScholarDigital Library
- Benjamin Livshits, Aditya V. Nori, Sriram K. Rajamani, and Anindya Banerjee. 2009. Merlin: Specification inference for explicit information flow problems. In Proceedings of the Symposium on Programming Language Design and Implementation (PLDI’09). Google ScholarDigital Library
- Sarah M. Loos, Geoffrey Irving, Christian Szegedy, and Cezary Kaliszyk. 2017. Deep network guided proof search. In Proceedings of the International Conference on Logic for Programming Artificial Intelligence and Reasoning (LPAR’17).Google Scholar
- Pablo Loyola, Edison Marrese-Taylor, and Yutaka Matsuo. 2017. A neural architecture for generating natural language descriptions from source code changes. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 2 (2017), 287--292.Google ScholarCross Ref
- Yanxin Lu, Swarat Chaudhuri, Chris Jermaine, and David Melski. 2017. Data-Driven program completion. arXiv Preprint arXiv:1705.09042 (2017).Google Scholar
- Chris Maddison and Daniel Tarlow. 2014. Structured generative models of natural source code. In Proceedings of the International Conference on Machine Learning (ICML’14). Google ScholarDigital Library
- Ravi Mangal, Xin Zhang, Aditya V. Nori, and Mayur Naik. 2015. A user-guided approach to program analysis. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE’15). Google ScholarDigital Library
- Collin Mcmillan, Denys Poshyvanyk, Mark Grechanik, Qing Xie, and Chen Fu. 2013. Portfolio: Searching for relevant functions and their usages in millions of lines of code. ACM Transactions on Software Engineering and Methodology (TOSEM) 22, 4 (2013), 37 pages. Google ScholarDigital Library
- Aditya Menon, Omer Tamuz, Sumit Gulwani, Butler Lampson, and Adam Kalai. 2013. A machine learning framework for programming by example. In Proceedings of the International Conference on Machine Learning (ICML’13). Google ScholarDigital Library
- Kim Mens and Angela Lozano. 2014. Source code-based recommendation systems. In Recommendation Systems in Software Engineering. Springer, 93--130.Google Scholar
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv Preprint arXiv:1301.3781 (2013).Google Scholar
- Lili Mou, Ge Li, Lu Zhang, Tao Wang, and Zhi Jin. 2016. Convolutional neural networks over tree structures for programming language processing. In Proceedings of the Conference of Artificial Intelligence (AAAI’16). Google ScholarDigital Library
- Dana Movshovitz-Attias and William W. Cohen. 2013. Natural language models for predicting programming comments. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’13).Google Scholar
- Dana Movshovitz-Attias and William W. Cohen. 2015. KB-LDA: Jointly learning a knowledge base of hierarchy, relations, and facts. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’15).Google Scholar
- Vijayaraghavan Murali, Letao Qi, Swarat Chaudhuri, and Chris Jermaine. 2018. Neural sketch learning for conditional program generation. In Proceedings of the International Conference on Learning Representations (ICLR).Google Scholar
- Vijayaraghavan Murali, Swarat Chaudhuri, and Chris Jermaine. 2017. Bayesian specification learning for finding API usage errors. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. ACM, 151--162. Google ScholarDigital Library
- Arvind Neelakantan, Quoc V. Le, and Ilya Sutskever. 2015. Neural programmer: Inducing latent programs with gradient descent. In Proceedings of the International Conference on Learning Representations (ICLR’15).Google Scholar
- Graham Neubig. 2016. Survey of methods to generate natural language from source code. Retrieved from http://www.languageandcode.org/nlse2015/neubig15nlse-survey.pdf.Google Scholar
- Anh Tuan Nguyen and Tien N. Nguyen. 2015. Graph-based statistical language model for code. In Proceedings of the International Conference on Software Engineering (ICSE’15).Google Scholar
- Anh Tuan Nguyen, Tung Thanh Nguyen, and Tien N. Nguyen. 2013. Lexical statistical machine translation for language migration. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE’13).Google Scholar
- Anh T. Nguyen, Tung Thanh Nguyen, and Tien N. Nguyen. 2015. Divide-and-conquer approach for multi-phase statistical migration for source code. In Proceedings of the International Conference on Automated Software Engineering (ASE’15).Google Scholar
- Trong Duc Nguyen, Anh Tuan Nguyen, and Tien N. Nguyen. 2016. Mapping API elements for code migration with vector representations. In Proceedings of the International Conference on Software Engineering (ICSE’16).Google Scholar
- Trong Duc Nguyen, Anh Tuan Nguyen, Hung Dang Phan, and Tien N. Nguyen. 2017. Exploring API embedding for API usages and applications. In Proceedings of the International Conference on Software Engineering (ICSE’17).Google Scholar
- Tung Thanh Nguyen, Anh Tuan Nguyen, Hoan Anh Nguyen, and Tien N. Nguyen. 2013. A statistical semantic language model for source code. In Proceedings of the Joint Meeting of the European Software Engineering Conference and the Symposium on the Foundations of Software Engineering (ESEC/FSE’13).Google Scholar
- Haoran Niu, Iman Keivanloo, and Ying Zou. 2017. Learning to rank code examples for code search engines. Empirical Software Engineering (ESEM’16) 22, 1 (2017), 259--291. Google ScholarDigital Library
- Yusuke Oda, Hiroyuki Fudaba, Graham Neubig, Hideaki Hata, Sakriani Sakti, Tomoki Toda, and Satoshi Nakamura. 2015. Learning to generate pseudo-code from source code using statistical machine translation. In Proceedings of the International Conference on Automated Software Engineering (ASE’15).Google ScholarDigital Library
- Hakjoo Oh, Hongseok Yang, and Kwangkeun Yi. 2015. Learning a strategy for adapting a program analysis via bayesian optimisation. In Proceedings of the Conference on Object-Oriented Programming, Systems, Languages 8 Applications (OOPSLA’15). Google ScholarDigital Library
- Cyrus Omar. 2013. Structured statistical syntax tree prediction. In Proceedings of the Conference on Systems, Programming, Languages and Applications: Software for Humanity (SPLASH’13). Google ScholarDigital Library
- Cyrus Omar, Ian Voysey, Michael Hilton, Joshua Sunshine, Claire Le Goues, Jonathan Aldrich, and Matthew A. Hammer. 2017. Toward semantic foundations for program editors. arXiv preprint arXiv:1703.08694.Google Scholar
- Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’02). Google ScholarDigital Library
- Emilio Parisotto, Abdel-rahman Mohamed, Rishabh Singh, Lihong Li, Dengyong Zhou, and Pushmeet Kohli. 2017. Neuro-symbolic program synthesis. In Proceedings of the International Conference on Learning Representations (ICLR’17).Google Scholar
- Terence Parr and Jurgen J. Vinju. 2016. Towards a universal code formatter through machine learning. In Proceedings of the International Conference on Software Language Engineering (SLE’16). Google ScholarDigital Library
- Jibesh Patra and Michael Pradel. 2016. Learning to Fuzz: Application-Independent Fuzz Testing with Probabilistic, Generative Models of Input Data. TU Darmstadt, Department of Computer Science, TUD-CS-2016-14664.Google Scholar
- Hung Viet Pham, Phong Minh Vu, Tung Thanh Nguyen, and others. 2016. Learning API usages from bytecode: A statistical approach. In Proceedings of the International Conference on Software Engineering (ICSE’16).Google Scholar
- Chris Piech, Jonathan Huang, Andy Nguyen, Mike Phulsuksombati, Mehran Sahami, and Leonidas J. Guibas. 2015. Learning program embeddings to propagate feedback on student code. In Proceedings of the International Conference on Machine Learning (ICML’15). Google ScholarDigital Library
- Matt Post and Daniel Gildea. 2009. Bayesian learning of a tree substitution grammar. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’09). Google ScholarDigital Library
- Michael Pradel and Koushik Sen. 2017. Deep learning to find bugs. TU Darmstadt, Department of Computer Science.Google Scholar
- Sebastian Proksch, Sven Amann, Sarah Nadi, and Mira Mezini. 2016. Evaluating the evaluations of code recommender systems: A reality check. In Proceedings of the International Conference on Automated Software Engineering (ASE’16). Google ScholarDigital Library
- Sebastian Proksch, Johannes Lerch, and Mira Mezini. 2015. Intelligent code completion with bayesian networks. ACM Transactions on Software Engineering and Methodology (TOSEM) 25, 1 (2015), 3. Google ScholarDigital Library
- Yewen Pu, Karthik Narasimhan, Armando Solar-Lezama, and Regina Barzilay. 2016. sk_p: A neural program corrector for MOOCs. In Proceedings of the Conference on Systems, Programming, Languages and Applications: Software for Humanity (SPLASH’16). Google ScholarDigital Library
- Chris Quirk, Raymond Mooney, and Michel Galley. 2015. Language to code: Learning semantic parsers for if-this-then-that recipes. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’15).Google ScholarCross Ref
- Maxim Rabinovich, Mitchell Stern, and Dan Klein. 2017. Abstract syntax networks for code generation and semantic parsing. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’17).Google ScholarCross Ref
- Baishakhi Ray, Vincent Hellendoorn, Saheel Godhane, Zhaopeng Tu, Alberto Bacchelli, and Premkumar Devanbu. 2016. On the naturalness of buggy code. In Proceedings of the International Conference on Software Engineering (ICSE’16). Google ScholarDigital Library
- Veselin Raychev, Pavol Bielik, Martin Vechev, and Andreas Krause. 2016. Learning programs from noisy data. In Proceedings of the Symposium on Principles of Programming Languages (POPL’16). Google ScholarDigital Library
- Veselin Raychev, Martin Vechev, and Andreas Krause. 2015. Predicting program properties from “big code.” In Proceedings of the Symposium on Principles of Programming Languages (POPL’15). Google ScholarDigital Library
- Veselin Raychev, Martin Vechev, and Eran Yahav. 2014. Code completion with statistical language models. In Proceedings of the Symposium on Programming Language Design and Implementation (PLDI’14). Google ScholarDigital Library
- Scott Reed and Nando de Freitas. 2016. Neural programmer-interpreters. In Proceedings of the International Conference on Learning Representations (ICLR’16).Google Scholar
- Sebastian Riedel, Matko Bosnjak, and Tim Rocktäschel. 2017. Programming with a differentiable forth interpreter. In Proceedings of the International Conference on Machine Learning (ICML’17).Google Scholar
- Martin Robillard, Robert Walker, and Thomas Zimmermann. 2010. Recommendation systems for software engineering. IEEE Software 27, 4 (2010), 80--86. Google ScholarDigital Library
- Martin P. Robillard, Walid Maalej, Robert J. Walker, and Thomas Zimmermann. 2014. Recommendation Systems in Software Engineering. Springer. Google ScholarDigital Library
- Tim Rocktäschel and Sebastian Riedel. 2017. End-to-end differentiable proving. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS’17).Google Scholar
- Caitlin Sadowski, Kathryn T. Stolee, and Sebastian Elbaum. 2015. How developers search for code: A case study. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE’15). Google ScholarDigital Library
- Juliana Saraiva, Christian Bird, and Thomas Zimmermann. 2015. Products, developers, and milestones: How should I build my N-gram language model. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE’15). Google ScholarDigital Library
- Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Neural machine translation of rare words with subword units. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’16).Google ScholarCross Ref
- Abhishek Sharma, Yuan Tian, and David Lo. 2015. NIRMAL: Automatic identification of software relevant tweets leveraging language model. In Proceedings of the International Conference on Software Analysis, Evolution, and Reengineering (SANER’15).Google ScholarCross Ref
- Rishabh Singh and Sumit Gulwani. 2015. Predicting a correct program in programming by example. In Proceedings of the International Conference on Computer Aided Verification.Google ScholarCross Ref
- Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS’14). Google ScholarDigital Library
- Suresh Thummalapenta and Tao Xie. 2007. Parseweb: A programmer assistant for reusing open source code on the web. In Proceedings of the International Conference on Automated Software Engineering (ASE’07). Google ScholarDigital Library
- Christoph Treude and Martin P. Robillard. 2016. Augmenting API documentation with insights from stack overflow. In Proceedings of the International Conference on Software Engineering (ICSE’16). Google ScholarDigital Library
- Zhaopeng Tu, Zhendong Su, and Premkumar Devanbu. 2014. On the localness of software. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE’14). Google ScholarDigital Library
- Bogdan Vasilescu, Casey Casalnuovo, and Premkumar Devanbu. 2017. Recovering clear, natural identifiers from obfuscated JS names. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE’17). Google ScholarDigital Library
- Lisa Wang, Angela Sy, Larry Liu, and Chris Piech. 2017. Deep knowledge tracing on programming exercises. In Proceedings of the Conference on Learning @ Scale. Google ScholarDigital Library
- Song Wang, Devin Chollak, Dana Movshovitz-Attias, and Lin Tan. 2016. Bugram: Bug detection with n-gram language models. In Proceedings of the International Conference on Automated Software Engineering (ASE’16). Google ScholarDigital Library
- Song Wang, Taiyue Liu, and Lin Tan. 2016. Automatically learning semantic features for defect prediction. In Proceedings of the International Conference on Software Engineering (ICSE’16). Google ScholarDigital Library
- Xin Wang, Chang Liu, Richard Shin, Joseph E. Gonzalez, and Dawn Song. 2016. Neural Code Completion. Retrieved from https://openreview.net/pdf?id=rJbPBt9lg.Google Scholar
- Andrzej Wasylkowski, Andreas Zeller, and Christian Lindig. 2007. Detecting object usage anomalies. In Proceedings of the Joint Meeting of the European Software Engineering Conference and the Symposium on the Foundations of Software Engineering (ESEC/FSE’07). Google ScholarDigital Library
- Martin White, Michele Tufano, Christopher Vendome, and Denys Poshyvanyk. 2016. Deep learning code fragments for code clone detection. In Proceedings of the International Conference on Automated Software Engineering (ASE’16). Google ScholarDigital Library
- Martin White, Christopher Vendome, Mario Linares-Vásquez, and Denys Poshyvanyk. 2015. Toward deep learning software repositories. In Proceedings of the Working Conference on Mining Software Repositories (MSR’15). Google ScholarDigital Library
- Chadd C. Williams and Jeffrey K. Hollingsworth. 2005. Automatic mining of source code repositories to improve bug finding techniques. IEEE Transactions on Software Engineering 31, 6 (2005), 466--480. Google ScholarDigital Library
- Ian H. Witten, Eibe Frank, Mark A. Hall, and Christopher J. Pal. 2016. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann. Google ScholarDigital Library
- W. Eric Wong, Ruizhi Gao, Yihao Li, Rui Abreu, and Franz Wotawa. 2016. A survey on software fault localization. IEEE Transactions on Software Engineering 42, 8 (2016), 707--740. Google ScholarDigital Library
- Tao Xie and Jian Pei. 2006. MAPO: Mining API usages from open source repositories. In Proceedings of the Working Conference on Mining Software Repositories (MSR’06). Google ScholarDigital Library
- Chang Xu, Dacheng Tao, and Chao Xu. 2013. A survey on multi-view learning. arXiv Preprint arXiv:1304.5634 (2013).Google Scholar
- Shir Yadid and Eran Yahav. 2016. Extracting code from programming tutorial videos. In Proceedings of the 2016 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software. Google ScholarDigital Library
- Eran Yahav. 2015. Programming with “big code.” In Asian Symposium on Programming Languages and Systems. Springer, 3--8.Google ScholarCross Ref
- Pengcheng Yin and Graham Neubig. 2017. A syntactic neural model for general-purpose code generation. Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’17).Google ScholarCross Ref
- Wojciech Zaremba and Ilya Sutskever. 2014. Learning to execute. arXiv Preprint arXiv:1410.4615 (2014).Google Scholar
- Alice X. Zheng, Michael I. Jordan, Ben Liblit, and Alex Aiken. 2003. Statistical debugging of sampled programs. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS’03). Google ScholarDigital Library
- Alice X. Zheng, Michael I. Jordan, Ben Liblit, Mayur Naik, and Alex Aiken. 2006. Statistical debugging: Simultaneous identification of multiple bugs. In Proceedings of the International Conference on Machine Learning (ICML’06). Google ScholarDigital Library
- Victor Zhong, Caiming Xiong, and Richard Socher. 2017. Seq2SQL: Generating structured queries from natural language using reinforcement learning. arXiv Preprint arXiv:1709.00103 (2017).Google Scholar
- Thomas Zimmermann, Andreas Zeller, Peter Weissgerber, and Stephan Diehl. 2005. Mining version histories to guide software changes. IEEE Transactions on Software Engineering 31, 6 (2005), 429--445. Google ScholarDigital Library
Index Terms
- A Survey of Machine Learning for Big Code and Naturalness
Recommendations
The adverse effects of code duplication in machine learning models of code
Onward! 2019: Proceedings of the 2019 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and SoftwareThe field of big code relies on mining large corpora of code to perform some learning task towards creating better tools for software engineers. A significant threat to this approach was recently identified by Lopes et al. (2017) who found a large ...
Machine learning on big data
Machine learning (ML) is continuously unleashing its power in a wide range of applications. It has been pushed to the forefront in recent years partly owing to the advent of big data. ML algorithms have never been better promised while challenged by big ...
Severity Classification of Code Smells Using Machine-Learning Methods
AbstractCode smell detection can be very useful for minimizing maintenance costs and improving software quality. Code smells help developers/programmers, researchers to subjectively interpret design defects in different ways. Code smells instances can ...
Comments