Top

Soft Computing

Published in:

09-01-2021 | Methodologies and Application

Recommending pull request reviewers based on code changes

Authors: Xin Ye, Yongjie Zheng, Wajdi Aljedaani, Mohamed Wiem Mkaouer

Published in: Soft Computing | Issue 7/2021

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Pull-based development supports collaborative distributed development. It enables developers to collaborate on projects hosted on GitHub. If a developer wants to collaborate on a project, he/she will fork the repository, make modifications on the forked repository and send a pull request to the development team to ask for a merge of the code changes to the official repository. When the development team receives a pull request, the team members will review the changes and make a decision on whether to accept the changes or not. However, efficiently finding suitable pull request reviewers is a challenge. In this paper, we propose a multi-instance-based deep neural network model to recommend reviewers for pull requests. Given a pull request, our model extracts three features, which pull request title, commit message, and code change. The proposed model extracts the three features automatically from the code changes of every commit in the pull request. The features of different commits are then merged to predict the likelihood that a reviewer candidate is the appropriate reviewer. We use CNN and LSTM-network to learn features since the pull requisition and commit message feature have different structures than code change, written in a programming language. To test the effectiveness of our model, we performed a set of experiments using 43,986 pull requests extracted from 12 open-source projects. We compare our model with two baselines approaches, CoreDevRec and Majority Classes. Experiments demonstrate that our model outperforms two state-of-the-art baselines. For instance, for the TensorFlow project, our model’s accuracy in determining the appropriate reviewers is 50.80%, 74.70%, and 84.04%, respectively, in Top-1, Top-3, and Top-5 recommendation.

previous article Confidence-aware collaborative detection mechanism for false data attacks in smart grids

next article An improved grid search algorithm to optimize SVR for prediction

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

https://github.com/tensorflow/tensorflow/pull/29912.

https://github.com/tensorflow/tensorflow/pull/29561.

https://github.com/tensorflow/tensorflow/commit/f1ffa02.

https://dumps.wikimedia.org/enwiki/.

https://developer.github.com/v3/.

Balachandran V (2013) Reducing human effort and improving quality in peer code reviews using automatic static analysis and reviewer recommendation. In: 2013 35th international conference on software engineering (ICSE), IEEE, pp 931–940

Bissyandé TF, Lo D, Jiang L, Réveillere L, Klein J, Le Traon Y (2013) Got issues? Who cares about it? A large scale investigation of issue trackers from github. In: 2013 IEEE 24th international symposium on software reliability engineering (ISSRE), IEEE, pp 188–197

Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning, vol 1. MIT press Cambridge

Gousios G, Pinzger M, Deursen Av (2014) An exploratory study of the pull-based software development model. In: Proceedings of the 36th international conference on software engineering, pp 345–355

Gousios G, Zaidman A, Storey MA, Van Deursen A (2015) Work practices and challenges in pull-based development: the integrator’s perspective. In: 2015 IEEE/ACM 37th IEEE international conference on software engineering, IEEE, vol 1, pp 358–368

Gu X, Zhang H, Zhang D, Kim S (2016) Deep api learning. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering, pp 631–642

Hoang T, Dam HK, Kamei Y, Lo D, Ubayashi N (2019) Deepjit: an end-to-end deep learning framework for just-in-time defect prediction. In: 2019 IEEE/ACM 16th international conference on mining software repositories (MSR), IEEE, pp 34–45

Huo X, Li M, Zhou ZH, et al (2016) Learning unified features from natural and programming languages for locating buggy source code. In: IJCAI, pp 1606–1612

Jiang J, He JH, Chen XY (2015) Coredevrec: automatic core member recommendation for contribution evaluation. J Comput Sci Technol 30(5):998–1016CrossRef

Jiang J, Yang Y, He J, Blanc X, Zhang L (2017) Who should comment on this pull request? analyzing attributes for more accurate commenter recommendation in pull-based development. Inf Softw Technol 84:48–62CrossRef

Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:14126980

Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning, pp 1188–1196

Lee JB, Ihara A, Monden A, Matsumoto Ki (2013) Patch reviewer recommendation in oss projects. In: APSEC (2), pp 1–6

Levy O, Goldberg Y (2014) Neural word embedding as implicit matrix factorization. In: Advances in neural information processing systems, pp 2177–2185

Li HY, Shi ST, Thung F, Huo X, Xu B, Li M, Lo D (2019) Deepreview: automatic code review using deep multi-instance learning. In: Pacific-Asia conference on knowledge discovery and data mining, Springer, pp 318–330

de Lima Júnior ML, Soares DM, Plastino A, Murta L (2015) Developers assignment for analyzing pull requests. In: Proceedings of the 30th annual ACM symposium on applied computing, pp 1567–1572

de Lima Júnior ML, Soares DM, Plastino A, Murta L (2018) Automatic assignment of integrators to pull requests: the importance of selecting appropriate attributes. J Syst Softw 144:181–196CrossRef

Manning CD, Schütze H, Raghavan P (2008) Introduction to information retrieval. Cambridge University Press, CambridgeCrossRef

Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: ICML

Pagliardini M, Gupta P, Jaggi M (2017) Unsupervised learning of sentence embeddings using compositional n-gram features. arXiv preprint arXiv:170302507

Rahman MM, Roy CK, Collins JA (2016) Correct: code reviewer recommendation in github based on cross-project and technology experience. In: Proceedings of the 38th international conference on software engineering companion, pp 222–231

Soares DM, de Lima Júnior ML, Plastino A, Murta L (2018) What factors influence the reviewer assignment to pull requests? Inf Softw Technol 98:32–43CrossRef

Thongtanunam P, Tantithamthavorn C, Kula RG, Yoshida N, Iida H, Matsumoto Ki (2015) Who should review my code? a file location-based code-reviewer recommendation approach for modern code review. In: 2015 IEEE 22nd international conference on software analysis, evolution, and reengineering (SANER), IEEE, pp 141–150

Tsay J, Dabbish L, Herbsleb J (2014) Influence of social and technical factors for evaluating contribution in github. In: Proceedings of the 36th international conference on Software engineering, pp 356–366

Voorhees EM et al (1999) The trec-8 question answering track report. Trec 99:77–82

Willett P (2006) The porter stemming algorithm: then and now. Program

Xia X, Lo D, Wang X, Yang X (2015) Who should review this change?: Putting text and file location analyses together for more accurate recommendations. In: 2015 IEEE international conference on software maintenance and evolution (ICSME), IEEE, pp 261–270

Yang C, Zhang X, Lb Z, Fan Q, Wang T, Yu Y, Yin G, Hm W (2018) Revrec: a two-layer reviewer recommendation algorithm in pull-based development model. J Central South Univ 25(5):1129–1143CrossRef

Ye X, Fang F, Wu J, Bunescu R, Liu C (2018) Bug report classification using lstm architecture for more accurate software defect locating. In: 2018 17th IEEE international conference on machine learning and applications (ICMLA), IEEE, pp 1438–1445

Yu Y, Wang H, Yin G, Ling CX (2014a) Reviewer recommender of pull-requests in github. In: 2014 IEEE international conference on software maintenance and evolution, IEEE, pp 609–612

Yu Y, Wang H, Yin G, Ling CX (2014b) Who should review this pull-request: reviewer recommendation to expedite crowd collaboration. In: 2014 21st Asia-Pacific software engineering conference, IEEE, vol 1, pp 335–342

Title: Recommending pull request reviewers based on code changes
Authors: Xin Ye
Yongjie Zheng
Wajdi Aljedaani
Mohamed Wiem Mkaouer
Publication date: 09-01-2021
Publisher: Springer Berlin Heidelberg
Published in: Soft Computing / Issue 7/2021
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI: https://doi.org/10.1007/s00500-020-05559-3

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 7/2021

Fast neighbor user searching for neighborhood-based collaborative filtering with hybrid user similarity measures

Multi-semantic region weighting and multi-scale flatness weighting based image retrieval

Confidence-aware collaborative detection mechanism for false data attacks in smart grids

Clustering based on whale optimization algorithm for IoT over wireless nodes

Security-aware multi-cloud service composition by exploiting rough sets and fuzzy FCA

Multi-criteria decision making process based on some single-valued neutrosophic Dombi power aggregation operators

Premium Partner