Skip to main content
Erschienen in: Software Quality Journal 3/2021

28.08.2020

Classification of application reviews into software maintenance tasks using data mining techniques

verfasst von: Assem Al-Hawari, Hassan Najadat, Raed Shatnawi

Erschienen in: Software Quality Journal | Ausgabe 3/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Mobile application reviews are considered a rich source of information for software engineers to provide a general understanding of user requirements and technical feedback to avoid main programming issues. Previous researches have used traditional data mining techniques to classify user reviews into several software maintenance tasks. In this paper, we aim to use associative classification (AC) algorithms to investigate the performance of different classifiers to classify reviews into several software maintenance tasks. Also, we proposed a new AC approach for review mining (ACRM). Review classification needs preprocessing steps to apply natural language preprocessing and text analysis. Also, we studied the influence of two feature selection techniques (information gain and chi-square) on classifiers. Association rules give a better understanding of users’ intent since they discover the hidden patterns in words and features that are related to one of the maintenance tasks, and present it as class association rules (CARs). For testing the classifiers, we used two datasets that classify reviews into four different maintenance tasks. Results show that the highest accuracy was achieved by AC algorithms for both datasets. ACRM has the highest precision, recall, F-score, and accuracy. Feature selection helps improving the classifiers’ performance significantly.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules in large databases. VLDB ‘94. In Proceedings of the 20th International Conference on Very Large Data Bases (pp. 487–499). San Jose: IBM Almaden Research Center. Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules in large databases. VLDB ‘94. In Proceedings of the 20th International Conference on Very Large Data Bases (pp. 487–499). San Jose: IBM Almaden Research Center.
Zurück zum Zitat Agrawal, R., Imieliński, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. ACM SIGMOD Record, 22(2), 207–216.CrossRef Agrawal, R., Imieliński, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. ACM SIGMOD Record, 22(2), 207–216.CrossRef
Zurück zum Zitat Alhaj, T. A., Siraj, M. M., Zainal, A., Elshoush, H. T., & Elhaj, F. (2016). Feature selection using information gain for improved structural-based alert correlation. PLoS One, 11(11), e0166017.CrossRef Alhaj, T. A., Siraj, M. M., Zainal, A., Elshoush, H. T., & Elhaj, F. (2016). Feature selection using information gain for improved structural-based alert correlation. PLoS One, 11(11), e0166017.CrossRef
Zurück zum Zitat Ali, K. (2017). A study of software development life cycle process models. International Journal of Advanced Research in Computer Science, 8(1), 15–23. Ali, K. (2017). A study of software development life cycle process models. International Journal of Advanced Research in Computer Science, 8(1), 15–23.
Zurück zum Zitat Ankit A, Sunil S (2017) A review paper on software engineering areas implementing data mining tools & techniques. International Journal of Computational Intelligence Research (IJCIR). 559-574. Ankit A, Sunil S (2017) A review paper on software engineering areas implementing data mining tools & techniques. International Journal of Computational Intelligence Research (IJCIR). 559-574.
Zurück zum Zitat Arunadevi J, Ramya S, Ramesh Raja M (2018) A study of classification algorithms using Rapidminer, International Journal of Pure and Applied Mathematics. 15977-15988. Arunadevi J, Ramya S, Ramesh Raja M (2018) A study of classification algorithms using Rapidminer, International Journal of Pure and Applied Mathematics. 15977-15988.
Zurück zum Zitat Bai, A., Deshpande, P. S., & Dhabu, M. (2018). Selective database projections based approach for mining high-utility Itemsets. IEEE Access, 6, 14389–14409.CrossRef Bai, A., Deshpande, P. S., & Dhabu, M. (2018). Selective database projections based approach for mining high-utility Itemsets. IEEE Access, 6, 14389–14409.CrossRef
Zurück zum Zitat Bakiu E. and Guzman E., Which feature is unusable? Detecting usability and user experience issues from user reviews. 2017 IEEE 25th International Requirements Engineering Conference Workshops (REW). Lisbon pp 182-187. Bakiu E. and Guzman E., Which feature is unusable? Detecting usability and user experience issues from user reviews. 2017 IEEE 25th International Requirements Engineering Conference Workshops (REW). Lisbon pp 182-187.
Zurück zum Zitat Bing, L., Wynne, H., & Yiming, M. (1998). Integrating classification and association rule mining. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD ‘98) (pp. 80–86). Bing, L., Wynne, H., & Yiming, M. (1998). Integrating classification and association rule mining. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD ‘98) (pp. 80–86).
Zurück zum Zitat Brijendra S,Shikha G (2016) The impact of software development process on software quality: a review. 2016 8th International Conference on Computational Intelligence and Communication Networks (CICN), Tehri, pp. 666-672. Brijendra S,Shikha G (2016) The impact of software development process on software quality: a review. 2016 8th International Conference on Computational Intelligence and Communication Networks (CICN), Tehri, pp. 666-672.
Zurück zum Zitat Ciurumelea A, Panichella S, and Gall H. (2018). Automated user reviews analyser. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings (ICSE '18). Association for Computing Machinery, New York, NY, USA, 317–318. Ciurumelea A, Panichella S, and Gall H. (2018). Automated user reviews analyser. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings (ICSE '18). Association for Computing Machinery, New York, NY, USA, 317–318.
Zurück zum Zitat Cliff, N. (1993). Dominance statistics: ordinal analyses to answer ordinal questions. Psychological Bulletin, 114(3), 494–509.CrossRef Cliff, N. (1993). Dominance statistics: ordinal analyses to answer ordinal questions. Psychological Bulletin, 114(3), 494–509.CrossRef
Zurück zum Zitat Dharmaraajan K, Dorairangaswamy MA (2016) Analysis of FP-growth and Apriori algorithms on pattern discovery from weblog data. 2016 IEEE International Conference on Advances in Computer Applications (ICACA). Dharmaraajan K, Dorairangaswamy MA (2016) Analysis of FP-growth and Apriori algorithms on pattern discovery from weblog data. 2016 IEEE International Conference on Advances in Computer Applications (ICACA).
Zurück zum Zitat Ding, J., & Fu, L. (2018). A hybrid feature selection algorithm based on information gain and sequential forward floating search. Journal of Intelligence Computation, 9(3), 93.CrossRef Ding, J., & Fu, L. (2018). A hybrid feature selection algorithm based on information gain and sequential forward floating search. Journal of Intelligence Computation, 9(3), 93.CrossRef
Zurück zum Zitat Ghag KV, Shah K (2015) Comparative analysis of effect of stopwords removal on sentiment classification. 2015 International Conference on Computer, Communication and Control (IC4). Ghag KV, Shah K (2015) Comparative analysis of effect of stopwords removal on sentiment classification. 2015 International Conference on Computer, Communication and Control (IC4).
Zurück zum Zitat Gurusamy V, Kannan S (2014) Preprocessing techniques for text mining. RTRICS. Gurusamy V, Kannan S (2014) Preprocessing techniques for text mining. RTRICS.
Zurück zum Zitat Guzman E, El-Haliby M, and Bruegge B (2015) Ensemble methods for app review classification: an approach for software evolution (N). 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), Lincoln, NE, 771–776. Guzman E, El-Haliby M, and Bruegge B (2015) Ensemble methods for app review classification: an approach for software evolution (N). 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), Lincoln, NE, 771–776.
Zurück zum Zitat Guzman E, Maalej W (2014) How do users like this feature? A fine grained sentiment analysis of app reviews. 2014 IEEE 22nd International Requirements Engineering Conference (RE). 153-162. Guzman E, Maalej W (2014) How do users like this feature? A fine grained sentiment analysis of app reviews. 2014 IEEE 22nd International Requirements Engineering Conference (RE). 153-162.
Zurück zum Zitat Han J, Pei J, and Yin Y (2000) Mining frequent patterns without candidate generation. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data (SIGMOD ‘00). 1-12. Han J, Pei J, and Yin Y (2000) Mining frequent patterns without candidate generation. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data (SIGMOD ‘00). 1-12.
Zurück zum Zitat Li W, Han J, Pei J (2001) CMAR: accurate and efficient classification based on multiple class-association rules. Proceedings 2001 IEEE International Conference on Data Mining, 369-376. Li W, Han J, Pei J (2001) CMAR: accurate and efficient classification based on multiple class-association rules. Proceedings 2001 IEEE International Conference on Data Mining, 369-376.
Zurück zum Zitat Li Y, Jia B, Guo Y, Chen X (2017) Mining user reviews for mobile app comparisons. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies. 1(3): 1–15. Li Y, Jia B, Guo Y, Chen X (2017) Mining user reviews for mobile app comparisons. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies. 1(3): 1–15.
Zurück zum Zitat Liu, B., Ma, Y., & Wong, C.-K. (2001). Classification using association rules: weaknesses and enhancements. In Data Mining for Scientific and Engineering Applications Massive Computing (pp. 591–605).CrossRef Liu, B., Ma, Y., & Wong, C.-K. (2001). Classification using association rules: weaknesses and enhancements. In Data Mining for Scientific and Engineering Applications Massive Computing (pp. 591–605).CrossRef
Zurück zum Zitat Maalej W, Nabil H (2015) Bug report, feature request, or simply praise? On automatically classifying app reviews. 2015 IEEE 23rd International Requirements Engineering Conference (RE). 116-125. Maalej W, Nabil H (2015) Bug report, feature request, or simply praise? On automatically classifying app reviews. 2015 IEEE 23rd International Requirements Engineering Conference (RE). 116-125.
Zurück zum Zitat Maalej, W., Kurtanović, Z., Nabil, H., & Stanik, C. (2016). On the automatic classification of app reviews. Requirements Engineering, 21(3), 311–331.CrossRef Maalej, W., Kurtanović, Z., Nabil, H., & Stanik, C. (2016). On the automatic classification of app reviews. Requirements Engineering, 21(3), 311–331.CrossRef
Zurück zum Zitat Mans, R. S., van der Aalst, W. M. P., & Verbeek, H. M. W. (2014). Supporting process mining workflows with RapidProM. In L. Limonad & B. Weber (Eds.), BPM Demo Sessions 2014 (pp. 56–60). Eindhoven, September 20, 2014). CEUR-WS.org.: co-located with BPM 2014. Mans, R. S., van der Aalst, W. M. P., & Verbeek, H. M. W. (2014). Supporting process mining workflows with RapidProM. In L. Limonad & B. Weber (Eds.), BPM Demo Sessions 2014 (pp. 56–60). Eindhoven, September 20, 2014). CEUR-WS.​org.: co-located with BPM 2014.
Zurück zum Zitat Martens D, and Johann T (2017) On the emotion of users in app reviews, 2nd International Workshop on Emotion Awareness in Software Engineering (SEmotion), Buenos Aires, 8-14. Martens D, and Johann T (2017) On the emotion of users in app reviews, 2nd International Workshop on Emotion Awareness in Software Engineering (SEmotion), Buenos Aires, 8-14.
Zurück zum Zitat Palomba F, Linares-Vásquez M, Bavota G, Oliveto R, Di Penta M, Poshyvanyk D, De Lucia A. (2015) User reviews matter! Tracking crowdsourced reviews to support evolution of successful apps. 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME), Bremen. pp. 291-300. Palomba F, Linares-Vásquez M, Bavota G, Oliveto R, Di Penta M, Poshyvanyk D, De Lucia A. (2015) User reviews matter! Tracking crowdsourced reviews to support evolution of successful apps. 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME), Bremen. pp. 291-300.
Zurück zum Zitat Palomba F, Linares-Vásquez M, Bavota G, Oliveto R, Di Penta M, Poshyvanyk D, De Lucia A, (2018) Crowdsourcing user reviews to support the evolution of mobile apps, Journal of Systems and Software. Volume 137. Pages 143–162. ISSN 0164-1212. Palomba F, Linares-Vásquez M, Bavota G, Oliveto R, Di Penta M, Poshyvanyk D, De Lucia A, (2018) Crowdsourcing user reviews to support the evolution of mobile apps, Journal of Systems and Software. Volume 137. Pages 143–162. ISSN 0164-1212.
Zurück zum Zitat Panichella S, Sorbo AD, Guzman E, Visaggio CA, Canfora G, Gall HC (2016) ARdoc: app reviews development oriented classifier. Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering - FSE 2016. 1023-1027. Panichella S, Sorbo AD, Guzman E, Visaggio CA, Canfora G, Gall HC (2016) ARdoc: app reviews development oriented classifier. Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering - FSE 2016. 1023-1027.
Zurück zum Zitat Panichella, S., Di Sorbo, A., Guzman, E., Visaggio, C., Canfora, G., Gall, H., & How Can, I. Improve my app? Classifying user reviews for software maintenance and evolution. In Proc. of the International Conference on Software Maintenance and Evolution (ICSME) p. to. Panichella, S., Di Sorbo, A., Guzman, E., Visaggio, C., Canfora, G., Gall, H., & How Can, I. Improve my app? Classifying user reviews for software maintenance and evolution. In Proc. of the International Conference on Software Maintenance and Evolution (ICSME) p. to.
Zurück zum Zitat Periasamy R, Mishbahulhuda A (2017) Applications of data mining techniques in software engineering. International Journal of Advanced Research in Computer Science and Software Engineering. 304–307. Periasamy R, Mishbahulhuda A (2017) Applications of data mining techniques in software engineering. International Journal of Advanced Research in Computer Science and Software Engineering. 304–307.
Zurück zum Zitat Pratiwi AI, Adiwijaya (2018) On the feature selection and classification based on information gain for document sentiment analysis. Applied Computational Intelligence and Soft Computing. 1–5. Pratiwi AI, Adiwijaya (2018) On the feature selection and classification based on information gain for document sentiment analysis. Applied Computational Intelligence and Soft Computing. 1–5.
Zurück zum Zitat Shen, J., Xia, J., Zhang, X., & Jia, W. (2017). Sliding block-based hybrid feature subset selection in network traffic. IEEE Access, 5, 18179–18186.CrossRef Shen, J., Xia, J., Zhang, X., & Jia, W. (2017). Sliding block-based hybrid feature subset selection in network traffic. IEEE Access, 5, 18179–18186.CrossRef
Zurück zum Zitat Sorbo AD, Panichella S, Alexandru CV, Shimagaki J, Visaggio CA, Canfora G, et al. (2016) What would users change in my app? Summarizing app reviews for recommending software changes. Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering - FSE 2016. 499-510. Sorbo AD, Panichella S, Alexandru CV, Shimagaki J, Visaggio CA, Canfora G, et al. (2016) What would users change in my app? Summarizing app reviews for recommending software changes. Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering - FSE 2016. 499-510.
Zurück zum Zitat Thabtah F, A review of associative classification mining, The Knowledge Engineering Review, Volume 22 , Issue 1 (March 2007),Pages 37–65, 2007. Thabtah F, A review of associative classification mining, The Knowledge Engineering Review, Volume 22 , Issue 1 (March 2007),Pages 37–65, 2007.
Zurück zum Zitat Triguero, I., González, S., Moyano, J. M., García, S., Alcalá-Fdez, J., Luengo, J., et al. (2017). KEEL 3.0: an open source software for multi-stage analysis in data mining. International Journal of Computational Intelligence System, 10(1), 1238.CrossRef Triguero, I., González, S., Moyano, J. M., García, S., Alcalá-Fdez, J., Luengo, J., et al. (2017). KEEL 3.0: an open source software for multi-stage analysis in data mining. International Journal of Computational Intelligence System, 10(1), 1238.CrossRef
Zurück zum Zitat Umadevi S and Marseline K (2017) A survey on data mining classification algorithms, 2017 International Conference on Signal Processing and Communication (ICSPC), Coimbatore, 264-268. Umadevi S and Marseline K (2017) A survey on data mining classification algorithms, 2017 International Conference on Signal Processing and Communication (ICSPC), Coimbatore, 264-268.
Zurück zum Zitat Vijayan V, Bindu K, Parameswaran L (2017) A comprehensive study of text classification algorithms. 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI). 1109-1113. Vijayan V, Bindu K, Parameswaran L (2017) A comprehensive study of text classification algorithms. 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI). 1109-1113.
Zurück zum Zitat Villarroel, L., Bavota, G., Russo, B., Oliveto, R., & Di Penta, M. (2016). Release planning of mobile apps based on user reviews. In Proceedings of the 38th International Conference on Software Engineering (ICSE ‘16) (pp. 14–24). New York: Association for Computing Machinery. Villarroel, L., Bavota, G., Russo, B., Oliveto, R., & Di Penta, M. (2016). Release planning of mobile apps based on user reviews. In Proceedings of the 38th International Conference on Software Engineering (ICSE ‘16) (pp. 14–24). New York: Association for Computing Machinery.
Zurück zum Zitat Vora, S., & Yang, H. (2017). A comprehensive study of eleven feature selection algorithms and their impact on text classification. Computing Conference, 2017, 440–449. Vora, S., & Yang, H. (2017). A comprehensive study of eleven feature selection algorithms and their impact on text classification. Computing Conference, 2017, 440–449.
Zurück zum Zitat Wang H, Bai L, Jiezhang M, Zhang J and Li Q (2017) Software testing data analysis based on data mining. 2017 4th International Conference on Information Science and Control Engineering (ICISCE) 682-687. Wang H, Bai L, Jiezhang M, Zhang J and Li Q (2017) Software testing data analysis based on data mining. 2017 4th International Conference on Information Science and Control Engineering (ICISCE) 682-687.
Zurück zum Zitat Williams G, Mahmoud A (2017a) Analyzing, classifying, and interpreting emotions in software users’ tweets. 2017 IEEE/ACM 2nd International Workshop on Emotion Awareness in Software Engineering (SEmotion). 2-7. Williams G, Mahmoud A (2017a) Analyzing, classifying, and interpreting emotions in software users’ tweets. 2017 IEEE/ACM 2nd International Workshop on Emotion Awareness in Software Engineering (SEmotion). 2-7.
Zurück zum Zitat Williams G, Mahmoud A (2017b) Mining twitter data for a more responsive software engineering process. 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C). 280-282. Williams G, Mahmoud A (2017b) Mining twitter data for a more responsive software engineering process. 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C). 280-282.
Zurück zum Zitat Williams G, Mahmoud A. Analyzing, classifying, and interpreting emotions in software users tweets. 2017 IEEE/ACM 2nd International Workshop on Emotion Awareness in Software Engineering (SEmotion). 2017c; 2–7. Williams G, Mahmoud A. Analyzing, classifying, and interpreting emotions in software users tweets. 2017 IEEE/ACM 2nd International Workshop on Emotion Awareness in Software Engineering (SEmotion). 2017c; 2–7.
Zurück zum Zitat Yang H, Liang P (2015) Identification and classification of requirements from app user reviews. Proceedings of the 27th International Conference on Software Engineering and Knowledge Engineering. Yang H, Liang P (2015) Identification and classification of requirements from app user reviews. Proceedings of the 27th International Conference on Software Engineering and Knowledge Engineering.
Zurück zum Zitat Yin X, Han J (2003) CPAR: classification based on predictive association rules. Proceedings of the 2003 SIAM International Conference on Data Mining. 331–336. Yin X, Han J (2003) CPAR: classification based on predictive association rules. Proceedings of the 2003 SIAM International Conference on Data Mining. 331–336.
Zurück zum Zitat Zdravevski, E., Lameski, P., Kulakov, A., Jakimovski, B., Filiposka, S., & Trajanov, D. (2015). Feature ranking based on information gain for large classification problems with MapReduce. IEEE Trustcom/BigDataSE/ISPA, 2015, 186–191. Zdravevski, E., Lameski, P., Kulakov, A., Jakimovski, B., Filiposka, S., & Trajanov, D. (2015). Feature ranking based on information gain for large classification problems with MapReduce. IEEE Trustcom/BigDataSE/ISPA, 2015, 186–191.
Zurück zum Zitat Zhou, Y., Su, Y., Chen, T., Huang, Z., Gall, H. C., & Panichella, S. (2020). User review-based change file localization for mobile applications. IEEE Transactions on Software Engineering, 1. Zhou, Y., Su, Y., Chen, T., Huang, Z., Gall, H. C., & Panichella, S. (2020). User review-based change file localization for mobile applications. IEEE Transactions on Software Engineering, 1.
Metadaten
Titel
Classification of application reviews into software maintenance tasks using data mining techniques
verfasst von
Assem Al-Hawari
Hassan Najadat
Raed Shatnawi
Publikationsdatum
28.08.2020
Verlag
Springer US
Erschienen in
Software Quality Journal / Ausgabe 3/2021
Print ISSN: 0963-9314
Elektronische ISSN: 1573-1367
DOI
https://doi.org/10.1007/s11219-020-09529-8

Weitere Artikel der Ausgabe 3/2021

Software Quality Journal 3/2021 Zur Ausgabe

Premium Partner