skip to main content
research-article

An Empirical Study of the Impact of Data Splitting Decisions on the Performance of AIOps Solutions

Published:23 July 2021Publication History
Skip Abstract Section

Abstract

AIOps (Artificial Intelligence for IT Operations) leverages machine learning models to help practitioners handle the massive data produced during the operations of large-scale systems. However, due to the nature of the operation data, AIOps modeling faces several data splitting-related challenges, such as imbalanced data, data leakage, and concept drift. In this work, we study the data leakage and concept drift challenges in the context of AIOps and evaluate the impact of different modeling decisions on such challenges. Specifically, we perform a case study on two commonly studied AIOps applications: (1) predicting job failures based on trace data from a large-scale cluster environment and (2) predicting disk failures based on disk monitoring data from a large-scale cloud storage environment. First, we observe that the data leakage issue exists in AIOps solutions. Using a time-based splitting of training and validation datasets can significantly reduce such data leakage, making it more appropriate than using a random splitting in the AIOps context. Second, we show that AIOps solutions suffer from concept drift. Periodically updating AIOps models can help mitigate the impact of such concept drift, while the performance benefit and the modeling cost of increasing the update frequency depend largely on the application data and the used models. Our findings encourage future studies and practices on developing AIOps solutions to pay attention to their data-splitting decisions to handle the data leakage and concept drift challenges.

References

  1. Giuseppe Aceto, Domenico Ciuonzo, Antonio Montieri, and Antonio Pescapè. 2019. MIMETIC: Mobile encrypted traffic classification using multimodal deep learning. Comput. Netw. 165 (2019), 106Google ScholarGoogle ScholarCross RefCross Ref
  2. Amritanshu Agrawal and Tim Menzies. 2018. Is “Better data” better than “better data miners”?: On the benefits of tuning SMOTE for defect prediction. In Proceedings of the 40th International Conference on Software Engineering (ICSE’18). 1050–1061. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software engineering for machine learning: A case study. In Proceedings of the IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP’19). 291–300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 10 (2012), 281–305. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Mirela Madalina Botezatu, Ioana Giurgiu, Jasmina Bogojeska, and Dorothea Wiesmann. 2016. Predicting disk replacement towards reliable data centers. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’16). 39–48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Sabri Boughorbel, Fethi Jarray, and Mohammed El-Anbari. 2017. Optimal classifier for imbalanced data using Matthews correlation coefficient metric. PLoS One 12, 6 (2017), 1–17.Google ScholarGoogle ScholarCross RefCross Ref
  7. Leo Breiman. 2001. Random forests. Mach. Learn. 45, 1 (2001), 5–32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Dariusz Brzezinski and Jerzy Stefanowski. 2014. Prequential AUC for classifier evaluation and drift detection in evolving data streams. In Proceedings of the 3rd International Conference on New Frontiers in Mining Complex Patterns (NFMCP’14). 87–101. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Dariusz Brzezinski and Jerzy Stefanowski. 2014. Reacting to different types of concept drift: The accuracy updated ensemble algorithm. IEEE Trans. Neural Netw. Learn. Syst. 25, 1 (2014), 81–94.Google ScholarGoogle ScholarCross RefCross Ref
  10. Dariusz Brzezinski and Jerzy Stefanowski. 2017. Prequential AUC: Properties of the area under the ROC curve for data streams with concept drift. Knowl. Inf. Syst. 52 (2017), 531–562. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Alberto Cano and Bartosz Krawczyk. 2019. Evolving rule-based classifiers with genetic programming on GPUs for drifting data streams. Pattern Recog. 87 (2019), 248–268.Google ScholarGoogle ScholarCross RefCross Ref
  12. Alberto Cano and Bartosz Krawczyk. 2020. Kappa updated ensemble for drifting data stream mining. Mach. Learn. 109, 1 (2020), 175–218.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Nitesh V. Chawla, Kevin Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 16 (2002), 321–357. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Xin Chen, Charng-Da Lu, and Karthik Pattabiraman. 2014. Failure prediction of jobs in compute clouds: A Google cluster case study. In Proceedings of the IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW’14). 341–346. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Yong Chen, Ruping Pan, and Alexander Pfeifer. 2017. Regulation of brown and beige fat by MicroRNAs. Pharmacol. Therap. 170 (2017), 1–7.Google ScholarGoogle ScholarCross RefCross Ref
  16. Yujun Chen, Xian Yang, Qingwei Lin, Hongyu Zhang, Feng Gao, Zhangwei Xu, Yingnong Dang, Dongmei Zhang, Hang Dong, Yong Xu, Hao Li, and Yu Kang. 2019. Outage prediction and diagnosis for cloud service systems. In Proceedings of the World Wide Web Conference (WWW’19). 2659–2665. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Yingnong Dang, Qingwei Lin, and Peng Huang. 2019. AIOps: Real-World challenges and research innovations. In Proceedings of the 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion’19). 4–5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Rui Ding, Qiang Fu, Jian-Guang Lou, Qingwei Lin, Dongmei Zhang, Jiajun Shen, and Tao Xie. 2012. Healing online service systems via mining historical issue repositories. In Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering (ASE’12). 318–321. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Rui Ding, Qiang Fu, Jian Guang Lou, Qingwei Lin, Dongmei Zhang, and Tao Xie. 2014. Mining historical issue repositories to heal large-scale online service systems. In Proceedings of the 44th IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’14). 311–322. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Pedro Domingos and Geoff Hulten. 2000. Mining high-speed data streams. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’00). 71–80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Priyanka B. Dongre and Latesh G. Malik. 2014. A review on real time data stream classification and adapting to various concept drift scenarios. In Proceedings of the IEEE International Advance Computing Conference (IACC’14). 533–537.Google ScholarGoogle Scholar
  22. Jayalath Ekanayake, Jonas Tappolet, Harald C. Gall, and Abraham Bernstein. 2009. Tracking concept drift of software projects using defect prediction quality. In Proceedings of the 6th IEEE International Working Conference on Mining Software Repositories (MSR’09). 51–60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jayalath Ekanayake, Jonas Tappolet, Harald C. Gall, and Abraham Bernstein. 2012. Time variance and defect prediction in software projects—Towards an exploitation of periods of stability and change as well as a notion of concept drift in software projects. Empir. Softw. Eng. 17, 4–5 (2012), 348–389. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Nosayba El-Sayed, Hongyu Zhu, and Bianca Schroeder. 2017. Learning from failure across multiple clusters: A trace-driven approach to understanding, predicting, and mitigating job terminations. In Proceedings of the 37th IEEE International Conference on Distributed Computing Systems (ICDCS’17). 1333–1344.Google ScholarGoogle ScholarCross RefCross Ref
  25. James D. Evans. 1996. Straightforward Statistics for the Behavioral Sciences. Brooks/Cole Publishing Co.Google ScholarGoogle Scholar
  26. João Gama, Pedro Medas, Gladys Castillo, and Pedro Rodrigues. 2004. Learning with drift detection. In Proceedings of the Advances in Artificial Intelligence Conference (SBIA’04). 286–295.Google ScholarGoogle ScholarCross RefCross Ref
  27. João Gama, Raquel Sebastião, and Pedro Rodrigues. 2013. On evaluating stream learning algorithms. Mach. Learn. 90 (2013), 317–346. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. João Gama, Indrė Žliobaitė, Albert Bifet, Mykola Pechenizkiy, and Abdelhamid Bouchachia. 2014. A survey on concept drift adaptation. ACM Comput. Surv. 46, 4 (2014). Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Baljinder Ghotra, Shane McIntosh, and Ahmed E. Hassan. 2015. Revisiting the impact of classification techniques on the performance of defect prediction models. In Proceedings of the 37th International Conference on Software Engineering (ICSE’15). 789–800. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Maayan Harel, Koby Crammer, Ran El-Yaniv, and Shie Mannor. 2014. Concept drift detection through resampling. In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32 (ICML’14). II–1009–II–1017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Shilin He, Qingwei Lin, Jian-Guang Lou, Hongyu Zhang, Michael R. Lyu, and Dongmei Zhang. 2018. Identifying impactful service system problems via log analysis. In Proceedings of the 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’18). 60–70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. T. Ryan Hoens, Robi Polikar, and Nitesh V. Chawla. 2012. Learning from streaming data with concept drift and imbalance: An overview. Prog. Artif. Intell. 1, 1 (2012), 89–101.Google ScholarGoogle ScholarCross RefCross Ref
  33. Geoff Hulten, Laurie Spencer, and Pedro Domingos. 2001. Mining time-changing data streams. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’01). 97–106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Backblaze Inc.2020. Backblaze Hard Drive Stats. Backblaze B2 Cloud Storage. Retrieved from https://www.backblaze.com/b2/hard-drive-test-data.html.Google ScholarGoogle Scholar
  35. Arya Iranmehr, Hamed Masnadi-Shirazi, and Nuno Vasconcelos. 2019. Cost-sensitive support vector machines. Neurocomputing 343 (2019), 50–64.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Rie Johnson and Tong Zhang. 2013. Learning nonlinear functions using regularized greedy forest. IEEE Trans. Pattern Anal. Mach. Intell. 36, 5 (2013), 942–954.Google ScholarGoogle ScholarCross RefCross Ref
  37. Yasutaka Kamei, Takafumi Fukushima, Shane McIntosh, Kazuhiro Yamashita, Naoyasu Ubayashi, and Ahmed E. Hassan. 2016. Studying just-in-time defect prediction using cross-project models. Empir. Softw. Eng. 21 (2016), 2072–2106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Andrej Karpathy. 2019. “We see more significant improvements from training data distribution search (data splits + oversampling factor ratios) than neural architecture search. The latter is so overrated :)”. Retrieved from https://twitter.com/karpathy/status/1175138379198914560.Google ScholarGoogle Scholar
  39. Shachar Kaufman, Saharon Rosset, and Claudia Perlich. 2011. Leakage in data mining: Formulation, detection, and avoidance. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’11), Vol. 6. 556–563. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Shachar Kaufman, Saharon Rosset, Claudia Perlich, and Ori Stitelman. 2012. Leakage in data mining: Formulation, detection, and avoidance. ACM Trans. Knowl. Discov. Data 6, 4 (2012), 1–21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Foutse Khomh, Bram Adams, Jinghui Cheng, Marios Fokaefs, and Giuliano Antoniol. 2018. Software engineering for machine-learning applications: The road ahead. IEEE Softw. 35, 5 (2018), 81–84.Google ScholarGoogle ScholarCross RefCross Ref
  42. Ralf Klinkenberg and Thorsten Joachims. 2000. Detecting concept drift with support vector machines. In Proceedings of the 17th International Conference on Machine Learning (ICML’00). 487–494. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Bartosz Krawczyk and Alberto Cano. 2018. Online ensemble learning with abstaining classifiers for drifting and noisy data streams. Appl. Soft Comput. 68 (2018), 677–692.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Max Kuhn and Kjell Johnson. 2013. Applied Predictive Modeling. Vol. 26. Springer.Google ScholarGoogle Scholar
  45. Mark Last. 2002. Online classification of nonstationary data streams. Intell. Data Anal. 6, 2 (2002), 129–147. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Heng Li, Tse-Hsun Peter Chen, Ahmed E. Hassan, Mohamed Nasser, and Parminder Flora. 2018. Adopting autonomic computing capabilities in existing large-scale systems: An industrial experience report. In Proceedings of the 40th International Conference on Software Engineering (ICSE-SEIP’18). 1–10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Heng Li, Weiyi Shang, and Ahmed E. Hassan. 2017. Which log level should developers choose for a new logging statement?Empir. Softw. Eng. 22, 4 (2017), 1684–1716. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Heng Li, Weiyi Shang, Ying Zou, and Ahmed E. Hassan. 2017. Towards just-in-time suggestions for log changes. Empir. Softw. Eng. 22, 4 (2017), 1831–1865. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Yangguang Li, Zhen Ming Jiang, Heng Li, Ahmed E. Hassan, Cheng He, Ruirui Huang, Zhengda Zeng, Mian Wang, and Pinan Chen. 2020. Predicting node failures in an ultra-large-scale cloud computing platform: An AIOps solution. ACM Trans. Softw. Eng. Methodol. 29, 2 (2020), 1–24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Meng-Hui Lim, Jian-Guang Lou, Hongyu Zhang, Qiang Fu, Andrew Beng Jin Teoh, Qingwei Lin, Rui Ding, and Dongmei Zhang. 2014. Identifying recurrent and unknown performance issues. In Proceedings of the IEEE International Conference on Data Mining (ICDM’14). 320–329. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Qingwei Lin, Ken Hsieh, Yingnong Dang, Hongyu Zhang, Kaixin Sui, Yong Xu, Jian-Guang Lou, Chenggang Li, Youjiang Wu, Randolph Yao, et al. 2018. Predicting node failure in cloud service systems. In Proceedings of the 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’18). 480–490. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Jian-Guang Lou, Qingwei Lin, Rui Ding, Qiang Fu, Dongmei Zhang, and Tao Xie. 2013. Software analytics for incident management of online services: An experience report. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering (ASE’13). 475–485. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Jian-Guang Lou, Qingwei Lin, Rui Ding, Qiang Fu, Dongmei Zhang, and Tao Xie. 2017. experience report on applying software analytics in incident management of online service. Autom. Softw. Eng. 24, 4 (2017), 905–941. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Nicola Lunardon, Giovanna Menardi, and Nicola Torelli. 2014. ROSE: A package for binary imbalanced learning. R. Journal 6 (2014), 79–89.Google ScholarGoogle Scholar
  55. Chen Luo, Jian-Guang Lou, Qingwei Lin, Qiang Fu, Rui Ding, Dongmei Zhang, and Zhe Wang. 2014. Correlating events with time series for incident diagnosis. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’14). 1583–1592. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Guillermo Eduardo Macbeth, Eugenia Razumiejczyk, and Rubén Daniel Ledesma. 2011. Cliff’s delta calculator: A non-parametric effect size program for two groups of observations. Univers. Psychol. 10 (2011), 545–555.Google ScholarGoogle ScholarCross RefCross Ref
  57. Farzaneh Mahdisoltani, Ioan Stefanovici, and Bianca Schroeder. 2017. Proactive error prediction to improve storage system reliability. In Proceedings of the USENIX Annual Technical Conference (ATC’17). 391–402. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Shane Mcintosh, Yasutaka Kamei, Bram Adams, and Ahmed E. Hassan. 2016. An empirical study of the impact of modern code review practices on software quality. Empir. Softw. Eng. 21, 5 (2016), 2146–2189. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Giovanna Menardi and Nicola Torelli. 2014. Training and assessing classification rules with imbalanced data. Data Mining Knowl. Discov. 28, 1 (2014), 92–122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Leandro L. Minku, Allan P. White, and Xin Yao. 2009. The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Trans. Knowl. Data Eng. 22, 5 (2009), 730–742. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Leandro L. Minku and Xin Yao. 2011. DDD: A new ensemble approach for dealing with concept drift. IEEE Trans. Knowl. Data Eng. 24, 4 (2011), 619–633. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Leon Moonen, Stefano Di Alesio, David Binkley, and Thomas Rolfsnes. 2016. Practical guidelines for change recommendation using association rule mining. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE’16). 732–743. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Andrew Ng. 2019. “The rise of Software Engineering required inventing processes like version control, code review, agile, to help teams work effectively. The rise of AI & Machine Learning Engineering is now requiring new processes, like how we split train/dev/test, model zoos, etc.”Retrieved from https://twitter.com/andrewyng/status/10808864393 80869122.Google ScholarGoogle Scholar
  64. Kyosuke Nishida and Koichiro Yamauchi. 2007. Detecting concept drift using statistical testing. In Discovery Science. Springer, 264–269. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Claudia Perlich. 2014. Lessons learned from data competitions: Data leakage and model evaluation. In Doing Data Science: Straight Talk from the Frontline. O’Reilly Media, Inc., 303–320.Google ScholarGoogle Scholar
  66. Teerat Pitakrat, André van Hoorn, and Lars Grunske. 2013. A comparison of machine learning algorithms for proactive hard disk drive failure detection. In Proceedings of the 4th International ACM Sigsoft Symposium on Architecting Critical Systems (ISARCS’13). 1–10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Pankaj Prasad and Charley Rich. 2018. Market Guide for AIOps Platforms. Gartner Research. Retrieved from https://www.gartner.com/doc/3892967/market-guide-aiops-platforms.Google ScholarGoogle Scholar
  68. Thomas Rolfsnes, Leon Moonen, and David Binkley. 2017. Predicting relevance of change recommendations. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE’17). 694–705. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Jeanine Romano, Jeffrey D. Kromrey, Jesse Coraggio, and Jeff Skowronek. 2006. Appropriate statistics for ordinal level data: Should we really be using t-test and Cohen’d for evaluating group differences on the NSSE and other surveys. In Proceedings of the Meeting of the Florida Association of Institutional Research. 1–3.Google ScholarGoogle Scholar
  70. Andrea Rosà, Lydia Y. Chen, and Walter Binder. 2015. Catching failures of failures at big-data clusters: A two-level neural network approach. In Proceedings of the IEEE 23rd International Symposium on Quality of Service (IWQoS’15). 231–236.Google ScholarGoogle ScholarCross RefCross Ref
  71. Andrea Rosà, Lydia Y. Chen, and Walter Binder. 2015. Predicting and mitigating jobs failures in big data clusters. In Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid’15). 221–230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Saharon Rosset, Claudia Perlich, Grzergorz Świrszcz, Prem Melville, and Yan Liu. 2010. Medical data mining: Insights from winning two competitions. Data Mining Knowl. Discov. 20, 3 (2010), 439–468. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Sebastian Schelter, Felix Biessmann, Tim Januschowski, David Salinas, Stephan Seufert, Gyuri Szarvas, Manasi Vartak, Samuel Madden, Hui Miao, Amol Deshpande, et al. 2018. On challenges in machine learning model management. IEEE Data Eng. Bull. 41, 4 (2018), 5–15.Google ScholarGoogle Scholar
  74. Conlan Scientific. 2020. Avoiding Data Leakage in Machine Learning. Conlan Scientific. Retrieved from https://conlanscientific.com/posts/category/blog/post/avoiding-data-leakage-machine-learning/.Google ScholarGoogle Scholar
  75. Andrew Jhon Scott and M. Knott. 1974. A cluster analysis method for grouping means in the analysis of variance. Biometrics 30, 3 (1974), 507–512.Google ScholarGoogle ScholarCross RefCross Ref
  76. Liyan Song, Leandro L. Minku, and Xin Yao. 2013. The impact of parameter tuning on software effort estimation using learning machines. In Proceedings of the 9th International Conference on Predictive Models in Software Engineering (PROMISE’13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. W. Nick Street and YongSeog Kim. 2001. A streaming ensemble algorithm (SEA) for large-scale classification. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’01). New York, NY, 377–382. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Ming Tan, Lin Tan, Sashank Dara, and Caleb Mayeux. 2015. Online defect prediction for imbalanced data. In Proceedings of the IEEE/ACM 37th IEEE International Conference on Software Engineering (ICSE’15), Vol. 2. 99–108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. Chakkrit Tantithamthavorn and Ahmed E. Hassan. 2018. An experience report on defect modelling in practice: Pitfalls and challenges. In Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP’18). 286–295. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. Chakkrit Tantithamthavorn, Ahmed E. Hassan, and Kenichi Matsumoto. 2020. The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Trans. Softw. Eng. 46, 11 (2020), 1200–1219.Google ScholarGoogle ScholarCross RefCross Ref
  81. Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E. Hassan, and Kenichi Matsumoto. 2016. Automated parameter optimization of classification techniques for defect prediction models. In Proceedings of the 38th International Conference on Software Engineering (ICSE’16). 321–332. Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E. Hassan, and Kenichi Matsumoto. 2017. An empirical comparison of model validation techniques for defect prediction models. IEEE Trans. Softw. Eng. 43, 1 (2017), 1–18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E. Hassan, and Kenichi Matsumoto. 2018. The Impact of automated parameter optimization on defect prediction models. IEEE Trans. Softw. Eng. 45, 7 (2018), 683–711.Google ScholarGoogle ScholarCross RefCross Ref
  84. Alexey Tsymbal. 2004. The problem of concept drift: Definitions and related work. Comput. Sci. Depart., Trinity Coll. Dublin 106, 2 (2004), 58.Google ScholarGoogle Scholar
  85. Haixun Wang, Wei Fan, Philip S. Yu, and Jiawei Han. 2003. Mining concept-drifting data streams using ensemble classifiers. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’03). 226–235. Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. Haixun Wang, Philip S. Yu, and Jiawei Han. 2010. Mining concept-drifting data streams. In Data Mining and Knowledge Discovery Handbook. Springer, 789–802.Google ScholarGoogle Scholar
  87. Shuo Wang, Leandro L. Minku, Davide Ghezzi, Daniele Caltabiano, Peter Tino, and Xin Yao. 2013. Concept drift detection for online class imbalance learning. In Proceedings of the International Joint Conference on Neural Networks (IJCNN’13). 1–10.Google ScholarGoogle ScholarCross RefCross Ref
  88. John Wilkes. 2020. Google Cluster-Usage Traces V3. Technical Report. Google Inc.Retrieved from https://github.com/google/cluster-data/blob/master/ClusterData2019.md.Google ScholarGoogle Scholar
  89. Yong Xu, Kaixin Sui, Randolph Yao, Hongyu Zhang, Qingwei Lin, Yingnong Dang, Peng Li, Keceng Jiang, Wenchi Zhang, Jian-Guang Lou, Murali Chintalapati, and Dongmei Zhang. 2018. Improving service availability of cloud systems by predicting disk error. In Proceedings of the USENIX Annual Technical Conference (ATC’15). 481–494. Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. Ji Xue, Robert Birke, Lydia Y. Chen, and Evgenia Smirni. 2016. Managing data center tickets: Prediction and active sizing. In Proceedings of the 46th IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’16). 335–346.Google ScholarGoogle ScholarCross RefCross Ref
  91. Ji Xue, Robert Birke, Lydia Y. Chen, and Evgenia Smirni. 2018. Spatial-temporal prediction models for active ticket managing in data centers. IEEE Trans. Netw. Serv. Manag. 15, 1 (2018), 39–52.Google ScholarGoogle ScholarCross RefCross Ref
  92. Indrė Žliobaitė, Mykola Pechenizkiy, and João Gama. 2016. An overview of concept drift applications. In Big Data Analysis: New Algorithms for a New Society. Springer, 91–114.Google ScholarGoogle Scholar

Index Terms

  1. An Empirical Study of the Impact of Data Splitting Decisions on the Performance of AIOps Solutions

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Software Engineering and Methodology
          ACM Transactions on Software Engineering and Methodology  Volume 30, Issue 4
          Continuous Special Section: AI and SE
          October 2021
          613 pages
          ISSN:1049-331X
          EISSN:1557-7392
          DOI:10.1145/3461694
          • Editor:
          • Mauro Pezzè
          Issue’s Table of Contents

          Copyright © 2021 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 23 July 2021
          • Accepted: 1 January 2021
          • Revised: 1 December 2020
          • Received: 1 June 2020
          Published in tosem Volume 30, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format