research-article

Open Access

Assuring the Machine Learning Lifecycle: Desiderata, Methods, and Challenges

Authors:
Rob Ashmore

Defence Science and Technology Laboratory, UK

Defence Science and Technology Laboratory, UK
View Profile

,
Radu Calinescu

University of York and Assuring Autonomy International Programme, UK

University of York and Assuring Autonomy International Programme, UK
View Profile

,
Colin Paterson

University of York and Assuring Autonomy International Programme, UK

University of York and Assuring Autonomy International Programme, UK
View Profile

Authors Info & Claims

ACM Computing Surveys Volume 54 Issue 5Article No.: 111pp 1–39https://doi.org/10.1145/3453444

Published:25 May 2021Publication History

ACM Computing Surveys

Abstract

Machine learning has evolved into an enabling technology for a wide range of highly successful applications. The potential for this success to continue and accelerate has placed machine learning (ML) at the top of research, economic, and political agendas. Such unprecedented interest is fuelled by a vision of ML applicability extending to healthcare, transportation, defence, and other domains of great societal importance. Achieving this vision requires the use of ML in safety-critical applications that demand levels of assurance beyond those needed for current ML applications. Our article provides a comprehensive survey of the state of the art in the assurance of ML, i.e., in the generation of evidence that ML is sufficiently safe for its intended use. The survey covers the methods capable of providing such evidence at different stages of the machine learning lifecycle, i.e., of the complex, iterative process that starts with the collection of the data used to train an ML component for a system, and ends with the deployment of that component within the system. The article begins with a systematic presentation of the ML lifecycle and its stages. We then define assurance desiderata for each stage, review existing methods that contribute to achieving these desiderata, and identify open challenges that require further research.

Supplemental Material

Available for Download

zip

ashmore.zip (52.2 KB)

Supplemental movie, appendix, image and software files for, Assuring the Machine Learning Lifecycle: Desiderata, Methods, and Challenges

References

Mahdieh Abbasi, Arezoo Rajabi, Azadeh Sadat Mozafari, Rakesh B. Bobba, and Christian Gagne. 2018. Controlling over-generalization and its effect on adversarial examples generation and detection. arXiv:1808.08282. Retrieved from https://arxiv.org/abs/1808.08282.Google Scholar
Amina Adadi and Mohammed Berrada. 2018. Peeking inside the black-box: A survey on Explainable Artificial Intelligence (XAI). IEEE Access 6 (2018), 52138--52160.Google ScholarCross Ref
Ajaya Adhikari, D. M. Tax, Riccardo Satta, and Matthias Fath. 2018. Example and Feature importance-based Explanations for Black-box Machine Learning Models. arXiv:1812.09044. Retrieved from https://arxiv.org/abs/1812.09044.Google Scholar
Rocío Alaiz-Rodríguez and Nathalie Japkowicz. 2008. Assessing the impact of changing environments on classifier performance. In Proceedings of the Conference of the Canadian Society for Computational Studies of Intelligence. Springer, 13--24.Google ScholarCross Ref
Rob Alexander, Heather Rebecca Hawkins, and Andrew John Rae. 2015. Situation Coverage—A Coverage Criterion for Testing Autonomous Robots. Technical Report YCS-2015-496. Department of Computer Science, University of York.Google Scholar
Hassan Abu Alhaija, Siva Karthik Mustikovela, Lars Mescheder, Andreas Geiger, and Carsten Rother. 2018. Augmented reality meets computer vision: Efficient data generation for urban driving scenes. Int. J. Comput. Vis. 126, 9 (2018), 961--972.Google ScholarDigital Library
Maksym Andriushchenko and Matthias Hein. 2019. Provably robust boosted decision stumps and trees against adversarial attacks. In Advances in Neural Information Processing Systems. 13017--13028.Google Scholar
D. Anguita, A. Ghio, L. Oneto, X. Parra, and J. L. Reyes-Ortiz. 2012. Human activity recognition on smartphones using a multiclass hardware-friendly support vector machine. In Proceedings of the International Workshop on Ambient Assisted Living. 216--223.Google Scholar
Adina Aniculaesei, Daniel Arnsberger, Falk Howar, and Andreas Rausch. 2016. Towards the verification of safety-critical autonomous systems in dynamic environments. In Proceedings of the Workshop on Verification and Validation of Cyber-Physical Systems (V2CPS@IFM’16). 79--90.Google ScholarCross Ref
Antreas Antoniou, Amos Storkey, and Harrison Edwards. 2017. Data augmentation generative adversarial networks. arXiv:1711.04340. Retrieved from https://arxiv.org/abs/1711.04340.Google Scholar
Maziar Arjomandi, Shane Agostino, Matthew Mammone, Matthieu Nelson, and Tong Zhou. 2006. Classification of Unmanned Aerial Vehicles. Report for Mechanical Engineering Class. Technical Report. University of Adelaide, Australia.Google Scholar
Rob Ashmore and Matthew Hill. 2018. Boxing clever: Practical techniques for gaining insights into training data and monitoring distribution shift. In Proceedings of the International Conference on Computer Safety, Reliability, and Security. Springer, 393--405.Google ScholarCross Ref
Rob Ashmore and Elizabeth Lennon. 2017. Progress towards the assurance of non-traditional software. In Developments in System Safety Engineering, Proceedings of the 25th Safety-Critical Systems Symposium. 33--48.Google Scholar
Rob Ashmore and Bhopinder Madahar. 2019. Rethinking diversity in the context of autonomous systems. In Engineering Safe Autonomy, Proceedings of the 27th Safety-Critical Systems Symposium. 175--192.Google Scholar
Kamyar Azizzadenesheli, Anqi Liu, Fanny Yang, and Animashree Anandkumar. 2019. Regularized learning for domain adaptation under label shifts. arXiv:1903.09734. Retrieved from https://arxiv.org/abs/1903.09734.Google Scholar
R. K. E. Bellamy, K. Dey, M. Hind, S. C. Hoffman, S. Houde, K. Kannan, P. Lohia, J. Martino, S. Mehta, A. Mojsilović, S. Nagar, K. N. Ramamurthy, J. Richards, D. Saha, P. Sattigeri, M. Singh, K. R. Varshney, and Y. Zhang. 2019. AI fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias. IBM J. Res. Dev. 63, 4/5 (2019), 4:1--4:15.Google ScholarCross Ref
James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13(Feb.2012), 281--305.Google Scholar
Steffen Bickel, Michael Brückner, and Tobias Scheffer. 2009. Discriminative learning under covariate shift. J. Mach. Learn. Res. 10, 9 (2009), 2137--2155.Google ScholarDigital Library
Arijit Bishnu, Sameer Desai, Arijit Ghosh, Mayank Goswami, and Paul Subhabrata. 2015. Uniformity of point samples in metric spaces using gap ratio. In Proceedings of the 12th Annual Conference on Theory and Applications of Models of Computation. 347--358.Google ScholarCross Ref
Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer.Google ScholarDigital Library
Robin Bloomfield and Peter Bishop. 2010. Safety and assurance cases: Past, present and possible future—An Adelard perspective. In Making Systems Safer. Springer, 51--67.Google Scholar
Barry Boehm and Wilfred J. Hansen. 2000. Spiral Development: Experience, Principles, and Refinements. Technical Report CMU/SEI-2000-SR-008. Carnegie Mellon University.Google Scholar
Chris Bogdiukiewicz, Michael Butler, Thai Son Hoang, Martin Paxton, James Snook, Xanthippe Waldron, and Toby Wilkinson. 2017. Formal development of policing functions for intelligent systems. In Proceedings of the 28th International Symposium on Software Reliability Engineering. IEEE, 194--204.Google ScholarCross Ref
Andrew P. Bradley. 1997. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn. 30, 7 (1997), 1145--1159.Google ScholarDigital Library
Houssem Ben Braiek and Foutse Khomh. 2018. On Testing Machine Learning Programs. arXiv:1812.02257. Retrieved from https://arxiv.org/abs/1812.02257.Google Scholar
Carla E. Brodley and Mark A. Friedl. 1999. Identifying mislabeled training data. J. Artif. Intell. Res. 11 (1999), 131--167.Google ScholarCross Ref
Atilla Bulmus, Axel Freiwald, and Chris Wunderlich. 2017. Over the Air Software Update Realization within Generic Modules with Microcontrollers Using External Serial FLASH. Technical Report. SAE Technical Paper.Google Scholar
Jonathod Byrd and Zachary Lipton. 2019. What is the effect of importance weighting in deep learning? arXiv:1812.03372. Retrieved from https://arxiv.org/abs/1812.03372.Google Scholar
Radu Calinescu, Danny Weyns, Simos Gerasimou, Muhammad Usman Iftikhar, Ibrahim Habli, and Tim Kelly. 2018. Engineering trustworthy self-adaptive software with dynamic assurance cases. IEEE Trans. Softw. Eng. 44, 11 (2018), 1039--1069.Google ScholarCross Ref
Cristian S. Calude and Giuseppe Longo. 2017. The deluge of spurious correlations in big data. Found. Sci. 22, 3 (2017), 595--612.Google ScholarCross Ref
Richard Carlsson, Björn Gustavsson, Erik Johansson, Thomas Lindgren, Sven-Olof Nyström, Mikael Pettersson, and Robert Virding. 2000. Core Erlang 1.0 Language Specification. Technical Report. Information Technology Department, Uppsala University.Google Scholar
Paul Caseley. 2016. Claims and architectures to rationate on automatic and autonomous functions. In Proceedings of the 11th International Conference on System Safety and Cyber-Security. IET, 1--6.Google ScholarCross Ref
Nitesh V. Chawla, Aleksandar Lazarevic, Lawrence O. Hall, and Kevin W. Bowyer. 2003. SMOTEBoost: Improving prediction of the minority class in boosting. In Proceedings of the European Conference on Principles of Data Mining and Knowledge Discovery. 107--119.Google Scholar
Liming Chen and Algirdas Avizienis. 1978. N-version programming: A fault-tolerance approach to reliability of software operation. In Proceedings of the 8th IEEE International Symposium on Fault-Tolerant Computing, Vol. 1. 3--9.Google Scholar
Xinyun Chen, Chang Liu, Bo Li, Kimberley Lu, and Dawn Song. 2017. Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning. arXiv:1712.05526. Retrieved from https://arxiv.org/abs/1712.05526.Google Scholar
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Zakaria Anil, Rohan an Haque, Lichan Hong, Vihan Jain, Xiabing Liu, and Hemal Shah. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. ACM, 7--10.Google ScholarDigital Library
Patryk Chrabaszcz, Ilya Loshchilov, and Frank Hutter. 2018. Back to basics: Benchmarking canonical evolution strategies for playing Atari. arXiv:1802.08842. Retrieved from https://arxiv.org/abs/1802.08842.Google Scholar
David A. Cieslak and Nitesh V. Chawla. 2009. A framework for monitoring classifiers performance: When and why failure occurs? Knowl. Inf. Syst. 18, 1 (2009), 83--108.Google ScholarDigital Library
Adnan Darwiche. 2018. Human-level intelligence or animal-like abilities? Comm. ACM 61, 10 (2018), 56--67.Google ScholarDigital Library
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 248--255.Google ScholarCross Ref
Yue Deng, Feng Bao, Youyong Kong, Zhiquan Ren, and Qionghai Dai. 2017. Deep direct reinforcement learning for financial signal representation and trading. IEEE Trans. Neural Netw. Learn. Syst. 28, 3 (2017), 653--664.Google ScholarCross Ref
Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv:1702.08608. Retrieved from https://arxiv.org/abs/1702.08608.Google Scholar
Tommaso Dreossi, Daniel J. Fremont, Shromona Ghosh, Edward Kim, Hadi Ravanbakhsh, Marcell Vazquez-Chanlatte, and Sanjit A Seshia. 2019. VERIFAI: A toolkit for the design and analysis of artificial intelligence-based systems. arXiv:1902.04245. Retrieved from https://arxiv.org/abs/1902.04245.Google Scholar
Tommaso Dreossi, Shromona Ghosh, Xiangyu Yue, Kurt Keutzer, Alberto Sangiovanni-Vincentelli, and Sanjit A Seshia. 2018. Counterexample-guided data augmentation. arXiv:1805.06962. Retrieved from https://arxiv.org/abs/1805.06962.Google Scholar
Tommaso Dreossi, Somesh Jha, and Sanjit A. Seshia. 2018. Semantic adversarial deep learning. arXiv:1804.07045. Retrieved from https://arxiv.org/abs/1804.07045.Google Scholar
Chris Drummond and Robert C. Holte. 2006. Cost curves: An improved method for visualizing classifier performance. Mach. Learn. 65, 1 (2006), 95--130.Google ScholarDigital Library
Souradeep Dutta, Xin Chen, Susmit Jha, Sriram Sankaranarayanan, and Ashish Tiwari. 2019. Sherlock-A tool for verification of neural network feedback systems: Demo abstract. In Proceedings of the 22nd ACM International Conference on Hybrid Systems: Computation and Control. 262--263.Google ScholarDigital Library
Ruediger Ehlers. 2017. Formal verification of piece-wise linear feed-forward neural networks. In Proceedings of the International Symposium on Automated Technology for Verification and Analysis. Springer, 269--286.Google ScholarCross Ref
Alhussein Fawzi, Hamza Fawzi, and Omar Fawzi. 2018. Adversarial vulnerability for any classifier. arXiv:1802.08686. Retrieved from https://arxiv.org/abs/1802.08686.Google Scholar
Alhussein Fawzi, Omar Fawzi, and Pascal Frossard. 2015. Fundamental limits on adversarial robustness. In Proceedings of the ICML Workshop on Deep Learning.Google Scholar
Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli, and Pascal Frossard. 2016. Robustness of classifiers: From adversarial to random noise. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS’16). Curran Associates Inc., Red Hook, NY, 1632--1640.Google Scholar
Michael Feldman, Sorelle A. Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 259--268.Google ScholarDigital Library
Angelo Ferrando, Louise A. Dennis, Davide Ancona, Michael Fisher, and Viviana Mascardi. 2018. Verifying and validating autonomous systems: Towards an integrated approach. In Proceedings of the International Conference on Runtime Verification. Springer, 263--281.Google ScholarCross Ref
Peter Flach. 2019. Performance evaluation in machine learning: The good, the bad, the ugly and the way forward. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence. 9808--9814.Google ScholarDigital Library
Michael Forsting. 2017. Machine learning will change medicine. J. Nucl. Med. 58, 3 (2017), 357--358.Google ScholarCross Ref
Yoav Freund, Robert Schapire, and Naoki Abe. 1999. A short introduction to boosting. J. Jpn. Soc. Artif. Intell. 14, 771--780 (1999), 1612.Google Scholar
Yoav Freund and Robert E. Schapire. 1997. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 1 (1997), 119--139.Google ScholarDigital Library
Timon Gehr, Matthew Mirman, Dana Drachsler-Cohen, Petar Tsankov, Swarat Chaudhuri, and Martin Vechev. 2018. AI2: Safety and robustness certification of neural networks with abstract interpretation. In Proceedings of the 2018 IEEE Symposium on Security and Privacy. IEEE, 3--18.Google ScholarCross Ref
Aurélien Géron. 2017. Hands-on Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media, Inc.Google Scholar
Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. 2016. Deep Learning. Vol. 1. MIT Press.Google ScholarDigital Library
Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv:1412.6572. Retrieved from https://arxiv.org/abs/1412.6572.Google Scholar
Tianyu Gu, Brendan Dolan-Gavitt, and Siddharth Garg. 2017. BadNets: Identifying vulnerabilities in the machine learning model supply chain. arXiv:1708.06733. Retrieved from https://arxiv.org/abs/1708.06733.Google Scholar
Guo Haixiang, Li Yijing, Jennifer Shang, Gu Mingyun, Huang Yuanyue, and Gong Bing. 2017. Learning from class-imbalanced data: Review of methods and applications. Expert Syst. Appl. 73 (2017), 220--239.Google ScholarDigital Library
Jeff Heaton. 2016. An empirical analysis of feature engineering for predictive modeling. In Proceedings of SoutheastCon’16. IEEE, 1--6.Google ScholarCross Ref
Constance L. Heitmeyer, Ralph D. Jeffords, and Bruce G. Labaw. 1996. Automated consistency checking of requirements specifications. ACM Trans. Softw. Eng. Methodol. 5, 3 (1996), 231--261.Google ScholarDigital Library
Parker Hill, Babak Zamirai, Shengshuo Lu, Yu-Wei Chao, Michael Laurenzano, Mehrzad Samadi, Marios C. Papaefthymiou, Scott A. Mahlke, Thomas F. Wenisch, Jia Deng, Lingjia Tang, and Jason Mars. [n.d.]. Rethinking Numerical Representations for Deep Neural Networks. arXiv:1808.02513. Retrieved from https://arxiv.org/abs/1808.02513.Google Scholar
Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. [n.d.]. Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580. Retrieved from https://arxiv.org/abs/1207.0580.Google Scholar
Xiaowei Huang, Marta Kwiatkowska, Sen Wang, and Min Wu. 2017. Safety verification of deep neural networks. In Proceedings of the 29th International Conference on Computer Aided Verification, Rupak Majumdar and Viktor Kuncak (Eds.), Lecture Notes in Computer Science, Vol. 10426. Springer, 3--29.Google Scholar
Zhongling Huang, Zongxu Pan, and Bin Lei. 2017. Transfer learning with deep convolutional neural network for SAR target classification with limited labeled data. Remote Sens. 9, 9 (2017), 907.Google ScholarCross Ref
Casidhe Hutchison, Milda Zizyte, Patrick E. Lanigan, David Guttendorf, Michael Wagner, Claire Le Goues, and Philip Koopman. 2018. Robustness testing of autonomy software. In Proceedings of the 40th IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice. 276--285.Google ScholarDigital Library
Frank Hutter, Jörg Lücke, and Lars Schmidt-Thieme. 2015. Beyond manual tuning of hyperparameters. Künstl. Intell. 29, 4 (2015), 329--337.Google ScholarCross Ref
Didac Gil De La Iglesia and Danny Weyns. 2015. MAPE-K formal templates to rigorously design behaviors for self-adaptive systems. ACM Trans. Auton. Adapt. Syst. 10, 3 (2015), 15.Google Scholar
Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167. Retrieved from https://arxiv.org/abs/1502.03167.Google Scholar
Bandar Seri Iskandar. 2017. Terrorism detection based on sentiment analysis using machine learning. J. Eng. Appl. Sci. 12, 3 (2017), 691--698.Google Scholar
ISO. 2018. Road Vehicles—Functional Safety: Part 6. Technical Report BS ISO 26262-6:2018. ISO.Google Scholar
Nathalie Japkowicz. 2001. Concept-learning in the presence of between-class and within-class imbalances. In Proceedings of the Conference of the Canadian Society for Computational Studies of Intelligence. Springer, 67--77.Google ScholarCross Ref
Nikita Johnson and Tim Kelly. 2019. Devil’s in the detail: Through-life safety and security co-assurance using SSAF. In Proceedings of the 38th International Conference on Computer Safety, Reliability, and Security. Springer, 299--314.Google ScholarDigital Library
Taylor T. Johnson, Stanley Bak, Marco Caccamo, and Lui Sha. 2016. Real-time reachability for verified Simplex design. ACM Trans. Embed. Comput. Syst. 15, 2 (2016), 1--27.Google ScholarDigital Library
M. H. Kabir, M. R. Hoque, H. Seo, and S. H. Yang. 2015. Machine learning based adaptive context-aware system for smart home environment. Int. J. Smart Home 9, 11 (2015), 55--62.Google ScholarCross Ref
Faisal Kamiran and Toon Calders. 2012. Data preprocessing techniques for classification without discrimination. Knowl. Inf. Syst. 33, 1 (2012), 1--33.Google ScholarDigital Library
Guy Katz, Clark Barrett, David L. Dill, Kyle Julian, and Mykel J. Kochenderfer. 2017. Reluplex: An efficient SMT solver for verifying deep neural networks. In Proceedings of the International Conference on Computer Aided Verification. Springer, 97--117.Google Scholar
Guy Katz, Derek A. Huang, Duligur Ibeling, Kyle Julian, Christopher Lazarus, Rachel Lim, Parth Shah, Shantanu Thakoor, Haoze Wu, Aleksandar Zeljić, et al. 2019. The marabou framework for verification and analysis of deep neural networks. In Proceedings of the International Conference on Computer Aided Verification. Springer, 443--452.Google ScholarCross Ref
Shachar Kaufman, Saharon Rosset, Claudia Perlich, and Ori Stitelman. 2012. Leakage in data mining: Formulation, detection, and avoidance. ACM Trans. Knowl. Discov. Data 6, 4 (2012), 15.Google ScholarDigital Library
Jeffrey O. Kephart and David M. Chess. 2003. The vision of autonomic computing. Computer 36, 1 (2003), 41--50.Google ScholarDigital Library
Muhammad Taimoor Khan, Dimitrios Serpanos, and Howard Shrobe. 2016. A rigorous and efficient run-time security monitor for real-time critical embedded system applications. In Proceedings of the 3rd World Forum on Internet of Things. IEEE, 100--105.Google ScholarCross Ref
Udayan Khurana, Horst Samulowitz, and Deepak Turaga. 2018. Feature engineering for predictive modeling using reinforcement learning. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 3407--3414.Google Scholar
Roger E. Kirk. 2007. Experimental design. Wiley Online Library.Google Scholar
Tom Ko, Vijayaditya Peddinti, Daniel Povey, and Sanjeev Khudanpur. 2015. Audio augmentation for speech recognition. In Proceedings of the 16th Annual Conference of the International Speech Communication Association.Google ScholarCross Ref
Patrick Koch, Brett Wujek, Oleg Golovidov, and Steven Gardner. 2017. Automated hyperparameter tuning for effective machine learning. In Proceedings of the SAS Global Forum Conference.Google Scholar
Matthieu Komorowski, Leo A. Celi, Omar Badawi, Anthony C. Gordon, and A. Aldo Faisal. 2018. The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care. Nat. Med. 24, 11 (2018), 1716--1720.Google ScholarCross Ref
Philip Koopman and Frank Fratrik. 2019. How many operational design domains, objects, and events? In Proceedings of the AAAI Workshop on Artificial Intelligence Safety.Google Scholar
Philip Koopman, Aaron Kane, and Jen Black. 2019. Credible autonomy safety argumentation. In Proceedings of the 27th Safety-Critical Systems Symposium.Google Scholar
S. B. Kotsiantis, Dimitris Kanellopoulos, and P. E. Pintelas. 2006. Data preprocessing for supervised leaning. Int. J. Comput. Sci. 1, 2 (2006), 111--117.Google Scholar
S. B. Kotsiantis, D. Kanellopoulos, and P. E. Pintelas. 2007. Data preprocessing for supervised leaning. Int. J. Comput. Electr. Autom. Contr. Inf. Eng. 1, 12 (2007), 4104--4109.Google Scholar
Samantha Krening, Brent Harrison, Karen M. Feigh, Charles Lee Isbell, Mark Riedl, and Andrea Thomaz. 2017. Learning from explanations using sentiment and advice in RL. IEEE Trans. Cogn. Dev. Syst. 9, 1 (2017), 44--55.Google ScholarCross Ref
Isaac Lage, Andrew Ross, Kim Been, Samuel Gershman, and Finale Doshi-Velez. 2018. Human-in-the-loop interpretability prior. In Proceedings of the Conference on Neural Information Processing Systems. 10180--10189.Google Scholar
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2323.Google ScholarCross Ref
Joseph Lemley, Filip Jagodzinski, and Razvan Andonie. 2016. Big holes in big data: A Monte Carlo algorithm for detecting large hyper-rectangles in high dimensional data. In Proceedings of the IEEE Computer Software and Applications Conference. 563--571.Google ScholarCross Ref
Zachary C. Lipton. 2016. The mythos of model interpretability. arXiv:1606.03490. Retrieved from https://arxiv.org/abs/1606.03490.Google Scholar
Yingqi Liu, Wen-Chuan Lee, Guanhong Tao, Shiqing Ma, Yousra Aafer, and Xiangyu Zhang. 2019. ABS: Scanning neural networks for back-doors by artificial brain stimulation. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 1265--1282.Google ScholarDigital Library
Victoria López, Alberto Fernández, Salvador García, Vasile Palade, and Francisco Herrera. 2013. An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics. Inf. Sci. 250 (2013), 113--141.Google ScholarCross Ref
Gustavo A. Lujan-Moreno, Phillip R. Howard, Omar G. Rojas, and Douglas C. Montgomery. 2018. Design of experiments and response surface methodology to tune machine learning hyperparameters, with a random forest case-study. Expert Syst. Appl. 109 (2018), 195--205.Google ScholarDigital Library
Lei Ma, Felix Juefei-Xu, Minhui Xue, Bo Li, Li Li, Yang Liu, and Jianjun Zhao. 2019. DeepCT: Tomographic combinatorial testing for deep learning systems. In Proceedings of the 26th IEEE International Conference on Software Analysis, Evolution and Reengineering. IEEE, 614--618.Google ScholarCross Ref
Lei Ma, Felix Juefei-Xu, Fuyuan Zhang, Jiyuan Sun, Minhui Xue, Bo Li, Chunyang Chen, Ting Su, Li Li, Yang Liu, Jianjun Zhao, and Yadong Wang. 2018. DeepGauge: Multi-granularity testing criteria for deep learning systems. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. ACM, 120--131.Google ScholarDigital Library
Mathilde Machin, Jérémie Guiochet, Hélène Waeselynck, Jean-Paul Blanquart, Matthieu Roy, and Lola Masson. 2018. SMOF: A safety monitoring framework for autonomous systems. IEEE Trans. Syst. Man Cybernet. Syst. 48, 5 (2018), 702--715.Google ScholarCross Ref
Aravindh Mahendran and Andrea Vedaldi. 2015. Understanding deep image representations by inverting them. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5188--5196.Google ScholarCross Ref
Spyros Makridakis. 2017. The forthcoming Artificial Intelligence (AI) revolution: Its impact on society and firms. Futures 90 (2017), 46--60.Google ScholarCross Ref
Pedro Marcelino. 2018. Transfer learning from pre-trained models. In Towards Data Science (2018).Google Scholar
George Mason, Radu Calinescu, Daniel Kudenko, and Alec Banks. 2017. Assured reinforcement learning with formally verified abstract policies. In Proceedings of the 9th International Conference on Agents and Artificial Intelligence. 105--117.Google ScholarCross Ref
Michael Maurer, Ivan Breskovic, Vincent C. Emeakaroha, and Ivona Brandic. 2011. Revealing the MAPE loop for the autonomic management of cloud infrastructures. In Proceedings of the Symposium on Computers and Communications. IEEE, 147--152.Google ScholarDigital Library
Markus Maurer, J. Christian Gerdes, Barbara Lenz, and Hermann Winner. 2016. Autonomous Driving: Technical, Legal and Social Aspects. Springer Nature.Google Scholar
Christopher Meyer and Jörg Schwenk. 2013. SoK: Lessons learned from SSL/TLS attacks. In Proceedings of the International Workshop on Information Security Applications. Springer, 189--209.Google Scholar
Microsoft. 2019. How to choose algorithms for Azure Machine Learning Studio. Retrieved February 2019 from https://docs.microsoft.com/en-us/azure/machine-learning/studio/algorithm-choice.Google Scholar
Tom M. Mitchell. 1997. Machine Learning. McGraw–Hill.Google Scholar
Model Zoos Caffe 2019. Caffe Model Zoo. Retrieved March 2019 from http://caffe.berkeleyvision.org/model_zoo.html.Google Scholar
Model Zoos Github 2019. Model Zoos of machine and deep learning technologies. Retrieved March 2019 from https://github.com/collections/ai-model-zoos.Google Scholar
Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar Fawzi, and Pascal Frossard. 2017. Universal adversarial perturbations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1765--1773.Google ScholarCross Ref
Jose G. Moreno-Torres, Troy Raeder, Rocío Alaiz-Rodríguez, Nitesh V. Chawla, and Francisco Herrera. 2012. A unifying view on dataset shift in classification. Pattern Recogn. 45, 1 (2012), 521--530.Google ScholarDigital Library
Pamela A. Munro and Barbara G. Kanki. 2003. An analysis of ASRS maintenance reports on the use of minimum equipment lists. In Proceedings of the 12th International Symposium on Aviation Psychology.Google Scholar
Kevin P. Murphy. 2012. Machine Learning: A Probabilistic Perspective. The MIT Press.Google ScholarDigital Library
Partha Niyogi and Federico Girosi. 1996. On the relationship between generalization error, hypothesis complexity, and sample complexity for radial basis functions. Neural Comput. 8, 4 (1996), 819--842.Google ScholarDigital Library
Object Management Group. 2018. Structured Assurance Case Metamodel (SACM). Version 2.0.Google Scholar
Augustus Odena and Ian Goodfellow. 2018. TensorFuzz: Debugging neural networks with coverage-guided fuzzing. arXiv:1807.10875. Retrived from https://arxiv.org/abs/1807.10875.Google Scholar
Maxime Oquab, Leon Bottou, Ivan Laptev, and Josef Sivic. 2014. Learning and transferring mid-level image representations using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1717--1724.Google ScholarDigital Library
Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z. Berkay Celik, and Ananthram Swami. 2017. Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security. ACM, 506--519.Google ScholarDigital Library
Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2017. DeepXplore: Automated whitebox testing of deep learning systems. In Proceedings of the 26th Symposium on Operating Systems Principles. ACM, 1--18.Google ScholarDigital Library
Teresa Placho, Christoph Schmittner, Arndt Bonitz, and Oliver Wana. 2020. Management of automotive software updates. Microprocess. Microsystems. 78 (2020), 103257.Google ScholarCross Ref
Michael J. Pont and Royan H. L. Ong. 2002. Using watchdog timers to improve the reliability of single-processor embedded systems: Seven new patterns and a case study. In Proceedings of the 1st Nordic Conference on Pattern Languages of Programs.Google Scholar
Lutz Prechelt. 1998. Early stopping-but when? In Neural Networks: Tricks of the Trade. Springer, 55--69.Google ScholarDigital Library
Philipp Probst, Bernd Bischl, and Anne-Laure Boulesteix. 2018. Tunability: Importance of hyperparameters of machine learning algorithms. arXiv:1802.09596. Retrieved from https://arxiv.org/abs/1802.09596.Google Scholar
Foster Provost and Tom Fawcett. 2001. Robust classification for imprecise environments. Mach. Learn. 42, 3 (2001), 203--231.Google ScholarDigital Library
J. Provost Foster, Fawcett Tom, and Kohavi Ron. 1998. The case against accuracy estimation for comparing induction algorithms. In Proceedings of the 15th International Conference on Machine Learning. 445--453.Google Scholar
R-Bloggers Data Analysis 2019. How to Use Data Analysis for Machine Learning. Retrieved February 2019 from https://www.r-bloggers.com/how-to-use-data-analysis-for-machine-learning-example-part-1.Google Scholar
Stephan Rabanser, Stephan Günnemann, and Zachary C. Lipton. 2019. Failing loudly: An empirical study of methods for detecting dataset shift. Advances in Neural Information Processing Systems 32 (2019).Google Scholar
Jan Ramon, Kurt Driessens, and Tom Croonenborghs. 2007. Transfer learning in reinforcement learning problems through partial policy recycling. In Proceedings of the European Conference on Machine Learning. Springer, 699--707.Google ScholarDigital Library
Francesco Ranzato and Marco Zanella. 2019. Robustness verification of support vector machines. In Proceedings of the International Static Analysis Symposium. Springer, 271--295.Google ScholarDigital Library
Jorge-L. Reyes-Ortiz, Luca Oneto, Albert Samà, Xavier Parra, and Davide Anguita. 2016. Transition-aware human activity recognition using smartphones. Neurocomputing 171 (2016), 754--767.Google ScholarDigital Library
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should I trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1135--1144.Google ScholarDigital Library
F. Ricci, L. Rokach, and B. Shapira. 2015. Recommender systems: Introduction and challenges. Recommender Systems Handbook (2015), 1--34.Google Scholar
German Ros, Laura Sellart, Joanna Materzynska, David Vazquez, and Antonio M. Lopez. 2016. The SYNTHIA dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3234--3243.Google Scholar
Andrew Slavin Ross and Finale Doshi-Velez. 2018. Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 1660--1669.Google Scholar
Saharon Rosset, Claudia Perlich, Grzergorz Świrszcz, Prem Melville, and Yan Liu. 2010. Medical data mining: Insights from winning two competitions. Data Min. Knowl. Discov. 20, 3 (2010), 439--468.Google ScholarDigital Library
RTCA. 2011. Software Considerations in Airborne Systems and Equipment Certification. Technical Report DO-178C.Google Scholar
Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 5 (2019), 206--215.Google ScholarCross Ref
Stuart J. Russell and Peter Norvig. 2016. Artificial Intelligence: A Modern Approach. Pearson Education Limited.Google Scholar
Jerome Sacks, William J. Welch, Toby J. Mitchell, and Henry P. Wynn. 1989. Design and analysis of computer experiments. Stat. Sci. (1989), 409--423.Google Scholar
Omer Sagi and Lior Rokach. 2018. Ensemble learning: A survey. Data Min. Knowl. Discov. 8, 4 (2018), e1249.Google Scholar
Ahmed Salem, Michael Backes, and Yang Zhang. 2020. Don’t Trigger Me! A Triggerless Backdoor Attack Against Deep Neural Networks. arXiv:2010.03282. Retrieved from https://arxiv.org/abs/2010.03282.Google Scholar
Robert G. Sargent. 2009. Verification and validation of simulation models. In Proceedings of the Winter Simulation Conference. 162--176.Google ScholarCross Ref
Lawrence K. Saul and Sam T. Roweis. 2003. Think globally, fit locally: Unsupervised learning of low dimensional manifolds. J. Mach. Learn. Res. 4(Jun.2003), 119--155.Google Scholar
Christoph Schorn, Andre Guntoro, and Gerd Ascheid. 2018. Efficient on-line error detection and mitigation for deep neural network accelerators. In Proceedings of the International Conference on Computer Safety, Reliability, and Security. Springer, 205--219.Google ScholarCross Ref
Scikit-Taxonomy 2019. Scikit—Choosing the right estimator. Retrieved February 2019 from https://scikit-learn.org/stable/tutorial/machine_learning_map/index.html.Google Scholar
Noam Segev, Maayan Harel, Shie Mannor, Koby Crammer, and Ran El-Yaniv. 2017. Learn on source, refine on target: A model transfer learning framework with random forests. IEEE Trans. Pattern Anal. Mach. Intell. 39, 9 (2017), 1811--1824.Google ScholarDigital Library
Daniel Selsam, Percy Liang, and David L. Dill. 2017. Developing bug-free machine learning systems with formal mathematics. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 3047--3056.Google Scholar
Victor S. Sheng and Jing Zhang. 2019. Machine learning with crowdsourcing: A brief summary of the past research and future directions. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 9837--9843.Google Scholar
Andy Shih, Arthur Choi, and Adnan Darwiche. 2018. Formal verification of Bayesian network classifiers. In Proceedings of the International Conference on Probabilistic Graphical Models. 427--438.Google Scholar
Padhraic Smyth. 1996. Bounds on the mean classification error rate of multiple experts. Pattern Recogn. Lett. 17, 12 (1996), 1253--1257.Google ScholarDigital Library
Marina Sokolova and Guy Lapalme. 2009. A systematic analysis of performance measures for classification tasks. Inf. Process. Manage. 45, 4 (2009), 427--437.Google ScholarDigital Library
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1 (2014), 1929--1958.Google ScholarDigital Library
Sanatan Sukhija, Narayanan C. Krishnan, and Deepak Kumar. 2018. Supervised heterogeneous transfer learning using random forests. In Proceedings of the ACM India Joint International Conference on Data Science and Management of Data. ACM, 157--166.Google ScholarDigital Library
Youcheng Sun, Min Wu, Wenjie Ruan, Xiaowei Huang, Marta Kwiatkowska, and Daniel Kroening. 2018. Concolic testing for deep neural networks. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. ACM, 109--119.Google ScholarDigital Library
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv:1312.6199. Retrieved from https://arxiv.org/abs/1312.6199.Google Scholar
A. Taber and E. Normand. 1993. Single event upset in avionics. IEEE Trans. Nucl. Sci. 40, 2 (1993), 120--126.Google ScholarCross Ref
Mariarosaria Taddeo, Tom McCutcheon, and Luciano Floridi. 2019. Trusting artificial intelligence in cybersecurity is a double-edged sword. Nat. Mach. Intell. (2019), 557--560.Google Scholar
Luke Taylor and Geoff Nitschke. 2017. Improving deep learning using generic data augmentation. arXiv:1708.06020. Retrieved from https://arxiv.org/abs/1708.06020.Google Scholar
Chris Thornton, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. 2013. Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 847--855.Google ScholarDigital Library
Yuchi Tian, Kexin Pei, Suman Jana, and Baishakhi Ray. 2018. DeepTest: Automated testing of deep-neural-network-driven autonomous cars. In Proceedings of the 40th International Conference on Software Engineering. ACM, 303--314.Google ScholarDigital Library
John Törnblom and Simin Nadjm-Tehrani. 2018. Formal verification of random forests in safety-critical applications. In Proceedings of the International Workshop on Formal Techniques for Safety-Critical Systems. Springer, 55--71.Google Scholar
Hoang-Dung Tran, Stanley Bak, Weiming Xiang, and Taylor T. Johnson. 2020. Verification of deep convolutional neural networks using ImageStars. arXiv:2004.05511. Retrieved from https://arxiv.org/abs/2004.05511.Google Scholar
Hoang-Dung Tran, Xiaodong Yang, Diego Manzanas Lopez, Patrick Musau, Luan Viet Nguyen, Weiming Xiang, Stanley Bak, and Taylor T. Johnson. 2020. NNV: The neural network verification tool for deep neural networks and learning-enabled cyber-physical systems. arXiv:2004.05519. Retrieved from https://arxiv.org/abs/2004.05519.Google Scholar
John W. Tukey. 1977. Exploratory Data Analysis. Vol. 2. Reading, MA.Google Scholar
Jasper van der Waa, Jurriaan van Diggelen, Mark A Neerincx, and Stephan Raaijmakers. 2018. ICM: An intuitive model independent and accurate certainty measure for machine learning.. In Proceedings of the International Conference on Agents and Artificial Intelligence (ICAART’18). 314--321.Google ScholarCross Ref
Perry Van Wesel and Alwyn E. Goodloe. 2017. Challenges in the verification of reinforcement learning algorithms. (2017).Google Scholar
Kiri Wagstaff. 2012. Machine learning that matters. arXiv:1206.4656. Retrieved from https://arxiv.org/abs/1206.4656.Google Scholar
Kiri L. Wagstaff and Benjamin Bornstein. 2009. K-means in space: A radiation sensitivity evaluation. In Proceedings of the 26th Annual International Conference on Machine Learning. 1097--1104.Google Scholar
Li Wan, Matthew Zeiler, Sixin Zhang, Yann Le Cun, and Rob Fergus. 2013. Regularization of neural networks using dropconnect. In Proceedings of the International Conference on Machine Learning. 1058--1066.Google Scholar
Binghui Wang and Neil Zhenqiang Gong. 2018. Stealing hyperparameters in machine learning. In Proceedings of the 2018 IEEE Symposium on Security and Privacy. IEEE, 36--52.Google ScholarCross Ref
Bolun Wang, Yuanshun Yao, Shawn Shan, Huiying Li, Bimal Viswanath, Haitao Zheng, and Ben Y Zhao. 2019. Neural cleanse: Identifying and mitigating backdoor attacks in neural networks. In Proceedings of the 2019 IEEE Symposium on Security and Privacy. IEEE, 707--723.Google ScholarCross Ref
Ke Wang, Senqiang Zhou, Chee Ada Fu, and Jeffrey Xu Yu. 2003. Mining changes of classification by correspondence tracing. In Proceedings of the 2003 SIAM International Conference on Data Mining. SIAM, 95--106.Google ScholarCross Ref
Lu Wang, Xuanqing Liu, Jinfeng Yi, Zhi-Hua Zhou, and Cho-Jui Hsieh. 2019. Evaluating the robustness of nearest neighbor classifiers: A primal-dual perspective. arXiv:1906.03972. Retrieved from https://arxiv.org/abs/1906.03972.Google Scholar
Yihan Wang, Huan Zhang, Hongge Chen, Duane Boning, and Cho-Jui Hsieh. 2020. On -norm robustness of ensemble stumps and trees. arXiv:2008.08755. Retrieved from https://arxiv.org/abs/2008.08755.Google Scholar
Gary M. Weiss. 2004. Mining with rarity: A unifying framework. ACM SIGKDD Expl. Newslett. 6, 1 (2004), 7--19.Google ScholarDigital Library
Karl Weiss, Taghi M. Khoshgoftaar, and DingDing Wang. 2016. A survey of transfer learning. J. Big Data 3, 1 (2016), 9.Google ScholarCross Ref
Reinhard Wilhelm, Jakob Engblom, Andreas Ermedahl, Niklas Holsti, Stephan Thesing, David Whalley, Guillem Bernat, Christian Ferdinand, Reinhold Heckmann, Tulika Mitra, Frank Mueller, Isabelle Puaut, Peter Puschner, Jan Straschulat, and Per Strenström. 2008. The worst-case execution-time problem—Overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst. 7, 3 (2008), 36.Google ScholarDigital Library
Sebastien C. Wong, Adam Gatt, Victor Stamatescu, and Mark D. McDonnell. 2016. Understanding data augmentation for classification: When to warp? In Proceedings of the International Conference on Digital Image Computing: Techniques and Applications. IEEE, 1--6.Google Scholar
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V., Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, and Jeffrey Dean. 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv:1609.0814. Retrieved from https://arxiv.org/abs/1609.0814.Google Scholar
Steven R. Young, Derek C. Rose, Thomas P. Karnowski, Seung-Hwan Lim, and Robert M Patton. 2015. Optimizing deep learning hyper-parameters through an evolutionary algorithm. In Proceedings of the Workshop on Machine Learning in High-Performance Computing Environments. ACM, 4.Google ScholarDigital Library
X. Yuan, Y. Chen, Y. Zhao, Y. Long, X. Liu, K. Chen, S. Zhang, H. Huang, X. Wang, and C. A. Gunter. 2018. CommanderSong: A systematic approach for practical adversarial voice recognition. arXiv:1801.08535. Retrieved from https://arxix.org/abs/1801.08535.Google Scholar
Matei Zaharia, Andrew Chen, Aaron Davidson, Ali Ghodsi, Sue Ann Hong, Andy Konwinski, Siddharth Murching, Tomas Nykodym, Paul Ogilvie, Mani Parkhe, Fen Xie, and Corey Zumar. 2018. Accelerating the machine learning lifecycle with MLflow. Data Eng. 41, 4 (2018), 39--45.Google Scholar
Mengshi Zhang, Yuqun Zhang, Lingming Zhang, Cong Liu, and Sarfraz Khurshid. 2018. DeepRoad: GAN-based metamorphic autonomous driving system testing. arXiv:1802.02295. Retrieved from https://arxiv.org/abs/1802.02295.Google Scholar
Shichao Zhang, Chengqi Zhang, and Qiang Yang. 2003. Data preparation for data mining. Appl. Artif. Intell. 17, 5--6 (2003), 375--381.Google ScholarCross Ref
Stephan Zheng, Yang Song, Thomas Leung, and Ian Goodfellow. 2016. Improving the robustness of deep neural networks via stability training. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4480--4488.Google ScholarCross Ref
Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, and Yi Yang. 2017. Random erasing data augmentation. arXiv:1708.04896. Retrieved from https://arxiv.org/abs/1708.04896.Google Scholar

Index Terms

Assuring the Machine Learning Lifecycle: Desiderata, Methods, and Challenges
1. Computing methodologies
  1. Machine learning
  2. Modeling and simulation
    1. Model development and analysis
      1. Model verification and validation
2. General and reference
  1. Document types
    1. Surveys and overviews

Recommendations

Assured Deep Multi-Agent Reinforcement Learning for Safe Robotic Systems
Agents and Artificial Intelligence
Abstract
Using multi-agent reinforcement learning to find solutions to complex decision-making problems in shared environments has become standard practice in many scenarios. However, this is not the case in safety-critical scenarios, where the ...
Read More
MLife: a lite framework for machine learning lifecycle initialization
Abstract
Machine learning (ML) lifecycle is a cyclic process to build an efficient ML system. Though a lot of commercial and community (non-commercial) frameworks have been proposed to streamline the major stages in the ML lifecycle, they are normally ...
Read More
Utilising Assured Multi-Agent Reinforcement Learning within Safety-Critical Scenarios
Abstract
Multi-agent reinforcement learning allows a team of agents to learn how to work together to solve complex decision-making problems in a shared environment. However, this learning process utilises stochastic mechanisms, meaning that its use in ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Computing Surveys Volume 54, Issue 5
June 2022
719 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/3467690
Editor:
Albert Zomaya
University of Sydney, Australia
Issue’s Table of Contents
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 May 2021
- Accepted: 1 February 2021
- Revised: 1 December 2020
- Received: 1 May 2019
Published in csur Volume 54, Issue 5

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Machine learning lifecycle
assurance
assurance evidence
machine learning workflow
safety-critical systems
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 45
  Total Citations
  View Citations
- 7,977
  Total Downloads
- Downloads (Last 12 months)3,371
- Downloads (Last 6 weeks)986
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Assuring the Machine Learning Lifecycle: Desiderata, Methods, and Challenges

ACM Computing Surveys

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Assured Deep Multi-Agent Reinforcement Learning for Safe Robotic Systems

MLife: a lite framework for machine learning lifecycle initialization

Utilising Assured Multi-Agent Reinforcement Learning within Safety-Critical Scenarios