ABSTRACT
Despite all of the power that machine learning and artificial intelligence (AI) models bring to applications, much of AI development is currently a fairly ad hoc process. Software engineering and AI development share many of the same languages and tools, but AI development as an engineering practice is still in early stages. Mining software repositories of AI models enables insight into the current state of AI development. However, much of the relevant metadata around models are not easily extractable directly from repositories and require deduction or domain knowledge. This paper presents a library called AIMMX that enables simplified AI Model Metadata eXtraction from software repositories. The extractors have five modules for extracting AI model-specific metadata: model name, associated datasets, references, AI frameworks used, and model domain. We evaluated AIMMX against 7,998 open-source models from three sources: model zoos, arXiv AI papers, and state-of-the-art AI papers. Our platform extracted metadata with 87% precision and 83% recall. As preliminary examples of how AI model metadata extraction enables studies and tools to advance engineering support for AI development, this paper presents an exploratory analysis for data and method reproducibility over the models in the evaluation dataset and a catalog tool for discovering and managing models. Our analysis suggests that while data reproducibility may be relatively poor with 42% of models in our sample citing their datasets, method reproducibility is more common at 72% of models in our sample, particularly state-of-the-art models. Our collected models are searchable in a catalog that uses existing metadata to enable advanced discovery features for efficiently finding models.
- [n.d.]. arXiv.org help - arXiv API. https://arxiv.org/help/api/index Accessed: 2020-03-13.Google Scholar
- [n.d.]. Papers With Code: the latest in machine learning. https://paperswithcode.com Accessed: 2020-03-13.Google Scholar
- 1991. arXiv.org e-Print archive. https://arxiv.org/Accessed: 2020-03-13.Google Scholar
- 2008. The world's leading software development platform - GitHub. https://github.com/ Accessed: 2020-03-13.Google Scholar
- 2017. ONNX. https://onnx.ai/ Accessed: 2020-03-13.Google Scholar
- 2019. MLFlow- A platform for the machine learning lifecycle. https://mlflow.org/ Accessed: 2020-03-13.Google Scholar
- Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software Engineering for Machine Learning: A Case Study. In International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). 291--300. https://doi.org/10.1109/ICSE-SEIP.2019.00042Google Scholar
- Abdul Ali Bangash, Hareem Sahar, Shaiful Chowdhury, Alexander William Wong, Abram Hindle, and Karim Ali. 2019. What Do Developers Know about Machine Learning: A Study of ML Discussions on StackOverflow. In Conference on Mining Software Repositories (MSR). 260--264. https://doi.org/10.1109/MSR.2019.00052Google ScholarDigital Library
- H Ben Braiek, F Khomh, and B Adams. 2018. The Open-Closed Principle of Modern Machine Learning Frameworks. In Conference on Mining Software Repositories (MSR). 353--363.Google Scholar
- Eric Breck, Neoklis Polyzotis, Sudip Roy, Steven Euijong Whang, and Martin Zinkevich. 2019. Data Validation for Machine Learning. In Conference on Systems and Machine Learning (SysML).Google Scholar
- Ciprian Chelba, Tomas Mikolov, Mike Schuster, Qi Ge, Thorsten Brants, and Phillipp Koehn. 2013. One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling. CoRR abs/1312.3005 (2013). arXiv:1312.3005 http://arxiv.org/abs/1312.3005Google Scholar
- Laura Dabbish, Colleen Stuart, Jason Tsay, and Jim Herbsleb. 2012. Social coding in GitHub: transparency and collaboration in an open software repository. In Conference on Computer Supported Cooperative Work (CSCW). 1277--1286. https://doi.org/10.1145/2145204.2145396Google ScholarDigital Library
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805 (2018).Google Scholar
- GitHub. 2016. GitHub API v3 | GitHub Developer Guide. https://developer.github.com/v3/ Accessed: 2020-03-13.Google Scholar
- T. L. Graves, A. F. Karr, J. S. Marron, and H. Siy. 2000. Predicting fault incidence using software change history. IEEE Transactions on Software Engineering 26, 7 (July 2000), 653--661. https://doi.org/10.1109/32.859533Google ScholarDigital Library
- Alex Guazzelli, Michael Zeller, Wen-Ching Lin, Graham Williams, et al. 2009. PMML: An open standard for sharing models. The R Journal 1, 1 (2009), 60--65.Google ScholarCross Ref
- Odd Erik Gundersen and Sigbjørn Kjensmo. 2017. State of the art: Reproducibility in artificial intelligence. In Conference on Artificial Intelligence (AAAI).Google Scholar
- CharlesHill, Rachel Bellamy, Thomas Erickson, and Margaret Burnett. 2016. Trials and tribulations of developers of intelligent systems: A field study. In Symposium on Visual Languages and Human-Centric Computing (VL/HCC). 162--170.Google Scholar
- Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel M German, and Daniela Damian. 2014. The Promises and Perils of Mining GitHub. In Conference on Mining Software Repositories (MSR). 92--101. https://doi.org/10.1145/2597073.2597074Google Scholar
- Miryung Kim, Thomas Zimmermann, Robert DeLine, and Andrew Begel. 2016. The Emerging Role of Data Scientists on Software Development Teams. In International Conference on Software Engineering (ICSE). 96--107. http://doi.acm.org/10.1145/2884781.2884783Google Scholar
- Y Ma, S Fakhoury, M Christensen, V Arnaoudova, W Zogaan, and M Mirakhorli. 2018. Automatic Classification of Software Artifacts in Open-Source Applications. In Conference on Mining Software Repositories (MSR). 414--425.Google Scholar
- T. Menzies and T. Zimmermann. 2013. Software Analytics: So What? IEEE Software 30, 4 (July 2013), 31--37. https://doi.org/10.1109/MS.2013.86Google Scholar
- Hui Miao, Ang Li, Larry S. Davis, and Amol Deshpande. 2016. ModelHub: Towards Unified Data and Lifecycle Management for Deep Learning. CoRR abs/1611.06224 (2016). https://arxiv.org/abs/1611.06224Google Scholar
- Hui Miao, Ang Li, Larry S Davis, and Amol Deshpande. 2017. On Model Discovery For Hosted Data Science Projects. In Workshop on Data Management for End-to-End Machine Learning (DEEM'17). 6:1---6:4. https://doi.org/10.1145/3076246.3076252Google Scholar
- T. J. Ostrand, E. J. Weyuker, and R. M. Bell. 2005. Predicting the location and number of faults in large software systems. IEEE Transactions on Software Engineering 31, 4 (April 2005), 340--355. https://doi.org/10.1109/TSE.2005.49Google ScholarDigital Library
- João Felipe Pimentel, Leonardo Murta, Vanessa Braganholo, and Juliana Freire. 2019. A Large-Scale Study about Quality and Reproducibility of Jupyter Notebooks. In Conference on Mining Software Repositories (MSR). 507--517. https://doi.org/10.1109/MSR.2019.00077Google ScholarDigital Library
- Jim Pivarski, Collin Bennett, and Robert L. Grossman. 2016. Deploying Analytics with the Portable Format for Analytics (PFA). In Conference on Knowledge Discovery and Data Mining (KDD) (San Francisco, California, USA). 579--588. https://doi.org/10.1145/2939672.2939731Google Scholar
- Gustavo Correa Publio, Diego Esteves, Agnieszka ÅĄawrynowicz, PanÄŊe Panov, Larisa Soldatova, Tommaso Soru, Joaquin Vanschoren, and Hamid Zafar. 2018. ML Schema: Exposing the Semantics of Machine Learning with Schemas and Ontologies. In Reproducibility in Machine Learning Workshop (RML). https://openreview.net/forum?id=B1e8MrXVxQGoogle Scholar
- D Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-François Crespo, and Dan Dennison. 2015. Hidden Technical Debt in Machine Learning Systems. In Conference on Neural Information Processing Systems (NIPS). 2503--2511.Google Scholar
- Akshay Sethi, Anush Sankaran, Naveen Panwar, Shreya Khare, and Senthil Mani. 2018. DLPaper2Code: Auto-generation of Code from Deep Learning Research Papers. In Conference on Artificial Intelligence (AAAI). 7339--7346. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/17100Google Scholar
- Neel Shah. [n.d.]. ARXIV data from 24,000+ papers Version 2. https://www.kaggle.com/neelshah18/arxivdataset/home Accessed: 2019-01-15.Google Scholar
- Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alexander A Alemi. 2017. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. In Conference on Artificial Intelligence (AAAI).Google Scholar
- Erik H. Trainer, Chalalai Chaihirunkarn, Arun Kalyanasundaram, and James D. Herbsleb. 2015. From Personal Tool to Community Resource: What's the Extra Work and Who Will Do It?. In Conference on Computer Supported Cooperative Work (CSCW). 417--430. http://doi.acm.org/10.1145/2675133.2675172Google Scholar
- Jason Tsay, Todd Mummert, Norman Bobroff, Alan Braz, and Martin Hirzel. 2018. Runway: Machine Learning Model Experiment Management Tool. In Conference on Systems and Machine Learning(SysML).Google Scholar
- Joaquin Vanschoren, Jan N. van Rijn, Bernd Bischl, and Luis Torgo. 2014. OpenML: Networked Science in Machine Learning. SIGKDD Explorations Newsletter 15, 2 (June 2014), 49--60. http://doi.acm.org/10.1145/2641190.2641198Google ScholarDigital Library
- Manasi Vartak, Harihar Subramanyam, Wei-En Lee, Srinidhi Viswanathan, Saadiyah Husnoo, Samuel Madden, and Matei Zaharia. 2016. ModelDB: A System for Machine Learning Model Management. In Workshop on Human-In-the-Loop Data Analytics (HILDA). 14:1-14:3. http://doi.acm.org/10.1145/2939502.2939516Google Scholar
- Mandana Vaziri, Louis Mandel, Avraham Shinnar, Jérôme Siméon, and Martin Hirzel. 2017. Generating Chat Bots from Web API Specifications. In Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Onward!). 44--57. http://doi.acm.org/10.1145/3133850.3133864Google ScholarDigital Library
- Z Wan, X Xia, D Lo, and G C Murphy. 2019. How does Machine Learning Change Software Development Practices? IEEE Transactions on Software Engineering (2019), 1. https://doi.org/10.1109/TSE.2019.2937083Google Scholar
- Ian H Witten, Eibe Frank, Mark A Hall, and Christopher J Pal. 2016. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann.Google ScholarDigital Library
Index Terms
- AIMMX: Artificial Intelligence Model Metadata Extractor
Recommendations
Extracting enhanced artificial intelligence model metadata from software repositories
AbstractWhile artificial intelligence (AI) models have improved at understanding large-scale data, understanding AI models themselves at any scale is difficult. For example, even two models that implement the same network architecture may differ in ...
AI based intelligent system on the EDISON platform
AICCC '18: Proceedings of the 2018 Artificial Intelligence and Cloud Computing ConferenceIn recent years, artificial intelligence (AI) has become a trend all over the world. This trend has led to the application and development of intelligent system that apply AI. In this paper, we describe a system architecture that uses AI, on a platform ...
Hybrid Bionic Cognitive Architecture for Artificial General Intelligence Agents
AbstractThe article describes the author’s proposal on cognitive architecture for the development of a general-level artificial intelligent agent («strong» artificial intelligence). New principles for the development of such an architecture are proposed — ...
Comments