ABSTRACT
Distributed infrastructures for computation and analytics are now evolving towards an interconnected ecosystem allowing complex scientific workflows to be executed across hybrid systems spanning from IoT Edge devices to Clouds, and sometimes to supercomputers (the Computing Continuum). Understanding the performance trade-offs of large-scale workflows deployed on such complex Edge-to-Cloud Continuum is challenging. To achieve this, one needs to systematically perform experiments, to enable their reproducibility and allow other researchers to replicate the study and the obtained conclusions on different infrastructures. This breaks down to the tedious process of reconciling the numerous experimental requirements and constraints with low-level infrastructure design choices.
To address the limitations of the main state-of-the-art approaches for distributed, collaborative experimentation, such as Google Colab, Kaggle, and Code Ocean, we propose KheOps, a collaborative environment specifically designed to enable cost-effective reproducibility and replicability of Edge-to-Cloud experiments. KheOps is composed of three core elements: (1) an experiment repository; (2) a notebook environment; and (3) a multi-platform experiment methodology.
We illustrate KheOps with a real-life Edge-to-Cloud application. The evaluations explore the point of view of the authors of an experiment described in an article (who aim to make their experiments reproducible) and the perspective of their readers (who aim to replicate the experiment). The results show how KheOps helps authors to systematically perform repeatable and reproducible experiments on the Grid5000 + FIT IoT LAB testbeds. Furthermore, KheOps helps readers to cost-effectively replicate authors experiments in different infrastructures such as Chameleon Cloud + CHI@Edge testbeds, and obtain the same conclusions with high accuracies (> 88% for all performance metrics).
Supplemental Material
Available for Download
- [n. d.]. Artifact Review and Badging Version 1.1.https://www.acm.org/publications/policies/artifact-review-and-badging-currentGoogle Scholar
- [n. d.]. What is Docker Hub?Retrieved Jun 1, 2023 from https://www.docker.com/products/docker-hub/Google Scholar
- 2018. Dool (Dstat) monitoring.Retrieved Jan 14, 2023 from https://github.com/scottchiefbaker/doolGoogle Scholar
- 2018. GitHub.Retrieved Jan 14, 2023 from https://github.com/Google Scholar
- 2018. Zenodo.Retrieved Jan 14, 2023 from https://zenodo.org/Google Scholar
- 2019. E2Clab source code.Retrieved Jan 14, 2023 from https://gitlab.inria.fr/E2Clab/e2clabGoogle Scholar
- 2023. AI Hub.Retrieved Jan 14, 2023 from https://aihub.cloud.google.com/Google Scholar
- 2023. Apache Zeppelin.Retrieved Jan 15, 2023 from https://zeppelin.apache.org/Google Scholar
- 2023. Code Ocean Explore: Open Science Library.Retrieved Jan 19, 2023 from https://codeocean.com/exploreGoogle Scholar
- 2023. Colab: Cloud Storage from the command line.Retrieved Jan 18, 2023 from https://cloud.google.com/storage/docs/gsutilGoogle Scholar
- 2023. Colab: Google Spreadsheets.Retrieved Jan 18, 2023 from https://github.com/burnash/gspread#more-examplesGoogle Scholar
- 2023. Compute skylake cluster at [email protected] Feb 16, 2023 from https://www.chameleoncloud.org/hardware/node/sites/tacc/clusters/chameleon/nodes/0b0bceb9-14bf-423e-890f-3ef187511d71/Google Scholar
- 2023. Dahu cluster.Retrieved Feb 16, 2023 from https://www.grid5000.fr/w/Grenoble:Hardware#dahuGoogle Scholar
- 2023. Docker.Retrieved Jan 18, 2023 from https://www.docker.com/Google Scholar
- 2023. E2Clab User Defined Services.Retrieved Feb 8, 2023 from https://gitlab.inria.fr/E2Clab/user-defined-servicesGoogle Scholar
- 2023. Experiment artifacts.Retrieved Feb 8, 2023 from https://www.chameleoncloud.org/experiment/share/347adbf3-7c14-4834-b802-b45fdd0d9564Google Scholar
- 2023. Experiment results.Retrieved Jan 14, 2023 from https://gitlab.inria.fr/E2Clab/Paper-ArtifactsGoogle Scholar
- 2023. Google Colab.Retrieved Jan 17, 2023 from https://colab.research.google.com/Google Scholar
- 2023. Google Colab: Frequently Asked Questions.Retrieved Jan 18, 2023 from https://research.google.com/colaboratory/faq.htmlGoogle Scholar
- 2023. Google Colab vs Kaggle. Retrieved Jan 20, 2023 from https://datasciencenotebook.org/compare/colab/kaggleGoogle Scholar
- 2023. Kaggle community.Retrieved Jan 19, 2023 from https://www.kaggle.com/Google Scholar
- 2023. Kaggle datasets.Retrieved Jan 20, 2023 from https://www.kaggle.com/datasetsGoogle Scholar
- 2023. MQTT: The Standard for IoT Messaging.Retrieved Feb 16, 2023 from https://mqtt.org/Google Scholar
- 2023. Python zlib.Retrieved Feb 16, 2023 from https://docs.python.org/3/library/zlib.htmlGoogle Scholar
- 2023. Raspberry Pi 3 Model B.Retrieved Feb 16, 2023 from https://www.iot-lab.info/docs/boards/raspberry-pi-3/Google Scholar
- 2023. Raspberry Pi 4.Retrieved Feb 16, 2023 from https://chameleoncloud.org/experiment/chiedge/hardware-info/Google Scholar
- 2023. SC: The largest Reproducibility Laboratory.Retrieved Feb 8, 2023 from https://www.chameleoncloud.org/blog/2023/02/20/sc-the-largest-reproducibility-laboratory/Google Scholar
- 2023. Trovi: Practical Open Reproducibility.Retrieved Jan 20, 2023 from https://chameleoncloud.gitbook.io/trovi/Google Scholar
- 2023. Yocto Project.Retrieved Jan 14, 2023 from https://www.yoctoproject.org/Google Scholar
- 2023. Zooniverse dataset.Retrieved Feb 16, 2023 from https://www.zooniverse.org/organizations/meredithspalmer/snapshot-safariGoogle Scholar
- Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, 2016. Tensorflow: A system for large-scale machine learning. In 12th { USENIX} symposium on operating systems design and implementation ({ OSDI} 16). 265–283.Google Scholar
- Cedric Adjih, Emmanuel Baccelli, Eric Fleury, Gaetan Harter, Nathalie Mitton, Thomas Noel, Roger Pissard-Gibollet, Frederic Saint-Marcel, Guillaume Schreiner, Julien Vandaele, 2015. FIT IoT-LAB: A large scale open experimental IoT testbed. In 2015 IEEE 2nd World Forum on Internet of Things (WF-IoT). IEEE, 459–464.Google Scholar
- Jason Anderson and Kate Keahey. 2019. A case for integrating experimental containers with notebooks. In 2019 IEEE International Conference on Cloud Computing Technology and Science (CloudCom). IEEE, 151–158.Google ScholarCross Ref
- L. A. Barba and G. K. Thiruvathukal. 2017. Reproducible Research for Computing in Science Engineering. Computing in Science Engineering 19, 6 (2017), 85–87.Google ScholarCross Ref
- Raphaël Bolze, Franck Cappello, Eddy Caron, Michel Dayde, Frédéric Desprez, Emmanuel Jeannot, Yvon Jégou, Stephane Lanteri, Julien Leduc, Nouredine Melab, Guillaume Mornet, Raymond Namyst, Pascale Primet, Benjamin Quétier, Olivier Richard, El-Ghazali Talbi, and Iréa Touche. 2006. Grid’5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed. International Journal of High Performance Computing Applications 20, 4 (2006), 481–494. https://doi.org/10.1177/1094342006070078Google ScholarDigital Library
- Ronan-Alexandre Cherrueau, Marie Delavergne, Alexandre Van Kempen, Adrien Lebre, Dimitri Pertin, Javier Rojas Balderrama, Anthony Simonet, and Matthieu Simonin. 2021. Enoslib: A library for experiment-driven research in distributed computing. IEEE Transactions on Parallel and Distributed Systems 33, 6 (2021), 1464–1477.Google ScholarCross Ref
- April Clyburne-Sherin, Xu Fei, and Seth Ariel Green. 2019. Computational reproducibility via containers in psychology. Meta-psychology 3 (2019).Google Scholar
- Geoff Cumming, Fiona Fidler, and David L Vaux. 2007. Error bars in experimental biology. The Journal of cell biology 177, 1 (2007), 7–11.Google ScholarCross Ref
- ETP4HPC. April 29, 2020. ETP4HPC Strategic Research Agenda. https://www.etp4hpc.eu/sra.html.Google Scholar
- Odd Erik Gundersen, Yolanda Gil, and David W Aha. 2018. On reproducible AI: Towards reproducible research, open science, and digital scholarship in AI publications. AI magazine 39, 3 (2018), 56–68.Google Scholar
- Benjamin Haibe-Kains, George Alexandru Adam, Ahmed Hosny, Farnoosh Khodakarami, Massive Analysis Quality Control (MAQC) Society Board of Directors Shraddha Thakkar 35 Kusko Rebecca 36 Sansone Susanna-Assunta 37 Tong Weida 35 Wolfinger Russ D. 38 Mason Christopher E. 39 Jones Wendell 40 Dopazo Joaquin 41 Furlanello Cesare 42, Levi Waldron, Bo Wang, Chris McIntosh, Anna Goldenberg, Anshul Kundaje, 2020. Transparency and reproducibility in artificial intelligence. Nature 586, 7829 (2020), E14–E16.Google Scholar
- Kate Keahey. 2020. The Silver Lining. IEEE Internet Computing 24, 4 (2020), 55–59.Google ScholarCross Ref
- Kate Keahey, Jason Anderson, Michael Sherman, Zhuo Zhen, Mark Powers, Isabel Brunkan, and Adam Cooper. 2021. Chameleon@Edge Community Workshop Report.Google Scholar
- Kate Keahey, Jason Anderson, Zhuo Zhen, Pierre Riteau, Paul Ruth, Dan Stanzione, Mert Cevik, Jacob Colleran, Haryadi S Gunawi, Cody Hammock, 2020. Lessons learned from the chameleon testbed. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). 219–233.Google Scholar
- Thomas Kluyver, Benjamin Ragan-Kelley, Fernando Pérez, Brian E Granger, Matthias Bussonnier, Jonathan Frederic, Kyle Kelley, Jessica B Hamrick, Jason Grout, Sylvain Corlay, 2016. Jupyter Notebooks-a publishing format for reproducible computational workflows. Vol. 2016.Google Scholar
- Matthew S Krafczyk, A Shi, Adhithya Bhaskar, D Marinov, and Victoria Stodden. 2021. Learning from reproducing computational results: introducing three principles and the Reproduction Package. Philosophical Transactions of the Royal Society A 379, 2197 (2021), 20200069.Google Scholar
- Ling Liu and M Tamer Özsu. 2009. Encyclopedia of database systems. Vol. 6. Springer.Google Scholar
- Engineering National Academies of Sciences, Medicine, 2019. Reproducibility and replicability in science. National Academies Press.Google Scholar
- Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019), 8026–8037.Google Scholar
- Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, 2011. Scikit-learn: Machine learning in Python. the Journal of machine Learning research 12 (2011), 2825–2830.Google Scholar
- Daniel Rosendo, Alexandru Costan, Gabriel Antoniu, Matthieu Simonin, Jean-Christophe Lombardo, Alexis Joly, and Patrick Valduriez. 2021. Reproducible Performance Optimization of Complex Applications on the Edge-to-Cloud Continuum. In Cluster 2021 - IEEE International Conference on Cluster Computing. Portland, OR, United States, 23–34. https://doi.org/10.1109/Cluster48925.2021.00043Google Scholar
- Daniel Rosendo, Alexandru Costan, Patrick Valduriez, and Gabriel Antoniu. 2022. Distributed intelligence on the Edge-to-Cloud Continuum: A systematic literature review. Journal of Parallel and Distributed Computing 166 (Aug. 2022), 71–94. https://doi.org/10.1016/j.jpdc.2022.04.004Google ScholarDigital Library
- Daniel Rosendo, Pedro Silva, Matthieu Simonin, Alexandru Costan, and Gabriel Antoniu. 2020. E2Clab: Exploring the Computing Continuum through Repeatable, Replicable and Reproducible Edge-to-Cloud Experiments. In Cluster 2020 - IEEE International Conference on Cluster Computing. Kobe, Japan, 1–11. https://doi.org/10.1109/CLUSTER49012.2020.00028Google Scholar
- Renan Souza, Vítor Silva, Jose J. Camata, Alvaro L. G. A. Coutinho, Patrick Valduriez, and Marta Mattoso. 2019. Keeping Track of User Steering Actions in Dynamic Workflows. Future Generation Computer Systems 99 (2019), 624–643. https://doi.org/10.1016/j.future.2019.05.011Google ScholarDigital Library
- Victoria Stodden, Marcia McNutt, David H Bailey, Ewa Deelman, Yolanda Gil, Brooks Hanson, Michael A Heroux, John PA Ioannidis, and Michela Taufer. 2016. Enhancing reproducibility for computational methods. Science 354, 6317 (2016), 1240–1241.Google Scholar
- Victoria Stodden and Sheila Miguez. 2014. Best Practices for Computational Science: Software Infrastructure and Environments for Reproducible and Extensible Research. Journal of Open Research Software (Jul 2014). https://openresearchsoftware.metajnl.com/articles/10.5334/jors.ayGoogle ScholarCross Ref
- Mark D Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, Jan-Willem Boiten, Luiz Bonino da Silva Santos, Philip E Bourne, 2016. The FAIR Guiding Principles for scientific data management and stewardship. Scientific data 3, 1 (2016), 1–9.Google Scholar
Index Terms
- KheOps: Cost-effective Repeatability, Reproducibility, and Replicability of Edge-to-Cloud Experiments
Recommendations
The pos framework: a methodology and toolchain for reproducible network experiments
CoNEXT '21: Proceedings of the 17th International Conference on emerging Networking EXperiments and TechnologiesIn scientific research, the independent reproduction of experimental results is the source of trust. The release of experimental artifacts enables the reproduction of results; however, additional efforts of researchers are required to prepare and ...
Reproducibility in Scientific Computing
Reproducibility is widely considered to be an essential requirement of the scientific process. However, a number of serious concerns have been raised recently, questioning whether today’s computational work is adequately reproducible. In principle, it ...
A Study on Reproducibility and Replicability of Table Structure Recognition Methods
Document Analysis and Recognition - ICDAR 2023AbstractConcerns about reproducibility in artificial intelligence (AI) have emerged, as researchers have reported unsuccessful attempts to directly reproduce published findings in the field. Replicability, the ability to affirm a finding using the same ...
Comments