Skip to main content
Top

2019 | OriginalPaper | Chapter

Best Practices in Structuring Data Science Projects

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The goal of Data Science projects is to extract knowledge and insights from collected data. The focus is put on the novelty and usability of the obtained insights. However, the impact of a project can be seriously reduced if the results are not communicated well. In this paper, we describe a means of managing and describing the outcomes of the Data Science projects in such a way that they optimally convey the insights gained. We focus on the main artifact of the non-verbal communication, namely project structure. In particular, we surveyed three sources of information on how to structure projects: common management methodologies, community best practices, and data sharing platforms. The survey resulted in a list of recommendations on how to build the project artifacts to make them clear, intuitive, and logical. We also provide hints on tools that can be helpful for managing such structures in an efficient manner. The paper is intended to motivate and support an informed decision on how to structure a Data Science project to facilitate better communication of the outcomes.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
17.
go back to reference Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: Knowledge discovery and data mining: towards a unifying framework. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 82–88 (1996) Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: Knowledge discovery and data mining: towards a unifying framework. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 82–88 (1996)
19.
go back to reference Kurgan, L.A., Musilek, P.: A survey of knowledge discovery and data mining process models. Knowl. Eng. Rev. 21(1), 1–24 (2006)CrossRef Kurgan, L.A., Musilek, P.: A survey of knowledge discovery and data mining process models. Knowl. Eng. Rev. 21(1), 1–24 (2006)CrossRef
21.
go back to reference Piatetsky-Shapiro, G., Frawley, W.J. (eds.): Knowledge Discovery in Databases. AAAI/MIT Press, Cambridge (1991)MATH Piatetsky-Shapiro, G., Frawley, W.J. (eds.): Knowledge Discovery in Databases. AAAI/MIT Press, Cambridge (1991)MATH
22.
go back to reference Reinartz, T.: Stages of the discovery process. In: Klosgrn, W., Zylkon, J. (eds.) Handbook of Data Mining and Knowledge Discovery, pp. 185–192. Oxford University Press, Inc., Oxford (2002) Reinartz, T.: Stages of the discovery process. In: Klosgrn, W., Zylkon, J. (eds.) Handbook of Data Mining and Knowledge Discovery, pp. 185–192. Oxford University Press, Inc., Oxford (2002)
23.
go back to reference Reitz, K., Schlusser, T.: The Hitchhiker’s Guide to Python: Best Practices for Development (2016). ISBN: 978-1-49193-317-6 Reitz, K., Schlusser, T.: The Hitchhiker’s Guide to Python: Best Practices for Development (2016). ISBN: 978-1-49193-317-6
24.
go back to reference Roure, D.D., Goble, C., Stevens, R.: The design and realisation of the myExperiment virtual research environment for social sharing of workflows. Future Gener. Comput. Syst. 25(5), 561–567 (2009)CrossRef Roure, D.D., Goble, C., Stevens, R.: The design and realisation of the myExperiment virtual research environment for social sharing of workflows. Future Gener. Comput. Syst. 25(5), 561–567 (2009)CrossRef
25.
go back to reference Rybicki, J., von St. Vieth, B.: Reproducible evaluation of semantic storage options. In: Proceedings of the 3rd IARIA International Conference on Big Data, Small Data, Linked Data and Open Data (ALLDATA 2017), pp. 26–29, April 2017. ISBN: 978-1-61208-552-4, ISSN: 2519-8386 Rybicki, J., von St. Vieth, B.: Reproducible evaluation of semantic storage options. In: Proceedings of the 3rd IARIA International Conference on Big Data, Small Data, Linked Data and Open Data (ALLDATA 2017), pp. 26–29, April 2017. ISBN: 978-1-61208-552-4, ISSN: 2519-8386
Metadata
Title
Best Practices in Structuring Data Science Projects
Author
Jedrzej Rybicki
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-319-99993-7_31

Premium Partner