Skip to main content
Top

2018 | OriginalPaper | Chapter

T-Brain: A Collaboration Platform for Data Scientists

Authors : Chao-Chun Yeh, Sheng-An Chang, Yi-Chin Chu, Xuan-Yi Lin, Yichiao Sun, Jiazheng Zhou, Shih-Kun Huang

Published in: Security with Intelligent Computing and Big-data Services

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

When data were generated easily and rapidly with mobile services and computing power can increase on demand with the cloud computation service, data scientists who work with huge data can solve challenging problems. Smart intelligent applications such as Go, healthcare and self-driving vehicles show great improvement recently. In addition to those problems, there are still more complex problem such as weather impacts analysis, financial crisis prediction and crime prevention and so on. To overcome those challenging problems, many crossdisciplinarity or interdisciplinary experts have to collaborate for the solutions. In the paper, we propose a collaboration platform and a system design for data scientists to share data, write analytic scripts and discuss topics related with those problems. In current status, eleven dataset were collect ed such as spam mail, malware data, honeynet log, Hadoop workload log and some other open data and based on those dataset and improvement local cache design (i.e., average response time improvement 92.36% and request availability improvement 70%). With the platform, many education and competition activities can be hold successfully on the collaboration platform.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016)CrossRef Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016)CrossRef
2.
go back to reference Sadek, I., Elawady, M., Shabayek, A.E.R.: Automatic classification of bright retinal lesions via deep network features. arXiv preprint arXiv:1707.02022 (2017) Sadek, I., Elawady, M., Shabayek, A.E.R.: Automatic classification of bright retinal lesions via deep network features. arXiv preprint arXiv:​1707.​02022 (2017)
3.
go back to reference Huval, B., Wang, T., Tandon, S., Kiske, J., Song, W., Pazhayampallil, J., et al.: An empirical evaluation of deep learning on highway driving. arXiv preprint arXiv:1504.01716 (2015) Huval, B., Wang, T., Tandon, S., Kiske, J., Song, W., Pazhayampallil, J., et al.: An empirical evaluation of deep learning on highway driving. arXiv preprint arXiv:​1504.​01716 (2015)
6.
go back to reference CrowdANALYTIX: Automating business processes using artificial intelligence CrowdANALYTIX: Automating business processes using artificial intelligence
10.
go back to reference Merkel, D.: Docker: lightweight Linux containers for consistent development and deployment. Linux J. 2014, 2 (2014) Merkel, D.: Docker: lightweight Linux containers for consistent development and deployment. Linux J. 2014, 2 (2014)
12.
go back to reference Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51, 107–113 (2008)CrossRef Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51, 107–113 (2008)CrossRef
14.
go back to reference Kluyver, T., Ragan-Kelley, B., Pérez, F., Granger, B.E., Bussonnier, M., Frederic, J., et al.: Jupyter Notebooks-a publishing format for reproducible computational workflows. In: ELPUB, pp. 87–90 (2016) Kluyver, T., Ragan-Kelley, B., Pérez, F., Granger, B.E., Bussonnier, M., Frederic, J., et al.: Jupyter Notebooks-a publishing format for reproducible computational workflows. In: ELPUB, pp. 87–90 (2016)
23.
go back to reference Zeilenga, K.: Lightweight directory access protocol (LDAP): technical specification road map (2006) Zeilenga, K.: Lightweight directory access protocol (LDAP): technical specification road map (2006)
24.
go back to reference Pierson, N.: Overview of Active Directory Federation Services in Windows Server 2003 R2. Microsoft Corporation, October 2005 Pierson, N.: Overview of Active Directory Federation Services in Windows Server 2003 R2. Microsoft Corporation, October 2005
25.
go back to reference Hellerstein, J.M., Ré, C., Schoppmann, F., Wang, D.Z., Fratkin, E., Gorajek, A., et al.: The MADlib analytics library: or MAD skills, the SQL. Proc. VLDB Endow. 5, 1700–1711 (2012)CrossRef Hellerstein, J.M., Ré, C., Schoppmann, F., Wang, D.Z., Fratkin, E., Gorajek, A., et al.: The MADlib analytics library: or MAD skills, the SQL. Proc. VLDB Endow. 5, 1700–1711 (2012)CrossRef
26.
go back to reference RC Team: R language definition. R Foundation for Statistical Computing, Vienna, Austria (2000) RC Team: R language definition. R Foundation for Statistical Computing, Vienna, Austria (2000)
29.
go back to reference Zawodny, J.: Redis: lightweight key/value store that goes the extra mile. Linux Magazine, vol. 79 (2009) Zawodny, J.: Redis: lightweight key/value store that goes the extra mile. Linux Magazine, vol. 79 (2009)
30.
go back to reference Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., et al.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, p. 2 (2012) Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., et al.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, p. 2 (2012)
31.
go back to reference Vora, M.N.: Hadoop-HBase for large-scale data. In: 2011 International Conference on Computer Science and Network Technology (ICCSNT), pp. 601–605 (2011) Vora, M.N.: Hadoop-HBase for large-scale data. In: 2011 International Conference on Computer Science and Network Technology (ICCSNT), pp. 601–605 (2011)
32.
go back to reference Chodorow, K.: MongoDB: The Definitive Guide: Powerful and Scalable Data Storage. O’Reilly Media, Inc. (2013) Chodorow, K.: MongoDB: The Definitive Guide: Powerful and Scalable Data Storage. O’Reilly Media, Inc. (2013)
35.
go back to reference Pérez, F., Granger, B.E.: IPython: a system for interactive scientific computing. Comput. Sci. Eng. 9 (2007) Pérez, F., Granger, B.E.: IPython: a system for interactive scientific computing. Comput. Sci. Eng. 9 (2007)
37.
go back to reference Yeh, C.-C., Zhou, J., Chang, S.-A., Lin, X.-Y., Sun, Y., Huang, S.-K.: BigExplorer: a configuration recommendation system for big data platform. In: 2016 Conference on Technologies and Applications of Artificial Intelligence (TAAI), pp. 228–234 (2016) Yeh, C.-C., Zhou, J., Chang, S.-A., Lin, X.-Y., Sun, Y., Huang, S.-K.: BigExplorer: a configuration recommendation system for big data platform. In: 2016 Conference on Technologies and Applications of Artificial Intelligence (TAAI), pp. 228–234 (2016)
Metadata
Title
T-Brain: A Collaboration Platform for Data Scientists
Authors
Chao-Chun Yeh
Sheng-An Chang
Yi-Chin Chu
Xuan-Yi Lin
Yichiao Sun
Jiazheng Zhou
Shih-Kun Huang
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-76451-1_28

Premium Partner