Skip to main content

2016 | OriginalPaper | Buchkapitel

A Holistic Approach to Testing Biomedical Hypotheses and Analysis of Biomedical Data

verfasst von : Krzysztof Psiuk-Maksymowicz, Aleksander Płaczek, Roman Jaksik, Sebastian Student, Damian Borys, Dariusz Mrozek, Krzysztof Fujarewicz, Andrzej Świerniak

Erschienen in: Beyond Databases, Architectures and Structures. Advanced Technologies for Data Mining and Knowledge Discovery

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Testing biomedical hypotheses is performed based on advanced and usually many-step analysis of biomedical data. This requires sophisticated analytical methods and data structures that allow to store intermediate results, which are needed in the subsequent steps. However, biomedical data, especially reference data, often change in time and new analytical methods are created every year. This causes the necessity to repeat the iterative analyses with new methods and new reference data sets, which in turn causes frequent changes of the underlying data structures. Such instability of data structures can be mitigated by the use of the idea of data lake, instead of traditional database systems.
The aim of this paper is to show system for researchers dealing with various types of biomedical data. Such a system provides a functionality of data analysis and testing different biomedical hypotheses. We treat a problem in a holistic way giving a researcher freedom in configuration his own multi-step analysis. This is possible by using a multiversion dynamic-schema data warehouse, performing parallel calculations on the virtualized computational environment, and delivering data in MapReduce-based ETL processes.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Arfaoui, N., Akaichi, J.: Automating schema integration technique case study: generating data warehouse schema from data mart schemas. In: Kozielski, S., Mrozek, D., Kasprowski, P., Malysiak-Mrozek, B., Kostrzewa, D. (eds.) Beyond Databases, Architectures and Structures. CCIS, vol. 521, pp. 200–209. Springer, Heidelberg (2015). http://dx.doi.org/10.1007/978-3-319-18422-7_18 Arfaoui, N., Akaichi, J.: Automating schema integration technique case study: generating data warehouse schema from data mart schemas. In: Kozielski, S., Mrozek, D., Kasprowski, P., Malysiak-Mrozek, B., Kostrzewa, D. (eds.) Beyond Databases, Architectures and Structures. CCIS, vol. 521, pp. 200–209. Springer, Heidelberg (2015). http://​dx.​doi.​org/​10.​1007/​978-3-319-18422-7_​18
2.
Zurück zum Zitat DePristo, M., Banks, E., Poplin, R., Garimella, K., Maguire, J., Hartl, C., Philippakis, A., del Angel, G., Rivas, M., Hanna, M., McKenna, A., Fennell, T., Kernytsky, A., Sivachenko, A., Cibulskis, K., Gabriel, S., Altshuler, D., Daly, M.: A framework for variation discovery and genotyping using next-generation dna sequencing data. Nature Genet. 43, 491–498 (2011)CrossRef DePristo, M., Banks, E., Poplin, R., Garimella, K., Maguire, J., Hartl, C., Philippakis, A., del Angel, G., Rivas, M., Hanna, M., McKenna, A., Fennell, T., Kernytsky, A., Sivachenko, A., Cibulskis, K., Gabriel, S., Altshuler, D., Daly, M.: A framework for variation discovery and genotyping using next-generation dna sequencing data. Nature Genet. 43, 491–498 (2011)CrossRef
3.
Zurück zum Zitat Govindarajan, R., Duraiyan, J., Kaliyappan, K., Palanisamy, M.: Microarray and its applications. J. Pharm. Bioallied Sci. 4(Suppl 2), S310–S312 (2012) Govindarajan, R., Duraiyan, J., Kaliyappan, K., Palanisamy, M.: Microarray and its applications. J. Pharm. Bioallied Sci. 4(Suppl 2), S310–S312 (2012)
4.
Zurück zum Zitat Gullapalli, R., Desai, K., Santana-Santos, L., Kant, J., Becich, M.: Next generation sequencing in clinical medicine: Challenges and lessons for pathology and biomedical informatics. J. Pathol. Inform. 3, 40 (2012)CrossRef Gullapalli, R., Desai, K., Santana-Santos, L., Kant, J., Becich, M.: Next generation sequencing in clinical medicine: Challenges and lessons for pathology and biomedical informatics. J. Pathol. Inform. 3, 40 (2012)CrossRef
5.
Zurück zum Zitat Inmon, W., Linstedt, D.: Data Architecture: A Primer for the Data Scientist: Big Data, Data Warehouse and Data Vault. 1st edn. Morgan Kaufmann, Waltham, MA, USA (2014) Inmon, W., Linstedt, D.: Data Architecture: A Primer for the Data Scientist: Big Data, Data Warehouse and Data Vault. 1st edn. Morgan Kaufmann, Waltham, MA, USA (2014)
6.
Zurück zum Zitat Jaksik, R., Bensz, W., Smieja, J.: Nucleotide composition based measurement bias in high throughput gene expression studies. In: Gruca, A., Brachman, A., Kozielski, S., Czachórski, T. (eds.) Man–Machine Interactions 4. AISC, vol. 391, pp. 205–214. Springer, Heidelberg (2016) Jaksik, R., Bensz, W., Smieja, J.: Nucleotide composition based measurement bias in high throughput gene expression studies. In: Gruca, A., Brachman, A., Kozielski, S., Czachórski, T. (eds.) Man–Machine Interactions 4. AISC, vol. 391, pp. 205–214. Springer, Heidelberg (2016)
8.
Zurück zum Zitat Kimball, R., Reeves, L., Margy, R., Thornthwaite, W.: The Data Warehouse. Lifecycle Toolkit. 3rd edn. John Wiley & Sons, Indianapolis, IN, USA (2013) Kimball, R., Reeves, L., Margy, R., Thornthwaite, W.: The Data Warehouse. Lifecycle Toolkit. 3rd edn. John Wiley & Sons, Indianapolis, IN, USA (2013)
9.
Zurück zum Zitat Lee, T., Pouliot, Y., Wagner, V., Gupta, P., Stringer-Calvert, D., Tenenbaum, J., Karp, P.: Biowarehouse: a bioinformatics database warehouse toolkit. BMC Bioinform. 7(170), 1–14 (2006) Lee, T., Pouliot, Y., Wagner, V., Gupta, P., Stringer-Calvert, D., Tenenbaum, J., Karp, P.: Biowarehouse: a bioinformatics database warehouse toolkit. BMC Bioinform. 7(170), 1–14 (2006)
10.
Zurück zum Zitat Małysiak-Mrozek, B., Mrozek, D., Kozielski, S.: Processing of crisp and fuzzy measures in the fuzzy data warehouse for global natural resources. In: García-Pedrajas, N., Herrera, F., Fyfe, C., Benítez, J.M., Ali, M. (eds.) IEA/AIE 2010, Part III. LNCS, vol. 6098, pp. 616–625. Springer, Heidelberg (2010)CrossRef Małysiak-Mrozek, B., Mrozek, D., Kozielski, S.: Processing of crisp and fuzzy measures in the fuzzy data warehouse for global natural resources. In: García-Pedrajas, N., Herrera, F., Fyfe, C., Benítez, J.M., Ali, M. (eds.) IEA/AIE 2010, Part III. LNCS, vol. 6098, pp. 616–625. Springer, Heidelberg (2010)CrossRef
12.
Zurück zum Zitat Mazurek, M.: Applying NoSQL databases for operationalizing clinical data miningmodels. In: Kozielski, S., Mrozek, D., Kasprowski, P., Malysiak-Mrozek, B., Kostrzewa, D. (eds.) Beyond Databases, Architectures, and Structures: 10th InternationalConference, BDAS 2014, Ustron, Poland, May 27-30, 2014. Proceedings, Communications in Computer and Information Science, vol. 424, pp.527–536. Springer International Publishing (2014). http://dx.doi.org/10.1007/978-3-319-06932-6_51 Mazurek, M.: Applying NoSQL databases for operationalizing clinical data miningmodels. In: Kozielski, S., Mrozek, D., Kasprowski, P., Malysiak-Mrozek, B., Kostrzewa, D. (eds.) Beyond Databases, Architectures, and Structures: 10th InternationalConference, BDAS 2014, Ustron, Poland, May 27-30, 2014. Proceedings, Communications in Computer and Information Science, vol. 424, pp.527–536. Springer International Publishing (2014). http://​dx.​doi.​org/​10.​1007/​978-3-319-06932-6_​51
15.
Zurück zum Zitat Pabinger, S., Dander, A., Fischer, M., Snajder, R., Sperk, M., Efremova, M., Krabichler, B., Speicher, M., Zschocke, J., Trajanoski, Z.: A survey of tools for variant analysis of next-generation genome sequencing data. Brief. Bioinform. 15, 256–278 (2014)CrossRef Pabinger, S., Dander, A., Fischer, M., Snajder, R., Sperk, M., Efremova, M., Krabichler, B., Speicher, M., Zschocke, J., Trajanoski, Z.: A survey of tools for variant analysis of next-generation genome sequencing data. Brief. Bioinform. 15, 256–278 (2014)CrossRef
16.
Zurück zum Zitat Ponniah, P.: Data Warehousing Fundamentals. A Comprehensive Guide for IT Professionals. John Wiley & Sons, Hoboken, New Jersey, USA (2001) Ponniah, P.: Data Warehousing Fundamentals. A Comprehensive Guide for IT Professionals. John Wiley & Sons, Hoboken, New Jersey, USA (2001)
18.
Zurück zum Zitat Shah, S., Huang, Y., Xu, T., Yuen, M., Ling, J., Ouellette, B.: Atlas - a data warehouse for integrative bioinformatics. BMC Bioinform. 6(34), 1–16 (2005) Shah, S., Huang, Y., Xu, T., Yuen, M., Ling, J., Ouellette, B.: Atlas - a data warehouse for integrative bioinformatics. BMC Bioinform. 6(34), 1–16 (2005)
19.
Zurück zum Zitat Shyr, D., Liu, Q.: Next generation sequencing in cancer research and clinical application. Biol. Proced. Online 15(1), 4 (2013)CrossRef Shyr, D., Liu, Q.: Next generation sequencing in cancer research and clinical application. Biol. Proced. Online 15(1), 4 (2013)CrossRef
20.
Zurück zum Zitat Student, S., Danch-Wierzchowska, M., Gorczewski, K., Borys, D.: Automatic segmentation system of emission tomography data based on classification system. In: Ortuño, F., Rojas, I. (eds.) IWBBIO 2015, Part I. LNCS, vol. 9043, pp. 274–281. Springer, Heidelberg (2015) Student, S., Danch-Wierzchowska, M., Gorczewski, K., Borys, D.: Automatic segmentation system of emission tomography data based on classification system. In: Ortuño, F., Rojas, I. (eds.) IWBBIO 2015, Part I. LNCS, vol. 9043, pp. 274–281. Springer, Heidelberg (2015)
21.
Zurück zum Zitat Student, S., Fujarewicz, K.: Stable feature selection and classification algorithms for multiclass microarray data. Biol. Direct 7(33), 1–20 (2012) Student, S., Fujarewicz, K.: Stable feature selection and classification algorithms for multiclass microarray data. Biol. Direct 7(33), 1–20 (2012)
22.
Zurück zum Zitat Topel, T., Kormeier, B., Klassen, A., Hofestädt, R.: Biodwh: A data warehouse kit for life science data integration. J. Integr. Bioinform. 5(2), 1–9 (2008) Topel, T., Kormeier, B., Klassen, A., Hofestädt, R.: Biodwh: A data warehouse kit for life science data integration. J. Integr. Bioinform. 5(2), 1–9 (2008)
23.
Zurück zum Zitat Ulahannan, D., Kovac, M., Mulholland, P., Cazier, J.B., Tomlinson, I.: Technical and implementation issues in using next-generation sequencing of cancers in clinical practice. Br. J. Cancer 109, 827–835 (2013)CrossRef Ulahannan, D., Kovac, M., Mulholland, P., Cazier, J.B., Tomlinson, I.: Technical and implementation issues in using next-generation sequencing of cancers in clinical practice. Br. J. Cancer 109, 827–835 (2013)CrossRef
24.
Zurück zum Zitat Wycislik, L., Augustyn, D.R., Mrozek, D., Pluciennik, E., Zghidi, H., Brzeski, R.: E–LT concept in a light of new features of Oracle Data Integrator 12c based on data migration within a Hospital Information System. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds.) Beyond Databases, Architectures and Structures: 11th International Conference, BDAS2015, Ustroń, Poland, May 26-29, 2015, Proceedings, Communications in Computer and Information Science, vol. 521, pp. 190–199. Springer International Publishing (2015). http://dx.doi.org/10.1007/978-3-319-18422-7_17 Wycislik, L., Augustyn, D.R., Mrozek, D., Pluciennik, E., Zghidi, H., Brzeski, R.: E–LT concept in a light of new features of Oracle Data Integrator 12c based on data migration within a Hospital Information System. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds.) Beyond Databases, Architectures and Structures: 11th International Conference, BDAS2015, Ustroń, Poland, May 26-29, 2015, Proceedings, Communications in Computer and Information Science, vol. 521, pp. 190–199. Springer International Publishing (2015). http://​dx.​doi.​org/​10.​1007/​978-3-319-18422-7_​17
Metadaten
Titel
A Holistic Approach to Testing Biomedical Hypotheses and Analysis of Biomedical Data
verfasst von
Krzysztof Psiuk-Maksymowicz
Aleksander Płaczek
Roman Jaksik
Sebastian Student
Damian Borys
Dariusz Mrozek
Krzysztof Fujarewicz
Andrzej Świerniak
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-34099-9_34

Premium Partner