nach oben

Erschienen in:

2014 | OriginalPaper | Buchkapitel

14. Analysis of Multiple DNA Microarray Datasets

verfasst von : Veselka Boeva, Elena Tsiporkova, Elena Kostadinova

Erschienen in: Springer Handbook of Bio-/Neuroinformatics

Verlag: Springer Berlin Heidelberg

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In contrast to conventional clustering algorithms, where a single dataset is used to produce a clustering solution, we introduce herein a MapReduce approach for clustering of datasets generated in multiple-experiment settings. It is inspired by the map-reduce functions commonly used in functional programming and consists of two distinctive phases. Initially, the selected clustering algorithm is applied (mapped) to each experiment separately. This produces a list of different clustering solutions, one per experiment. These are further transformed (reduced) by portioning the cluster centers into a single clustering solution. The obtained partition is not disjoint in terms of the different participating genes, and it is further analyzed and refined by applying formal concept analysis.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Case-Based Reasoning for Biomedical Informatics and Medicine

Nächstes Kapitel Fuzzy Logic and Rule-Based Methods in Bioinformatics

14.1.

T.C. Havens, J.M Keller, M Popescu, J.C Bezdek, E. MacNeal Rehrig, H.M Appel, J.C Schultz: Fuzzy cluster analysis of bioinformatics data composed of microarray expression data and gene ontology annotations, Proc. North Am. Fuzzy Inf. Process. Soc. (2008) pp. 1–6

14.2.

D. Huang, W. Pan: Incorporating biological knowledge into distance-based clustering analysis of microarray gene expression data, Bioinformatics 22(10), 1259–1268 (2006)CrossRef

14.3.

J. Kasturi, R. Acharya: Clustering of diverse genomic data using information fusion, Bioinformatics 21(4), 423–429 (2005)CrossRef

14.4.

G. Li, Z. Wang: Incorporating heterogeneous biological data sources in clustering gene expression data, Health 1, 17–23 (2009)CrossRef

14.5.

R. Kustra, A. Zagdanski: Incorporating gene ontology in clustering gene expression data, Proc. 19th IEEE Symp. Comput.-Based Med. Syst. (2006) pp. 555–563

14.6.

E. Johnson, H. Kargupta: Collective hierarchical clustering from distributed, heterogeneous data, LNCS 1759, 221–244 (1999)

14.7.

A. Strehl, J. Ghosh: Cluster ensembles – A knowledge reuse framework for combining multiple partitions, J. Mach. Learn. Res. 3, 583–617 (2002)MathSciNet

14.8.

A. Topchy, K. Jain, W. Punch: Clustering ensembles: Models of consensus and weak partitions, IEEE Trans. Pattern Anal. Mach. Intell. 27, 1866–1881 (2005)CrossRef

14.9.

A.P. Dempster, N.M. Laird, D.B. Rubin: Maximum likelihood from incomplete data via the EM algorithm, J. Roy. Stat. Soc. B 39(1), 1–38 (1977)MathSciNetMATH

14.10.

E. Kostadinova, V. Boeva, N. Lavesson: Clustering of multiple microarray experiments using information integration, LNCS 6865, 123–137 (2011)

14.11.

B. Ganter, G. Stumme, R. Wille (Eds.): Formal Concept Analysis: Foundations and Applications, Lect. Notes Artif. Intell., Vol. 3626 (Springer, Berlin, Heidelberg 2005)

14.12.

J. Besson, C. Robardet, J.-F. Boulicaut: Constraint-based mining of formal concepts in transactional data, LNCS 3056, 615–624 (2004)

14.13.

J. Besson, C. Robardet, J.-F. Boulicaut, S. Rome: Constraint-based concept mining and its application to microarray data analysis, Intell. Data Anal. 9(1), 59–82 (2005)

14.14.

D.P. Potter: A combinatorial approach to scientific exploration of gene expression data: An integrative method using formal concept analysis for the comparative analysis of microarray data. Ph.D. Thesis (Department of Mathematics, Virginia Tech 2005)

14.15.

V. Choi, Y. Huang, V. Lam, D. Potter, R. Laubenbacher, K. Duca: Using formal concept analysis for microarray data comparison, J. Bioinf. Comput. Biol. 6(1), 65–75 (2008)CrossRef

14.16.

M. Kaytoue-Uberall, S. Duplessis, A. Napoli: Using Formal Concept Analysis for the Extraction of Groups of Coexpressed Genes CCIS 14 (Springer, Berlin, Heidelberg 2008) pp. 445–455

14.17.

G. Rustici, J. Mata, K. Kivinen, P. Lió, C.J. Penkett, G. Burns, J. Hayles, A. Brazma, P. Nurse, J. Bähler: Periodic gene expression program of the fission yeast cell cycle, Nat. Genet. 36, 809–817 (2004)CrossRef

14.18.

E. Tsiporkova, V. Boeva: Two-pass imputation algorithm for missing value estimation in gene expression time series, J. Bioinf. Comput. Biol. 5(5), 1005–1022 (2007)CrossRef

14.19.

V. Boeva, E. Tsiporkova: A multipurpose time series data standardization method. Intelligent systems: From theory to practice, Stud. Comput. Intell. 299, 445–460 (2010)CrossRef

14.20.

A.K. Jain, M.N. Murty, P.J. Flynn: Data clustering: A review, ACM Comput. Surv. 31(3), 264–323 (1999)CrossRef

14.21.

M. Ester, H.P. Kriegel, J. Sander, X. Xu: A density-based algorithm for discovering clusters in large spatial databases with noise, Proc. 2nd ACM SIGKDD, Portland (1996) pp. 226–231

14.22.

M. Eisen, P.T Spollman, P.O Brown, D. Botstein: Cluster analysis and display of genome-wide expression patterns, Proc. Natl. Acad. Sci. USA 95, 14863–14868 (1998)CrossRef

14.23.

S. Datta, S. Datta: Comparisons and validation of statistical clustering techniques for microarray gene expression data, Bioinformatics 19, 459–466 (2003)CrossRef

14.24.

J.B. MacQueen: Some methods for classification and analysis of multivariate observations, Proc. 5th Berkeley Symp. Math. Stat. Prob. 1, 281–297 (1967)MathSciNet

14.25.

L. Kaufman, P.J. Rousseeuw: Fitting Groups in Data: An Introduction to Cluster Analysis (Wiley, New York 1990)CrossRef

14.26.

G. Babu, M. Murty: A near optimal initial seed value selection in k-means algorithm using a genetic algorithm, Pattern Recognit. Lett. 14, 763–769 (1993)CrossRefMATH

14.27.

S.S. Khan, A. Ahmad: Cluster center initialization algorithm for k-means clustering, Pattern Recognit. Lett. 25, 1293–1302 (2004)CrossRef

14.28.

M. Al-Daoud: A new algorithm for cluster initialization, World Acad. Sci. Eng. Technol. 4, 74–76 (2005)

14.29.

V. Boeva, E. Tsiporkova, E. Kostadinova: Analysis of multiple DNA microarrays (2012), available online at http://cst.tu-plovdiv.bg/bi/SupplementaryMaterial_MapReduce-FCA.pdf

14.30.

M. Halkidi, Y. Batistakis, M. Vazirgiannis: On clustering validation techniques, J. Intell. Inf. Syst. 17(2/3), 107–145 (2001)CrossRefMATH

14.31.

S. Theodoridis, K. Koutroubas: Pattern Recognition (Academic, New York 1999)

14.32.

A.K. Jain, R.C. Dubes: Algorithms for Clustering Data (Prentice Hall, Englewood Cliffs 2006)

14.33.

J. Handl, J. Knowles, D.B. Bell: Computational cluster validation in post-genomic data analysis, Bioinformatics 21, 3201–3212 (2005)CrossRef

14.34.

P. Rousseeuw: Silhouettes: A graphical aid to the interpretation and validation of cluster analysis, J. Comput. Appl. Math. 20, 53–65 (1987)CrossRefMATH

14.35.

M. de Hoon: Open clustering software (Laboratory of DNA Information Analysis, Human Genome Center, Institute of Medical Science, University of Tokyo, 2012) available at http://bonsai.hgc.jp/∼mdehoon/software/cluster/software.htm

14.36.

S. Maere, K. Heymans, M. Kuiper: BiNGO: A Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks, Bioinformatics 21, 3448–3449 (2005)CrossRef

Titel: Analysis of Multiple DNA Microarray Datasets
verfasst von: Veselka Boeva
Elena Tsiporkova
Elena Kostadinova
Verlag: Springer Berlin Heidelberg
Buch: Springer Handbook of Bio-/Neuroinformatics
Print ISBN: 978-3-642-30573-3

Electronic ISBN: 978-3-642-30574-0

Copyright-Jahr: 2014
DOI: https://doi.org/10.1007/978-3-642-30574-0_14

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"