Skip to main content
Erschienen in:
Buchtitelbild

2015 | OriginalPaper | Buchkapitel

Classification and Clustering on Microarray Data for Gene Functional Prediction Using R

verfasst von : Liliana López Kleine, Rosa Montaño, Francisco Torres-Avilés

Verlag: Springer New York

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Gene expression data (microarrays and RNA-sequencing data) as well as other kinds of genomic data can be extracted from publicly available genomic data. Here, we explain how to apply multivariate cluster and classification methods on gene expression data. These methods have become very popular and are implemented in freely available software in order to predict the participation of gene products in a specific functional category of interest. Taking into account the availability of data and of these methods, every biological study should apply them in order to obtain knowledge on the organism studied and functional category of interest. A special emphasis is made on the nonlinear kernel classification methods.
Literatur
1.
Zurück zum Zitat Dudoit S, Yang YH, Callow MJ, Speed TP (2002) Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Stat Sinica 12(1):111–140 Dudoit S, Yang YH, Callow MJ, Speed TP (2002) Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Stat Sinica 12(1):111–140
2.
Zurück zum Zitat Moguerza JM, Muñoz A (2006) Support vector machines with applications. Statist Sci 21(3):299–426CrossRef Moguerza JM, Muñoz A (2006) Support vector machines with applications. Statist Sci 21(3):299–426CrossRef
4.
5.
Zurück zum Zitat López-Kleine L, Torres-Avilés F, Tejedor FH, Gordillo LA (2012) Virulence factor prediction in Streptococcus pyogenes using classification and clustering based on microarray data. Appl Microbiol Biotechnol 93:2091–2098. doi:10.1007/s00253-012-3917-3 CrossRefPubMed López-Kleine L, Torres-Avilés F, Tejedor FH, Gordillo LA (2012) Virulence factor prediction in Streptococcus pyogenes using classification and clustering based on microarray data. Appl Microbiol Biotechnol 93:2091–2098. doi:10.​1007/​s00253-012-3917-3 CrossRefPubMed
6.
Zurück zum Zitat López-Kleine L, Romeo J, Torres-Avilés F (2013) Gene functional prediction using clustering methods for the analysis of tomato microarray data. In: Mohamad MS et al (eds) 7th International conference on PACBB, AISC, vol 222, pp 1–6 López-Kleine L, Romeo J, Torres-Avilés F (2013) Gene functional prediction using clustering methods for the analysis of tomato microarray data. In: Mohamad MS et al (eds) 7th International conference on PACBB, AISC, vol 222, pp 1–6
7.
8.
Zurück zum Zitat Huber W, von Heydebreck A, Sueltmann H, Poustka A, Vingron M (2003) Parameter estimation for the calibration and variance stabilization of microarray data. Stat Appl Genet Mol 2(1):Article 3 Huber W, von Heydebreck A, Sueltmann H, Poustka A, Vingron M (2003) Parameter estimation for the calibration and variance stabilization of microarray data. Stat Appl Genet Mol 2(1):Article 3
9.
Zurück zum Zitat Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Hornik K, Gentry J, Hothorn T, Huber W, Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, Smith C, Smyth G, Tierney L, Yang JYH, Zhang J (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5(10):R80CrossRefPubMedPubMedCentral Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Hornik K, Gentry J, Hothorn T, Huber W, Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, Smith C, Smyth G, Tierney L, Yang JYH, Zhang J (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5(10):R80CrossRefPubMedPubMedCentral
10.
Zurück zum Zitat Rencher AC, Christensen WF (2012) Methods of multivariate analysis, 3rd edn. Wiley, Hoboken, NJCrossRef Rencher AC, Christensen WF (2012) Methods of multivariate analysis, 3rd edn. Wiley, Hoboken, NJCrossRef
11.
Zurück zum Zitat Izenman AJ (2008) Modern multivariate statistical techniques: regression, classification, and manifold learning. Springer, New YorkCrossRef Izenman AJ (2008) Modern multivariate statistical techniques: regression, classification, and manifold learning. Springer, New YorkCrossRef
12.
Zurück zum Zitat Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New YorkCrossRef Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New YorkCrossRef
14.
Zurück zum Zitat Glenn W, Milligan GW, Cooper MC (1985) An examination of procedures for determining the number of clusters in a data set. Psychometrika 50(2):159–179CrossRef Glenn W, Milligan GW, Cooper MC (1985) An examination of procedures for determining the number of clusters in a data set. Psychometrika 50(2):159–179CrossRef
15.
Zurück zum Zitat Hartigan JA, Wong MA (1979) A k-means clustering algorithm. Appl Statist 28:100–108CrossRef Hartigan JA, Wong MA (1979) A k-means clustering algorithm. Appl Statist 28:100–108CrossRef
16.
Zurück zum Zitat Leiva-Valdebenito S, Torres-Avilés F (2010) A review of the most common partition algorithms in cluster analysis: a comparative study. Rev Colomb Estad 33(2):321–339 Leiva-Valdebenito S, Torres-Avilés F (2010) A review of the most common partition algorithms in cluster analysis: a comparative study. Rev Colomb Estad 33(2):321–339
17.
Zurück zum Zitat Kohonen T (1982) Self-organizing formation of topologically correct feature maps. Biol Cybern 43:59–69CrossRef Kohonen T (1982) Self-organizing formation of topologically correct feature maps. Biol Cybern 43:59–69CrossRef
18.
19.
Zurück zum Zitat Friedman JH (1989) Regularized discriminant analysis. JASA 84:165–175CrossRef Friedman JH (1989) Regularized discriminant analysis. JASA 84:165–175CrossRef
20.
Zurück zum Zitat van’t Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AAM, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415:530–536CrossRef van’t Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AAM, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415:530–536CrossRef
21.
Zurück zum Zitat Schölkopf B, Smola A (2002) Learning with Kernels: support vector machines, regularization, optimization, and beyond. The MIT Press, Cambridge Schölkopf B, Smola A (2002) Learning with Kernels: support vector machines, regularization, optimization, and beyond. The MIT Press, Cambridge
22.
Zurück zum Zitat Clarke B, Fokoué E, Zhang H (2009) Principles and theory for data mining and machine learning. Springer, New YorkCrossRef Clarke B, Fokoué E, Zhang H (2009) Principles and theory for data mining and machine learning. Springer, New YorkCrossRef
Metadaten
Titel
Classification and Clustering on Microarray Data for Gene Functional Prediction Using R
verfasst von
Liliana López Kleine
Rosa Montaño
Francisco Torres-Avilés
Copyright-Jahr
2015
Verlag
Springer New York
DOI
https://doi.org/10.1007/7651_2015_240