Skip to main content
Erschienen in: Neural Computing and Applications 7-8/2013

01.06.2013 | Original Article

Bi-clustering continuous data with self-organizing map

verfasst von: Khalid Benabdeslem, Kais Allab

Erschienen in: Neural Computing and Applications | Ausgabe 7-8/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we present a new SOM-based bi-clustering approach for continuous data. This approach is called Bi-SOM (for Bi-clustering based on Self-Organizing Map). The main goal of bi-clustering aims to simultaneously group the rows and columns of a given data matrix. In addition, we propose in this work to deal with some issues related to this task: (1) the topological visualization of bi-clusters with respect to their neighborhood relation, (2) the optimization of these bi-clusters in macro-blocks and (3) the dimensionality reduction by eliminating noise blocks, iteratively. Finally, experiments are given over several data sets for validating our approach in comparison with other bi-clustering methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Angiulli F, Cesario E, Pizzuti C (2008) Random walk biclustering for microarray data. Inf Sci 178:1479–1497MATHCrossRef Angiulli F, Cesario E, Pizzuti C (2008) Random walk biclustering for microarray data. Inf Sci 178:1479–1497MATHCrossRef
2.
Zurück zum Zitat Bandyopadhyay S, Mukhopadhyay A, Maulik U (2007) An improved algorithm for clustering gene expression data. Bioinformatics 21:2859–2865CrossRef Bandyopadhyay S, Mukhopadhyay A, Maulik U (2007) An improved algorithm for clustering gene expression data. Bioinformatics 21:2859–2865CrossRef
3.
Zurück zum Zitat BenDor A, Chor B, Karp R, Yakhini Z (2003) Discovering local structure in gene expression data: the order preserving sub matrix problem. J Comput Biol 10(3–4):373–384CrossRef BenDor A, Chor B, Karp R, Yakhini Z (2003) Discovering local structure in gene expression data: the order preserving sub matrix problem. J Comput Biol 10(3–4):373–384CrossRef
4.
Zurück zum Zitat Bergmann S, Ihmels J, Barkai N (2004) Defining transcription modules using large-scale gene expression. Bioinformatics 20(13):1993–2003CrossRef Bergmann S, Ihmels J, Barkai N (2004) Defining transcription modules using large-scale gene expression. Bioinformatics 20(13):1993–2003CrossRef
5.
Zurück zum Zitat Bryan K, Cunningham P, Bolshakova N (2005) Biclustering of expression data using simulated annealing. CBMS 2005:383–388 Bryan K, Cunningham P, Bolshakova N (2005) Biclustering of expression data using simulated annealing. CBMS 2005:383–388
6.
Zurück zum Zitat Busygin S, Jacobsen G, Kramer E (2002) Double conjugated clustering applied to leukemia microarray data. In: Proceedings of the 2nd SIAM international conference on data mining, workshop on clustering high dimensional data Busygin S, Jacobsen G, Kramer E (2002) Double conjugated clustering applied to leukemia microarray data. In: Proceedings of the 2nd SIAM international conference on data mining, workshop on clustering high dimensional data
7.
Zurück zum Zitat Cheng Y, Church G (2000) Biclustering of expression data. In: Proceedings of the 8th international conference on intelligent systems for molecular biology (ISMB’00), vol 8, pp 93–103 Cheng Y, Church G (2000) Biclustering of expression data. In: Proceedings of the 8th international conference on intelligent systems for molecular biology (ISMB’00), vol 8, pp 93–103
8.
Zurück zum Zitat Cottrell M, Ibbou S, Letrémy P (2004) Som-based algorithms for qualitative variables. Neural Netw 17(8–9):1149–1167MATHCrossRef Cottrell M, Ibbou S, Letrémy P (2004) Som-based algorithms for qualitative variables. Neural Netw 17(8–9):1149–1167MATHCrossRef
9.
Zurück zum Zitat Cottrell M, Letrémy MP (2005) How to use the kohonen algorithm to simultaneously analyze individuals and modalities in a survey. Neurocomputing 63:193–207CrossRef Cottrell M, Letrémy MP (2005) How to use the kohonen algorithm to simultaneously analyze individuals and modalities in a survey. Neurocomputing 63:193–207CrossRef
10.
Zurück zum Zitat Dhillon IS (2001) Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, pp 269–274 Dhillon IS (2001) Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, pp 269–274
11.
Zurück zum Zitat Eisen M, Spellman P, Brown P, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 95(25):14863–14868CrossRef Eisen M, Spellman P, Brown P, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 95(25):14863–14868CrossRef
12.
Zurück zum Zitat Fort J, Cottrel M, Letrémy P (2001) Stochastic on-row algorithm versus batch algorithm for quantization and self-organizing maps. Neural networks for signal processing XI, 2001. In: Proceedings of the 2001 IEEE signal processing society workshop, pp 43–52 Fort J, Cottrel M, Letrémy P (2001) Stochastic on-row algorithm versus batch algorithm for quantization and self-organizing maps. Neural networks for signal processing XI, 2001. In: Proceedings of the 2001 IEEE signal processing society workshop, pp 43–52
14.
Zurück zum Zitat Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh M, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999) Molecular classification of cancer: class discovery and class prediction by gene expression. Science 286:531–537CrossRef Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh M, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999) Molecular classification of cancer: class discovery and class prediction by gene expression. Science 286:531–537CrossRef
15.
Zurück zum Zitat Govaert G, Nadif M (2008) Block clustering with mixture models: comparison of different approaches. Comput Stat Data Anal 52:3233–3245MathSciNetMATHCrossRef Govaert G, Nadif M (2008) Block clustering with mixture models: comparison of different approaches. Comput Stat Data Anal 52:3233–3245MathSciNetMATHCrossRef
16.
Zurück zum Zitat Govaert G (1983) Classification Croisée. Thèse d’état, Université de Paris6 Govaert G (1983) Classification Croisée. Thèse d’état, Université de Paris6
17.
Zurück zum Zitat Hartigan J (1972) Direct clustering of data matrix. J Am Stat Assoc 67(337):123–129CrossRef Hartigan J (1972) Direct clustering of data matrix. J Am Stat Assoc 67(337):123–129CrossRef
18.
Zurück zum Zitat Hartigan J (1975) Direct splitting. Clustering algorithms, Chap. 14. Wiley, New York, pp 251–277 Hartigan J (1975) Direct splitting. Clustering algorithms, Chap. 14. Wiley, New York, pp 251–277
19.
Zurück zum Zitat Klugar Y, Basri R, Chang J, Gerstein M (2003) Spectral biclustering of microarray data: coclustering genes and conditions. Genome Res 13:703–716CrossRef Klugar Y, Basri R, Chang J, Gerstein M (2003) Spectral biclustering of microarray data: coclustering genes and conditions. Genome Res 13:703–716CrossRef
21.
Zurück zum Zitat Lazzeroni L, Owen A (2000) Plaid models for gene expression data. Stat Sin 12:61–86MathSciNet Lazzeroni L, Owen A (2000) Plaid models for gene expression data. Stat Sin 12:61–86MathSciNet
22.
Zurück zum Zitat MacQueen J (1967) Some methods for classification and analysis of multivariate observations. Proc Fifth Berkeley Symp Math Stat Probab 1:281–297MathSciNet MacQueen J (1967) Some methods for classification and analysis of multivariate observations. Proc Fifth Berkeley Symp Math Stat Probab 1:281–297MathSciNet
23.
Zurück zum Zitat Madeira S, Oliveira A (2004) Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans Comput Biol Bioinf 1(1):24–45CrossRef Madeira S, Oliveira A (2004) Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans Comput Biol Bioinf 1(1):24–45CrossRef
24.
Zurück zum Zitat Meeds E, Roweis S (2007) Nonparametric bayesian bi-clustering. Technical report Meeds E, Roweis S (2007) Nonparametric bayesian bi-clustering. Technical report
25.
Zurück zum Zitat Mitra S, Banka H (2006) Multi-objective evolutionary biclustering of gene expression data. Pattern Recogn 39(12):2464–2477MATHCrossRef Mitra S, Banka H (2006) Multi-objective evolutionary biclustering of gene expression data. Pattern Recogn 39(12):2464–2477MATHCrossRef
26.
Zurück zum Zitat Murali T, Kasif S (2003) Extracting conserved gene expression motifs from gene expression data. Pac Symp Biocomput 8:77–88 Murali T, Kasif S (2003) Extracting conserved gene expression motifs from gene expression data. Pac Symp Biocomput 8:77–88
27.
Zurück zum Zitat Pensa R, Boulicaut J-F, Cordero F, Atzori M (2010) Co-clustering numerical data under user-defined constraints. Stat Anal Data Min 3(1):38–55MathSciNet Pensa R, Boulicaut J-F, Cordero F, Atzori M (2010) Co-clustering numerical data under user-defined constraints. Stat Anal Data Min 3(1):38–55MathSciNet
28.
Zurück zum Zitat Prelic A, Bleuler S, Zimmermann P, Wille A, Buhlmann P, Gruissem W, Hennig L, Thiele L, Zitzler E (2006) A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9):1122–1131CrossRef Prelic A, Bleuler S, Zimmermann P, Wille A, Buhlmann P, Gruissem W, Hennig L, Thiele L, Zitzler E (2006) A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9):1122–1131CrossRef
29.
Zurück zum Zitat Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850CrossRef Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850CrossRef
30.
Zurück zum Zitat Santamaria R, Quintales L, Theron R (2007) Methods to bicluster validation and comparison in microarray data. In: Proceedings of IDEAL 2007, LNCS4881, pp 780–789 Santamaria R, Quintales L, Theron R (2007) Methods to bicluster validation and comparison in microarray data. In: Proceedings of IDEAL 2007, LNCS4881, pp 780–789
31.
Zurück zum Zitat Schummer M, Ng W, Bumgarner R, Nelson P, Schummer B, Bednarski D, Hassell L, Baldwin R, Karlan B, Hood L (1999) Comparative hybridization of an array of 21500 ovarian cdnas for the discovery of genes overexpressed in ovarian carcinomas. Gene 238(2):375–385CrossRef Schummer M, Ng W, Bumgarner R, Nelson P, Schummer B, Bednarski D, Hassell L, Baldwin R, Karlan B, Hood L (1999) Comparative hybridization of an array of 21500 ovarian cdnas for the discovery of genes overexpressed in ovarian carcinomas. Gene 238(2):375–385CrossRef
32.
Zurück zum Zitat Shi J, Malik J (2000) Normalized cuts and image segmentation. Technical report, University of California at Berkeley, Berkeley, CA, USA Shi J, Malik J (2000) Normalized cuts and image segmentation. Technical report, University of California at Berkeley, Berkeley, CA, USA
33.
Zurück zum Zitat Tanay A, Sharan R, Shamir R (2002) Discovering statistically significant biclusters in gene expression data. Bioinformatics 18:36–44CrossRef Tanay A, Sharan R, Shamir R (2002) Discovering statistically significant biclusters in gene expression data. Bioinformatics 18:36–44CrossRef
34.
Zurück zum Zitat Xiaowen L, Wang L (2007) Computing the maximum similarity bi-clusters of gene expression data. Bioinformatics 23(1):50–56CrossRef Xiaowen L, Wang L (2007) Computing the maximum similarity bi-clusters of gene expression data. Bioinformatics 23(1):50–56CrossRef
35.
Zurück zum Zitat Yang J, Wang W, Wang H, Yu P (2003) Enhanced biclustering on expression data. BIBE ’03, pp. 321–327 Yang J, Wang W, Wang H, Yu P (2003) Enhanced biclustering on expression data. BIBE ’03, pp. 321–327
Metadaten
Titel
Bi-clustering continuous data with self-organizing map
verfasst von
Khalid Benabdeslem
Kais Allab
Publikationsdatum
01.06.2013
Verlag
Springer-Verlag
Erschienen in
Neural Computing and Applications / Ausgabe 7-8/2013
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-012-1047-6

Weitere Artikel der Ausgabe 7-8/2013

Neural Computing and Applications 7-8/2013 Zur Ausgabe

Premium Partner