Skip to main content
Erschienen in: Network Modeling Analysis in Health Informatics and Bioinformatics 3/2012

01.09.2012 | Original Article

Merging network patterns: a general framework to summarize biomedical network data

verfasst von: Yang Xiang, David Fuhry, Kamer Kaya, Ruoming Jin, Ümit V. Çatalyürek, Kun Huang

Erschienen in: Network Modeling Analysis in Health Informatics and Bioinformatics | Ausgabe 3/2012

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The ability to summarize a large number of network patterns discovered from biomedical data provides valuable information for use in many applications. We show that several variants of the problem are all NP-hard, and merging network patterns is a practical solution for these applications. In this work, we propose an algorithmic framework for merging network patterns. We have developed fast algorithms under this general framework which supports several types of biomedical network data. In addition, our empirical study demonstrates that our algorithms are efficient in merging a large number of biomedical network patterns and can be configured for various knowledge discovery purposes.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Publicly available at: http://​human-phenotype-ontology.​org. The data for this work, downloaded on Feb 23, 2012, contains 1854 genes and 5220 phenotypes.
 
2
If a gene name has multiple entries in the microarray, only an entry with the maximum express level is considered. Gene names without an entry (due to alias or other reasons) in the microarray will be omitted.
 
4
For GSE1456 and GSE2034, there is not survival data but relapse data. Therefore “no-relapse" is considered as survival for our statistical study in this work.
 
5
All patients in GSE2034 are LN-Negative.
 
6
Website: http://​toppgene.​cchmc.​org, visited on April 26, 2012, “correction” set to be “Bonferroni”.
 
7
 
Literatur
Zurück zum Zitat Abello J, Resende MGC, Sudarsky S (2002) Massive quasi-clique detection. In: LATIN, pp 598–612 Abello J, Resende MGC, Sudarsky S (2002) Massive quasi-clique detection. In: LATIN, pp 598–612
Zurück zum Zitat Almansoori W, Gao S, Jarada TN, Elsheikh AM, Murshed AN, Jida J, Alhajj R, Rokne J (2012) Link prediction and classification in social networks and its application in healthcare and systems biology. Netw Model Anal Health Inform Bioinforma. doi: 10.1007/s13721-012-0005-7 Almansoori W, Gao S, Jarada TN, Elsheikh AM, Murshed AN, Jida J, Alhajj R, Rokne J (2012) Link prediction and classification in social networks and its application in healthcare and systems biology. Netw Model Anal Health Inform Bioinforma. doi: 10.​1007/​s13721-012-0005-7
Zurück zum Zitat Bastian M, Heymann S, Gephi MJ (2009) An open source software for exploring and manipulating networks. In: ICWSM Bastian M, Heymann S, Gephi MJ (2009) An open source software for exploring and manipulating networks. In: ICWSM
Zurück zum Zitat Berkhin P (2006) A survey of clustering data mining techniques. Grouping Multidimensional Data Berkhin P (2006) A survey of clustering data mining techniques. Grouping Multidimensional Data
Zurück zum Zitat Bron C, Kerbosch J (1973) Algorithm 457: finding all cliques of an undirected graph. Commun ACM 16(9):575–577MATHCrossRef Bron C, Kerbosch J (1973) Algorithm 457: finding all cliques of an undirected graph. Commun ACM 16(9):575–577MATHCrossRef
Zurück zum Zitat Burdick D, Calimlim M, Flannick J, Gehrke J, Yiu T (2005) Mafia: a maximal frequent itemset algorithm. IEEE Trans Knowl Data Eng 17(11):1490–1504CrossRef Burdick D, Calimlim M, Flannick J, Gehrke J, Yiu T (2005) Mafia: a maximal frequent itemset algorithm. IEEE Trans Knowl Data Eng 17(11):1490–1504CrossRef
Zurück zum Zitat Carroll JS, Meyer CA, Song J, Li W et al (2006) Genome-wide analysis of estrogen receptor binding sites. Nat Genet 38(11):1289–1297CrossRef Carroll JS, Meyer CA, Song J, Li W et al (2006) Genome-wide analysis of estrogen receptor binding sites. Nat Genet 38(11):1289–1297CrossRef
Zurück zum Zitat Desmedt C, Haibe-Kains B, Wirapati P et al (2008) Biological processes associated with breast cancer clinical outcome depend on the molecular subtypes. Clin Cancer Res 14(16):5158–5165CrossRef Desmedt C, Haibe-Kains B, Wirapati P et al (2008) Biological processes associated with breast cancer clinical outcome depend on the molecular subtypes. Clin Cancer Res 14(16):5158–5165CrossRef
Zurück zum Zitat Du N, Wang B, Wu B, Wang Y (2008) Overlapping community detection in bipartite networks. In: Web intelligence, pp 176–179 Du N, Wang B, Wu B, Wang Y (2008) Overlapping community detection in bipartite networks. In: Web intelligence, pp 176–179
Zurück zum Zitat Duan R, Pettie S, Su H-H (2011) Scaling algorithms for approximate and exact maximum weight matching Duan R, Pettie S, Su H-H (2011) Scaling algorithms for approximate and exact maximum weight matching
Zurück zum Zitat Geerts F, Goethals B, Mielikäinen T (2004) Tiling databases. In: Discovery science, pp 278–289 Geerts F, Goethals B, Mielikäinen T (2004) Tiling databases. In: Discovery science, pp 278–289
Zurück zum Zitat Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. Elsevier Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. Elsevier
Zurück zum Zitat Hartigan JA (1972) Direct clustering of a data matrix. J Am Stat Assoc 67(337):123–129CrossRef Hartigan JA (1972) Direct clustering of a data matrix. J Am Stat Assoc 67(337):123–129CrossRef
Zurück zum Zitat Jin R, Hong H, Wang H, Ruan N, Xiang Y (2010) Computing label-constraint reachability in graph databases. In: SIGMOD conference, pp 123–134 Jin R, Hong H, Wang H, Ruan N, Xiang Y (2010) Computing label-constraint reachability in graph databases. In: SIGMOD conference, pp 123–134
Zurück zum Zitat Jin R, Ruan N, Xiang Y, Lee VE (2012) A highway-centric labeling approach for answering distance queries on large sparse graphs. In: SIGMOD conference Jin R, Ruan N, Xiang Y, Lee VE (2012) A highway-centric labeling approach for answering distance queries on large sparse graphs. In: SIGMOD conference
Zurück zum Zitat Jin R, Xiang Y, Hong H, Huang K (2010) Block interaction: a generative summarization scheme for frequent patterns. In: UP ’10 proceedings of the ACM SIGKDD workshop on useful patterns, pp 55–64 Jin R, Xiang Y, Hong H, Huang K (2010) Block interaction: a generative summarization scheme for frequent patterns. In: UP ’10 proceedings of the ACM SIGKDD workshop on useful patterns, pp 55–64
Zurück zum Zitat Johnson DS, Yannakakis M, Papadimitriou CH (1988) On generating all maximal independent sets. Inf Process Lett 27(3):119–123MathSciNetMATHCrossRef Johnson DS, Yannakakis M, Papadimitriou CH (1988) On generating all maximal independent sets. Inf Process Lett 27(3):119–123MathSciNetMATHCrossRef
Zurück zum Zitat Karp R (1972) Reducibility among combinatorial problems. In: Miller R, Thatcher J (eds) Complexity of computer computations. Plenum Press, New York, pp 85–103 Karp R (1972) Reducibility among combinatorial problems. In: Miller R, Thatcher J (eds) Complexity of computer computations. Plenum Press, New York, pp 85–103
Zurück zum Zitat Kutalik Z, Beckmann JS, Bergmann S (2008) A modular approach for integrative analysis of large-scale gene-expression and drug-response data. Nat Biotechnol 26(5):531–539CrossRef Kutalik Z, Beckmann JS, Bergmann S (2008) A modular approach for integrative analysis of large-scale gene-expression and drug-response data. Nat Biotechnol 26(5):531–539CrossRef
Zurück zum Zitat Li J, Liu Y, Gao H (2011) Efficient algorithms for summarizing graph patterns. IEEE Trans Knowl Data Eng 23(9):1388–1405CrossRef Li J, Liu Y, Gao H (2011) Efficient algorithms for summarizing graph patterns. IEEE Trans Knowl Data Eng 23(9):1388–1405CrossRef
Zurück zum Zitat Li J, Liu G, Li H, Wong L (2007) Maximal biclique subgraphs and closed pattern pairs of the adjacency matrix: a one-to-one correspondence and mining algorithms. IEEE Trans Knowl Data Eng 19(12):1625–1637CrossRef Li J, Liu G, Li H, Wong L (2007) Maximal biclique subgraphs and closed pattern pairs of the adjacency matrix: a one-to-one correspondence and mining algorithms. IEEE Trans Knowl Data Eng 19(12):1625–1637CrossRef
Zurück zum Zitat Li J, Sim K, Liu G, Wong L (2008) Maximal quasi-bicliques with balanced noise tolerance: concepts and co-clustering applications. In: SDM, pp 72–83 Li J, Sim K, Liu G, Wong L (2008) Maximal quasi-bicliques with balanced noise tolerance: concepts and co-clustering applications. In: SDM, pp 72–83
Zurück zum Zitat Lucchese C, Orlando S, Perego R (2010) A generative pattern model for mining binary datasets. In: SAC, pp 1109–1110 Lucchese C, Orlando S, Perego R (2010) A generative pattern model for mining binary datasets. In: SAC, pp 1109–1110
Zurück zum Zitat Makino K, Uno T (2004) New algorithms for enumerating all maximal cliques. In: SWAT, pp 260–272 Makino K, Uno T (2004) New algorithms for enumerating all maximal cliques. In: SWAT, pp 260–272
Zurück zum Zitat Micali S, Vazirani VV (1980) An \(\mathcal{O}(\sqrt{|V|}{|E|})\) algorithm for finding maximum matching in general graphs. In: FOCS, pp 17–27 Micali S, Vazirani VV (1980) An \(\mathcal{O}(\sqrt{|V|}{|E|})\) algorithm for finding maximum matching in general graphs. In: FOCS, pp 17–27
Zurück zum Zitat Miettinen P, Mielikäinen T, Gionis A, Das G, Mannila H (2008) The discrete basis problem. IEEE Trans Knowl Data Eng 20(10):1348–1362CrossRef Miettinen P, Mielikäinen T, Gionis A, Das G, Mannila H (2008) The discrete basis problem. IEEE Trans Knowl Data Eng 20(10):1348–1362CrossRef
Zurück zum Zitat Mirkin B (1996) Mathematical classification and clustering. Kluwer Academic Publishers, Dordrecht Mirkin B (1996) Mathematical classification and clustering. Kluwer Academic Publishers, Dordrecht
Zurück zum Zitat Mucha M, Sankowski P (2004) Maximum matchings via gaussian elimination. In: FOCS, pp 248–255 Mucha M, Sankowski P (2004) Maximum matchings via gaussian elimination. In: FOCS, pp 248–255
Zurück zum Zitat Mushlin RA, Gallagher S, Kershenbaum A, Rebbeck TR (2009) Clique-finding for heterogeneity and multidimensionality in biomarker epidemiology research: the chamber algorithm. PloS one 4(3):4862CrossRef Mushlin RA, Gallagher S, Kershenbaum A, Rebbeck TR (2009) Clique-finding for heterogeneity and multidimensionality in biomarker epidemiology research: the chamber algorithm. PloS one 4(3):4862CrossRef
Zurück zum Zitat Ravetti MG, Moscato P (2008) Identification of a 5-protein biomarker molecular signature for predicting alzheimer’s disease. PLoS One 3(9):3111CrossRef Ravetti MG, Moscato P (2008) Identification of a 5-protein biomarker molecular signature for predicting alzheimer’s disease. PLoS One 3(9):3111CrossRef
Zurück zum Zitat Reyal F, van Vliet MH, Armstrong NJ et al (2008) A comprehensive analysis of prognostic signatures reveals the high predictive capacity of the proliferation, immune response and rna splicing modules in breast cancer. Breast Cancer Res 10(6):R93CrossRef Reyal F, van Vliet MH, Armstrong NJ et al (2008) A comprehensive analysis of prognostic signatures reveals the high predictive capacity of the proliferation, immune response and rna splicing modules in breast cancer. Breast Cancer Res 10(6):R93CrossRef
Zurück zum Zitat Robinson PN, Köhler S, Bauer S, Seelow D, Horn D, Mundlos S (2008) The human phenotype ontology: a tool for annotating and analyzing human hereditary disease. Am J Hum Genet 83(5):610–615CrossRef Robinson PN, Köhler S, Bauer S, Seelow D, Horn D, Mundlos S (2008) The human phenotype ontology: a tool for annotating and analyzing human hereditary disease. Am J Hum Genet 83(5):610–615CrossRef
Zurück zum Zitat Robinson PN, Mundlos S (2010) The human phenotype ontology. Clin Genet 77(6):525–534CrossRef Robinson PN, Mundlos S (2010) The human phenotype ontology. Clin Genet 77(6):525–534CrossRef
Zurück zum Zitat Tsukiyama S, Ide M, Ariyoshi H, Shirakawa I (1977) A new algorithm for generating all the maximal independent sets. SIAM J Comput 6:505MathSciNetMATHCrossRef Tsukiyama S, Ide M, Ariyoshi H, Shirakawa I (1977) A new algorithm for generating all the maximal independent sets. SIAM J Comput 6:505MathSciNetMATHCrossRef
Zurück zum Zitat Uppalapati P, Xiang Y, Huang K (2010) Predicting prognostic markers for glioma using gene co-expression network analysis. In: Proceedings of the first ACM international conference on bioinformatics and computational biology, pp 546–551 Uppalapati P, Xiang Y, Huang K (2010) Predicting prognostic markers for glioma using gene co-expression network analysis. In: Proceedings of the first ACM international conference on bioinformatics and computational biology, pp 546–551
Zurück zum Zitat van de Vijver MJ, He YD, van ’t Veer LJ et al (2002) A gene-expression signature as a predictor of survival in breast cancer. New Engl J Med 347(25):1999–2009CrossRef van de Vijver MJ, He YD, van ’t Veer LJ et al (2002) A gene-expression signature as a predictor of survival in breast cancer. New Engl J Med 347(25):1999–2009CrossRef
Zurück zum Zitat van’t Veer LJ, Dai H, van de Vijver MJ et al (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871):530–536CrossRef van’t Veer LJ, Dai H, van de Vijver MJ et al (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871):530–536CrossRef
Zurück zum Zitat Wang Y, Klijn PGM, Zhang Y et al (2005) Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 365(9460):671–679 Wang Y, Klijn PGM, Zhang Y et al (2005) Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 365(9460):671–679
Zurück zum Zitat Wirapati P, Sotiriou C, Kunkel S et al (2008) Meta-analysis of gene expression profiles in breast cancer: toward a unified understanding of breast cancer subtyping and prognosis signatures. Breast Cancer Res 10(4):R65CrossRef Wirapati P, Sotiriou C, Kunkel S et al (2008) Meta-analysis of gene expression profiles in breast cancer: toward a unified understanding of breast cancer subtyping and prognosis signatures. Breast Cancer Res 10(4):R65CrossRef
Zurück zum Zitat Xiang Y, Zhang CQ, Huang K (2012) Predicting glioblastoma prognosis networks using weighted gene co-expression network analysis on tcga data. BMC Bioinform 13(Suppl 2):S12 Xiang Y, Zhang CQ, Huang K (2012) Predicting glioblastoma prognosis networks using weighted gene co-expression network analysis on tcga data. BMC Bioinform 13(Suppl 2):S12
Zurück zum Zitat Xiang Y, Jin R, Fuhry D, Dragan FF (2011) Summarizing transactional databases with overlapped hyperrectangles. Data Min Knowl Discov 23(2):215–251MathSciNetMATHCrossRef Xiang Y, Jin R, Fuhry D, Dragan FF (2011) Summarizing transactional databases with overlapped hyperrectangles. Data Min Knowl Discov 23(2):215–251MathSciNetMATHCrossRef
Zurück zum Zitat Xiang Y, Payne P, Huang K (2012) Transactional database transformation and its application in prioritizing human disease genes. IEEE/ACM Trans Comput Biol Bioinform 9(1):294–304CrossRef Xiang Y, Payne P, Huang K (2012) Transactional database transformation and its application in prioritizing human disease genes. IEEE/ACM Trans Comput Biol Bioinform 9(1):294–304CrossRef
Zurück zum Zitat Zhang J, Xiang Y, Ding L et al (2010) Using gene co-expression network analysis to predict biomarkers for chronic lymphocytic leukemia. BMC Bioinform 11(Suppl 9):S5CrossRef Zhang J, Xiang Y, Ding L et al (2010) Using gene co-expression network analysis to predict biomarkers for chronic lymphocytic leukemia. BMC Bioinform 11(Suppl 9):S5CrossRef
Metadaten
Titel
Merging network patterns: a general framework to summarize biomedical network data
verfasst von
Yang Xiang
David Fuhry
Kamer Kaya
Ruoming Jin
Ümit V. Çatalyürek
Kun Huang
Publikationsdatum
01.09.2012
Verlag
Springer Vienna
Erschienen in
Network Modeling Analysis in Health Informatics and Bioinformatics / Ausgabe 3/2012
Print ISSN: 2192-6662
Elektronische ISSN: 2192-6670
DOI
https://doi.org/10.1007/s13721-012-0009-3