Skip to main content
Top
Published in: Network Modeling Analysis in Health Informatics and Bioinformatics 3/2012

01-09-2012 | Original Article

Merging network patterns: a general framework to summarize biomedical network data

Authors: Yang Xiang, David Fuhry, Kamer Kaya, Ruoming Jin, Ümit V. Çatalyürek, Kun Huang

Published in: Network Modeling Analysis in Health Informatics and Bioinformatics | Issue 3/2012

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The ability to summarize a large number of network patterns discovered from biomedical data provides valuable information for use in many applications. We show that several variants of the problem are all NP-hard, and merging network patterns is a practical solution for these applications. In this work, we propose an algorithmic framework for merging network patterns. We have developed fast algorithms under this general framework which supports several types of biomedical network data. In addition, our empirical study demonstrates that our algorithms are efficient in merging a large number of biomedical network patterns and can be configured for various knowledge discovery purposes.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Publicly available at: http://​human-phenotype-ontology.​org. The data for this work, downloaded on Feb 23, 2012, contains 1854 genes and 5220 phenotypes.
 
2
If a gene name has multiple entries in the microarray, only an entry with the maximum express level is considered. Gene names without an entry (due to alias or other reasons) in the microarray will be omitted.
 
4
For GSE1456 and GSE2034, there is not survival data but relapse data. Therefore “no-relapse" is considered as survival for our statistical study in this work.
 
5
All patients in GSE2034 are LN-Negative.
 
6
Website: http://​toppgene.​cchmc.​org, visited on April 26, 2012, “correction” set to be “Bonferroni”.
 
7
 
Literature
go back to reference Abello J, Resende MGC, Sudarsky S (2002) Massive quasi-clique detection. In: LATIN, pp 598–612 Abello J, Resende MGC, Sudarsky S (2002) Massive quasi-clique detection. In: LATIN, pp 598–612
go back to reference Almansoori W, Gao S, Jarada TN, Elsheikh AM, Murshed AN, Jida J, Alhajj R, Rokne J (2012) Link prediction and classification in social networks and its application in healthcare and systems biology. Netw Model Anal Health Inform Bioinforma. doi: 10.1007/s13721-012-0005-7 Almansoori W, Gao S, Jarada TN, Elsheikh AM, Murshed AN, Jida J, Alhajj R, Rokne J (2012) Link prediction and classification in social networks and its application in healthcare and systems biology. Netw Model Anal Health Inform Bioinforma. doi: 10.​1007/​s13721-012-0005-7
go back to reference Bastian M, Heymann S, Gephi MJ (2009) An open source software for exploring and manipulating networks. In: ICWSM Bastian M, Heymann S, Gephi MJ (2009) An open source software for exploring and manipulating networks. In: ICWSM
go back to reference Berkhin P (2006) A survey of clustering data mining techniques. Grouping Multidimensional Data Berkhin P (2006) A survey of clustering data mining techniques. Grouping Multidimensional Data
go back to reference Bron C, Kerbosch J (1973) Algorithm 457: finding all cliques of an undirected graph. Commun ACM 16(9):575–577MATHCrossRef Bron C, Kerbosch J (1973) Algorithm 457: finding all cliques of an undirected graph. Commun ACM 16(9):575–577MATHCrossRef
go back to reference Burdick D, Calimlim M, Flannick J, Gehrke J, Yiu T (2005) Mafia: a maximal frequent itemset algorithm. IEEE Trans Knowl Data Eng 17(11):1490–1504CrossRef Burdick D, Calimlim M, Flannick J, Gehrke J, Yiu T (2005) Mafia: a maximal frequent itemset algorithm. IEEE Trans Knowl Data Eng 17(11):1490–1504CrossRef
go back to reference Carroll JS, Meyer CA, Song J, Li W et al (2006) Genome-wide analysis of estrogen receptor binding sites. Nat Genet 38(11):1289–1297CrossRef Carroll JS, Meyer CA, Song J, Li W et al (2006) Genome-wide analysis of estrogen receptor binding sites. Nat Genet 38(11):1289–1297CrossRef
go back to reference Desmedt C, Haibe-Kains B, Wirapati P et al (2008) Biological processes associated with breast cancer clinical outcome depend on the molecular subtypes. Clin Cancer Res 14(16):5158–5165CrossRef Desmedt C, Haibe-Kains B, Wirapati P et al (2008) Biological processes associated with breast cancer clinical outcome depend on the molecular subtypes. Clin Cancer Res 14(16):5158–5165CrossRef
go back to reference Du N, Wang B, Wu B, Wang Y (2008) Overlapping community detection in bipartite networks. In: Web intelligence, pp 176–179 Du N, Wang B, Wu B, Wang Y (2008) Overlapping community detection in bipartite networks. In: Web intelligence, pp 176–179
go back to reference Duan R, Pettie S, Su H-H (2011) Scaling algorithms for approximate and exact maximum weight matching Duan R, Pettie S, Su H-H (2011) Scaling algorithms for approximate and exact maximum weight matching
go back to reference Geerts F, Goethals B, Mielikäinen T (2004) Tiling databases. In: Discovery science, pp 278–289 Geerts F, Goethals B, Mielikäinen T (2004) Tiling databases. In: Discovery science, pp 278–289
go back to reference Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. Elsevier Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. Elsevier
go back to reference Hartigan JA (1972) Direct clustering of a data matrix. J Am Stat Assoc 67(337):123–129CrossRef Hartigan JA (1972) Direct clustering of a data matrix. J Am Stat Assoc 67(337):123–129CrossRef
go back to reference Jin R, Hong H, Wang H, Ruan N, Xiang Y (2010) Computing label-constraint reachability in graph databases. In: SIGMOD conference, pp 123–134 Jin R, Hong H, Wang H, Ruan N, Xiang Y (2010) Computing label-constraint reachability in graph databases. In: SIGMOD conference, pp 123–134
go back to reference Jin R, Ruan N, Xiang Y, Lee VE (2012) A highway-centric labeling approach for answering distance queries on large sparse graphs. In: SIGMOD conference Jin R, Ruan N, Xiang Y, Lee VE (2012) A highway-centric labeling approach for answering distance queries on large sparse graphs. In: SIGMOD conference
go back to reference Jin R, Xiang Y, Hong H, Huang K (2010) Block interaction: a generative summarization scheme for frequent patterns. In: UP ’10 proceedings of the ACM SIGKDD workshop on useful patterns, pp 55–64 Jin R, Xiang Y, Hong H, Huang K (2010) Block interaction: a generative summarization scheme for frequent patterns. In: UP ’10 proceedings of the ACM SIGKDD workshop on useful patterns, pp 55–64
go back to reference Karp R (1972) Reducibility among combinatorial problems. In: Miller R, Thatcher J (eds) Complexity of computer computations. Plenum Press, New York, pp 85–103 Karp R (1972) Reducibility among combinatorial problems. In: Miller R, Thatcher J (eds) Complexity of computer computations. Plenum Press, New York, pp 85–103
go back to reference Kutalik Z, Beckmann JS, Bergmann S (2008) A modular approach for integrative analysis of large-scale gene-expression and drug-response data. Nat Biotechnol 26(5):531–539CrossRef Kutalik Z, Beckmann JS, Bergmann S (2008) A modular approach for integrative analysis of large-scale gene-expression and drug-response data. Nat Biotechnol 26(5):531–539CrossRef
go back to reference Li J, Liu Y, Gao H (2011) Efficient algorithms for summarizing graph patterns. IEEE Trans Knowl Data Eng 23(9):1388–1405CrossRef Li J, Liu Y, Gao H (2011) Efficient algorithms for summarizing graph patterns. IEEE Trans Knowl Data Eng 23(9):1388–1405CrossRef
go back to reference Li J, Liu G, Li H, Wong L (2007) Maximal biclique subgraphs and closed pattern pairs of the adjacency matrix: a one-to-one correspondence and mining algorithms. IEEE Trans Knowl Data Eng 19(12):1625–1637CrossRef Li J, Liu G, Li H, Wong L (2007) Maximal biclique subgraphs and closed pattern pairs of the adjacency matrix: a one-to-one correspondence and mining algorithms. IEEE Trans Knowl Data Eng 19(12):1625–1637CrossRef
go back to reference Li J, Sim K, Liu G, Wong L (2008) Maximal quasi-bicliques with balanced noise tolerance: concepts and co-clustering applications. In: SDM, pp 72–83 Li J, Sim K, Liu G, Wong L (2008) Maximal quasi-bicliques with balanced noise tolerance: concepts and co-clustering applications. In: SDM, pp 72–83
go back to reference Lucchese C, Orlando S, Perego R (2010) A generative pattern model for mining binary datasets. In: SAC, pp 1109–1110 Lucchese C, Orlando S, Perego R (2010) A generative pattern model for mining binary datasets. In: SAC, pp 1109–1110
go back to reference Makino K, Uno T (2004) New algorithms for enumerating all maximal cliques. In: SWAT, pp 260–272 Makino K, Uno T (2004) New algorithms for enumerating all maximal cliques. In: SWAT, pp 260–272
go back to reference Micali S, Vazirani VV (1980) An \(\mathcal{O}(\sqrt{|V|}{|E|})\) algorithm for finding maximum matching in general graphs. In: FOCS, pp 17–27 Micali S, Vazirani VV (1980) An \(\mathcal{O}(\sqrt{|V|}{|E|})\) algorithm for finding maximum matching in general graphs. In: FOCS, pp 17–27
go back to reference Miettinen P, Mielikäinen T, Gionis A, Das G, Mannila H (2008) The discrete basis problem. IEEE Trans Knowl Data Eng 20(10):1348–1362CrossRef Miettinen P, Mielikäinen T, Gionis A, Das G, Mannila H (2008) The discrete basis problem. IEEE Trans Knowl Data Eng 20(10):1348–1362CrossRef
go back to reference Mirkin B (1996) Mathematical classification and clustering. Kluwer Academic Publishers, Dordrecht Mirkin B (1996) Mathematical classification and clustering. Kluwer Academic Publishers, Dordrecht
go back to reference Mucha M, Sankowski P (2004) Maximum matchings via gaussian elimination. In: FOCS, pp 248–255 Mucha M, Sankowski P (2004) Maximum matchings via gaussian elimination. In: FOCS, pp 248–255
go back to reference Mushlin RA, Gallagher S, Kershenbaum A, Rebbeck TR (2009) Clique-finding for heterogeneity and multidimensionality in biomarker epidemiology research: the chamber algorithm. PloS one 4(3):4862CrossRef Mushlin RA, Gallagher S, Kershenbaum A, Rebbeck TR (2009) Clique-finding for heterogeneity and multidimensionality in biomarker epidemiology research: the chamber algorithm. PloS one 4(3):4862CrossRef
go back to reference Ravetti MG, Moscato P (2008) Identification of a 5-protein biomarker molecular signature for predicting alzheimer’s disease. PLoS One 3(9):3111CrossRef Ravetti MG, Moscato P (2008) Identification of a 5-protein biomarker molecular signature for predicting alzheimer’s disease. PLoS One 3(9):3111CrossRef
go back to reference Reyal F, van Vliet MH, Armstrong NJ et al (2008) A comprehensive analysis of prognostic signatures reveals the high predictive capacity of the proliferation, immune response and rna splicing modules in breast cancer. Breast Cancer Res 10(6):R93CrossRef Reyal F, van Vliet MH, Armstrong NJ et al (2008) A comprehensive analysis of prognostic signatures reveals the high predictive capacity of the proliferation, immune response and rna splicing modules in breast cancer. Breast Cancer Res 10(6):R93CrossRef
go back to reference Robinson PN, Köhler S, Bauer S, Seelow D, Horn D, Mundlos S (2008) The human phenotype ontology: a tool for annotating and analyzing human hereditary disease. Am J Hum Genet 83(5):610–615CrossRef Robinson PN, Köhler S, Bauer S, Seelow D, Horn D, Mundlos S (2008) The human phenotype ontology: a tool for annotating and analyzing human hereditary disease. Am J Hum Genet 83(5):610–615CrossRef
go back to reference Robinson PN, Mundlos S (2010) The human phenotype ontology. Clin Genet 77(6):525–534CrossRef Robinson PN, Mundlos S (2010) The human phenotype ontology. Clin Genet 77(6):525–534CrossRef
go back to reference Tsukiyama S, Ide M, Ariyoshi H, Shirakawa I (1977) A new algorithm for generating all the maximal independent sets. SIAM J Comput 6:505MathSciNetMATHCrossRef Tsukiyama S, Ide M, Ariyoshi H, Shirakawa I (1977) A new algorithm for generating all the maximal independent sets. SIAM J Comput 6:505MathSciNetMATHCrossRef
go back to reference Uppalapati P, Xiang Y, Huang K (2010) Predicting prognostic markers for glioma using gene co-expression network analysis. In: Proceedings of the first ACM international conference on bioinformatics and computational biology, pp 546–551 Uppalapati P, Xiang Y, Huang K (2010) Predicting prognostic markers for glioma using gene co-expression network analysis. In: Proceedings of the first ACM international conference on bioinformatics and computational biology, pp 546–551
go back to reference van de Vijver MJ, He YD, van ’t Veer LJ et al (2002) A gene-expression signature as a predictor of survival in breast cancer. New Engl J Med 347(25):1999–2009CrossRef van de Vijver MJ, He YD, van ’t Veer LJ et al (2002) A gene-expression signature as a predictor of survival in breast cancer. New Engl J Med 347(25):1999–2009CrossRef
go back to reference van’t Veer LJ, Dai H, van de Vijver MJ et al (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871):530–536CrossRef van’t Veer LJ, Dai H, van de Vijver MJ et al (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871):530–536CrossRef
go back to reference Wang Y, Klijn PGM, Zhang Y et al (2005) Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 365(9460):671–679 Wang Y, Klijn PGM, Zhang Y et al (2005) Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 365(9460):671–679
go back to reference Wirapati P, Sotiriou C, Kunkel S et al (2008) Meta-analysis of gene expression profiles in breast cancer: toward a unified understanding of breast cancer subtyping and prognosis signatures. Breast Cancer Res 10(4):R65CrossRef Wirapati P, Sotiriou C, Kunkel S et al (2008) Meta-analysis of gene expression profiles in breast cancer: toward a unified understanding of breast cancer subtyping and prognosis signatures. Breast Cancer Res 10(4):R65CrossRef
go back to reference Xiang Y, Zhang CQ, Huang K (2012) Predicting glioblastoma prognosis networks using weighted gene co-expression network analysis on tcga data. BMC Bioinform 13(Suppl 2):S12 Xiang Y, Zhang CQ, Huang K (2012) Predicting glioblastoma prognosis networks using weighted gene co-expression network analysis on tcga data. BMC Bioinform 13(Suppl 2):S12
go back to reference Xiang Y, Jin R, Fuhry D, Dragan FF (2011) Summarizing transactional databases with overlapped hyperrectangles. Data Min Knowl Discov 23(2):215–251MathSciNetMATHCrossRef Xiang Y, Jin R, Fuhry D, Dragan FF (2011) Summarizing transactional databases with overlapped hyperrectangles. Data Min Knowl Discov 23(2):215–251MathSciNetMATHCrossRef
go back to reference Xiang Y, Payne P, Huang K (2012) Transactional database transformation and its application in prioritizing human disease genes. IEEE/ACM Trans Comput Biol Bioinform 9(1):294–304CrossRef Xiang Y, Payne P, Huang K (2012) Transactional database transformation and its application in prioritizing human disease genes. IEEE/ACM Trans Comput Biol Bioinform 9(1):294–304CrossRef
go back to reference Zhang J, Xiang Y, Ding L et al (2010) Using gene co-expression network analysis to predict biomarkers for chronic lymphocytic leukemia. BMC Bioinform 11(Suppl 9):S5CrossRef Zhang J, Xiang Y, Ding L et al (2010) Using gene co-expression network analysis to predict biomarkers for chronic lymphocytic leukemia. BMC Bioinform 11(Suppl 9):S5CrossRef
Metadata
Title
Merging network patterns: a general framework to summarize biomedical network data
Authors
Yang Xiang
David Fuhry
Kamer Kaya
Ruoming Jin
Ümit V. Çatalyürek
Kun Huang
Publication date
01-09-2012
Publisher
Springer Vienna
Published in
Network Modeling Analysis in Health Informatics and Bioinformatics / Issue 3/2012
Print ISSN: 2192-6662
Electronic ISSN: 2192-6670
DOI
https://doi.org/10.1007/s13721-012-0009-3

Premium Partner