Skip to main content
Erschienen in: Knowledge and Information Systems 1/2015

01.01.2015 | Regular Paper

Detecting multiple stochastic network motifs in network data

verfasst von: Kai Liu, William K. Cheung, Jiming Liu

Erschienen in: Knowledge and Information Systems | Ausgabe 1/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Network motifs are referred to as the interaction patterns that occur significantly more often in a complex network than in the corresponding randomized networks. They have been found effective in characterizing many real-world networks. A number of network motif detection algorithms have been proposed in the literature where the interactions in a motif are mostly assumed to be deterministic, i.e., either present or missing. With the conjecture that the real-world networks are resulted from interaction patterns which should be stochastic in nature, the use of stochastic models is proposed in this paper to achieve more robust motif detection. In particular, we propose the use of a finite mixture model to detect multiple stochastic network motifs. A component-wise expectation maximization (CEM) algorithm is derived for the finite mixture of stochastic network motifs so that both the optimal number of motifs and the motif parameters can be automatically estimated. For performance evaluation, we applied the proposed algorithm to both synthetic networks and a number of online social network data sets and demonstrated that it outperformed the deterministic motif detection algorithm FANMOD as well as the conventional EM algorithm in term of its robustness against noise. Also, how to interpret the detected stochastic network motifs to gain insights on the interaction patterns embedded in the network data is discussed. In addition, the algorithm’s computational complexity and runtime performance are presented for efficiency evaluation.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
3
The formulation presented in this section is based on directed graphs, but can be easily modified for undirected graphs as well.
 
4
The relative errors \(e_{\lambda }\) and \(e_{\Theta }\) for the mixture models inferred based on different Z-score thresholds for initialization were calculated by comparing the model parameters \(\lambda \) and \(\Theta \) with the reference case of having the Z-score threshold set to 2.
 
Literatur
3.
Zurück zum Zitat Kleinberg J (1999) Hubs, authorities, and communities. ACM Comput Surv 31(4):5–7CrossRef Kleinberg J (1999) Hubs, authorities, and communities. ACM Comput Surv 31(4):5–7CrossRef
4.
Zurück zum Zitat Leskovec J, Lang K, Mahoney M (2010) Empirical comparison of algorithms for network community detection. In: Proceedings of the 19th international conference on World Wide Web, pp 631–640 Leskovec J, Lang K, Mahoney M (2010) Empirical comparison of algorithms for network community detection. In: Proceedings of the 19th international conference on World Wide Web, pp 631–640
5.
Zurück zum Zitat Papadopoulos S, Kompatsiaris Y, Vakali A, Spyridonos P (2012) Community detection in social media. Data Min Knowl Discov 24(3):515–554CrossRef Papadopoulos S, Kompatsiaris Y, Vakali A, Spyridonos P (2012) Community detection in social media. Data Min Knowl Discov 24(3):515–554CrossRef
6.
Zurück zum Zitat Comar P, Tan PN, Jain AK (2012) Simultaneous classification and community detection on heterogeneous network data. Data Min Knowl Discov 25(3):420–449CrossRefMathSciNetMATH Comar P, Tan PN, Jain AK (2012) Simultaneous classification and community detection on heterogeneous network data. Data Min Knowl Discov 25(3):420–449CrossRefMathSciNetMATH
7.
Zurück zum Zitat Shen-Orr S, Milo R, Mangan S, Alon U (2002) Network motifs in the transcriptional regulation network of Escherichia coli. Nature Genet 31(1):64–68CrossRef Shen-Orr S, Milo R, Mangan S, Alon U (2002) Network motifs in the transcriptional regulation network of Escherichia coli. Nature Genet 31(1):64–68CrossRef
8.
Zurück zum Zitat Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U (2002) Network motifs: simple building blocks of complex networks. Science 298(5594):824–827CrossRef Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U (2002) Network motifs: simple building blocks of complex networks. Science 298(5594):824–827CrossRef
9.
Zurück zum Zitat Milo R, Itzkovitz S, Kashtan N, Levitt R, Shen-Orr S, Ayzenshtat I, Sheffer M, Alon U (2004) Superfamilies of evolved and designed networks. Science 303(5663):1538–1541CrossRef Milo R, Itzkovitz S, Kashtan N, Levitt R, Shen-Orr S, Ayzenshtat I, Sheffer M, Alon U (2004) Superfamilies of evolved and designed networks. Science 303(5663):1538–1541CrossRef
10.
Zurück zum Zitat Kashtan N, Itzkovitz S, Milo R, Alon U (2004) Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 20(11):1746–1758CrossRef Kashtan N, Itzkovitz S, Milo R, Alon U (2004) Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 20(11):1746–1758CrossRef
11.
Zurück zum Zitat Wernicke S (2006) Efficient detection of network motifs. IEEE/ACM Trans Comput Biol Bioinform 3(4):347–359CrossRef Wernicke S (2006) Efficient detection of network motifs. IEEE/ACM Trans Comput Biol Bioinform 3(4):347–359CrossRef
12.
Zurück zum Zitat Berg J, Michael L (2004) Local graph alignment and motif search in biological networks. Proc Natl Acad Sci 101(41):14689–14694CrossRef Berg J, Michael L (2004) Local graph alignment and motif search in biological networks. Proc Natl Acad Sci 101(41):14689–14694CrossRef
13.
Zurück zum Zitat Jiang R, Tu Z, Chen T, Sun F (2006) Network motif identification in stochastic networks. Proc Natl Acad Sci 103(25):9404–9409CrossRef Jiang R, Tu Z, Chen T, Sun F (2006) Network motif identification in stochastic networks. Proc Natl Acad Sci 103(25):9404–9409CrossRef
14.
Zurück zum Zitat Zhao Q, Tian Y, He Q, Oliver N, Jin R, Lee W (2010) Communication motifs: A tool to characterize social communications. In: Proccedings of the 19th ACM international conference on information and, knowledge management, pp 1645–1648 Zhao Q, Tian Y, He Q, Oliver N, Jin R, Lee W (2010) Communication motifs: A tool to characterize social communications. In: Proccedings of the 19th ACM international conference on information and, knowledge management, pp 1645–1648
15.
Zurück zum Zitat Kovanen L, Karsai M, Kaski K, Kertész J, Saramäki J (2011) Temporal motifs in time-dependent networks. J Stat Mech Theory Exp 11(11):5–23 Kovanen L, Karsai M, Kaski K, Kertész J, Saramäki J (2011) Temporal motifs in time-dependent networks. J Stat Mech Theory Exp 11(11):5–23
16.
Zurück zum Zitat Juszczyszyn K, Kazienko P, Musial K (2008) Local topology of social network based on motif analysis. In: Proceedings of knowledge-based intelligent information and engineering systems, pp 97–105 Juszczyszyn K, Kazienko P, Musial K (2008) Local topology of social network based on motif analysis. In: Proceedings of knowledge-based intelligent information and engineering systems, pp 97–105
17.
Zurück zum Zitat Musial K, Juszczyszyn K (2009) Motif-based analysis of social position influence on interconnection patterns in complex social network. In: Procceedings of the 1st Asian conference on intelligent information and database systems, pp 34–39 Musial K, Juszczyszyn K (2009) Motif-based analysis of social position influence on interconnection patterns in complex social network. In: Procceedings of the 1st Asian conference on intelligent information and database systems, pp 34–39
18.
Zurück zum Zitat Braha D, Bar-Yam Y (2009) Time-dependent complex networks: dynamic centrality, dynamic motifs, and cycles of social interactions. Theory, models and applications, adaptive networks, pp 39–50 Braha D, Bar-Yam Y (2009) Time-dependent complex networks: dynamic centrality, dynamic motifs, and cycles of social interactions. Theory, models and applications, adaptive networks, pp 39–50
19.
Zurück zum Zitat Leskovec J, Huttenlocher D, Kleinberg J (2010) Signed networks in social media. In: Proceedings of the 28th international conference on human factors in computing systems, pp 1361–1370 Leskovec J, Huttenlocher D, Kleinberg J (2010) Signed networks in social media. In: Proceedings of the 28th international conference on human factors in computing systems, pp 1361–1370
20.
Zurück zum Zitat Clauset A (2005) Finding local community structure in networks. Phys Rev E 72(2):132–137CrossRef Clauset A (2005) Finding local community structure in networks. Phys Rev E 72(2):132–137CrossRef
21.
Zurück zum Zitat Faust K (2007) Very local structure in social networks. Sociol Method 37(1):209–256CrossRef Faust K (2007) Very local structure in social networks. Sociol Method 37(1):209–256CrossRef
22.
Zurück zum Zitat Kossinets G, Kleinberg J, Watts D (2008) The structure of information pathways in a social communication network. In: Proceeding of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 435–443 Kossinets G, Kleinberg J, Watts D (2008) The structure of information pathways in a social communication network. In: Proceeding of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 435–443
24.
Zurück zum Zitat Gallos LK, Diego R, Fredrik L, Shlomo H, Hernán MA (2012) How people interact in evolving online affiliation networks. Phys Rev X 2(3):14–31 Gallos LK, Diego R, Fredrik L, Shlomo H, Hernán MA (2012) How people interact in evolving online affiliation networks. Phys Rev X 2(3):14–31
25.
Zurück zum Zitat Allan E, Turkett W, Fulp E (2009) Using network motifs to identify application protocols. In: Global telecommunications conference, pp 1–7 Allan E, Turkett W, Fulp E (2009) Using network motifs to identify application protocols. In: Global telecommunications conference, pp 1–7
26.
Zurück zum Zitat Callaghan D, Harrigan M, Carthy J, Cunningham P (2012) Network analysis of recurring YouTube spam campaigns. In: Proceedings of the 6th international AAAI conference on weblogs and social media, pp 531–534 Callaghan D, Harrigan M, Carthy J, Cunningham P (2012) Network analysis of recurring YouTube spam campaigns. In: Proceedings of the 6th international AAAI conference on weblogs and social media, pp 531–534
27.
Zurück zum Zitat Liu K, Cheung WK, Liu J (2011) Stochastic network motif detection in social media. In: Proccedings of ICDM workshop on data mining in networks, pp 949–956 Liu K, Cheung WK, Liu J (2011) Stochastic network motif detection in social media. In: Proccedings of ICDM workshop on data mining in networks, pp 949–956
28.
Zurück zum Zitat Liu K, Cheung W K, Liu J (2012) Detecting multiple stochastic network motifs in network data. In: Proceedings of the 16th Pacific–Asia conference on knowledge discovery and data mining, pp 205–217 Liu K, Cheung W K, Liu J (2012) Detecting multiple stochastic network motifs in network data. In: Proceedings of the 16th Pacific–Asia conference on knowledge discovery and data mining, pp 205–217
29.
Zurück zum Zitat Figueiredo M, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 24(3):381–396CrossRef Figueiredo M, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 24(3):381–396CrossRef
30.
Zurück zum Zitat Bruno F, Palopoli L, Rombo S (2010) New trends in graph mining: structural and node-colored network motifs. Int J Knowl Discov Bioinform 1(1):81–99CrossRef Bruno F, Palopoli L, Rombo S (2010) New trends in graph mining: structural and node-colored network motifs. Int J Knowl Discov Bioinform 1(1):81–99CrossRef
31.
Zurück zum Zitat Wong E, Baur B, Quader S, Huang C (2012) Biological network motif detection: principles and practice. Briefings Bioinform 12(2):202–215CrossRef Wong E, Baur B, Quader S, Huang C (2012) Biological network motif detection: principles and practice. Briefings Bioinform 12(2):202–215CrossRef
32.
Zurück zum Zitat Schreiber F, Schwöbbermeyer H (2005) Frequency concepts and pattern detection for the analysis of motifs in networks. Transactions on computational systems biology, pp 89–104 Schreiber F, Schwöbbermeyer H (2005) Frequency concepts and pattern detection for the analysis of motifs in networks. Transactions on computational systems biology, pp 89–104
33.
Zurück zum Zitat Schreiber F, Schwöbbermeyer H (2005) MAVisto: a tool for the exploration of network motifs. Bioinformatics 21(17):3572–3574CrossRef Schreiber F, Schwöbbermeyer H (2005) MAVisto: a tool for the exploration of network motifs. Bioinformatics 21(17):3572–3574CrossRef
34.
Zurück zum Zitat Chen J, Hsu W, Lee M, Ng S (2006) Nemofinder: dissecting genome-wide protein–protein interactions with meso-scale network motifs. In: Proccedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, pp 106–115 Chen J, Hsu W, Lee M, Ng S (2006) Nemofinder: dissecting genome-wide protein–protein interactions with meso-scale network motifs. In: Proccedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, pp 106–115
35.
Zurück zum Zitat Kashani Z, Ahrabian H, Elahi E et al (2009) Kavosh: a new algorithm for finding network motifs. BMC Bioinform 10(1):318–329CrossRef Kashani Z, Ahrabian H, Elahi E et al (2009) Kavosh: a new algorithm for finding network motifs. BMC Bioinform 10(1):318–329CrossRef
36.
Zurück zum Zitat Grochow J, Kellis M (2007) Network motif discovery using subgraph enumeration and symmetry-breaking. Research in computational, molecular biology, pp 92–106 Grochow J, Kellis M (2007) Network motif discovery using subgraph enumeration and symmetry-breaking. Research in computational, molecular biology, pp 92–106
37.
Zurück zum Zitat Omidi S, Schreiber F, Masoudi-Nejad A (2009) Moda: an efficient algorithm for network motif discovery in biological networks. Genes Genet Syst 84(5):385–395CrossRef Omidi S, Schreiber F, Masoudi-Nejad A (2009) Moda: an efficient algorithm for network motif discovery in biological networks. Genes Genet Syst 84(5):385–395CrossRef
38.
Zurück zum Zitat Fortin S (1996) The graph isomorphism problem. Technical Report 96–20, University of Alberta, Edomonton, Alberta, Canada Fortin S (1996) The graph isomorphism problem. Technical Report 96–20, University of Alberta, Edomonton, Alberta, Canada
39.
Zurück zum Zitat McKay B (2007) Nauty user’s guide (version 2.4). Computer Science Dept, Australian National University McKay B (2007) Nauty user’s guide (version 2.4). Computer Science Dept, Australian National University
40.
Zurück zum Zitat Vazquez A, Dobrin R, Sergi D, Eckmann J, Oltvai Z, Barabási A (2004) The topological relationship between the large-scale attributes and local interaction patterns of complex networks. Proc Natl Acad Sci 101(52):17940–17945CrossRef Vazquez A, Dobrin R, Sergi D, Eckmann J, Oltvai Z, Barabási A (2004) The topological relationship between the large-scale attributes and local interaction patterns of complex networks. Proc Natl Acad Sci 101(52):17940–17945CrossRef
41.
Zurück zum Zitat Tsourakakis CE (2011) Counting triangles in real-world networks using projections. Knowl Inf Syst 26(3):501–520CrossRef Tsourakakis CE (2011) Counting triangles in real-world networks using projections. Knowl Inf Syst 26(3):501–520CrossRef
42.
Zurück zum Zitat Kolountzakis M, Miller G, Peng R, Tsourakakis C (2012) Efficient triangle counting in large graphs via degree-based vertex partitioning. Internet Math 8(2):161–185CrossRefMathSciNetMATH Kolountzakis M, Miller G, Peng R, Tsourakakis C (2012) Efficient triangle counting in large graphs via degree-based vertex partitioning. Internet Math 8(2):161–185CrossRefMathSciNetMATH
43.
Zurück zum Zitat Seshadhri C, Pinar A, Kolda T (2013) Triadic measures on graphs: the power of wedge sampling. CoRR, abs/1202.5230 Seshadhri C, Pinar A, Kolda T (2013) Triadic measures on graphs: the power of wedge sampling. CoRR, abs/1202.5230
44.
Zurück zum Zitat Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38MathSciNetMATH Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38MathSciNetMATH
45.
Zurück zum Zitat Celeux G, Chrétien S, Forbes F, Mkhadri A (2001) A component-wise EM algorithm for mixtures. J Comput Graph Stat 10(4):697–712CrossRef Celeux G, Chrétien S, Forbes F, Mkhadri A (2001) A component-wise EM algorithm for mixtures. J Comput Graph Stat 10(4):697–712CrossRef
46.
Zurück zum Zitat Wallace C, Dowe D (1999) Minimum message length and kolmogorov complexity. Comput J 42(4):270–283CrossRefMATH Wallace C, Dowe D (1999) Minimum message length and kolmogorov complexity. Comput J 42(4):270–283CrossRefMATH
47.
Zurück zum Zitat Cover T, Thomas J, Wiley J et al (1991) Elements of information theory. Wiley, New YorkCrossRefMATH Cover T, Thomas J, Wiley J et al (1991) Elements of information theory. Wiley, New YorkCrossRefMATH
48.
Zurück zum Zitat Leskovec J, Adamic L, Huberman B (2007) The dynamics of viral marketing. ACM Trans Web 1(1):5–44CrossRef Leskovec J, Adamic L, Huberman B (2007) The dynamics of viral marketing. ACM Trans Web 1(1):5–44CrossRef
49.
Zurück zum Zitat Cartwright D, Harary F (1956) Structural balance: a generalization of Heider’s theory. Psychol Rev 63(5):277–293CrossRef Cartwright D, Harary F (1956) Structural balance: a generalization of Heider’s theory. Psychol Rev 63(5):277–293CrossRef
50.
Zurück zum Zitat Kumar N, Satoor S, Buck I (2009) Fast parallel expectation maximization for gaussian mixture models on GPUs using CUDA. In: Proccedings of the 11th IEEE international conference on high performance computing and communications, pp 103–109 Kumar N, Satoor S, Buck I (2009) Fast parallel expectation maximization for gaussian mixture models on GPUs using CUDA. In: Proccedings of the 11th IEEE international conference on high performance computing and communications, pp 103–109
Metadaten
Titel
Detecting multiple stochastic network motifs in network data
verfasst von
Kai Liu
William K. Cheung
Jiming Liu
Publikationsdatum
01.01.2015
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 1/2015
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-013-0680-4

Weitere Artikel der Ausgabe 1/2015

Knowledge and Information Systems 1/2015 Zur Ausgabe

Premium Partner