Skip to main content
Erschienen in: Data Mining and Knowledge Discovery 5/2017

30.03.2017

Lagrangian relaxations for multiple network alignment

verfasst von: Eric Malmi, Sanjay Chawla, Aristides Gionis

Erschienen in: Data Mining and Knowledge Discovery | Ausgabe 5/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We propose a principled approach for the problem of aligning multiple partially overlapping networks. The objective is to map multiple graphs into a single graph while preserving vertex and edge similarities. The problem is inspired by the task of integrating partial views of a family tree (genealogical network) into one unified network, but it also has applications, for example, in social and biological networks. Our approach, called Flan, introduces the idea of generalizing the facility location problem by adding a non-linear term to capture edge similarities and to infer the underlying entity network. The problem is solved using an alternating optimization procedure with a Lagrangian relaxation. Flan has the advantage of being able to leverage prior information on the number of entities, so that when this information is available, Flan is shown to work robustly without the need to use any ground truth data for fine-tuning method parameters. Additionally, we present three multiple-network extensions to an existing state-of-the-art pairwise alignment method called Natalie. Extensive experiments on synthetic, as well as real-world datasets on social networks and genealogical networks, attest to the effectiveness of the proposed approaches which clearly outperform a popular multiple network alignment method called IsoRankN.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The code is available at: https://​github.​com/​ekQ/​flan.
 
2
Like in the case of Natalie, we assume an ordering of graphs and consider aligning vertex i with itself or any vertex from graphs \(g'=1,\ldots ,g-1\). In other words, we avoid considering simultaneously vertex i as an entity for vertex j and j as an entity for i, which we have observed to result in larger duality gaps.
 
3
The implementation of the feasibility heuristics is available at: https://​github.​com/​ekQ/​flan.
 
4
For simplicity, we write “\(\min _{B} \text {objective}\)” although the objective is being minimized only w.r.t. elements \(B_{jl}\), where \((j,\ell ) \notin E_I\).
 
5
The implementation is available at https://​www.​cs.​purdue.​edu/​homes/​dgleich/​codes/​netalign/​ and has been used in Bayati et al. (2013) and Malmi et al. (2016).
 
Literatur
Zurück zum Zitat Althaus E, Canzar S (2008) A Lagrangian relaxation approach for the multiple sequence alignment problem. J Comb Optim 16(2):127–154MathSciNetCrossRefMATH Althaus E, Canzar S (2008) A Lagrangian relaxation approach for the multiple sequence alignment problem. J Comb Optim 16(2):127–154MathSciNetCrossRefMATH
Zurück zum Zitat Bayati M, Gleich DF, Saberi A, Wang Y (2013) Message-passing algorithms for sparse network alignment. ACM Trans Knowl Discov Data 7(1):3CrossRef Bayati M, Gleich DF, Saberi A, Wang Y (2013) Message-passing algorithms for sparse network alignment. ACM Trans Knowl Discov Data 7(1):3CrossRef
Zurück zum Zitat Bezdek JC, Hathaway RJ (2003) Convergence of alternating optimization. Neural Parallel Sci Comput 11(4):351–368MathSciNetMATH Bezdek JC, Hathaway RJ (2003) Convergence of alternating optimization. Neural Parallel Sci Comput 11(4):351–368MathSciNetMATH
Zurück zum Zitat Bhattacharya I, Getoor L (2007) Collective entity resolution in relational data. ACM Trans Knowl Discov Data 1(1):5CrossRef Bhattacharya I, Getoor L (2007) Collective entity resolution in relational data. ACM Trans Knowl Discov Data 1(1):5CrossRef
Zurück zum Zitat Christen P (2012) Data matching: concepts and techniques for record linkage, entity resolution, and duplicate detection. Springer, BerlinCrossRef Christen P (2012) Data matching: concepts and techniques for record linkage, entity resolution, and duplicate detection. Springer, BerlinCrossRef
Zurück zum Zitat Christen P, Vatsalan D, Fu Z (2015) Advanced record linkage methods and privacy aspects for population reconstruction—a survey and case studies. In: Population reconstruction. Springer, pp 87–110 Christen P, Vatsalan D, Fu Z (2015) Advanced record linkage methods and privacy aspects for population reconstruction—a survey and case studies. In: Population reconstruction. Springer, pp 87–110
Zurück zum Zitat Clark C, Kalita J (2014) A comparison of algorithms for the pairwise alignment of biological networks. Bioinformatics 30(16):2351–2359CrossRef Clark C, Kalita J (2014) A comparison of algorithms for the pairwise alignment of biological networks. Bioinformatics 30(16):2351–2359CrossRef
Zurück zum Zitat Conte D, Foggia P, Sansone C, Vento M (2004) Thirty years of graph matching in pattern recognition. IJPRAI 18(3):265–298 Conte D, Foggia P, Sansone C, Vento M (2004) Thirty years of graph matching in pattern recognition. IJPRAI 18(3):265–298
Zurück zum Zitat Cornuejols G, Fisher ML, Nemhauser GL (1977) Location of bank accounts to optimize float: an analytic study of exact and approximate algorithms. Manag Sci 23(8):789–810MathSciNetCrossRefMATH Cornuejols G, Fisher ML, Nemhauser GL (1977) Location of bank accounts to optimize float: an analytic study of exact and approximate algorithms. Manag Sci 23(8):789–810MathSciNetCrossRefMATH
Zurück zum Zitat Efremova J, Ranjbar-Sahraei B, Rahmani H, Oliehoek FA, Calders T, Tuyls K, Weiss G (2015) Multi-source entity resolution for genealogical data. In: Population reconstruction. Springer, pp 129–154 Efremova J, Ranjbar-Sahraei B, Rahmani H, Oliehoek FA, Calders T, Tuyls K, Weiss G (2015) Multi-source entity resolution for genealogical data. In: Population reconstruction. Springer, pp 129–154
Zurück zum Zitat El-Kebir M, Heringa J, Klau GW (2015) Natalie 2.0: sparse global network alignment as a special case of quadratic assignment. Algorithms 8(4):1035–1051MathSciNetCrossRef El-Kebir M, Heringa J, Klau GW (2015) Natalie 2.0: sparse global network alignment as a special case of quadratic assignment. Algorithms 8(4):1035–1051MathSciNetCrossRef
Zurück zum Zitat Elmsallati A, Clark C, Kalita J (2015) Global alignment of protein–protein interaction networks: a survey. IEEE/ACM Trans Comput Biol Bioinform PP(99):1-1. doi:10.1109/TCBB.2015.2474391 Elmsallati A, Clark C, Kalita J (2015) Global alignment of protein–protein interaction networks: a survey. IEEE/ACM Trans Comput Biol Bioinform PP(99):1-1. doi:10.​1109/​TCBB.​2015.​2474391
Zurück zum Zitat Goga O, Loiseau P, Sommer R, Teixeira R, Gummadi KP (2015) On the reliability of profile matching across large online social networks. In: Proceedings of the 21st ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1799–1808 Goga O, Loiseau P, Sommer R, Teixeira R, Gummadi KP (2015) On the reliability of profile matching across large online social networks. In: Proceedings of the 21st ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1799–1808
Zurück zum Zitat Hu J, Kehr B, Reinert K (2013) NetCoffee: a fast and accurate global alignment approach to identify functionally conserved proteins in multiple networks. Bioinformatics 30(4):540–548CrossRef Hu J, Kehr B, Reinert K (2013) NetCoffee: a fast and accurate global alignment approach to identify functionally conserved proteins in multiple networks. Bioinformatics 30(4):540–548CrossRef
Zurück zum Zitat Klau GW (2009) A new graph-based method for pairwise global network alignment. BMC Bioinform 10(Suppl 1):S59CrossRef Klau GW (2009) A new graph-based method for pairwise global network alignment. BMC Bioinform 10(Suppl 1):S59CrossRef
Zurück zum Zitat Kouki P, Marcum C, Koehly L, Getoor L (2016) Entity resolution in familial networks. In: Proceedings of the 12th workshop on mining and learning with graphs Kouki P, Marcum C, Koehly L, Getoor L (2016) Entity resolution in familial networks. In: Proceedings of the 12th workshop on mining and learning with graphs
Zurück zum Zitat Sahraeian SME, Yoon BJ (2013) SMETANA: accurate and scalable algorithm for probabilistic alignment of large-scale biological networks. PLOS ONE 8(7):e67,995CrossRef Sahraeian SME, Yoon BJ (2013) SMETANA: accurate and scalable algorithm for probabilistic alignment of large-scale biological networks. PLOS ONE 8(7):e67,995CrossRef
Zurück zum Zitat Shor NZ (2012) Minimization methods for non-differentiable functions, vol 3. Springer, New York Shor NZ (2012) Minimization methods for non-differentiable functions, vol 3. Springer, New York
Zurück zum Zitat Singh R, Xu J, Berger B (2008) Global alignment of multiple protein interaction networks with application to functional orthology detection. Proc Natl Acad Sci 105(35):12763–12768CrossRef Singh R, Xu J, Berger B (2008) Global alignment of multiple protein interaction networks with application to functional orthology detection. Proc Natl Acad Sci 105(35):12763–12768CrossRef
Zurück zum Zitat Singla P, Domingos P (2006) Entity resolution with markov logic. In: Proceedings of the sixth international conference on data mining, ICDM’06. IEEE, pp 572–582 Singla P, Domingos P (2006) Entity resolution with markov logic. In: Proceedings of the sixth international conference on data mining, ICDM’06. IEEE, pp 572–582
Zurück zum Zitat Vazirani VV (2001) Approximation algorithms. Springer, New YorkMATH Vazirani VV (2001) Approximation algorithms. Springer, New YorkMATH
Zurück zum Zitat Winkler WE (1990) String comparator metrics and enhanced decision rules in the fellegi–sunter model of record linkage. In: Proceedings of the section on survey research methods. American Statistical Association, pp 354–359 Winkler WE (1990) String comparator metrics and enhanced decision rules in the fellegi–sunter model of record linkage. In: Proceedings of the section on survey research methods. American Statistical Association, pp 354–359
Zurück zum Zitat Zhai Y, Liu B (2005) Web data extraction based on partial tree alignment. In: Proceedings of the 14th international conference on world wide web. ACM, pp 76–85 Zhai Y, Liu B (2005) Web data extraction based on partial tree alignment. In: Proceedings of the 14th international conference on world wide web. ACM, pp 76–85
Zurück zum Zitat Zhang J, Yu PS (2015) Multiple anonymized social networks alignment. In: Proceedings of the IEEE international conference on data mining, ICDM’15. IEEE Zhang J, Yu PS (2015) Multiple anonymized social networks alignment. In: Proceedings of the IEEE international conference on data mining, ICDM’15. IEEE
Metadaten
Titel
Lagrangian relaxations for multiple network alignment
verfasst von
Eric Malmi
Sanjay Chawla
Aristides Gionis
Publikationsdatum
30.03.2017
Verlag
Springer US
Erschienen in
Data Mining and Knowledge Discovery / Ausgabe 5/2017
Print ISSN: 1384-5810
Elektronische ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-017-0505-2

Weitere Artikel der Ausgabe 5/2017

Data Mining and Knowledge Discovery 5/2017 Zur Ausgabe

Premium Partner