Skip to main content

2019 | OriginalPaper | Buchkapitel

Towards Heterogeneous Network Alignment: Design and Implementation of a Large-Scale Data Processing Framework

verfasst von : Marianna Milano, Pierangelo Veltri, Mario Cannataro, Pietro H. Guzzi

Erschienen in: Euro-Par 2018: Parallel Processing Workshops

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The importance of the use of networks to model and analyse biological data and the interplay of bio-molecules is widely recognised. Consequently, many algorithms for the analysis and the comparison of networks (such as alignment algorithms) have been developed in the past. Recently, many different approaches tried to integrate into a single model the interplay of different molecules, such as genes, transcription factors and microRNAs. A possible formalism to model such scenario comes from node coloured networks (or heterogeneous networks) implemented as node/ edge-coloured graphs. Consequently, the need for the introduction of alignment algorithms able to analyse heterogeneous networks arises. To the best of our knowledge, all the existing algorithms are not able to mine heterogeneous networks. We propose a two-step alignment strategy that receives as input two heterogeneous networks (node-coloured graphs) and a similarity function among nodes of two networks extending the previous formulations. We first build a single alignment graph. Then we mine this graph extracting relevant subgraphs. Despite this simple approach, the analysis of such networks relies on graph and subgraph isomorphism and the size of the data is still growing. Therefore the use of high-performance data analytics framework is needed. We here present HetNetAligner a framework built on top of Apache Spark. We also implemented our algorithm, and we tested it on some selected heterogeneous biological networks. Preliminary results confirm that our method may extract relevant knowledge from biological data reducing the computational time.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Berg, J., Lässig, M.: Local graph alignment and motif search in biological networks. Proc. Natl. Acad. Sci. U. S. A. 101(41), 14689–14694 (2004)CrossRef Berg, J., Lässig, M.: Local graph alignment and motif search in biological networks. Proc. Natl. Acad. Sci. U. S. A. 101(41), 14689–14694 (2004)CrossRef
2.
Zurück zum Zitat Cannataro, M., Guzzi, P.H., Veltri, P.: Protein-to-protein interactions: technologies, databases, and algorithms. ACM Comput. Surv. (CSUR) 43(1), 1 (2010)CrossRef Cannataro, M., Guzzi, P.H., Veltri, P.: Protein-to-protein interactions: technologies, databases, and algorithms. ACM Comput. Surv. (CSUR) 43(1), 1 (2010)CrossRef
3.
Zurück zum Zitat Ciriello, G., Mina, M., Guzzi, P.H., Cannataro, M., Guerra, C.: Alignnemo: a local network alignment method to integrate homology and topology. PLoS ONE 7(6), e38107 (2012)CrossRef Ciriello, G., Mina, M., Guzzi, P.H., Cannataro, M., Guerra, C.: Alignnemo: a local network alignment method to integrate homology and topology. PLoS ONE 7(6), e38107 (2012)CrossRef
4.
Zurück zum Zitat Csardi, G., Nepusz, T.: The igraph software package for complex network research. InterJournal Complex Syst. 1695(5), 1–9 (2006) Csardi, G., Nepusz, T.: The igraph software package for complex network research. InterJournal Complex Syst. 1695(5), 1–9 (2006)
5.
Zurück zum Zitat Di Martino, M.T., et al.: Integrated analysis of micrornas, transcription factors and target genes expression discloses a specific molecular architecture of hyperdiploid multiple myeloma. Oncotarget 6(22), 19132 (2015)CrossRef Di Martino, M.T., et al.: Integrated analysis of micrornas, transcription factors and target genes expression discloses a specific molecular architecture of hyperdiploid multiple myeloma. Oncotarget 6(22), 19132 (2015)CrossRef
6.
Zurück zum Zitat Enright, A.J., Van Dongen, S., Ouzounis, C.A.: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30(7), 1575–1584 (2002)CrossRef Enright, A.J., Van Dongen, S., Ouzounis, C.A.: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30(7), 1575–1584 (2002)CrossRef
7.
Zurück zum Zitat Gligorijevic, V., Malod-Dognin, N., Przulj, N.: Integrative methods for analyzing big data in precision medicine. Proteomics 16(5), 741–758 (2016)CrossRef Gligorijevic, V., Malod-Dognin, N., Przulj, N.: Integrative methods for analyzing big data in precision medicine. Proteomics 16(5), 741–758 (2016)CrossRef
8.
Zurück zum Zitat Guzzi, P.H., Di Martino, M.T., Tagliaferri, P., Tassone, P., Cannataro, M.: Analysis of miRNA, mRNA, and TF interactions through network-based methods. EURASIP J. Bioinform. Syst. Biol. 2015(1), 1–11 (2015)CrossRef Guzzi, P.H., Di Martino, M.T., Tagliaferri, P., Tassone, P., Cannataro, M.: Analysis of miRNA, mRNA, and TF interactions through network-based methods. EURASIP J. Bioinform. Syst. Biol. 2015(1), 1–11 (2015)CrossRef
9.
Zurück zum Zitat Hagberg, A., Swart, P., Chult, D.S.: Exploring network structure, dynamics, and function using networkx. Technical report, Los Alamos National Lab. (LANL), Los Alamos, NM, United States (2008) Hagberg, A., Swart, P., Chult, D.S.: Exploring network structure, dynamics, and function using networkx. Technical report, Los Alamos National Lab. (LANL), Los Alamos, NM, United States (2008)
10.
Zurück zum Zitat Hu, J., Reinert, K.: Localali: an evolutionary-based local alignment approach to identify functionally conserved modules in multiple networks. Bioinformatics 31(3), 363–372 (2014)CrossRef Hu, J., Reinert, K.: Localali: an evolutionary-based local alignment approach to identify functionally conserved modules in multiple networks. Bioinformatics 31(3), 363–372 (2014)CrossRef
11.
Zurück zum Zitat Ideker, T., Nussinov, R.: Network approaches and applications in biology. PLoS Comput. Biol. 13(10), e1005771 (2017)CrossRef Ideker, T., Nussinov, R.: Network approaches and applications in biology. PLoS Comput. Biol. 13(10), e1005771 (2017)CrossRef
12.
Zurück zum Zitat Ihaka, R., Gentleman, R.: R: a language for data analysis and graphics. J. Comput. Graph. Stat. 5(3), 299–314 (1996) Ihaka, R., Gentleman, R.: R: a language for data analysis and graphics. J. Comput. Graph. Stat. 5(3), 299–314 (1996)
13.
Zurück zum Zitat Koyuturk, M., Kim, Y., Topkara, U., Subramaniam, S., Szpankowski, W., Grama, A.: Pairwise alignment of protein interaction networks. J. Comput. Biol. 13(2), 182–199 (2006)MathSciNetCrossRef Koyuturk, M., Kim, Y., Topkara, U., Subramaniam, S., Szpankowski, W., Grama, A.: Pairwise alignment of protein interaction networks. J. Comput. Biol. 13(2), 182–199 (2006)MathSciNetCrossRef
14.
Zurück zum Zitat Meng, X., et al.: MLlib: machine learning in apache spark. J. Mach. Learn. Res. 17(1), 1235–1241 (2016)MathSciNetMATH Meng, X., et al.: MLlib: machine learning in apache spark. J. Mach. Learn. Res. 17(1), 1235–1241 (2016)MathSciNetMATH
15.
Zurück zum Zitat Mina, M., Guzzi, P.H.: Improving the robustness of local network alignment: design and extensive assessmentof a Markov clustering-based approach. IEEE/ACM Trans. Comput. Biol. Bioinform. 11(3), 561–572 (2014)CrossRef Mina, M., Guzzi, P.H.: Improving the robustness of local network alignment: design and extensive assessmentof a Markov clustering-based approach. IEEE/ACM Trans. Comput. Biol. Bioinform. 11(3), 561–572 (2014)CrossRef
16.
Zurück zum Zitat Navarro, C., Martínez, V., Blanco, A., Cano, C.: ProphTools: general prioritization tools for heterogeneous biological networks. GigaScience 6(12), 1–8 (2017)CrossRef Navarro, C., Martínez, V., Blanco, A., Cano, C.: ProphTools: general prioritization tools for heterogeneous biological networks. GigaScience 6(12), 1–8 (2017)CrossRef
17.
Zurück zum Zitat Taylor, R.C.: An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. In: BMC Bioinformatics, vol. 11, p. S1. BioMed Central (2010)MathSciNetCrossRef Taylor, R.C.: An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. In: BMC Bioinformatics, vol. 11, p. S1. BioMed Central (2010)MathSciNetCrossRef
18.
Zurück zum Zitat Yap, P.-T., Wu, G., Shen, D.: Human brain connectomics: networks, techniques, and applications [life sciences]. IEEE Signal Process. Mag. 27(4), 131–134 (2010)CrossRef Yap, P.-T., Wu, G., Shen, D.: Human brain connectomics: networks, techniques, and applications [life sciences]. IEEE Signal Process. Mag. 27(4), 131–134 (2010)CrossRef
19.
Zurück zum Zitat Zaharia, M., et al.: Apache spark: a unified engine for big data processing. Commun. ACM 59(11), 56–65 (2016)CrossRef Zaharia, M., et al.: Apache spark: a unified engine for big data processing. Commun. ACM 59(11), 56–65 (2016)CrossRef
20.
Zurück zum Zitat Zikopoulos, P., Eaton, C., et al.: Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data. McGraw-Hill Osborne Media, New York (2011) Zikopoulos, P., Eaton, C., et al.: Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data. McGraw-Hill Osborne Media, New York (2011)
Metadaten
Titel
Towards Heterogeneous Network Alignment: Design and Implementation of a Large-Scale Data Processing Framework
verfasst von
Marianna Milano
Pierangelo Veltri
Mario Cannataro
Pietro H. Guzzi
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-10549-5_54

Premium Partner