Skip to main content
Erschienen in: Neural Computing and Applications 6/2016

01.08.2016 | Original Article

Learning node labels with multi-category Hopfield networks

verfasst von: Marco Frasca, Simone Bassis, Giorgio Valentini

Erschienen in: Neural Computing and Applications | Ausgabe 6/2016

Einloggen

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In several real-world node label prediction problems on graphs, in fields ranging from computational biology to World Wide Web analysis, nodes can be partitioned into categories different from the classes to be predicted, on the basis of their characteristics or their common properties. Such partitions may provide further information about node classification that classical machine learning algorithms do not take into account. We introduce a novel family of parametric Hopfield networks (m-category Hopfield networks) and a novel algorithm (Hopfield multi-categoryHoMCat), designed to appropriately exploit the presence of property-based partitions of nodes into multiple categories. Moreover, the proposed model adopts a cost-sensitive learning strategy to prevent the remarkable decay in performance usually observed when instance labels are unbalanced, that is, when one class of labels is highly underrepresented than the other one. We validate the proposed model on both synthetic and real-world data, in the context of multi-species function prediction, where the classes to be predicted are the Gene Ontology terms and the categories the different species in the multi-species protein network. We carried out an intensive experimental validation, which on the one hand compares HoMCat with several state-of-the-art graph-based algorithms, and on the other hand reveals that exploiting meaningful prior partitions of input data can substantially improve classification performances.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Ashburner M et al (2000) Gene ontology: tool for the unification of biology. Gene ontology consortium. Nat Genet 25(1):25–29CrossRef Ashburner M et al (2000) Gene ontology: tool for the unification of biology. Gene ontology consortium. Nat Genet 25(1):25–29CrossRef
3.
Zurück zum Zitat Attwood TK, Bradley P, Flower DR, Gaulton A, Maudling N, Mitchell A, Moulton G, Nordle A, Paine K, Taylor P et al (2003) Prints and its automatic supplement, preprints. Nucl Acids Res 31(1):400–402CrossRef Attwood TK, Bradley P, Flower DR, Gaulton A, Maudling N, Mitchell A, Moulton G, Nordle A, Paine K, Taylor P et al (2003) Prints and its automatic supplement, preprints. Nucl Acids Res 31(1):400–402CrossRef
4.
Zurück zum Zitat Azran A (2007) The rendezvous algorithm: multi- class semi-supervised learning with Markov random walks. In: Proceedings of the 24th international conference on machine learning (ICML) Azran A (2007) The rendezvous algorithm: multi- class semi-supervised learning with Markov random walks. In: Proceedings of the 24th international conference on machine learning (ICML)
5.
Zurück zum Zitat Bairoch A, Apweiler R (1997) the SWISS-PROT protein sequence data bank and its supplement TrEMBL. Nucl Acids Res 25(1):31–36CrossRef Bairoch A, Apweiler R (1997) the SWISS-PROT protein sequence data bank and its supplement TrEMBL. Nucl Acids Res 25(1):31–36CrossRef
6.
Zurück zum Zitat Bengio Y, Delalleau O, Le Roux N (2006) Label propagation and quadratic criterion. In: Chapelle O, Scholkopf B, Zien A (eds) Semi supervised learning. MIT Press, Cambridge, pp 193–216 Bengio Y, Delalleau O, Le Roux N (2006) Label propagation and quadratic criterion. In: Chapelle O, Scholkopf B, Zien A (eds) Semi supervised learning. MIT Press, Cambridge, pp 193–216
7.
Zurück zum Zitat Bertoni A, Frasca M, Valentini G (2011) Cosnet: a cost sensitive neural network for semi-supervised learning in graphs. In: ECML/PKDD (1), Lecture Notes in Computer Science, vol 6911, pp 219–234. Springer Bertoni A, Frasca M, Valentini G (2011) Cosnet: a cost sensitive neural network for semi-supervised learning in graphs. In: ECML/PKDD (1), Lecture Notes in Computer Science, vol 6911, pp 219–234. Springer
8.
Zurück zum Zitat Bhagat S, Cormode G, Muthukrishnan S (2011) Node classification in social networks. CoRR abs/1101.3291 Bhagat S, Cormode G, Muthukrishnan S (2011) Node classification in social networks. CoRR abs/1101.3291
9.
Zurück zum Zitat Bogdanov P, Singh AK (2010) Molecular function prediction using neighborhood features. IEEE/ACM Trans Comput Biol Bioinform 7:208–217CrossRef Bogdanov P, Singh AK (2010) Molecular function prediction using neighborhood features. IEEE/ACM Trans Comput Biol Bioinform 7:208–217CrossRef
10.
Zurück zum Zitat Brent R (1973) Algorithms for minimization without derivatives. Prentice-Hall, New JerseyMATH Brent R (1973) Algorithms for minimization without derivatives. Prentice-Hall, New JerseyMATH
11.
Zurück zum Zitat Chaudhari G, Avadhanula V, Sarawagi S (2014) A few good predictions: selective node labeling in a social network. In: Proceedings of the 7th ACM international conference on web search and data mining, WSDM ’14, pp 353–362. ACM, New York. doi:10.1145/2556195.2556241 Chaudhari G, Avadhanula V, Sarawagi S (2014) A few good predictions: selective node labeling in a social network. In: Proceedings of the 7th ACM international conference on web search and data mining, WSDM ’14, pp 353–362. ACM, New York. doi:10.​1145/​2556195.​2556241
13.
Zurück zum Zitat Cheng Z, Caverlee J, Lee K (2010) You are where you tweet: a content-based approach to geo-locating twitter users. In: Proceedings of the 19th ACM international conference on information and knowledge management., CIKM ’10ACM, New York, pp 759–768 Cheng Z, Caverlee J, Lee K (2010) You are where you tweet: a content-based approach to geo-locating twitter users. In: Proceedings of the 19th ACM international conference on information and knowledge management., CIKM ’10ACM, New York, pp 759–768
14.
Zurück zum Zitat Chua HN, Sung WK, Wong L (2006) Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics 22:1623–1630CrossRef Chua HN, Sung WK, Wong L (2006) Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics 22:1623–1630CrossRef
15.
Zurück zum Zitat Deng M, Chen T, Sun F (2004) An integrated probabilistic model for functional prediction of proteins. J Comput Biol 11:463–475CrossRef Deng M, Chen T, Sun F (2004) An integrated probabilistic model for functional prediction of proteins. J Comput Biol 11:463–475CrossRef
16.
Zurück zum Zitat Elkan C (2001) The foundations of cost-sensitive learning. In: Proceedings of the seventeenth international joint conference on artificial intelligence, pp 973–978 Elkan C (2001) The foundations of cost-sensitive learning. In: Proceedings of the seventeenth international joint conference on artificial intelligence, pp 973–978
17.
Zurück zum Zitat Erdem MH, Ozturk Y (1996) A new family of multivalued networks. Neural Netw 9(6):979–989CrossRef Erdem MH, Ozturk Y (1996) A new family of multivalued networks. Neural Netw 9(6):979–989CrossRef
18.
Zurück zum Zitat Ertoz L, Steinbach M, Kumar V (2002) A new shared nearest neighbor clustering algorithm and its applications. In: Workshop on clustering high dimensional data and its applications at 2nd SIAM international conference on data mining Ertoz L, Steinbach M, Kumar V (2002) A new shared nearest neighbor clustering algorithm and its applications. In: Workshop on clustering high dimensional data and its applications at 2nd SIAM international conference on data mining
19.
Zurück zum Zitat Finn RD, Mistry J, Schuster-Böckler B, Griffiths-Jones S, Hollich V, Lassmann T, Moxon S, Marshall M, Khanna A, Durbin R et al (2006) Pfam: clans, web tools and services. Nucl Acids Res 34(suppl 1):D247–D251CrossRef Finn RD, Mistry J, Schuster-Böckler B, Griffiths-Jones S, Hollich V, Lassmann T, Moxon S, Marshall M, Khanna A, Durbin R et al (2006) Pfam: clans, web tools and services. Nucl Acids Res 34(suppl 1):D247–D251CrossRef
21.
Zurück zum Zitat Frasca M, Bertoni A et al (2013) A neural network algorithm for semi-supervised node label learning from unbalanced data. Neural Netw 43:84–98CrossRefMATH Frasca M, Bertoni A et al (2013) A neural network algorithm for semi-supervised node label learning from unbalanced data. Neural Netw 43:84–98CrossRefMATH
22.
Zurück zum Zitat Frasca M, Pavesi G (2013) A neural network based algorithm for gene expression prediction from chromatin structure. In: IJCNN, pp 1–8. IEEE Frasca M, Pavesi G (2013) A neural network based algorithm for gene expression prediction from chromatin structure. In: IJCNN, pp 1–8. IEEE
23.
Zurück zum Zitat Gough J, Karplus K, Hughey R, Chothia C (2001) Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. J Mol Biol 313(4):903–919CrossRef Gough J, Karplus K, Hughey R, Chothia C (2001) Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. J Mol Biol 313(4):903–919CrossRef
24.
Zurück zum Zitat Guyon I, Cawley G, Dror G (eds) (2011) Hands-on pattern recognition: challenges in machine learning, challenges in machine learning, vol 1. Microtome Publishing, Brookline Guyon I, Cawley G, Dror G (eds) (2011) Hands-on pattern recognition: challenges in machine learning, challenges in machine learning, vol 1. Microtome Publishing, Brookline
26.
Zurück zum Zitat Hopfield J (1982) Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci USA 79:2554–2558MathSciNetCrossRef Hopfield J (1982) Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci USA 79:2554–2558MathSciNetCrossRef
27.
Zurück zum Zitat Hulo N, Bairoch A, Bulliard V, Cerutti L, De Castro E, Langendijk-Genevaux PS, Pagni M, Sigrist CJ (2006) The PROSITE database. Nucl Acids Res 34(suppl 1):D227–D230CrossRef Hulo N, Bairoch A, Bulliard V, Cerutti L, De Castro E, Langendijk-Genevaux PS, Pagni M, Sigrist CJ (2006) The PROSITE database. Nucl Acids Res 34(suppl 1):D227–D230CrossRef
28.
Zurück zum Zitat Jarvis RA, Patrick EA (1973) Clustering using a similarity measure based on shared near neighbors. IEEE Trans Comput 22(11):1025–1034CrossRef Jarvis RA, Patrick EA (1973) Clustering using a similarity measure based on shared near neighbors. IEEE Trans Comput 22(11):1025–1034CrossRef
29.
Zurück zum Zitat Karaoz U et al (2004) Whole-genome annotation by using evidence integration in functional-linkage networks. Proc Natl Acad Sci USA 101:2888–2893CrossRef Karaoz U et al (2004) Whole-genome annotation by using evidence integration in functional-linkage networks. Proc Natl Acad Sci USA 101:2888–2893CrossRef
30.
Zurück zum Zitat Kohler S, Bauer S, Horn D, Robinson P (2008) Walking the interactome for prioritization of candidate disease genes. Am J Human Genet 82(4):948–958CrossRef Kohler S, Bauer S, Horn D, Robinson P (2008) Walking the interactome for prioritization of candidate disease genes. Am J Human Genet 82(4):948–958CrossRef
32.
Zurück zum Zitat Lan L et al (2013) MS-kNN: protein function prediction by integrating multiple data sources. BMC Bioinformatics 14(Suppl 3:S8) Lan L et al (2013) MS-kNN: protein function prediction by integrating multiple data sources. BMC Bioinformatics 14(Suppl 3:S8)
33.
Zurück zum Zitat Letunic I, Copley RR, Pils B, Pinkert S, Schultz J, Bork P (2006) Smart 5: domains in the context of genomes and networks. Nucl Acids Res 34(suppl 1):D257–D260CrossRef Letunic I, Copley RR, Pils B, Pinkert S, Schultz J, Bork P (2006) Smart 5: domains in the context of genomes and networks. Nucl Acids Res 34(suppl 1):D257–D260CrossRef
36.
Zurück zum Zitat Lovász L (1996) Random walks on graphs: a survey. In: Miklós D, Sós VT, Szőnyi T (eds) Combinatorics, Paul Erdős is eighty, vol 2. János Bolyai Mathematical Society, Budapest, pp 353–398 Lovász L (1996) Random walks on graphs: a survey. In: Miklós D, Sós VT, Szőnyi T (eds) Combinatorics, Paul Erdős is eighty, vol 2. János Bolyai Mathematical Society, Budapest, pp 353–398
38.
Zurück zum Zitat Marcotte E, Pellegrini M, Thompson M, Yeates T, Eisenberg D (1999) A combined algorithm for genome-wide prediction of protein function. Nature 402:83–86CrossRef Marcotte E, Pellegrini M, Thompson M, Yeates T, Eisenberg D (1999) A combined algorithm for genome-wide prediction of protein function. Nature 402:83–86CrossRef
39.
Zurück zum Zitat Mayer ML, Hieter P (2000) Protein networks-built by association. Nat Biotechnol 18(12):1242–3CrossRef Mayer ML, Hieter P (2000) Protein networks-built by association. Nat Biotechnol 18(12):1242–3CrossRef
41.
Zurück zum Zitat Mesiti M, Re M, Valentini G (2014) Think globally and solve locally: secondary memory-based network learning for automated multi-species function prediction. Giga Sci 3:5. doi:10.1186/2047-217X-3-5 CrossRef Mesiti M, Re M, Valentini G (2014) Think globally and solve locally: secondary memory-based network learning for automated multi-species function prediction. Giga Sci 3:5. doi:10.​1186/​2047-217X-3-5 CrossRef
42.
Zurück zum Zitat Mislove A, Viswanath B, Gummadi KP, Druschel P (2010) You are who you know: inferring user profiles in online social networks. In: Proceedings of the third ACM international conference on web search and data mining, WSDM ’10. ACM, New York, pp 251–260. doi:10.1145/1718487.1718519 Mislove A, Viswanath B, Gummadi KP, Druschel P (2010) You are who you know: inferring user profiles in online social networks. In: Proceedings of the third ACM international conference on web search and data mining, WSDM ’10. ACM, New York, pp 251–260. doi:10.​1145/​1718487.​1718519
43.
Zurück zum Zitat Mostafavi S, Morris Q (2010) Fast integration of heterogeneous data sources for predicting gene function with limited annotation. Bioinformatics 26(14):1759–1765CrossRef Mostafavi S, Morris Q (2010) Fast integration of heterogeneous data sources for predicting gene function with limited annotation. Bioinformatics 26(14):1759–1765CrossRef
44.
Zurück zum Zitat Mostafavi S, Ray D, Farley DW, et al (2008) GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function. Genome Biol 9(Suppl 1), S4+ Mostafavi S, Ray D, Farley DW, et al (2008) GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function. Genome Biol 9(Suppl 1), S4+
45.
Zurück zum Zitat Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Buillard V, Cerutti L, Copley R et al (2007) New developments in the InterPro database. Nucl Acids Res 35(suppl 1):D224–D228CrossRef Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Buillard V, Cerutti L, Copley R et al (2007) New developments in the InterPro database. Nucl Acids Res 35(suppl 1):D224–D228CrossRef
46.
Zurück zum Zitat Muller J, Szklarczyk D, Julien P, Letunic I, Roth A, Kuhn M, Powell S, von Mering C, Doerks T, Jensen LJ et al (2010) eggnog v2. 0: extending the evolutionary genealogy of genes with enhanced non-supervised orthologous groups, species and functional annotations. Nucl Acids Res 38(suppl 1):D190–D195CrossRef Muller J, Szklarczyk D, Julien P, Letunic I, Roth A, Kuhn M, Powell S, von Mering C, Doerks T, Jensen LJ et al (2010) eggnog v2. 0: extending the evolutionary genealogy of genes with enhanced non-supervised orthologous groups, species and functional annotations. Nucl Acids Res 38(suppl 1):D190–D195CrossRef
48.
Zurück zum Zitat Muruganantham G, Bhakat RS (2013) A review of impulse buying behavior. Int J Mark Stud 5(3):p149 Muruganantham G, Bhakat RS (2013) A review of impulse buying behavior. Int J Mark Stud 5(3):p149
49.
Zurück zum Zitat Nabieva E, Jim K, Agarwal A, Chazelle B, Singh M (2005) Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics 21(S1):302–310CrossRef Nabieva E, Jim K, Agarwal A, Chazelle B, Singh M (2005) Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics 21(S1):302–310CrossRef
52.
Zurück zum Zitat Pena-Castillo L, Tasan M, Myers C et al (2008) A critical assessment of Mus musculus gene function prediction using integrated genomic evidence. Genome Biol 9:S1CrossRef Pena-Castillo L, Tasan M, Myers C et al (2008) A critical assessment of Mus musculus gene function prediction using integrated genomic evidence. Genome Biol 9:S1CrossRef
53.
Zurück zum Zitat Radivojac P et al (2013) A large-scale evaluation of computational protein function prediction. Nat Methods 10(3):221–227CrossRef Radivojac P et al (2013) A large-scale evaluation of computational protein function prediction. Nat Methods 10(3):221–227CrossRef
54.
56.
Zurück zum Zitat Salavati AH, Kumar KR, Shokrollahi A (2013) A non-binary associative memory with exponential pattern retrieval capacity and iterative learning: Extended Results. CoRR abs/1302.1156 Salavati AH, Kumar KR, Shokrollahi A (2013) A non-binary associative memory with exponential pattern retrieval capacity and iterative learning: Extended Results. CoRR abs/1302.1156
57.
Zurück zum Zitat Schwikowski B, Uetz P, Fields S (2000) A network of protein-protein interactions in yeast. Nat Biotechnol 18(12):1257–1261CrossRef Schwikowski B, Uetz P, Fields S (2000) A network of protein-protein interactions in yeast. Nat Biotechnol 18(12):1257–1261CrossRef
59.
Zurück zum Zitat Szummer M, Jaakkola T (2001) Partially labeled classification with Markov random walks. In: Advances in neural information processing systems (NIPS) 14:945–952. MIT Press Szummer M, Jaakkola T (2001) Partially labeled classification with Markov random walks. In: Advances in neural information processing systems (NIPS) 14:945–952. MIT Press
60.
Zurück zum Zitat Tsuda K, Shin H, Scholkopf B (2005) Fast protein classification with multiple networks. Bioinformatics 21(Suppl 2):ii59–ii65CrossRef Tsuda K, Shin H, Scholkopf B (2005) Fast protein classification with multiple networks. Bioinformatics 21(Suppl 2):ii59–ii65CrossRef
61.
Zurück zum Zitat Valentini G, Paccanaro A, Caniza H, Romero A, Re M (2014) An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods. Artif Intell Med 61(2):63–78. doi:10.1016/j.artmed.2014.03.003 CrossRef Valentini G, Paccanaro A, Caniza H, Romero A, Re M (2014) An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods. Artif Intell Med 61(2):63–78. doi:10.​1016/​j.​artmed.​2014.​03.​003 CrossRef
62.
Zurück zum Zitat Vazquez A, Flammini A, Maritan A, Vespignani A (2003) Global protein function prediction from protein-protein interaction networks. Nat Biotechnol 21:697–700CrossRef Vazquez A, Flammini A, Maritan A, Vespignani A (2003) Global protein function prediction from protein-protein interaction networks. Nat Biotechnol 21:697–700CrossRef
65.
Zurück zum Zitat Wong AK, Park CY, Greene CS, Bongo LA, Guan Y, Troyanskaya OG (2012) Imp: a multi-species functional genomics portal for integration, visualization and prediction of protein functions and networks. Nucl Acids Res 40(W1):W484–W490CrossRef Wong AK, Park CY, Greene CS, Bongo LA, Guan Y, Troyanskaya OG (2012) Imp: a multi-species functional genomics portal for integration, visualization and prediction of protein functions and networks. Nucl Acids Res 40(W1):W484–W490CrossRef
68.
Zurück zum Zitat Youngs N, Penfold-Brown D, Drew K, Shasha D, Bonneau R (2013) Parametric Bayesian priors and better choice of negative examples improve protein function prediction. Bioinformatics 29(9):btt110–1198. doi:10.1093/bioinformatics/btt110 CrossRef Youngs N, Penfold-Brown D, Drew K, Shasha D, Bonneau R (2013) Parametric Bayesian priors and better choice of negative examples improve protein function prediction. Bioinformatics 29(9):btt110–1198. doi:10.​1093/​bioinformatics/​btt110 CrossRef
70.
Zurück zum Zitat Zhu X, Ghahramani Z, Lafferty J (2003) Semi-supervised learning using Gaussian fields and harmonic functions. In. In ICML, pp 912–919 Zhu X, Ghahramani Z, Lafferty J (2003) Semi-supervised learning using Gaussian fields and harmonic functions. In. In ICML, pp 912–919
71.
Zurück zum Zitat Zurada JM, Cloete I, van der Poel E (1996) Generalized Hopfield networks for associative memories with multi-valued stable states. Neurocomputing 13(24):135–149CrossRef Zurada JM, Cloete I, van der Poel E (1996) Generalized Hopfield networks for associative memories with multi-valued stable states. Neurocomputing 13(24):135–149CrossRef
Metadaten
Titel
Learning node labels with multi-category Hopfield networks
verfasst von
Marco Frasca
Simone Bassis
Giorgio Valentini
Publikationsdatum
01.08.2016
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 6/2016
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-015-1965-1

Weitere Artikel der Ausgabe 6/2016

Neural Computing and Applications 6/2016 Zur Ausgabe