Skip to main content
Top
Published in: Neural Computing and Applications 6/2016

01-08-2016 | Original Article

Learning node labels with multi-category Hopfield networks

Authors: Marco Frasca, Simone Bassis, Giorgio Valentini

Published in: Neural Computing and Applications | Issue 6/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In several real-world node label prediction problems on graphs, in fields ranging from computational biology to World Wide Web analysis, nodes can be partitioned into categories different from the classes to be predicted, on the basis of their characteristics or their common properties. Such partitions may provide further information about node classification that classical machine learning algorithms do not take into account. We introduce a novel family of parametric Hopfield networks (m-category Hopfield networks) and a novel algorithm (Hopfield multi-categoryHoMCat), designed to appropriately exploit the presence of property-based partitions of nodes into multiple categories. Moreover, the proposed model adopts a cost-sensitive learning strategy to prevent the remarkable decay in performance usually observed when instance labels are unbalanced, that is, when one class of labels is highly underrepresented than the other one. We validate the proposed model on both synthetic and real-world data, in the context of multi-species function prediction, where the classes to be predicted are the Gene Ontology terms and the categories the different species in the multi-species protein network. We carried out an intensive experimental validation, which on the one hand compares HoMCat with several state-of-the-art graph-based algorithms, and on the other hand reveals that exploiting meaningful prior partitions of input data can substantially improve classification performances.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
1.
go back to reference Ashburner M et al (2000) Gene ontology: tool for the unification of biology. Gene ontology consortium. Nat Genet 25(1):25–29CrossRef Ashburner M et al (2000) Gene ontology: tool for the unification of biology. Gene ontology consortium. Nat Genet 25(1):25–29CrossRef
3.
go back to reference Attwood TK, Bradley P, Flower DR, Gaulton A, Maudling N, Mitchell A, Moulton G, Nordle A, Paine K, Taylor P et al (2003) Prints and its automatic supplement, preprints. Nucl Acids Res 31(1):400–402CrossRef Attwood TK, Bradley P, Flower DR, Gaulton A, Maudling N, Mitchell A, Moulton G, Nordle A, Paine K, Taylor P et al (2003) Prints and its automatic supplement, preprints. Nucl Acids Res 31(1):400–402CrossRef
4.
go back to reference Azran A (2007) The rendezvous algorithm: multi- class semi-supervised learning with Markov random walks. In: Proceedings of the 24th international conference on machine learning (ICML) Azran A (2007) The rendezvous algorithm: multi- class semi-supervised learning with Markov random walks. In: Proceedings of the 24th international conference on machine learning (ICML)
5.
go back to reference Bairoch A, Apweiler R (1997) the SWISS-PROT protein sequence data bank and its supplement TrEMBL. Nucl Acids Res 25(1):31–36CrossRef Bairoch A, Apweiler R (1997) the SWISS-PROT protein sequence data bank and its supplement TrEMBL. Nucl Acids Res 25(1):31–36CrossRef
6.
go back to reference Bengio Y, Delalleau O, Le Roux N (2006) Label propagation and quadratic criterion. In: Chapelle O, Scholkopf B, Zien A (eds) Semi supervised learning. MIT Press, Cambridge, pp 193–216 Bengio Y, Delalleau O, Le Roux N (2006) Label propagation and quadratic criterion. In: Chapelle O, Scholkopf B, Zien A (eds) Semi supervised learning. MIT Press, Cambridge, pp 193–216
7.
go back to reference Bertoni A, Frasca M, Valentini G (2011) Cosnet: a cost sensitive neural network for semi-supervised learning in graphs. In: ECML/PKDD (1), Lecture Notes in Computer Science, vol 6911, pp 219–234. Springer Bertoni A, Frasca M, Valentini G (2011) Cosnet: a cost sensitive neural network for semi-supervised learning in graphs. In: ECML/PKDD (1), Lecture Notes in Computer Science, vol 6911, pp 219–234. Springer
8.
go back to reference Bhagat S, Cormode G, Muthukrishnan S (2011) Node classification in social networks. CoRR abs/1101.3291 Bhagat S, Cormode G, Muthukrishnan S (2011) Node classification in social networks. CoRR abs/1101.3291
9.
go back to reference Bogdanov P, Singh AK (2010) Molecular function prediction using neighborhood features. IEEE/ACM Trans Comput Biol Bioinform 7:208–217CrossRef Bogdanov P, Singh AK (2010) Molecular function prediction using neighborhood features. IEEE/ACM Trans Comput Biol Bioinform 7:208–217CrossRef
10.
go back to reference Brent R (1973) Algorithms for minimization without derivatives. Prentice-Hall, New JerseyMATH Brent R (1973) Algorithms for minimization without derivatives. Prentice-Hall, New JerseyMATH
11.
go back to reference Chaudhari G, Avadhanula V, Sarawagi S (2014) A few good predictions: selective node labeling in a social network. In: Proceedings of the 7th ACM international conference on web search and data mining, WSDM ’14, pp 353–362. ACM, New York. doi:10.1145/2556195.2556241 Chaudhari G, Avadhanula V, Sarawagi S (2014) A few good predictions: selective node labeling in a social network. In: Proceedings of the 7th ACM international conference on web search and data mining, WSDM ’14, pp 353–362. ACM, New York. doi:10.​1145/​2556195.​2556241
13.
go back to reference Cheng Z, Caverlee J, Lee K (2010) You are where you tweet: a content-based approach to geo-locating twitter users. In: Proceedings of the 19th ACM international conference on information and knowledge management., CIKM ’10ACM, New York, pp 759–768 Cheng Z, Caverlee J, Lee K (2010) You are where you tweet: a content-based approach to geo-locating twitter users. In: Proceedings of the 19th ACM international conference on information and knowledge management., CIKM ’10ACM, New York, pp 759–768
14.
go back to reference Chua HN, Sung WK, Wong L (2006) Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics 22:1623–1630CrossRef Chua HN, Sung WK, Wong L (2006) Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics 22:1623–1630CrossRef
15.
go back to reference Deng M, Chen T, Sun F (2004) An integrated probabilistic model for functional prediction of proteins. J Comput Biol 11:463–475CrossRef Deng M, Chen T, Sun F (2004) An integrated probabilistic model for functional prediction of proteins. J Comput Biol 11:463–475CrossRef
16.
go back to reference Elkan C (2001) The foundations of cost-sensitive learning. In: Proceedings of the seventeenth international joint conference on artificial intelligence, pp 973–978 Elkan C (2001) The foundations of cost-sensitive learning. In: Proceedings of the seventeenth international joint conference on artificial intelligence, pp 973–978
17.
go back to reference Erdem MH, Ozturk Y (1996) A new family of multivalued networks. Neural Netw 9(6):979–989CrossRef Erdem MH, Ozturk Y (1996) A new family of multivalued networks. Neural Netw 9(6):979–989CrossRef
18.
go back to reference Ertoz L, Steinbach M, Kumar V (2002) A new shared nearest neighbor clustering algorithm and its applications. In: Workshop on clustering high dimensional data and its applications at 2nd SIAM international conference on data mining Ertoz L, Steinbach M, Kumar V (2002) A new shared nearest neighbor clustering algorithm and its applications. In: Workshop on clustering high dimensional data and its applications at 2nd SIAM international conference on data mining
19.
go back to reference Finn RD, Mistry J, Schuster-Böckler B, Griffiths-Jones S, Hollich V, Lassmann T, Moxon S, Marshall M, Khanna A, Durbin R et al (2006) Pfam: clans, web tools and services. Nucl Acids Res 34(suppl 1):D247–D251CrossRef Finn RD, Mistry J, Schuster-Böckler B, Griffiths-Jones S, Hollich V, Lassmann T, Moxon S, Marshall M, Khanna A, Durbin R et al (2006) Pfam: clans, web tools and services. Nucl Acids Res 34(suppl 1):D247–D251CrossRef
21.
go back to reference Frasca M, Bertoni A et al (2013) A neural network algorithm for semi-supervised node label learning from unbalanced data. Neural Netw 43:84–98CrossRefMATH Frasca M, Bertoni A et al (2013) A neural network algorithm for semi-supervised node label learning from unbalanced data. Neural Netw 43:84–98CrossRefMATH
22.
go back to reference Frasca M, Pavesi G (2013) A neural network based algorithm for gene expression prediction from chromatin structure. In: IJCNN, pp 1–8. IEEE Frasca M, Pavesi G (2013) A neural network based algorithm for gene expression prediction from chromatin structure. In: IJCNN, pp 1–8. IEEE
23.
go back to reference Gough J, Karplus K, Hughey R, Chothia C (2001) Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. J Mol Biol 313(4):903–919CrossRef Gough J, Karplus K, Hughey R, Chothia C (2001) Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. J Mol Biol 313(4):903–919CrossRef
24.
go back to reference Guyon I, Cawley G, Dror G (eds) (2011) Hands-on pattern recognition: challenges in machine learning, challenges in machine learning, vol 1. Microtome Publishing, Brookline Guyon I, Cawley G, Dror G (eds) (2011) Hands-on pattern recognition: challenges in machine learning, challenges in machine learning, vol 1. Microtome Publishing, Brookline
26.
go back to reference Hopfield J (1982) Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci USA 79:2554–2558MathSciNetCrossRef Hopfield J (1982) Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci USA 79:2554–2558MathSciNetCrossRef
27.
go back to reference Hulo N, Bairoch A, Bulliard V, Cerutti L, De Castro E, Langendijk-Genevaux PS, Pagni M, Sigrist CJ (2006) The PROSITE database. Nucl Acids Res 34(suppl 1):D227–D230CrossRef Hulo N, Bairoch A, Bulliard V, Cerutti L, De Castro E, Langendijk-Genevaux PS, Pagni M, Sigrist CJ (2006) The PROSITE database. Nucl Acids Res 34(suppl 1):D227–D230CrossRef
28.
go back to reference Jarvis RA, Patrick EA (1973) Clustering using a similarity measure based on shared near neighbors. IEEE Trans Comput 22(11):1025–1034CrossRef Jarvis RA, Patrick EA (1973) Clustering using a similarity measure based on shared near neighbors. IEEE Trans Comput 22(11):1025–1034CrossRef
29.
go back to reference Karaoz U et al (2004) Whole-genome annotation by using evidence integration in functional-linkage networks. Proc Natl Acad Sci USA 101:2888–2893CrossRef Karaoz U et al (2004) Whole-genome annotation by using evidence integration in functional-linkage networks. Proc Natl Acad Sci USA 101:2888–2893CrossRef
30.
go back to reference Kohler S, Bauer S, Horn D, Robinson P (2008) Walking the interactome for prioritization of candidate disease genes. Am J Human Genet 82(4):948–958CrossRef Kohler S, Bauer S, Horn D, Robinson P (2008) Walking the interactome for prioritization of candidate disease genes. Am J Human Genet 82(4):948–958CrossRef
32.
go back to reference Lan L et al (2013) MS-kNN: protein function prediction by integrating multiple data sources. BMC Bioinformatics 14(Suppl 3:S8) Lan L et al (2013) MS-kNN: protein function prediction by integrating multiple data sources. BMC Bioinformatics 14(Suppl 3:S8)
33.
go back to reference Letunic I, Copley RR, Pils B, Pinkert S, Schultz J, Bork P (2006) Smart 5: domains in the context of genomes and networks. Nucl Acids Res 34(suppl 1):D257–D260CrossRef Letunic I, Copley RR, Pils B, Pinkert S, Schultz J, Bork P (2006) Smart 5: domains in the context of genomes and networks. Nucl Acids Res 34(suppl 1):D257–D260CrossRef
36.
go back to reference Lovász L (1996) Random walks on graphs: a survey. In: Miklós D, Sós VT, Szőnyi T (eds) Combinatorics, Paul Erdős is eighty, vol 2. János Bolyai Mathematical Society, Budapest, pp 353–398 Lovász L (1996) Random walks on graphs: a survey. In: Miklós D, Sós VT, Szőnyi T (eds) Combinatorics, Paul Erdős is eighty, vol 2. János Bolyai Mathematical Society, Budapest, pp 353–398
38.
go back to reference Marcotte E, Pellegrini M, Thompson M, Yeates T, Eisenberg D (1999) A combined algorithm for genome-wide prediction of protein function. Nature 402:83–86CrossRef Marcotte E, Pellegrini M, Thompson M, Yeates T, Eisenberg D (1999) A combined algorithm for genome-wide prediction of protein function. Nature 402:83–86CrossRef
39.
go back to reference Mayer ML, Hieter P (2000) Protein networks-built by association. Nat Biotechnol 18(12):1242–3CrossRef Mayer ML, Hieter P (2000) Protein networks-built by association. Nat Biotechnol 18(12):1242–3CrossRef
41.
go back to reference Mesiti M, Re M, Valentini G (2014) Think globally and solve locally: secondary memory-based network learning for automated multi-species function prediction. Giga Sci 3:5. doi:10.1186/2047-217X-3-5 CrossRef Mesiti M, Re M, Valentini G (2014) Think globally and solve locally: secondary memory-based network learning for automated multi-species function prediction. Giga Sci 3:5. doi:10.​1186/​2047-217X-3-5 CrossRef
42.
go back to reference Mislove A, Viswanath B, Gummadi KP, Druschel P (2010) You are who you know: inferring user profiles in online social networks. In: Proceedings of the third ACM international conference on web search and data mining, WSDM ’10. ACM, New York, pp 251–260. doi:10.1145/1718487.1718519 Mislove A, Viswanath B, Gummadi KP, Druschel P (2010) You are who you know: inferring user profiles in online social networks. In: Proceedings of the third ACM international conference on web search and data mining, WSDM ’10. ACM, New York, pp 251–260. doi:10.​1145/​1718487.​1718519
43.
go back to reference Mostafavi S, Morris Q (2010) Fast integration of heterogeneous data sources for predicting gene function with limited annotation. Bioinformatics 26(14):1759–1765CrossRef Mostafavi S, Morris Q (2010) Fast integration of heterogeneous data sources for predicting gene function with limited annotation. Bioinformatics 26(14):1759–1765CrossRef
44.
go back to reference Mostafavi S, Ray D, Farley DW, et al (2008) GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function. Genome Biol 9(Suppl 1), S4+ Mostafavi S, Ray D, Farley DW, et al (2008) GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function. Genome Biol 9(Suppl 1), S4+
45.
go back to reference Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Buillard V, Cerutti L, Copley R et al (2007) New developments in the InterPro database. Nucl Acids Res 35(suppl 1):D224–D228CrossRef Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Buillard V, Cerutti L, Copley R et al (2007) New developments in the InterPro database. Nucl Acids Res 35(suppl 1):D224–D228CrossRef
46.
go back to reference Muller J, Szklarczyk D, Julien P, Letunic I, Roth A, Kuhn M, Powell S, von Mering C, Doerks T, Jensen LJ et al (2010) eggnog v2. 0: extending the evolutionary genealogy of genes with enhanced non-supervised orthologous groups, species and functional annotations. Nucl Acids Res 38(suppl 1):D190–D195CrossRef Muller J, Szklarczyk D, Julien P, Letunic I, Roth A, Kuhn M, Powell S, von Mering C, Doerks T, Jensen LJ et al (2010) eggnog v2. 0: extending the evolutionary genealogy of genes with enhanced non-supervised orthologous groups, species and functional annotations. Nucl Acids Res 38(suppl 1):D190–D195CrossRef
48.
go back to reference Muruganantham G, Bhakat RS (2013) A review of impulse buying behavior. Int J Mark Stud 5(3):p149 Muruganantham G, Bhakat RS (2013) A review of impulse buying behavior. Int J Mark Stud 5(3):p149
49.
go back to reference Nabieva E, Jim K, Agarwal A, Chazelle B, Singh M (2005) Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics 21(S1):302–310CrossRef Nabieva E, Jim K, Agarwal A, Chazelle B, Singh M (2005) Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics 21(S1):302–310CrossRef
52.
go back to reference Pena-Castillo L, Tasan M, Myers C et al (2008) A critical assessment of Mus musculus gene function prediction using integrated genomic evidence. Genome Biol 9:S1CrossRef Pena-Castillo L, Tasan M, Myers C et al (2008) A critical assessment of Mus musculus gene function prediction using integrated genomic evidence. Genome Biol 9:S1CrossRef
53.
go back to reference Radivojac P et al (2013) A large-scale evaluation of computational protein function prediction. Nat Methods 10(3):221–227CrossRef Radivojac P et al (2013) A large-scale evaluation of computational protein function prediction. Nat Methods 10(3):221–227CrossRef
54.
56.
go back to reference Salavati AH, Kumar KR, Shokrollahi A (2013) A non-binary associative memory with exponential pattern retrieval capacity and iterative learning: Extended Results. CoRR abs/1302.1156 Salavati AH, Kumar KR, Shokrollahi A (2013) A non-binary associative memory with exponential pattern retrieval capacity and iterative learning: Extended Results. CoRR abs/1302.1156
57.
go back to reference Schwikowski B, Uetz P, Fields S (2000) A network of protein-protein interactions in yeast. Nat Biotechnol 18(12):1257–1261CrossRef Schwikowski B, Uetz P, Fields S (2000) A network of protein-protein interactions in yeast. Nat Biotechnol 18(12):1257–1261CrossRef
59.
go back to reference Szummer M, Jaakkola T (2001) Partially labeled classification with Markov random walks. In: Advances in neural information processing systems (NIPS) 14:945–952. MIT Press Szummer M, Jaakkola T (2001) Partially labeled classification with Markov random walks. In: Advances in neural information processing systems (NIPS) 14:945–952. MIT Press
60.
go back to reference Tsuda K, Shin H, Scholkopf B (2005) Fast protein classification with multiple networks. Bioinformatics 21(Suppl 2):ii59–ii65CrossRef Tsuda K, Shin H, Scholkopf B (2005) Fast protein classification with multiple networks. Bioinformatics 21(Suppl 2):ii59–ii65CrossRef
61.
go back to reference Valentini G, Paccanaro A, Caniza H, Romero A, Re M (2014) An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods. Artif Intell Med 61(2):63–78. doi:10.1016/j.artmed.2014.03.003 CrossRef Valentini G, Paccanaro A, Caniza H, Romero A, Re M (2014) An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods. Artif Intell Med 61(2):63–78. doi:10.​1016/​j.​artmed.​2014.​03.​003 CrossRef
62.
go back to reference Vazquez A, Flammini A, Maritan A, Vespignani A (2003) Global protein function prediction from protein-protein interaction networks. Nat Biotechnol 21:697–700CrossRef Vazquez A, Flammini A, Maritan A, Vespignani A (2003) Global protein function prediction from protein-protein interaction networks. Nat Biotechnol 21:697–700CrossRef
65.
go back to reference Wong AK, Park CY, Greene CS, Bongo LA, Guan Y, Troyanskaya OG (2012) Imp: a multi-species functional genomics portal for integration, visualization and prediction of protein functions and networks. Nucl Acids Res 40(W1):W484–W490CrossRef Wong AK, Park CY, Greene CS, Bongo LA, Guan Y, Troyanskaya OG (2012) Imp: a multi-species functional genomics portal for integration, visualization and prediction of protein functions and networks. Nucl Acids Res 40(W1):W484–W490CrossRef
68.
70.
go back to reference Zhu X, Ghahramani Z, Lafferty J (2003) Semi-supervised learning using Gaussian fields and harmonic functions. In. In ICML, pp 912–919 Zhu X, Ghahramani Z, Lafferty J (2003) Semi-supervised learning using Gaussian fields and harmonic functions. In. In ICML, pp 912–919
71.
go back to reference Zurada JM, Cloete I, van der Poel E (1996) Generalized Hopfield networks for associative memories with multi-valued stable states. Neurocomputing 13(24):135–149CrossRef Zurada JM, Cloete I, van der Poel E (1996) Generalized Hopfield networks for associative memories with multi-valued stable states. Neurocomputing 13(24):135–149CrossRef
Metadata
Title
Learning node labels with multi-category Hopfield networks
Authors
Marco Frasca
Simone Bassis
Giorgio Valentini
Publication date
01-08-2016
Publisher
Springer London
Published in
Neural Computing and Applications / Issue 6/2016
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-015-1965-1

Other articles of this Issue 6/2016

Neural Computing and Applications 6/2016 Go to the issue

Premium Partner