skip to main content
10.1145/2783258.2783417acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Deep Graph Kernels

Published:10 August 2015Publication History

ABSTRACT

In this paper, we present Deep Graph Kernels, a unified framework to learn latent representations of sub-structures for graphs, inspired by latest advancements in language modeling and deep learning. Our framework leverages the dependency information between sub-structures by learning their latent representations. We demonstrate instances of our framework on three popular graph kernels, namely Graphlet kernels, Weisfeiler-Lehman subtree kernels, and Shortest-Path graph kernels. Our experiments on several benchmark datasets show that Deep Graph Kernels achieve significant improvements in classification accuracy over state-of-the-art graph kernels.

Skip Supplemental Material Section

Supplemental Material

p1365.mp4

mp4

212.6 MB

References

  1. A. Andreeva, D. Howorth, S. E. Brenner, T. J. Hubbard, C. Chothia, and A. G. Murzin. Scop database in 2004: refinements integrate structure and sequence family data. Nucleic acids research, 32 (suppl 1): D226--D229, 2004.Google ScholarGoogle Scholar
  2. Y. Bengio, R. Ducharme, P. Vincent, and C. Janvin. A neural probabilistic language model. The Journal of Machine Learning Research, 3: 1137--1155, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. K. M. Borgwardt and H.-P. Kriegel. Shortest-path kernels on gras. In Proc. Intl. Conf. Data Mining, pages 74--81, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. M. Borgwardt, C. S. Ong, S. Schönauer, S. V. N. Vishwanathan, A. J. Smola, and H.-P. Kriegel. Protein function prediction via gra kernels. In Proceedings of Intelligent Systems in Molecular Biology (ISMB), Detroit, USA, 2005.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Chang and C. Lin. LIBSVM: a library for support vector machines, 2001.Google ScholarGoogle Scholar
  6. F. Costa and K. De Grave. Fast neighborhood subgra pairwise distance kernel. In Proceedings of the 26th International Conference on Machine Learning, pages 255--262, 2010.Google ScholarGoogle Scholar
  7. A. K. Debnath, R. L. Lopez de Compadre, G. Debnath, A. J. Shusterman, and C. Hansch. Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydroobicity. J Med Chem, 34: 786--797, 1991.Google ScholarGoogle ScholarCross RefCross Ref
  8. A. Feragen, N. Kasenburg, J. Petersen, M. de Bruijne, and K. Borgwardt. Scalable kernels for gras with continuous attributes. In Advances in Neural Information Processing Systems, pages 216--224, 2013.Google ScholarGoogle Scholar
  9. N. K. Fox, S. E. Brenner, and J.-M. Chandonia. Scope: Structural classification of proteins-extended, integrating scop and astral data and classification of new structures. Nucleic acids research, 42 (D1): D304--D309, 2014.Google ScholarGoogle Scholar
  10. T. Gärtner, P. Flach, and S. Wrobel. On graph kernels: Hardness results and efficient alternatives. In B. Schölkopf and M. K. Warmuth, editors, Proc. Annual Conf. Computational Learning Theory, pages 129--143. Springer, 2003.Google ScholarGoogle Scholar
  11. D. Haussler. Convolution kernels on discrete structures. Technical Report UCS-CRL-99-10, UC Santa Cruz, 1999.Google ScholarGoogle Scholar
  12. T. Hofmann, B. Schölkopf, and A. J. Smola. Kernel methods in machine learning. Technical Report 156, Max-Planck-Institut für biologische Kybernetik, 2006. To appear in the Annals of Statistics.Google ScholarGoogle Scholar
  13. T. Horvath, T. Gärtner, and S. Wrobel. Cyclic pattern kernels for predictive gra mining. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD), pages 158--167, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. H. Kashima, K. Tsuda, and A. Inokuchi. Marginalized kernels between labeled gras. In Proc. Intl. Conf. Machine Learning, pages 321--328, San Francisco, CA, 2003. Morgan Kaufmann.Google ScholarGoogle Scholar
  15. T. Kin, T. Kato, and K. Tsuda. Protein classification via kernel matrix completion. In B. Schölkopf, K. Tsuda, and J.-P. Vert, editors, Kernel Methods in Computational Biology, pages 261--274, Cambridge, MA; USA, 2004. MIT Press.Google ScholarGoogle Scholar
  16. J. Leskovec, J. Kleinberg, and C. Faloutsos. Gras over time: densification laws, shrinking diameters and possible explanations. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pages 177--187. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Leslie, E. Eskin, and W. S. Noble. The spectrum kernel: A string kernel for SVM protein classification. In Proceedings of the Pacific Symposium on Biocomputing, pages 564--575, Singapore, 2002. World Scientific Publishing.Google ScholarGoogle Scholar
  18. O. Levy and Y. Goldberg. Neural word embedding as implicit matrix factorization. In M. Welling, Z. Ghahramani, C. Cortes, N. Lawrence, and K. Weinberger, editors, Advances in Neural Information Processing Systems 27, pages 2177--2185, 2014.Google ScholarGoogle Scholar
  19. B. D. McKay. Nauty user's guide (version 2.4). Computer Science Dept., Australian National University, 2007.Google ScholarGoogle Scholar
  20. T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.Google ScholarGoogle Scholar
  21. T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean. Distributed representations of words and rases and their compositionality. In C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Weinberger, editors, Advances in Neural Information Processing Systems 26, 2013.Google ScholarGoogle Scholar
  22. B. Perozzi, R. Al-Rfou, and S. Skiena. Deepwalk: Online learning of social representations. arXiv preprint arXiv:1403.6652, 2014.Google ScholarGoogle Scholar
  23. N. Przulj. Biological network comparison using gralet degree distribution. In 2006 European Conference on Computational Biology (ECCB), September 2006.Google ScholarGoogle Scholar
  24. J. Ramon and T. Gärtner. Expressivity versus efficiency of gra kernels. Technical report, First International Workshop on Mining Gras, Trees and Sequences (held with ECML/PKDD'03), 2003.Google ScholarGoogle Scholar
  25. R. Řehůřek and P. Sojka. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pages 45--50, Valletta, Malta, May 2010. ELRA.Google ScholarGoogle Scholar
  26. B. Schölkopf and A. J. Smola. Learning with Kernels. MIT Press, Cambridge, MA, 2002.Google ScholarGoogle Scholar
  27. N. Shervashidze and K. Borgwardt. Fast subtree kernels on gras. In Neural Information Processing Systems, 2010.Google ScholarGoogle Scholar
  28. N. Shervashidze, S. V. N. Vishwanathan, T. Petri, K. Mehlhorn, and K. Borgwardt. Efficient gralet kernels for large gra comparison. In M. Welling and D. van Dyk, editors, Proc. Intl. Conference on Artificial Intelligence and Statistics. Society for Artificial Intelligence and Statistics, 2009.Google ScholarGoogle Scholar
  29. A. Shrivastava and P. Li. A new space for comparing gras. In Advances in Social Networks Analysis and Mining (ASONAM), 2014 IEEE/ACM International Conference on, pages 62--71. IEEE, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. A. J. Smola and R. Kondor. Kernels and regularization on gras. In B. Schölkopf and M. K. Warmuth, editors, Proc. Annual Conf. Computational Learning Theory, Lecture Notes in Comput. Sci., pages 144--158, Heidelberg, Germany, 2003. Springer-Verlag.Google ScholarGoogle ScholarCross RefCross Ref
  31. H. Toivonen, A. Srinivasan, R. D. King, S. Kramer, and C. Helma. Statistical evaluation of the predictive toxicology challenge 2000--2001. Bioinformatics, 19 (10): 1183--1193, July 2003.Google ScholarGoogle ScholarCross RefCross Ref
  32. S. V. N. Vishwanathan, K. M. Borgwardt, O. Guttman, and A. J. Smola. Kernel extrapolation. Neurocomputing, 69 (7-9): 721--729, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. S. V. N. Vishwanathan, N. N. Schraudol, I. R. Kondor, and K. M. Borgwardt. Gra kernels. Journal of Machine Learning Research, 2010. In press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. N. Wale, I. A. Watson, and G. Karypis. Comparison of descriptor spaces for chemical compound retrieval and classification. Knowledge and Information Systems, 14 (3): 347--375, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Deep Graph Kernels

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader