skip to main content
article

A survey of kernels for structured data

Published:01 July 2003Publication History
Skip Abstract Section

Abstract

Kernel methods in general and support vector machines in particular have been successful in various learning tasks on data represented in a single table. Much 'real-world' data, however, is structured - it has no natural representation in a single table. Usually, to apply kernel methods to 'real-world' data, extensive pre-processing is performed to embed the data into areal vector space and thus in a single table. This survey describes several approaches of defining positive definite kernels on structured instances directly.

References

  1. N. Aronszajn. Theory of reproducing kernels. Transactions of the American Mathematical Society, 68, 1950.]]Google ScholarGoogle Scholar
  2. K. Bennett and C. Campbell. Support vector machines: Hype or hallelujah? SIGKDD Explorations, 2(2), 2000. http://www.acm.org/sigs/sigkdd/explorations/issue2-2/bennett.pdf.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. E. Boser, I. M. Guyon, and V. N. Vapnik. A training algorithm for optimal margin classifiers. In D. Haussler, editor, Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, pages 144--152. ACM Press, July 1992.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Collins and N. Duffy. Convolution kernels for natural language. In T. G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems, volume 14. MIT Press, 2002.]]Google ScholarGoogle Scholar
  5. N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines (and Other Kernel-Based Learning Methods). Cambridge University Press, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. G. Dietterich, R. H. Lathrop, and T. Lozano-Pérez. Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence, 89(1--2):31--71, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Durbin, S. Eddy, A. Krogh, and G. Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, 1998.]]Google ScholarGoogle ScholarCross RefCross Ref
  8. S. Džeroski and N. Lavrač, editors. Relational Data Mining. Springer-Verlag, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Džeroski, S. Schulze-Kremer, K. Heidtke, K. Siems, D. Wettschereck, and H. Blockeel. Diterpene structure elucidation from 13C NMR spectra with inductive logic programming. Applied Artificial Intelligence, 12(5):363--383, July-Aug. 1998. Special Issue on First-Order Knowledge Discovery in Databases.]]Google ScholarGoogle ScholarCross RefCross Ref
  10. T. Gärtner. Exponential and geometric kernels for graphs. In NIPS Workshop on Unreal Data: Principles of Modeling Nonvectorial Data, 2002.]]Google ScholarGoogle Scholar
  11. T. Gärtner, K. Driessens, and J. Ramon. Graph kernels and gaussian processes for relational reinforcement learning. In Proceedings of the 13th International Conference on Inductive Logic Programming, 2003.]]Google ScholarGoogle ScholarCross RefCross Ref
  12. T. Gärtner, P. A. Flach, A. Kowalczyk, and A. J. Smola. Multi-instance kernels. In C. Sammut and A. Hoffmann, editors, Proceedings of the 19th International Conference on Machine Learning, pages 179--186. Morgan Kaufmann, June 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Gärtner, P. A. Flach, and S. Wrobel. On graph kernels: Hardness results and efficient alternatives. In Proceedings of the 16th Annual Conference on Computa tional Learning Theory and the 7th Kernel Workshop, 2003.]]Google ScholarGoogle ScholarCross RefCross Ref
  14. T. Gärtner, J. W. Lloyd, and P. A. Flach. Kernels for structured data. In Proceedings of the 12th International Conference on Inductive Logic Programming. Springer-Verlag, 2002.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Haussler. Convolution kernels on discrete structures. Technical report, Department of Computer Science, University of California at Santa Cruz, 1999.]]Google ScholarGoogle Scholar
  16. T. Jaakkola, M. Diekhans, and D. Haussler. A discriminative framework for detecting remote protein homologies. Journal of Computational Biology, 7(1, 2), 2000.]]Google ScholarGoogle ScholarCross RefCross Ref
  17. T. Jaakkola and D. Haussler. Exploiting generative models in discriminative classifiers. In Advances in Neural Information Processing Systems, volume 10, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. Jaakkola and D. Haussler. Probabilistic kernel regression models. In Proceedings of the 1999 Conference on AI and Statistics, 1999.]]Google ScholarGoogle Scholar
  19. T. Joachims. Learning to Classify Text using Support Vector Machines. Kluwer Academic Publishers, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Karchin, K. Karplus, and D. Haussler. Classifying g-protein coupled receptors with support vector machines. Bioinformatics, 18(1):147--159, 2002.]]Google ScholarGoogle ScholarCross RefCross Ref
  21. H. Kashima and A. Inokuchi. Kernels for graph classification. In ICDM Workshop on Active Mining, 2002.]]Google ScholarGoogle Scholar
  22. H. Kashima and T. Koyanagi. Kernels for semistructured data. In C. Sammut and A. Hoffmann, editors, Proceedings of the 19th International Conference on Machine Learning. Morgan Kaufmann, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. H. Kashima, K. Tsuda, and A. Inokuchi. Marginalized kernels between labeled graphs. In Proceedings of the 20th International Conference on Machine Learning, 2003.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. I. Kondor and J. Lafferty. Diffusion kernels on graphs and other discrete input spaces. In C. Sammut and A. Hoffmann, editors, Proceedings of the 19th International Conference on Machine Learning, pages 315--322. Morgan Kaufmann, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. Kramer, N. Lavrač, and P. A. Flach. Propositionalization approaches to relational data mining. In Džeroski and Lavrač {8}, chapter 11.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M.-A. Krogel and S. Wrobel. Transformation-based learning using multirelational aggregation. In C. Rouveirol and M. Sebag, editors, Proceedings of the 11th International Conference on Inductive Logic Programming. Springer-Verlag, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. C. Leslie, E. Eskin, andW. Noble. The spectrum kernel: A string kernel for svm protein classification. In Proceedings of the Pacific Symposium on Biocomputing, pages 564--575, 2002.]]Google ScholarGoogle Scholar
  28. C. Leslie, E. Eskin, J. Weston, and W. Noble. Mismatch string kernels for svm protein classification. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems, volume 15. MIT Press, 2003.]]Google ScholarGoogle Scholar
  29. J. W. Lloyd. Logic for Learning. Springer-Verlag, 2002.]]Google ScholarGoogle Scholar
  30. H. Lodhi, C. Saunders, J. Shawe-Taylor, N. Cristianini, and C. Watkins. Text classification using string kernels. Journal of Machine Learning Research, 2, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. H. Lodhi, J. Shawe-Taylor, N. Christianini, and C. Watkins. Text classification using string kernels. In T. Leen, T. Dietterich, and V. Tresp, editors, Advances in Neural Information Processing Systems, volume 13. MIT Press, 2001.]]Google ScholarGoogle Scholar
  32. K.-R. Müller, S. Mika, G. Rätsch, K. Tsuda, and B. Schölkopf. An introduction to kernel-based learning algorithms. IEEE Transactions on Neural Networks, 2(2), 2001.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. G. Paass, E. Leopold, M. Larson, J. Kindermann, and S. Eickeler. Svm classification using sequences of phonemes and syllables. In T. Elomaa, H. Mannila, and H. Toivonen, editors, Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery, pages 373--384. Springer-Verlag, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. P. Pavlidis, T. Furey, M. Liberto, D. Haussler, and W. Grundy. Promoter region-based classification of genes. In Proceedings of the Pacific Symposium on Biocomputing, pages 151--163, 2001.]]Google ScholarGoogle Scholar
  35. L. R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257--285, Feb. 1989.]]Google ScholarGoogle ScholarCross RefCross Ref
  36. C. Saunders, J. Shawe-Taylor, and A. Vinokourov. String kernels, fisher kernels and finite state automata. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems, volume 15. MIT Press, 2003.]]Google ScholarGoogle Scholar
  37. B. Schölkopf and A. J. Smola. Learning with Kernels. MIT Press, 2002.]]Google ScholarGoogle Scholar
  38. N. Smith and M. Gales. Speech recognition using SVMs. In T. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems, volume 14. MIT Press, 2002.]]Google ScholarGoogle Scholar
  39. K. Tsuda, M. Kawanabe, G. Rätsch, S. Sonnenburg, and K.-R. Müller. A new discriminative kernel from probabilistic models. In T. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems, volume 14. MIT Press, 2002.]]Google ScholarGoogle Scholar
  40. K. Tsuda, T. Kin, and K. Asai. Marginalized kernels for biological sequences. Bioinformatics, 2002.]]Google ScholarGoogle Scholar
  41. V. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. J.-P. Vert. A tree kernel to analyze phylogenetic profiles. Bioinformatics, 2002.]]Google ScholarGoogle Scholar
  43. J.-P. Vert and M. Kanehisa. Graph driven features extraction from microarray data using diffusion kernels and kernel cca. In S. Becker, S. Thrun, and K. Ober mayer, editors, Advances in Neural Information Processing Systems, volume 15. MIT Press, 2003.]]Google ScholarGoogle Scholar
  44. S. Vishwanathan and A. Smola. Fast kernels for string and tree matching. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems, volume 15. MIT Press, 2003.]]Google ScholarGoogle Scholar
  45. C. Watkins. Dynamic alignment kernels. Technical report, Department of Computer Science, Royal Holloway, University of London, 1999.]]Google ScholarGoogle Scholar
  46. C. Watkins. Kernels from matching operations. Technical report, Department of Computer Science, Royal Holloway, University of London, 1999.]]Google ScholarGoogle Scholar
  47. A, Zien, G. Ratsch, S. Mika, B. Schölkopf, T. Lengauer, and K.-R. Muller. Engineering support vector machine kernels that recognize translation initiation sites. Bioinforrnatics, 16(9):799--807, 2000.]]Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. A survey of kernels for structured data
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader