Abstract
Kernel methods in general and support vector machines in particular have been successful in various learning tasks on data represented in a single table. Much 'real-world' data, however, is structured - it has no natural representation in a single table. Usually, to apply kernel methods to 'real-world' data, extensive pre-processing is performed to embed the data into areal vector space and thus in a single table. This survey describes several approaches of defining positive definite kernels on structured instances directly.
- N. Aronszajn. Theory of reproducing kernels. Transactions of the American Mathematical Society, 68, 1950.]]Google Scholar
- K. Bennett and C. Campbell. Support vector machines: Hype or hallelujah? SIGKDD Explorations, 2(2), 2000. http://www.acm.org/sigs/sigkdd/explorations/issue2-2/bennett.pdf.]] Google ScholarDigital Library
- B. E. Boser, I. M. Guyon, and V. N. Vapnik. A training algorithm for optimal margin classifiers. In D. Haussler, editor, Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, pages 144--152. ACM Press, July 1992.]] Google ScholarDigital Library
- M. Collins and N. Duffy. Convolution kernels for natural language. In T. G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems, volume 14. MIT Press, 2002.]]Google Scholar
- N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines (and Other Kernel-Based Learning Methods). Cambridge University Press, 2000.]] Google ScholarDigital Library
- T. G. Dietterich, R. H. Lathrop, and T. Lozano-Pérez. Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence, 89(1--2):31--71, 1997.]] Google ScholarDigital Library
- R. Durbin, S. Eddy, A. Krogh, and G. Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, 1998.]]Google ScholarCross Ref
- S. Džeroski and N. Lavrač, editors. Relational Data Mining. Springer-Verlag, 2001.]] Google ScholarDigital Library
- S. Džeroski, S. Schulze-Kremer, K. Heidtke, K. Siems, D. Wettschereck, and H. Blockeel. Diterpene structure elucidation from 13C NMR spectra with inductive logic programming. Applied Artificial Intelligence, 12(5):363--383, July-Aug. 1998. Special Issue on First-Order Knowledge Discovery in Databases.]]Google ScholarCross Ref
- T. Gärtner. Exponential and geometric kernels for graphs. In NIPS Workshop on Unreal Data: Principles of Modeling Nonvectorial Data, 2002.]]Google Scholar
- T. Gärtner, K. Driessens, and J. Ramon. Graph kernels and gaussian processes for relational reinforcement learning. In Proceedings of the 13th International Conference on Inductive Logic Programming, 2003.]]Google ScholarCross Ref
- T. Gärtner, P. A. Flach, A. Kowalczyk, and A. J. Smola. Multi-instance kernels. In C. Sammut and A. Hoffmann, editors, Proceedings of the 19th International Conference on Machine Learning, pages 179--186. Morgan Kaufmann, June 2002.]] Google ScholarDigital Library
- T. Gärtner, P. A. Flach, and S. Wrobel. On graph kernels: Hardness results and efficient alternatives. In Proceedings of the 16th Annual Conference on Computa tional Learning Theory and the 7th Kernel Workshop, 2003.]]Google ScholarCross Ref
- T. Gärtner, J. W. Lloyd, and P. A. Flach. Kernels for structured data. In Proceedings of the 12th International Conference on Inductive Logic Programming. Springer-Verlag, 2002.]]Google ScholarDigital Library
- D. Haussler. Convolution kernels on discrete structures. Technical report, Department of Computer Science, University of California at Santa Cruz, 1999.]]Google Scholar
- T. Jaakkola, M. Diekhans, and D. Haussler. A discriminative framework for detecting remote protein homologies. Journal of Computational Biology, 7(1, 2), 2000.]]Google ScholarCross Ref
- T. Jaakkola and D. Haussler. Exploiting generative models in discriminative classifiers. In Advances in Neural Information Processing Systems, volume 10, 1999.]] Google ScholarDigital Library
- T. Jaakkola and D. Haussler. Probabilistic kernel regression models. In Proceedings of the 1999 Conference on AI and Statistics, 1999.]]Google Scholar
- T. Joachims. Learning to Classify Text using Support Vector Machines. Kluwer Academic Publishers, 2002.]] Google ScholarDigital Library
- R. Karchin, K. Karplus, and D. Haussler. Classifying g-protein coupled receptors with support vector machines. Bioinformatics, 18(1):147--159, 2002.]]Google ScholarCross Ref
- H. Kashima and A. Inokuchi. Kernels for graph classification. In ICDM Workshop on Active Mining, 2002.]]Google Scholar
- H. Kashima and T. Koyanagi. Kernels for semistructured data. In C. Sammut and A. Hoffmann, editors, Proceedings of the 19th International Conference on Machine Learning. Morgan Kaufmann, 2002.]] Google ScholarDigital Library
- H. Kashima, K. Tsuda, and A. Inokuchi. Marginalized kernels between labeled graphs. In Proceedings of the 20th International Conference on Machine Learning, 2003.]]Google ScholarDigital Library
- R. I. Kondor and J. Lafferty. Diffusion kernels on graphs and other discrete input spaces. In C. Sammut and A. Hoffmann, editors, Proceedings of the 19th International Conference on Machine Learning, pages 315--322. Morgan Kaufmann, 2002.]] Google ScholarDigital Library
- S. Kramer, N. Lavrač, and P. A. Flach. Propositionalization approaches to relational data mining. In Džeroski and Lavrač {8}, chapter 11.]] Google ScholarDigital Library
- M.-A. Krogel and S. Wrobel. Transformation-based learning using multirelational aggregation. In C. Rouveirol and M. Sebag, editors, Proceedings of the 11th International Conference on Inductive Logic Programming. Springer-Verlag, 2001.]] Google ScholarDigital Library
- C. Leslie, E. Eskin, andW. Noble. The spectrum kernel: A string kernel for svm protein classification. In Proceedings of the Pacific Symposium on Biocomputing, pages 564--575, 2002.]]Google Scholar
- C. Leslie, E. Eskin, J. Weston, and W. Noble. Mismatch string kernels for svm protein classification. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems, volume 15. MIT Press, 2003.]]Google Scholar
- J. W. Lloyd. Logic for Learning. Springer-Verlag, 2002.]]Google Scholar
- H. Lodhi, C. Saunders, J. Shawe-Taylor, N. Cristianini, and C. Watkins. Text classification using string kernels. Journal of Machine Learning Research, 2, 2002.]] Google ScholarDigital Library
- H. Lodhi, J. Shawe-Taylor, N. Christianini, and C. Watkins. Text classification using string kernels. In T. Leen, T. Dietterich, and V. Tresp, editors, Advances in Neural Information Processing Systems, volume 13. MIT Press, 2001.]]Google Scholar
- K.-R. Müller, S. Mika, G. Rätsch, K. Tsuda, and B. Schölkopf. An introduction to kernel-based learning algorithms. IEEE Transactions on Neural Networks, 2(2), 2001.]]Google ScholarDigital Library
- G. Paass, E. Leopold, M. Larson, J. Kindermann, and S. Eickeler. Svm classification using sequences of phonemes and syllables. In T. Elomaa, H. Mannila, and H. Toivonen, editors, Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery, pages 373--384. Springer-Verlag, 2002.]] Google ScholarDigital Library
- P. Pavlidis, T. Furey, M. Liberto, D. Haussler, and W. Grundy. Promoter region-based classification of genes. In Proceedings of the Pacific Symposium on Biocomputing, pages 151--163, 2001.]]Google Scholar
- L. R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257--285, Feb. 1989.]]Google ScholarCross Ref
- C. Saunders, J. Shawe-Taylor, and A. Vinokourov. String kernels, fisher kernels and finite state automata. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems, volume 15. MIT Press, 2003.]]Google Scholar
- B. Schölkopf and A. J. Smola. Learning with Kernels. MIT Press, 2002.]]Google Scholar
- N. Smith and M. Gales. Speech recognition using SVMs. In T. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems, volume 14. MIT Press, 2002.]]Google Scholar
- K. Tsuda, M. Kawanabe, G. Rätsch, S. Sonnenburg, and K.-R. Müller. A new discriminative kernel from probabilistic models. In T. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems, volume 14. MIT Press, 2002.]]Google Scholar
- K. Tsuda, T. Kin, and K. Asai. Marginalized kernels for biological sequences. Bioinformatics, 2002.]]Google Scholar
- V. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, 1995.]] Google ScholarDigital Library
- J.-P. Vert. A tree kernel to analyze phylogenetic profiles. Bioinformatics, 2002.]]Google Scholar
- J.-P. Vert and M. Kanehisa. Graph driven features extraction from microarray data using diffusion kernels and kernel cca. In S. Becker, S. Thrun, and K. Ober mayer, editors, Advances in Neural Information Processing Systems, volume 15. MIT Press, 2003.]]Google Scholar
- S. Vishwanathan and A. Smola. Fast kernels for string and tree matching. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems, volume 15. MIT Press, 2003.]]Google Scholar
- C. Watkins. Dynamic alignment kernels. Technical report, Department of Computer Science, Royal Holloway, University of London, 1999.]]Google Scholar
- C. Watkins. Kernels from matching operations. Technical report, Department of Computer Science, Royal Holloway, University of London, 1999.]]Google Scholar
- A, Zien, G. Ratsch, S. Mika, B. Schölkopf, T. Lengauer, and K.-R. Muller. Engineering support vector machine kernels that recognize translation initiation sites. Bioinforrnatics, 16(9):799--807, 2000.]]Google ScholarCross Ref
Index Terms
- A survey of kernels for structured data
Recommendations
Kernels and Distances for Structured Data
This paper brings together two strands of machine learning of increasing importance: kernel methods and highly structured data. We propose a general method for constructing a kernel following the syntactic structure of the data, as defined by its type ...
Design of ETL Tool for Structured Data Based on Data Warehouse
CSAE '20: Proceedings of the 4th International Conference on Computer Science and Application EngineeringThis paper takes the current business system of a mobile communication-equipment-chain sales-service-company as an example, and analyzes the problem that the data from multiple data sources cannot directly be loaded into the data warehouse by the ...
Evolutionary strategies for hyperparameters of support vector machines based on multi-scale radial basis function kernels
Kernel functions are used in support vector machines (SVM) to compute inner product in a higher dimensional feature space. SVM classification performance depends on the chosen kernel. The radial basis function (RBF) kernel is a distance-based kernel ...
Comments