ABSTRACT
We propose a local, generative model for similarity-based classification. The method is applicable to the case that only pairwise similarities between samples are available. The classifier models the local class-conditional distribution using a maximum entropy estimate and empirical moment constraints. The resulting exponential class conditional-distributions are combined with class prior probabilities and misclassification costs to form the local similarity discriminant analysis (local SDA) classifier. We compare the performance of local SDA to a non-local version, to the local nearest centroid classifier, the nearest centroid classifier, k-NN, and to the recently-developed potential support vector machine (PSVM). Results show that local SDA is competitive with k-NN and the computationally-demanding PSVM while offering the advantages of a generative classifier.
- Belongie, S., Malik, J., & Puzicha, J. (2002). Shape matching and object recognition using shape contexts. IEEE Trans. on Pattern Analysis and Machine Intelligence, 24, 509--522. Google ScholarDigital Library
- Bicego, M., Murino, V., Pelillo, M., & Torsello, A. (2006). Special issue on similarity-based classification. Pattern Recognition, 39. Google ScholarDigital Library
- Cost, S., & Salzberg, S. (1993). A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning, 10, 57--78. Google ScholarDigital Library
- Cover, T., & Thomas, J. (1991). Elements of information theory. New York: John Wiley and Sons. Google ScholarDigital Library
- Devroye, L., Gyorfi, L., & Lugosi, G. (1996). A probabilistic theory of pattern recognition. New York: Springer-Verlag Inc.Google Scholar
- Duin, R. P. W., Pekalska, E., & de Ridder, D. (1999). Relational discriminant analysis. Pattern Recognition Letters, 20, 1175--1181. Google ScholarDigital Library
- Friedlander, M. P., & Gupta, M. R. (2006). On minimizing distortion and relative entropy. IEEE Trans. on Information Theory, 52, 238--245. Google ScholarDigital Library
- Gati, I., & Tversky, A. (1984). Weighting common and distinctive features in perceptual and conceptual judgments. Cognitive Psychology, 341--370.Google Scholar
- Goldfarb, L. (1985). A new approach to pattern recognition. Progress in Pattern Recognition, 2, 241--402.Google Scholar
- Graepel, T., Herbrich, R., & Obermayer, K. (1999). Classification on pairwise proximity data. Advances in Neural Information Processing Systems 11, 438--444. Google ScholarDigital Library
- Gupta, M. R., Cazzanti, L., & Koppal, A. J. (2007). Maximum entropy generative models for similarity-based learning. IEEE Intl. Symp. on Information Theory, to appear.Google ScholarCross Ref
- Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning. New York: Springer-Verlag.Google Scholar
- Hochreiter, S., Mozer, M. C., & Obermayer, K. (2003). Coulomb classifiers: Generalizing support vector machines via an analogy to electrostatic systems. Advances in Neural Information Processing Systems 15, 545--552.Google Scholar
- Hochreiter, S., & Obermayer, K. (2006). Support vector machines for dyadic data. Neural Computation, 18, 1472--1510. Google ScholarDigital Library
- Hoffmann, T., & Buhmann, J. (1997). Pairwise data clustering by deterministic annealing. IEEE Trans. on Pattern Analysis and Machine Intelligence, 19. Google ScholarDigital Library
- Jacobs, D. W., Weinshall, D., & Gdalyahu, Y. (2000). Classification with nonmetric distances: Image retrieval and class representation. IEEE Trans. on Pattern Analysis and Machine Intelligence, 22, 583--600. Google ScholarDigital Library
- Jaynes, E. T. (1982). On the rationale for maximum entropy methods. Proc. of the IEEE, 70, 939--952.Google ScholarCross Ref
- Jaynes, E. T. (2003). Probability theory: the logic of science. Cambridge University Press.Google Scholar
- Mitani, Y., & Hamamoto, Y. (2000). Classifier design based on the use of nearest neighbor samples. Proc. of the Intl. Conf. on Pattern Recognition, 769--772.Google ScholarCross Ref
- Mitani, Y., & Hamamoto, Y. (2006). A local mean-based nonparametric classifier. Pattern Recognition Letters, 27, 1151--1159. Google ScholarDigital Library
- Newman, D. J., Hettich, S., Blake, C. L., & Merz, C. J. (1998). UCI repository of machine learning databases.Google Scholar
- Pekalska, E., Paclíc, P., & Duin, R. P. W. (2001). A generalized kernel approach to dissimilarity-based classification. Journal of Machine Learning Research, 175--211. Google ScholarDigital Library
- Rosch, E. (1973). Natural categories. Cognitive Psychology, 328--350.Google Scholar
- Simard, P., Cun, Y. L., & Denker, J. (1993). Efficient pattern recognition using a new transformation distance. Advances in Neural Information Processing Systems 5, 50--68. Google ScholarDigital Library
- Stanfill, C., & Waltz, D. (1986). Toward memory-based reasoning. Communications of the ACM, 29, 1213--1228. Google ScholarDigital Library
- Tversky, A. (1977). Features of similarity. Psychological Review, 327--352.Google Scholar
- Tversky, A., & Gati, I. (1978). Studies of similarity. In E. Rosch and B. Lloyd (Eds.), Cognition and categorization. Hillsdale, N.J.: Earlbaum.Google Scholar
- Van Campenhout, J., & Cover, T. (1981). Maximum entropy and conditional probability. IEEE Trans. on Information Theory, 27, 483--489.Google ScholarDigital Library
- Weinshall, D., Jacobs, D. W., & Gdalyahu, Y. (1999). Classification in non-metric spaces. Advances in Neural Information Processing Systems 11, 838--844. Google ScholarDigital Library
- Zhang, H., Berg, A. C., Maire, M., & Malik, J. (2006). SVM-KNN: discriminative nearest neighbor classification for visual category recognition. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, 2126--2136. Google ScholarDigital Library
- Local similarity discriminant analysis
Recommendations
Regularizing the Local Similarity Discriminant Analysis Classifier
ICMLA '09: Proceedings of the 2009 International Conference on Machine Learning and ApplicationsWe investigate parameter-based and distribution-based approaches to regularizing the generative, similarity-based classifier called local similarity discriminant analysis classifier (local SDA). We argue that regularizing distributions rather than ...
Local Tangent Space Discriminant Analysis
We propose a novel supervised dimensionality reduction method named local tangent space discriminant analysis (TSD) which is capable of utilizing the geometrical information from tangent spaces. The proposed method aims to seek an embedding space where ...
Joint Global and Local Structure Discriminant Analysis
Linear discriminant analysis (LDA) only considers the global Euclidean geometrical structure of data for dimensionality reduction. However, previous works have demonstrated that the local geometrical structure is effective for dimensionality reduction. ...
Comments