Skip to main content
Erschienen in: Pattern Analysis and Applications 3-4/2008

01.09.2008 | Theoretical Advances

Model-based classification with dissimilarities: a maximum likelihood approach

verfasst von: Eugène-Patrice Ndong Nguéma, Guillaume Saint-Pierre

Erschienen in: Pattern Analysis and Applications | Ausgabe 3-4/2008

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Most of classification problems concern applications with objects lying in an Euclidean space, but, in some situations, only dissimilarities between objects are known. We are concerned with supervised classification analysis from an observed dissimilarity table, which task is classifying new unobserved or implicit objects (only known through their dissimilarity measures with previously classified ones forming the training data set) into predefined classes. This work concentrates on developing model-based classifiers for dissimilarities which take into account the measurement error w.r.t. Euclidean distance. Basically, it is assumed that the unobserved objects are unknown parameters to estimate in an Euclidean space, and the observed dissimilarity table is a random perturbation of their Euclidean distances of gaussian type. Allowing the distribution of these perturbations to vary across pairs of classes in the population leads to more flexible classification methods than usual algorithms. Model parameters are estimated from the training data set via the maximum likelihood (ML) method, and allocation is done by assigning a new implicit object to the group in the population and positioning in the Euclidean space maximizing the conditional group likelihood with the estimated parameters. This point of view can be expected to be useful in classifying dissimilarity tables that are no longer Euclidean due to measurement error or instabilities of various types. Two possible structures are postulated for the error, resulting in two different model-based classifiers. First results on real or simulated data sets show interesting behavior of the two proposed algorithms, ant the respective effects of the dissimilarity type and of the data intrinsic dimension are investigated. For these latter two aspects, one of the constructed classifiers appears to be very promising. Interestingly, the data intrinsic dimension seems to have a much less adverse effect on our classifiers than initially feared, at least for small to moderate dimensions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Balachander T, Kothari R (1999) Introducing locality and softness in subspace classification. Pattern Anal Appl 2(1):53–58MATHCrossRef Balachander T, Kothari R (1999) Introducing locality and softness in subspace classification. Pattern Anal Appl 2(1):53–58MATHCrossRef
2.
Zurück zum Zitat Borg I, Groenen PJF (1997) Modern multidimensional scaling. Theory and applications. Springer Series in Statistics. Springer, New York Borg I, Groenen PJF (1997) Modern multidimensional scaling. Theory and applications. Springer Series in Statistics. Springer, New York
3.
Zurück zum Zitat Bottigli U, Golosio B, Masala GL, Oliva P, Stumbo S, Cascio D, Fauci F, Magro R, Raso G, Vasile M, Bellotti R, De Carlo F, Tangaro S, De Mitri I, De Nunzio G, Quarta M, Preite Martinez A, Tata A, Cerello P, Cheran SC, Lopez Torres E (2006) Dissimilarity application in digitized mammographic images classification. J Syst Cybern Inf 4(3):18–22 Bottigli U, Golosio B, Masala GL, Oliva P, Stumbo S, Cascio D, Fauci F, Magro R, Raso G, Vasile M, Bellotti R, De Carlo F, Tangaro S, De Mitri I, De Nunzio G, Quarta M, Preite Martinez A, Tata A, Cerello P, Cheran SC, Lopez Torres E (2006) Dissimilarity application in digitized mammographic images classification. J Syst Cybern Inf 4(3):18–22
4.
Zurück zum Zitat Bozdogan H (1993) Choosing the number of component clusters in the mixture-model using a new informational complexity criterion of the inverse-Fisher information matrix. In: Opitz O, Lausen B, Klar R (eds) Studies in classification, data analysis, and knowledge organization. Springer, Heidelberg, pp 40–54 Bozdogan H (1993) Choosing the number of component clusters in the mixture-model using a new informational complexity criterion of the inverse-Fisher information matrix. In: Opitz O, Lausen B, Klar R (eds) Studies in classification, data analysis, and knowledge organization. Springer, Heidelberg, pp 40–54
5.
Zurück zum Zitat Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. Wadsworth, Belmont Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. Wadsworth, Belmont
6.
Zurück zum Zitat Celeux G (2003) Analyse discriminante. In: Govaert G (ed) Analyse des données. Lavoisier, Paris, pp 201–234 Celeux G (2003) Analyse discriminante. In: Govaert G (ed) Analyse des données. Lavoisier, Paris, pp 201–234
8.
Zurück zum Zitat Dickinson PJ, Bunke H, Dadej A, Kraetzl M (2004) Object-based image content characterisation for semantic-level image similarity calculation. Pattern Anal Appl 7(3):243–254MathSciNet Dickinson PJ, Bunke H, Dadej A, Kraetzl M (2004) Object-based image content characterisation for semantic-level image similarity calculation. Pattern Anal Appl 7(3):243–254MathSciNet
9.
Zurück zum Zitat Dimitriadou E, Hornik K, Leisch F, Meyer D (2006) e1071: misc functions of the Department of Statistics (e1071). R package, version 1.5–16, TU Wien, Vienna, Austria Dimitriadou E, Hornik K, Leisch F, Meyer D (2006) e1071: misc functions of the Department of Statistics (e1071). R package, version 1.5–16, TU Wien, Vienna, Austria
10.
Zurück zum Zitat Duin RPW, Pekalska E, Paclík P, Tax DMJ (2004) The dissimilarity representation, a basis for domain based pattern recognition? Representations in pattern recognition, IAPR Workshop, Cambridge, pp 43–56 Duin RPW, Pekalska E, Paclík P, Tax DMJ (2004) The dissimilarity representation, a basis for domain based pattern recognition? Representations in pattern recognition, IAPR Workshop, Cambridge, pp 43–56
11.
Zurück zum Zitat Fournier J, Cordi M, Philipp-Foliguet S (2001) RETIN: a content-based image indexing and retrieval system. Pattern Anal Appl 4(2–3):153–173MATHCrossRef Fournier J, Cordi M, Philipp-Foliguet S (2001) RETIN: a content-based image indexing and retrieval system. Pattern Anal Appl 4(2–3):153–173MATHCrossRef
12.
Zurück zum Zitat Fukunaga K (1990) Introduction to statistical pattern recognition, 2nd edn. Computer Science and Scientific Computing Series. Academic Press Inc, Boston Fukunaga K (1990) Introduction to statistical pattern recognition, 2nd edn. Computer Science and Scientific Computing Series. Academic Press Inc, Boston
13.
Zurück zum Zitat Glunt W, Hayden TL, Liu W-M (1991) The embedding problem for predistance matrices. Bull Math Biol 53:769–796MATH Glunt W, Hayden TL, Liu W-M (1991) The embedding problem for predistance matrices. Bull Math Biol 53:769–796MATH
14.
Zurück zum Zitat Guérin-Dugué A, Celeux G (2001) Discriminant analysis on dissimilarity data: a new fast Gaussian like algorithm. AISTAT 2001, Florida Guérin-Dugué A, Celeux G (2001) Discriminant analysis on dissimilarity data: a new fast Gaussian like algorithm. AISTAT 2001, Florida
15.
Zurück zum Zitat Guérin-Dugué A, Oliva A (2000) Classification of scene photographs from local orientation features. Preprint Guérin-Dugué A, Oliva A (2000) Classification of scene photographs from local orientation features. Preprint
16.
Zurück zum Zitat Guttman L (1968) A general nonmetric technique for finding the smallest coordinate space for a configuration of points. Psychometrika 33:469–506MATHCrossRef Guttman L (1968) A general nonmetric technique for finding the smallest coordinate space for a configuration of points. Psychometrika 33:469–506MATHCrossRef
17.
Zurück zum Zitat Haasdonk B, Bahlmann C (2004) Learning with distance substitution kernels. In: Proceedings of 26th DAGM symposium (Tübingen, Germany). Springer, Berlin, pp 220–227 Haasdonk B, Bahlmann C (2004) Learning with distance substitution kernels. In: Proceedings of 26th DAGM symposium (Tübingen, Germany). Springer, Berlin, pp 220–227
18.
Zurück zum Zitat Harol A, Pekalska E, Verzakov S, Duin RPW (2006) Augmented embedding of dissimilarity data into (pseudo-)Euclidean spaces. Joint IAPR Iinternational workshops on statistical and structural pattern recognition (Honk Kong, China). Lect Notes Comp Sci 4109:613–621CrossRef Harol A, Pekalska E, Verzakov S, Duin RPW (2006) Augmented embedding of dissimilarity data into (pseudo-)Euclidean spaces. Joint IAPR Iinternational workshops on statistical and structural pattern recognition (Honk Kong, China). Lect Notes Comp Sci 4109:613–621CrossRef
20.
Zurück zum Zitat Heiser WJ, de Leeuw J (1986) SMACOF-I. Technical Report UG-86-02 Department of Data Theory, University of Leiden, Leiden, The Netherlands Heiser WJ, de Leeuw J (1986) SMACOF-I. Technical Report UG-86-02 Department of Data Theory, University of Leiden, Leiden, The Netherlands
21.
Zurück zum Zitat Higham NI (2002) Accuracy and stability of numerical algorithms, 2nd edn. Society for Industrial and Applied Mathematics. PhiladelphiaMATH Higham NI (2002) Accuracy and stability of numerical algorithms, 2nd edn. Society for Industrial and Applied Mathematics. PhiladelphiaMATH
22.
Zurück zum Zitat Kearsley AJ, Tapia RA, Trosset MW (1998) The solution of the metric STRESS and SSTRESS problems in multidimensional scaling using Newton’s method. Comput Stat 13(3):369–396MATH Kearsley AJ, Tapia RA, Trosset MW (1998) The solution of the metric STRESS and SSTRESS problems in multidimensional scaling using Newton’s method. Comput Stat 13(3):369–396MATH
23.
Zurück zum Zitat Le Cun Y, Boser B, Denker J, Henderson D, Howard R, Hubbard W, Jackel L (1990) Handwritten digit recognition with a back-propagation network. In: Touretzky D (ed) Advances in neural information processing systems, vol 2. Morgan Kaufman, Denver Le Cun Y, Boser B, Denker J, Henderson D, Howard R, Hubbard W, Jackel L (1990) Handwritten digit recognition with a back-propagation network. In: Touretzky D (ed) Advances in neural information processing systems, vol 2. Morgan Kaufman, Denver
24.
25.
Zurück zum Zitat Lozano M, Sotoca JM, Sánchez JS, Pla F, Pekalska E, Duin RPW (2006) Experimental study on prototype optimisation algorithms for prototype-based classification in vector spaces. Pattern Recogn 39:1827–1838MATHCrossRef Lozano M, Sotoca JM, Sánchez JS, Pla F, Pekalska E, Duin RPW (2006) Experimental study on prototype optimisation algorithms for prototype-based classification in vector spaces. Pattern Recogn 39:1827–1838MATHCrossRef
26.
Zurück zum Zitat Malone SW, Tarazaga P, Trosset MW (2002) Better initial configurations for metric multidimensional scaling. Comput Stat Data Anal 41:143–156MATHCrossRefMathSciNet Malone SW, Tarazaga P, Trosset MW (2002) Better initial configurations for metric multidimensional scaling. Comput Stat Data Anal 41:143–156MATHCrossRefMathSciNet
27.
Zurück zum Zitat Malone SW, Trosset MW (2000) Optimal dilations for metric multidimensional Scaling. In: 2000 Proceedings of the statistical computing section and section on statistical graphics. American Statistical Association, Alexandria Malone SW, Trosset MW (2000) Optimal dilations for metric multidimensional Scaling. In: 2000 Proceedings of the statistical computing section and section on statistical graphics. American Statistical Association, Alexandria
28.
Zurück zum Zitat Martins A, Figueiredo M, Aguiar P (2007) Kernels and similarity measures for text classification. In: 6th Conference on telecommunications—ConfTele’2007, Peniche, Portugal Martins A, Figueiredo M, Aguiar P (2007) Kernels and similarity measures for text classification. In: 6th Conference on telecommunications—ConfTele’2007, Peniche, Portugal
29.
Zurück zum Zitat Masala GL (2006) Pattern recognition techniques applied to biomedical patterns. Int J Biomed Sci 1(1):47–55 Masala GL (2006) Pattern recognition techniques applied to biomedical patterns. Int J Biomed Sci 1(1):47–55
30.
Zurück zum Zitat Orozco M, García ME, Duin RPW and Castellanos CG (2006) Dissimilarity-based classification of seismic signals at Nevado del Ruiz Volcano. Earth Sci Res J 10(2):57–65 Orozco M, García ME, Duin RPW and Castellanos CG (2006) Dissimilarity-based classification of seismic signals at Nevado del Ruiz Volcano. Earth Sci Res J 10(2):57–65
31.
Zurück zum Zitat Paclík P, Duin RPW (2003) Dissimilarity-based classification of spectra: computational issues. Real-Time Imaging 9:237–244CrossRef Paclík P, Duin RPW (2003) Dissimilarity-based classification of spectra: computational issues. Real-Time Imaging 9:237–244CrossRef
32.
Zurück zum Zitat Pekalska E, Duin RPW (2000) Classifiers for dissimilarity-based pattern recognition. In: Sanfeliu A, Villanueva JJ, Vanrell M, Alquezar R, Jain AK (eds) Proceedings of 15th international conference on pattern recognition (Barcelona, Spain), 2:12–16 Pattern recognition and neutral networks. IEEE Computer Society Press, Los Alamitos Pekalska E, Duin RPW (2000) Classifiers for dissimilarity-based pattern recognition. In: Sanfeliu A, Villanueva JJ, Vanrell M, Alquezar R, Jain AK (eds) Proceedings of 15th international conference on pattern recognition (Barcelona, Spain), 2:12–16 Pattern recognition and neutral networks. IEEE Computer Society Press, Los Alamitos
33.
Zurück zum Zitat Pekalska E, Duin RPW (2000) Classification on dissimilarity data: a first look. In: Van Vliet LJ, Heinjnsdijk JWJ, Kielman T, Knijnenburg PMW (eds) Proceedings of annual conference of the advanced school for computing and imaging (Lommel, Belgium), Pattern recognition and neutral networks. IEEE Computer Society Press, Los Alamitos, pp 221–228 Pekalska E, Duin RPW (2000) Classification on dissimilarity data: a first look. In: Van Vliet LJ, Heinjnsdijk JWJ, Kielman T, Knijnenburg PMW (eds) Proceedings of annual conference of the advanced school for computing and imaging (Lommel, Belgium), Pattern recognition and neutral networks. IEEE Computer Society Press, Los Alamitos, pp 221–228
34.
Zurück zum Zitat Pekalska E, Duin RPW (2002) Dissimilarity representations allow for building good classifiers. Pattern Recogn Lett 23(8):943–956MATHCrossRef Pekalska E, Duin RPW (2002) Dissimilarity representations allow for building good classifiers. Pattern Recogn Lett 23(8):943–956MATHCrossRef
35.
Zurück zum Zitat Pekalska E, Duin RPW (2006) Dissimilarity-based classification with vectorial representations. Int Conf Pattern Recogn Hong Kong. Hong Kong 3:137–140 Pekalska E, Duin RPW (2006) Dissimilarity-based classification with vectorial representations. Int Conf Pattern Recogn Hong Kong. Hong Kong 3:137–140
36.
Zurück zum Zitat Pekalska E, Duin RPW, Paclík P (2006) Prototype selection for dissimilarity-based classifiers. Pattern Recogn 39(2):189–208MATHCrossRef Pekalska E, Duin RPW, Paclík P (2006) Prototype selection for dissimilarity-based classifiers. Pattern Recogn 39(2):189–208MATHCrossRef
37.
Zurück zum Zitat Pekalska E, Paclík P, Duin RPW (2002) A generalized kernel approach to dissimilarity-based classification. J Mach Learn Res Spec Issue Kernel Methods 2(2):175–211MATH Pekalska E, Paclík P, Duin RPW (2002) A generalized kernel approach to dissimilarity-based classification. J Mach Learn Res Spec Issue Kernel Methods 2(2):175–211MATH
38.
Zurück zum Zitat The R Development Core Team (2007) R : a language and environment for statistical computing. Reference Index. Version 2.5.0, R Foundation for Statistical Science The R Development Core Team (2007) R : a language and environment for statistical computing. Reference Index. Version 2.5.0, R Foundation for Statistical Science
39.
40.
Zurück zum Zitat Srisuk S, Petrou M, Kurutach W, Kadyrov A (2005) A face authentication system using the trace transform. Pattern Anal Appl 8(1–2):50–61MathSciNet Srisuk S, Petrou M, Kurutach W, Kadyrov A (2005) A face authentication system using the trace transform. Pattern Anal Appl 8(1–2):50–61MathSciNet
41.
Zurück zum Zitat Tolba AS, Abu-Rezq AN (1998) Arabic glove-talk (AGT): a communication aid for vocally impaired. Pattern Anal Appl 1(4):218–230CrossRef Tolba AS, Abu-Rezq AN (1998) Arabic glove-talk (AGT): a communication aid for vocally impaired. Pattern Anal Appl 1(4):218–230CrossRef
43.
Zurück zum Zitat Young G, Householder AS (1938) Discussion of a set of points in terms of their mutual distances. Psychometrika 3:19–22CrossRef Young G, Householder AS (1938) Discussion of a set of points in terms of their mutual distances. Psychometrika 3:19–22CrossRef
Metadaten
Titel
Model-based classification with dissimilarities: a maximum likelihood approach
verfasst von
Eugène-Patrice Ndong Nguéma
Guillaume Saint-Pierre
Publikationsdatum
01.09.2008
Verlag
Springer-Verlag
Erschienen in
Pattern Analysis and Applications / Ausgabe 3-4/2008
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-008-0105-2

Weitere Artikel der Ausgabe 3-4/2008

Pattern Analysis and Applications 3-4/2008 Zur Ausgabe