Skip to main content
Erschienen in: Knowledge and Information Systems 3/2017

25.05.2016 | Regular Paper

Toward value difference metric with attribute weighting

verfasst von: Chaoqun Li, Liangxiao Jiang, Hongwei Li, Jia Wu, Peng Zhang

Erschienen in: Knowledge and Information Systems | Ausgabe 3/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In distance metric learning, recent work has shown that value difference metric (VDM) with a strong attribute independence assumption outperforms other existing distance metrics. However, an open question is whether VDM with a less restrictive assumption can perform even better. Many approaches have been proposed to improve VDM by weakening the assumption. In this paper, we make a comprehensive survey on the existing improved approaches and then propose a new approach to improve VDM by attribute weighting. We name the proposed new distance function as attribute-weighted value difference metric (AWVDM). Moreover, we propose a modified attribute-weighted value difference metric (MAWVDM) by incorporating the learned attribute weights into the conditional probability estimates of AWVDM. AWVDM and MAWVDM significantly outperform VDM and inherit the computational simplicity of VDM simultaneously. Experimental results on a large number of UCI data sets validate the performance of AWVDM and MAWVDM.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Aha D (1992) Tolerating noisy, irrelevant, and novel attributes in instance-based learning algorithms. Int J Man Mach Stud 36(2):267–287CrossRef Aha D (1992) Tolerating noisy, irrelevant, and novel attributes in instance-based learning algorithms. Int J Man Mach Stud 36(2):267–287CrossRef
2.
Zurück zum Zitat Aha D, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6:37–66 Aha D, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6:37–66
3.
Zurück zum Zitat Alcalá-Fdez J, Fernandez A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multiple-Valued Logic Soft Comput 17(2–3):255–287 Alcalá-Fdez J, Fernandez A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multiple-Valued Logic Soft Comput 17(2–3):255–287
4.
Zurück zum Zitat Atkeson CG, Moore AW, Schaal S (1997) Locally weighted learning. Artif Intell Rev 11:11–73CrossRef Atkeson CG, Moore AW, Schaal S (1997) Locally weighted learning. Artif Intell Rev 11:11–73CrossRef
5.
Zurück zum Zitat Bian W, Tao D (2012) Constrained empirical risk minimization framework for distance metric learning. IEEE Trans Neural Netw Learn Syst 23(8):1194–1205CrossRef Bian W, Tao D (2012) Constrained empirical risk minimization framework for distance metric learning. IEEE Trans Neural Netw Learn Syst 23(8):1194–1205CrossRef
6.
Zurück zum Zitat Blanzieri E, Ricci F (1999) Probability based metrics for nearestneighbor classification and case-based reasoning. In: Proceedings of the 3rd international conference on case-based reasoning. Springer, pp 14–28 Blanzieri E, Ricci F (1999) Probability based metrics for nearestneighbor classification and case-based reasoning. In: Proceedings of the 3rd international conference on case-based reasoning. Springer, pp 14–28
7.
Zurück zum Zitat Cattral R, Oppacher F, Deugo D (2002) Evolutionary data mining with automatic rule generalization. Recent advances in computers, computing and communications. WSEAS Press, pp 296–300 Cattral R, Oppacher F, Deugo D (2002) Evolutionary data mining with automatic rule generalization. Recent advances in computers, computing and communications. WSEAS Press, pp 296–300
8.
Zurück zum Zitat Chen C, Zhang J, Fleischer R (2010) Distance approximating dimension reduction of riemannian manifolds. IEEE Trans Syst Man Cybern Part B: Cybern 40(1):208–217CrossRef Chen C, Zhang J, Fleischer R (2010) Distance approximating dimension reduction of riemannian manifolds. IEEE Trans Syst Man Cybern Part B: Cybern 40(1):208–217CrossRef
9.
Zurück zum Zitat Chen C, Zhuang Y, Nie F, Yang Y, Wu F, Xiao J (2011) Learning a 3D human pose distance metric from geometric pose descriptor. IEEE Trans Vis Comput Graphics 17(11):1676–1689CrossRef Chen C, Zhuang Y, Nie F, Yang Y, Wu F, Xiao J (2011) Learning a 3D human pose distance metric from geometric pose descriptor. IEEE Trans Vis Comput Graphics 17(11):1676–1689CrossRef
10.
Zurück zum Zitat Cheng V, Li CH, Kwok JT, Li CK (2004) Dissimilarity learning for nominal data. Pattern Recogn 37(7):1471–1477CrossRef Cheng V, Li CH, Kwok JT, Li CK (2004) Dissimilarity learning for nominal data. Pattern Recogn 37(7):1471–1477CrossRef
11.
Zurück zum Zitat Cleary JG, Trigg LE (1995) K*: An instance-based learner using an entropic distance measure. In: Proceedings of the 12th international conference on machine learning. Morgan Kaufmann, Tahoe City, pp 108–114 Cleary JG, Trigg LE (1995) K*: An instance-based learner using an entropic distance measure. In: Proceedings of the 12th international conference on machine learning. Morgan Kaufmann, Tahoe City, pp 108–114
12.
Zurück zum Zitat Cost S, Salzberg S (1993) A weighted nearest neighbor algorithm for learning with symbolic features. Mach Learn 10:57–78 Cost S, Salzberg S (1993) A weighted nearest neighbor algorithm for learning with symbolic features. Mach Learn 10:57–78
13.
Zurück zum Zitat Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATH Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATH
14.
Zurück zum Zitat Deufemia V, Risi M, Tortora G (2014) Sketched symbol recognition using latent-dynamic conditional random fields and distance-based clustering. Pattern Recogn 47(3):1159–1171CrossRef Deufemia V, Risi M, Tortora G (2014) Sketched symbol recognition using latent-dynamic conditional random fields and distance-based clustering. Pattern Recogn 47(3):1159–1171CrossRef
15.
Zurück zum Zitat Diday E (1974) Recent progress in distance and similarity measures in pattern recognition. In: Proceedings of the 2th international joint conference on pattern recognition, pp 534–539 Diday E (1974) Recent progress in distance and similarity measures in pattern recognition. In: Proceedings of the 2th international joint conference on pattern recognition, pp 534–539
16.
Zurück zum Zitat Frank A, Asuncion A (2010) UCI machine learning repository. University of California, Irvine Frank A, Asuncion A (2010) UCI machine learning repository. University of California, Irvine
17.
Zurück zum Zitat Frank E, Hall M, Pfahringer B (2003) Locally weighted naive bayes. In: Proceedings of the 19th conference on uncertainty in artificial intelligence (UAI’03). Morgan Kaufmann, San Francisco, pp 249–256 Frank E, Hall M, Pfahringer B (2003) Locally weighted naive bayes. In: Proceedings of the 19th conference on uncertainty in artificial intelligence (UAI’03). Morgan Kaufmann, San Francisco, pp 249–256
18.
Zurück zum Zitat Garcia S, Herrera F (2008) An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. J Mach Learn Res 9:2677–2694MATH Garcia S, Herrera F (2008) An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. J Mach Learn Res 9:2677–2694MATH
19.
Zurück zum Zitat Grossman D, Domingos P (2004) Learning bayesian network classifiers by maximizing conditional likelihood. In: Proceedings of the 21st international conference on machine learning. ACM, pp 361–368 Grossman D, Domingos P (2004) Learning bayesian network classifiers by maximizing conditional likelihood. In: Proceedings of the 21st international conference on machine learning. ACM, pp 361–368
20.
Zurück zum Zitat Guo Y, Greiner R (2005) Discriminative model selection for belief net structures. In: Proceedings of the 12th National Conference on Artificial Intelligence, AAAI, pp 770–776 Guo Y, Greiner R (2005) Discriminative model selection for belief net structures. In: Proceedings of the 12th National Conference on Artificial Intelligence, AAAI, pp 770–776
21.
Zurück zum Zitat Hall M (2007) A decision tree-based attribute weighting filter for naive bayes. Knowl-Based Syst 20:120–126CrossRef Hall M (2007) A decision tree-based attribute weighting filter for naive bayes. Knowl-Based Syst 20:120–126CrossRef
22.
Zurück zum Zitat Hall MA (2000) Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the 17th international conference on machine learning. Morgan Kaufmann, Stanford, pp 359–366 Hall MA (2000) Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the 17th international conference on machine learning. Morgan Kaufmann, Stanford, pp 359–366
23.
Zurück zum Zitat Hinneburg A, Aggarwal C, Keim D (2000) What is the nearest neighbor in high dimensional spaces? In: Proceedings of the 26th international conference on very large data bases. Cairo, pp 506–515 Hinneburg A, Aggarwal C, Keim D (2000) What is the nearest neighbor in high dimensional spaces? In: Proceedings of the 26th international conference on very large data bases. Cairo, pp 506–515
24.
Zurück zum Zitat Jiang L, Cai Z, Zhang H, Wang D (2013) Naive bayes text classifiers: a locally weighted learning approach. J Exp Theor Artif Intell 25(2):273–286CrossRef Jiang L, Cai Z, Zhang H, Wang D (2013) Naive bayes text classifiers: a locally weighted learning approach. J Exp Theor Artif Intell 25(2):273–286CrossRef
25.
Zurück zum Zitat Jiang L, Li C (2013) An augmented value difference measure. Pattern Recogn Lett 34(10):1169–1174CrossRef Jiang L, Li C (2013) An augmented value difference measure. Pattern Recogn Lett 34(10):1169–1174CrossRef
26.
Zurück zum Zitat Jiang L, Li C, Wang S, Zhang L (2016) Deep feature weighting for naive bayes and its application to text classification. Eng Appl Artif Intell 52:26–39CrossRef Jiang L, Li C, Wang S, Zhang L (2016) Deep feature weighting for naive bayes and its application to text classification. Eng Appl Artif Intell 52:26–39CrossRef
27.
Zurück zum Zitat Jiang L, Li C, Zhang H, Cai Z (2014) A novel distance function: Frequency difference metric. Int J Pattern Recognit Artif Intell 28(2):1451002CrossRef Jiang L, Li C, Zhang H, Cai Z (2014) A novel distance function: Frequency difference metric. Int J Pattern Recognit Artif Intell 28(2):1451002CrossRef
28.
Zurück zum Zitat Jiang L, Wang D, Cai Z (2012) Discriminatively weighted naive bayes and its application in text classification. Int J Artif Intell Tools 21:1250007CrossRef Jiang L, Wang D, Cai Z (2012) Discriminatively weighted naive bayes and its application in text classification. Int J Artif Intell Tools 21:1250007CrossRef
29.
Zurück zum Zitat Jiang L, Zhang H (2006) Learning naive bayes for probability estimation by feature selection. In: Proceedings of the 19th Canadian conference on artificial intelligence. Springer, pp 503–514 Jiang L, Zhang H (2006) Learning naive bayes for probability estimation by feature selection. In: Proceedings of the 19th Canadian conference on artificial intelligence. Springer, pp 503–514
30.
Zurück zum Zitat Kasif S, Salzberg S, Waltz D, Rachlin J, Aha D (1998) A probabilistic framework for memory-based reasoning. Artif Intell 104:287–311MathSciNetCrossRefMATH Kasif S, Salzberg S, Waltz D, Rachlin J, Aha D (1998) A probabilistic framework for memory-based reasoning. Artif Intell 104:287–311MathSciNetCrossRefMATH
31.
Zurück zum Zitat Li C, Jiang L, Li H (2014) Local value difference metric. Pattern Recogn Lett 49:62–68CrossRef Li C, Jiang L, Li H (2014) Local value difference metric. Pattern Recogn Lett 49:62–68CrossRef
32.
33.
Zurück zum Zitat Li C, Jiang L, Li H, Wang S (2013) Attribute weighted value difference metric. In: Proceedings of the 25th IEEE international conference on tools with artificial intelligence. IEEE, pp 575–580 Li C, Jiang L, Li H, Wang S (2013) Attribute weighted value difference metric. In: Proceedings of the 25th IEEE international conference on tools with artificial intelligence. IEEE, pp 575–580
34.
Zurück zum Zitat Li C, Li H (2011) One dependence value difference metric. Knowl-Based Syst 24(5):589–594CrossRef Li C, Li H (2011) One dependence value difference metric. Knowl-Based Syst 24(5):589–594CrossRef
35.
Zurück zum Zitat Li C, Li H (2012) A modified short and fukunaga metric based on the attribute independence assumption. Pattern Recogn Lett 33(9):1213–1218CrossRef Li C, Li H (2012) A modified short and fukunaga metric based on the attribute independence assumption. Pattern Recogn Lett 33(9):1213–1218CrossRef
36.
Zurück zum Zitat Li C, Li H (2013) Selective value difference metric. J Comput 8(9):2232–2238 Li C, Li H (2013) Selective value difference metric. J Comput 8(9):2232–2238
37.
Zurück zum Zitat Liu B, Wang M, Hong R, Zha Z, Hua X (2010) Joint learning of labels and distance metric. IEEE Trans Syst Man Cybern Part B: Cybern 40(3):973–978CrossRef Liu B, Wang M, Hong R, Zha Z, Hua X (2010) Joint learning of labels and distance metric. IEEE Trans Syst Man Cybern Part B: Cybern 40(3):973–978CrossRef
38.
Zurück zum Zitat Ma L, Yang X, Tao D (2014) Person re-identification over camera networks using multi-task distance metric learning. IEEE Trans Image Process 23(8):3656–3670MathSciNetCrossRef Ma L, Yang X, Tao D (2014) Person re-identification over camera networks using multi-task distance metric learning. IEEE Trans Image Process 23(8):3656–3670MathSciNetCrossRef
39.
Zurück zum Zitat Mitchell TM (1997) Machine learning, 1st edn. McGraw-Hill, New YorkMATH Mitchell TM (1997) Machine learning, 1st edn. McGraw-Hill, New YorkMATH
40.
Zurück zum Zitat Myles JP, Hand DJ (1990) The multi-class metric problem in nearest neighbour discrimination rules. Pattern Recogn 23(11):1291–1297CrossRef Myles JP, Hand DJ (1990) The multi-class metric problem in nearest neighbour discrimination rules. Pattern Recogn 23(11):1291–1297CrossRef
41.
Zurück zum Zitat Nadeau C, Bengio Y (2003) Inference for the generalization error. Mach Learn 52(3):239–281CrossRefMATH Nadeau C, Bengio Y (2003) Inference for the generalization error. Mach Learn 52(3):239–281CrossRefMATH
42.
Zurück zum Zitat Noh YK, Zhang BT, Lee DD (2010) Generative local metric learning for nearest neighbor classification. In: Proceedings of the 24th annual conference on neural information processing systems. Curran Associates, Inc., pp 1822–1830 Noh YK, Zhang BT, Lee DD (2010) Generative local metric learning for nearest neighbor classification. In: Proceedings of the 24th annual conference on neural information processing systems. Curran Associates, Inc., pp 1822–1830
43.
Zurück zum Zitat Qiu C, Jiang L, Li C (2015) Not always simple classification: learning superparent for class probability estimation. Expert Syst Appl 42(13):5433–5440CrossRef Qiu C, Jiang L, Li C (2015) Not always simple classification: learning superparent for class probability estimation. Expert Syst Appl 42(13):5433–5440CrossRef
44.
Zurück zum Zitat Sangineto E (2013) Pose and expression independent facial landmark localization using dense-surf and the hausdorff distance. IEEE Trans Pattern Anal Mach Intell 35(3):624–638CrossRef Sangineto E (2013) Pose and expression independent facial landmark localization using dense-surf and the hausdorff distance. IEEE Trans Pattern Anal Mach Intell 35(3):624–638CrossRef
45.
Zurück zum Zitat Short RD, Fukunaga K (1981) The optimal distance measure for nearest neighbour classification. IEEE Trans Inf Theory 27:622–627CrossRefMATH Short RD, Fukunaga K (1981) The optimal distance measure for nearest neighbour classification. IEEE Trans Inf Theory 27:622–627CrossRefMATH
46.
Zurück zum Zitat Stanfill C, Waltz D (1986) Toward memory-based reasoning. Commun ACM 29:1213–1228CrossRef Stanfill C, Waltz D (1986) Toward memory-based reasoning. Commun ACM 29:1213–1228CrossRef
47.
Zurück zum Zitat Wilson DR, Martinez TR (1997) Improved heterogeneous distance functions. J Artif Intell Res 6:1–34MathSciNetMATH Wilson DR, Martinez TR (1997) Improved heterogeneous distance functions. J Artif Intell Res 6:1–34MathSciNetMATH
48.
Zurück zum Zitat Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, San Francisco Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, San Francisco
49.
Zurück zum Zitat Yang, L. (2006), Distance metric learning: a comprehensive survey, Technical report, Department of Computer Science and Engineering, Michigan State University Yang, L. (2006), Distance metric learning: a comprehensive survey, Technical report, Department of Computer Science and Engineering, Michigan State University
50.
Zurück zum Zitat Yu J, Rui Y, Tang YY, Tao D (2014) High-order distance-based multiview stochastic learning in image classification. IEEE Trans Cybern 44(12):2431–2442CrossRef Yu J, Rui Y, Tang YY, Tao D (2014) High-order distance-based multiview stochastic learning in image classification. IEEE Trans Cybern 44(12):2431–2442CrossRef
51.
Zurück zum Zitat Yu J, Tao D, Li J, Cheng J (2014) Semantic preserving distance metric learning and applications. Inf Sci 281:674–686MathSciNetCrossRef Yu J, Tao D, Li J, Cheng J (2014) Semantic preserving distance metric learning and applications. Inf Sci 281:674–686MathSciNetCrossRef
52.
Zurück zum Zitat Yu J, Wang M, Tao D (2012) Semi-supervised multiview distance metric learning for cartoon synthesis. IEEE Trans Image Process 21(11):4636–464MathSciNetCrossRef Yu J, Wang M, Tao D (2012) Semi-supervised multiview distance metric learning for cartoon synthesis. IEEE Trans Image Process 21(11):4636–464MathSciNetCrossRef
53.
Zurück zum Zitat Zaidi NA, Cerquides J, Carman MJ, Webb GI (2013) Alleviating naive bayes attribute independence assumption by attribute weighting. J Mach Learn Res 14:1947–1988MathSciNetMATH Zaidi NA, Cerquides J, Carman MJ, Webb GI (2013) Alleviating naive bayes attribute independence assumption by attribute weighting. J Mach Learn Res 14:1947–1988MathSciNetMATH
54.
Zurück zum Zitat Zhang H, Sheng S (2004) Learning weighted naive bayes with accurate ranking. In: Proceedings of the 4th IEEE international conference on data mining. IEEE, pp 567–570 Zhang H, Sheng S (2004) Learning weighted naive bayes with accurate ranking. In: Proceedings of the 4th IEEE international conference on data mining. IEEE, pp 567–570
Metadaten
Titel
Toward value difference metric with attribute weighting
verfasst von
Chaoqun Li
Liangxiao Jiang
Hongwei Li
Jia Wu
Peng Zhang
Publikationsdatum
25.05.2016
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 3/2017
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-016-0960-x

Weitere Artikel der Ausgabe 3/2017

Knowledge and Information Systems 3/2017 Zur Ausgabe

Premium Partner