nach oben

Knowledge and Information Systems

Erschienen in:

25.05.2016 | Regular Paper

Toward value difference metric with attribute weighting

verfasst von: Chaoqun Li, Liangxiao Jiang, Hongwei Li, Jia Wu, Peng Zhang

Erschienen in: Knowledge and Information Systems | Ausgabe 3/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In distance metric learning, recent work has shown that value difference metric (VDM) with a strong attribute independence assumption outperforms other existing distance metrics. However, an open question is whether VDM with a less restrictive assumption can perform even better. Many approaches have been proposed to improve VDM by weakening the assumption. In this paper, we make a comprehensive survey on the existing improved approaches and then propose a new approach to improve VDM by attribute weighting. We name the proposed new distance function as attribute-weighted value difference metric (AWVDM). Moreover, we propose a modified attribute-weighted value difference metric (MAWVDM) by incorporating the learned attribute weights into the conditional probability estimates of AWVDM. AWVDM and MAWVDM significantly outperform VDM and inherit the computational simplicity of VDM simultaneously. Experimental results on a large number of UCI data sets validate the performance of AWVDM and MAWVDM.

Vorheriger Artikel An effective and interpretable method for document classification

Nächster Artikel DBMUTE: density-based majority under-sampling technique

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Aha D (1992) Tolerating noisy, irrelevant, and novel attributes in instance-based learning algorithms. Int J Man Mach Stud 36(2):267–287CrossRef

Aha D, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6:37–66

Alcalá-Fdez J, Fernandez A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multiple-Valued Logic Soft Comput 17(2–3):255–287

Atkeson CG, Moore AW, Schaal S (1997) Locally weighted learning. Artif Intell Rev 11:11–73CrossRef

Bian W, Tao D (2012) Constrained empirical risk minimization framework for distance metric learning. IEEE Trans Neural Netw Learn Syst 23(8):1194–1205CrossRef

Blanzieri E, Ricci F (1999) Probability based metrics for nearestneighbor classification and case-based reasoning. In: Proceedings of the 3rd international conference on case-based reasoning. Springer, pp 14–28

Cattral R, Oppacher F, Deugo D (2002) Evolutionary data mining with automatic rule generalization. Recent advances in computers, computing and communications. WSEAS Press, pp 296–300

Chen C, Zhang J, Fleischer R (2010) Distance approximating dimension reduction of riemannian manifolds. IEEE Trans Syst Man Cybern Part B: Cybern 40(1):208–217CrossRef

Chen C, Zhuang Y, Nie F, Yang Y, Wu F, Xiao J (2011) Learning a 3D human pose distance metric from geometric pose descriptor. IEEE Trans Vis Comput Graphics 17(11):1676–1689CrossRef

10.

Cheng V, Li CH, Kwok JT, Li CK (2004) Dissimilarity learning for nominal data. Pattern Recogn 37(7):1471–1477CrossRef

11.

Cleary JG, Trigg LE (1995) K*: An instance-based learner using an entropic distance measure. In: Proceedings of the 12th international conference on machine learning. Morgan Kaufmann, Tahoe City, pp 108–114

12.

Cost S, Salzberg S (1993) A weighted nearest neighbor algorithm for learning with symbolic features. Mach Learn 10:57–78

13.

Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATH

14.

Deufemia V, Risi M, Tortora G (2014) Sketched symbol recognition using latent-dynamic conditional random fields and distance-based clustering. Pattern Recogn 47(3):1159–1171CrossRef

15.

Diday E (1974) Recent progress in distance and similarity measures in pattern recognition. In: Proceedings of the 2th international joint conference on pattern recognition, pp 534–539

16.

Frank A, Asuncion A (2010) UCI machine learning repository. University of California, Irvine

17.

Frank E, Hall M, Pfahringer B (2003) Locally weighted naive bayes. In: Proceedings of the 19th conference on uncertainty in artificial intelligence (UAI’03). Morgan Kaufmann, San Francisco, pp 249–256

18.

Garcia S, Herrera F (2008) An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. J Mach Learn Res 9:2677–2694MATH

19.

Grossman D, Domingos P (2004) Learning bayesian network classifiers by maximizing conditional likelihood. In: Proceedings of the 21st international conference on machine learning. ACM, pp 361–368

20.

Guo Y, Greiner R (2005) Discriminative model selection for belief net structures. In: Proceedings of the 12th National Conference on Artificial Intelligence, AAAI, pp 770–776

21.

Hall M (2007) A decision tree-based attribute weighting filter for naive bayes. Knowl-Based Syst 20:120–126CrossRef

22.

Hall MA (2000) Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the 17th international conference on machine learning. Morgan Kaufmann, Stanford, pp 359–366

23.

Hinneburg A, Aggarwal C, Keim D (2000) What is the nearest neighbor in high dimensional spaces? In: Proceedings of the 26th international conference on very large data bases. Cairo, pp 506–515

24.

Jiang L, Cai Z, Zhang H, Wang D (2013) Naive bayes text classifiers: a locally weighted learning approach. J Exp Theor Artif Intell 25(2):273–286CrossRef

25.

Jiang L, Li C (2013) An augmented value difference measure. Pattern Recogn Lett 34(10):1169–1174CrossRef

26.

Jiang L, Li C, Wang S, Zhang L (2016) Deep feature weighting for naive bayes and its application to text classification. Eng Appl Artif Intell 52:26–39CrossRef

27.

Jiang L, Li C, Zhang H, Cai Z (2014) A novel distance function: Frequency difference metric. Int J Pattern Recognit Artif Intell 28(2):1451002CrossRef

28.

Jiang L, Wang D, Cai Z (2012) Discriminatively weighted naive bayes and its application in text classification. Int J Artif Intell Tools 21:1250007CrossRef

29.

Jiang L, Zhang H (2006) Learning naive bayes for probability estimation by feature selection. In: Proceedings of the 19th Canadian conference on artificial intelligence. Springer, pp 503–514

30.

Kasif S, Salzberg S, Waltz D, Rachlin J, Aha D (1998) A probabilistic framework for memory-based reasoning. Artif Intell 104:287–311MathSciNetCrossRefMATH

31.

Li C, Jiang L, Li H (2014) Local value difference metric. Pattern Recogn Lett 49:62–68CrossRef

32.

Li C, Jiang L, Li H (2014) Naive bayes for value difference metric. Front Comput Sci 8(2):255–264MathSciNetCrossRef

33.

Li C, Jiang L, Li H, Wang S (2013) Attribute weighted value difference metric. In: Proceedings of the 25th IEEE international conference on tools with artificial intelligence. IEEE, pp 575–580

34.

Li C, Li H (2011) One dependence value difference metric. Knowl-Based Syst 24(5):589–594CrossRef

35.

Li C, Li H (2012) A modified short and fukunaga metric based on the attribute independence assumption. Pattern Recogn Lett 33(9):1213–1218CrossRef

36.

Li C, Li H (2013) Selective value difference metric. J Comput 8(9):2232–2238

37.

Liu B, Wang M, Hong R, Zha Z, Hua X (2010) Joint learning of labels and distance metric. IEEE Trans Syst Man Cybern Part B: Cybern 40(3):973–978CrossRef

38.

Ma L, Yang X, Tao D (2014) Person re-identification over camera networks using multi-task distance metric learning. IEEE Trans Image Process 23(8):3656–3670MathSciNetCrossRef

39.

Mitchell TM (1997) Machine learning, 1st edn. McGraw-Hill, New YorkMATH

40.

Myles JP, Hand DJ (1990) The multi-class metric problem in nearest neighbour discrimination rules. Pattern Recogn 23(11):1291–1297CrossRef

41.

Nadeau C, Bengio Y (2003) Inference for the generalization error. Mach Learn 52(3):239–281CrossRefMATH

42.

Noh YK, Zhang BT, Lee DD (2010) Generative local metric learning for nearest neighbor classification. In: Proceedings of the 24th annual conference on neural information processing systems. Curran Associates, Inc., pp 1822–1830

43.

Qiu C, Jiang L, Li C (2015) Not always simple classification: learning superparent for class probability estimation. Expert Syst Appl 42(13):5433–5440CrossRef

44.

Sangineto E (2013) Pose and expression independent facial landmark localization using dense-surf and the hausdorff distance. IEEE Trans Pattern Anal Mach Intell 35(3):624–638CrossRef

45.

Short RD, Fukunaga K (1981) The optimal distance measure for nearest neighbour classification. IEEE Trans Inf Theory 27:622–627CrossRefMATH

46.

Stanfill C, Waltz D (1986) Toward memory-based reasoning. Commun ACM 29:1213–1228CrossRef

47.

Wilson DR, Martinez TR (1997) Improved heterogeneous distance functions. J Artif Intell Res 6:1–34MathSciNetMATH

48.

Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, San Francisco

49.

Yang, L. (2006), Distance metric learning: a comprehensive survey, Technical report, Department of Computer Science and Engineering, Michigan State University

50.

Yu J, Rui Y, Tang YY, Tao D (2014) High-order distance-based multiview stochastic learning in image classification. IEEE Trans Cybern 44(12):2431–2442CrossRef

51.

Yu J, Tao D, Li J, Cheng J (2014) Semantic preserving distance metric learning and applications. Inf Sci 281:674–686MathSciNetCrossRef

52.

Yu J, Wang M, Tao D (2012) Semi-supervised multiview distance metric learning for cartoon synthesis. IEEE Trans Image Process 21(11):4636–464MathSciNetCrossRef

53.

Zaidi NA, Cerquides J, Carman MJ, Webb GI (2013) Alleviating naive bayes attribute independence assumption by attribute weighting. J Mach Learn Res 14:1947–1988MathSciNetMATH

54.

Zhang H, Sheng S (2004) Learning weighted naive bayes with accurate ranking. In: Proceedings of the 4th IEEE international conference on data mining. IEEE, pp 567–570

Titel: Toward value difference metric with attribute weighting
verfasst von: Chaoqun Li
Liangxiao Jiang
Hongwei Li
Jia Wu
Peng Zhang
Publikationsdatum: 25.05.2016
Verlag: Springer London
Erschienen in: Knowledge and Information Systems / Ausgabe 3/2017
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI: https://doi.org/10.1007/s10115-016-0960-x

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 3/2017

Towards efficient top-k reliability search on uncertain graphs

SILVERBACK+: scalable association mining via fast list intersection for columnar social data

Context-aware query expansion method using Language Models and Latent Semantic Analyses

DASC: data aware algorithm for scalable clustering

DBMUTE: density-based majority under-sampling technique

Event-based summarization using a centrality-as-relevance model

Premium Partner