Skip to main content
Top
Published in: International Journal of Machine Learning and Cybernetics 2/2014

01-04-2014 | Original Article

Bayesian Citation-KNN with distance weighting

Authors: Liangxiao Jiang, Zhihua Cai, Dianhong Wang, Harry Zhang

Published in: International Journal of Machine Learning and Cybernetics | Issue 2/2014

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Multi-instance (MI) learning is receiving growing attention in the machine learning research field, in which learning examples are represented by a bag of instances instead of a single instance. K-nearest-neighbor (KNN) is a simple and effective classification model in the traditional supervised learning. As its two variants, Bayesian-KNN (BKNN) and Citation-KNN (CKNN) are proposed and are widely used for solving multi-instance classification problems. However, CKNN still applies the simplest majority vote approach among the references and citers to classify unseen bags. In this paper, we propose an improved algorithm called Bayesian Citation-KNN (BCKNN). For each unseen bag, BCKNN firstly finds its \( k \) references and \( q \) citers respectively, and then a Bayesian approach is applied to its \( k \) references and a distance weighted majority vote approach is applied to its \( q \) citers. The experimental results on several benchmark datasets show that our BCKNN is generally better than previous BKNN and CKNN. Besides, BCKNN almost maintains the same order of computational overhead as CKNN.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Show more products
Literature
1.
go back to reference Zhou ZH (2004) Multi-instance learning: a survey. Technical Report, AI Lab, Department of Computer Science and Technology, Nanjing University, Nanjing Zhou ZH (2004) Multi-instance learning: a survey. Technical Report, AI Lab, Department of Computer Science and Technology, Nanjing University, Nanjing
2.
go back to reference Zhang ML, Zhou ZH (2007) ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit 40:2038–2048CrossRefMATH Zhang ML, Zhou ZH (2007) ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit 40:2038–2048CrossRefMATH
3.
go back to reference He J, Gu H, Wang Z (2012) Bayesian multi-instance multi-label learning using Gaussian process prior. Mach Learn 88(1–2):273–295CrossRefMATHMathSciNet He J, Gu H, Wang Z (2012) Bayesian multi-instance multi-label learning using Gaussian process prior. Mach Learn 88(1–2):273–295CrossRefMATHMathSciNet
5.
go back to reference Dietterich TG, Lathrop RH, Lozano-Perez T (1997) Solving the multiple instance problem with axis-parallel rectangles. Artif Intell 89(1–2):31–71CrossRefMATH Dietterich TG, Lathrop RH, Lozano-Perez T (1997) Solving the multiple instance problem with axis-parallel rectangles. Artif Intell 89(1–2):31–71CrossRefMATH
6.
go back to reference Ruffo G (2000) Learning single and multiple decision trees for security applications. PhD Dissertation, Department of Computer Science, University of Turin, Turin Ruffo G (2000) Learning single and multiple decision trees for security applications. PhD Dissertation, Department of Computer Science, University of Turin, Turin
7.
go back to reference Zhang Q, Goldman SA (2002) EM-DD: an improved multiple-instance learning technique. Adv Neural Inf Process Syst 14:1073–1080 Zhang Q, Goldman SA (2002) EM-DD: an improved multiple-instance learning technique. Adv Neural Inf Process Syst 14:1073–1080
8.
go back to reference De Raedt L (1998) Attribute-value learning versus inductive logic programming: the missing links. Lecture Notes Artif Intell 1446:1–8 De Raedt L (1998) Attribute-value learning versus inductive logic programming: the missing links. Lecture Notes Artif Intell 1446:1–8
9.
go back to reference Zucker JD, Chevaleyre Y (2001) Solving multiple-instance and multiple-part learning problems with decision trees and rule sets, application to the mutagenesis problem. Lecture Notes Artif Intell 2056:204–214 Zucker JD, Chevaleyre Y (2001) Solving multiple-instance and multiple-part learning problems with decision trees and rule sets, application to the mutagenesis problem. Lecture Notes Artif Intell 2056:204–214
10.
go back to reference Wang C, Scott S, Zhang J, Tao Q, Fomenko D, Gladyshev V (2004) A study in modeling low-conservation protein superfamilies. Technical report, Department of Computer Science, University of Nebraska-Lincoln, Lincoln Wang C, Scott S, Zhang J, Tao Q, Fomenko D, Gladyshev V (2004) A study in modeling low-conservation protein superfamilies. Technical report, Department of Computer Science, University of Nebraska-Lincoln, Lincoln
11.
go back to reference Yang C, Lozano-Perez T (2000) Image database retrieval with multiple-instance learning techniques. In: Proceedings of the IEEE International Conference on Data Engineering, pp 233–243 Yang C, Lozano-Perez T (2000) Image database retrieval with multiple-instance learning techniques. In: Proceedings of the IEEE International Conference on Data Engineering, pp 233–243
12.
go back to reference Zhang Q, Goldman SA, Yu W, Fritts J (2002) Content-based image retrieval using multiple-instance learning. In: Proceedings of 19th International Conference on Machine Learning, pp 682–689 Zhang Q, Goldman SA, Yu W, Fritts J (2002) Content-based image retrieval using multiple-instance learning. In: Proceedings of 19th International Conference on Machine Learning, pp 682–689
13.
go back to reference Maron O (1998) Learning from ambiguity. Department of Electrical and Computer Science, Massachusetts Institute of Technology, Cambridge Maron O (1998) Learning from ambiguity. Department of Electrical and Computer Science, Massachusetts Institute of Technology, Cambridge
14.
go back to reference Andrews S, Tsochantaridis I, Hofmann T (2003) Support vector machines for multiple-instance learning. Adv Neural Inf Process Syst 15:561–568 Andrews S, Tsochantaridis I, Hofmann T (2003) Support vector machines for multiple-instance learning. Adv Neural Inf Process Syst 15:561–568
15.
go back to reference Maron O, Ratan AL (1998) Multiple-instance learning for natural scene classification. In: Proceedings of 15th International Conference on Machine Learning, pp 341–349 Maron O, Ratan AL (1998) Multiple-instance learning for natural scene classification. In: Proceedings of 15th International Conference on Machine Learning, pp 341–349
16.
go back to reference Chen Y, Wang JZ (2004) Image categorization by learning and reasoning with regions. J Mach Learn Res 5:913–939 Chen Y, Wang JZ (2004) Image categorization by learning and reasoning with regions. J Mach Learn Res 5:913–939
17.
go back to reference Aha DW (ed) (1997) Lazy learning. Kluwer Academic Publishers, DordrechtMATH Aha DW (ed) (1997) Lazy learning. Kluwer Academic Publishers, DordrechtMATH
18.
go back to reference Wang J, Zucker J-D (2000) Solving the multiple-instance problem: a lazy learning approach. In: Proceedings of 17th International Conference on Machine Learning, pp 1119–1125 Wang J, Zucker J-D (2000) Solving the multiple-instance problem: a lazy learning approach. In: Proceedings of 17th International Conference on Machine Learning, pp 1119–1125
19.
go back to reference Auer P (1997) On learning from multi-instance examples: empirical evaluation of a theoretical approach. In: Proceedings of the Fourteenth International Conference on Machine Learning. Morgan Kaufmann, San Francisco, pp 21–29 Auer P (1997) On learning from multi-instance examples: empirical evaluation of a theoretical approach. In: Proceedings of the Fourteenth International Conference on Machine Learning. Morgan Kaufmann, San Francisco, pp 21–29
20.
go back to reference Maron O, Lozano-Perez T (1998) A framework for multiple-instance learning. In: Advances in Neural Information Processing Systems, vol 10. MIT Press, Cambridge Maron O, Lozano-Perez T (1998) A framework for multiple-instance learning. In: Advances in Neural Information Processing Systems, vol 10. MIT Press, Cambridge
21.
go back to reference Chen Y, Bi J, Wang JZ (2006) MILES: multiple-instance learning via embedded instance selection. IEEE PAMI 28(12):1931–1947CrossRef Chen Y, Bi J, Wang JZ (2006) MILES: multiple-instance learning via embedded instance selection. IEEE PAMI 28(12):1931–1947CrossRef
22.
go back to reference Foulds JR, Frank E (2008) Revisiting multiple-instance learning via embedded instance selection. In: Proceedings of 21st Australasian Joint Conference on Artificial Intelligence. Springer, Auckland, pp 300–310 Foulds JR, Frank E (2008) Revisiting multiple-instance learning via embedded instance selection. In: Proceedings of 21st Australasian Joint Conference on Artificial Intelligence. Springer, Auckland, pp 300–310
23.
go back to reference Blockeel H, De Raedt L (1998) Top-down induction of first order logical decision trees. Artif Intell 101:285–297CrossRefMATH Blockeel H, De Raedt L (1998) Top-down induction of first order logical decision trees. Artif Intell 101:285–297CrossRefMATH
24.
go back to reference Xu X (2001) A nearest distribution approach to multiple-instance learning. Department of Computer Science, University of Waikato, Hamilton Xu X (2001) A nearest distribution approach to multiple-instance learning. Department of Computer Science, University of Waikato, Hamilton
25.
go back to reference Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning. Morgan Kaufmann Press, San Francisco, pp 148–156 Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning. Morgan Kaufmann Press, San Francisco, pp 148–156
26.
go back to reference Frank ET, Xu X (2003) Applying propositional learning algorithms to multi-instance data. Technical Report, Department of Computer Science, University of Waikato, Hamilton Frank ET, Xu X (2003) Applying propositional learning algorithms to multi-instance data. Technical Report, Department of Computer Science, University of Waikato, Hamilton
28.
go back to reference Peuquet DJ (1992) An algorithm for calculating minimum euclidean distance between two geographic features. Comput Geosci 18(8):989–1001CrossRef Peuquet DJ (1992) An algorithm for calculating minimum euclidean distance between two geographic features. Comput Geosci 18(8):989–1001CrossRef
29.
go back to reference Edgar GA (1995) Measure, topology, and fractal geometry. 3rd print, Springer, Berlin Edgar GA (1995) Measure, topology, and fractal geometry. 3rd print, Springer, Berlin
30.
go back to reference Chen X, Doihara T, Nasu M (1995) Spatial relations of distance between arbitrary object s in 2D/3D geographic spaces based on the hausdorff metric. LIESMARS’95, Wuhan Chen X, Doihara T, Nasu M (1995) Spatial relations of distance between arbitrary object s in 2D/3D geographic spaces based on the hausdorff metric. LIESMARS’95, Wuhan
31.
go back to reference Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco
32.
go back to reference Bhattacharya G, Ghosh K, Chowdhury AS (2012) An affinity-based new local distance function and similarity measure for kNN algorithm. Pattern Recognit Lett 33(3):356–363CrossRef Bhattacharya G, Ghosh K, Chowdhury AS (2012) An affinity-based new local distance function and similarity measure for kNN algorithm. Pattern Recognit Lett 33(3):356–363CrossRef
33.
go back to reference Huang J, Ling CX (2005) Using AUC and accuracy in evaluating learning algorithms. IEEE Trans Knowl Data Eng 17(3):299–310CrossRef Huang J, Ling CX (2005) Using AUC and accuracy in evaluating learning algorithms. IEEE Trans Knowl Data Eng 17(3):299–310CrossRef
34.
go back to reference Jiang L, Li C, Cai Z (2009) Learning decision tree for ranking. Knowl Inf Syst 20(1):123–135CrossRef Jiang L, Li C, Cai Z (2009) Learning decision tree for ranking. Knowl Inf Syst 20(1):123–135CrossRef
35.
go back to reference Liang G, Zhu X, Zhang C (2012) The effect of varying levels of class distribution on bagging for different algorithms: an empirical study. Int J Mach Learn Cybern. doi:10.1007/s13042-012-0125-5 Liang G, Zhu X, Zhang C (2012) The effect of varying levels of class distribution on bagging for different algorithms: an empirical study. Int J Mach Learn Cybern. doi:10.​1007/​s13042-012-0125-5
Metadata
Title
Bayesian Citation-KNN with distance weighting
Authors
Liangxiao Jiang
Zhihua Cai
Dianhong Wang
Harry Zhang
Publication date
01-04-2014
Publisher
Springer Berlin Heidelberg
Published in
International Journal of Machine Learning and Cybernetics / Issue 2/2014
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-013-0152-x

Other articles of this Issue 2/2014

International Journal of Machine Learning and Cybernetics 2/2014 Go to the issue