Skip to main content
Top
Published in: International Journal of Machine Learning and Cybernetics 1/2019

19-08-2017 | Original Article

Learning from label proportions with pinball loss

Authors: Yong Shi, Limeng Cui, Zhensong Chen, Zhiquan Qi

Published in: International Journal of Machine Learning and Cybernetics | Issue 1/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Learning from label proportions is a new kind of learning problem which has drawn much attention in recent years. Different from the well-known supervised learning, it considers instances in bags and uses the label proportion of each bag instead of instance. As obtaining the instance label is not always feasible, it has been widely used in areas like modeling voting behaviors and spam filtering. However, learning from label proportions still suffers great challenges due to the inference of noise, the improper partition of bags and so on. In this paper, we propose a novel learning from label proportions method based on pinball loss, called “pSVM-pin”, to address the above issues. The pinball loss is introduced to generate an effective classifier in order to eliminate the impact of noise. Experimental results prove the precision of pSVM-pin compared with competing methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Show more products
Appendix
Available only for authorised users
Literature
1.
go back to reference Hernández-González J, Inza I, Lozano JA (2015) A novel weakly supervised problem: learning from positive-unlabeled proportions. In: Puerta J et al (eds) Advances in artificial intelligence. Springer, Cham, pp 3–13CrossRef Hernández-González J, Inza I, Lozano JA (2015) A novel weakly supervised problem: learning from positive-unlabeled proportions. In: Puerta J et al (eds) Advances in artificial intelligence. Springer, Cham, pp 3–13CrossRef
2.
go back to reference Chapelle O, Schölkopf B, Zien A et al (2006) Semi-supervised learning. IEEE Transactions on Neural Networks 20(3):542–542CrossRef Chapelle O, Schölkopf B, Zien A et al (2006) Semi-supervised learning. IEEE Transactions on Neural Networks 20(3):542–542CrossRef
3.
go back to reference Zhu X, Goldberg AB (2009) Introduction to semi-supervised learning. Synthesis lectures on artificial intelligence and machine learning 3(1):1–130CrossRefMATH Zhu X, Goldberg AB (2009) Introduction to semi-supervised learning. Synthesis lectures on artificial intelligence and machine learning 3(1):1–130CrossRefMATH
4.
go back to reference Andrews S, Tsochantaridis I, Hofmann T (2002) Support vector machines for multiple-instance learning. In: Advances in neural information processing systems, pp 561–568 Andrews S, Tsochantaridis I, Hofmann T (2002) Support vector machines for multiple-instance learning. In: Advances in neural information processing systems, pp 561–568
5.
go back to reference Bunescu RC, Mooney RJ (2007) Multiple instance learning for sparse positive bags. In: Proceedings of the 24th international conference on machine learning. ACM, pp 105–112 Bunescu RC, Mooney RJ (2007) ​Multiple instance learning for sparse positive bags. In: Proceedings of the 24th international conference on machine learning. ACM, pp 105–112
6.
go back to reference Quadrianto N, Smola AJ, Caetano TS, Le QV (2009) Estimating labels from label proportions. J Mach Learn Res 10:2349–2374MathSciNetMATH Quadrianto N, Smola AJ, Caetano TS, Le QV (2009) ​Estimating labels from label proportions. J Mach Learn Res 10:2349–2374MathSciNetMATH
7.
go back to reference Rueping S (2010) SVM classifier estimation from group probabilities. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 911–918 Rueping S (2010) SVM classifier estimation from group probabilities. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 911–918
8.
go back to reference Stolpe M, Morik K (2011) Learning from label proportions by optimizing cluster model selection. In: Gunopulos D, Hofmann T, Malerba D, Vazirgiannis M (eds) Machine learning and knowledge discovery in databases. Springer, Berlin, pp 349–364CrossRef Stolpe M, Morik K (2011) ​Learning from label proportions by optimizing cluster model selection. In: Gunopulos D, Hofmann T, Malerba D, Vazirgiannis M (eds) Machine learning and knowledge discovery in databases. Springer, Berlin, pp 349–364CrossRef
9.
go back to reference Yu F, Liu D, Kumar S, Tony J, Chang SF (2013) \(\propto\)SVM for learning with label proportions. In: Proceedings of the 30th international conference on machine learning, pp 504–512 Yu F, Liu D, Kumar S, Tony J, Chang SF (2013) \(\propto\)SVM for learning with label proportions. In: Proceedings of the 30th international conference on machine learning, pp 504–512
10.
go back to reference Patrini G, Nock R, Caetano T, Rivera P (2014) (Almost) no label no cry. In: Advances in Neural Information Processing Systems, pp 190–198 Patrini G, Nock R, Caetano T, Rivera P (2014) (Almost) no label no cry. In: Advances in Neural Information Processing Systems, pp 190–198
11.
go back to reference Musicant DR, Christensen JM, Olson JF (2007) Supervised learning by training on aggregate outputs. Data mining, 2007. ICDM 2007. Seventh IEEE international conference on IEEE, pp 252–261 Musicant DR, Christensen JM, Olson JF (2007) Supervised learning by training on aggregate outputs. Data mining, 2007. ICDM 2007. Seventh IEEE international conference on IEEE, pp 252–261
12.
go back to reference Chen T, Yu FX, Chen J, Cui Y, Chen YY, Chang SF (2014) Object-based visual sentiment concept analysis and application. In: Proceedings of the ACM international conference on multimedia. ACM, pp 367–376 Chen T, Yu FX, Chen J, Cui Y, Chen YY, Chang SF (2014) Object-based visual sentiment concept analysis and application. In: Proceedings of the ACM international conference on multimedia. ACM, pp 367–376
13.
go back to reference Lai KT, Yu FX, Chen MS, Chang SF (2014) Video event detection by inferring temporal instance labels. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 2251–2258 Lai KT, Yu FX, Chen MS, Chang SF (2014) Video event detection by inferring temporal instance labels. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 2251–2258
14.
15.
go back to reference Xiao X, Tao Y (2006) Anatomy: Simple and effective privacy preservation. In: Proceedings of the 32nd international conference on Very large data bases. VLDB Endowment, pp 139–150 Xiao X, Tao Y (2006) Anatomy: Simple and effective privacy preservation. In: Proceedings of the 32nd international conference on Very large data bases. VLDB Endowment, pp 139–150
16.
go back to reference Martin DJ, Kifer D, Machanavajjhala A, Gehrke J, Halpern JY (2007) Worst-case background knowledge for privacy-preserving data publishing. In: Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on IEEE, pp 126–135 Martin DJ, Kifer D, Machanavajjhala A, Gehrke J, Halpern JY (2007) Worst-case background knowledge for privacy-preserving data publishing. In: Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on IEEE, pp 126–135
17.
go back to reference Kumari DA (2013) Slicing: a new approach to privacy preserving data publishing related to medical data-base using k-means clustering technique. Int J Adv Engg Res Technol 2(8) Kumari DA (2013) Slicing: a new approach to privacy preserving data publishing related to medical data-base using k-means clustering technique. Int J Adv Engg Res Technol 2(8)
18.
go back to reference Li XB, Sarkar S (2006) A tree-based data perturbation approach for privacy-preserving data mining. IEEE Trans Knowl Data Eng 18(9):1278CrossRef Li XB, Sarkar S (2006) A tree-based data perturbation approach for privacy-preserving data mining. IEEE Trans Knowl Data Eng 18(9):1278CrossRef
19.
go back to reference Muralidhar K, Parsa R, Sarathy R (1999) A general additive data perturbation method for database security. Manag Sci 45(10):1399–1415CrossRef Muralidhar K, Parsa R, Sarathy R (1999) A general additive data perturbation method for database security. Manag Sci 45(10):1399–1415CrossRef
20.
go back to reference Mitra P, Murthy C, Pal SK (2000) Data condensation in large databases by incremental learning with support vector machines. Pattern recognition, 2000. In: Proceedings of 15th international conference on, vol 2. IEEE, pp 708–711 Mitra P, Murthy C, Pal SK (2000) Data condensation in large databases by incremental learning with support vector machines. Pattern recognition, 2000. In: Proceedings of 15th international conference on, vol 2. IEEE, pp 708–711
21.
go back to reference Pan F, Zhang X, Wang W (2008) Crd: fast co-clustering on large datasets utilizing sampling-based matrix decomposition. In: Proceedings of the 2008 ACM SIGMOD international conference on Management of data. ACM, pp 173–184 Pan F, Zhang X, Wang W (2008) Crd: fast co-clustering on large datasets utilizing sampling-based matrix decomposition. In: Proceedings of the 2008 ACM SIGMOD international conference on Management of data. ACM, pp 173–184
22.
go back to reference Kück H, de Freitas N (2005) Learning about individuals from group statistics. In: Proceedings of the twenty-first conference on uncertainty in artificial intelligence. AUAI Press, Corvallis, pp 332–339 Kück H, de Freitas N (2005) Learning about individuals from group statistics. In: Proceedings of the twenty-first conference on uncertainty in artificial intelligence. AUAI Press, Corvallis, pp 332–339
23.
go back to reference Hernández J, Inza I (2011) Learning naive Bayes models for multiple-instance learning with label proportions. In: Lozano JA, Gámez JA, Moreno JA (eds) Advances in Artificial Intelligence. Springer, Berlin, Heidelberg, pp 134–144CrossRef Hernández J, Inza I (2011) Learning naive Bayes models for multiple-instance learning with label proportions. In: Lozano JA, Gámez JA, Moreno JA (eds) Advances in Artificial Intelligence. Springer, Berlin, Heidelberg, pp 134–144CrossRef
24.
go back to reference Huang X, Shi L, Suykens JA (2015) Sequential minimal optimization for SVM with pinball loss. Neurocomputing 149:1596–1603CrossRef Huang X, Shi L, Suykens JA (2015) Sequential minimal optimization for SVM with pinball loss. Neurocomputing 149:1596–1603CrossRef
25.
go back to reference Koenker R (2005) Quantile regression, vol 38. Cambridge University Press Koenker R (2005) Quantile regression, vol 38. Cambridge University Press
26.
go back to reference Christmann A, Steinwart I (2007) How SVMs can estimate quantiles and the median. In: Advances in neural information processing systems, pp 305–312 Christmann A, Steinwart I (2007) How SVMs can estimate quantiles and the median. In: Advances in neural information processing systems, pp 305–312
27.
go back to reference Steinwart I, Christmann A et al (2011) Estimating conditional quantiles with the help of the pinball loss. Bernoulli 17(1):211–225MathSciNetCrossRefMATH Steinwart I, Christmann A et al (2011) Estimating conditional quantiles with the help of the pinball loss. Bernoulli 17(1):211–225MathSciNetCrossRefMATH
28.
go back to reference Huang X, Shi L, Suykens J et al (2014) Support vector machine classifier with pinball loss. IEEE Trans Pattern Anal Mach Intell 36(5):984–997CrossRef Huang X, Shi L, Suykens J et al (2014) Support vector machine classifier with pinball loss. IEEE Trans Pattern Anal Mach Intell 36(5):984–997CrossRef
29.
go back to reference Huang X, Shi L, Suykens JA (2014) Solution path for PIN-SVM classifiers with positive and negative \(\tau\) values. IEEE transactions on neural networks and learning systems Huang X, Shi L, Suykens JA (2014) Solution path for PIN-SVM classifiers with positive and negative \(\tau\) values. IEEE transactions on neural networks and learning systems
30.
go back to reference Tragante do OV, Fierens D, Blockeel H (2011) Instance-level accuracy versus bag-level accuracy in multi-instance learning. In: Proceedings of the 23rd Benelux conference on artificial intelligence (BNAIC), p 8 Tragante do OV, Fierens D, Blockeel H (2011) Instance-level accuracy versus bag-level accuracy in multi-instance learning. In: Proceedings of the 23rd Benelux conference on artificial intelligence (BNAIC), p 8
31.
go back to reference Moro S, Laureano R, Cortez P (2011) Using data mining for bank direct marketing: an application of the crisp-dm methodology. In: Proceedings of European Simulation and Modelling Conference-ESM'2011, pp 117–121 Moro S, Laureano R, Cortez P (2011) Using data mining for bank direct marketing: an application of the crisp-dm methodology. In: Proceedings of European Simulation and Modelling Conference-ESM'2011, pp 117–121
32.
go back to reference Yu FX, Choromanski K, Kumar S, Jebara T, Chang SF (2014) On Learning from Label Proportions. arXiv:1402.5902 (arXiv preprint) Yu FX, Choromanski K, Kumar S, Jebara T, Chang SF (2014) On Learning from Label Proportions. arXiv:​1402.​5902 (arXiv preprint)
Metadata
Title
Learning from label proportions with pinball loss
Authors
Yong Shi
Limeng Cui
Zhensong Chen
Zhiquan Qi
Publication date
19-08-2017
Publisher
Springer Berlin Heidelberg
Published in
International Journal of Machine Learning and Cybernetics / Issue 1/2019
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-017-0708-2

Other articles of this Issue 1/2019

International Journal of Machine Learning and Cybernetics 1/2019 Go to the issue