Skip to main content
Erschienen in: Knowledge and Information Systems 1/2019

06.07.2018 | Regular Paper

Emerging topics and challenges of learning from noisy data in nonstandard classification: a survey beyond binary class noise

verfasst von: Ronaldo C. Prati, Julián Luengo, Francisco Herrera

Erschienen in: Knowledge and Information Systems | Ausgabe 1/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The problem of class noisy instances is omnipresent in different classification problems. However, most of research focuses on noise handling in binary classification problems and adaptations to multiclass learning. This paper aims to contextualize noise labels in the context of non-binary classification problems, including multiclass, multilabel, multitask, multi-instance ordinal and data stream classification. Practical considerations for analyzing noise under these classification problems, as well as trends, open-ended problems and future research directions are analyzed. We believe this paper could help expand research on class noise handling and help practitioners to better identify the particular aspects of noise in challenging classification scenarios.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
We opt to use artificially generated data to have control about data characteristics.
 
2
In this particular simple problem, some algorithms are more resilient to class noise for low noise rates, although we can observe this trend for higher noise rates.
 
3
A possible explanation for this behavior of J48 is related to the uncertainty in leaf boundaries in the tree
 
4
Multilabel classification should not be confused with multiclass classification (Sect. 3.1), which is the problem of categorizing instances into one of more than two classes.
 
7
This corresponds to noise at random. For the noise completely at random case, we have \(p_1 = p_2 = \ldots = p_n = p_e\).
 
Literatur
1.
Zurück zum Zitat Abellán J, Masegosa AR (2010) Bagging decision trees on data sets with classification noise. In: International symposium on foundations of information and knowledge systems. Springer, pp 248–265 Abellán J, Masegosa AR (2010) Bagging decision trees on data sets with classification noise. In: International symposium on foundations of information and knowledge systems. Springer, pp 248–265
2.
Zurück zum Zitat Amores J (2013) Multiple instance classification: review, taxonomy and comparative study. Artif Intell 201:81–105MathSciNetMATH Amores J (2013) Multiple instance classification: review, taxonomy and comparative study. Artif Intell 201:81–105MathSciNetMATH
3.
Zurück zum Zitat Angluin D, Laird P (1988) Learning from noisy examples. Mach Learn 2(4):343–370 Angluin D, Laird P (1988) Learning from noisy examples. Mach Learn 2(4):343–370
4.
Zurück zum Zitat Baranauskas JA (2015) The number of classes as a source for instability of decision tree algorithms in high dimensional datasets. Artif Intell Rev 43(2):301–310 Baranauskas JA (2015) The number of classes as a source for instability of decision tree algorithms in high dimensional datasets. Artif Intell Rev 43(2):301–310
5.
Zurück zum Zitat Bartlett PL, Jordan MI, McAuliffe JD (2006) Convexity, classification, and risk bounds. J Am Stat Assoc 101(473):138–156MathSciNetMATH Bartlett PL, Jordan MI, McAuliffe JD (2006) Convexity, classification, and risk bounds. J Am Stat Assoc 101(473):138–156MathSciNetMATH
6.
Zurück zum Zitat Beigman E, Klebanov BB (2009) Learning with annotation noise. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP: volume 1–volume 1, ACL ’09, pp 280–287 Beigman E, Klebanov BB (2009) Learning with annotation noise. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP: volume 1–volume 1, ACL ’09, pp 280–287
7.
Zurück zum Zitat Ben-David A, Sterling L, Tran T (2009) Adding monotonicity to learning algorithms may impair their accuracy. Expert Syst Appl 36(3):6627–6634 Ben-David A, Sterling L, Tran T (2009) Adding monotonicity to learning algorithms may impair their accuracy. Expert Syst Appl 36(3):6627–6634
8.
Zurück zum Zitat Bi Y, Jeske DR (2010) The efficiency of logistic regression compared to normal discriminant analysis under class-conditional classification noise. J Multivar Anal 101(7):1622–1637MathSciNetMATH Bi Y, Jeske DR (2010) The efficiency of logistic regression compared to normal discriminant analysis under class-conditional classification noise. J Multivar Anal 101(7):1622–1637MathSciNetMATH
9.
Zurück zum Zitat Bouchachia A (2011) Fuzzy classification in dynamic environments. Soft Comput 15(5):1009–1022 Bouchachia A (2011) Fuzzy classification in dynamic environments. Soft Comput 15(5):1009–1022
10.
Zurück zum Zitat Brefeld U, Scheffer T (2004) Co-Em support vector learning. In: International conference on machine learning (ICML), p 16 Brefeld U, Scheffer T (2004) Co-Em support vector learning. In: International conference on machine learning (ICML), p 16
11.
Zurück zum Zitat Breve FA, Zhao L, Quiles MG (2015) Particle competition and cooperation for semi-supervised learning with label noise. Neurocomputing 160:63–72 Breve FA, Zhao L, Quiles MG (2015) Particle competition and cooperation for semi-supervised learning with label noise. Neurocomputing 160:63–72
12.
Zurück zum Zitat Brodley CE, Friedl MA (1999) Identifying mislabeled training data. J Artif Intell Res 11:131–167MATH Brodley CE, Friedl MA (1999) Identifying mislabeled training data. J Artif Intell Res 11:131–167MATH
14.
Zurück zum Zitat Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41(3):15 Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41(3):15
15.
Zurück zum Zitat Chapelle O, Shivaswamy P, Vadrevu S, Weinberger K, Zhang Y, Tseng B (2010) Multi-task learning for boosting with application to web search ranking. In: ACM SIGKDD international conference on knowledge discovery and data mining (KDD). ACM, pp 1189–1198 Chapelle O, Shivaswamy P, Vadrevu S, Weinberger K, Zhang Y, Tseng B (2010) Multi-task learning for boosting with application to web search ranking. In: ACM SIGKDD international conference on knowledge discovery and data mining (KDD). ACM, pp 1189–1198
16.
Zurück zum Zitat Charte F, Rivera AJ, del Jesús MJ, Herrera F (2015) Addressing imbalance in multilabel classification: measures and random resampling algorithms. Neurocomputing 163:3–16 Charte F, Rivera AJ, del Jesús MJ, Herrera F (2015) Addressing imbalance in multilabel classification: measures and random resampling algorithms. Neurocomputing 163:3–16
17.
Zurück zum Zitat Chen K, Kämäräinen J-K (2016) Learning with ambiguous label distribution for apparent age estimation. In: Asian conference on computer vision. Springer, pp 330–343 Chen K, Kämäräinen J-K (2016) Learning with ambiguous label distribution for apparent age estimation. In: Asian conference on computer vision. Springer, pp 330–343
18.
Zurück zum Zitat Chen P-Y, Chen C-C, Yang C-H, Chang S-M, Lee K-J (2017) milr: Multiple-instance logistic regression with lasso penalty. R J 9(1):446–457 Chen P-Y, Chen C-C, Yang C-H, Chang S-M, Lee K-J (2017) milr: Multiple-instance logistic regression with lasso penalty. R J 9(1):446–457
19.
Zurück zum Zitat Cheng W, Hüllermeier E, Dembczynski KJ (2010) Bayes optimal multilabel classification via probabilistic classifier chains. In: International conference on machine learning (ICML), pp 279–286 Cheng W, Hüllermeier E, Dembczynski KJ (2010) Bayes optimal multilabel classification via probabilistic classifier chains. In: International conference on machine learning (ICML), pp 279–286
20.
Zurück zum Zitat Cheplygina V, Tax DM, Loog M (2015) Multiple instance learning with bag dissimilarities. Pattern Recognit 48(1):264–275 Cheplygina V, Tax DM, Loog M (2015) Multiple instance learning with bag dissimilarities. Pattern Recognit 48(1):264–275
21.
Zurück zum Zitat Chevaleyre Y, Zucker J-D (2000) Noise-tolerant rule induction from multi-instance data. In: ICML 2000, workshop on attribute-value and relational learning Chevaleyre Y, Zucker J-D (2000) Noise-tolerant rule induction from multi-instance data. In: ICML 2000, workshop on attribute-value and relational learning
22.
Zurück zum Zitat Daniels HA, Velikova MV (2006) Derivation of monotone decision models from noisy data. IEEE Trans Syst Man Cybern C 36(5):705–710 Daniels HA, Velikova MV (2006) Derivation of monotone decision models from noisy data. IEEE Trans Syst Man Cybern C 36(5):705–710
23.
Zurück zum Zitat de Faria ER, de Leon Ferreira ACP, Gama J et al (2016) Minas: multiclass learning algorithm for novelty detection in data streams. Data Min Knowl Discov 30(3):640–680MathSciNet de Faria ER, de Leon Ferreira ACP, Gama J et al (2016) Minas: multiclass learning algorithm for novelty detection in data streams. Data Min Knowl Discov 30(3):640–680MathSciNet
24.
Zurück zum Zitat Dembczyński K, Waegeman W, Cheng W, Hüllermeier E (2012) On label dependence and loss minimization in multi-label classification. Mach Learn 88(1–2):5–45MathSciNetMATH Dembczyński K, Waegeman W, Cheng W, Hüllermeier E (2012) On label dependence and loss minimization in multi-label classification. Mach Learn 88(1–2):5–45MathSciNetMATH
25.
Zurück zum Zitat Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. J Artif Intell Res 2:263–286MATH Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. J Artif Intell Res 2:263–286MATH
26.
Zurück zum Zitat Ditzler G, Roveri M, Alippi C, Polikar R (2015) Learning in nonstationary environments: a survey. IEEE Comput Intell Mag 10(4):12–25 Ditzler G, Roveri M, Alippi C, Polikar R (2015) Learning in nonstationary environments: a survey. IEEE Comput Intell Mag 10(4):12–25
27.
Zurück zum Zitat Du J, Cai Z (2015) Modelling class noise with symmetric and asymmetric distributions. In: AAAI conference on artificial intelligence (AAAI), pp 2589–2595 Du J, Cai Z (2015) Modelling class noise with symmetric and asymmetric distributions. In: AAAI conference on artificial intelligence (AAAI), pp 2589–2595
28.
Zurück zum Zitat Evgeniou T, Micchelli CA, Pontil M (2005) Learning multiple tasks with kernel methods. J Mach Learn Res 6:615–637MathSciNetMATH Evgeniou T, Micchelli CA, Pontil M (2005) Learning multiple tasks with kernel methods. J Mach Learn Res 6:615–637MathSciNetMATH
29.
Zurück zum Zitat Feelders A (2010) Monotone relabeling in ordinal classification. In: IEEE international conference on data mining (ICDM). IEEE, pp 803–808 Feelders A (2010) Monotone relabeling in ordinal classification. In: IEEE international conference on data mining (ICDM). IEEE, pp 803–808
30.
Zurück zum Zitat Frénay B, Verleysen M (2014) Classification in the presence of label noise: a survey. IEEE Trans Neural Netw Learn Syst 25(5):845–869MATH Frénay B, Verleysen M (2014) Classification in the presence of label noise: a survey. IEEE Trans Neural Netw Learn Syst 25(5):845–869MATH
31.
Zurück zum Zitat Friedman JH (1989) Regularized discriminant analysis. J Am Stat Assoc 84(405):165–175MathSciNet Friedman JH (1989) Regularized discriminant analysis. J Am Stat Assoc 84(405):165–175MathSciNet
32.
Zurück zum Zitat Gaba A, Winkler RL (1992) Implications of errors in survey data: a Bayesian model. Manag Sci 38(7):913–925MATH Gaba A, Winkler RL (1992) Implications of errors in survey data: a Bayesian model. Manag Sci 38(7):913–925MATH
33.
Zurück zum Zitat Gaber MM, Gama J, Krishnaswamy S, Gomes JB, Stahl F (2014) Data stream mining in ubiquitous environments: state-of-the-art and current directions. Wiley Interdiscip Rev Data Min Knowl Discov 4(2):116–138 Gaber MM, Gama J, Krishnaswamy S, Gomes JB, Stahl F (2014) Data stream mining in ubiquitous environments: state-of-the-art and current directions. Wiley Interdiscip Rev Data Min Knowl Discov 4(2):116–138
34.
Zurück zum Zitat Gaber MM, Zaslavsky A, Krishnaswamy S (2005) Mining data streams: a review. ACM Sigmod Record 34(2):18–26MATH Gaber MM, Zaslavsky A, Krishnaswamy S (2005) Mining data streams: a review. ACM Sigmod Record 34(2):18–26MATH
35.
Zurück zum Zitat Galimberti G, Soffritti G, Maso MD et al (2012) Classification trees for ordinal responses in r: the rpartscore package. J Stat Softw 47(10):1 Galimberti G, Soffritti G, Maso MD et al (2012) Classification trees for ordinal responses in r: the rpartscore package. J Stat Softw 47(10):1
36.
Zurück zum Zitat Gama J, Zliobaite I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):44:1–44:37MATH Gama J, Zliobaite I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):44:1–44:37MATH
37.
Zurück zum Zitat Gamberger D, Boskovic R, Lavrac N, Groselj C (1999) Experiments with noise filtering in a medical domain. In: International conference on machine learning (ICML). Morgan Kaufmann Publishers, pp 143–151 Gamberger D, Boskovic R, Lavrac N, Groselj C (1999) Experiments with noise filtering in a medical domain. In: International conference on machine learning (ICML). Morgan Kaufmann Publishers, pp 143–151
38.
Zurück zum Zitat Gamberger D, Lavrač N, Džeroski S (1996) Noise elimination in inductive concept learning: a case study in medical diagnosis. In: International workshop on algorithmic learning theory (ALT). Springer, pp 199–212 Gamberger D, Lavrač N, Džeroski S (1996) Noise elimination in inductive concept learning: a case study in medical diagnosis. In: International workshop on algorithmic learning theory (ALT). Springer, pp 199–212
39.
Zurück zum Zitat Gao B-B, Xing C, Xie C-W, Wu J, Geng X (2017) Deep label distribution learning with label ambiguity. IEEE Trans Image Process 26(6):2825–2838MathSciNetMATH Gao B-B, Xing C, Xie C-W, Wu J, Geng X (2017) Deep label distribution learning with label ambiguity. IEEE Trans Image Process 26(6):2825–2838MathSciNetMATH
40.
Zurück zum Zitat Gao J, Fan W, Han J (2007) On appropriate assumptions to mine data streams: analysis and practice. In: IEEE international conference on data mining (ICDM). IEEE, pp 143–152 Gao J, Fan W, Han J (2007) On appropriate assumptions to mine data streams: analysis and practice. In: IEEE international conference on data mining (ICDM). IEEE, pp 143–152
41.
Zurück zum Zitat García S, Luengo J, Herrera F (2015) Data preprocessing in data mining. Springer, Berlin García S, Luengo J, Herrera F (2015) Data preprocessing in data mining. Springer, Berlin
42.
Zurück zum Zitat Garofalakis M, Gehrke J, Rastogi R (2016) Data stream management: processing high-speed data streams. Springer, Berlin Garofalakis M, Gehrke J, Rastogi R (2016) Data stream management: processing high-speed data streams. Springer, Berlin
43.
Zurück zum Zitat Geng X (2016) Label distribution learning. IEEE Trans Knowl Data Eng 28(7):1734–1748 Geng X (2016) Label distribution learning. IEEE Trans Knowl Data Eng 28(7):1734–1748
44.
Zurück zum Zitat Ghosh A, Manwani N, Sastry P (2015) Making risk minimization tolerant to label noise. Neurocomputing 160:93–107 Ghosh A, Manwani N, Sastry P (2015) Making risk minimization tolerant to label noise. Neurocomputing 160:93–107
45.
Zurück zum Zitat Gibaja E, Ventura S (2015) A tutorial on multilabel learning. ACM Comput Surv 47(3):52 Gibaja E, Ventura S (2015) A tutorial on multilabel learning. ACM Comput Surv 47(3):52
46.
Zurück zum Zitat Gomes JB, Gaber MM, Sousa PA, Menasalvas E (2014) Mining recurring concepts in a dynamic feature space. IEEE Trans Neural Netw Learn Syst 25(1):95–110 Gomes JB, Gaber MM, Sousa PA, Menasalvas E (2014) Mining recurring concepts in a dynamic feature space. IEEE Trans Neural Netw Learn Syst 25(1):95–110
47.
Zurück zum Zitat Gutiér rez PA, García S (2016) Current prospects on ordinal and monotonic classification. Prog AI 5(3):171–179 Gutiér rez PA, García S (2016) Current prospects on ordinal and monotonic classification. Prog AI 5(3):171–179
48.
Zurück zum Zitat Gutiérrez PA, Perez-Ortiz M, Sanchez-Monedero J, Fernández-Navarro F, Hervas-Martinez C (2016) Ordinal regression methods: survey and experimental study. IEEE Trans Knowl Data Eng 28(1):127–146 Gutiérrez PA, Perez-Ortiz M, Sanchez-Monedero J, Fernández-Navarro F, Hervas-Martinez C (2016) Ordinal regression methods: survey and experimental study. IEEE Trans Knowl Data Eng 28(1):127–146
49.
Zurück zum Zitat He Z, Li X, Zhang Z, Wu F, Geng X, Zhang Y, Yang M-H, Zhuang Y (2017) Data-dependent label distribution learning for age estimation. IEEE Trans Image Process 26(8):3846–3858MathSciNet He Z, Li X, Zhang Z, Wu F, Geng X, Zhang Y, Yang M-H, Zhuang Y (2017) Data-dependent label distribution learning for age estimation. IEEE Trans Image Process 26(8):3846–3858MathSciNet
50.
Zurück zum Zitat Hernández-González J, Inza I, Lozano JA (2016) Weak supervision and other non-standard classification problems: a taxonomy. Pattern Recognit Lett 69:49–55 Hernández-González J, Inza I, Lozano JA (2016) Weak supervision and other non-standard classification problems: a taxonomy. Pattern Recognit Lett 69:49–55
51.
Zurück zum Zitat Herrera F, Charte F, Rivera AJ, del Jesus MJ (2016) Multilabel classification: problem analysis, metrics and techniques. Springer, Berlin Herrera F, Charte F, Rivera AJ, del Jesus MJ (2016) Multilabel classification: problem analysis, metrics and techniques. Springer, Berlin
52.
Zurück zum Zitat Herrera F, Ventura S, Bello R, Cornelis C, Zafra A, Sánchez-Tarragó D, Vluymans S (2016) Multiple instance learning: foundations and algorithms. Springer, BerlinMATH Herrera F, Ventura S, Bello R, Cornelis C, Zafra A, Sánchez-Tarragó D, Vluymans S (2016) Multiple instance learning: foundations and algorithms. Springer, BerlinMATH
53.
Zurück zum Zitat Hornung R (2017) Ordinal forests. Technical report 212. University of Munich, Department of Statistics Hornung R (2017) Ordinal forests. Technical report 212. University of Munich, Department of Statistics
54.
Zurück zum Zitat Hu Q, Che X, Zhang L, Zhang D, Guo M, Yu D (2012) Rank entropy-based decision trees for monotonic classification. IEEE Trans Knowl Data Eng 24(11):2052–2064 Hu Q, Che X, Zhang L, Zhang D, Guo M, Yu D (2012) Rank entropy-based decision trees for monotonic classification. IEEE Trans Knowl Data Eng 24(11):2052–2064
55.
Zurück zum Zitat Ipeirotis PG, Provost F, Sheng VS, Wang J (2014) Repeated labeling using multiple noisy labelers. Data Min Knowl Discov 28(2):402–441MathSciNetMATH Ipeirotis PG, Provost F, Sheng VS, Wang J (2014) Repeated labeling using multiple noisy labelers. Data Min Knowl Discov 28(2):402–441MathSciNetMATH
56.
Zurück zum Zitat Jabbari S, Holte RC, Zilles S (2012) Pac-learning with general class noise models. In: Annual conference on artificial intelligence. Springer, pp 73–84 Jabbari S, Holte RC, Zilles S (2012) Pac-learning with general class noise models. In: Annual conference on artificial intelligence. Springer, pp 73–84
57.
Zurück zum Zitat Josse J, Wager S (2016) Bootstrap-based regularization for low-rank matrix estimation. J Mach Learn Res 17(1):4227–4255MathSciNetMATH Josse J, Wager S (2016) Bootstrap-based regularization for low-rank matrix estimation. J Mach Learn Res 17(1):4227–4255MathSciNetMATH
58.
Zurück zum Zitat Khardon R, Wachman G (2007) Noise tolerant variants of the perceptron algorithm. J Mach Learn Res 8:227–248MATH Khardon R, Wachman G (2007) Noise tolerant variants of the perceptron algorithm. J Mach Learn Res 8:227–248MATH
59.
Zurück zum Zitat Krawczyk B, Woźniak M (2015) One-class classifiers with incremental learning and forgetting for data streams with concept drift. Soft Comput 19(12):3387–3400 Krawczyk B, Woźniak M (2015) One-class classifiers with incremental learning and forgetting for data streams with concept drift. Soft Comput 19(12):3387–3400
60.
Zurück zum Zitat Kubat M (2015) Similarities: nearest neighbor classifiers. In: An introduction to machine learning. Springer, pp 43–64 Kubat M (2015) Similarities: nearest neighbor classifiers. In: An introduction to machine learning. Springer, pp 43–64
61.
Zurück zum Zitat Lachenbruch PA (1979) Note on initial misclassification effects on the quadratic discriminant function. Technometrics 21(1):129–132MATH Lachenbruch PA (1979) Note on initial misclassification effects on the quadratic discriminant function. Technometrics 21(1):129–132MATH
62.
Zurück zum Zitat Lawrence ND, Schölkopf B (2001) Estimating a kernel fisher discriminant in the presence of label noise. In: International conference on machine learning (ICML), pp 306–313 Lawrence ND, Schölkopf B (2001) Estimating a kernel fisher discriminant in the presence of label noise. In: International conference on machine learning (ICML), pp 306–313
63.
Zurück zum Zitat Leisch F, Weingessel A, Hornik K (1998) On the generation of correlated artificial binary data. SFB Adaptive information systems and modelling in economics and management science, 13. Working paper series, WU Vienna University of Economics and Business, Vienna Leisch F, Weingessel A, Hornik K (1998) On the generation of correlated artificial binary data. SFB Adaptive information systems and modelling in economics and management science, 13. Working paper series, WU Vienna University of Economics and Business, Vienna
64.
Zurück zum Zitat Leung T, Song Y, Zhang J (2011) Handling label noise in video classification via multiple instance learning. In: IEEE international conference on computer vision (ICCV). IEEE, pp 2056–2063 Leung T, Song Y, Zhang J (2011) Handling label noise in video classification via multiple instance learning. In: IEEE international conference on computer vision (ICCV). IEEE, pp 2056–2063
65.
Zurück zum Zitat Li S-T, Chen C-C (2015) A regularized monotonic fuzzy support vector machine model for data mining with prior knowledge. IEEE Trans Fuzzy Syst 23(5):1713–1727 Li S-T, Chen C-C (2015) A regularized monotonic fuzzy support vector machine model for data mining with prior knowledge. IEEE Trans Fuzzy Syst 23(5):1713–1727
66.
Zurück zum Zitat Li W, Vasconcelos N (2015) Multiple instance learning for soft bags via top instances. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 4277–4285 Li W, Vasconcelos N (2015) Multiple instance learning for soft bags via top instances. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 4277–4285
68.
Zurück zum Zitat Lin H-T, Li L (2012) Reduction from cost-sensitive ordinal ranking to weighted binary classification. Neural Comput 24(5):1329–1367MATH Lin H-T, Li L (2012) Reduction from cost-sensitive ordinal ranking to weighted binary classification. Neural Comput 24(5):1329–1367MATH
69.
Zurück zum Zitat Little RJ, Rubin DB (2002) Statistical analysis with missing data. Wiley, New YorkMATH Little RJ, Rubin DB (2002) Statistical analysis with missing data. Wiley, New YorkMATH
70.
Zurück zum Zitat Liu B (2015) Sentiment analysis: mining opinions, sentiments, and emotions. Cambridge University Press, Cambridge Liu B (2015) Sentiment analysis: mining opinions, sentiments, and emotions. Cambridge University Press, Cambridge
71.
Zurück zum Zitat Lorena AC, Garcia L PF, de Carvalho ACPLF (2015) Adapting noise filters for ranking. In: Brazilian conference on intelligent systems (BRACIS), pp 299–304 Lorena AC, Garcia L PF, de Carvalho ACPLF (2015) Adapting noise filters for ranking. In: Brazilian conference on intelligent systems (BRACIS), pp 299–304
72.
Zurück zum Zitat Luengo J, Shim S-O, Alshomrani S, Altalhi A, Herrera F (2018) CNC-NOS: class noise cleaning by ensemble filtering and noise scoring. Knowl Based Syst 140:27–49 Luengo J, Shim S-O, Alshomrani S, Altalhi A, Herrera F (2018) CNC-NOS: class noise cleaning by ensemble filtering and noise scoring. Knowl Based Syst 140:27–49
73.
Zurück zum Zitat Ma L, Destercke S, Wang Y (2016) Online active learning of decision trees with evidential data. Pattern Recognit 52:33–45MATH Ma L, Destercke S, Wang Y (2016) Online active learning of decision trees with evidential data. Pattern Recognit 52:33–45MATH
74.
Zurück zum Zitat Maloof MA, Michalski RS (2000) Selecting examples for partial memory learning. Mach Learn 41(1):27–52MATH Maloof MA, Michalski RS (2000) Selecting examples for partial memory learning. Mach Learn 41(1):27–52MATH
75.
Zurück zum Zitat Manwani N, Sastry P (2013) Noise tolerance under risk minimization. IEEE Trans Cybern 43(3):1146–1151 Manwani N, Sastry P (2013) Noise tolerance under risk minimization. IEEE Trans Cybern 43(3):1146–1151
76.
Zurück zum Zitat Maron O (1998) Learning from ambiguity. PhD thesis, Massachusetts Institute of Technology Maron O (1998) Learning from ambiguity. PhD thesis, Massachusetts Institute of Technology
77.
Zurück zum Zitat Maron O, Lozano-Pérez T (1998) A framework for multiple-instance learning. Adv Neural Inf Process Syst 10:570–576 Maron O, Lozano-Pérez T (1998) A framework for multiple-instance learning. Adv Neural Inf Process Syst 10:570–576
78.
Zurück zum Zitat Masud M, Gao J, Khan L, Han J, Thuraisingham BM (2011) Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Trans Knowl Data Eng 23(6):859–874 Masud M, Gao J, Khan L, Han J, Thuraisingham BM (2011) Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Trans Knowl Data Eng 23(6):859–874
79.
Zurück zum Zitat Masud MM, Chen Q, Gao J, Khan L, Han J, Thuraisingham B (2010) Classification and novel class detection of data streams in a dynamic feature space. In: European conference on machine learning and principles and practice of knowledge discovery (ECML/PKDD). Springer, pp 337–352 Masud MM, Chen Q, Gao J, Khan L, Han J, Thuraisingham B (2010) Classification and novel class detection of data streams in a dynamic feature space. In: European conference on machine learning and principles and practice of knowledge discovery (ECML/PKDD). Springer, pp 337–352
80.
Zurück zum Zitat Masud MM, Chen Q, Khan L, Aggarwal CC, Gao J, Han J, Srivastava A, Oza NC (2013) Classification and adaptive novel class detection of feature-evolving data streams. IEEE Trans Knowl Data Eng 25(7):1484–1497 Masud MM, Chen Q, Khan L, Aggarwal CC, Gao J, Han J, Srivastava A, Oza NC (2013) Classification and adaptive novel class detection of feature-evolving data streams. IEEE Trans Knowl Data Eng 25(7):1484–1497
81.
Zurück zum Zitat McLachlan G (1972) Asymptotic results for discriminant analysis when the initial samples are misclassified. Technometrics 14(2):415–422MATH McLachlan G (1972) Asymptotic results for discriminant analysis when the initial samples are misclassified. Technometrics 14(2):415–422MATH
82.
Zurück zum Zitat Miao Q, Cao Y, Xia G, Gong M, Liu J, Song J (2016) Rboost: label noise-robust boosting algorithm based on a nonconvex loss function and the numerically stable base learners. IEEE Trans Neural Netw Learn Syst 27(11):2216–2228MathSciNet Miao Q, Cao Y, Xia G, Gong M, Liu J, Song J (2016) Rboost: label noise-robust boosting algorithm based on a nonconvex loss function and the numerically stable base learners. IEEE Trans Neural Netw Learn Syst 27(11):2216–2228MathSciNet
83.
Zurück zum Zitat Michalek JE, Tripathi RC (1980) The effect of errors in diagnosis and measurement on the estimation of the probability of an event. J Am Stat Assoc 75(371):713–721MathSciNetMATH Michalek JE, Tripathi RC (1980) The effect of errors in diagnosis and measurement on the estimation of the probability of an event. J Am Stat Assoc 75(371):713–721MathSciNetMATH
84.
Zurück zum Zitat Milstein I, David AB, Potharst R (2013) Generating noisy monotone ordinal datasets. Artif Intell Rev 3(1):p30 Milstein I, David AB, Potharst R (2013) Generating noisy monotone ordinal datasets. Artif Intell Rev 3(1):p30
85.
Zurück zum Zitat Minku LL, White AP, Yao X (2010) The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Trans Knowl Data Eng 22(5):730–742 Minku LL, White AP, Yao X (2010) The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Trans Knowl Data Eng 22(5):730–742
86.
Zurück zum Zitat Miranda ALB, Garcia LPF, Carvalho ACPLF, Lorena AC (2009) Use of classification algorithms in noise detection and elimination. In: Corchado E, Wu X, Oja E, Herrero Á, Baruque B (eds) Proceedings of the hybrid artificial intelligence systems: 4th international conference, HAIS 2009, Salamanca, Spain. Springer, Berlin, pp 424–471 Miranda ALB, Garcia LPF, Carvalho ACPLF, Lorena AC (2009) Use of classification algorithms in noise detection and elimination. In: Corchado E, Wu X, Oja E, Herrero Á, Baruque B (eds) Proceedings of the hybrid artificial intelligence systems: 4th international conference, HAIS 2009, Salamanca, Spain. Springer, Berlin, pp 424–471
87.
Zurück zum Zitat Montañes E, Senge R, Barranquero J, Quevedo JR, del Coz JJ, Hüllermeier E (2014) Dependent binary relevance models for multi-label classification. Pattern Recognit 47(3):1494–1508 Montañes E, Senge R, Barranquero J, Quevedo JR, del Coz JJ, Hüllermeier E (2014) Dependent binary relevance models for multi-label classification. Pattern Recognit 47(3):1494–1508
88.
Zurück zum Zitat Napierała K, Stefanowski J, Wilk S (2010) Learning from imbalanced data in presence of noisy and borderline examples. In: International conference on rough sets and current trends in computing. Springer, pp 158–167 Napierała K, Stefanowski J, Wilk S (2010) Learning from imbalanced data in presence of noisy and borderline examples. In: International conference on rough sets and current trends in computing. Springer, pp 158–167
89.
Zurück zum Zitat Natarajan N, Dhillon IS, Ravikumar PK, Tewari A (2013) Learning with noisy labels. In: Advances in neural information processing systems (NIPS), pp 1196–1204 Natarajan N, Dhillon IS, Ravikumar PK, Tewari A (2013) Learning with noisy labels. In: Advances in neural information processing systems (NIPS), pp 1196–1204
90.
Zurück zum Zitat Nettleton DF, Orriols-Puig A, Fornells A (2010) A study of the effect of different types of noise on the precision of supervised learning techniques. Artif Intell Rev 33(4):275–306 Nettleton DF, Orriols-Puig A, Fornells A (2010) A study of the effect of different types of noise on the precision of supervised learning techniques. Artif Intell Rev 33(4):275–306
91.
Zurück zum Zitat Nicholson B, Sheng VS, Zhang J (2016) Label noise correction and application in crowdsourcing. Expert Syst Appl 66:149–162 Nicholson B, Sheng VS, Zhang J (2016) Label noise correction and application in crowdsourcing. Expert Syst Appl 66:149–162
92.
Zurück zum Zitat Nowak S, Rüger S (2010) How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation. In: International conference on multimedia information retrieval (ICMR). ACM, pp 557–566 Nowak S, Rüger S (2010) How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation. In: International conference on multimedia information retrieval (ICMR). ACM, pp 557–566
93.
Zurück zum Zitat Okamoto S, Yugami N (2003) Effects of domain characteristics on instance-based learning algorithms. Theor Comput Sci 298(1):207–233MathSciNetMATH Okamoto S, Yugami N (2003) Effects of domain characteristics on instance-based learning algorithms. Theor Comput Sci 298(1):207–233MathSciNetMATH
94.
Zurück zum Zitat Ozuysal M, Calonder M, Lepetit V, Fua P (2010) Fast keypoint recognition using random ferns. IEEE Trans Pattern Anal Mach Intell 32(3):448–461 Ozuysal M, Calonder M, Lepetit V, Fua P (2010) Fast keypoint recognition using random ferns. IEEE Trans Pattern Anal Mach Intell 32(3):448–461
95.
Zurück zum Zitat Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359 Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
96.
Zurück zum Zitat Pathak D, Shelhamer E, Long J, Darrell T (2015) Fully convolutional multi-class multiple instance learning. In: International conference on learning representations (ICLR) workshop. arXiv:1412.7144 Pathak D, Shelhamer E, Long J, Darrell T (2015) Fully convolutional multi-class multiple instance learning. In: International conference on learning representations (ICLR) workshop. arXiv:​1412.​7144
97.
Zurück zum Zitat Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, BurlingtonMATH Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, BurlingtonMATH
98.
Zurück zum Zitat Pérez CJ, González-Torre FJG, Martín J, Ruiz M, Rojano C (2007) Misclassified multinomial data: a Bayesian approach. RACSAM 101(1):71–80MathSciNetMATH Pérez CJ, González-Torre FJG, Martín J, Ruiz M, Rojano C (2007) Misclassified multinomial data: a Bayesian approach. RACSAM 101(1):71–80MathSciNetMATH
99.
Zurück zum Zitat Perez PS, Nozawa SR, Macedo AA, Baranauskas JA (2016) Windowing improvements towards more comprehensible models. Knowl Based Syst 92:9–22 Perez PS, Nozawa SR, Macedo AA, Baranauskas JA (2016) Windowing improvements towards more comprehensible models. Knowl Based Syst 92:9–22
100.
Zurück zum Zitat Prati RC, Batista GEAPA, Silva DF (2015) Class imbalance revisited: a new experimental setup to assess the performance of treatment methods. Knowl Inf Syst 45(1):247–270 Prati RC, Batista GEAPA, Silva DF (2015) Class imbalance revisited: a new experimental setup to assess the performance of treatment methods. Knowl Inf Syst 45(1):247–270
101.
Zurück zum Zitat Qi Z, Yang M, Zhang ZM, Zhang Z (2012) Mining noisy tagging from multi-label space. In: ACM international conference on information and knowledge management (CIKM). ACM, pp 1925–1929 Qi Z, Yang M, Zhang ZM, Zhang Z (2012) Mining noisy tagging from multi-label space. In: ACM international conference on information and knowledge management (CIKM). ACM, pp 1925–1929
102.
Zurück zum Zitat Qu W, Zhang Y, Zhu J, Qiu Q (2009) Mining multi-label concept-drifting data streams using dynamic classifier ensemble. In: Asian conference on machine learning (ACML). Springer, pp 308–321 Qu W, Zhang Y, Zhu J, Qiu Q (2009) Mining multi-label concept-drifting data streams using dynamic classifier ensemble. In: Asian conference on machine learning (ACML). Springer, pp 308–321
103.
Zurück zum Zitat Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106 Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106
104.
Zurück zum Zitat Quinlan JR (1993) C4. 5: programs for machine learning. Elsevier, New York Quinlan JR (1993) C4. 5: programs for machine learning. Elsevier, New York
105.
Zurück zum Zitat Rademaker M, De Baets B, De Meyer H (2012) Optimal monotone relabelling of partially non-monotone ordinal data. Optim Methods Softw 27(1):17–31MathSciNetMATH Rademaker M, De Baets B, De Meyer H (2012) Optimal monotone relabelling of partially non-monotone ordinal data. Optim Methods Softw 27(1):17–31MathSciNetMATH
106.
Zurück zum Zitat Rakitsch B, Lippert C, Borgwardt K, Stegle O (2013) It is all in the noise: efficient multi-task Gaussian process inference with structured residuals. In: Advances in neural information processing systems (NIPS), pp 1466–1474 Rakitsch B, Lippert C, Borgwardt K, Stegle O (2013) It is all in the noise: efficient multi-task Gaussian process inference with structured residuals. In: Advances in neural information processing systems (NIPS), pp 1466–1474
107.
Zurück zum Zitat Ralaivola L, Denis F, Magnan CN (2006) CN = CPCN. In: Proceedings of the 23rd international conference on Machine learning. ACM, pp 721–728 Ralaivola L, Denis F, Magnan CN (2006) CN = CPCN. In: Proceedings of the 23rd international conference on Machine learning. ACM, pp 721–728
108.
Zurück zum Zitat Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333–359MathSciNet Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333–359MathSciNet
109.
Zurück zum Zitat Rider AK, Johnson RA, Davis DA, Hoens TR, Chawla NV (2013) Classifier evaluation with missing negative class labels. In: International symposium on intelligent data analysis. Springer, pp 380–391 Rider AK, Johnson RA, Davis DA, Hoens TR, Chawla NV (2013) Classifier evaluation with missing negative class labels. In: International symposium on intelligent data analysis. Springer, pp 380–391
110.
111.
Zurück zum Zitat Sabzevari M, Martínez-Muñoz G, Suárez A (2018) A two-stage ensemble method for the detection of class-label noise. Neurocomputing 275:2374–2383 Sabzevari M, Martínez-Muñoz G, Suárez A (2018) A two-stage ensemble method for the detection of class-label noise. Neurocomputing 275:2374–2383
112.
Zurück zum Zitat Sáez JA, Galar M, Luengo J, Herrera F (2014) Analyzing the presence of noise in multi-class problems: alleviating its influence with the one-vs-one decomposition. Knowl Inf Syst 38(1):179–206 Sáez JA, Galar M, Luengo J, Herrera F (2014) Analyzing the presence of noise in multi-class problems: alleviating its influence with the one-vs-one decomposition. Knowl Inf Syst 38(1):179–206
113.
Zurück zum Zitat Sáez JA, Galar M, Luengo J, Herrera F (2016) INFFC: an iterative class noise filter based on the fusion of classifiers with noise sensitivity control. Inform Fusion 27:19–32 Sáez JA, Galar M, Luengo J, Herrera F (2016) INFFC: an iterative class noise filter based on the fusion of classifiers with noise sensitivity control. Inform Fusion 27:19–32
114.
Zurück zum Zitat Sáez JA, Luengo J, Stefanowski J, Herrera F (2015) Smote-ipf: addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering. Inf Sci 291:184–203 Sáez JA, Luengo J, Stefanowski J, Herrera F (2015) Smote-ipf: addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering. Inf Sci 291:184–203
115.
Zurück zum Zitat Sánchez JS, Pla F, Ferri FJ (1997) Prototype selection for the nearest neighbour rule through proximity graphs. Pattern Recognit Lett 18(6):507–513 Sánchez JS, Pla F, Ferri FJ (1997) Prototype selection for the nearest neighbour rule through proximity graphs. Pattern Recognit Lett 18(6):507–513
116.
Zurück zum Zitat Scott C (2015) A rate of convergence for mixture proportion estimation, with application to learning from noisy labels. In: International conference on artificial intelligence and statistics (AISTATS), pp 838–846 Scott C (2015) A rate of convergence for mixture proportion estimation, with application to learning from noisy labels. In: International conference on artificial intelligence and statistics (AISTATS), pp 838–846
117.
Zurück zum Zitat Sluban B, Gamberger D, Lavrač N (2014) Ensemble-based noise detection: noise ranking and visual performance evaluation. Data Min Knowl Discov 28(2):265–303MathSciNetMATH Sluban B, Gamberger D, Lavrač N (2014) Ensemble-based noise detection: noise ranking and visual performance evaluation. Data Min Knowl Discov 28(2):265–303MathSciNetMATH
118.
Zurück zum Zitat Street WN, Kim Y (2001) A streaming ensemble algorithm (sea) for large-scale classification. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 377–382 Street WN, Kim Y (2001) A streaming ensemble algorithm (sea) for large-scale classification. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 377–382
119.
Zurück zum Zitat Sulis E, Farías DIH, Rosso P, Patti V, Ruffo G (2016) Figurative messages and affect in twitter: differences between #irony, #sarcasm and #not. Knowl Based Syst 108:132–143 Sulis E, Farías DIH, Rosso P, Patti V, Ruffo G (2016) Figurative messages and affect in twitter: differences between #irony, #sarcasm and #not. Knowl Based Syst 108:132–143
120.
Zurück zum Zitat Sun B, Chen S, Wang J, Chen H (2016) A robust multi-class AdaBoost algorithm for mislabeled noisy data. Knowl Based Syst 102:87–102 Sun B, Chen S, Wang J, Chen H (2016) A robust multi-class AdaBoost algorithm for mislabeled noisy data. Knowl Based Syst 102:87–102
121.
Zurück zum Zitat Sun S (2013) A survey of multi-view machine learning. Neural Comput Appl 23(7–8):2031–2038 Sun S (2013) A survey of multi-view machine learning. Neural Comput Appl 23(7–8):2031–2038
122.
Zurück zum Zitat Sun Y, Tang K, Minku LL, Wang S, Yao X (2016) Online ensemble learning of data streams with gradually evolved classes. IEEE Trans Knowl Data Eng 28(6):1532–1545 Sun Y, Tang K, Minku LL, Wang S, Yao X (2016) Online ensemble learning of data streams with gradually evolved classes. IEEE Trans Knowl Data Eng 28(6):1532–1545
123.
Zurück zum Zitat Tan M, Shi Q, van den Hengel A, Shen C, Gao J, Hu F, Zhang Z (2015) Learning graph structure for multi-label image classification via clique generation. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 4100–4109 Tan M, Shi Q, van den Hengel A, Shen C, Gao J, Hu F, Zhang Z (2015) Learning graph structure for multi-label image classification via clique generation. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 4100–4109
124.
Zurück zum Zitat Teng C-M (1999) Correcting noisy data. In: Proceedings of the sixteenth international conference on machine learning. Morgan Kaufmann Publishers, San Francisco, CA, USA, pp 239–248 Teng C-M (1999) Correcting noisy data. In: Proceedings of the sixteenth international conference on machine learning. Morgan Kaufmann Publishers, San Francisco, CA, USA, pp 239–248
125.
Zurück zum Zitat Tu H-H, Lin H-T (2010) One-sided support vector regression for multiclass cost-sensitive classification. In: International conference on machine learning (ICML), pp 1095–1102 Tu H-H, Lin H-T (2010) One-sided support vector regression for multiclass cost-sensitive classification. In: International conference on machine learning (ICML), pp 1095–1102
126.
Zurück zum Zitat Van Hulse J, Khoshgoftaar T (2009) Knowledge discovery from imbalanced and noisy data. Data Knowl Eng 68(12):1513–1542 Van Hulse J, Khoshgoftaar T (2009) Knowledge discovery from imbalanced and noisy data. Data Knowl Eng 68(12):1513–1542
127.
Zurück zum Zitat Vens C, Struyf J, Schietgat L, Džeroski S, Blockeel H (2008) Decision trees for hierarchical multi-label classification. Mach Learn 73(2):185–214 Vens C, Struyf J, Schietgat L, Džeroski S, Blockeel H (2008) Decision trees for hierarchical multi-label classification. Mach Learn 73(2):185–214
128.
Zurück zum Zitat Wang S, Yao X (2012) Multiclass imbalance problems: analysis and potential solutions. IEEE Trans Syst Man Cybern B 42(4):1119–1130 Wang S, Yao X (2012) Multiclass imbalance problems: analysis and potential solutions. IEEE Trans Syst Man Cybern B 42(4):1119–1130
129.
Zurück zum Zitat Wei Y, Zheng Y, Yang Q (2016) Transfer knowledge between cities. In: ACM SIGKDD conference on knowledge discovery and data mining (KDD). ACM, pp 1905–1914 Wei Y, Zheng Y, Yang Q (2016) Transfer knowledge between cities. In: ACM SIGKDD conference on knowledge discovery and data mining (KDD). ACM, pp 1905–1914
130.
Zurück zum Zitat Xiao H, Xiao H, Eckert C (2012) Adversarial label flips attack on support vector machines. In: Proceedings of the 20th european conference on artificial intelligence. IOS Press, pp 870–875 Xiao H, Xiao H, Eckert C (2012) Adversarial label flips attack on support vector machines. In: Proceedings of the 20th european conference on artificial intelligence. IOS Press, pp 870–875
131.
Zurück zum Zitat Xiao T, Xia T, Yang Y, Huang C, Wang X (2015) Learning from massive noisy labeled data for image classification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2691–2699 Xiao T, Xia T, Yang Y, Huang C, Wang X (2015) Learning from massive noisy labeled data for image classification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2691–2699
132.
Zurück zum Zitat Xing C, Geng X, Xue H (2016) Logistic boosting regression for label distribution learning, In: ‘Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4489–4497 Xing C, Geng X, Xue H (2016) Logistic boosting regression for label distribution learning, In: ‘Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4489–4497
133.
Zurück zum Zitat Xu K, Liao SS, Li J, Song Y (2011) Mining comparative opinions from customer reviews for competitive intelligence. Decis Support Syst 50(4):743–754 Xu K, Liao SS, Li J, Song Y (2011) Mining comparative opinions from customer reviews for competitive intelligence. Decis Support Syst 50(4):743–754
134.
Zurück zum Zitat Xu L, Wang Z, Shen Z, Wang Y, Chen E (2014) Learning low-rank label correlations for multi-label classification with missing labels. In: International conference on data mining (ICDM). IEEE, pp 1067–1072 Xu L, Wang Z, Shen Z, Wang Y, Chen E (2014) Learning low-rank label correlations for multi-label classification with missing labels. In: International conference on data mining (ICDM). IEEE, pp 1067–1072
135.
Zurück zum Zitat Xu M, Zhou Z-H (2017) Incomplete label distribution learning. In: Proceedings of the 26th international joint conference on artificial intelligence. AAAI Press, pp 3175–3181 Xu M, Zhou Z-H (2017) Incomplete label distribution learning. In: Proceedings of the 26th international joint conference on artificial intelligence. AAAI Press, pp 3175–3181
136.
Zurück zum Zitat Xu X, Li B (2007) Multiple class multiple-instance learning and its application to image categorization. Int J Image Graph 7(03):427–444 Xu X, Li B (2007) Multiple class multiple-instance learning and its application to image categorization. Int J Image Graph 7(03):427–444
137.
Zurück zum Zitat Yang C-Y, Wang J-J, Chou J-J, Lian F-L (2015) Confirming robustness of fuzzy support vector machine via \(\xi \)-\(\alpha \) bound. Neurocomputing 162:256–266 Yang C-Y, Wang J-J, Chou J-J, Lian F-L (2015) Confirming robustness of fuzzy support vector machine via \(\xi \)-\(\alpha \) bound. Neurocomputing 162:256–266
138.
Zurück zum Zitat Yogatama D, Mann G (2014) Efficient transfer learning method for automatic hyperparameter tuning, In: Artificial intelligence and statistics, pp 1077–1085 Yogatama D, Mann G (2014) Efficient transfer learning method for automatic hyperparameter tuning, In: Artificial intelligence and statistics, pp 1077–1085
139.
Zurück zum Zitat Yuan X-T, Liu X, Yan S (2012) Visual classification with multitask joint sparse representation. IEEE Trans Image Process 21(10):4349–4360MathSciNetMATH Yuan X-T, Liu X, Yan S (2012) Visual classification with multitask joint sparse representation. IEEE Trans Image Process 21(10):4349–4360MathSciNetMATH
140.
Zurück zum Zitat Zeng X, Martinez T (2008) Using decision trees and soft labeling to filter mislabeled data. J Intell Syst 17(4):331–354 Zeng X, Martinez T (2008) Using decision trees and soft labeling to filter mislabeled data. J Intell Syst 17(4):331–354
141.
Zurück zum Zitat Zhang C, Wu C, Blanzieri E, Zhou Y, Wang Y, Du W, Liang Y (2009) Methods for labeling error detection in microarrays based on the effect of data perturbation on the regression model. Bioinformatics 25(20):2708–2714 Zhang C, Wu C, Blanzieri E, Zhou Y, Wang Y, Du W, Liang Y (2009) Methods for labeling error detection in microarrays based on the effect of data perturbation on the regression model. Bioinformatics 25(20):2708–2714
142.
Zurück zum Zitat Zhang P, Zhu X, Shi Y, Guo L, Wu X (2011) Robust ensemble learning for mining noisy data streams. Decis Support Syst 50(2):469–479 Zhang P, Zhu X, Shi Y, Guo L, Wu X (2011) Robust ensemble learning for mining noisy data streams. Decis Support Syst 50(2):469–479
143.
Zurück zum Zitat Zhang W, Rekaya R, Bertrand K (2006) A method for predicting disease subtypes in presence of misclassification among training samples using gene expression: application to human breast cancer. Bioinformatics 22(3):317–325 Zhang W, Rekaya R, Bertrand K (2006) A method for predicting disease subtypes in presence of misclassification among training samples using gene expression: application to human breast cancer. Bioinformatics 22(3):317–325
144.
Zurück zum Zitat Zhang Z, Zhou J (2010) Transfer estimation of evolving class priors in data stream classification. Pattern Recognit 43(9):3151–3161MATH Zhang Z, Zhou J (2010) Transfer estimation of evolving class priors in data stream classification. Pattern Recognit 43(9):3151–3161MATH
145.
Zurück zum Zitat Zhou J, Liu J, Narayan VA, Ye J, Initiative ADN et al (2013) Modeling disease progression via multi-task learning. Neuroimage 78:233–248 Zhou J, Liu J, Narayan VA, Ye J, Initiative ADN et al (2013) Modeling disease progression via multi-task learning. Neuroimage 78:233–248
146.
Zurück zum Zitat Zhou Z-H, Zhang M-L, Huang S-J, Li Y-F (2012) Multi-instance multi-label learning. Artif Intell 176(1):2291–2320MathSciNetMATH Zhou Z-H, Zhang M-L, Huang S-J, Li Y-F (2012) Multi-instance multi-label learning. Artif Intell 176(1):2291–2320MathSciNetMATH
147.
Zurück zum Zitat Zhu X, Wu X (2004a) Class noise vs. attribute noise: a quantitative study. Artif Intell Rev 22(3):177–210MATH Zhu X, Wu X (2004a) Class noise vs. attribute noise: a quantitative study. Artif Intell Rev 22(3):177–210MATH
148.
Zurück zum Zitat Zhu X, Wu X (2004b) Cost-guided class noise handling for effective cost-sensitive learning. In: IEEE international conference on data mining (ICDM), IEEE, pp 297–304 Zhu X, Wu X (2004b) Cost-guided class noise handling for effective cost-sensitive learning. In: IEEE international conference on data mining (ICDM), IEEE, pp 297–304
149.
Zurück zum Zitat Zhu X, Wu X, Chen Q (2003) Eliminating class noise in large datasets. In: International conference on machine learning (ICML), vol 3, pp 920–927 Zhu X, Wu X, Chen Q (2003) Eliminating class noise in large datasets. In: International conference on machine learning (ICML), vol 3, pp 920–927
150.
Zurück zum Zitat Zhu X, Wu X, Chen Q (2006) Bridging local and global data cleansing: Identifying class noise in large, distributed data datasets. Data Min Knowl Discov 12(2–3):275–308MathSciNet Zhu X, Wu X, Chen Q (2006) Bridging local and global data cleansing: Identifying class noise in large, distributed data datasets. Data Min Knowl Discov 12(2–3):275–308MathSciNet
151.
Zurück zum Zitat Zhu X, Wu X, Khoshgoftaar TM, Shi Y (2007) An empirical study of the noise impact on cost-sensitive learning. In: International joint conference on artificial intelligence (IJCAI), vol 7, pp 1168–1173 Zhu X, Wu X, Khoshgoftaar TM, Shi Y (2007) An empirical study of the noise impact on cost-sensitive learning. In: International joint conference on artificial intelligence (IJCAI), vol 7, pp 1168–1173
152.
Zurück zum Zitat Zhu Y, Shasha D (2002) Statstream: statistical monitoring of thousands of data streams in real time. In: International conference on very large data bases (VLDB), VLDB Endowment, pp 358–369 Zhu Y, Shasha D (2002) Statstream: statistical monitoring of thousands of data streams in real time. In: International conference on very large data bases (VLDB), VLDB Endowment, pp 358–369
153.
Zurück zum Zitat Žliobaitė I, Bifet A, Pfahringer B, Holmes G (2014) Active learning with drifting streaming data. IEEE Trans Neural Netw Learn Syst 25(1):27–39 Žliobaitė I, Bifet A, Pfahringer B, Holmes G (2014) Active learning with drifting streaming data. IEEE Trans Neural Netw Learn Syst 25(1):27–39
Metadaten
Titel
Emerging topics and challenges of learning from noisy data in nonstandard classification: a survey beyond binary class noise
verfasst von
Ronaldo C. Prati
Julián Luengo
Francisco Herrera
Publikationsdatum
06.07.2018
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 1/2019
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-018-1244-4

Weitere Artikel der Ausgabe 1/2019

Knowledge and Information Systems 1/2019 Zur Ausgabe