nach oben

Data Mining and Knowledge Discovery

Erschienen in:

01.03.2014

Repeated labeling using multiple noisy labelers

verfasst von: Panagiotis G. Ipeirotis, Foster Provost, Victor S. Sheng, Jing Wang

Erschienen in: Data Mining and Knowledge Discovery | Ausgabe 2/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This paper addresses the repeated acquisition of labels for data items when the labeling is imperfect. We examine the improvement (or lack thereof) in data quality via repeated labeling, and focus especially on the improvement of training labels for supervised induction of predictive models. With the outsourcing of small tasks becoming easier, for example via Amazon’s Mechanical Turk, it often is possible to obtain less-than-expert labeling at low cost. With low-cost labeling, preparing the unlabeled part of the data can become considerably more expensive than labeling. We present repeated-labeling strategies of increasing complexity, and show several main results. (i) Repeated-labeling can improve label quality and model quality, but not always. (ii) When labels are noisy, repeated labeling can be preferable to single labeling even in the traditional setting where labels are not particularly cheap. (iii) As soon as the cost of processing the unlabeled data is not free, even the simple strategy of labeling everything multiple times can give considerable advantage. (iv) Repeatedly labeling a carefully chosen set of points is generally preferable, and we present a set of robust techniques that combine different notions of uncertainty to select data points for which quality should be improved. The bottom line: the results show clearly that when labeling is not perfect, selective acquisition of multiple labels is a strategy that data miners should have in their repertoire; for certain label-quality/cost regimes, the benefit is substantial.

Vorheriger Artikel Conditional ordinal random fields for structured ordinal-valued label prediction

Nächster Artikel Affinity-driven blog cascade analysis and prediction

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

This setting is in direct contrast to the setting motivating active learning and semi-supervised learning, where unlabeled points are relatively inexpensive, but labeling is expensive.

http://www.mturk.com

http://www.espgame.org

The test set has perfect quality with zero noise.

We do not assume that the quality is the same across all examples. In fact, LU indirectly relies on the assumption that the labeling quality is different across examples.

As a shorthand we will simply call that Label Uncertainty (LU).

We do not use selective labeling strategies for this experiment, as we want to keep the labeling allocation strategy constant, and independent of the two uncertainty scoring strategies. The goal is to see which uncertainty score can separate best the correctly from the incorrectly labeled examples.

Since the Proposition and proof sketch are mainly to give theoretical motivation to MU, let’s assume that the induction algorithm is no worse than a standard classification tree learner.

Subsequent to these experiments, we also experimented with other approaches for combining probabilities from multiple sources, following the discussion in Clemen and Winkler (1990). For our experiments, taking the geometric mean was the best performing and most robust approach for combining the uncertainty scores, even after transforming the uncertainty scores into proper probability estimates.

From Provost and Danyluk (1995): “No two experts, of the five experts surveyed, agreed upon diagnoses more than 65 % of the time. This might be evidence for the differences that exist between sites, as the experts surveyed had gained their expertise at different locations. If not, however, it raises questions about the correctness of the expert data.”

http://crowdflower.com

Baram Y, El-Yaniv R, Luz K (2004) Online choice of active learning algorithms. J Mach Learn Res 5:255–291MathSciNet

Blake CL, Merz CJ (1998) UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html. Accessed 11 Mar 2013

Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recognit 37(9):1757–1771CrossRef

Breiman L (2001) Random forests. Mach Learn 45(1):5–32CrossRefMATH

Brodley CE, Friedl MA (1999) Identifying mislabeled training data. J Artif Intell Res 11:131–167MATH

Carpenter B (2008) Multilevel bayesian model of categorical data annotation. http://lingpipe-blog.com/lingpipe-white-papers/. Accessed 11 Mar 2013

Clemen RT, Winkler RL (1990) Unanimity and compromise among probability forecasters. Manag Sci 36(7):767–779CrossRefMATH

Cohn DA, Atlas LE, Ladner RE (1994) Improving generalization with active learning. Mach Learn 15(2):201–221

Dawid AP, Skene AM (1979) Maximum likelihood estimation of observer error-rates using the EM algorithm. Appl Stat 28(1):20–28CrossRef

Domingos P (1999) MetaCost: a general method for making classifiers cost-sensitive. In: Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining (KDD-99). pp 155–164

Donmez P, Carbonell JG (2008) Proactive learning: cost-sensitive active learning with multiple imperfect oracles. In: Proceedings of the 17th ACM conference on information and knowledge management (CIKM 2008). pp 619–628

Donmez P, Carbonell JG, Schneider J (2009) Efficiently learning the accuracy of labeling sources for selective sampling. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (KDD 2009). pp 259–268

Donmez P, Carbonell JG, Schneider J (2010) A probabilistic framework to learn from multiple annotators with time-varying accuracy. In: Proceedings of the SIAM international conference on data mining (SDM 2010). pp 826–837

Elkan C (2001) The foundations of cost-sensitive learning. In: Proceedings of the seventeenth international joint conference on, artificial intelligence (IJCAI-01). pp 973–978

Gelman A, Carlin JB, Stern HS, Rubin DB (2003) Bayesian data analysis, 2nd edn. Chapman and Hall/CRC, Boca Raton

Ipeirotis PG, Provost F, Wang J (2010) Quality management on amazon mechanical turk. In: Proceedings of the ACM SIGKDD workshop on human computation (HCOMP 2010). pp 64–67

Jin R, Ghahramani Z (2002) Learning with multiple labels. In: Advances in neural information processing systems 15 (NIPS 2002). pp 897–904

Kapoor A, Greiner R (2005) Learning and classifying under hard budgets. In: ECML 2005, 16th European conference on machine learning. pp 170–181

Lizotte DJ, Madani O, Greiner R (2003) Budgeted learning of naive-bayes classifiers. In: 19th conference on uncertainty in artificial intelligence (UAI 2003). pp 378–385

Lugosi G (1992) Learning with an unreliable teacher. Pattern Recognit 25(1):79–87CrossRefMathSciNet

Margineantu DD (2005) Active cost-sensitive learning. In: Proceedings of the nineteenth international joint conference on, artificial intelligence (IJCAI-05). pp 1622–1613

Mason W, Watts DJ (2009) Financial incentives and the performance of crowds. In: Proceedings of the human computation workshop (HCOMP 2009). pp 77–85

McCallum A (1999) Multi-label text classification with a mixture model trained by EM. In: AAAI’99 workshop on text learning

Melville P, Saar-Tsechansky M, Provost FJ, Mooney RJ (2004) Active feature-value acquisition for classifier induction. In: Proceedings of the 4th IEEE international conference on data mining (ICDM 2004). pp 483–486

Melville P, Provost FJ, Mooney RJ (2005) An expected utility approach to active feature-value acquisition. In: Proceedings of the 5th IEEE international conference on data mining (ICDM 2005). pp 745–748

Morrison CT, Cohen PR (2005) Noisy information value in utility-based decision making. In: Proceedings of the 1st international workshop on utility-based data mining (UBDM’05). pp 34–38

Provost F (2005) Toward economic machine learning and utility-based data mining. In: Proceedings of the 1st international workshop on utility-based data mining (UBDM’05). p 1

Provost F, Danyluk AP (1995) Learning from bad data. In: Proceedings of the ML-95 workshop on applying machine learning, in practice. pp 27–33

Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106

Quinlan JR (1992) C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc, San Mateo

Raykar VC, Yu S, Zhao LH, Jerebko A, Florin C, Valadez GH, Bogoni L, Moy L (2009) Supervised learning from multiple experts: whom to trust when everyone lies a bit. In: Proceedings of the 26th annual international conference on machine learning (ICML 2009). pp. 889–896

Raykar VC, Yu S, Zhao LH, Valadez GH, Florin C, Bogoni L, Moy L (2010) Learning from crowds. J Mach Learn Res 11(7):1297–1322MathSciNet

Rebbapragada U, Brodley CE (2007) Class noise mitigation through instance weighting. In: 18th European conference on machine learning (ECML’07). pp. 708–715

Saar-Tsechansky M, Provost F (2004) Active sampling for class probability estimation and ranking. Mach Learn 54(2):153–178CrossRefMATH

Saar-Tsechansky M, Melville P, Provost F (2009) Active feature-value acquisition. Manag Sci 55(4):664–684

Sheng VS, Provost F, Ipeirotis P (2008) Get another label? Improving data quality and data mining using multiple, noisy labelers. In: Proceedings of the fourteenth ACM SIGKDD international conference on knowledge discovery and data mining (KDD-2008). pp. 614–622

Silverman BW (1980) Some asymptotic properties of the probabilistic teacher. IEEE Trans Inf Theory 26(2):246–249CrossRefMATH

Smyth P (1995) Learning with probabilistic supervision. In: Petsche T (ed) Computational learning theory and natural learning systems, vol. III: selecting good models. MIT Press, Cambridge

Smyth P (1996) Bounds on the mean classification error rate of multiple experts. Pattern Recognit Lett 17(12):1253–1257CrossRef

Smyth P, Burl MC, Fayyad UM, Perona P (1994a) Knowledge discovery in large image databases: Dealing with uncertainties in ground truth. In: Knowledge discovery in databases: papers from the 1994 AAAI, workshop (KDD-94). pp 109–120

Smyth P, Fayyad UM, Burl MC, Perona P, Baldi P (1994b) Inferring ground truth from subjective labelling of Venus images. In: Advances in neural information processing systems 7 (NIPS 1994). pp 1085–1092

Snow R, O’Connor B, Jurafsky D, Ng AY (2008) Cheap and fast–but is it good? Evaluating non-expert annotations for natural language tasks. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP’08). pp 254–263

Ting KM (2002) An instance-weighting method to induce cost-sensitive trees. IEEE Trans Knowl Data Eng 14(3):659–665CrossRef

Turney PD (1995) Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm. J Artif Intell Res 2:369–409

Turney PD (2000) Types of cost in inductive concept learning. In: Proceedings of the ICML-2000 workshop on cost-sensitive, learning. pp 15–21

Verbaeten S, Assche AV (2003) Ensemble methods for noise elimination in classification problems. In: Fourth international workshop on multiple classifier systems. Springer, pp 317–325

von Ahn L, Dabbish L (2004) Labeling images with a computer game. In: Proceedings of the 2004 conference on human factors in computing systems (CHI 2004). pp 319–326

Weiss GM, Provost FJ (2003) Learning when training data are costly: the effect of class distribution on tree induction. J Artif Intell Res 19:315–354MATH

Whitehill J, Ruvolo P, fan Wu T, Bergsma J, Movellan J (2009) Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. In: Advances in neural information processing systems 22 (NIPS 2009). pp 2035–2043

Whittle P (1973) Some general points in the theory of optimal experimental design. J R Stat Soc Ser B 35(1):123–130MATHMathSciNet

Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann Publishing, San Francisco

Zadrozny B, Langford J, Abe N (2003) Cost-sensitive learning by cost-proportionate example weighting. In: Proceedings of the 3th IEEE international conference on data mining (ICDM 2003). pp 435–442

Zheng Z, Padmanabhan B (2006) Selectively acquiring customer information: a new data acquisition problem and an active learning-based solution. Manag Sci 52(5):697–712CrossRef

Zhu X, Wu X (2005) Cost-constrained data acquisition for intelligent data preparation. IEEE Trans Knowl Data Eng 17(11):1542–1556CrossRef

Titel: Repeated labeling using multiple noisy labelers
verfasst von: Panagiotis G. Ipeirotis
Foster Provost
Victor S. Sheng
Jing Wang
Publikationsdatum: 01.03.2014
Verlag: Springer US
Erschienen in: Data Mining and Knowledge Discovery / Ausgabe 2/2014
Print ISSN: 1384-5810
Elektronische ISSN: 1573-756X
DOI: https://doi.org/10.1007/s10618-013-0306-1

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2/2014

Adaptive evolutionary clustering

Exploiting domain knowledge to detect outliers

Conditional ordinal random fields for structured ordinal-valued label prediction

Aggregative quantification for regression

G-Tries: a data structure for storing and finding subgraphs

Ensemble-based noise detection: noise ranking and visual performance evaluation