research-article

The wisdom of minority: discovering and targeting the right group of workers for crowdsourcing

Authors:
Hongwei Li

Department of Statistics, UC Berkeley, Berkeley, CA, USA

Department of Statistics, UC Berkeley, Berkeley, CA, USA
View Profile

,
Bo Zhao

Microsoft Research, Silicon Valley, Mountain View, CA, USA

Microsoft Research, Silicon Valley, Mountain View, CA, USA
View Profile

,
Ariel Fuxman

Microsoft Research, Silicon Valley, Mountain View, CA, USA

Microsoft Research, Silicon Valley, Mountain View, CA, USA
View Profile

WWW '14: Proceedings of the 23rd international conference on World wide webApril 2014Pages 165–176https://doi.org/10.1145/2566486.2568033

Published:07 April 2014Publication History

WWW '14: Proceedings of the 23rd international conference on World wide web

Pages 165–176

ABSTRACT

Worker reliability is a longstanding issue in crowdsourcing, and the automatic discovery of high quality workers is an important practical problem. Most previous work on this problem mainly focuses on estimating the quality of each individual worker jointly with the true answer of each task. However, in practice, for some tasks, worker quality could be associated with some explicit characteristics of the worker, such as education level, major and age. So the following question arises: how do we automatically discover related worker attributes for a given task, and further utilize the findings to improve data quality? In this paper, we propose a general crowd targeting framework that can automatically discover, for a given task, if any group of workers based on their attributes have higher quality on average; and target such groups, if they exist, for future work on the same task. Our crowd targeting framework is complementary to traditional worker quality estimation approaches. Furthermore, an advantage of our framework is that it is more budget efficient because we are able to target potentially good workers before they actually do the task. Experiments on real datasets show that the accuracy of final prediction can be improved significantly for the same budget (or even less budget in some cases). Our framework can be applied to many real word tasks and can be easily integrated in current crowdsourcing platforms.

References

Y. Bachrach, T. Graepel, T. Minka, and J. Guiver. How to grade a test without knowing the answers-a bayesian graphical model for adaptive crowdsourcing and aptitude testing. arXiv preprint arXiv:1206.6386, 2012.Google Scholar
A. P. Dawid and A. M. Skene. Maximum Likelihood Estimation of Observer Error-Rates Using the EM Algorithm. Journal of the Royal Statistical Society., 28(1):20--28, 1979.Google Scholar
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society. Series B (Methodological), pages 1--38, 1977.Google ScholarCross Ref
S. Ertekin, H. Hirsh, and C. Rudin. Learning to predict the wisdom of crowds. arXiv preprint arXiv:1204.3611, 2012.Google Scholar
C.-J. Ho, S. Jabbari, and J. W. Vaughan. Adaptive task assignment for crowdsourced classification. In ICML, pages 534--542, 2013.Google ScholarDigital Library
P. G. Ipeirotis. Analyzing the amazon mechanical turk marketplace. XRDS: Crossroads, The ACM Magazine for Students, 17(2):16--21, 2010. Google ScholarDigital Library
P. G. Ipeirotis. Demographics of mechanical turk. In NYU Digital Working Paper CeDER-10-01, 2010.Google Scholar
P. G. Ipeirotis, F. Provost, and J. Wang. Quality management on amazon mechanical turk. In Proceedings of the ACM SIGKDD workshop on human computation, pages 64--67. ACM, 2010. Google ScholarDigital Library
D. R. Karger, S. Oh, and D. Shah. Budget-optimal crowdsourcing using low-rank matrix approximations. In Communication, Control, and Computing (Allerton), 2011 49th Annual Allerton Conference on, pages 284--291. IEEE, 2011.Google ScholarCross Ref
D. R. Karger, S. Oh, and D. Shah. Iterative learning for reliable crowdsourcing systems. In NIPS, pages 1953--1961, 2011.Google ScholarDigital Library
G. Kazai, J. Kamps, and N. Milic-Frayling. The face of quality in crowdsourcing relevance labels: Demographics, personality and labeling accuracy. In Proceedings of the 21st ACM Conference on Information and Knowledge Management (CIKM 2012). ACM Press, New York NY, 2012. Google ScholarDigital Library
G. Kazai, J. Kamps, and N. Milic-Frayling. An analysis of human factors and label accuracy in crowdsourcing relevance judgments. Information Retrieval, 16:138--178, 2013. Google ScholarDigital Library
H. Li, B. Yu, and D. Zhou. Error Rate Analysis of Labeling by Crowdsourcing. In ICML Workshop: Machine Learning Meets Crowdsourcing. Atalanta, Georgia, USA., 2013.Google Scholar
H. Li, B. Yu, and D. Zhou. Error rate bounds in crowdsourcing models. arXiv preprint arXiv:1307.2674, 2013.Google Scholar
Q. Liu, J. Peng, and A. Ihler. Variational inference for crowdsourcing. NIPS, 2012.Google ScholarDigital Library
R. M. Mickey, O. J. Dunn, and V. Clark. Applied statistics: analysis of variance and regression. Wiley-Interscience, 2004.Google Scholar
T. Pfeier, X. A. Gao, Y. Chen, A. Mao, and D. G. Rand. Adaptive polling for information aggregation. In AAAI, 2012.Google Scholar
V. C. Raykar, S. Yu, L. H. Zhao, C. Florin, L. Bogoni, and L. Moy. Learning From Crowds. Journal of Machine Learning Research, 11:1297--1322, 2010. Google ScholarDigital Library
J. Ross, L. Irani, M. Silberman, A. Zaldivar, and B. Tomlinson. Who are the crowdworkers?: shifting demographics in mechanical turk. In CHI, pages 2863--2872. ACM, 2010. Google ScholarDigital Library
L. A. Schmidt. Crowdsourcing for human subjects research. Proceedings of CrowdConf, 2010.Google Scholar
P. Smyth, U. Fayyad, M. Burl, P. Perona, and P. Baldi. Inferring Ground Truth from Subjective Labelling of Venus Images. In NIPS, 1995.Google Scholar
R. Snow, B. O. Connor, D. Jurafsky, A. Y. Ng, D. Labs, and C. St. Cheap and Fast - But is it Good -- Evaluating Non-Expert Annotations for Natural Language Tasks. EMNLP, 2008. Google ScholarDigital Library
L. STAHLE and S. Wold. Analysis of variance (anova). Chemometrics and intelligent laboratory systems, 6(4):259--272, 1989.Google Scholar
P. Welinder, S. Branson, S. Belongie, and P. Perona. The Multidimensional Wisdom of Crowds. In NIPS, 2010.Google ScholarDigital Library
J. Whitehill, P. Ruvolo, T. Wu, J. Bergsma, and J. Movellan. Whose Vote Should Count More : Optimal Integration of Labels from Labelers of Unknown Expertise. In NIPS, pages 2035--2043, 2009.Google Scholar
Y. Yan, G. M. Fung, R. Rosales, and J. G. Dy. Active learning from crowds. In ICML, pages 1161--1168, 2011.Google ScholarDigital Library
D. Zhou, J. Platt, S. Basu, and Y. Mao. Learning from the Wisdom of Crowds by Minimax Entropy. In NIPS, 2012.Google ScholarDigital Library

Index Terms

The wisdom of minority: discovering and targeting the right group of workers for crowdsourcing
1. Computing methodologies
  1. Modeling and simulation
    1. Simulation theory
      1. Systems theory
2. Mathematics of computing
  1. Information theory

Recommendations

An optimization approach for worker selection in crowdsourcing systems
Abstract
Crowdsourcing has attracted considerable attention in recent years. A large amount of labelled data can be obtained efficiently and cheaply from the crowdsourcing platform. Therefore, providing a complete, feasible and efficient ...
Highlights
- Workers with different qualities are selected from the perspective of minimum cost.
Read More
Differential evolution-based weighted soft majority voting for crowdsourcing
Abstract
Crowdsourcing has attracted considerable attention in recent years. A large amount of labeled data can be obtained efficiently and cheaply from the crowdsourcing platform. Obviously, the labeling quality of crowd workers directly ...
Read More
The Wisdom of Multiple Guesses
EC '15: Proceedings of the Sixteenth ACM Conference on Economics and Computation

The "wisdom of crowds" dictates that aggregate predictions from a large crowd can be surprisingly accurate, rivaling predictions by experts. Crowds, meanwhile, are highly heterogeneous in their expertise. In this work, we study how the heterogeneous ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '14: Proceedings of the 23rd international conference on World wide web
April 2014
926 pages
ISBN:9781450327442
DOI:10.1145/2566486
General Chair:
Chin-Wan Chung
Korea Advanced Institute of Science and Technology, Korea
,
Program Chairs:
Andrei Broder
Google Inc., USA
,
Kyuseok Shim
Seoul National University, Korea
,
Torsten Suel
New York University, USA
Copyright © 2014 Copyright is held by the International World Wide Web Conference Committee (IW3C2).
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 April 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
crowd targeting
crowdsourcing
worker quality
Qualifiers
- research-article
Conference

Acceptance Rates
WWW '14 Paper Acceptance Rate84of645submissions,13%Overall Acceptance Rate1,899of8,196submissions,23%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 80
  Total Citations
  View Citations
- 1,021
  Total Downloads
- Downloads (Last 12 months)33
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

The wisdom of minority: discovering and targeting the right group of workers for crowdsourcing

WWW '14: Proceedings of the 23rd international conference on World wide web

ABSTRACT

References

Cited By

Index Terms

Recommendations

An optimization approach for worker selection in crowdsourcing systems

Differential evolution-based weighted soft majority voting for crowdsourcing

The Wisdom of Multiple Guesses