research-article

Learning from crowds in the presence of schools of thought

Authors:
Yuandong Tian

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

,
Jun Zhu

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2012Pages 226–234https://doi.org/10.1145/2339530.2339571

Published:12 August 2012Publication History

KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 226–234

ABSTRACT

Crowdsourcing has recently become popular among machine learning researchers and social scientists as an effective way to collect large-scale experimental data from distributed workers. To extract useful information from the cheap but potentially unreliable answers to tasks, a key problem is to identify reliable workers as well as unambiguous tasks. Although for objective tasks that have one correct answer per task, previous works can estimate worker reliability and task clarity based on the single gold standard assumption, for tasks that are subjective and accept multiple reasonable answers that workers may be grouped into, a phenomenon called schools of thought, existing models cannot be trivially applied. In this work, we present a statistical model to estimate worker reliability and task clarity without resorting to the single gold standard assumption. This is instantiated by explicitly characterizing the grouping behavior to form schools of thought with a rank-1 factorization of a worker-task groupsize matrix. Instead of performing an intermediate inference step, which can be expensive and unstable, we present an algorithm to analytically compute the sizes of different groups. We perform extensive empirical studies on real data collected from Amazon Mechanical Turk. Our method discovers the schools of thought, shows reasonable estimation of worker reliability and task clarity, and is robust to hyperparameter changes. Furthermore, our estimated worker reliability can be used to improve the gold standard prediction for objective tasks.

Supplemental Material

306_m_talk_6.mp4

mp4

144.8 MB

Download

References

J. Abernethy and R.M. Frongillo. A collaborative mechanism for crowdsourcing prediction problems. In NIPS, 2011.Google Scholar
E. Adar. Why I hate Mechanical Turk research (and workshops). In CHI, 2011.Google Scholar
C.E. Antoniak. Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Annals of statistics, 2(6):1152--1174, 1974.Google ScholarCross Ref
D. Blackwell and J.B. MacQueen. Ferguson distributions via Pólya urn schemes. Annals of statistics, 1(2):353--355, 1973.Google ScholarCross Ref
AP Dawid and AM Skene. Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), 28(1):20--28, 1979.Google Scholar
S.J. Gershman and D. Blei. A tutorial on bayesian nonparametric models. Journal of Mathematical Psychology (in press), 2011.Google Scholar
T.L. Griffiths and M. Steyvers. Finding scientific topics. National Academy of Sciences of the United States of America, 101(Suppl 1):5228, 2004.Google ScholarCross Ref
P.D. Hoff, A.E. Raftery, and M.S. Handcock. Latent space approaches to social network analysis. Journal of the American Statistical Association, 97(460):1090--1098, 2002.Google ScholarCross Ref
P.G. Ipeirotis, F. Provost, and J. Wang. Quality management on Amazon Mechanical Turk. In ACM SIGKDD workshop on human computation, 2010. Google ScholarDigital Library
D.R. Karger, S. Oh, and D. Shah. Iterative learning for reliable crowdsourcing systems. In NIPS, 2011.Google ScholarDigital Library
A. Kittur, E. Chi, and B. Suh. Crowdsourcing user studies with Mechanical Turk. In CHI, 2008. Google ScholarDigital Library
R.M. Neal. Markov chain sampling methods for Dirichlet process mixture models. Journal of computational and graphical statistics, 9(2):249--265, 2000.Google Scholar
A. Ng, M. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In NIPS, 2001.Google ScholarDigital Library
D. Pinar, J. Carbonell, and J. Schneider. Efficiently learning the accuracy of labeling sources for selective sampling. In SIGKDD, 2009. Google ScholarDigital Library
V.C. Raykar, S. Yu, L.H. Zhao, G.H. Valadez, C. Florin, L. Bogoni, and L. Moy. Learning from crowds. Journal of Machine Learning Research, 99:1297--1322, 2010. Google ScholarDigital Library
J. Ross, L. Irani, M. Silberman, A. Zaldivar, and B. Tomlinson. Who are the crowdworkers? Shifting demographics in Mechanical Turk. In CHI, 2010. Google ScholarDigital Library
D.M. Russell, M.J. Stefik, P. Pirolli, and S.K. Card. The cost structure of sensemaking. In CHI, 1993. Google ScholarDigital Library
P. Smyth, U. Fayyad, M. Burl, P. Perona, and P. Baldi. Inferring ground truth from subjective labelling of venus images. In NIPS, 1995.Google Scholar
R. Snow, B. O'Connor, D. Jurafsky, and A. Ng. Cheap and fast -- but is it good? evaluating non-expert annotations for natural language tasks. In EMNLP, 2008. Google ScholarDigital Library
Y.W. Teh, D. Newman, and M. Welling. A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. In NIPS, 2006.Google ScholarDigital Library
P. Welinder, S. Branson, S. Belongie, and P. Perona. The multidimensional wisdom of crowds. In NIPS, 2010.Google ScholarDigital Library
J. Whitehill, P. Ruvolo, T. Wu, J. Bergsma, and J. Movellan. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. In NIPS, 2009.Google ScholarDigital Library
Y. Yan, R. Rosales, G. Fung, M. Schmidt, G. Hermosillo, L. Bogoni, L. Moy, J.G. Dy, and PA Malvern. Modeling annotator expertise: Learning when everybody knows a bit of something. In AISTATS, 2010.Google Scholar
J. Zhu, N. Chen, and E. Xing. Infinite Latent SVM for Classification and Multi-task Learning. In NIPS, 2011.Google ScholarDigital Library

Recommendations

Self-correcting crowds
CHI EA '12: CHI '12 Extended Abstracts on Human Factors in Computing Systems

Much of the current work in crowdsourcing is focused on increasing the quality of responses. Quality issues are most often due to a small subset of low quality workers. The ability to distinguish between high and low quality workers would allow a wide ...
Read More
Crowds in two seconds: enabling realtime crowd-powered interfaces
UIST '11: Proceedings of the 24th annual ACM symposium on User interface software and technology

Interactive systems must respond to user input within seconds. Therefore, to create realtime crowd-powered interfaces, we need to dramatically lower crowd latency. In this paper, we introduce the use of synchronous crowds for on-demand, realtime ...
Read More
Parting Crowds: Characterizing Divergent Interpretations in Crowdsourced Annotation Tasks
CSCW '16: Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing

Crowdsourcing is a common strategy for collecting the “gold standard” labels required for many natural language applications. Crowdworkers differ in their responses for many reasons, but existing approaches often treat disagreements as "noise" to be ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2012
1616 pages
ISBN:9781450314626
DOI:10.1145/2339530
General Chair:
Qiang Yang
Hong Kong University of Science and Technology
,
Program Chairs:
Deepak Agarwal
LinkedIn
,
Jian Pei
Simon Fraser University
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 August 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
crowdsourcing
pattern analysis
schools of thought
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 43
  Total Citations
  View Citations
- 872
  Total Downloads
- Downloads (Last 12 months)9
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Learning from crowds in the presence of schools of thought

KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

Supplemental Material

References

Cited By

Recommendations

Self-correcting crowds

Crowds in two seconds: enabling realtime crowd-powered interfaces

Parting Crowds: Characterizing Divergent Interpretations in Crowdsourced Annotation Tasks