poster

The face of quality in crowdsourcing relevance labels: demographics, personality and labeling accuracy

Authors:
Gabriella Kazai

Microsoft Research, Cambridge, United Kingdom

Microsoft Research, Cambridge, United Kingdom
View Profile

,
Jaap Kamps

University of Amsterdam, Amsterdam, Netherlands

University of Amsterdam, Amsterdam, Netherlands
View Profile

,
Natasa Milic-Frayling

Microsoft Research, Cambridge, United Kingdom

Microsoft Research, Cambridge, United Kingdom
View Profile

CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge managementOctober 2012Pages 2583–2586https://doi.org/10.1145/2396761.2398697

Published:29 October 2012Publication History

CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management

Pages 2583–2586

ABSTRACT

Information retrieval systems require human contributed relevance labels for their training and evaluation. Increasingly such labels are collected under the anonymous, uncontrolled conditions of crowdsourcing, leading to varied output quality. While a range of quality assurance and control techniques have now been developed to reduce noise during or after task completion, little is known about the workers themselves and possible relationships between workers' characteristics and the quality of their work. In this paper, we ask how do the relatively well or poorly-performing crowds, working under specific task conditions, actually look like in terms of worker characteristics, such as demographics or personality traits. Our findings show that the face of a crowd is in fact indicative of the quality of their work.

References

O. Alonso, D. E. Rose, and B. Stewart. Crowdsourcing for relevance evaluation. phSIGIR Forum, 42: 9--15, November 2008. Google ScholarDigital Library
P. Bailey, N. Craswell, I. Soboroff, P. Thomas, A. P. de Vries, and E. Yilmaz. Relevance assessment: are judges exchangeable and does it matter. In phProc. of SIGIR 2008, pages 667--674, 2008. Google ScholarDigital Library
D. Chandler and A. Kapelner. Breaking monotony with meaning: Motivation in crowdsourcing markets. Technical report, Working paper, University of Chicago, 2010.Google Scholar
J. S. Downs, M. B. Holbrook, S. Sheng, and L. F. Cranor. Are your participants gaming the system?: screening mechanical turk workers. In phProc. of CHI 2010, pages 2399--2402. ACM, 2010. Google ScholarDigital Library
C. Grady and M. Lease. Crowdsourcing document relevance assessment with mechanical turk. In phProc. of CSLDAMT '10, pages 172--179, 2010. Google ScholarDigital Library
P. G. Ipeirotis. Analyzing the amazon mechanical turk marketplace. phXRDS, 17: 16--21, 2010. Google ScholarDigital Library
P. G. Ipeirotis, F. Provost, and J. Wang. Quality management on amazon mechanical turk. In phProc. of HCOMP '10, pages 64--67, 2010. ACM. Google ScholarDigital Library
O. P. John, L. P. Naumann, and C. J. Soto. Paradigm shift to the integrative big-five trait taxonomy: History, measurement, and conceptual issues. In phHandbook of personality: Theory and research, chapter 4, pages 114--212. Guilford Press, New York NY, 2008.Google Scholar
G. Kazai, M. Koolen, A. Doucet, and M. Landoni. Overview of the INEX 2010 book track: At the mercy of crowdsourcing. In phINEX 2010 Workshop Pre-proceedings, pages 89--99, 2010.Google Scholar
G. Kazai, J. Kamps, M. Koolen, and N. Milic-Frayling. Crowdsourcing for book search evaluation: Impact of quality on comparative system ranking. In phProc. of SIGIR 2011. ACM, 2011. Google ScholarDigital Library
A. Kittur, E. H. Chi, and B. Suh. Crowdsourcing user studies with mechanical turk. In phProc. of CHI 2008, pages 453--456, 2008. ACM. Google ScholarDigital Library
J. Le, A. Edmonds, V. Hester, and L. Biewald. Ensuring quality in crowdsourced search relevance evaluation: The effects of training question distribution. In phSIGIR 2010 Workshop on Crowdsourcing for Search Evaluation, pages 21--26, 2010.Google Scholar
W. Mason and D. J. Watts. Financial incentives and the "performance of crowds". In phProc. of HCOMP '09, pages 77--85, 2009. ACM. Google ScholarDigital Library
B. Rammstedt and O. P. John. Measuring personality in one minute or less: A 10-item short version of the Big Five Inventory in English and German. phJournal of Research in Personality, 41: 203--212, 2007.Google ScholarCross Ref
J. Ross, L. Irani, M. S. Silberman, A. Zaldivar, and B. Tomlinson. Who are the crowdworkers?: shifting demographics in mechanical turk. In phProc. of CHI 2010, Extended Abstracts Volume, pages 2863--2872. ACM, 2010. Google ScholarDigital Library
A. Shaw, J. Horton, and D. Chen. Designing incentives for inexpert human raters. In phProc. of CSCW '11, 2011. Google ScholarDigital Library
V. S. Sheng, F. Provost, and P. G. Ipeirotis. Get another label? improving data quality and data mining using multiple, noisy labelers. In phProc. of KDD '08, pages 614--622, 2008. Google ScholarDigital Library
R. Snow, B. O'Connor, D. Jurafsky, and A. Y. Ng. Cheap and fast--but is it good?: evaluating non-expert annotations for natural language tasks. In phProc. of the EMNLP '08, pages 254--263, 2008. Google ScholarDigital Library
E. M. Voorhees and D. K. Harman, editors. phTREC: Experimentation and Evaluation in Information Retrieval. MIT Press, 2005.Google Scholar
J. Vuurens, A. de Vries, and C. Eickhoff. How much spam can you take? an analysis of crowdsourcing results to increase accuracy. In phProc. of the ACM SIGIR Workshop on Crowdsourcing for Information Retrieval, 2011. ACM.Google Scholar
D. Zhu and B. Carterette. An analysis of assessor behavior in crowdsourced preference judgments. In phSIGIR 2010 Workshop on Crowdsourcing for Search Evaluation, pages 17--20, 2010.Google Scholar

Index Terms

The face of quality in crowdsourcing relevance labels: demographics, personality and labeling accuracy
1. Information systems
  1. Information systems applications

Recommendations

Quality Control in Crowdsourcing: A Survey of Quality Attributes, Assessment Techniques, and Assurance Actions

Crowdsourcing enables one to leverage on the intelligence and wisdom of potentially large groups of individuals toward solving problems. Common problems approached with crowdsourcing are labeling images, translating or transcribing text, providing ...
Read More
Worker types and personality traits in crowdsourcing relevance labels
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

Crowdsourcing platforms offer unprecedented opportunities for creating evaluation benchmarks, but suffer from varied output quality from crowd workers who possess different levels of competence and aspiration. This raises new challenges for quality ...
Read More
Who are the crowdworkers?: shifting demographics in mechanical turk
CHI EA '10: CHI '10 Extended Abstracts on Human Factors in Computing Systems

Amazon Mechanical Turk (MTurk) is a crowdsourcing system in which tasks are distributed to a population of thousands of anonymous workers for completion. This system is increasingly popular with researchers and developers. Here we extend previous ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management
October 2012
2840 pages
ISBN:9781450311564
DOI:10.1145/2396761
General Chair:
Xuewen Chen
Wayne State University, USA
,
Program Chairs:
Guy Lebanon
Georgia Institute of Technology
,
Haixun Wang
Microsoft Research Asia
,
Mohammed J. Zaki
Rensselaer Polytechnic Institute
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 October 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
crowdsourcing
demographics
personality traits
worker accuracy
Qualifiers
- poster
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 62
  Total Citations
  View Citations
- 619
  Total Downloads
- Downloads (Last 12 months)27
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

The face of quality in crowdsourcing relevance labels: demographics, personality and labeling accuracy

CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Quality Control in Crowdsourcing: A Survey of Quality Attributes, Assessment Techniques, and Assurance Actions

Worker types and personality traits in crowdsourcing relevance labels

Who are the crowdworkers?: shifting demographics in mechanical turk

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

The face of quality in crowdsourcing relevance labels: demographics, personality and labeling accuracy

CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Quality Control in Crowdsourcing: A Survey of Quality Attributes, Assessment Techniques, and Assurance Actions

Worker types and personality traits in crowdsourcing relevance labels

Who are the crowdworkers?: shifting demographics in mechanical turk

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media