ABSTRACT
Crowdsourcing platforms offer unprecedented opportunities for creating evaluation benchmarks, but suffer from varied output quality from crowd workers who possess different levels of competence and aspiration. This raises new challenges for quality control and requires an in-depth understanding of how workers' characteristics relate to the quality of their work.
In this paper, we use behavioral observations (HIT completion time, fraction of useful labels, label accuracy) to define five worker types: Spammer, Sloppy, Incompetent, Competent, Diligent. Using data collected from workers engaged in the crowdsourced evaluation of the INEX 2010 Book Track Prove It task, we relate the worker types to label accuracy and personality trait information along the `Big Five' personality dimensions.
We expect that these new insights about the types of crowd workers and the quality of their work will inform how to design HITs to attract the best workers to a task and explain why certain HIT designs are more effective than others.
- O. Alonso and R. A. Baeza-Yates. Design and implementation of relevance assessments using crowdsourcing. In Proc. ECIR'11, pages 153--164, 2011. Google ScholarDigital Library
- O. Alonso, D. E. Rose, and B. Stewart. Crowdsourcing for relevance evaluation. SIGIR Forum, 42: 9--15, November 2008. Google ScholarDigital Library
- B. Carterette and I. Soboroff. The effect of assessor error on ir system evaluation. In Proc. SIGIR'10, pages 539--546. ACM, 2010. Google ScholarDigital Library
- J. S. Downs, M. B. Holbrook, S. Sheng, and L. F. Cranor. Are your participants gaming the system?: screening Mechanical Turk workers. In Proc. CHI'10, pages 2399--2402, 2010. Google ScholarDigital Library
- S. D. Gosling, S. Gaddis, and S. Vazire. Personality impressions based on Facebook profiles. Psychology, pages 1--4, 2007.Google Scholar
- C. Grady and M. Lease. Crowdsourcing document relevance assessment with Mechanical Turk. In Proc. CSLDAMT'10, pages 172--179, 2010. Google ScholarDigital Library
- J. Howe. Crowdsourcing: Why the Power of the Crowd Is Driving the Future of Business. Crown Publishing Group, 2008. Google ScholarDigital Library
- J.-H. Huang and Y.-C. Yang. The relationship between personality traits and online shopping motivations. Social Behavior and Personality, 38: 673--680, 2010.Google ScholarCross Ref
- P. G. Ipeirotis. Analyzing the Amazon Mechanical Turk marketplace. XRDS, 17: 16--21, 2010. Google ScholarDigital Library
- O. P. John, L. P. Naumann, and C. J. Soto. Paradigm shift to the integrative big-five trait taxonomy. In Handbook of personality, chapter 4, pages 114--212. Guilford Press, New York NY, 2008.Google Scholar
- G. Kazai. In search of quality in crowdsourcing for search engine evaluation. In Proc. ECIR'11, pages 165--176, 2011. Google ScholarDigital Library
- Kazai, Kamps, Koolen, and Milic-Frayling}kazai11sigirG. Kazai, J. Kamps, M. Koolen, and N. Milic-Frayling. Crowdsourcing for book search evaluation: impact of HIT design on comparative system ranking. In Proc. SIGIR'11, pages 205--214, 2011. Google ScholarDigital Library
- Kazai, Koolen, Kamps, Doucet, and Landoni}kaza:over11G. Kazai, M. Koolen, J. Kamps, A. Doucet, and M. Landoni. Overview of the INEX 2010 book track: Scaling up the evaluation using crowdsourcing. In Proc. INEX'10, pages 101--120, 2011. Google ScholarDigital Library
- A. Kittur, E. H. Chi, and B. Suh. Crowdsourcing user studies with Mechanical Turk. In Proc. CHI'08, CHI '08, pages 453--456, 2008. Google ScholarDigital Library
- M. Kosinski, F. Radlinski, and P. Kohli. Personality and online behavior. In Proc. CIKM'11, 2011. ACM.Google Scholar
- J. Le, A. Edmonds, V. Hester, and L. Biewald. Ensuring quality in crowdsourced search relevance evaluation: The effects of training question distribution. In Proc. CSE'10, pages 21--26, 2010.Google Scholar
- B. Rammstedt and O. P. John. Measuring personality in one minute or less: A 10-item short version of the Big Five Inventory in English and German. Journal of Research in Personality, 41: 203--212, 2007.Google ScholarCross Ref
- J. Ross, L. Irani, M. S. Silberman, A. Zaldivar, and B. Tomlinson. Who are the crowdworkers?: shifting demographics in Mechanical Turk. In Proc. CHI 2010, Extended Abstracts Volume, pages 2863--2872. ACM, 2010. Google ScholarDigital Library
- R. Snow, B. O'Connor, D. Jurafsky, and A. Y. Ng. Cheap and fast--but is it good?: evaluating non-expert annotations for natural language tasks. In Proc. EMNLP'08, pages 254--263, 2008. Google ScholarDigital Library
- J. Vuurens, A. P. de Vries, and C. Eickhoff. How much spam can you take? an analysis of crowdsourcing results to increase accuracy. In Proc. ACM SIGIR Workshop on Crowdsourcing for Information Retrieval (CIR'11), pages 21--26, 2011. ACM.Google Scholar
- D. Zhu and B. Carterette. An analysis of assessor behavior in crowdsourced preference judgments. In Proc. CSE'10, pages 17--20, 2010.Google Scholar
Index Terms
- Worker types and personality traits in crowdsourcing relevance labels
Recommendations
The face of quality in crowdsourcing relevance labels: demographics, personality and labeling accuracy
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge managementInformation retrieval systems require human contributed relevance labels for their training and evaluation. Increasingly such labels are collected under the anonymous, uncontrolled conditions of crowdsourcing, leading to varied output quality. While a ...
Crowd Anatomy Beyond the Good and Bad: Behavioral Traces for Crowd Worker Modeling and Pre-selection
AbstractThe suitability of crowdsourcing to solve a variety of problems has been investigated widely. Yet, there is still a lack of understanding about the distinct behavior and performance of workers within microtasks. In this paper, we first introduce a ...
Swedish IT Project Managers' Personality Traits Mirrored in the Big Five
Project success, which is critical to achieve, requires a competent project manager. Could anybody become a skilled project manager, or what does it take? One factor that is considered to influence the opportunities of succeeding is individuals' ...
Comments