ABSTRACT
In recent years, crowdsourcing has become essential in a wide range of Web applications. One of the biggest challenges of crowdsourcing is the quality of crowd answers as workers have wide-ranging levels of expertise and the worker community may contain faulty workers. Although various techniques for quality control have been proposed, a post-processing phase in which crowd answers are validated is still required. Validation is typically conducted by experts, whose availability is limited and who incur high costs. Therefore, we develop a probabilistic model that helps to identify the most beneficial validation questions in terms of both, improvement of result correctness and detection of faulty workers. Our approach allows us to guide the expert's work by collecting input on the most problematic cases, thereby achieving a set of high quality answers even if the expert does not validate the complete answer set. Our comprehensive evaluation using both real-world and synthetic datasets demonstrates that our techniques save up to 50% of expert efforts compared to baseline methods when striving for perfect result correctness. In absolute terms, for most cases, we achieve close to perfect correctness after expert input has been sought for only 20\% of the questions.
- https://code.google.com/p/crowdvalidator/.Google Scholar
- http://www.crowdflower.com/.Google Scholar
- http://www.mturk.com/.Google Scholar
- Y. Amsterdamer, Y. Grossman, T. Milo, and P. Senellart. Crowd mining. In SIGMOD, pages 241--252, 2013. Google ScholarDigital Library
- A. Arasu, M. Götz, and R. Kaushik. On active learning of record matching packages. In SIGMOD, pages 783--794, 2010. Google ScholarDigital Library
- A. Artikis, M. Weidlich, F. Schnitzler, I. Boutsis, T. Liebig, N. Piatkowski, C. Bockermann, K. Morik, V. Kalogeraki, J. Marecek, et al. Heterogeneous stream processing and crowdsourcing for urban traffic management. In EDBT, pages 712--723, 2014.Google Scholar
- J. A. Blakeley, P.-A. Larson, and F. W. Tompa. Efficiently updating materialized views. In SIGMOD, pages 61--71, 1986. Google ScholarDigital Library
- C. C. Cao, J. She, Y. Tong, and L. Chen. Whom to ask?: jury selection for decision making tasks on micro-blog services. In VLDB, pages 1495--1506, 2012. Google ScholarDigital Library
- A. P. Dawid and A. M. Skene. Maximum likelihood estimation of observer error-rates using the EM algorithm. J. R. Stat. Soc., pages 20--28, 1979.Google ScholarCross Ref
- G. Demartini, D. E. Difallah, and P. Cudré-Mauroux. Zencrowd: Leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In WWW, pages 469--478, 2012. Google ScholarDigital Library
- X. L. Dong, L. Berti-Equille, Y. Hu, and D. Srivastava. Solomon: Seeking the truth via copying detection. In VLDB, pages 1617--1620, 2010. Google ScholarDigital Library
- X. L. Dong, L. Berti-Equille, and D. Srivastava. Truth discovery and copying detection in a dynamic world. In VLDB, pages 562--573, 2009. Google ScholarDigital Library
- X. L. Dong and F. Naumann. Data fusion: resolving data conflicts for integration. In VLDB, pages 1654--1655, 2009. Google ScholarDigital Library
- C. Eckart and G. Young. The approximation of one matrix by another of lower rank. Psychometrika, pages 211--218, 1936.Google ScholarCross Ref
- A. Galland, S. Abiteboul, A. Marian, and P. Senellart. Corroborating information from disagreeing views. In WSDM, pages 131--140, 2010. Google ScholarDigital Library
- F. Garcin, B. Faltings, R. Jurca, and N. Joswig. Rating aggregation in collaborative filtering systems. In RecSys, pages 349--352, 2009. Google ScholarDigital Library
- C. Gokhale, S. Das, A. Doan, J. F. Naughton, N. Rampalli, J. Shavlik, and X. Zhu. Corleone: Hands-off crowdsourcing for entity matching. In SIGMOD, pages 601--612, 2014. Google ScholarDigital Library
- D. E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Longman, 1989. Google ScholarDigital Library
- R. G. Gomes, P. Welinder, A. Krause, and P. Perona. Crowdclustering. In NIPS, pages 558--566, 2011.Google ScholarDigital Library
- Q. Hu, Q. He, H. Huang, K. Chiew, and Z. Liu. Learning from crowds under experts supervision. In PAKDD, pages 200--211. 2014.Google ScholarCross Ref
- Z. Huang, A. Olteanu, and K. Aberer. Credibleweb: a platform for web credibility evaluation. In CHI, pages 1887--1892, 2013. Google ScholarDigital Library
- N. Q. V. Hung, N. T. Tam, L. N. Tran, and K. Aberer. An evaluation of aggregation techniques in crowdsourcing. In WISE, pages 1--15, 2013.Google Scholar
- P. G. Ipeirotis, F. Provost, and J. Wang. Quality management on amazon mechanical turk. HCOMP, pages 64--67, 2010. Google ScholarDigital Library
- S. R. Jeffery, M. J. Franklin, and A. Y. Halevy. Pay-as-you-go user feedback for dataspace systems. In SIGMOD, pages 847--860, 2008. Google ScholarDigital Library
- H. J. Jung and M. Lease. Improving quality of crowdsourced labels via probabilistic matrix factorization. In HCOMP, pages 101--106, 2012.Google Scholar
- H. Kajino, Y. Tsuboi, I. Sato, and H. Kashima. Learning from crowds and experts. In HCOMP, pages 107--113, 2012.Google Scholar
- D. R. Karger, S. Oh, and D. Shah. Iterative learning for reliable crowdsourcing systems. In NIPS, pages 1953--1961, 2011.Google ScholarDigital Library
- G. Karypis and V. Kumar. Metis-unstructured graph partitioning and sparse matrix ordering system, version 2.0. 1995.Google Scholar
- G. Kazai, J. Kamps, and N. Milic-Frayling. Worker types and personality traits in crowdsourcing relevance labels. In CIKM, pages 1941--1944, 2011. Google ScholarDigital Library
- C.-W. Ko, J. Lee, and M. Queyranne. An exact algorithm for maximum entropy sampling. Operations Research, pages 684--691, 1995. Google ScholarDigital Library
- S. K. Lam and J. Riedl. Shilling recommender systems for fun and profit. In WWW, pages 393--402, 2004. Google ScholarDigital Library
- K. Lee, J. Caverlee, and S. Webb. The social honeypot project: protecting online communities from spammers. In WWW, pages 1139--1140, 2010. Google ScholarDigital Library
- X. Liu, M. Lu, B. C. Ooi, Y. Shen, S. Wu, and M. Zhang. Cdas: a crowdsourcing data analytics system. In VLDB, pages 1040--1051, 2012. Google ScholarDigital Library
- Q. V. H. Nguyen, T. T. Nguyen, Z. Miklós, K. Aberer, A. Gal, and M. Weidlich. Pay-as-you-go reconciliation in schema matching networks. In ICDE, pages 220--231, 2014.Google ScholarCross Ref
- M. O'Mahony, N. Hurley, N. Kushmerick, and G. Silvestre. Collaborative recommendation: A robustness analysis. TOIT, pages 344--377, 2004. Google ScholarDigital Library
- J. Pasternack and D. Roth. Latent credibility analysis. In WWW, pages 1009--1020, 2013. Google ScholarDigital Library
- A. J. Quinn and B. B. Bederson. Human computation: a survey and taxonomy of a growing field. In CHI, pages 1403--1412, 2011. Google ScholarDigital Library
- V. C. Raykar and S. Yu. Ranking annotators for crowdsourced labeling tasks. In NIPS, pages 1809--1817, 2011.Google ScholarDigital Library
- J. Reason. Human error. Cambridge university press, 1990.Google ScholarCross Ref
- J. Ross, L. Irani, M. Silberman, A. Zaldivar, and B. Tomlinson. Who are the crowdworkers?: shifting demographics in mechanical turk. In CHI, pages 2863--2872, 2010. Google ScholarDigital Library
- N. Rubens, D. Kaplan, and M. Sugiyama. Active learning in recommender systems. In Recommender Systems Handbook, pages 735--767. Springer, 2011.Google ScholarCross Ref
- S. J. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Pearson Education, 2003. Google ScholarDigital Library
- C. E. Shannon. A mathematical theory of communication. SIGMOBILE, pages 3--55, 2001. Google ScholarDigital Library
- C. Sun, N. Rampalli, F. Yang, and A. Doan. Chimera: Large-scale classification using machine learning, rules, and crowdsourcing. In VLDB, pages 1529--1540, 2014. Google ScholarDigital Library
- J. Surowiecki. The wisdom of crowds: Why the many are smarter than the few and how collective wisdom shapes business, economies. Economies, Societies and Nations, 2004.Google Scholar
- TRAVAIL. Global wage report 2012--13. International Labour Organization (ILO), 2012.Google Scholar
- D. Wang, L. Kaplan, H. Le, and T. Abdelzaher. On truth discovery in social sensing: A maximum likelihood estimation approach. In IPSN, pages 233--244, 2012. Google ScholarDigital Library
- P. Welinder and P. Perona. Online crowdsourcing: rating annotators and obtaining cost-effective labels. In CVPRW, pages 25--32, 2010.Google ScholarCross Ref
- M. Yakout, A. K. Elmagarmid, J. Neville, M. Ouzzani, and I. F. Ilyas. Guided data repair. In VLDB, pages 279--289, 2011. Google ScholarDigital Library
- T. Yan, V. Kumar, and D. Ganesan. Crowdsearch: Exploiting crowds for accurate real-time image search on mobile phones. In MobiSys, pages 77--90, 2010. Google ScholarDigital Library
- C. J. Zhang, L. Chen, H. V. Jagadish, and C. C. Cao. Reducing uncertainty of schema matching via crowdsourcing. In VLDB, pages 757--768, 2013. Google ScholarDigital Library
- B. Zhao, B. I. Rubinstein, J. Gemmell, and J. Han. A bayesian approach to discovering truth from conflicting sources for data integration. In VLDB, pages 550--561, 2012. Google ScholarDigital Library
Index Terms
- Minimizing Efforts in Validating Crowd Answers
Recommendations
ERICA: Expert Guidance in Validating Crowd Answers
SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information RetrievalCrowdsourcing became an essential tool for a broad range of Web applications. Yet, the wide-ranging levels of expertise of crowd workers as well as the presence of faulty workers call for quality control of the crowdsourcing result. To this end, many ...
Answer validation for generic crowdsourcing tasks with minimal efforts
Crowdsourcing has been established as an essential means to scale human computation in diverse Web applications, reaching from data integration to information retrieval. Yet, crowd workers have wide-ranging levels of expertise. Large worker populations ...
Modus Operandi of Crowd Workers: The Invisible Role of Microtask Work Environments
The ubiquity of the Internet and the widespread proliferation of electronic devices has resulted in flourishing microtask crowdsourcing marketplaces, such as Amazon MTurk. An aspect that has remained largely invisible in microtask crowdsourcing is that ...
Comments