ABSTRACT
We study the problem of predicting future hourly earnings and task completion time for a crowdsourcing platform user who sees the list of available tasks and wants to select one of them to execute. Namely, for each task shown in the list, one needs to have an estimated value of the user's performance (i.e., hourly earnings and completion time) that will be if she selects this task. We address this problem on real crowd tasks completed on one of the global crowdsourcing marketplaces by (1) conducting a survey and an A/B test on real users; the results confirm the dominance of monetary incentives and importance of knowledge on hourly earnings for users; (2) an in-depth analysis of user behavior that shows that the prediction problem is challenging: (a) users and projects are highly heterogeneous, (b) there exists the so-called "learning effect" of a user selected a new task; and (3) the solution to the problem of predicting user performance that demonstrates improvement of prediction quality by up to 25% for hourly earnings and up to $32%$ completion time w.r.t. a naive baseline which is based solely on historical performance of users on tasks. In our experimentation, we use data about 18 million real crowdsourcing tasks performed by $161$ thousand users on the crowd platform; we publish this dataset. The hourly earning prediction has been deployed in Yandex.Toloka.
Supplemental Material
- R. Budylin, A. Drutsa, G. Gusev, P. Serdyukov, and I. Yashkov. 2018. Online evaluation for effective web service development. Tutorial at KDD'2018.Google Scholar
- Ricardo Buettner. 2015. A systematic literature review of crowdsourcing research from a human resource management perspective. In HICSS.Google Scholar
- Chris Callison-Burch. 2014. Crowd-Workers: Aggregating Information Across Turkers to Help Them Find Higher Paying Work. In HCOMP.Google Scholar
- Justin Cheng, Jaime Teevan, and Michael S. Bernstein. 2015. Measuring Crowdsourcing Effort with Error-Time Curves. In CHI.Google Scholar
- Djellel Eddine Difallah, Michele Catasta, Gianluca Demartini, Panagiotis G Ipeirotis, and Philippe Cudré-Mauroux. 2015. The dynamics of micro-task crowdsourcing: The case of amazon mturk. In WWW. 238--247.Google Scholar
- A. Drutsa, V. Farafonova, V. Fedorova, O. Megorskaya, E. Zerminova, and O. Zhilinskaya. 2019. Practice of Efficient Data Collection via Crowdsourcing at Large-Scale. Tutorial at KDD'2019.Google Scholar
- A. Drutsa, V. Fedorova, D. Ustalov, O. Megorskaya, E. Zerminova, and D. Baidakova. 2020 a. Practice of Efficient Data Collection via Crowdsourcing: Aggregation, Incremental Relabelling, and Pricing. In WSDM'2020. 873--876.Google Scholar
- Alexey Drutsa, Gleb Gusev, and Pavel Serdyukov. 2017. Using the Delay in a Treatment Effect to Improve Sensitivity and Preserve Directionality of Engagement Metrics in A/B Experiments. In WWW'2017.Google ScholarDigital Library
- A. Drutsa, D. Rogachevsky, O. Megorskaya, A. Slesarev, E. Zerminova, D. Baidakova, A. Rykov, and A. Golomedov. 2020 b. Efficient Data Annotation for Self-Driving Cars via Crowdsourcing on a Large-Scale. In CVPR'2020.Google Scholar
- Alexey Drutsa, Anna Ufliand, and Gleb Gusev. 2015. Practical Aspects of Sensitivity in Online Experimentation with User Engagement Metrics. In CIKM'2015.Google Scholar
- A. Drutsa, D. Ustalov, E. Zerminova, V. Fedorova, O. Megorskaya, and D. Baidakova. 2020 c. Crowdsourcing Practice for Efficient Data Labeling: Aggregation, Incremental Relabeling, and Pricing. In SIGMOD'2020. 2623--2627.Google Scholar
- David S Evans and Richard Schmalensee. 2016. Matchmakers: the new economics of multisided platforms. Harvard Business Review Press.Google Scholar
- Siamak Faradani, Björn Hartmann, and Panagiotis G Ipeirotis. 2011. What's the Right Price? Pricing Tasks for Finishing on Time. Human comp., Vol. 11 (2011), 11.Google Scholar
- Tanya Goyal, Tyler McDonnell, Mucahid Kutlu, Tamer Elsayed, and Matthew Lease. 2018. Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to Ensure Quality Relevance Annotations. In HCOMP.Google Scholar
- B.V Hanrahan, D. Martin, J. Willamowski, and J.M Carroll. 2018. Investigating the Amazon Mechanical Turk Market Through Tool Design. CSCW, Vol. 27, 3--6 (2018).Google Scholar
- Benjamin V Hanrahan, Jutta K Willamowski, Saiganesh Swaminathan, and David B Martin. 2015. TurkBench: Rendering the market for Turkers. In HFCS.Google Scholar
- Kotaro Hara, Abigail Adams, Kristy Milland, Saiph Savage, Chris Callison-Burch, and Jeffrey P Bigham. 2018. A Data-Driven Analysis of Workers' Earnings on Amazon Mechanical Turk. In CHI.Google Scholar
- Simo Hosio, Jorge Goncalves, Vili Lehdonvirta, Denzil Ferreira, and Vassilis Kostakos. 2014. Situated crowdsourcing using a market model. In UIST. ACM.Google Scholar
- Panagiotis G Ipeirotis. 2010. Analyzing the amazon mechanical turk marketplace. XRDS: Crossroads, The ACM Magazine for Students, Vol. 17, 2 (2010), 16--21.Google ScholarDigital Library
- Lilly C Irani and M Silberman. 2013. Turkopticon: Interrupting worker invisibility in amazon mechanical turk. In CHI.Google ScholarDigital Library
- T. Kaplan, S. Saito, K. Hara, and J. P Bigham. 2018. Striving to earn more: a survey of work strategies and tool use among crowd workers. In HCOMP'2018.Google ScholarCross Ref
- R. Kohavi, R. Longbotham, D. Sommerfield, and R. M Henne. 2009. Controlled experiments on the web: survey and practical guide. DMKD, Vol. 18, 1 (2009).Google ScholarDigital Library
- Pavel Kucherbaev, Florian Daniel, Stefano Tranquillini, and Maurizio Marchese. 2016. ReLauncher: crowdsourcing micro-tasks runtime controller. In CSCWSC.Google Scholar
- Adam Marcus, Aditya Parameswaran, et almbox. 2015. Crowdsourced data management: Industry and academic perspectives. FTD, Vol. 6, 1--2 (2015), 1--161.Google ScholarDigital Library
- David Martin, Benjamin V Hanrahan, Jacki O'Neill, and Neha Gupta. 2014. Being a turker. In CSCWSC.Google Scholar
- P. Minder, S. Seuken, A. Bernstein, and M. Zollinger. 2012. Crowdmanager-combinatorial allocation and pricing of crowdsourcing tasks with time constraints. In Workshop on Social Computing and User Generated Content in ACM EC.Google Scholar
- Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, and Andrey Gulin. 2018. CatBoost: unbiased boosting with categorical features. In Advances in Neural Information Processing Systems. 6638--6648.Google ScholarDigital Library
- John Prpić, Araz Taeihagh, and James Melton. 2015. The fundamentals of policy crowdsourcing. Policy & Internet, Vol. 7, 3 (2015), 340--361.Google ScholarCross Ref
- Mejdl Safran and Dunren Che. 2018. Efficient Learning-Based Recommendation Algorithms for Top-N Tasks and Top-N Workers in Large-Scale Crowdsourcing Systems. ACM Transactions on Information Systems (TOIS), Vol. 37, 1 (2018), 2.Google Scholar
- Susumu Saito, Chun-Wei Chiang, Saiph Savage, Teppei Nakano, Tetsunori Kobayashi, and Jeffrey P Bigham. 2019. TurkScanner: Predicting the Hourly Wage of Microtasks. In The World Wide Web Conference. ACM, 3187--3193.Google Scholar
- Thimo Schulze, Simone Krug, and Martin Schader. 2012. Workers' task choice in crowdsourcing and human computation markets. (2012).Google Scholar
- Jing Wang, Siamak Faridani, and Panagiotis Ipeirotis. 2011. Estimating the completion time of crowdsourced tasks using survival analysis models. Crowdsourcing for search and data mining (CSDM 2011), Vol. 31 (2011).Google Scholar
- Jing Wang and Panagiotis Ipeirotis. 2013. A framework for quality assurance in crowdsourcing. (2013).Google Scholar
- Chaolun Xia and Shan Muthukrishnan. 2017. Revenue-Maximizing Stable Pricing in Online Labor Markets. (2017).Google Scholar
- Yandex. Since 2014. Yandex.Toloka. http://toloka.aiGoogle Scholar
- Yandex.Toloka. 2020. Dataset "Toloka Users & Tasks". https://toloka.ai/datasetsGoogle Scholar
- Y. Zhang, H. Qin, B. Li, J. Wang, S. Lee, and Zh. Huang. 2018. Truthful mechanism for crowdsourcing task assignment. Tsinghua Science and Technology, Vol. 23, 6 (2018).Google Scholar
Index Terms
- Prediction of Hourly Earnings and Completion Time on a Crowdsourcing Platform
Recommendations
Automated Earnings Forecasts: Beat Analysts or Combine and Conquer?
Prior studies attribute analysts' forecast superiority over time-series forecasting models to their access to a large set of firm, industry, and macroeconomic information an information advantage, which they use to update their forecasts on a daily, ...
Earnings management prediction: A pilot study of combining neural networks and decision trees
Many financial crisis cases related to the public companies have increased recently, but many investors and creditors are difficult to foresee the financial crisis, especially in the cases with earnings management. Earnings management is manipulating ...
Predicting post-task user satisfaction with weibull analysis of task completion times
Task completion times have been shown to follow Weibull distributions, with parameters reflecting different aspects of the task solution process (Rummel, 2017). The offset time matches UI operation time, including system response time, on the shortest ...
Comments