research-article

Prediction of Hourly Earnings and Completion Time on a Crowdsourcing Platform

Authors:
Anna Lioznova

Yandex, Moscow, Russian Fed.

Yandex, Moscow, Russian Fed.
View Profile

,
Alexey Drutsa

Yandex, Moscow, Russian Fed.

Yandex, Moscow, Russian Fed.
View Profile

,
Vladimir Kukushkin

Huawei, Saint Petersburg, Russian Fed.

Huawei, Saint Petersburg, Russian Fed.
View Profile

,
Anastasia Bezzubtseva

Yandex, Moscow, Russian Fed.

Yandex, Moscow, Russian Fed.
View Profile

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningAugust 2020Pages 3172–3182https://doi.org/10.1145/3394486.3403369

Published:20 August 2020Publication History

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 3172–3182

ABSTRACT

We study the problem of predicting future hourly earnings and task completion time for a crowdsourcing platform user who sees the list of available tasks and wants to select one of them to execute. Namely, for each task shown in the list, one needs to have an estimated value of the user's performance (i.e., hourly earnings and completion time) that will be if she selects this task. We address this problem on real crowd tasks completed on one of the global crowdsourcing marketplaces by (1) conducting a survey and an A/B test on real users; the results confirm the dominance of monetary incentives and importance of knowledge on hourly earnings for users; (2) an in-depth analysis of user behavior that shows that the prediction problem is challenging: (a) users and projects are highly heterogeneous, (b) there exists the so-called "learning effect" of a user selected a new task; and (3) the solution to the problem of predicting user performance that demonstrates improvement of prediction quality by up to 25% for hourly earnings and up to $32%$ completion time w.r.t. a naive baseline which is based solely on historical performance of users on tasks. In our experimentation, we use data about 18 million real crowdsourcing tasks performed by $161$ thousand users on the crowd platform; we publish this dataset. The hourly earning prediction has been deployed in Yandex.Toloka.

Supplemental Material

3394486.3403369.mp4

mp4

262.4 MB

Download

References

R. Budylin, A. Drutsa, G. Gusev, P. Serdyukov, and I. Yashkov. 2018. Online evaluation for effective web service development. Tutorial at KDD'2018.Google Scholar
Ricardo Buettner. 2015. A systematic literature review of crowdsourcing research from a human resource management perspective. In HICSS.Google Scholar
Chris Callison-Burch. 2014. Crowd-Workers: Aggregating Information Across Turkers to Help Them Find Higher Paying Work. In HCOMP.Google Scholar
Justin Cheng, Jaime Teevan, and Michael S. Bernstein. 2015. Measuring Crowdsourcing Effort with Error-Time Curves. In CHI.Google Scholar
Djellel Eddine Difallah, Michele Catasta, Gianluca Demartini, Panagiotis G Ipeirotis, and Philippe Cudré-Mauroux. 2015. The dynamics of micro-task crowdsourcing: The case of amazon mturk. In WWW. 238--247.Google Scholar
A. Drutsa, V. Farafonova, V. Fedorova, O. Megorskaya, E. Zerminova, and O. Zhilinskaya. 2019. Practice of Efficient Data Collection via Crowdsourcing at Large-Scale. Tutorial at KDD'2019.Google Scholar
A. Drutsa, V. Fedorova, D. Ustalov, O. Megorskaya, E. Zerminova, and D. Baidakova. 2020 a. Practice of Efficient Data Collection via Crowdsourcing: Aggregation, Incremental Relabelling, and Pricing. In WSDM'2020. 873--876.Google Scholar
Alexey Drutsa, Gleb Gusev, and Pavel Serdyukov. 2017. Using the Delay in a Treatment Effect to Improve Sensitivity and Preserve Directionality of Engagement Metrics in A/B Experiments. In WWW'2017.Google ScholarDigital Library
A. Drutsa, D. Rogachevsky, O. Megorskaya, A. Slesarev, E. Zerminova, D. Baidakova, A. Rykov, and A. Golomedov. 2020 b. Efficient Data Annotation for Self-Driving Cars via Crowdsourcing on a Large-Scale. In CVPR'2020.Google Scholar
Alexey Drutsa, Anna Ufliand, and Gleb Gusev. 2015. Practical Aspects of Sensitivity in Online Experimentation with User Engagement Metrics. In CIKM'2015.Google Scholar
A. Drutsa, D. Ustalov, E. Zerminova, V. Fedorova, O. Megorskaya, and D. Baidakova. 2020 c. Crowdsourcing Practice for Efficient Data Labeling: Aggregation, Incremental Relabeling, and Pricing. In SIGMOD'2020. 2623--2627.Google Scholar
David S Evans and Richard Schmalensee. 2016. Matchmakers: the new economics of multisided platforms. Harvard Business Review Press.Google Scholar
Siamak Faradani, Björn Hartmann, and Panagiotis G Ipeirotis. 2011. What's the Right Price? Pricing Tasks for Finishing on Time. Human comp., Vol. 11 (2011), 11.Google Scholar
Tanya Goyal, Tyler McDonnell, Mucahid Kutlu, Tamer Elsayed, and Matthew Lease. 2018. Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to Ensure Quality Relevance Annotations. In HCOMP.Google Scholar
B.V Hanrahan, D. Martin, J. Willamowski, and J.M Carroll. 2018. Investigating the Amazon Mechanical Turk Market Through Tool Design. CSCW, Vol. 27, 3--6 (2018).Google Scholar
Benjamin V Hanrahan, Jutta K Willamowski, Saiganesh Swaminathan, and David B Martin. 2015. TurkBench: Rendering the market for Turkers. In HFCS.Google Scholar
Kotaro Hara, Abigail Adams, Kristy Milland, Saiph Savage, Chris Callison-Burch, and Jeffrey P Bigham. 2018. A Data-Driven Analysis of Workers' Earnings on Amazon Mechanical Turk. In CHI.Google Scholar
Simo Hosio, Jorge Goncalves, Vili Lehdonvirta, Denzil Ferreira, and Vassilis Kostakos. 2014. Situated crowdsourcing using a market model. In UIST. ACM.Google Scholar
Panagiotis G Ipeirotis. 2010. Analyzing the amazon mechanical turk marketplace. XRDS: Crossroads, The ACM Magazine for Students, Vol. 17, 2 (2010), 16--21.Google ScholarDigital Library
Lilly C Irani and M Silberman. 2013. Turkopticon: Interrupting worker invisibility in amazon mechanical turk. In CHI.Google ScholarDigital Library
T. Kaplan, S. Saito, K. Hara, and J. P Bigham. 2018. Striving to earn more: a survey of work strategies and tool use among crowd workers. In HCOMP'2018.Google ScholarCross Ref
R. Kohavi, R. Longbotham, D. Sommerfield, and R. M Henne. 2009. Controlled experiments on the web: survey and practical guide. DMKD, Vol. 18, 1 (2009).Google ScholarDigital Library
Pavel Kucherbaev, Florian Daniel, Stefano Tranquillini, and Maurizio Marchese. 2016. ReLauncher: crowdsourcing micro-tasks runtime controller. In CSCWSC.Google Scholar
Adam Marcus, Aditya Parameswaran, et almbox. 2015. Crowdsourced data management: Industry and academic perspectives. FTD, Vol. 6, 1--2 (2015), 1--161.Google ScholarDigital Library
David Martin, Benjamin V Hanrahan, Jacki O'Neill, and Neha Gupta. 2014. Being a turker. In CSCWSC.Google Scholar
P. Minder, S. Seuken, A. Bernstein, and M. Zollinger. 2012. Crowdmanager-combinatorial allocation and pricing of crowdsourcing tasks with time constraints. In Workshop on Social Computing and User Generated Content in ACM EC.Google Scholar
Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, and Andrey Gulin. 2018. CatBoost: unbiased boosting with categorical features. In Advances in Neural Information Processing Systems. 6638--6648.Google ScholarDigital Library
John Prpić, Araz Taeihagh, and James Melton. 2015. The fundamentals of policy crowdsourcing. Policy & Internet, Vol. 7, 3 (2015), 340--361.Google ScholarCross Ref
Mejdl Safran and Dunren Che. 2018. Efficient Learning-Based Recommendation Algorithms for Top-N Tasks and Top-N Workers in Large-Scale Crowdsourcing Systems. ACM Transactions on Information Systems (TOIS), Vol. 37, 1 (2018), 2.Google Scholar
Susumu Saito, Chun-Wei Chiang, Saiph Savage, Teppei Nakano, Tetsunori Kobayashi, and Jeffrey P Bigham. 2019. TurkScanner: Predicting the Hourly Wage of Microtasks. In The World Wide Web Conference. ACM, 3187--3193.Google Scholar
Thimo Schulze, Simone Krug, and Martin Schader. 2012. Workers' task choice in crowdsourcing and human computation markets. (2012).Google Scholar
Jing Wang, Siamak Faridani, and Panagiotis Ipeirotis. 2011. Estimating the completion time of crowdsourced tasks using survival analysis models. Crowdsourcing for search and data mining (CSDM 2011), Vol. 31 (2011).Google Scholar
Jing Wang and Panagiotis Ipeirotis. 2013. A framework for quality assurance in crowdsourcing. (2013).Google Scholar
Chaolun Xia and Shan Muthukrishnan. 2017. Revenue-Maximizing Stable Pricing in Online Labor Markets. (2017).Google Scholar
Yandex. Since 2014. Yandex.Toloka. http://toloka.aiGoogle Scholar
Yandex.Toloka. 2020. Dataset "Toloka Users & Tasks". https://toloka.ai/datasetsGoogle Scholar
Y. Zhang, H. Qin, B. Li, J. Wang, S. Lee, and Zh. Huang. 2018. Truthful mechanism for crowdsourcing task assignment. Tsinghua Science and Technology, Vol. 23, 6 (2018).Google Scholar

Index Terms

Prediction of Hourly Earnings and Completion Time on a Crowdsourcing Platform

Recommendations

Automated Earnings Forecasts: Beat Analysts or Combine and Conquer?

Prior studies attribute analysts' forecast superiority over time-series forecasting models to their access to a large set of firm, industry, and macroeconomic information an information advantage, which they use to update their forecasts on a daily, ...
Read More
Earnings management prediction: A pilot study of combining neural networks and decision trees

Many financial crisis cases related to the public companies have increased recently, but many investors and creditors are difficult to foresee the financial crisis, especially in the cases with earnings management. Earnings management is manipulating ...
Read More
Predicting post-task user satisfaction with weibull analysis of task completion times

Task completion times have been shown to follow Weibull distributions, with parameters reflecting different aspects of the task solution process (Rummel, 2017). The offset time matches UI operation time, including system response time, on the shortest ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
August 2020
3664 pages
ISBN:9781450379984
DOI:10.1145/3394486
General Chairs:
Rajesh Gupta
UC San Diego, USA
,
Yan Liu
USC, USA
,
Program Chairs:
Mohak Shah
LG Electronics, USA
,
Suju Rajan
Linkedin, USA
,
Publications Chairs:
Jiliang Tang
Michigan State, USA
,
B. Aditya Prakash
Georgia Tech, USA
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 August 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
crowdsourcing marketplace
hourly earnings
task completion time
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 470
  Total Downloads
- Downloads (Last 12 months)21
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Prediction of Hourly Earnings and Completion Time on a Crowdsourcing Platform

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Automated Earnings Forecasts: Beat Analysts or Combine and Conquer?

Earnings management prediction: A pilot study of combining neural networks and decision trees

Predicting post-task user satisfaction with weibull analysis of task completion times

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Prediction of Hourly Earnings and Completion Time on a Crowdsourcing Platform

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Automated Earnings Forecasts: Beat Analysts or Combine and Conquer?

Earnings management prediction: A pilot study of combining neural networks and decision trees

Predicting post-task user satisfaction with weibull analysis of task completion times

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media