Skip to main content
Top

2018 | OriginalPaper | Chapter

Incorporating Worker Similarity for Label Aggregation in Crowdsourcing

Authors : Jiyi Li, Yukino Baba, Hisashi Kashima

Published in: Artificial Neural Networks and Machine Learning – ICANN 2018

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

For the quality control in the crowdsourcing tasks, requesters usually assign a task to multiple workers to obtain redundant answers and then aggregate them to obtain the more reliable answer. Because of the existence of the non-experts in the crowds, one of the problems in the label aggregation is how to differ experts with higher ability from non-experts with lower ability and strengthen the influences of these experts. Most of the existing label aggregation approaches tend to strengthen the workers who provide majority answers and regard them with high ability. In addition, we find that the similarity among worker labels is possible to be effective for this issue because two experts are more probable to reach consensus than two non-experts. We thus propose a novel probabilistic model which can incorporate the similarity information of workers. The experimental results on a number of real datasets show that our approach can outperform the existing models including a probabilistic model without incorporating the similarity. We also make an empirical study on the influence of worker ability, label sparsity and redundancy to the performance of label aggregation approaches, and provide a suggestion on the strategy of collecting the labels in crowdsourcing.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Bachrach, Y., Minka, T., Guiver, J., Graepel, T.: How to grade a test without knowing the answers: a bayesian graphical model for adaptive crowdsourcing and aptitude testing. In: Proceedings of the 29th International Conference on International Conference on Machine Learning ICML 2012, pp. 819–826 (2012) Bachrach, Y., Minka, T., Guiver, J., Graepel, T.: How to grade a test without knowing the answers: a bayesian graphical model for adaptive crowdsourcing and aptitude testing. In: Proceedings of the 29th International Conference on International Conference on Machine Learning ICML 2012, pp. 819–826 (2012)
2.
go back to reference Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 20–28 (1979)CrossRef Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 20–28 (1979)CrossRef
3.
go back to reference Karger, D.R., Oh, S., Shah, D.: Iterative learning for reliable crowdsourcing systems. In: Proceedings of the 24th International Conference on Neural Information Processing Systems NIPS 2011, pp. 1953–1961 (2011) Karger, D.R., Oh, S., Shah, D.: Iterative learning for reliable crowdsourcing systems. In: Proceedings of the 24th International Conference on Neural Information Processing Systems NIPS 2011, pp. 1953–1961 (2011)
4.
go back to reference Li, H.W., Zhao, B., Fuxman, A.: The wisdom of minority: discovering and targeting the right group of workers for crowdsourcing. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 165–176 (2014) Li, H.W., Zhao, B., Fuxman, A.: The wisdom of minority: discovering and targeting the right group of workers for crowdsourcing. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 165–176 (2014)
5.
go back to reference Li, J., Baba, Y., Kashima, H.: Hyper questions: unsupervised targeting of a few experts in crowdsourcing. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management CIKM 2017, pp. 1069–1078 (2017) Li, J., Baba, Y., Kashima, H.: Hyper questions: unsupervised targeting of a few experts in crowdsourcing. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management CIKM 2017, pp. 1069–1078 (2017)
6.
go back to reference Liu, Q., Peng, J., Ihler, A.: Variational inference for crowdsourcing. In: Proceedings of the 25th International Conference on Neural Information Processing Systems NIPS 2012, pp. 692–700 (2012) Liu, Q., Peng, J., Ihler, A.: Variational inference for crowdsourcing. In: Proceedings of the 25th International Conference on Neural Information Processing Systems NIPS 2012, pp. 692–700 (2012)
7.
go back to reference Ma, F.L., et al.: Faitcrowd: fine grained truth discovery for crowdsourced data aggregation. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 2015, pp. 745–754 (2015) Ma, F.L., et al.: Faitcrowd: fine grained truth discovery for crowdsourced data aggregation. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 2015, pp. 745–754 (2015)
8.
go back to reference Mozafari, B., Sarkar, P., Franklin, M.J., Jordan, M.I., Madden, S.: Active learning for crowd-sourced databases. CoRR abs/1209.3686 (2012) Mozafari, B., Sarkar, P., Franklin, M.J., Jordan, M.I., Madden, S.: Active learning for crowd-sourced databases. CoRR abs/1209.3686 (2012)
9.
go back to reference Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics ACL 2004 (2004) Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics ACL 2004 (2004)
10.
go back to reference Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast–but is it good?: evaluating non-expert annotations for natural language tasks. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing EMNLP 2008, pp. 254–263 (2008) Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast–but is it good?: evaluating non-expert annotations for natural language tasks. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing EMNLP 2008, pp. 254–263 (2008)
11.
go back to reference Venanzi, M., Teacy, W., Rogers, A., Jennings, N.R.: Weather sentiment-amazon mechanical turk dataset (2015) Venanzi, M., Teacy, W., Rogers, A., Jennings, N.R.: Weather sentiment-amazon mechanical turk dataset (2015)
12.
go back to reference Welinder, P., Branson, S., Belongie, S., Perona, P.: The multidimensional wisdom of crowds. In: Proceedings of the 23rd International Conference on Neural Information Processing Systems NIPS 2010, pp. 2424–2432 (2010) Welinder, P., Branson, S., Belongie, S., Perona, P.: The multidimensional wisdom of crowds. In: Proceedings of the 23rd International Conference on Neural Information Processing Systems NIPS 2010, pp. 2424–2432 (2010)
13.
go back to reference Whitehill, J., Ruvolo, P., Wu, T., Bergsma, J., Movellan, J.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Proceedings of the 22nd International Conference on Neural Information Processing Systems NIPS 2009, pp. 2035–2043 (2009) Whitehill, J., Ruvolo, P., Wu, T., Bergsma, J., Movellan, J.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Proceedings of the 22nd International Conference on Neural Information Processing Systems NIPS 2009, pp. 2035–2043 (2009)
14.
go back to reference Zhou, D.Y., Platt, J.C., Basu, S., Mao, Y.: Learning from the wisdom of crowds by minimax entropy. In: Proceedings of the 25th International Conference on Neural Information Processing Systems NIPS 2012, pp. 2195–2203 (2012) Zhou, D.Y., Platt, J.C., Basu, S., Mao, Y.: Learning from the wisdom of crowds by minimax entropy. In: Proceedings of the 25th International Conference on Neural Information Processing Systems NIPS 2012, pp. 2195–2203 (2012)
Metadata
Title
Incorporating Worker Similarity for Label Aggregation in Crowdsourcing
Authors
Jiyi Li
Yukino Baba
Hisashi Kashima
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-030-01421-6_57

Premium Partner