Skip to main content

2018 | OriginalPaper | Buchkapitel

Incorporating Worker Similarity for Label Aggregation in Crowdsourcing

verfasst von : Jiyi Li, Yukino Baba, Hisashi Kashima

Erschienen in: Artificial Neural Networks and Machine Learning – ICANN 2018

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

For the quality control in the crowdsourcing tasks, requesters usually assign a task to multiple workers to obtain redundant answers and then aggregate them to obtain the more reliable answer. Because of the existence of the non-experts in the crowds, one of the problems in the label aggregation is how to differ experts with higher ability from non-experts with lower ability and strengthen the influences of these experts. Most of the existing label aggregation approaches tend to strengthen the workers who provide majority answers and regard them with high ability. In addition, we find that the similarity among worker labels is possible to be effective for this issue because two experts are more probable to reach consensus than two non-experts. We thus propose a novel probabilistic model which can incorporate the similarity information of workers. The experimental results on a number of real datasets show that our approach can outperform the existing models including a probabilistic model without incorporating the similarity. We also make an empirical study on the influence of worker ability, label sparsity and redundancy to the performance of label aggregation approaches, and provide a suggestion on the strategy of collecting the labels in crowdsourcing.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bachrach, Y., Minka, T., Guiver, J., Graepel, T.: How to grade a test without knowing the answers: a bayesian graphical model for adaptive crowdsourcing and aptitude testing. In: Proceedings of the 29th International Conference on International Conference on Machine Learning ICML 2012, pp. 819–826 (2012) Bachrach, Y., Minka, T., Guiver, J., Graepel, T.: How to grade a test without knowing the answers: a bayesian graphical model for adaptive crowdsourcing and aptitude testing. In: Proceedings of the 29th International Conference on International Conference on Machine Learning ICML 2012, pp. 819–826 (2012)
2.
Zurück zum Zitat Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 20–28 (1979)CrossRef Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 20–28 (1979)CrossRef
3.
Zurück zum Zitat Karger, D.R., Oh, S., Shah, D.: Iterative learning for reliable crowdsourcing systems. In: Proceedings of the 24th International Conference on Neural Information Processing Systems NIPS 2011, pp. 1953–1961 (2011) Karger, D.R., Oh, S., Shah, D.: Iterative learning for reliable crowdsourcing systems. In: Proceedings of the 24th International Conference on Neural Information Processing Systems NIPS 2011, pp. 1953–1961 (2011)
4.
Zurück zum Zitat Li, H.W., Zhao, B., Fuxman, A.: The wisdom of minority: discovering and targeting the right group of workers for crowdsourcing. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 165–176 (2014) Li, H.W., Zhao, B., Fuxman, A.: The wisdom of minority: discovering and targeting the right group of workers for crowdsourcing. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 165–176 (2014)
5.
Zurück zum Zitat Li, J., Baba, Y., Kashima, H.: Hyper questions: unsupervised targeting of a few experts in crowdsourcing. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management CIKM 2017, pp. 1069–1078 (2017) Li, J., Baba, Y., Kashima, H.: Hyper questions: unsupervised targeting of a few experts in crowdsourcing. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management CIKM 2017, pp. 1069–1078 (2017)
6.
Zurück zum Zitat Liu, Q., Peng, J., Ihler, A.: Variational inference for crowdsourcing. In: Proceedings of the 25th International Conference on Neural Information Processing Systems NIPS 2012, pp. 692–700 (2012) Liu, Q., Peng, J., Ihler, A.: Variational inference for crowdsourcing. In: Proceedings of the 25th International Conference on Neural Information Processing Systems NIPS 2012, pp. 692–700 (2012)
7.
Zurück zum Zitat Ma, F.L., et al.: Faitcrowd: fine grained truth discovery for crowdsourced data aggregation. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 2015, pp. 745–754 (2015) Ma, F.L., et al.: Faitcrowd: fine grained truth discovery for crowdsourced data aggregation. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 2015, pp. 745–754 (2015)
8.
Zurück zum Zitat Mozafari, B., Sarkar, P., Franklin, M.J., Jordan, M.I., Madden, S.: Active learning for crowd-sourced databases. CoRR abs/1209.3686 (2012) Mozafari, B., Sarkar, P., Franklin, M.J., Jordan, M.I., Madden, S.: Active learning for crowd-sourced databases. CoRR abs/1209.3686 (2012)
9.
Zurück zum Zitat Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics ACL 2004 (2004) Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics ACL 2004 (2004)
10.
Zurück zum Zitat Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast–but is it good?: evaluating non-expert annotations for natural language tasks. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing EMNLP 2008, pp. 254–263 (2008) Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast–but is it good?: evaluating non-expert annotations for natural language tasks. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing EMNLP 2008, pp. 254–263 (2008)
11.
Zurück zum Zitat Venanzi, M., Teacy, W., Rogers, A., Jennings, N.R.: Weather sentiment-amazon mechanical turk dataset (2015) Venanzi, M., Teacy, W., Rogers, A., Jennings, N.R.: Weather sentiment-amazon mechanical turk dataset (2015)
12.
Zurück zum Zitat Welinder, P., Branson, S., Belongie, S., Perona, P.: The multidimensional wisdom of crowds. In: Proceedings of the 23rd International Conference on Neural Information Processing Systems NIPS 2010, pp. 2424–2432 (2010) Welinder, P., Branson, S., Belongie, S., Perona, P.: The multidimensional wisdom of crowds. In: Proceedings of the 23rd International Conference on Neural Information Processing Systems NIPS 2010, pp. 2424–2432 (2010)
13.
Zurück zum Zitat Whitehill, J., Ruvolo, P., Wu, T., Bergsma, J., Movellan, J.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Proceedings of the 22nd International Conference on Neural Information Processing Systems NIPS 2009, pp. 2035–2043 (2009) Whitehill, J., Ruvolo, P., Wu, T., Bergsma, J., Movellan, J.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Proceedings of the 22nd International Conference on Neural Information Processing Systems NIPS 2009, pp. 2035–2043 (2009)
14.
Zurück zum Zitat Zhou, D.Y., Platt, J.C., Basu, S., Mao, Y.: Learning from the wisdom of crowds by minimax entropy. In: Proceedings of the 25th International Conference on Neural Information Processing Systems NIPS 2012, pp. 2195–2203 (2012) Zhou, D.Y., Platt, J.C., Basu, S., Mao, Y.: Learning from the wisdom of crowds by minimax entropy. In: Proceedings of the 25th International Conference on Neural Information Processing Systems NIPS 2012, pp. 2195–2203 (2012)
Metadaten
Titel
Incorporating Worker Similarity for Label Aggregation in Crowdsourcing
verfasst von
Jiyi Li
Yukino Baba
Hisashi Kashima
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-01421-6_57