skip to main content
10.1145/2556195.2556227acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Trust, but verify: predicting contribution quality for knowledge base construction and curation

Published:24 February 2014Publication History

ABSTRACT

The largest publicly available knowledge repositories, such as Wikipedia and Freebase, owe their existence and growth to volunteer contributors around the globe. While the majority of contributions are correct, errors can still creep in, due to editors' carelessness, misunderstanding of the schema, malice, or even lack of accepted ground truth. If left undetected, inaccuracies often degrade the experience of users and the performance of applications that rely on these knowledge repositories. We present a new method, CQUAL, for automatically predicting the quality of contributions submitted to a knowledge base. Significantly expanding upon previous work, our method holistically exploits a variety of signals, including the user's domains of expertise as reflected in her prior contribution history, and the historical accuracy rates of different types of facts. In a large-scale human evaluation, our method exhibits precision of 91% at 80% recall. Our model verifies whether a contribution is correct immediately after it is submitted, significantly alleviating the need for post-submission human reviewing.

References

  1. L. A. Adamic, J. Zhang, E. Bakshy, and M. S. Ackerman. Knowledge sharing and yahoo answers: Everyone knows something. In WWW, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. B. T. Adler, L. de Alfaro, I. Pye, and V. Raman. Measuring author contributions to the wikipedia. In 4th Int'l Symposium on Wikis. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. E. Agichtein, C. Castillo, D. Donato, A. Gionis, and G. Mishne. Finding high-quality content in social media. In WSDM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Anderson, D. Huttenlocher, J. Kleinberg, and J. Leskovec. Steering user behavior with badges. In WWW, pages 95--106, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Y. Bachrach, T. Graepel, T. Minka, and J. Guiver. How to grade a test without knowing the answers|a bayesian graphical model for adaptive crowdsourcing and aptitude testing. arXiv preprint arXiv:1206.6386, 2012.Google ScholarGoogle Scholar
  6. J. Bian, Y. Liu, D. Zhou, E. Agichtein, and H. Zha. Learning to recognize reliable users and content in social media with coupled mutual reinforcement. In WWW, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Bizer, J. Lehmann, G. Kobilarov, S. Auer, C. Becker, R. Cyganiak, and S. Hellmann. Dbpedia - a crystallization point for the web of data. Web Semantics, 7(3):154--165, Sept. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. JMLR, 3:993--1022, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. Dellarocas. The digitization of word of mouth: Promise and challenges of online feedback mechanisms. Management Science, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Dellarocas. Reputation mechanisms. In Handbook on Economics and Information Systems. Elsevier Publishing, 2006.Google ScholarGoogle Scholar
  11. O. Deshpande, D. S. Lamba, M. Tourn, S. Das, S. Subramaniam, A. Rajaraman, V. Harinarayan, and A. Doan. Building, maintaining, and using knowledge bases: A report from the trenches. In SIGMOD, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Duchi and Y. Singer. Boosting with structural sparsity. In ICML, pages 297--304, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Dumais and H. Chen. Hierarchical classification of web content. In SIGIR'00, pages 256--263, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. E. Embretson and S. P. Reise. Item response theory. Psychology Press, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  15. Y. Freund and R. E. Schapire. Large margin classification using the perceptron algorithm. Machine learning, 37(3):277--296, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Galland, S. Abiteboul, A. Marian, and P. Senellart. Corroborating information from disagreeing views. In WSDM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Halfaker, A. Kittur, R. Kraut, and J. Riedl. A jury of your peers: quality, experience and ownership in wikipedia. In 5th Int'l Symposium on Wikis and Open Collaboration, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Hoffart, F. M. Suchanek, K. Berberich, and G. Weikum. YAGO2: A spatially and temporally enhanced knowledge base from wikipedia. Artificial Intelligence, 194:28--61, Jan. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. P. G. Ipeirotis, F. Provost, and J. Wang. Quality management on amazon mechanical turk. In KDD Workshop on Human computation, pages 64--67, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Jeon, W. B. Croft, J. H. Lee, and S. Park. A framework to predict the quality of answers with non-textual features. In SIGIR, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Kochhar, S. Mazzocchi, and P. Paritosh. The anatomy of a large-scale human computation engine. In KDD Workshop on Human Computation, pages 10--17, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Kokkodis and P. G. Ipeirotis. Have you done anything like that?: predicting performance using inter-category reputation. In WSDM, pages 435--444, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. D. Koller and M. Sahami. Hierarchically classifying documents using very few words. In ICML, pages 170--178, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Y. Liu, J. Bian, and E. Agichtein. Predicting information seeker satisfaction in community question answering. In SIGIR, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. V. C. Raykar, S. Yu, L. H. Zhao, G. H. Valadez, C. Florin, L. Bogoni, and L. Moy. Learning from crowds. JMLR, 99:1297--1322, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. E. Ruiz and P. Srinivasan. Hierarchical text categorization using neural networks. Information Retrieval, 5:87--118, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Rzeszotarski and A. Kittur. Learning from history: predicting reverted work at the word level in wikipedia. In Computer Supported Cooperative Work, pages 437--440, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. M. Rzeszotarski and A. Kittur. Instrumenting the crowd: using implicit behavioral measures to predict task performance. In Annual symposium on User interface software and technology, pages 13--22. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. C. Shah and J. Pomerantz. Evaluating and predicting answer quality in community QA. In SIGIR, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. V. S. Sheng, F. Provost, and P. G. Ipeirotis. Get another label? improving data quality and data mining using multiple, noisy labelers. In KDD, pages 614--622. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. R. Snow, B. O'Connor, D. Jurafsky, and A. Y. Ng. Cheap and fast|but is it good?: evaluating non-expert annotations for natural language tasks. In EMNLP, pages 254--263. Association for Computational Linguistics, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. A. Suryanto, E.-P. Lim, and A. S. R. H. L. Chiang. Quality-aware collaborative question answering: Methods and evaluation. In WSDM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. F. B. Viégas, M. Wattenberg, and K. Dave. Studying cooperation and conict between authors with history flow visualizations. In CHI, pages 575--582, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. E. M. Voorhees. Overview of trec 2003. In TREC, pages 1--13, 2003.Google ScholarGoogle Scholar
  35. P. Welinder, S. Branson, S. Belongie, and P. Perona. The multidimensional wisdom of crowds. NIPS, 23:2424--2432, 2010.Google ScholarGoogle Scholar
  36. M. Wick, K. Schultz, and A. McCallum. Human-machine cooperation with epistemological dbs: supporting user corrections to knowledge bases. In AKBC Workshop, pages 89--94. ACL, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Trust, but verify: predicting contribution quality for knowledge base construction and curation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WSDM '14: Proceedings of the 7th ACM international conference on Web search and data mining
      February 2014
      712 pages
      ISBN:9781450323512
      DOI:10.1145/2556195

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 February 2014

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      WSDM '14 Paper Acceptance Rate64of355submissions,18%Overall Acceptance Rate498of2,863submissions,17%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader