skip to main content
10.1145/2983323.2983767acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Empowering Truth Discovery with Multi-Truth Prediction

Authors Info & Claims
Published:24 October 2016Publication History

ABSTRACT

Truth discovery is the problem of detecting true values from the conflicting data provided by multiple sources on the same data items. Since sources' reliability is unknown a priori, a truth discovery method usually estimates sources' reliability along with the truth discovery process. A major limitation of existing truth discovery methods is that they commonly assume exactly one true value on each data item and therefore cannot deal with the more general case that a data item may have multiple true values (or multi-truth). Since the number of true values may vary from data item to data item, this requires truth discovery methods being able to detect varying numbers of truth values from the multi-source data. In this paper, we propose a multi-truth discovery approach, which addresses the above challenges by providing a generic framework for enhancing existing truth discovery methods. In particular, we redeem the numbers of true values as an important clue for facilitating multi-truth discovery. We present the procedure and components of our approach, and propose three models, namely the byproduct model, the joint model, and the synthesis model to implement our approach. We further propose two extensions to enhance our approach, by leveraging the implications of similar numerical values and values' co-occurrence information in sources' claims to improve the truth discovery accuracy. Experimental studies on real-world datasets demonstrate the effectiveness of our approach.

References

  1. M. Allahbakhsh, B. Benatallah, A. Ignjatovic, H. R. Motahari-Nezhad, E. Bertino, and S. Dustdar. Quality control in crowdsourcing systems: Issues and directions. IEEE Internet Computing, (2):76--81, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. Benslimane, Q. Z. Sheng, M. Barhamgi, and H. Prade. The uncertain web: concepts, challenges, and current solutions. ACM Transactions on Internet Technology (TOIT), 16(1):1, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Dobre and F. Xhafa. Intelligent services for big data science. Future Generation Computer Systems, 37:267--281, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  4. X. L. Dong, L. Berti-Equille, Y. Hu, and D. Srivastava. Global detection of complex copying relationships between sources. Proc. the VLDB Endowment, 3(1--2):1358--1369, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. X. L. Dong, L. Berti-Equille, and D. Srivastava. Integrating conflicting data: the role of source dependence. Proc. the VLDB Endowment, 2(1):550--561, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Galland, S. Abiteboul, A. Marian, and P. Senellart. Corroborating information from disagreeing views. In Proc. ACM International Conference on Web Search and Data Mining (WSDM), pages 131--140, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. J. Kim, D. L. Ferrin, and H. R. Rao. A trust-based consumer decision-making model in electronic commerce: The role of trust, perceived risk, and their antecedents. Decis. Support Syst., 44(2):544--564, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Q. Li, Y. Li, J. Gao, L. Su, B. Zhao, M. Demirbas, W. Fan, and J. Han. A confidence-aware approach for truth discovery on long-tail data. Proc. the VLDB Endowment, 8(4), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Q. Li, Y. Li, J. Gao, B. Zhao, W. Fan, and J. Han. Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation. In Proc. ACM SIGMOD International Conference on Management of Data, pages 1187--1198, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Y. Li, J. Gao, C. Meng, Q. Li, L. Su, B. Zhao, W. Fan, and J. Han. A survey on truth discovery. ACM SIGKDD Exploration Newsletters, 17(2):1--16, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Pasternack and D. Roth. Knowing what to believe (when you already know something). In Proc. International Conference on Computational Linguistics (COLING), pages 877--885, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Pasternack and D. Roth. Latent credibility analysis. In Proc. the 22th international conference on World Wide Web (WWW), pages 1009--1020, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Rajaraman, J. D. Ullman, J. D. Ullman, and J. D. Ullman. Mining of massive datasets, volume 77. Cambridge University Press Cambridge, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. D. Wang, L. Kaplan, H. Le, and T. Abdelzaher. On truth discovery in social sensing: a maximum likelihood estimation approach. In Proc. ACM International Conference on Information Processing in Sensor Networks (Sensys), pages 233--244, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. X. Wang, Q. Z. Sheng, X. S. Fang, X. Li, X. Xu, and L. Yao. Approximate truth discovery via problem scale reduction. In Proc. the 24th ACM International Conference on Information and Knowledge Management (CIKM), pages 503--512, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. X. Wang, Q. Z. Sheng, X. S. Fang, L. Yao, X. Xu, and X. Li. An integrated bayesian approach for effective multi-truth discovery. In Proc. the 24th ACM International Conference on Information and Knowledge Management (CIKM), pages 493--502, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. X. Yin, J. Han, and P. S. Yu. Truth discovery with multiple conflicting information providers on the web. IEEE Transactions on Knowledge and Data Engineering (TKDE), 20(6):796--808, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. X. Yin and W. Tan. Semi-supervised truth discovery. In Proc. the 20th international conference on World Wide Web (WWW), pages 217--226, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. B. Zhao, B. I. Rubinstein, J. Gemmell, and J. Han. A bayesian approach to discovering truth from conflicting sources for data integration. Proc. the VLDB Endowment, 5(6):550--561, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Empowering Truth Discovery with Multi-Truth Prediction

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
          October 2016
          2566 pages
          ISBN:9781450340731
          DOI:10.1145/2983323

          Copyright © 2016 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 October 2016

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          CIKM '16 Paper Acceptance Rate160of701submissions,23%Overall Acceptance Rate1,861of8,427submissions,22%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader