ABSTRACT
Existing fact-finding models assume availability of structured data or accurate information extraction. However, as online data gets more unstructured, these assumptions are no longer valid. To overcome this, we propose a novel, content-based, trust propagation framework that relies on signals from the textual content to ascertain veracity of free-text claims and compute trustworthiness of their sources. We incorporate the quality of relevant content into the framework and present an iterative algorithm for propagation of trust scores. We show that existing fact finders on structured data can be modeled as specific instances of this framework. Using a retrieval-based approach to find relevant articles, we instantiate the framework to compute trustworthiness of news sources and articles. We show that the proposed framework helps ascertain trustworthiness of sources better. We also show that ranking news articles based on trustworthiness learned from the content-driven framework is significantly better than baselines that ignore either the content quality or the trust framework.
- Gather: The Changing Face of News Media, May 25th, $2010$. http://www.emarketer.com/.Google Scholar
- Polls: PBS Most Trusted News Source. TVNewsCheck, Feb 18th, 2010 (Retrieved Feb 16th, 2011). http://www.tvnewscheck.com/.Google Scholar
- Survey by Henry J. Kaiser Family Foundation, January 26-March 8, 2009.Google Scholar
- B. T. Adler and L. de Alfaro. A Content-driven Reputation System for the Wikipedia. In Proc. of World Wide Web (WWW), pages 261--270, 2007. Google ScholarDigital Library
- D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet Allocation. Journal of Machine Learning Research, 3:993--1022, 2003. Google ScholarDigital Library
- S. Brin and L. Page. The Anatomy of a Large-Scale Hypertextual Web Search Engine. Computer Networks, 30(1--7):107--117, 1998. Google ScholarDigital Library
- W. W. Cohen, P. Ravikumar, and S. E. Fienberg. A Comparison of String Distance Metrics for Name-Matching Tasks. In IJCAI Workshop on Information Integration on the Web, 2003.Google Scholar
- A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of Royal Statistical Society, Series B, 39(1):1--38, 1977.Google Scholar
- X. L. Dong, L. Berti-Equille, Y. Hu, and D. Srivastava. Global Detection of Complex Copying Relationships Between Sources. Proc. of VLDB Endowment (PVLDB), 3(1):1358--1369, 2010. Google ScholarDigital Library
- X. L. Dong, L. Berti-Equille, and D. Srivastava. Truth Discovery and Copying Detection in a Dynamic World. Proc. of VLDB Endowment (PVLDB), 2(1):562--573, 2009. Google ScholarDigital Library
- A. Galland, S. Abiteboul, A. Marian, and P. Senellart. Corroborating Information from Disagreeing Views. In Proc. of WSDM, pages 131--140, 2010. Google ScholarDigital Library
- T. H. Haveliwala. Topic-sensitive PageRank. In Proc. of 11th Intl. Conf. on World Wide Web (WWW), pages 517--526, 2002. Google ScholarDigital Library
- T. Hofmann. Probabilistic Latent Semantic Indexing. In Proc. of 22nd Intl. ACM Conf. on Research and development in Information Retrieval (SIGIR), pages 50--57, 1999. Google ScholarDigital Library
- K. Jarvelin and J. Kekalainen. Cumulated Gain-based Evaluation of IR Techniques. ACM Transactions on Information Systems, 20(4):422--446, 2002. Google ScholarDigital Library
- K. Kelton, K. R. Fleischmann, and W. A. Wallace. Trust in Digital Information. Journal of the American Society for Infromation Science and Technology, 59(3):363--374, 2008. Google ScholarDigital Library
- M. G. Kendall. A New Measure of Rank Correlation. Biometrika, 30:81--89, 1938.Google ScholarCross Ref
- J. M. Kleinberg. Authoritative Sources in a Hyperlinked Environment. Journal of ACM, 46(5):604--632, 1999. Google ScholarDigital Library
- R. D. Lankes. Credibility on the Internet: Shifting from Authority to Reliability. Journal of Documentation, 64(5):667--686, 2007.Google ScholarCross Ref
- J. Pasternack and D. Roth. Knowing what to believe (when you already know something). In Proc. of Intl. Conf. on Computational Linguistics (COLING), pages 877--885, 2010. Google ScholarDigital Library
- D. Roth, M. Sammons, and V. Vydiswaran. A Framework for Entailed Relation Recognition. In Proc. of $47^th$ Annual Meeting of the Association for Computational Linguistics (ACL), pages 57--60, 2009. Google ScholarDigital Library
- C. Shah and J. Pomerantz. Evaluating and Predicting Answer Quality in Community QA. In Proc. of $33^rd$ Intl. ACM SIGIR Conf. on Research and development in Information Retrieval, pages 411--418, 2010. Google ScholarDigital Library
- Q. Su, C.-R. Huang, and H. K. yun Chen. Evidentiality for Text Trustworthiness Detection. In Proc. of the Workshop on NLP and Linguistics: Finding the Common Ground, pages 10--17, 2010. Google ScholarDigital Library
- M. Wu and A. Marian. Corroborating Answers from Multiple Web Sources. In Proc. of the 10th Intl. Workshop on Web and Databases (WebDB), pages 1--6, 2007.Google Scholar
- X. Yin, J. Han, and P. S. Yu. Truth Discovery with Multiple Conflicting Information Providers on the Web. IEEE Transactions on Knowledge and Data Engineering, 20(6):796--808, 2008. Google ScholarDigital Library
- C. Zhai, A. Velivelli, and B. Yu. A Cross-Collection Mixture Model for Comparative Text Mining. In Proc. of Intl. Conf. on Knowledge Discovery and Data Mining (KDD), pages 743--748, 2004. Google ScholarDigital Library
Index Terms
- Content-driven trust propagation framework
Recommendations
Mobile-banking adoption by Iranian bank clients
This study provides insights into factors affecting the adoption of mobile banking in Iran. Encouraging clients to use the cell-phone for banking affairs, and negative trends in the adoption of this technology makes it imperative to study the factors ...
On-line trust: concepts, evolving themes, a model
Special issue: Trust and technologyTrust is emerging as a key element of success in the on-line environment. Although considerable research on trust in the offline world has been performed, to date empirical study of on-line trust has been limited. This paper examines on-line trust, ...
A framework for understanding trust factors in web-based health advice
Trust is a key factor in consumer decisions about website engagement. Consumers will engage with sites they deem trustworthy and turn away from those they mistrust. In this paper, we present a framework for understanding trust factors in web-based ...
Comments