ABSTRACT
In many real world applications, the same object or event may be described by multiple sources. As a result, conflicts among these sources are inevitable and these conflicts cause confusion as we have more than one value or outcome for each object. One significant problem is to resolve the confusion and to identify a piece of information which is trustworthy. This process of finding the truth from conflicting values of an object provided by multiple sources is called truth discovery or fact-finding. The main purpose of the truth discovery is to find more and more trustworthy information and reliable sources. Because the major assumption of truth discovery is on this intuitive principle, the source that provides trustworthy information is considered more reliable, and moreover, if the piece of information is from a reliable source, then it is more trustworthy. However, previously proposed truth discovery methods either do not conduct source reliability estimation at all (Voting Method), or even if they do, they do not model multiple properties of the object separately. This is the motivation for researchers to develop new techniques to tackle the problem of truth discovery in data with multiple properties. We present a method using an optimization framework which minimizes the overall weighted deviation between the truths and the multi-source observations. In this framework, different types of distance functions can be plugged in to capture the characteristics of different data types. We use weather datasets collected by four different platforms for extensive experiments and the results verify both the efficiency and precision of our methods for truth discovery.
- Xin Luna Dong, Laure Bertiequille, and Divesh Srivastava. Integrating conflicting data: the role of source dependence. Proceedings of The Vldb Endowment, 2(1):550--561, 2009. Google ScholarDigital Library
- Xin Luna Dong, Laure Bertiequille, and Divesh Srivastava. Truth discovery and copying detection in a dynamic world. Proceedings of The Vldb Endowment, 2(1):562--573, 2009. Google ScholarDigital Library
- Li Jia, Hongzhi Wang, Jianzhong Li, and Hong Gao. Incremental truth discovery for information from multiple data sources. pages 56--66, 2013.Google Scholar
- Xian Li, Xin Luna Dong, Kenneth Lyons, Weiyi Meng, and Divesh Srivastava. Truth finding on the deep web: is the problem solved? 6(2):97--108, 2012. Google ScholarDigital Library
- Anish Das Sarma, Xin Luna Dong, and Alon Halevy. Data integration with dependent sources. pages 401412, 2011.Google Scholar
- Dong Wang, Lance Kaplan, Hieu Le, and Tarek Abdelzaher. On truth discovery in social sensing: a maximum likelihood estimation approach. pages 233--244, 2012. Google ScholarDigital Library
- Xiaoxin Yin, Jiawei Han, and Philip S Yu. Truth discovery with multiple conflicting information providers on the web. IEEE Transactions on Knowledge and Data Engineering, 20(6):796--808, 2008. Google ScholarDigital Library
- Alban Galland, Serge Abiteboul, Amelie Marian, and Pierre Senellart. Corroborating information from disagreeing views. pages 131--140, 2010. Google ScholarDigital Library
- Bo Zhao, Benjamin I. P Rubinstein, Jim Gemmell, and Jiawei Han. A bayesian approach to discovering truth from conflicting sources for data integration. Proceedings of the Vldb Endowment, 5(6):550--561, 2012. Google ScholarDigital Library
- Jeff Pasternack and Dan Roth. Knowing what to believe (when you already know something). pages 877--885, 2010. Google ScholarDigital Library
- Xiaoxin Yin and Wenzhao Tan. Semi-supervised truth discovery. international world wide web conferences, pages 217--226, 2011. Google ScholarDigital Library
- Bo Zhao and Jiawei Han. A probabilistic model for estimating real-valued truth from conflicting sources. Proc.of Intl.workshop on Quality in Databases.Google Scholar
- Rakesh Agrawal and Samuel Ieong. Aggregating web offers to determine product prices. pages 435--443, 2012. Google ScholarDigital Library
Index Terms
- Better Weather Forecasting through truth discovery Analysis
Recommendations
Towards Confidence in the Truth: A Bootstrapping based Truth Discovery Approach
KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data MiningThe demand for automatic extraction of true information (i.e., truths) from conflicting multi-source data has soared recently. A variety of truth discovery methods have witnessed great successes via jointly estimating source reliability and truths. All ...
Empowering Truth Discovery with Multi-Truth Prediction
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge ManagementTruth discovery is the problem of detecting true values from the conflicting data provided by multiple sources on the same data items. Since sources' reliability is unknown a priori, a truth discovery method usually estimates sources' reliability along ...
On the Discovery of Continuous Truth: A Semi-supervised Approach with Partial Ground Truths
Web Information Systems Engineering – WISE 2018AbstractIn many applications, the information regarding to the same object can be collected from multiple sources. However, these multi-source data are not reported consistently. In the light of this challenge, truth discovery is emerged to identify truth ...
Comments