research-article

Toward computational fact-checking

Authors:
You Wu

Duke University

Duke University
View Profile

,
Pankaj K. Agarwal

Duke University

Duke University
View Profile

,
Chengkai Li

University of Texas, at Arlington

University of Texas, at Arlington
View Profile

,
Jun Yang

Duke University

Duke University
View Profile

,
Cong Yu

Google Research

Google Research
View Profile

Proceedings of the VLDB Endowment Volume 7 Issue 7pp 589–600https://doi.org/10.14778/2732286.2732295

Published:01 March 2014Publication History

Proceedings of the VLDB Endowment

Abstract

Our news are saturated with claims of "facts" made from data. Database research has in the past focused on how to answer queries, but has not devoted much attention to discerning more subtle qualities of the resulting claims, e.g., is a claim "cherry-picking"? This paper proposes a framework that models claims based on structured data as parameterized queries. A key insight is that we can learn a lot about a claim by perturbing its parameters and seeing how its conclusion changes. This framework lets us formulate practical fact-checking tasks---reverse-engineering (often intentionally) vague claims, and countering questionable claims---as computational problems. Along with the modeling framework, we develop an algorithmic framework that enables efficient instantiations of "meta" algorithms by supplying appropriate algorithmic building blocks. We present real-world examples and experiments that demonstrate the power of our model, efficiency of our algorithms, and usefulness of their results.

References

C. C. Aggarwal, editor. Managing and Mining Uncertain Data. Springer, 2009. Google ScholarDigital Library
P. Agrawal and J. Widom. Confidence-aware join algorithms. ICDE, 2009, 628--639. Google ScholarDigital Library
A. M. Andrew. Another efficient algorithm for convex hulls in two dimensions. Information Processing Letters, 9(1979), 216--219.Google ScholarCross Ref
M. A. Bender and M. Farach-Colton. The LCA problem revisited. LATIN, 2000, 88--94. Google ScholarDigital Library
S. Börzsönyi, D. Kossmann, and K. Stocker. The skyline operator. ICDE, 2001, 421--430. Google ScholarDigital Library
S Cohen, J. T. Hamilton, and F. Turner. Computational journalism. CACM, 54(2011), 66--71. Google ScholarDigital Library
S. Cohen, C. Li, J. Yang, and C. Yu. Computational journalism: A call to arms to database researchers. CIDR, 2011.Google Scholar
Harish D., P. N. Darera, and J. R. Haritsa. Identifying robust plans through plan diagram reduction. VLDB, 2008, 1124--1140. Google ScholarDigital Library
N. N. Dalvi, C. Ré, and D. Suciu. Probabilistic databases: Diamonds in the dirt. CACM, 52(2009), 86--94. Google ScholarDigital Library
J. Fischer and V. Heun. A new succinct representation of rmq-information and improvements in the enhanced suffix array. ESCAPE, 2007, 459--470. Google ScholarDigital Library
S. Ganguly. Design and analysis of parametric query optimization algorithms. VLDB, 1998, 228--238. Google ScholarDigital Library
J. Gray, A. Bosworth, A. Layman, and H. Pirahesh. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-total. ICDE, 1996, 152--159. Google ScholarDigital Library
D. Harel and R. E. Tarjan. Fast algorithms for finding nearest common ancestors. SIAM, 13(1984), 338--355. Google ScholarDigital Library
Z. He and E. Lo. Answering why-not questions on top-k queries. ICDE, 2012, 750--761. Google ScholarDigital Library
A. Hulgeri and S. Sudarshan. AniPQO: Almost non-intrusive parametric query optimization for nonlinear cost functions. VLDB, 2003, 766--777. Google ScholarDigital Library
Y. E. Ioannidis, R. T. Ng, K. Shim, and T. K. Sellis. Parametric query optimization. VLDB, 1992, 103--114. Google ScholarDigital Library
R. Jampani, F. Xu, M. Wu, L. L. Perez, C. Jermaine, and P. J. Haas. The Monte Carlo database system: Stochastic analysis close to the data. TODS, 36(2011), 18. Google ScholarDigital Library
H. T. Kung, F. Luccio, and F. P. Preparata. On finding the maxima of a set of vectors. JACM, 22(1975), 469--476. Google ScholarDigital Library
X Lin, A. Mukherji, E. A. Rundensteiner, C. Ruiz, and M. O. Ward. PARAS: A parameter space framework for online association mining. VLDB 6(2013), 193--204. Google ScholarDigital Library
Y. Luo, X. Lin, W. Wang, and X. Zhou. Spark: top-k keyword query in relational databases. SIGMOD, 2007, 115--126. Google ScholarDigital Library
K. Mouratidis and H. Pang. Computing immutable regions for sub-space top-k queries. VLDB, 6(2012), 73--84. Google ScholarDigital Library
A. Das Sarma, A. G. Parameswaran, H. Garcia-Molina, and J. Widom. Synthesizing view definitions from data. ICDT, 2010, 89--103. Google ScholarDigital Library
M. A. Soliman, I. F. Ilyas, D. Martinenghi, and M. Tagliasacchi. Ranking with uncertain scoring functions: Semantics and sensitivity measures. SIGMOD, 2011, 805--816. Google ScholarDigital Library
R. E. Tarjan. Applications of path compression on balanced trees. JACM, 26(1979), 690--715. Google ScholarDigital Library
Q. T. Tran and C. Y. Chan. How to ConQueR why-not questions. SIGMOD, 2010, 15--26. Google ScholarDigital Library
Q. T. Tran, C. Y. Chan, and S. Parthasarathy. Query by output. SIGMOD, 2009, 535--548. Google ScholarDigital Library
E. Wu and S. Madden. Scorpion: Explaining away outliers in aggregate queries. VLDB, 6(2013), 553--564. Google ScholarDigital Library
Y. Wu, P. K. Agarwal, C. Li, J. Yang, and C. Yu. Toward computational fact-checking. Technical report, Duke University, 2013. http://www.cs.duke.edu/dbgroup/papers/WuAgarwalEtAl-13-fact_check.pdf.Google Scholar
A. Yu, P. K. Agarwal, and J. Yang. Processing a large number of continuous preference top-k queries. SIGMOD, 2012, 397--408. Google ScholarDigital Library

Recommendations

Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by ClaimBuster
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

This paper introduces how ClaimBuster, a fact-checking platform, uses natural language processing and supervised learning to detect important factual claims in political discourses. The claim spotting model is built using a human-labeled dataset of ...
Read More
Computational Fact Checking through Query Perturbations
Invited Paper from ICDT 2014, Invited Paper from EDBT 2015, Regular Papers and Technical Correspondence

Our media is saturated with claims of “facts” made from data. Database research has in the past focused on how to answer queries, but has not devoted much attention to discerning more subtle qualities of the resulting claims, for example, is a claim “...
Read More
Linguistic Signals under Misinformation and Fact-Checking: Evidence from User Comments on Social Media

Misinformation and fact-checking are opposite forces in the news environment: the former creates inaccuracies to mislead people, while the latter provides evidence to rebut the former. These news articles are often posted on social media and attract user ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Proceedings of the VLDB Endowment Volume 7, Issue 7
March 2014
108 pages
ISSN:2150-8097
Editors:
H. V. Jagadish
University of Michigan
,
Aoying Zhou
East Normal University, China
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 1 March 2014
Published in pvldb Volume 7, Issue 7
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 51
  Total Citations
  View Citations
- 671
  Total Downloads
- Downloads (Last 12 months)56
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Toward computational fact-checking

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Recommendations

Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by ClaimBuster

Computational Fact Checking through Query Perturbations

Linguistic Signals under Misinformation and Fact-Checking: Evidence from User Comments on Social Media

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Toward computational fact-checking

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Recommendations

Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by ClaimBuster

Computational Fact Checking through Query Perturbations

Linguistic Signals under Misinformation and Fact-Checking: Evidence from User Comments on Social Media

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media