research-article

Debiasing Vandalism Detection Models at Wikidata

Authors:
Stefan Heindorf

Paderborn University, Germany

Paderborn University, Germany
View Profile

,
Yan Scholten

Paderborn University, Germany

Paderborn University, Germany
View Profile

,
Gregor Engels

University of Paderborn, Germany

University of Paderborn, Germany
View Profile

,
Martin Potthast

Leipzig University, Germany

Leipzig University, Germany
View Profile

Authors Info & Claims

WWW '19: The World Wide Web ConferenceMay 2019Pages 670–680https://doi.org/10.1145/3308558.3313507

Published:13 May 2019Publication History

WWW '19: The World Wide Web Conference

Pages 670–680

ABSTRACT

Crowdsourced knowledge bases like Wikidata suffer from low-quality edits and vandalism, employing machine learning-based approaches to detect both kinds of damage. We reveal that state-of-the-art detection approaches discriminate anonymous and new users: benign edits from these users receive much higher vandalism scores than benign edits from older ones, causing newcomers to abandon the project prematurely. We address this problem for the first time by analyzing and measuring the sources of bias, and by developing a new vandalism detection model that avoids them. Our model FAIR-S reduces the bias ratio of the state-of-the-art vandalism detector WDVD from 310.7 to only 11.9 while maintaining high predictive performance at 0.963 ROC and 0.316 PR.

References

B. T. Adler, L. de Alfaro, S. M. Mola-Velasco, P. Rosso, and A. G. West. 2011. Wikipedia Vandalism Detection: Combining Natural Language, Metadata, and Reputation Features. In CICLing. Springer, 277-288. Google ScholarDigital Library
R. Baeza-Yates. 2018. Bias on the Web. Commun. ACM 61, 6 (2018), 54-61. Google ScholarDigital Library
S. Barocas and A. D. Selbst. 2016. Big data's disparate impact. Cal. L. Rev. 104(2016), 671.Google Scholar
R. Berk, H. Heidari, S. Jabbari, M. Kearns, and A. Roth. 2018. Fairness in Criminal Justice Risk Assessments: The State of the Art. Sociological Methods & Research(2018).Google Scholar
T. Bolukbasi, K. Chang, J. Y. Zou, V. Saligrama, and A. T. Kalai. 2016. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. In NIPS. 4349-4357. Google ScholarDigital Library
A. Bordes, N. Usunier, A. García-Durán, J. Weston, and O. Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. In NIPS. 2787-2795. Google ScholarDigital Library
T. Calders and S. Verwer. 2010. Three naive Bayes approaches for discrimination-free classification. Data Min. Knowl. Discov. 21, 2 (2010), 277-292. Google ScholarDigital Library
A. Chouldechova. 2017. Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments. Big Data 5, 2 (2017), 153-163.Google ScholarCross Ref
G. L. Ciampaglia, P. Shiralkar, L. M. Rocha, J. Bollen, F. Menczer, and A. Flammini. 2015. Computational Fact Checking from Knowledge Networks. PLOS ONE 10, 6 (2015), 1-13.Google ScholarCross Ref
S. Corbett-Davies, E. Pierson, A. Feller, S. Goel, and A. Huq. 2017. Algorithmic Decision Making and the Cost of Fairness. In KDD. ACM, 797-806. Google ScholarDigital Library
F. Darari, S. Razniewski, R. E. Prasojo, and W. Nutt. 2016. Enabling Fine-Grained RDF Data Completeness Assessment. In ICWE. Springer, 170-187.Google Scholar
J. Davis and M. Goadrich. 2006. The Relationship Between Precision-Recall and ROC Curves. In ICML. ACM, 233-240. Google ScholarDigital Library
L. Dixon, J. Li, J. Sorensen, N. Thain, and L. Vasserman. 2018. Measuring and Mitigating Unintended Bias in Text Classification. In AIES. ACM, 67-73. Google ScholarDigital Library
X. Dong, E. Gabrilovich, G. Heitz, W. Horn, N. Lao, K. Murphy, T. Strohmann, S. Sun, and W. Zhang. 2014. Knowledge Vault: A Web-Scale Approach to Probabilistic Knowledge Fusion. In KDD. ACM, 601-610. Google ScholarDigital Library
X. L. Dong, E. Gabrilovich, K. Murphy, V. Dang, W. Horn, C. Lugaresi, S. Sun, and W. Zhang. 2016. Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources. IEEE Data Eng. Bull. 39, 2 (2016), 106-117.Google Scholar
C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. S. Zemel. 2012. Fairness Through Awareness. In ITCS. ACM, 214-226. Google ScholarDigital Library
M. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, and S. Venkatasubramanian. 2015. Certifying and Removing Disparate Impact. In KDD. ACM, 259-268. Google ScholarDigital Library
L. Galárraga, S. Razniewski, A. Amarilli, and F. M. Suchanek. 2017. Predicting Completeness in Knowledge Bases. In WSDM. ACM, 375-383. Google ScholarDigital Library
A. Gangemi, A. G. Nuzzolese, V. Presutti, F. Draicchio, A. Musetti, and P. Ciancarini. 2012. Automatic Typing of DBpedia Entities. In ISWC. 65-81. Google ScholarDigital Library
M. Gardner and T. M. Mitchell. 2015. Efficient and Expressive Knowledge Base Completion Using Subgraph Feature Extraction. In EMNLP. ACL, 1488-1498.Google Scholar
A. Halfaker, R. S. Geiger, J. T. Morgan, and J. Riedl. 2013. The Rise and Decline of an Open Collaboration System: How Wikipedia's Reaction to Popularity is Causing Its Decline. American Behavioral Scientist 57, 5 (2013), 664-688.Google ScholarCross Ref
A. Halfaker, A. Kittur, and J. Riedl. 2011. Don't Bite the Newbies: How Reverts Affect the Quantity and Quality of Wikipedia Work. In Int. Sym. Wikis. 163-172. Google ScholarDigital Library
M. Hardt, E. Price, and N. Srebro. 2016. Equality of Opportunity in Supervised Learning. In NIPS. 3315-3323. Google ScholarDigital Library
S. Heindorf, M. Potthast, H. Bast, B. Buchhold, and E. Haussmann. 2017. WSDM Cup 2017: Vandalism Detection and Triple Scoring. In WSDM. ACM, 827-828. Google ScholarDigital Library
S. Heindorf, M. Potthast, G. Engels, and B. Stein. 2017. Overview of the Wikidata Vandalism Detection Task at the WSDM Cup 2017. In WSDM Cup 2017 Notebook Papers.Google Scholar
S. Heindorf, M. Potthast, B. Stein, and G. Engels. 2015. Towards Vandalism Detection in Knowledge Bases: Corpus Construction and Analysis. In SIGIR. ACM, 831-834. Google ScholarDigital Library
S. Heindorf, M. Potthast, B. Stein, and G. Engels. 2016. Vandalism Detection in Wikidata. In CIKM. ACM, 327-336. Google ScholarDigital Library
A. Jain and P. Pantel. 2010. FactRank: Random Walks on a Web of Facts. In COLING. Tsinghua University Press, 501-509. Google ScholarDigital Library
S. Javanmardi, D. W. McDonald, and C. V. Lopes. 2011. Vandalism Detection in Wikipedia: A High-Performing, Feature-Rich Model and its Reduction Through Lasso. In Int. Sym. Wikis. ACM, 82-90. Google ScholarDigital Library
F. Kamiran, T. Calders, and M. Pechenizkiy. 2010. Discrimination Aware Decision Tree Learning. In ICDM. IEEE Computer Society, 869-874. Google ScholarDigital Library
J. Kiesel, M. Potthast, M. Hagen, and B. Stein. 2017. Spatio-Temporal Analysis of Reverted Wikipedia Edits. In ICWSM. AAAI Press, 122-131.Google Scholar
N. Kilbertus, M. Rojas-Carulla, G. Parascandolo, M. Hardt, D. Janzing, and B. Schölkopf. 2017. Avoiding Discrimination through Causal Reasoning. In NIPS. 656-666. Google ScholarDigital Library
J. M. Kleinberg, S. Mullainathan, and M. Raghavan. 2017. Inherent Trade-Offs in the Fair Determination of Risk Scores. In ITCS, Vol. 67. 43:1-43:23.Google Scholar
M. J. Kusner, J. R. Loftus, C. Russell, and R. Silva. 2017. Counterfactual Fairness. In NIPS. 4069-4079. Google ScholarDigital Library
J. Lajus and F. M. Suchanek. 2018. Are All People Married?: Determining Obligatory Attributes in Knowledge Bases. In WWW. ACM, 1115-1124. Google ScholarDigital Library
N. Lao, T. M. Mitchell, and W. W. Cohen. 2011. Random Walk Inference and Learning in A Large Scale Knowledge Base. In EMNLP. ACL, 529-539. Google ScholarDigital Library
J. Lehmann, D. Gerber, M. Morsey, and A. N. Ngomo. 2012. DeFacto - Deep Fact Validation. In ISWC. Springer, 312-327. Google ScholarDigital Library
Y. Lin, Z. Liu, M. Sun, Y. Liu, and X. Zhu. 2015. Learning Entity and Relation Embeddings for Knowledge Graph Completion. In AAAI. AAAI Press, 2181-2187. Google ScholarDigital Library
A. Melo, H. Paulheim, and J. Völker. 2016. Type Prediction in RDF Knowledge Bases Using Hierarchical Multilabel Classification. In WIMS. ACM, 14:1-14:10. Google ScholarDigital Library
E. Minkov, W. W. Cohen, and A. Y. Ng. 2006. Contextual Search and Name Disambiguation in Email Using Graphs. In SIGIR. ACM, 27-34. Google ScholarDigital Library
M. Nickel, K. Murphy, V. Tresp, and E. Gabrilovich. 2016. A Review of Relational Machine Learning for Knowledge Graphs. Proc. IEEE 104, 1 (2016), 11-33.Google ScholarCross Ref
M. Nickel, V. Tresp, and H. Kriegel. 2012. Factorizing YAGO: Scalable Machine Learning for Linked Data. In WWW. ACM, 271-280. Google ScholarDigital Library
C. Nishioka and A. Scherp. 2018. Analysing the Evolution of Knowledge Graphs for the Purpose of Change Verification. In ICSC. IEEE Computer Society, 25-32.Google Scholar
H. Paulheim and C. Bizer. 2013. Type Inference on Noisy RDF Data. In ISWC. Springer, 510-525. Google ScholarDigital Library
D. Pedreschi, S. Ruggieri, and F. Turini. 2009. Measuring Discrimination in Socially-Sensitive Decision Records. In SDM. SIAM, 581-592.Google Scholar
M. Potthast, B. Stein, and R. Gerling. 2008. Automatic Vandalism Detection in Wikipedia. In ECIR. Springer, 663-668. Google ScholarDigital Library
E. Raff, J. Sylvester, and S. Mills. 2018. Fair Forests: Regularized Tree Induction to Minimize Model Bias. In AIES. ACM, 243-250. Google ScholarDigital Library
A. Romei and S. Ruggieri. 2014. A multidisciplinary survey on discrimination analysis. Knowledge Eng. Review 29, 5 (2014), 582-638.Google ScholarCross Ref
A. Sarabadani, A. Halfaker, and D. Taraborelli. 2017. Building Automated Vandalism Detection Tools for Wikidata. In WWW (Companion Volume). ACM, 1647-1654. Google ScholarDigital Library
J. Schneider, B. S. Gelley, and A. Halfaker. 2014. Accept, decline, postpone: How newcomer productivity is reduced in English Wikipedia by pre-publication review. In OpenSym. ACM, 26:1-26:10. Google ScholarDigital Library
B. Shi and T. Weninger. 2016. Discriminative predicate path mining for fact checking in knowledge graphs. Knowl.-Based Syst. 104(2016), 123-133. Google ScholarDigital Library
C. H. Tan, E. Agichtein, P. Ipeirotis, and E. Gabrilovich. 2014. Trust, but Verify: Predicting Contribution Quality for Knowledge Base Construction and Curation. In WSDM. ACM, 553-562. Google ScholarDigital Library
A. Torralba and A. A. Efros. 2011. Unbiased Look at Dataset Bias. In CVPR. IEEE Computer Society, 1521-1528. Google ScholarDigital Library
K. Tran and P. Christen. 2013. Cross Language Prediction of Vandalism on Wikipedia Using Article Views and Revisions. In PAKDD. Springer, 268-279.Google Scholar
Q. Wang, Z. Mao, B. Wang, and L. Guo. 2017. Knowledge Graph Embedding: A Survey of Approaches and Applications. IEEE Trans. Knowl. Data Eng. 29, 12 (2017), 2724-2743.Google ScholarCross Ref
W. Y. Wang and K. McKeown. 2010. ”Got You!”: Automatic Vandalism Detection in Wikipedia with Web-based Shallow Syntactic-Semantic Modeling. In COLING. Tsinghua University Press, 1146-1154. Google ScholarDigital Library
X. Wang, M. Bendersky, D. Metzler, and M. Najork. 2016. Learning to Rank with Selection Bias in Personal Search. In SIGIR. ACM, 115-124. Google ScholarDigital Library
Z. Wang, J. Zhang, J. Feng, and Z. Chen. 2014. Knowledge Graph Embedding by Translating on Hyperplanes. In AAAI. AAAI Press, 1112-1119. Google ScholarDigital Library
C. Wilkie and L. Azzopardi. 2017. Algorithmic Bias: Do Good Systems Make Relevant Documents More Retrievable?. In CIKM. ACM, 2375-2378. Google ScholarDigital Library
Y. Wu, P. K. Agarwal, C. Li, J. Yang, and C. Yu. 2014. Toward Computational Fact-Checking. PVLDB 7, 7 (2014), 589-600. Google ScholarDigital Library
K. Yang and J. Stoyanovich. 2017. Measuring Fairness in Ranked Outputs. In SSDBM. ACM, 22:1-22:6. Google ScholarDigital Library
M. B. Zafar, I. Valera, M. Gomez-Rodriguez, and K. P. Gummadi. 2017. Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment. In WWW. ACM, 1171-1180. Google ScholarDigital Library
R. S. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork. 2013. Learning Fair Representations. In ICML (3)(JMLR Workshop and Conference Proceedings), Vol. 28. JMLR.org, 325-333. Google ScholarDigital Library
L. Zhang and X. Wu. 2017. Anti-discrimination learning: a causal modeling-based framework. I. J. Data Science and Analytics 4, 1 (2017), 1-16.Google ScholarCross Ref
J. Zhao, T. Wang, M. Yatskar, V. Ordonez, and K. Chang. 2017. Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints. In EMNLP. ACL, 2979-2989.Google Scholar
I. Zliobaite. 2015. On the relation between accuracy and fairness in binary classification. CoRR abs/1505.05723(2015).Google Scholar
I. Zliobaite. 2017. Measuring discrimination in algorithmic decision making. Data Min. Knowl. Discov. 31, 4 (2017), 1060-1089. Google ScholarDigital Library

Recommendations

Vandalism Detection in Wikidata
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

Wikidata is the new, large-scale knowledge base of the Wikimedia Foundation. Its knowledge is increasingly used within Wikipedia itself and various other kinds of information systems, imposing high demands on its integrity. Wikidata can be edited by ...
Read More
Building Automated Vandalism Detection Tools for Wikidata
WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion

Wikidata, like Wikipedia, is a knowledge base that anyone can edit. This open collaboration model is powerful in that it reduces barriers to participation and allows a large number of people to contribute. However, it exposes the knowledge base to the ...
Read More
Wikipedia vandalism detection: combining natural language, metadata, and reputation features
CICLing'11: Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II

Wikipedia is an online encyclopedia which anyone can edit. While most edits are constructive, about 7% are acts of vandalism. Such behavior is characterized by modifications made in bad faith; introducing spam and other inappropriate content.

In this ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '19: The World Wide Web Conference
May 2019
3620 pages
ISBN:9781450366748
DOI:10.1145/3308558
Editors:
Ling Liu
Georgia Tech, USA
,
Ryen White
Microsoft Research, USA
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 May 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 9
  Total Citations
  View Citations
- 346
  Total Downloads
- Downloads (Last 12 months)27
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Debiasing Vandalism Detection Models at Wikidata

WWW '19: The World Wide Web Conference

ABSTRACT

References

Cited By

Recommendations

Vandalism Detection in Wikidata

Building Automated Vandalism Detection Tools for Wikidata

Wikipedia vandalism detection: combining natural language, metadata, and reputation features

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Debiasing Vandalism Detection Models at Wikidata

WWW '19: The World Wide Web Conference

ABSTRACT

References

Cited By

Recommendations

Vandalism Detection in Wikidata

Building Automated Vandalism Detection Tools for Wikidata

Wikipedia vandalism detection: combining natural language, metadata, and reputation features

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media