ABSTRACT
Crowdsourced knowledge bases like Wikidata suffer from low-quality edits and vandalism, employing machine learning-based approaches to detect both kinds of damage. We reveal that state-of-the-art detection approaches discriminate anonymous and new users: benign edits from these users receive much higher vandalism scores than benign edits from older ones, causing newcomers to abandon the project prematurely. We address this problem for the first time by analyzing and measuring the sources of bias, and by developing a new vandalism detection model that avoids them. Our model FAIR-S reduces the bias ratio of the state-of-the-art vandalism detector WDVD from 310.7 to only 11.9 while maintaining high predictive performance at 0.963 ROC and 0.316 PR.
- B. T. Adler, L. de Alfaro, S. M. Mola-Velasco, P. Rosso, and A. G. West. 2011. Wikipedia Vandalism Detection: Combining Natural Language, Metadata, and Reputation Features. In CICLing. Springer, 277-288. Google ScholarDigital Library
- R. Baeza-Yates. 2018. Bias on the Web. Commun. ACM 61, 6 (2018), 54-61. Google ScholarDigital Library
- S. Barocas and A. D. Selbst. 2016. Big data's disparate impact. Cal. L. Rev. 104(2016), 671.Google Scholar
- R. Berk, H. Heidari, S. Jabbari, M. Kearns, and A. Roth. 2018. Fairness in Criminal Justice Risk Assessments: The State of the Art. Sociological Methods & Research(2018).Google Scholar
- T. Bolukbasi, K. Chang, J. Y. Zou, V. Saligrama, and A. T. Kalai. 2016. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. In NIPS. 4349-4357. Google ScholarDigital Library
- A. Bordes, N. Usunier, A. García-Durán, J. Weston, and O. Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. In NIPS. 2787-2795. Google ScholarDigital Library
- T. Calders and S. Verwer. 2010. Three naive Bayes approaches for discrimination-free classification. Data Min. Knowl. Discov. 21, 2 (2010), 277-292. Google ScholarDigital Library
- A. Chouldechova. 2017. Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments. Big Data 5, 2 (2017), 153-163.Google ScholarCross Ref
- G. L. Ciampaglia, P. Shiralkar, L. M. Rocha, J. Bollen, F. Menczer, and A. Flammini. 2015. Computational Fact Checking from Knowledge Networks. PLOS ONE 10, 6 (2015), 1-13.Google ScholarCross Ref
- S. Corbett-Davies, E. Pierson, A. Feller, S. Goel, and A. Huq. 2017. Algorithmic Decision Making and the Cost of Fairness. In KDD. ACM, 797-806. Google ScholarDigital Library
- F. Darari, S. Razniewski, R. E. Prasojo, and W. Nutt. 2016. Enabling Fine-Grained RDF Data Completeness Assessment. In ICWE. Springer, 170-187.Google Scholar
- J. Davis and M. Goadrich. 2006. The Relationship Between Precision-Recall and ROC Curves. In ICML. ACM, 233-240. Google ScholarDigital Library
- L. Dixon, J. Li, J. Sorensen, N. Thain, and L. Vasserman. 2018. Measuring and Mitigating Unintended Bias in Text Classification. In AIES. ACM, 67-73. Google ScholarDigital Library
- X. Dong, E. Gabrilovich, G. Heitz, W. Horn, N. Lao, K. Murphy, T. Strohmann, S. Sun, and W. Zhang. 2014. Knowledge Vault: A Web-Scale Approach to Probabilistic Knowledge Fusion. In KDD. ACM, 601-610. Google ScholarDigital Library
- X. L. Dong, E. Gabrilovich, K. Murphy, V. Dang, W. Horn, C. Lugaresi, S. Sun, and W. Zhang. 2016. Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources. IEEE Data Eng. Bull. 39, 2 (2016), 106-117.Google Scholar
- C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. S. Zemel. 2012. Fairness Through Awareness. In ITCS. ACM, 214-226. Google ScholarDigital Library
- M. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, and S. Venkatasubramanian. 2015. Certifying and Removing Disparate Impact. In KDD. ACM, 259-268. Google ScholarDigital Library
- L. Galárraga, S. Razniewski, A. Amarilli, and F. M. Suchanek. 2017. Predicting Completeness in Knowledge Bases. In WSDM. ACM, 375-383. Google ScholarDigital Library
- A. Gangemi, A. G. Nuzzolese, V. Presutti, F. Draicchio, A. Musetti, and P. Ciancarini. 2012. Automatic Typing of DBpedia Entities. In ISWC. 65-81. Google ScholarDigital Library
- M. Gardner and T. M. Mitchell. 2015. Efficient and Expressive Knowledge Base Completion Using Subgraph Feature Extraction. In EMNLP. ACL, 1488-1498.Google Scholar
- A. Halfaker, R. S. Geiger, J. T. Morgan, and J. Riedl. 2013. The Rise and Decline of an Open Collaboration System: How Wikipedia's Reaction to Popularity is Causing Its Decline. American Behavioral Scientist 57, 5 (2013), 664-688.Google ScholarCross Ref
- A. Halfaker, A. Kittur, and J. Riedl. 2011. Don't Bite the Newbies: How Reverts Affect the Quantity and Quality of Wikipedia Work. In Int. Sym. Wikis. 163-172. Google ScholarDigital Library
- M. Hardt, E. Price, and N. Srebro. 2016. Equality of Opportunity in Supervised Learning. In NIPS. 3315-3323. Google ScholarDigital Library
- S. Heindorf, M. Potthast, H. Bast, B. Buchhold, and E. Haussmann. 2017. WSDM Cup 2017: Vandalism Detection and Triple Scoring. In WSDM. ACM, 827-828. Google ScholarDigital Library
- S. Heindorf, M. Potthast, G. Engels, and B. Stein. 2017. Overview of the Wikidata Vandalism Detection Task at the WSDM Cup 2017. In WSDM Cup 2017 Notebook Papers.Google Scholar
- S. Heindorf, M. Potthast, B. Stein, and G. Engels. 2015. Towards Vandalism Detection in Knowledge Bases: Corpus Construction and Analysis. In SIGIR. ACM, 831-834. Google ScholarDigital Library
- S. Heindorf, M. Potthast, B. Stein, and G. Engels. 2016. Vandalism Detection in Wikidata. In CIKM. ACM, 327-336. Google ScholarDigital Library
- A. Jain and P. Pantel. 2010. FactRank: Random Walks on a Web of Facts. In COLING. Tsinghua University Press, 501-509. Google ScholarDigital Library
- S. Javanmardi, D. W. McDonald, and C. V. Lopes. 2011. Vandalism Detection in Wikipedia: A High-Performing, Feature-Rich Model and its Reduction Through Lasso. In Int. Sym. Wikis. ACM, 82-90. Google ScholarDigital Library
- F. Kamiran, T. Calders, and M. Pechenizkiy. 2010. Discrimination Aware Decision Tree Learning. In ICDM. IEEE Computer Society, 869-874. Google ScholarDigital Library
- J. Kiesel, M. Potthast, M. Hagen, and B. Stein. 2017. Spatio-Temporal Analysis of Reverted Wikipedia Edits. In ICWSM. AAAI Press, 122-131.Google Scholar
- N. Kilbertus, M. Rojas-Carulla, G. Parascandolo, M. Hardt, D. Janzing, and B. Schölkopf. 2017. Avoiding Discrimination through Causal Reasoning. In NIPS. 656-666. Google ScholarDigital Library
- J. M. Kleinberg, S. Mullainathan, and M. Raghavan. 2017. Inherent Trade-Offs in the Fair Determination of Risk Scores. In ITCS, Vol. 67. 43:1-43:23.Google Scholar
- M. J. Kusner, J. R. Loftus, C. Russell, and R. Silva. 2017. Counterfactual Fairness. In NIPS. 4069-4079. Google ScholarDigital Library
- J. Lajus and F. M. Suchanek. 2018. Are All People Married?: Determining Obligatory Attributes in Knowledge Bases. In WWW. ACM, 1115-1124. Google ScholarDigital Library
- N. Lao, T. M. Mitchell, and W. W. Cohen. 2011. Random Walk Inference and Learning in A Large Scale Knowledge Base. In EMNLP. ACL, 529-539. Google ScholarDigital Library
- J. Lehmann, D. Gerber, M. Morsey, and A. N. Ngomo. 2012. DeFacto - Deep Fact Validation. In ISWC. Springer, 312-327. Google ScholarDigital Library
- Y. Lin, Z. Liu, M. Sun, Y. Liu, and X. Zhu. 2015. Learning Entity and Relation Embeddings for Knowledge Graph Completion. In AAAI. AAAI Press, 2181-2187. Google ScholarDigital Library
- A. Melo, H. Paulheim, and J. Völker. 2016. Type Prediction in RDF Knowledge Bases Using Hierarchical Multilabel Classification. In WIMS. ACM, 14:1-14:10. Google ScholarDigital Library
- E. Minkov, W. W. Cohen, and A. Y. Ng. 2006. Contextual Search and Name Disambiguation in Email Using Graphs. In SIGIR. ACM, 27-34. Google ScholarDigital Library
- M. Nickel, K. Murphy, V. Tresp, and E. Gabrilovich. 2016. A Review of Relational Machine Learning for Knowledge Graphs. Proc. IEEE 104, 1 (2016), 11-33.Google ScholarCross Ref
- M. Nickel, V. Tresp, and H. Kriegel. 2012. Factorizing YAGO: Scalable Machine Learning for Linked Data. In WWW. ACM, 271-280. Google ScholarDigital Library
- C. Nishioka and A. Scherp. 2018. Analysing the Evolution of Knowledge Graphs for the Purpose of Change Verification. In ICSC. IEEE Computer Society, 25-32.Google Scholar
- H. Paulheim and C. Bizer. 2013. Type Inference on Noisy RDF Data. In ISWC. Springer, 510-525. Google ScholarDigital Library
- D. Pedreschi, S. Ruggieri, and F. Turini. 2009. Measuring Discrimination in Socially-Sensitive Decision Records. In SDM. SIAM, 581-592.Google Scholar
- M. Potthast, B. Stein, and R. Gerling. 2008. Automatic Vandalism Detection in Wikipedia. In ECIR. Springer, 663-668. Google ScholarDigital Library
- E. Raff, J. Sylvester, and S. Mills. 2018. Fair Forests: Regularized Tree Induction to Minimize Model Bias. In AIES. ACM, 243-250. Google ScholarDigital Library
- A. Romei and S. Ruggieri. 2014. A multidisciplinary survey on discrimination analysis. Knowledge Eng. Review 29, 5 (2014), 582-638.Google ScholarCross Ref
- A. Sarabadani, A. Halfaker, and D. Taraborelli. 2017. Building Automated Vandalism Detection Tools for Wikidata. In WWW (Companion Volume). ACM, 1647-1654. Google ScholarDigital Library
- J. Schneider, B. S. Gelley, and A. Halfaker. 2014. Accept, decline, postpone: How newcomer productivity is reduced in English Wikipedia by pre-publication review. In OpenSym. ACM, 26:1-26:10. Google ScholarDigital Library
- B. Shi and T. Weninger. 2016. Discriminative predicate path mining for fact checking in knowledge graphs. Knowl.-Based Syst. 104(2016), 123-133. Google ScholarDigital Library
- C. H. Tan, E. Agichtein, P. Ipeirotis, and E. Gabrilovich. 2014. Trust, but Verify: Predicting Contribution Quality for Knowledge Base Construction and Curation. In WSDM. ACM, 553-562. Google ScholarDigital Library
- A. Torralba and A. A. Efros. 2011. Unbiased Look at Dataset Bias. In CVPR. IEEE Computer Society, 1521-1528. Google ScholarDigital Library
- K. Tran and P. Christen. 2013. Cross Language Prediction of Vandalism on Wikipedia Using Article Views and Revisions. In PAKDD. Springer, 268-279.Google Scholar
- Q. Wang, Z. Mao, B. Wang, and L. Guo. 2017. Knowledge Graph Embedding: A Survey of Approaches and Applications. IEEE Trans. Knowl. Data Eng. 29, 12 (2017), 2724-2743.Google ScholarCross Ref
- W. Y. Wang and K. McKeown. 2010. ”Got You!”: Automatic Vandalism Detection in Wikipedia with Web-based Shallow Syntactic-Semantic Modeling. In COLING. Tsinghua University Press, 1146-1154. Google ScholarDigital Library
- X. Wang, M. Bendersky, D. Metzler, and M. Najork. 2016. Learning to Rank with Selection Bias in Personal Search. In SIGIR. ACM, 115-124. Google ScholarDigital Library
- Z. Wang, J. Zhang, J. Feng, and Z. Chen. 2014. Knowledge Graph Embedding by Translating on Hyperplanes. In AAAI. AAAI Press, 1112-1119. Google ScholarDigital Library
- C. Wilkie and L. Azzopardi. 2017. Algorithmic Bias: Do Good Systems Make Relevant Documents More Retrievable?. In CIKM. ACM, 2375-2378. Google ScholarDigital Library
- Y. Wu, P. K. Agarwal, C. Li, J. Yang, and C. Yu. 2014. Toward Computational Fact-Checking. PVLDB 7, 7 (2014), 589-600. Google ScholarDigital Library
- K. Yang and J. Stoyanovich. 2017. Measuring Fairness in Ranked Outputs. In SSDBM. ACM, 22:1-22:6. Google ScholarDigital Library
- M. B. Zafar, I. Valera, M. Gomez-Rodriguez, and K. P. Gummadi. 2017. Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment. In WWW. ACM, 1171-1180. Google ScholarDigital Library
- R. S. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork. 2013. Learning Fair Representations. In ICML (3)(JMLR Workshop and Conference Proceedings), Vol. 28. JMLR.org, 325-333. Google ScholarDigital Library
- L. Zhang and X. Wu. 2017. Anti-discrimination learning: a causal modeling-based framework. I. J. Data Science and Analytics 4, 1 (2017), 1-16.Google ScholarCross Ref
- J. Zhao, T. Wang, M. Yatskar, V. Ordonez, and K. Chang. 2017. Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints. In EMNLP. ACL, 2979-2989.Google Scholar
- I. Zliobaite. 2015. On the relation between accuracy and fairness in binary classification. CoRR abs/1505.05723(2015).Google Scholar
- I. Zliobaite. 2017. Measuring discrimination in algorithmic decision making. Data Min. Knowl. Discov. 31, 4 (2017), 1060-1089. Google ScholarDigital Library
Recommendations
Vandalism Detection in Wikidata
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge ManagementWikidata is the new, large-scale knowledge base of the Wikimedia Foundation. Its knowledge is increasingly used within Wikipedia itself and various other kinds of information systems, imposing high demands on its integrity. Wikidata can be edited by ...
Building Automated Vandalism Detection Tools for Wikidata
WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web CompanionWikidata, like Wikipedia, is a knowledge base that anyone can edit. This open collaboration model is powerful in that it reduces barriers to participation and allows a large number of people to contribute. However, it exposes the knowledge base to the ...
Wikipedia vandalism detection: combining natural language, metadata, and reputation features
CICLing'11: Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part IIWikipedia is an online encyclopedia which anyone can edit. While most edits are constructive, about 7% are acts of vandalism. Such behavior is characterized by modifications made in bad faith; introducing spam and other inappropriate content.
In this ...
Comments