2013 | OriginalPaper | Buchkapitel
Approaches of Anonymisation of an SMS Corpus
verfasst von : Namrata Patel, Pierre Accorsi, Diana Inkpen, Cédric Lopez, Mathieu Roche
Erschienen in: Computational Linguistics and Intelligent Text Processing
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
This paper presents two anonymisation methods to process an SMS corpus. The first one is based on an unsupervised approach called
Seek&Hide
. The implemented system uses several dictionaries and rules in order to predict if a SMS needs anonymisation process. The second method is based on a supervised approach using machine learning techniques. We evaluate the two approaches and we propose a way to use them together. Only when the two methods do not agree on their prediction, will the SMS be checked by a human expert. This greatly reduces the cost of anonymising the corpus.