This paper presents two anonymisation methods to process an SMS corpus. The first one is based on an unsupervised approach called
. The implemented system uses several dictionaries and rules in order to predict if a SMS needs anonymisation process. The second method is based on a supervised approach using machine learning techniques. We evaluate the two approaches and we propose a way to use them together. Only when the two methods do not agree on their prediction, will the SMS be checked by a human expert. This greatly reduces the cost of anonymising the corpus.
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten