2012 | OriginalPaper | Buchkapitel
Retrieving Candidate Plagiarised Documents Using Query Expansion
verfasst von : Rao Muhammad Adeel Nawab, Mark Stevenson, Paul Clough
Erschienen in: Advances in Information Retrieval
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
External plagiarism detection systems compare suspicious texts against a reference collection to identify the original one(s). The suspicious text may not contain a verbatim copy of the reference collection since plagiarists often try to disguise their behaviour by altering the text. For large reference collections, such as those accessible via the internet, it is not practical to compare the suspicious text with every document in the reference collection. Consequently many approaches to plagiarism detection begin by identifying a set of candidate documents from the reference collection. We report an IR-based approach to the candidate document selection problem that uses query expansion to identify candidates which have been altered. The reported system outperforms a previously reported approach and is also robust to changes in the reference collection text.