Skip to main content
Top

2013 | OriginalPaper | Chapter

106. An Improved Plagiarism Detection Method: Model and Sample

Authors : Jing Fang, Yuanyuan Zhang

Published in: Emerging Technologies for Information Systems, Computing, and Management

Publisher: Springer New York

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Cosine similarity measure is an efficient plagiarism detection algorithm for documents. However, it may be misled if the document is not properly preprocessed. Furthermore, the weight for the words in the document should depend on its occurrence frequency in the whole digital library. Otherwise, cosine similarity measure may not accurate enough. This paper aims to enhance the accuracy of similarity measure. A preprocessing method and a model to adjust word’s weight according to occurrence frequency are proposed in this paper. The paper also develops a sample to illustrate how to preprocess documents, adjust the weight for the words and calculate the similarity. The sample shows that it gets better result after applying the model in this paper.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Sven, M.E., Benno, S.: Intrinsic plagiarism detection. In: Advances in Information retrieval 28th European Conference on IR Research, ECIR 2006, London, UK, Automatic Conceptual Analysis for plagiarism detection April 10–12, 2006 Proceedings. Lecture Notes in Computer Science, vol. 3936, pp. 565–569. Springer (2006) Sven, M.E., Benno, S.: Intrinsic plagiarism detection. In: Advances in Information retrieval 28th European Conference on IR Research, ECIR 2006, London, UK, Automatic Conceptual Analysis for plagiarism detection April 10–12, 2006 Proceedings. Lecture Notes in Computer Science, vol. 3936, pp. 565–569. Springer (2006)
2.
go back to reference Zechner, M., Muhr, M., Kern, R., Granitzer, M.: External and intrinsic plagiarism detection using vector space models. PAN’s, pp. 47–55 (2009) Zechner, M., Muhr, M., Kern, R., Granitzer, M.: External and intrinsic plagiarism detection using vector space models. PAN’s, pp. 47–55 (2009)
3.
go back to reference Kang, N., Han, S.Y.: Document copy detection system based on plagiarism patterns. In: CICLing’06 Proceedings of the 7th international conference on computational linguistics and intelligent text processing, pp. 571–574 (2006) Kang, N., Han, S.Y.: Document copy detection system based on plagiarism patterns. In: CICLing’06 Proceedings of the 7th international conference on computational linguistics and intelligent text processing, pp. 571–574 (2006)
4.
go back to reference Si, A., Leong, H.V., Lau, R.W.H.: CHECK: A document plagiarism detection system. Proc. ACM Symp. Applied Comput., 70–77 (1997) Si, A., Leong, H.V., Lau, R.W.H.: CHECK: A document plagiarism detection system. Proc. ACM Symp. Applied Comput., 70–77 (1997)
5.
go back to reference Dreher, H.: Automatic conceptual analysis for plagiarism detection. J. Issues Informing Sci. Inf. Technol. 601–614 (2007) Dreher, H.: Automatic conceptual analysis for plagiarism detection. J. Issues Informing Sci. Inf. Technol. 601–614 (2007)
6.
go back to reference Kang, N., Gelbukh, A., Han, S.: PPChecker: Plagiarism pattern checker in document copy detection. Proc. TSD, 661–667 (2006) Kang, N., Gelbukh, A., Han, S.: PPChecker: Plagiarism pattern checker in document copy detection. Proc. TSD, 661–667 (2006)
7.
go back to reference Timothy, H., Justin, Z.: Methods for Identifying versioned and plagiarized documents. J. Am. Soc. Inform. Sci. Technol. 54(3), 203–215 (2003)CrossRef Timothy, H., Justin, Z.: Methods for Identifying versioned and plagiarized documents. J. Am. Soc. Inform. Sci. Technol. 54(3), 203–215 (2003)CrossRef
Metadata
Title
An Improved Plagiarism Detection Method: Model and Sample
Authors
Jing Fang
Yuanyuan Zhang
Copyright Year
2013
Publisher
Springer New York
DOI
https://doi.org/10.1007/978-1-4614-7010-6_106