Skip to main content

2018 | OriginalPaper | Buchkapitel

Large Scale Authorship Attribution of Online Reviews

verfasst von : Prasha Shrestha, Arjun Mukherjee, Thamar Solorio

Erschienen in: Computational Linguistics and Intelligent Text Processing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Traditional authorship attribution methods focus on the scenario of a limited number of authors writing long pieces of text. These methods are engineered to work on a small number of authors and generally do not scale well to a corpus of online reviews where the candidate set of authors is large. However, attribution of online reviews is important as they are replete with deception and spam. We evaluate a new large scale approach for predicting authorship via the task of verification on online reviews. Our evaluation considers a large number of possible candidate authors seen to date. Our results show that multiple verification models can be successfully combined to associate reviews with their correct author in more than 78% of the time. We propose that our approach can be used to slow down or deter the number of deceptive reviews in the wild.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The datasets can be obtained at http://​ritual.​uh.​edu/​resources/​.
 
Literatur
3.
Zurück zum Zitat Kešelj, V., Peng, F., Cercone, N., Thomas, C.: N-gram-based author profiles for authorship attribution. In: Proceedings of the Conference of Pacific Association for Computational Linguistics, PACLING, vol. 3, pp. 255–264 (2003) Kešelj, V., Peng, F., Cercone, N., Thomas, C.: N-gram-based author profiles for authorship attribution. In: Proceedings of the Conference of Pacific Association for Computational Linguistics, PACLING, vol. 3, pp. 255–264 (2003)
4.
Zurück zum Zitat Koppel, M., Schler, J.: Authorship verification as a one-class classification problem. In: Proceedings of the Twenty-First International Conference on Machine Learning, ICML 2004, p. 62. ACM, New York (2004) Koppel, M., Schler, J.: Authorship verification as a one-class classification problem. In: Proceedings of the Twenty-First International Conference on Machine Learning, ICML 2004, p. 62. ACM, New York (2004)
5.
Zurück zum Zitat Koppel, M., Winter, Y.: Determining if two documents are written by the same author. J. Assoc. Inf. Sci. Technol. 65, 178–187 (2014)CrossRef Koppel, M., Winter, Y.: Determining if two documents are written by the same author. J. Assoc. Inf. Sci. Technol. 65, 178–187 (2014)CrossRef
6.
Zurück zum Zitat Qian, T.Y., Liu, B., Li, Q., Si, J.: Review authorship attribution in a similarity space. J. Comput. Sci. Technol. 30, 200–213 (2015)CrossRef Qian, T.Y., Liu, B., Li, Q., Si, J.: Review authorship attribution in a similarity space. J. Comput. Sci. Technol. 30, 200–213 (2015)CrossRef
7.
Zurück zum Zitat Stamatatos, E.: A survey of modern authorship attribution methods. J. Am. Soc. Inf. Sci. Technol. 60, 538–556 (2009)CrossRef Stamatatos, E.: A survey of modern authorship attribution methods. J. Am. Soc. Inf. Sci. Technol. 60, 538–556 (2009)CrossRef
9.
Zurück zum Zitat Jindal, N., Liu, B.: Opinion spam and analysis. In: Proceedings of the 2008 International Conference on Web Search and Data Mining, WSDM 2008, pp. 219–230. ACM, New York (2008) Jindal, N., Liu, B.: Opinion spam and analysis. In: Proceedings of the 2008 International Conference on Web Search and Data Mining, WSDM 2008, pp. 219–230. ACM, New York (2008)
10.
Zurück zum Zitat Mukherjee, A., Liu, B., Glance, N.: Spotting fake reviewer groups in consumer reviews. In: Proceedings of the 21st International Conference on World Wide Web, WWW 2012, pp. 191–200. ACM, New York (2012) Mukherjee, A., Liu, B., Glance, N.: Spotting fake reviewer groups in consumer reviews. In: Proceedings of the 21st International Conference on World Wide Web, WWW 2012, pp. 191–200. ACM, New York (2012)
11.
Zurück zum Zitat Narayanan, A., Paskov, H., Gong, N., Bethencourt, J., Stefanov, E., Shin, E., Song, D.: On the feasibility of internet-scale author identification. In: 2012 IEEE Symposium on Security and Privacy (SP), pp. 300–314 (2012) Narayanan, A., Paskov, H., Gong, N., Bethencourt, J., Stefanov, E., Shin, E., Song, D.: On the feasibility of internet-scale author identification. In: 2012 IEEE Symposium on Security and Privacy (SP), pp. 300–314 (2012)
12.
Zurück zum Zitat Seroussi, Y., Zukerman, I., Bohnert, F.: Authorship attribution with topic models. Comput. Linguist. 40, 269–310 (2014)CrossRef Seroussi, Y., Zukerman, I., Bohnert, F.: Authorship attribution with topic models. Comput. Linguist. 40, 269–310 (2014)CrossRef
13.
Zurück zum Zitat Burrows, J.: Delta: a measure of stylistic difference and a guide to likely authorship. Literary Linguist. Comput. 17, 267–287 (2002)CrossRef Burrows, J.: Delta: a measure of stylistic difference and a guide to likely authorship. Literary Linguist. Comput. 17, 267–287 (2002)CrossRef
14.
Zurück zum Zitat Eder, M.: Does size matter? authorship attribution, small samples, big problem. Digit. Scholarsh. Humanit. 30, 167–182 (2015)CrossRef Eder, M.: Does size matter? authorship attribution, small samples, big problem. Digit. Scholarsh. Humanit. 30, 167–182 (2015)CrossRef
15.
Zurück zum Zitat Stein, B., Lipka, N., Prettenhofer, P.: Intrinsic plagiarism analysis. Lang. Resour. Eval. 45, 63–82 (2011)CrossRef Stein, B., Lipka, N., Prettenhofer, P.: Intrinsic plagiarism analysis. Lang. Resour. Eval. 45, 63–82 (2011)CrossRef
16.
Zurück zum Zitat Guthrie, D., Guthrie, L., Allison, B., Wilks, Y.: Unsupervised anomaly detection. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence, IJCAI 2007, San Francisco, CA, USA, pp. 1624–1628. Morgan Kaufmann Publishers Inc. (2007) Guthrie, D., Guthrie, L., Allison, B., Wilks, Y.: Unsupervised anomaly detection. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence, IJCAI 2007, San Francisco, CA, USA, pp. 1624–1628. Morgan Kaufmann Publishers Inc. (2007)
17.
Zurück zum Zitat Flesch, R.: A new readability yardstick. J. Appl. Psychol. 32, 221–223 (1948)CrossRef Flesch, R.: A new readability yardstick. J. Appl. Psychol. 32, 221–223 (1948)CrossRef
18.
Zurück zum Zitat Kincaid, J.P., Fishburne Jr., R.P., Rogers, R.L., Chissom, B.S.: Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. Technical report (1975) Kincaid, J.P., Fishburne Jr., R.P., Rogers, R.L., Chissom, B.S.: Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. Technical report (1975)
19.
Zurück zum Zitat Gunning, R.: The Technique of Clear Writing (1952) Gunning, R.: The Technique of Clear Writing (1952)
Metadaten
Titel
Large Scale Authorship Attribution of Online Reviews
verfasst von
Prasha Shrestha
Arjun Mukherjee
Thamar Solorio
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-75487-1_17

Premium Partner