Skip to main content

2017 | OriginalPaper | Buchkapitel

Using Extended Stopwords Lists to Improve the Quality of Academic Abstracts Clustering

verfasst von : Svetlana Popova, Vera Danilova

Erschienen in: On the Move to Meaningful Internet Systems: OTM 2016 Workshops

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Knowledge extraction from scientific documents plays an important role in the development of academic databases and services. We focus on the processing of abstracts to academic papers for the purposes of research data structuring that includes various subtasks, such as key phrase extraction and clustering. The use of abstracts is beneficial, because authors keep up with formal and stylistic requirements imposed by the publishers, and, therefore, informational and language patterns can be revealed. From our viewpoint, the existence of these patterns makes it possible to perform the cross-task application of techniques used for abstracts processing. The aim of the paper is to show it.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Cagnina, L., Errecalde, M., Ingaramo, D., Rosso, P.: A discrete particle swarm optimizer for clustering short text corpora. In: BIOMA 2008, pp. 93–103 (2008) Cagnina, L., Errecalde, M., Ingaramo, D., Rosso, P.: A discrete particle swarm optimizer for clustering short text corpora. In: BIOMA 2008, pp. 93–103 (2008)
2.
Zurück zum Zitat Errecalde, M., Ingaramo, D., Rosso, P.: ITSA*: an effective iterative method for short-text clustering tasks. In: García-Pedrajas, N., Herrera, F., Fyfe, C., Benítez, J.M., Ali, M. (eds.) IEA/AIE 2010. LNCS (LNAI), vol. 6096, pp. 550–559. Springer, Heidelberg (2010). doi:10.1007/978-3-642-13022-9_55 CrossRef Errecalde, M., Ingaramo, D., Rosso, P.: ITSA*: an effective iterative method for short-text clustering tasks. In: García-Pedrajas, N., Herrera, F., Fyfe, C., Benítez, J.M., Ali, M. (eds.) IEA/AIE 2010. LNCS (LNAI), vol. 6096, pp. 550–559. Springer, Heidelberg (2010). doi:10.​1007/​978-3-642-13022-9_​55 CrossRef
4.
Zurück zum Zitat Pinto, D., Rosso, P., Jimnez, H.: A self-enriching methodology for clustering narrow domain short texts. Comput. J. 54(7), 1148–1165 (2011)CrossRef Pinto, D., Rosso, P., Jimnez, H.: A self-enriching methodology for clustering narrow domain short texts. Comput. J. 54(7), 1148–1165 (2011)CrossRef
5.
Zurück zum Zitat Popova, S., Kovriguina, L., Muromtsev, D., Khodyrev, I.: Stop-words in keyphrase extraction problem. In: Proceedings of 14th Conference of Open Innovations Association FRUCT Helsinki, Finland (2013) Popova, S., Kovriguina, L., Muromtsev, D., Khodyrev, I.: Stop-words in keyphrase extraction problem. In: Proceedings of 14th Conference of Open Innovations Association FRUCT Helsinki, Finland (2013)
6.
Zurück zum Zitat Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: 2003 Conference on Empirical Methods in Natural Language Processing, pp. 216–223 (2003) Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: 2003 Conference on Empirical Methods in Natural Language Processing, pp. 216–223 (2003)
7.
Zurück zum Zitat Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. In: 2004 Conference on Empirical Methods in Natural Language Processing, pp. 404–411 (2004) Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. In: 2004 Conference on Empirical Methods in Natural Language Processing, pp. 404–411 (2004)
8.
Zurück zum Zitat Xiaojun, W., Xiao, J.: Exploiting neighborhood knowledge for single document summarization and keyphrase extraction. ACM Trans. Inf. Syst. 28(2), Artical ID 8 (2010) Xiaojun, W., Xiao, J.: Exploiting neighborhood knowledge for single document summarization and keyphrase extraction. ACM Trans. Inf. Syst. 28(2), Artical ID 8 (2010)
9.
Zurück zum Zitat Zesch, T., Gurevych, I.: Approximate Matching for Evaluating Keyphrase Extraction. In: 2009 International Conference RANLP, pp. 484–489 (2009) Zesch, T., Gurevych, I.: Approximate Matching for Evaluating Keyphrase Extraction. In: 2009 International Conference RANLP, pp. 484–489 (2009)
10.
Zurück zum Zitat Popova, S., Khodyrev, I.: Ranking in keyphrase extraction problem: is it useful to use statistics of words occurrences? In.: proceedings of the Institute for System Programming of the RAS, book 26, 2014, 4 (2014) Popova, S., Khodyrev, I.: Ranking in keyphrase extraction problem: is it useful to use statistics of words occurrences? In.: proceedings of the Institute for System Programming of the RAS, book 26, 2014, 4 (2014)
Metadaten
Titel
Using Extended Stopwords Lists to Improve the Quality of Academic Abstracts Clustering
verfasst von
Svetlana Popova
Vera Danilova
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-55961-2_24

Premium Partner