Top

Published in:

2017 | OriginalPaper | Chapter

Network-Enabled Keyword Extraction for Under-Resourced Languages

Authors : Slobodan Beliga, Sanda Martinčić-Ipšić

Published in: Semantic Keyword-Based Search on Structured Data Sources

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

In this paper we discuss advantages of network-enabled keyword extraction from texts in under-resourced languages. Network-enabled methods are shortly introduced, while focus of the paper is placed on discussion of difficulties that methods must overcome when dealing with content in under-resourced languages (mainly exhibit as a lack of natural language processing resources: corpora and tools). Additionally, the paper discusses how to circumvent the lack of NLP tools with network-enabled method such is SBKE method.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Keyword-Based Search on Bilingual Digital Libraries

next chapter Making Sense of Citations

Beliga, S., Meštrović, A., Martinčić-Ipšić, S.: An overview of graph-based keyword extraction methods and approaches. J. Inf. Organ. Sci. 39(1), 1–20 (2015)

Besacier, L., Barnard, E., Karpov, A., Schultz, T.: Automatic speech recognition for under-resourced languages: a survey. Speech Commun. 56, 85–100 (2014)CrossRef

Krauwer, S.: The basic language resource kit (BLARK) as the first milestone for the language resources roadmap. In: Proceedings of the 2003 International Workshop Speech and Computer SPECOM-2003, pp. 8–15. Moscow, Russia (2003)

Berment, V.: Méthodes pour informatiser des langues et des groupes de langues “peu dotées”. Ph.D. Thesis, J. Fourier University – Grenoble I (2004)

Abilhoa, W.D., Castro, L.N.: A keyword extraction method from twitter messages represented as graphs. Appl. Math. Comput. 240, 308–325 (2014)

Palshikar, G.K.: Keyword extraction from a single document using centrality measures. In: Ghosh, A., De, R.K., Pal, S.K. (eds.) PReMI 2007. LNCS, vol. 4815, pp. 503–510. Springer, Heidelberg (2007). doi:10.1007/978-3-540-77046-6_62 CrossRef

Mihalcea, R., Tarau, P.: TextRank: Bringing order into texts. In: Proceedings of Empirical Methods in Natural Language Processing – EMNLP 2004, pp. 404–411. ACL, Barcelona, Spain (2004)

META-NET – official site May 2016. http://www.meta-net.eu/

META-NET White Paper Series: Key Results and Cross-Language Comparison May 2016. http://www.meta-net.eu/whitepapers/key-results-and-cross-language-comparison

10.

Joorabchi, A., Mahdi, A.E.: Automatic keyphrase annotation of scientific documents using Wikipedia and genetic algorithms. J. Inf. Sci. 39(3), 410–426 (2013)CrossRef

11.

Lahiri, S., Choudhury, S.R., Caragea, C.: Keyword and Keyphrase Extraction Using Centrality Measures on Collocation Networks (2014). arXiv preprint arXiv:1401.6571

12.

Grineva, M., Grinev, M., Lizorkin, D.: Extracting key terms from noisy and multitheme documents. In: ACM 18th conference on World Wide Web, pp. 661–670 (2009)

13.

Beliga, S., Meštrović, A., Martinčić-Ipšić, S.: Toward selectivity-based keyword extraction for croatian news. In: CEUR Proceedings of the Workshop on Surfacing the Deep and the Social Web (SDSW 2014), vol. 1310, pp. 1–8, Riva del Garda, Trentino, Italy (2014)

14.

Beliga, S., Meštrović, A., Martinčić-Ipšić, S.: Selectivity-based keyword extraction method. Int. J. Semant. Web Inf. Syst. (IJSWIS) 12(3), 1–26 (2016)CrossRef

15.

Proceedings of the ACL 2015 Workshop on Novel Computational Approaches to Keyphrase Extraction, ACL-IJCNLP 2015, Beijing, China (2015)

16.

Paroubek, P., Zweigenbaum, P., Forest, D., Grouin, C.: Indexation libreet controlee d’articles scientifiques. Presentation et resultats du defi fouille de textes DEFT2012. In: Proceedings of the DEfi Fouille de Textes 2012 Workshop, pp. 1–13 (2012)

17.

Kozłowski, M.: PKE: a novel Polish keywords extraction method. Pomiary Automatyka Kontrola, R. 60(5), 305–308 (2014)

18.

Mijić, J., Dalbelo-Bašić, B., Šnajder, J.: Robust keyphrase extraction for a large-scale croatian news production system. In: Proceedings of the 7th International Conference on Formal Approaches to South Slavic and Balkan Languages, Zagreb, Croatia: Croatian Language Technologies Society, pp. 59–66 (2010)

19.

Collection of comparable Lithuanian, Latvian and Estonian laws and legislations (June 2016). http://metashare.nlp.ipipan.waw.pl/metashare/repository/browse/collection-of-comparable-lithuanian-latvian-and-estonian-laws-and-legisla-tions/8d0d633eae7711e2a28e525400c0c5ef33b6cfc6ca074e1ab58859157c8374e7/#

20.

Zunde, P., Dexter, M.E.: Indexing consistency and quality. Am. Documentation 20(3), 259–267 (1969)CrossRef

21.

Loza, V., Lahiri, S., Mihalcea, R., Lai, P.: Building a dataset for summarization and keyword extraction from emails. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014). pp. 2441–2446, Reykjavik, Iceland (2014)

22.

Su, N.K., Medelyan, O., Min-Yen, K., Timothy, B.: Automatic keyphrase extraction from scientific articles. Lang. Resour. Eval. 47(3), 723–742 (2013)CrossRef

23.

Gimpel, K., Schneider, N., O’Connor, B., Das, D., Mills, D., Eisenstein, J., et al.: Part-of-speech tagging for twitter: annotation, features, and experiments. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers – vol. 2, HLT 2011, Stroudsburg, PA, USA. Association for Computational Linguistics (2011)

24.

Marujo, L., Wang, L., Trancoso, I., Dyer, C., Black, A.W., Gershman, A., et al.: Automatic keyword extraction on twitter. In: ACL (2015)

25.

Medelyan, O.: Human-competitive automatic topic indexing. Ph.D. thesis. Department of Computer Science, University of Waikato, New Zealand (2009)

26.

Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pp. 216–223 (2003)

27.

Nguyen, T.D., Kan, M.-Y.: Keyphrase extraction in scientific publications. In: Goh, D.H.-L., Cao, T.H., Sølvberg, I.T., Rasmussen, E. (eds.) ICADL 2007. LNCS, vol. 4822, pp. 317–326. Springer, Heidelberg (2007). doi:10.1007/978-3-540-77094-7_41 CrossRef

28.

Wan, X., Xiao, J.: CollabRank: towards a collaborative approach to single-document keyphrase extraction. In: Proceedings of COLING, pp. 969–976 (2008)

29.

Krapivin, M., Autaeu, A., Marchese, M.: Large dataset for keyphrase extraction. Technical Report DISI-09-055, DISI, University of Trento, Italy (2009)

30.

Medelyan, O., Witten, I.H.: Domain independent automatic keyphrase indexing with small training sets. J. Am. Soc. Inf. Sci. Technol. 59(7), 1026–1040 (2008)CrossRef

31.

Marujo, L., Gershman, A., Carbonell, J., Frederking, R., Neto, J.P.: Supervised topical key phrase extraction of news stories using crowdsourcing. In: Light Filtering and Co-reference Normalization. Proceedings of LREC 2012 (2012)

32.

Marujo, L., Viveiros, M., Neto, J.P.: Keyphrase cloud generation of broadcast news. In: Proceeding of 12th Annual Conference of the International Speech Communication Association, Interspeech (2011)

Title: Network-Enabled Keyword Extraction for Under-Resourced Languages
Authors: Slobodan Beliga
Sanda Martinčić-Ipšić
Publisher: Springer International Publishing
Book: Semantic Keyword-Based Search on Structured Data Sources
Print ISBN: 978-3-319-53639-2

Electronic ISBN: 978-3-319-53640-8

Copyright Year: 2017
DOI: https://doi.org/10.1007/978-3-319-53640-8_11

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"