Skip to main content

2018 | OriginalPaper | Buchkapitel

Tuning SyntaxNet for POS Tagging Italian Sentences

verfasst von : Fiammetta Marulli, Marco Pota, Massimo Esposito, Alessandro Maisto, Raffaele Guarasci

Erschienen in: Advances on P2P, Parallel, Grid, Cloud and Internet Computing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Part-of-speech (POS) tagging is a Natural Language Processing (NLP) technique extremely relevant in Question Answering systems and becomes more complex when these systems operate on spoken language. For the use case of Italian spoken language, here considered, enclitic forms are very difficult to be tagged, since they consist of one or more pronouns appended as suffixes to verbs. This work describes a case study aiming at investigating how to refine SyntaxNet, the NLP framework released by Google, to efficiently tag enclitic forms in Italian. In particular, first, a forward selection of different features is presented, aimed to assess their influence on POS tagging performance of SyntaxNet in Italian. Second, further features are added, as suggested by morphological rules characterizing Italian enclitics, in order to improve POS tagging performance. Finally, a qualitative and quantitative evaluation with respect to sentences coming from real spoken dialogs is performed, showing very promising results.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Pota, M., Esposito, M., De Pietro, G.: Interpretability indexes for Fuzzy classification in cognitive systems. In: 2016 IEEE International Conference on Fuzzy Systems, pp. 24–31. IEEE (2016) Pota, M., Esposito, M., De Pietro, G.: Interpretability indexes for Fuzzy classification in cognitive systems. In: 2016 IEEE International Conference on Fuzzy Systems, pp. 24–31. IEEE (2016)
2.
Zurück zum Zitat Pota, M., Esposito, M., De Pietro, G.: A forward-selection algorithm for SVM-based question classification in cognitive systems. In: Intelligent Interactive Multimedia Systems and Services, pp. 587–598. Springer International Publishing (2016) Pota, M., Esposito, M., De Pietro, G.: A forward-selection algorithm for SVM-based question classification in cognitive systems. In: Intelligent Interactive Multimedia Systems and Services, pp. 587–598. Springer International Publishing (2016)
3.
Zurück zum Zitat Amato, F., Moscato, F.: Exploiting cloud and workflow patterns for the analysis of composite cloud services. Future Gener. Comput. Syst. 67, 255–265 (2017)CrossRef Amato, F., Moscato, F.: Exploiting cloud and workflow patterns for the analysis of composite cloud services. Future Gener. Comput. Syst. 67, 255–265 (2017)CrossRef
4.
Zurück zum Zitat Amato, F., Moscato, F.: Pattern-based orchestration and automatic verification of composite cloud services. Comput. Electr. Eng. 56, 842–853 (2016)CrossRef Amato, F., Moscato, F.: Pattern-based orchestration and automatic verification of composite cloud services. Comput. Electr. Eng. 56, 842–853 (2016)CrossRef
5.
Zurück zum Zitat Amato, F., Moscato, F.: A model driven approach to data privacy verification in E-Health systems. Trans. Data Priv. 8(3), 273–296 (2015) Amato, F., Moscato, F.: A model driven approach to data privacy verification in E-Health systems. Trans. Data Priv. 8(3), 273–296 (2015)
6.
Zurück zum Zitat Rosset, S., Galibert, O., Lamel, L.: Spoken question answering. In: Tur, G., De Mori, R. (eds.) Spoken Language Understanding: Systems for Extracting Semantic Information from Speech. Wiley, Chichester, UK (2011). doi:10.1002/9781119992691.ch6 Rosset, S., Galibert, O., Lamel, L.: Spoken question answering. In: Tur, G., De Mori, R. (eds.) Spoken Language Understanding: Systems for Extracting Semantic Information from Speech. Wiley, Chichester, UK (2011). doi:10.​1002/​9781119992691.​ch6
7.
Zurück zum Zitat Aprosio, A.P., Moretti, G.: Italy goes to Stanford: a collection of CoreNLP modules for Italian. arXiv preprint arXiv:1609.06204 (2016) Aprosio, A.P., Moretti, G.: Italy goes to Stanford: a collection of CoreNLP modules for Italian. arXiv preprint arXiv:​1609.​06204 (2016)
10.
Zurück zum Zitat Bosco, C., Montemagni, S., Simi, M.: Converting Italian treebanks: towards an Italian stanford dependency treebank. In: The 7th Linguistic Annotation Workshop & Interoperability with Discourse, ACL workshop (2013) Bosco, C., Montemagni, S., Simi, M.: Converting Italian treebanks: towards an Italian stanford dependency treebank. In: The 7th Linguistic Annotation Workshop & Interoperability with Discourse, ACL workshop (2013)
12.
13.
Zurück zum Zitat Goldberg, Y.: A primer on neural network models for natural language processing. J. Artif. Intell. Res. 57, 345–420 (2016)MATHMathSciNet Goldberg, Y.: A primer on neural network models for natural language processing. J. Artif. Intell. Res. 57, 345–420 (2016)MATHMathSciNet
14.
Zurück zum Zitat Toutanova, K., Klein, D., Manning, C.D., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proceedings of HLT-NAACL (2003) Toutanova, K., Klein, D., Manning, C.D., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proceedings of HLT-NAACL (2003)
16.
Zurück zum Zitat de Marneffe, M.C., Dozat, T., Silveira, N., Haverinen, K., Ginter, F., Nivre, J., Manning, C.D.: Universal stanford dependencies: a cross-linguistic typology. In: Proceedings of LREC (2014) de Marneffe, M.C., Dozat, T., Silveira, N., Haverinen, K., Ginter, F., Nivre, J., Manning, C.D.: Universal stanford dependencies: a cross-linguistic typology. In: Proceedings of LREC (2014)
18.
Zurück zum Zitat Lyding, V., Stemle, E., Borghetti, C., Brunello, M., Castagnoli, S., Dell’Orletta, F., Dittmann, H., Lenci, A., Pirrelli, V.: The PAISÀ corpus of Italian web texts. In: Proceedings of the 9th Web as Corpus Workshop (WaC-9), pp. 36–43. Association for Computational Linguistics (2014). http://www.corpusitaliano.it/it/ Lyding, V., Stemle, E., Borghetti, C., Brunello, M., Castagnoli, S., Dell’Orletta, F., Dittmann, H., Lenci, A., Pirrelli, V.: The PAISÀ corpus of Italian web texts. In: Proceedings of the 9th Web as Corpus Workshop (WaC-9), pp. 36–43. Association for Computational Linguistics (2014). http://​www.​corpusitaliano.​it/​it/​
19.
Zurück zum Zitat De Smedt, T., et al.: Using wiktionary to build an Italian part-of-speech tagger. In: International Conference on Applications of Natural Language to Data Bases/Information Systems. Springer, Cham (2014) De Smedt, T., et al.: Using wiktionary to build an Italian part-of-speech tagger. In: International Conference on Applications of Natural Language to Data Bases/Information Systems. Springer, Cham (2014)
20.
Zurück zum Zitat Lavelli, A.: Comparing state-of-the-art dependency parsers on the Italian stanford dependency treebank. In: CLiC it, p. 173 (2016) Lavelli, A.: Comparing state-of-the-art dependency parsers on the Italian stanford dependency treebank. In: CLiC it, p. 173 (2016)
Metadaten
Titel
Tuning SyntaxNet for POS Tagging Italian Sentences
verfasst von
Fiammetta Marulli
Marco Pota
Massimo Esposito
Alessandro Maisto
Raffaele Guarasci
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-69835-9_30

Premium Partner