Skip to main content

2018 | OriginalPaper | Buchkapitel

Using Syntactic and Semantic Features for Classifying Modal Values in the Portuguese Language

verfasst von : João Sequeira, Teresa Gonçalves, Paulo Quaresma, Amália Mendes, Iris Hendrickx

Erschienen in: Computational Linguistics and Intelligent Text Processing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper presents a study made in a field poorly explored in the Portuguese language – modality and its automatic tagging. Our main goal was to find a set of attributes for the creation of automatic taggers with improved performance over the bag-of-words (bow) approach. The performance was measured using precision, recall and \(F_1\). Because it is a relatively unexplored field, the study covers the creation of the corpus (composed by eleven verbs), the use of a parser to extract syntactic and semantic information from the sentences and a machine learning approach to identify modality values. Based on three different sets of attributes – from trigger itself and the trigger’s path (from the parse tree) and context – the system creates a tagger for each verb achieving (in almost every verb) an improvement in \(F_1\) when compared to the traditional bow approach.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The MMAX2 software is platform-independent, written in java and can freely be downloaded from http://​mmax2.​sourceforge.​net/​.
 
Literatur
1.
Zurück zum Zitat der Auwera, J.V., Plungian, V.A.: Modality’s semantic map. Linguist. Typol. 1(2), 79–124 (1998) der Auwera, J.V., Plungian, V.A.: Modality’s semantic map. Linguist. Typol. 1(2), 79–124 (1998)
2.
Zurück zum Zitat Baker, K., Bloodgood, M., Dorr, B., Filardo, N.W., Levin, L., Piatko, C.: A modality Lexicon and its use in automatic tagging. In: Chair, N.C.C., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Rosner, M., Tapias, D. (eds.) Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010). European Language Resources Association (ELRA), Valletta, Malta, May 2010 Baker, K., Bloodgood, M., Dorr, B., Filardo, N.W., Levin, L., Piatko, C.: A modality Lexicon and its use in automatic tagging. In: Chair, N.C.C., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Rosner, M., Tapias, D. (eds.) Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010). European Language Resources Association (ELRA), Valletta, Malta, May 2010
3.
Zurück zum Zitat Bick, E.: The Parsing System PALAVRAS. Aarhus University Press, Aarhus (1999) Bick, E.: The Parsing System PALAVRAS. Aarhus University Press, Aarhus (1999)
4.
Zurück zum Zitat Diab, M.T., Levin, L.S., Mitamura, T., Rambow, O., Prabhakaran, V., Guo, W.: Committed belief annotation and tagging. In: Third Linguistic Annotation Workshop, pp. 68–73. The Association for Computer Linguistics, Singapore, August 2009 Diab, M.T., Levin, L.S., Mitamura, T., Rambow, O., Prabhakaran, V., Guo, W.: Committed belief annotation and tagging. In: Third Linguistic Annotation Workshop, pp. 68–73. The Association for Computer Linguistics, Singapore, August 2009
5.
Zurück zum Zitat Farkas, R., Vincze, V., Móra, G., Csirik, J., Szarvas, G.: The CoNLL-2010 shared task: learning to detect hedges and their scope in natural language text. In: Proceedings of the Fourteenth Conference on Computational Natural Language Learning, pp. 1–12. Association for Computational Linguistics, Uppsala, Sweden, July 2010 Farkas, R., Vincze, V., Móra, G., Csirik, J., Szarvas, G.: The CoNLL-2010 shared task: learning to detect hedges and their scope in natural language text. In: Proceedings of the Fourteenth Conference on Computational Natural Language Learning, pp. 1–12. Association for Computational Linguistics, Uppsala, Sweden, July 2010
6.
Zurück zum Zitat Généreux, M., Hendrickx, I., Mendes, A.: Introducing the reference corpus of contemporary Portuguese on-line. In: Calzolari, N., Choukri, K., Declerck, T., Dogan, M.U., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S. (eds.) LREC 2012, pp. 2237–2244. European Language Resources Association (ELRA), Istanbul (2012) Généreux, M., Hendrickx, I., Mendes, A.: Introducing the reference corpus of contemporary Portuguese on-line. In: Calzolari, N., Choukri, K., Declerck, T., Dogan, M.U., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S. (eds.) LREC 2012, pp. 2237–2244. European Language Resources Association (ELRA), Istanbul (2012)
7.
Zurück zum Zitat Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)CrossRef Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)CrossRef
8.
Zurück zum Zitat Hendrickx, I., Mendes, A., Mencarelli, S.: Modality in text: a proposal for corpus annotation. In: Chair, N.C.C., Choukri, K., Declerck, T., Doğan, M.U., Maegaard, B., Mariani, J., Moreno, A., Odijk, J., Piperidis, S. (eds.) Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC 2012). European Language Resources Association (ELRA), Istanbul, Turkey, May 2012 Hendrickx, I., Mendes, A., Mencarelli, S.: Modality in text: a proposal for corpus annotation. In: Chair, N.C.C., Choukri, K., Declerck, T., Doğan, M.U., Maegaard, B., Mariani, J., Moreno, A., Odijk, J., Piperidis, S. (eds.) Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC 2012). European Language Resources Association (ELRA), Istanbul, Turkey, May 2012
9.
Zurück zum Zitat Matsuyoshi, S., Eguchi, M., Sao, C., Murakami, K., Inui, K., Matsumoto, Y.: Annotating event mentions in text with modality, focus, and source information. In: Chair, N.C.C., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Rosner, M., Tapias, D. (eds.) Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010). European Language Resources Association (ELRA), Valletta, Malta, May 2010 Matsuyoshi, S., Eguchi, M., Sao, C., Murakami, K., Inui, K., Matsumoto, Y.: Annotating event mentions in text with modality, focus, and source information. In: Chair, N.C.C., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Rosner, M., Tapias, D. (eds.) Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010). European Language Resources Association (ELRA), Valletta, Malta, May 2010
10.
Zurück zum Zitat McShane, M., Nirenburg, S., Beale, S., O’Hara, T.: Semantically rich human-aided machine annotation. In: Proceedings of the Workshop on Frontiers in Corpus Annotations II: Pie in the Sky, pp. 68–75. Association for Computational Linguistics, Ann Arbor, Michigan, June 2005 McShane, M., Nirenburg, S., Beale, S., O’Hara, T.: Semantically rich human-aided machine annotation. In: Proceedings of the Workshop on Frontiers in Corpus Annotations II: Pie in the Sky, pp. 68–75. Association for Computational Linguistics, Ann Arbor, Michigan, June 2005
11.
Zurück zum Zitat Mendes, A., Hendrickx, I., Salgueiro, A., Ávila, L.: Annotating the interaction between focus and modality: the case of exclusive particles. In: Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, pp. 228–237. Association for Computational Linguistics, Sofia, Bulgaria, August 2013 Mendes, A., Hendrickx, I., Salgueiro, A., Ávila, L.: Annotating the interaction between focus and modality: the case of exclusive particles. In: Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, pp. 228–237. Association for Computational Linguistics, Sofia, Bulgaria, August 2013
12.
Zurück zum Zitat Miwa, M., Thompson, P., McNaught, J., Kell, D.B., Ananiadou, S.: Extracting semantically enriched events from biomedical literature. BMC Bioinform. 13, 108 (2012)CrossRef Miwa, M., Thompson, P., McNaught, J., Kell, D.B., Ananiadou, S.: Extracting semantically enriched events from biomedical literature. BMC Bioinform. 13, 108 (2012)CrossRef
13.
Zurück zum Zitat Müller, C., Strube, M.: Multi-level annotation of linguistic data with MMAX2. In: Braun, S., Kohn, K., Mukherjee, J. (eds.) Corpus Technology and Language Pedagogy: New Resources, New Tools, New Methods, pp. 197–214. Peter Lang, Frankfurt a.M., Germany (2006) Müller, C., Strube, M.: Multi-level annotation of linguistic data with MMAX2. In: Braun, S., Kohn, K., Mukherjee, J. (eds.) Corpus Technology and Language Pedagogy: New Resources, New Tools, New Methods, pp. 197–214. Peter Lang, Frankfurt a.M., Germany (2006)
14.
Zurück zum Zitat Nirenburg, S., McShane, M.: Annotating modality. Technical report, University of Maryland, Baltimore County, USA, March 2008 Nirenburg, S., McShane, M.: Annotating modality. Technical report, University of Maryland, Baltimore County, USA, March 2008
15.
Zurück zum Zitat Nissim, M., Pietrandrea, P., Sanso, A., Mauri, C.: Cross-linguistic annotation of modality: a data-driven hierarchical model. In: Proceedings of IWCS 2013 WAMM Workshop on the Annotation of Modal Meaning in Natural Language, pp. 7–14. Association for Computational Linguistics, Postam, Germany (2013) Nissim, M., Pietrandrea, P., Sanso, A., Mauri, C.: Cross-linguistic annotation of modality: a data-driven hierarchical model. In: Proceedings of IWCS 2013 WAMM Workshop on the Annotation of Modal Meaning in Natural Language, pp. 7–14. Association for Computational Linguistics, Postam, Germany (2013)
16.
Zurück zum Zitat Palmer, F.R.: Mood and Modality. Cambridge Textbooks in Linguistics. Cambridge University Press, Cambridge (1986) Palmer, F.R.: Mood and Modality. Cambridge Textbooks in Linguistics. Cambridge University Press, Cambridge (1986)
17.
Zurück zum Zitat Prabhakaran, V., Bloodgood, M., Diab, M., Dorr, B., Levin, L., Piatko, C.D., Rambow, O., Van Durme, B.: Statistical modality tagging from rule-based annotations and crowdsourcing. In: Proceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics, ExProM 2012, pp. 57–64. Association for Computational Linguistics, Stroudsburg, PA, USA (2012) Prabhakaran, V., Bloodgood, M., Diab, M., Dorr, B., Levin, L., Piatko, C.D., Rambow, O., Van Durme, B.: Statistical modality tagging from rule-based annotations and crowdsourcing. In: Proceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics, ExProM 2012, pp. 57–64. Association for Computational Linguistics, Stroudsburg, PA, USA (2012)
18.
Zurück zum Zitat Ruppenhofer, J., Rehbein, I.: Yes we can!? Annotating English modal verbs. In: Chair, N.C.C., Choukri, K., Declerck, T., Doğan, M.U., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S. (eds.) Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC 2012). European Language Resources Association (ELRA), Istanbul, Turkey, May 2012 Ruppenhofer, J., Rehbein, I.: Yes we can!? Annotating English modal verbs. In: Chair, N.C.C., Choukri, K., Declerck, T., Doğan, M.U., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S. (eds.) Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC 2012). European Language Resources Association (ELRA), Istanbul, Turkey, May 2012
19.
Zurück zum Zitat Sauri, R., Verhagen, M., Pustejovsky, J.: Annotating and recognizing event modality in text. In: FLAIRS Conference, pp. 333–339 (2006) Sauri, R., Verhagen, M., Pustejovsky, J.: Annotating and recognizing event modality in text. In: FLAIRS Conference, pp. 333–339 (2006)
20.
Zurück zum Zitat Vapnik, V.N.: Statistical Learning Theory. Wiley-Interscience, Hoboken (1998) Vapnik, V.N.: Statistical Learning Theory. Wiley-Interscience, Hoboken (1998)
21.
Zurück zum Zitat Ávila, L., Melo, H.: Challenges in modality annotation in a Brazilian Portuguese spontaneous speech corpus. In: Proceedings of IWCS 2013 WAMM Workshop on the Annotation of Modal Meaning in Natural Language. Association for Computational Linguistics, Postam, Germany (2013) Ávila, L., Melo, H.: Challenges in modality annotation in a Brazilian Portuguese spontaneous speech corpus. In: Proceedings of IWCS 2013 WAMM Workshop on the Annotation of Modal Meaning in Natural Language. Association for Computational Linguistics, Postam, Germany (2013)
Metadaten
Titel
Using Syntactic and Semantic Features for Classifying Modal Values in the Portuguese Language
verfasst von
João Sequeira
Teresa Gonçalves
Paulo Quaresma
Amália Mendes
Iris Hendrickx
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-75487-1_28

Premium Partner