Skip to main content
Top

2014 | OriginalPaper | Chapter

Informal Mathematical Discourse Parsing with Conditional Random Fields

Authors : Raúl Ernesto Gutierrez de Piñerez Reyes, Juan Francisco Díaz-Frías

Published in: Statistical Language and Speech Processing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Discourse parsing for the Informal Mathematical Discourse (IMD) has been a difficult task because of the lack of data sets, partly because the Natural Language Processing (NLP) techniques must be adapted to informality of IMD. In this paper, we present an end-to-end discourse parser which is a sequential classifier of informal deductive argumentations (IDA) for Spanish. We design a discourse parser using sequence labeling based on CRFs (Conditional Random Fields). We use the CRFs on lexical, syntactic and semantic features extracted from a discursive corpus (MD-TreeBank: Mathematical Discourse TreeBank). In this article, we describe a Penn Discourse TreeBank (PDTB) styled End-to-End discourse parser into the Control Natural Languages (CNLs) context. Discourse parsing is focused from a discourse low level perspective in which we identify the IDA connectives avoiding complex linguistic phenomena. Our discourse parser performs parsing as a connective-level sequence labeling task and classifies several types of informal deductive argumentations into the mathematical proof.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Bikel, D.: Design of a multilingual, parallel processing statistical parsing engine. In: Proceedings of the 2nd International Conference on Human Language Technology Research HLT’02, pp. 178–182. Morgan Kaufmann Publishers Inc., San Francisco (2002) Bikel, D.: Design of a multilingual, parallel processing statistical parsing engine. In: Proceedings of the 2nd International Conference on Human Language Technology Research HLT’02, pp. 178–182. Morgan Kaufmann Publishers Inc., San Francisco (2002)
2.
go back to reference Dines, N., Lee, A., Miltsakaki, E., Prasad, R., Joshi, A., Webber, B.: Attribution and the (non-)alignment of syntactic and discourse arguments of connectives. In: Proceedings of the Workshop on Frontiers in Corpus Annotations II: Pie in the Sky, CorpusAnno ’05, Stroudsburg, PA, USA, pp. 29–36. Association for Computational Linguistics (2005). http://dl.acm.org/citation.cfm?id=1608829.1608834 Dines, N., Lee, A., Miltsakaki, E., Prasad, R., Joshi, A., Webber, B.: Attribution and the (non-)alignment of syntactic and discourse arguments of connectives. In: Proceedings of the Workshop on Frontiers in Corpus Annotations II: Pie in the Sky, CorpusAnno ’05, Stroudsburg, PA, USA, pp. 29–36. Association for Computational Linguistics (2005). http://​dl.​acm.​org/​citation.​cfm?​id=​1608829.​1608834
3.
go back to reference Ghosh, S., Johansson, R., Riccardi, G., Tonelli, S.: Shallow discourse parsing with conditional random fields. In: Proceedings of the 5th International Joint Conference on Natural Language Processing, Chiang Mai, Thailand, pp. 1071–1079 (2011) Ghosh, S., Johansson, R., Riccardi, G., Tonelli, S.: Shallow discourse parsing with conditional random fields. In: Proceedings of the 5th International Joint Conference on Natural Language Processing, Chiang Mai, Thailand, pp. 1071–1079 (2011)
6.
go back to reference Lin, Z., Ng, H.T., Kan, M.: A PDTB-styled end-to-end discourse parser. Comput. Res. Repository (2011) Lin, Z., Ng, H.T., Kan, M.: A PDTB-styled end-to-end discourse parser. Comput. Res. Repository (2011)
9.
go back to reference Prasad, R., Dinesh, N., Lee, A., Miltsakaki, E., Robaldo, L., Joshi, A., Webber, B.: The penn discourse treebank 2.0. In: Proceedings of the 6th International Conference on Languages Resources and Evaluations (LREC 2008), Marrakech, Marocco (2008) Prasad, R., Dinesh, N., Lee, A., Miltsakaki, E., Robaldo, L., Joshi, A., Webber, B.: The penn discourse treebank 2.0. In: Proceedings of the 6th International Conference on Languages Resources and Evaluations (LREC 2008), Marrakech, Marocco (2008)
10.
go back to reference Qi, L., Chen, L.: A linear-chain CRF-based learning approach for web opinion mining. In: Chen, L., Triantafillou, P., Suel, T. (eds.) WISE 2010. LNCS, vol. 6488, pp. 128–141. Springer, Heidelberg (2010) CrossRef Qi, L., Chen, L.: A linear-chain CRF-based learning approach for web opinion mining. In: Chen, L., Triantafillou, P., Suel, T. (eds.) WISE 2010. LNCS, vol. 6488, pp. 128–141. Springer, Heidelberg (2010) CrossRef
11.
go back to reference Gutierrez de Piñerez Reyes, R.E., Díaz Frías, J.F.: Preprocessing of informal mathematical discourse in context of controlled natural language. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM ’12, pp. 1632–1636. ACM, New York (2012). http://doi.acm.org/10.1145/2396761.2398487 Gutierrez de Piñerez Reyes, R.E., Díaz Frías, J.F.: Preprocessing of informal mathematical discourse in context of controlled natural language. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM ’12, pp. 1632–1636. ACM, New York (2012). http://​doi.​acm.​org/​10.​1145/​2396761.​2398487
13.
go back to reference Ruesga, S.L., Sandoval, S.L., León, L.F.: Spanish treebank: specifications version 5. Technical report, Universidad Autónoma de Madrid (1999) Ruesga, S.L., Sandoval, S.L., León, L.F.: Spanish treebank: specifications version 5. Technical report, Universidad Autónoma de Madrid (1999)
14.
15.
go back to reference Wellner, B.: Sequence models and ranking methods for discourse parsing. Ph.D. thesis, Brandeis University (2009) Wellner, B.: Sequence models and ranking methods for discourse parsing. Ph.D. thesis, Brandeis University (2009)
16.
go back to reference Wellner, B., Pustejovsky, J.: Automatically identifying the arguments of discourse connectives. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague, Czech Republic, June 2007, pp. 92–101. Association for Computational Linguistics (2007). http://www.aclweb.org/anthology/D/D07/D07-1010 Wellner, B., Pustejovsky, J.: Automatically identifying the arguments of discourse connectives. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague, Czech Republic, June 2007, pp. 92–101. Association for Computational Linguistics (2007). http://​www.​aclweb.​org/​anthology/​D/​D07/​D07-1010
17.
go back to reference Wolska, M.: A language engineering architecture for processing informal mathematical discourse. In: Towards Digital Mathematics Library, pp. 131–136. Masaryk University (2008) Wolska, M.: A language engineering architecture for processing informal mathematical discourse. In: Towards Digital Mathematics Library, pp. 131–136. Masaryk University (2008)
18.
go back to reference Zinn, C.: Understanding informal mathematical discourse. Ph.D. thesis. Universität Erlangen-Nürnberg Institut für Informatik (2004) Zinn, C.: Understanding informal mathematical discourse. Ph.D. thesis. Universität Erlangen-Nürnberg Institut für Informatik (2004)
Metadata
Title
Informal Mathematical Discourse Parsing with Conditional Random Fields
Authors
Raúl Ernesto Gutierrez de Piñerez Reyes
Juan Francisco Díaz-Frías
Copyright Year
2014
DOI
https://doi.org/10.1007/978-3-319-11397-5_20

Premium Partner