ABSTRACT
We propose the use of Lexicalized Tree Adjoining Grammar (LTAG) as a source of features that are useful for reranking the output of a statistical parser. In this paper, we extend the notion of a tree kernel over arbitrary sub-trees of the parse to the derivation trees and derived trees provided by the LTAG formalism, and in addition, we extend the original definition of the tree kernel, making it more lexicalized and more compact. We use LTAG based features for the parse reranking task and obtain labeled recall and precision of 89.7%/90.0% on WSJ section 23 of Penn Treebank for sentences of length ≤ 100 words. Our results show that the use of LTAG based tree kernel gives rise to a 17% relative difference in f-score improvement over the use of a linear kernel without LTAG based features.
- E. Black, F. Jelinek, J. Lafferty, Magerman D. M., R. Mercer, and S. Roukos. 1993. Towards history-based grammars: Using richer models for probabilistic parsing. In Proc. of the ACL 1993. Google ScholarDigital Library
- R. Bod. 2003. An Efficient Implementation of a New DOP Model. In Proc. of EACL 2003, Budapest. Google ScholarDigital Library
- J. Chen and K. Vijay-Shanker. 2000. Automated Extraction of TAGs from the Penn Treebank. In Proc. of the 6th IWPT.Google Scholar
- D. Chiang. 2000. Statistical Parsing with an Automatically-Extracted Tree Adjoining Grammar. In Proc. of ACL-2000. Google ScholarDigital Library
- M. Collins and N. Duffy. 2001. Convolution kernels for natural language. In Proc. of the 14th NIPS.Google Scholar
- M. Collins and N. Duffy. 2002. New ranking algorithms for parsing and tagging: Kernels over discrete structures, and the voted perceptron. In Proc. of ACL 2002. Google ScholarDigital Library
- M. Collins. 1999. Head-Driven Statistical Models for Natural Language Parsing. Ph.D. thesis, University of Pennsylvania. Google ScholarDigital Library
- M. Collins. 2000. Discriminative reranking for natural language parsing. In Proc. of 7th ICML. Google ScholarDigital Library
- M. Collins. 2001. Parameter estimation for statistical parsing models: Theory and practice of distribution-free methods. In Proc. of IWPT 2001. Invited Talk at IWPT 2001.Google Scholar
- R. Herbrich. 2002. Learning Kernel Classifiers: Theory and Algorithms. MIT Press. Google ScholarDigital Library
- A. K. Joshi and Y. Schabes. 1997. Tree-adjoining grammars. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, volume 3, pages 69 - 124. Springer. Google ScholarDigital Library
- L. Shen and A. K. Joshi. 2003. An SVM based voting algorithm with application to parse reranking. In Proc. of CoNLL 2003. Google ScholarDigital Library
- V. N. Vapnik. 1999. The Nature of Statistical Learning Theory. Springer, 2nd edition. Google ScholarDigital Library
- F. Xia. 2001. Investigating the Relationship between Grammars and Treebanks for Natural Languages. Ph.D. thesis, University of Pennsylvania, Philadelphia, PA.Google Scholar
Recommendations
Dependency parse reranking with rich subtree features
In pursuing machine understanding of human language, highly accurate syntactic analysis is a crucial step. In this work, we focus on dependency grammar, which models syntax by encoding transparent predicate-argument structures. Recent advances in ...
Rich bitext projection features for parse reranking
EACL '09: Proceedings of the 12th Conference of the European Chapter of the Association for Computational LinguisticsMany different types of features have been shown to improve accuracy in parse reranking. A class of features that thus far has not been considered is based on a projection of the syntactic structure of a translation of the text to be parsed. The ...
Comments