ABSTRACT
We present a simple, but surprisingly effective, method of self-training a two-phase parser-reranker system using readily available unlabeled data. We show that this type of bootstrapping is possible for parsing when the bootstrapped parses are processed by a discriminative reranker. Our improved model achieves an f-score of 92.1%, an absolute 1.1% improvement (12% error reduction) over the previous best result for Wall Street Journal parsing. Finally, we provide some analysis to better understand the phenomenon.
- Michiel Bacchiani, Michael Riley, Brian Roark, and Richard Sproat. 2006. MAP adaptation of stochastic grammars. Computer Speech and Language, 20(1):41--68.]] Google ScholarDigital Library
- Avrim Blum and Tom Mitchell. 1998. Combining labeled and unlabeled data with co-training. In Proceedings of the 11th Annual Conference on Computational Learning Theory (COLT-98).]] Google ScholarDigital Library
- Eugene Charniak and Mark Johnson. 2005. Coarse-to-fine n-best parsing and MaxEnt discriminative reranking. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), pages 173--180, Ann Arbor, Michigan, June. Association for Computational Linguistics.]] Google ScholarDigital Library
- Eugene Charniak. 1997. Statistical parsing with a context-free grammar and word statistics. In Proceedings of the Fourteenth National Conference on Artificial Intelligence, Menlo Park. AAAI Press/MIT Press.]]Google Scholar
- Eugene Charniak. 2000. A maximum-entropy-inspired parser. In 1st Annual Meeting of the NAACL.]] Google ScholarDigital Library
- Stanley F. Chen and Joshua Goodman. 1996. An empirical study of smoothing techniques for language modeling. In Arivind Joshi and Martha Palmer, editors, Proceedings of the Thirty-Fourth Annual Meeting of the Association for Computational Linguistics.]] Google ScholarDigital Library
- Stephen Clark, James Curran, and Miles Osborne. 2003. Bootstrapping POS-taggers using unlabelled data. In Proceedings of CoNLL-2003.]] Google ScholarDigital Library
- Michael Collins. 2000. Discriminative reranking for natural language parsing. In Machine Learning: Proceedings of the 17th International Conference (ICML 2000), pages 175--182, Stanford, California.]] Google ScholarDigital Library
- Sanjoy Dasgupta, M. L. Littman, and D. McAllester. 2001. PAC generalization bounds for co-training. In Advances in Neural Information Processing Systems (NIPS), 2001.]]Google Scholar
- Daniel Gildea. 2001. Corpus variation and parser performance. In Conference on Empirical Methods in Natural Language Processing (EMNLP).]]Google Scholar
- David Graff. 1995. North American News Text Corpus. Linguistic Data Consortium. LDC95T21.]]Google Scholar
- James Henderson. 2004. Discriminative training of a neural network statistical parser. In Proc. 42nd Meeting of Association for Computational Linguistics (ACL 2004), Barcelona, Spain.]] Google ScholarDigital Library
- Donald Hindle and Mats Rooth. 1993. Structural ambiguity and lexical relations. Computational Linguistics, 19(1):103--120.]] Google ScholarDigital Library
- Liang Huang and David Chang. 2005. Better k-best parsing. Technical Report MS-CIS-05-08, Department of Computer Science, University of Pennsylvania.]]Google Scholar
- Victor M. Jimenez and Andres Marzal. 2000. Computation of the n best parse trees for weighted and stochastic context-free grammars. In Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition. Springer LNCS 1876.]] Google ScholarDigital Library
- Mark Johnson, Stuart Geman, Stephen Canon, Zhiyi Chi, and Stefan Riezler. 1999. Estimators for stochastic "unification-based" grammars. In The Proceedings of the 37th Annual Conference of the Association for Computational Linguistics, pages 535--541, San Francisco. Morgan Kaufmann.]] Google ScholarDigital Library
- Dan Klein and Christopher Manning. 2002. A generative constituent-context model for improved grammar induction. In Proceedings of the 40th Annual Meeting of the ACL.]] Google ScholarDigital Library
- Michell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. 1993. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2):313--330.]] Google ScholarDigital Library
- Anoop Sarkar. 2001. Applying cotraining methods to statistical parsing. In Proceedings of the 2001 NAACL Conference.]] Google ScholarDigital Library
- Mark Steedman, Miles Osborne, Anoop Sarkar, Stephen Clark, Rebecca Hwa, Julia Hockenmaier, Paul Ruhlen, Steven Baker, and Jeremiah Crim. 2003. Bootstrapping statistical parsers from small datasets. In Proceedings of EACL 03.]] Google ScholarDigital Library
- Effective self-training for parsing
Recommendations
Self-training PCFG grammars with latent annotations across languages
EMNLP '09: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2We investigate the effectiveness of self-training PCFG grammars with latent annotations (PCFG-LA) for parsing languages with different amounts of labeled training data. Compared to Charniak's lexicalized parser, the PCFG-LA parser was more effectively ...
LLLR parsing
SAC '13: Proceedings of the 28th Annual ACM Symposium on Applied ComputingThe idea of an LLLR parsing is presented. An LLLR(k) parser can be constructed for any LR(k) grammar but it produces the left parse of the input string in linear time (in respect to the length of the derivation) without backtracking. If used as a basis ...
Comments