ABSTRACT
This paper proposes an approach to full parsing suitable for Information Extraction from texts. Sequences of cascades of rules deterministically analyze the text, building unambiguous structures. Initially basic chunks are analyzed; then argumental relations are recognized; finally modifier attachment is performed and the global parse tree is built. The approach was proven to work for three languages and different domains. It was implemented in the IE module of FACILE, a EU project for multilingual text classification and IE.
- Steven Abney. 1996. Partial parsing via finitestate cascades. In Proceedings of the ESSLI '96 Robust Parsing Workshop.Google Scholar
- Chinatsu Aone, Lauren Halverson, Tom Hampton, and Mila Ramos-Santacruz. 1998. SRA: description of the IE2 system used for MUC-7. In Proceedings of the Seventh Message Understanding Conference (MUC-7), http://www.muc.saic.com/.Google Scholar
- Douglas E. Appelt, Jerry R. Hobbs, John Bear, David Israel, and Mabry Tyson. 1993. FASTUS: A finite-state processor for information extraction from real-world text. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, Chambery, FranceGoogle Scholar
- Fabio Ciravegna, Alberto Lavelli, Nadia Mana, Luca Gilardoni, Silvia Mazza, Massimo Ferraro, Johannes Matiasek, William J. Black, Fabio Rinaldi, and David Mowatt. 1999. FACILE: Classifying texts integrating pattern matching and information extraction. In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, Stockholm, Sweden. Google ScholarDigital Library
- Aaron Douthat. 1998. The message understanding conference scoring software user's manual. In Proceedings of the Seventh Message Understanding Conference (MUC-7), http://www.muc.saic.com/.Google Scholar
- Ralph Grishman. 1995. The NYU system for MUC-6 or where's syntax? In Sixth message understanding conference MUC-6. Morgan Kaufmann Publishers. Google ScholarDigital Library
- Ralph Grishman. 1997. Information extraction: Techniques and challenges. In M. T. Pazienza, editor, Information Extraction: a multidisciplinary approach to an emerging technology. Springer Verlag. Google ScholarDigital Library
- Megumi Kameyama. 1997. Recognizing referential links: An information extraction perspective. In Mitkov and Boguraev, editors, Proceedings of ACL/EACL Workshop on Operational Factors in Practical, Robust Anaphora Resolution for Unrestricted Texts, Madrid, Spain. Google ScholarDigital Library
- Adam Kilgarriff. 1997. Foreground and background lexicons and word sense disambiguation for information extraction. In International Workshop on Lexically Driven Information Extraction, Frascati, Italy.Google Scholar
- MUC7. 1998. Proceedings of the Seventh Message Understanding Conference (MUC-7). SAIC, http://www.muc.saic.com/.Google Scholar
- Full text parsing using cascades of rules: an information extraction perspective
Recommendations
An initial study of full parsing of clinical text using the Stanford Parser
BIBMW '11: Proceedings of the 2011 IEEE International Conference on Bioinformatics and Biomedicine WorkshopsFull parsing recognizes a sentence and generates a syntactic structure of it (a parse tree), which is useful for many natural language processing (NLP) applications. The Stanford Parser is one of the state-of-art parsers in the general English domain. ...
LLLR parsing
SAC '13: Proceedings of the 28th Annual ACM Symposium on Applied ComputingThe idea of an LLLR parsing is presented. An LLLR(k) parser can be constructed for any LR(k) grammar but it produces the left parse of the input string in linear time (in respect to the length of the derivation) without backtracking. If used as a basis ...
Parsing long English sentences with pattern rules
COLING '90: Proceedings of the 13th conference on Computational linguistics - Volume 3In machine translation, parsing of long English sentences still causes some problems, whereas for short sentences a good machine translation system usually can generate readable translations. In this paper a practical method is presented for parsing ...
Comments