ABSTRACT
Most traditional information extraction approaches are generative models that assume events exist in text in certain patterns and these patterns can be regenerated in various ways. These assumptions limited the syntactic clues being considered for finding an event and confined these approaches to a particular syntactic level. This paper presents a discriminative framework based on kernel SVMs that takes into account different levels of syntactic information and automatically identifies the appropriate clues. Kernels are used to represent certain levels of syntactic structure and can be combined in principled ways as input for an SVM. We will show that by combining a low level sequence kernel with a high level kernel on a GLARF dependency graph, the new approach outperformed a good rule-based system on slot filler detection for MUC-6.
- D. Appelt, J. Hobbs, J. Bear, D. Israel, M. Kameyama, A. Kehler, D. Martin, K. Meyers, and M. Tyson 1996. SRI International FASTUS system: MUC-6 test results and analysis. In Proceedings of the Sixth Message Understanding Conference.]] Google ScholarDigital Library
- H. L. Chieu, H. T. Ng, & Y. K. Lee. 2003. Closing the Gap: Learning-Based Information Extraction Rivaling Knowledge-Engineering Methods. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics.]] Google ScholarDigital Library
- M. Collins and S. Miller. 1998. Semantic Tagging using a Probabilistic Context Free Grammar, In Proceedings of the Sixth Workshop on Very Large Corpora.]]Google Scholar
- M. Collins and N. Duffy. 2001. Convolution Kernels for Natural Language, Advances in Neural Information Processing Systems 14, MIT Press.]]Google Scholar
- D. Fisher, S. Soderland, J. McCarthy, F. Feng and W. Lehnert. 1996. Description of The UMass System As Used For MUC-6. In Proceedings of the Sixth Message Understanding Conference.]] Google ScholarDigital Library
- R. Grishman. 1996. The NYU System for MUC-6 or Where's the Syntax?. In Proceedings of the Sixth Message Understanding Conference.]] Google ScholarDigital Library
- H. Lodhi, C. Sander, J. Shawe-Taylor, N. Christianini and C. Watkins. 2002. Text Classification using String Kernels. Journal of Machine Learning Research.]] Google ScholarDigital Library
- A. Meyers, R. Grishman, M. Kosaka and S. Zhao. 2001. Covering Treebanks with GLARF. In Proceedings of of the ACL Workshop on Sharing Tools and Resources.]] Google ScholarDigital Library
- S. Miller, M. Crystal, H. Fox, L. Ramshaw, R. Schwartz, R. Stone, and R. Weischedel. 1998. BBN: Description of The SIFT System As Used For MUC-7, In Proceedings of the Seventh Message Understanding Conference.]]Google Scholar
- K.-R. Müller, S. Mika, G. Ratsch, K. Tsuda, B. Scholkopf. 2001. An introduction to kernel-based learning algorithms, IEEE Trans. Neural Networks, 12, 2, pages 181--201.]]Google ScholarDigital Library
- E. Riloff. 1993. Automatically constructing a dictionary for information extraction tasks. In Proceedings of the 11th National Conference on Artificial Intelligence, 811--816.]]Google Scholar
- V. N. Vapnik. 1998. Statistical Learning Theory. Wiley-Interscience Publication.]] Google ScholarDigital Library
- D. Zelenko, C. Aone and A. Richardella. 2003. Kernel methods for relation extraction. Journal of Machine Learning Research.]] Google ScholarDigital Library
- Discriminative slot detection using kernel methods
Recommendations
Anaphora resolution in slot grammar
We present three algorithms for resolving anaphora in Slot Grammar: (1) an algorithm for interpreting elliptical VPs in antecedent-contained deletion structures, subdeletion constructions, and intersentential cases; (2) a syntactic filter on pronominal ...
Syntactic discriminative language model rerankers for statistical machine translation
This article describes a method that successfully exploits syntactic features for n-best translation candidate reranking using perceptrons. We motivate the utility of syntax by demonstrating the superior performance of parsers over n-gram language ...
Coreference-oriented interlingual slot structure & machine translation
CorefApp '99: Proceedings of the Workshop on Coreference and its ApplicationsOne of the main problems of many commercial Machine Translation (MT) and experimental systems is that they do not carry out a correct pronominal anaphora generation. As mentioned in Mitkov (1996), solving the anaphora and extracting the antecedent are ...
Comments