2003 | OriginalPaper | Buchkapitel
Smoothing Techniques for Tree-k-Grammar-Based Natural Language Modeling
verfasst von : Jose L. Verdú-Mas, Jorge Calera-Rubio, Rafael C. Carrasco
Erschienen in: Pattern Recognition and Image Analysis
Verlag: Springer Berlin Heidelberg
Enthalten in: Professional Book Archive
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
In a previous work, a new probabilistic context-free grammar (PCFG) model for natural language parsing derived from a tree bank corpus has been introduced. The model estimates the probabilities according to a generalized k-grammar scheme for trees. It allows for faster parsing, decreases considerably the perplexity of the test samples and tends to give more structured and refined parses. However, it suffers from the problem of incomplete coverage. In this paper, we compare several smoothing techniques such as backing-off or interpolation that are used to avoid assigning zero probability to any sentence.