Skip to main content

2002 | OriginalPaper | Buchkapitel

Signal Boosting for Translingual Topic Tracking

Document Expansion and n-best Translation

verfasst von : Gina-Anne Levow, Douglas W. Oard

Erschienen in: Topic Detection and Tracking

Verlag: Springer US

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

The University of Maryland participated in the TDT-1999 topic tracking task. This chapter describes the system architecture, including source-dependent normalization, and then focuses on the cross-language case in which English training stories were used to find Mandarin stories on the same topic. Processes that may introduce noise, including errorful translation and transcription, are described and five techniques for minimizing the impact of a reduced signal-to-noise ratio are identified. Three techniques focus on signal boosting: augmenting story representations with topically related terminology through “document expansion,” exploiting knowledge of alternative translations using balanced n-best term translation, and enriching the bilingual term list to improve translation coverage. The remaining two techniques focus on noise reduction: removing common “stopwords” before translation and using corpus statistics to guide translation selection. Two of the signal boosting strategies yielded substantial gains using techniques that can be ported to other languages fairly easily, while outperforming state-of-the-art general-purpose machine translation. By contrast, neither of the noise reduction strategies produced significant improvements.

Metadaten
Titel
Signal Boosting for Translingual Topic Tracking
verfasst von
Gina-Anne Levow
Douglas W. Oard
Copyright-Jahr
2002
Verlag
Springer US
DOI
https://doi.org/10.1007/978-1-4615-0933-2_9

Neuer Inhalt