Skip to main content

2002 | OriginalPaper | Buchkapitel

A Hierarchical Model for Clustering and Categorising Documents

verfasst von : E. Gaussier, C. Goutte, K. Popat, F. Chen

Erschienen in: Advances in Information Retrieval

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

We propose a new hierarchical generative model for textual data, where words may be generated by topic specific distributions at any level in the hierarchy. This model is naturally well-suited to clustering documents in preset or automatically generated hierarchies, as well as categorising new documents in an existing hierarchy. Training algorithms are derived for both cases, and illustrated on real data by clustering news stories and categorising newsgroup messages. Finally, the generative model may be used to derive a Fisher kernel expressing similarity between documents.

Metadaten
Titel
A Hierarchical Model for Clustering and Categorising Documents
verfasst von
E. Gaussier
C. Goutte
K. Popat
F. Chen
Copyright-Jahr
2002
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/3-540-45886-7_16

Premium Partner