Skip to main content
Top

2002 | OriginalPaper | Chapter

A Hierarchical Model for Clustering and Categorising Documents

Authors : E. Gaussier, C. Goutte, K. Popat, F. Chen

Published in: Advances in Information Retrieval

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

We propose a new hierarchical generative model for textual data, where words may be generated by topic specific distributions at any level in the hierarchy. This model is naturally well-suited to clustering documents in preset or automatically generated hierarchies, as well as categorising new documents in an existing hierarchy. Training algorithms are derived for both cases, and illustrated on real data by clustering news stories and categorising newsgroup messages. Finally, the generative model may be used to derive a Fisher kernel expressing similarity between documents.

Metadata
Title
A Hierarchical Model for Clustering and Categorising Documents
Authors
E. Gaussier
C. Goutte
K. Popat
F. Chen
Copyright Year
2002
Publisher
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/3-540-45886-7_16

Premium Partner