2004 | OriginalPaper | Buchkapitel
Non-negative Matrix Factorization for Filtering Chinese Document
verfasst von : Jianjiang Lu, Baowen Xu, Jixiang Jiang, Dazhou Kang
Erschienen in: Computational Science - ICCS 2004
Verlag: Springer Berlin Heidelberg
Enthalten in: Professional Book Archive
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
There are two nasty classical problems of synonymy and polysemy in the filtering systems of Chinese documents. To deal with these two problems, we would ideally like to represent documents not by words, but by the semantic relations between words. Non-negative matrix factorization (NMF) is applied to dimensionality reduction of the words space. NMF is distinguished from the latent semantic indexing (LSI) by its non-negativity constraints. These constraints lead to a parts-based representation because they allow only additive, not subtractive, combinations. Also, NMF computation is based on the simple iterative algorithm; it is therefore advantageous for applications involving large sparse matrices. The experimental results show that, comparing with LSI, NMF method not only improves filtering precision markedly, but also has the merits of fast computing speed and less memory occupancy.