2002 | OriginalPaper | Chapter
Segmentation and Detection at IBM
Hybrid Statistical Models and Two-tiered Clustering
Authors : S. Dharanipragada, M. Franz, J. S. McCarley, T. Ward, W.-J. Zhu
Published in: Topic Detection and Tracking
Publisher: Springer US
Included in: Professional Book Archive
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
IBM’s story segmentation uses a combination of decision tree and maximum entropy models. They take a variety of lexical, prosodic, semantic, and structural features as their inputs. Both types of models are source-specific, and we substantially lower C seg by combining them. IBM’s topic detection system introduces a minimal hierarchy into the clustering: each cluster is comprised of one or more microclusters. We investigate the importance of merging microclusters together, and propose a merging strategy which improves our performance.