2010 | OriginalPaper | Chapter
Detecting Events in a Million New York Times Articles
Authors : Tristan Snowsill, Ilias Flaounas, Tijl De Bie, Nello Cristianini
Published in: Machine Learning and Knowledge Discovery in Databases
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
We present a demonstration of a newly developed text stream event detection method on over a million articles from the New York Times corpus. The event detection is designed to operate in a predominantly on-line fashion, reporting new events within a specified timeframe. The event detection is achieved by detecting significant changes in the statistical properties of the text where those properties are efficiently stored and updated in a suffix tree.
This particular demonstration shows how our method is effective at discovering both short- and long-term events (which are often denoted topics), and how it automatically copes with topic drift on a corpus of 1 035 263 articles.