2010 | OriginalPaper | Buchkapitel
Extracting 5W1H Event Semantic Elements from Chinese Online News
verfasst von : Wei Wang, Dongyan Zhao, Lei Zou, Dong Wang, Weiguo Zheng
Erschienen in: Web-Age Information Management
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
This paper proposes a verb-driven approach to extract 5W1H (
W
ho,
W
hat,
W
hom,
W
hen,
W
here and
H
ow) event semantic information from Chinese online news. The main contributions of our work are two-fold: First, given the usual structure of a news story, we propose a novel algorithm to extract topic sentences by stressing the importance of news headline; Second, we extract event facts (i.e. 5W1H) from these topic sentences by applying a rule-based method (verb-driven) and a supervised machine-learning method (SVM). This method significantly improves the predicate-argument structure used in Automatic Content Extraction (ACE) Event Extraction (EE) task by considering valency (dominant capacity to noun phrases) of a Chinese verb. Extensive experiments on ACE 2005 datasets confirm its effectiveness and it also shows a very high scalability, since we only consider the topic sentences and surface text features. Based on this method, we build a prototype system named Chinese News Fact Extractor (CNFE). CNFE is evaluated on a real world corpus containing 30,000 newspaper documents. Experiment results show that CNFE can extract event facts efficiently.