2012 | OriginalPaper | Buchkapitel
A New Way of News Extraction by Text Washing and Statistics
verfasst von : Wang Su, Du Junping, Gao Tian
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Most previous IE (IE) work relys on the analysis of the DOM tree of HTML file. When hundreds of information sources need to be extracted in a specific domain like news, it will lead to decreased accuracy. Based on the features of news articles, this paper proposed a new way to get news content desired by washing noise information and text group statistics. The experiment proved the effectiveness of the algorithm.