2015 | OriginalPaper | Chapter
On Designing an Effective Training Set for Information Extraction
Authors : Young-Min Kim, Sa-kwang Song, Sungho Shin, Choong-Nyoung Seon, Seunggyun Hong, Hanmin Jung
Published in: Computer Science and its Applications
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
While training set design has received less attention from academia compared to its significance, it becomes crucial in big data environments. We propose a novel way to construct a training set for information extraction. An effective data collection considering the trade-off between system quality and annotation difficulty is the core of the proposed approach. Instead of a random collection of data like usual systems, well-defined key expressions are used as sampling queries. This work is a part of an on-going R&D project and now in process of manual annotation that would be evaluated via final system quality.