2007 | OriginalPaper | Chapter
Named Entities in Czech: Annotating Data and Developing NE Tagger
Authors : Magda Ševčíková, Zdeněk Žabokrtský, Oldřich Krůza
Published in: Text, Speech and Dialogue
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
This paper deals with the treatment of Named Entities (NEs) in Czech. We introduce a two-level NE classification. We have used this classification for manual annotation of two thousand sentences, gaining more than 11,000 NE instances. Employing the annotated data and Machine-Learning techniques (namely the top-down induction of decision trees), we have developed and evaluated a software system aimed at automatic detection and classification of NEs in Czech texts.