In this paper we present the concept of an electronic lexicon based on morphological roots. The idea of the root-based lexicon returns to traditional linguistic division of a word into a stem and an inflectional suffix. The only difference to the pure linguistic description is that an electronic resource must adapt to the analyzed text. We assume that the lexicon will be used in written text analysis (or synthesis), therefore we operate on grapheme objects.
We used the lexicon of the inflectional analyzer AMOR as the empirical foundation for the root-based lexicon. In the second part of the paper we describe the process of the automatic conversion of the data from the analyzer into the assumed format. The conversion concerns the major inflecting parts of speech: nouns, adjectives and verbs. The results are two-level morphology based entries which bear the whole package of morphological information about lexemes. In the presented form, however, any generalization about Polish inflection or inner root alternations is not available. Thus, we rebuilt the lexicon of roots. As a result we obtained the compressed lexicon which can serve not only for inflection analysis but also applications of word-formation descriptions.
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten