2011 | OriginalPaper | Buchkapitel
CorpusExplorer: Supporting a Deeper Understanding of Linguistic Corpora
verfasst von : Andrés Esteban, Roberto Therón
Erschienen in: Smart Graphics
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Word trees are a common way of representing frequency information obtained by analyzing natural language data. This article explores their usage and possibilities, and addresses the development of an application to visualize the relative frequencies of 2-grams and 3-grams in Google’s ”English One Million” corpus using a two-sided word tree and sparklines to show usage trends through time. It also discusses how the raw data was processed and trimmed to speed up access to it.