Skip to main content

2011 | OriginalPaper | Buchkapitel

11. Typology by Means of Language Networks: Applying Information Theoretic Measures to Morphological Derivation Networks

verfasst von : Olga Abramov, Tatiana Lokot

Erschienen in: Towards an Information Theory of Complex Networks

Verlag: Birkhäuser Boston

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this chapter we present a network theoretic approach to linguistics. In particular, we introduce a network model of derivational morphology in languages. We focus on suffixation as a mechanism to derive new words from existing ones. We induce networks of natural language data consisting of words, derivation suffixes and parts of speech (PoS) as well as the relations between them. Measuring the entropy of these networks by means of so called information functionals we aim at capturing the variation between typologically different languages. In this way, we rely on the work of Dehmer (Appl Math Comput 201:82–94, 2008) who has introduced a framework for measuring the entropy of graphs. In addition, we compare several entropy measures recently presented for graphs. We check whether these measures allow us to distinguish between language networks on the one hand, and random networks on the other.We found out, that linguistic variation among languages can be captured by investigating the topology of the underlying networks. Further, information functionals based on distributions of topological properties turned out to be better discriminators than those that are based on properties of single vertices.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
In fact, only 4 nodes of 136 have a degree > 1.
 
2
Dehmer [12] uses log to calculate the entropy. We use ln here for all functionals, which does not have any impact on the final results of the relative entropy (see the definition below) values.
 
3
We selected these combinations since they performed best in the parameter study shown in Table 11.6.
 
4
ER graphs are connected undirected random [15] graphs of the cardinality of German and English. BA [5] and WA [39] are randomly generated small world graphs of the cardinality of German. We generate ten graphs of each kind of random network (i.e., ten graphs for ER, ten for BA, etc.) and compare the averaged entropy values.
 
5
See [21] for details.
 
Literatur
1.
Zurück zum Zitat Altmann, G., Lehfeldt, W.: Allgemeine Sprachtypologie. Wilhelm Fink, Germany (1973) Altmann, G., Lehfeldt, W.: Allgemeine Sprachtypologie. Wilhelm Fink, Germany (1973)
2.
Zurück zum Zitat Aronoff, M.: Word Formation in Generative Grammar. MIT, Cambridge (1976) Aronoff, M.: Word Formation in Generative Grammar. MIT, Cambridge (1976)
3.
Zurück zum Zitat Baayen, H.: Quantitative Aspects of Morphological Productivity. In: Geert Booij, J.M. (ed.) Yearbook of Morphology, pp. 109–149. Kluwer, Dordrecht, Boston, London (1991) Baayen, H.: Quantitative Aspects of Morphological Productivity. In: Geert Booij, J.M. (ed.) Yearbook of Morphology, pp. 109–149. Kluwer, Dordrecht, Boston, London (1991)
4.
Zurück zum Zitat Baayen, H.: On frequency, transparency, and productivity. Yearbook of Morphology 1992, pp. 181–208 (1992) Baayen, H.: On frequency, transparency, and productivity. Yearbook of Morphology 1992, pp. 181–208 (1992)
6.
Zurück zum Zitat Bauer, L.: Morphological Productivity. Cambridge University Press, Cambridge (2001)CrossRef Bauer, L.: Morphological Productivity. Cambridge University Press, Cambridge (2001)CrossRef
7.
Zurück zum Zitat Bertinetto, P.M., Noccetti, S.: Prolegomena to ATAM acquisition. Theoretical premises and corpus labeling. Quaderni del Laboratorio di Linguistica della SNS n.6 ns. (2006) Bertinetto, P.M., Noccetti, S.: Prolegomena to ATAM acquisition. Theoretical premises and corpus labeling. Quaderni del Laboratorio di Linguistica della SNS n.6 ns. (2006)
8.
Zurück zum Zitat Bonchev, D., Rouvray, D.H.: Complexity in Chemistry, Biology, and Ecology. Mathematical and Computational Chemistry. Springer, New York (2005)CrossRef Bonchev, D., Rouvray, D.H.: Complexity in Chemistry, Biology, and Ecology. Mathematical and Computational Chemistry. Springer, New York (2005)CrossRef
9.
Zurück zum Zitat Brandes, U.: A faster algorithm for betweenness centrality. J. Math. Sociol. 25, 163–177 (2001)CrossRef Brandes, U.: A faster algorithm for betweenness centrality. J. Math. Sociol. 25, 163–177 (2001)CrossRef
10.
Zurück zum Zitat Bybee, J.L.: Morphology as Lexical Organization, Chap. 7, pp. 119–141. Academic, London (1988) Bybee, J.L.: Morphology as Lexical Organization, Chap. 7, pp. 119–141. Academic, London (1988)
11.
Zurück zum Zitat Clahsen, H., Sonnenstuhl, I., Blevins, J.P.: Derivational morphology in the german mental lexicon: a dual mechanism account. In: Baayen, H., Schreuder, R. (eds.), Morphological Structure in Language Processing, Mouton de Gruyter, pp. 125–155, 2006 (2003) Clahsen, H., Sonnenstuhl, I., Blevins, J.P.: Derivational morphology in the german mental lexicon: a dual mechanism account. In: Baayen, H., Schreuder, R. (eds.), Morphological Structure in Language Processing, Mouton de Gruyter, pp. 125–155, 2006 (2003)
12.
Zurück zum Zitat Dehmer, M.: Information processing in complex networks: Graph entropy and information functionals. Appl. Math. Comput. 201, 82–94 (2008)MathSciNetMATH Dehmer, M.: Information processing in complex networks: Graph entropy and information functionals. Appl. Math. Comput. 201, 82–94 (2008)MathSciNetMATH
13.
Zurück zum Zitat Dehmer, M., Varmuza, K., Borgert, S., Emmert-Streib, F.: On entropy-based molecular descriptors: statistical analysis of real and synthetic chemical structures. J. Chem. Inform. Model. 49(7), 1655–1663 (2009)CrossRef Dehmer, M., Varmuza, K., Borgert, S., Emmert-Streib, F.: On entropy-based molecular descriptors: statistical analysis of real and synthetic chemical structures. J. Chem. Inform. Model. 49(7), 1655–1663 (2009)CrossRef
14.
Zurück zum Zitat Dressler, W.U., Karpf, A.: The theoretical relevance of pre- and protomorpholgy in language acquisition. Yearbook of Morphology 1994, pp. 99–122 (1995)CrossRef Dressler, W.U., Karpf, A.: The theoretical relevance of pre- and protomorpholgy in language acquisition. Yearbook of Morphology 1994, pp. 99–122 (1995)CrossRef
15.
Zurück zum Zitat Erdős, P., Rényi, A.: On random graphs. Publicationes Mathematicae 6, 290–297 (1959) Erdős, P., Rényi, A.: On random graphs. Publicationes Mathematicae 6, 290–297 (1959)
16.
Zurück zum Zitat Evert, S., Lüdeling, A.: Measuring Morphological Productivity: Is AutomaticPreprocessing Sufficient? In: Rayson, P., Wilson, A., McEnery, T., Hardie, A., Khoja, S. (eds.) Proceedings of the Corpus Linguistics 2001 conference, pp. 167–175. Lancaster (2001) Evert, S., Lüdeling, A.: Measuring Morphological Productivity: Is AutomaticPreprocessing Sufficient? In: Rayson, P., Wilson, A., McEnery, T., Hardie, A., Khoja, S. (eds.) Proceedings of the Corpus Linguistics 2001 conference, pp. 167–175. Lancaster (2001)
17.
Zurück zum Zitat Ferrer i Cancho, R., Mehler, A., Pustylnikov, O., Díaz-Guilera, A.: Correlations in the organization of large-scale syntactic dependency networks. In: TextGraphs-2: Graph-Based Algorithms for Natural Language Processing, pp. 65–72 (2007) Ferrer i Cancho, R., Mehler, A., Pustylnikov, O., Díaz-Guilera, A.: Correlations in the organization of large-scale syntactic dependency networks. In: TextGraphs-2: Graph-Based Algorithms for Natural Language Processing, pp. 65–72 (2007)
18.
Zurück zum Zitat Ferrer i Cancho, R., Solé, R.V., Köhler, R.: Patterns in syntactic dependency networks. Phys. Rev. E 69, 051, 915 (2004) Ferrer i Cancho, R., Solé, R.V., Köhler, R.: Patterns in syntactic dependency networks. Phys. Rev. E 69, 051, 915 (2004)
19.
Zurück zum Zitat Freeman, L.C.: Centrality in social networks conceptual clarification. Soc. Network. 1(3), 215–239 (1978-1979)CrossRef Freeman, L.C.: Centrality in social networks conceptual clarification. Soc. Network. 1(3), 215–239 (1978-1979)CrossRef
20.
Zurück zum Zitat Habermann, M.: Verbale Wortbildung um 1500. Eine historisch-synchrone Untersuchung anhand von Texten Albrecht Dürers, Heinrich Deichlers und Veit Dietrichs. de Gruyter, Berlin (1994) Habermann, M.: Verbale Wortbildung um 1500. Eine historisch-synchrone Untersuchung anhand von Texten Albrecht Dürers, Heinrich Deichlers und Veit Dietrichs. de Gruyter, Berlin (1994)
21.
Zurück zum Zitat Hotho, A., Nürnberger, A., Paaß, G.: A Brief Survey of Text Mining. J. Lang. Technol. Comput. Ling. (JLCL) 20(1), 19–62 (2005) Hotho, A., Nürnberger, A., Paaß, G.: A Brief Survey of Text Mining. J. Lang. Technol. Comput. Ling. (JLCL) 20(1), 19–62 (2005)
22.
Zurück zum Zitat Köhler, R.: Zur linguistischen Synergetik: Struktur und Dynamik der Lexik. Brockmeyer, Bochum (1986) Köhler, R.: Zur linguistischen Synergetik: Struktur und Dynamik der Lexik. Brockmeyer, Bochum (1986)
23.
Zurück zum Zitat Konstantinova, E.V.: On some applications of information indices in chemical graph theory. In: General Theory of Information Transfer and Combinatorics. Springer, New York (2006) Konstantinova, E.V.: On some applications of information indices in chemical graph theory. In: General Theory of Information Transfer and Combinatorics. Springer, New York (2006)
24.
Zurück zum Zitat Konstantinova, E.V., Paleev, A.A.: Sensitivity of topological indices of polycyclic graphs (Russian). Vichisl. Systemy 136, 38–48 (1990)MATH Konstantinova, E.V., Paleev, A.A.: Sensitivity of topological indices of polycyclic graphs (Russian). Vichisl. Systemy 136, 38–48 (1990)MATH
25.
Zurück zum Zitat Liu, H.: The complexity of chinese syntactic dependency networks. Phys. A 387, 3048–3058 (2008)CrossRef Liu, H.: The complexity of chinese syntactic dependency networks. Phys. A 387, 3048–3058 (2008)CrossRef
26.
Zurück zum Zitat Mehler, A.: Structural similarities of complex networks: A computational model by example of wiki graphs. Appl. Artif. Intell. 22, 619–683 (2008)CrossRef Mehler, A.: Structural similarities of complex networks: A computational model by example of wiki graphs. Appl. Artif. Intell. 22, 619–683 (2008)CrossRef
27.
Zurück zum Zitat Mehler, A.: A quantitative graph model of social ontologies by example of Wikipedia. In: Dehmer, M., Emmert-Streib, F., Mehler, A. (eds.) Towards an Information Theory of Complex Networks: Statistical Methods and Applications. Birkhäuser, Boston/Basel (2011)MATH Mehler, A.: A quantitative graph model of social ontologies by example of Wikipedia. In: Dehmer, M., Emmert-Streib, F., Mehler, A. (eds.) Towards an Information Theory of Complex Networks: Statistical Methods and Applications. Birkhäuser, Boston/Basel (2011)MATH
28.
Zurück zum Zitat Mehler, A., Pustylnikov, O., Diewald, N.: Geography of social ontologies: Testing a variant of the Sapir-Whorf hypothesis in the context of Wikipedia. Comput. Speech Lang. 25(3), 716–740 (2011)CrossRef Mehler, A., Pustylnikov, O., Diewald, N.: Geography of social ontologies: Testing a variant of the Sapir-Whorf hypothesis in the context of Wikipedia. Comput. Speech Lang. 25(3), 716–740 (2011)CrossRef
29.
Zurück zum Zitat Mehler, A., Lücking, A., Weiß, P.: A network model of interpersonal alignment. Entropy 12(6), 1440–1483 (2010)CrossRef Mehler, A., Lücking, A., Weiß, P.: A network model of interpersonal alignment. Entropy 12(6), 1440–1483 (2010)CrossRef
30.
Zurück zum Zitat Plag, I.: Morphological Productivity. Structural Constraints in English Derivation. Mouton de Gruyter, Berlin/New York (1999)CrossRef Plag, I.: Morphological Productivity. Structural Constraints in English Derivation. Mouton de Gruyter, Berlin/New York (1999)CrossRef
31.
Zurück zum Zitat Prell, H.P.: Die Ableitung von Verben aus Substantiven in biblischen und nichtbiblischen Texten des Frühneuhochdeutschen. Lang, Frankfurt am Main (1991) Prell, H.P.: Die Ableitung von Verben aus Substantiven in biblischen und nichtbiblischen Texten des Frühneuhochdeutschen. Lang, Frankfurt am Main (1991)
32.
Zurück zum Zitat Pustylnikov, O.: Modeling learning of derivation morphology in a multi-agent simulation. In: Proceedings of IEEE Africon 2009. IEEE (2009) Pustylnikov, O.: Modeling learning of derivation morphology in a multi-agent simulation. In: Proceedings of IEEE Africon 2009. IEEE (2009)
33.
Zurück zum Zitat Abramov, O., Mehler, A.: Automatic Language Classification by Means of Syntactic Dependency Networks. Journal of Quantitative Linguistics (2011) Abramov, O., Mehler, A.: Automatic Language Classification by Means of Syntactic Dependency Networks. Journal of Quantitative Linguistics (2011)
34.
Zurück zum Zitat Pustylnikov, O., Schneider-Wiejowski, K.: Measuring morphological productivity. Studies in Quantitative Linguistics 5: Issues in Quantitative Linguistics, pp. 106–125 (2009) Pustylnikov, O., Schneider-Wiejowski, K.: Measuring morphological productivity. Studies in Quantitative Linguistics 5: Issues in Quantitative Linguistics, pp. 106–125 (2009)
35.
Zurück zum Zitat Schneider-Wiejowski, K.: Sprachwandel anhand von Produktivitätsverschiebungen in der schweizerdeutschen Derivationsmorphologie. Linguistik online 38 (2009) Schneider-Wiejowski, K.: Sprachwandel anhand von Produktivitätsverschiebungen in der schweizerdeutschen Derivationsmorphologie. Linguistik online 38 (2009)
36.
Zurück zum Zitat Schultink, H.: Produktiviteit als Morfologisch Fenomeen. Forum der Letteren 2, 110–125 (1961) Schultink, H.: Produktiviteit als Morfologisch Fenomeen. Forum der Letteren 2, 110–125 (1961)
37.
Zurück zum Zitat Shannon, C.E., Weaver, W.: The Mathematical Theory of Communication. University of Illinois Press, Urbana, IL, USA (1997)MATH Shannon, C.E., Weaver, W.: The Mathematical Theory of Communication. University of Illinois Press, Urbana, IL, USA (1997)MATH
38.
39.
Zurück zum Zitat Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393, 440–442 (1998)CrossRef Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393, 440–442 (1998)CrossRef
Metadaten
Titel
Typology by Means of Language Networks: Applying Information Theoretic Measures to Morphological Derivation Networks
verfasst von
Olga Abramov
Tatiana Lokot
Copyright-Jahr
2011
Verlag
Birkhäuser Boston
DOI
https://doi.org/10.1007/978-0-8176-4904-3_11