nach oben

2016 | Buch

Kapitel lesen Erstes Kapitel lesen

Towards a Theoretical Framework for Analyzing Complex Linguistic Networks

herausgegeben von: Alexander Mehler, Andy Lücking, Sven Banisch, Philippe Blanchard, Barbara Job

Verlag: Springer Berlin Heidelberg

Buchreihe : Understanding Complex Systems

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

The aim of this book is to advocate and promote network models of linguistic systems that are both based on thorough mathematical models and substantiated in terms of linguistics. In this way, the book contributes first steps towards establishing a statistical network theory as a theoretical basis of linguistic network analysis the boarder of the natural sciences and the humanities. This book addresses researchers who want to get familiar with theoretical developments, computational models and their empirical evaluation in the field of complex linguistic networks. It is intended to all those who are interested in statistical models of linguistic systems from the point of view of network research. This includes all relevant areas of linguistics ranging from phonological, morphological and lexical networks on the one hand and syntactic, semantic and pragmatic networks on the other. In this sense, the volume concerns readers from many disciplines such as physics, linguistics, computer science and information science. It may also be of interest for the upcoming area of systems biology with which the chapters collected here share the view on systems from the point of view of network analysis.

Inhaltsverzeichnis

Frontmatter

Cognition

Frontmatter

Language Networks as Models of Cognition: Understanding Cognition through Language

Abstract

Language is inherently cognitive and distinctly human. Separating the object of language from the human mind that processes and creates language fails to capture the full language system. Linguistics traditionally has focused on the study of language as a static representation, removed from the human mind. Network analysis has traditionally been focused on the properties and structure that emerge from network representations. Both disciplines could gain from looking at language as a cognitive process. In contrast, psycholinguistic research has focused on the process of language without committing to a representation. However, by considering language networks as approximations of the cognitive system we can take the strength of each of these approaches to study human performance and cognition as related to language. This paper reviews research showcasing the contributions of network science to the study of language. Specifically, we focus on the interplay of cognition and language as captured by a network representation. To this end, we review different types of language network representations before considering the influence of global level network features. We continue by considering human performance in relation to network structure and conclude with theoretical network models that offer potential and testable explanations of cognitive and linguistic phenomena.

Nicole M. Beckage, Eliana Colunga

Path-Length and the Misperception of Speech: Insights from Network Science and Psycholinguistics

Abstract

Using the analytical methods of network science we examined what could be retrieved from the lexicon when a spoken word is misperceived. To simulate misperceptions in the laboratory, we used a variant of the semantic associates task—the phonological associate task—in which participants heard an English word and responded with the first word that came to mind that sounded like the word they heard, to examine what people actually do retrieve from the lexicon when a spoken word is misperceived. Most responses were 1 link away from the stimulus word in the lexical network. Distant neighbors (words >1 link) were provided more often as responses when the stimulus word had low rather than high degree. Finally, even very distant neighbors tended to be connected to the stimulus word by a path in the lexical network. These findings have implications for the processing of spoken words, and highlight the valuable insights that can be obtained by combining the analytic tools of network science with the experimental tasks of psycholinguistics.

Michael S. Vitevitch, Rutherford Goldstein, Elizabeth Johnson

Structure and Organization of the Mental Lexicon: A Network Approach Derived from Syntactic Dependency Relations and Word Associations

Abstract

Semantic networks are often used to represent the meaning of a word in the mental lexicon. To construct a large-scale network for this lexicon, text corpora provide a convenient and rich resource. In this chapter the network properties of a text-based approach are evaluated and compared with a more direct way of assessing the mental content of the lexicon through word associations. This comparison indicates that both approaches highlight different properties specific to linguistic and mental representations. Both types of network are qualitatively different in terms of their global network structure and the content of the network communities. Moreover, behavioral data from relatedness judgments show that language networks do not capture these judgments as well as mental networks.

Simon De Deyne, Steven Verheyen, Gert Storms

Topology

Frontmatter

Network Motifs Are a Powerful Tool for Semantic Distinction

Abstract

Motifs are a general network analysis technique, which statistically relates network structure to epiphenomena on the network. This technique has been developed and brought to maturity in molecular biology, where it has been successfully applied to network-based chemical and biological dynamics of various types. Early on, the motif technique has been successfully applied outside biology as well – to social networks, electrical networks, and many more. Results by Milo et al. showed that the motif signature of a network varies from realm to realm to some extent but is significantly more homogenous within a realm. This observation has been the starting point of the thread of research presented in this paper. More specifically, we do not compare networks from different realms but focus on networks from a given realm. In several case studies on particular realms, we found that motif signatures suffice to distinguish certain classes of networks from each other. In this paper, we summarize our previous work, and present some new results. In particular, in Biemann et al. (2012), we found that natural and artificially generated language can be distinguished from each other through the motif signatures of the co-occurrence graphs. Based on that, we present work on co-occurrence graphs that are restricted to word classes. We found that the co-occurrence graphs of verbs (and other word classes used like predicates) exhibit strongly different motif signatures and can be distinguished by that. To demonstrate the general power of the approach, we present further original work on co-authorship networks, peer-to-peer streaming networks, and mailing networks.

Chris Biemann, Lachezar Krumov, Stefanie Roos, Karsten Weihe

Multidimensional Analysis of Linguistic Networks

Abstract

Network-based approaches play an increasingly important role in the analysis of data even in systems in which a network representation is not immediately apparent. This is particularly true for linguistic networks, which use to be induced from a linguistic data set for which a network perspective is only one out of several options for representation. Here we introduce a multidimensional framework for network construction and analysis with special focus on linguistic networks. Such a framework is used to show that the higher is the abstraction level of network induction, the harder is the interpretation of the topological indicators used in network analysis. Several examples are provided allowing for the comparison of different linguistic networks as well as to networks in other fields of application of network theory. The computation and the intelligibility of some statistical indicators frequently used in linguistic networks are discussed. It suggests that the field of linguistic networks, by applying statistical tools inspired by network studies in other domains, may, in its current state, have only a limited contribution to the development of linguistic theory.

Tanya Araújo, Sven Banisch

Semantic Space as a Metapopulation System: Modelling the Wikipedia Information Flow Network

Abstract

The meaning of a word can be defined as an indefinite set of interpretants, which are other words that circumscribe the semantic content of the word they represent (Derrida 1982). In the same way each interpretant has a set of interpretants representing it and so on. Hence the indefinite chain of meaning assumes a rhizomatic shape that can be represented and analysed via the modern techniques of network theory (Dorogovtsev and Mendes 2013).

A. Paolo Masucci, Alkiviadis Kalampokis, Víctor M. Eguíluz, Emilio Hernández-García

Are Word-Adjacency Networks Networks?

Abstract

This article discusses the question of whether word-adjacency relationships are well-represented by a complex network. The main hypothesis of this work is that network representations are best suited to analyze indirect effects. For an indirect effect to occur in a network, a network process needs to exist that uses the network to exert an indirect effect, e.g., the spreading of a virus in a social network after a small group of persons were infected. Given any sequence of words, it can be represented by a so-called word-adjacency network by representing each word by a node and by connecting two nodes if the corresponding words are directly adjacent or at least close to each other in this sequence. It can be easily seen that the result of a speech production process gives rise to a word-adjacency network but it is unlikely that speech production uses an underlying word-adjacency network—at least not in any easily describable way. Thus, the results of clustering algorithms, centrality index values, and the results of other distance-based measures that quantify indirect effects cannot be interpreted with respect to speech production.

Katharina Anna Zweig

Syntax

Frontmatter

Syntactic Complex Networks and Their Applications

Abstract

We present a review of the development and the state of the art of syntactic complex network analysis. Some characteristics of such networks and problems connected with their construction are mentioned. Relations between global network indicators and specific language properties are discussed. Applications of syntactic networks (language acquisition, language typology) are described.

Radek Čech, Ján Mačutek, Haitao Liu

Function Nodes in Chinese Syntactic Networks

Abstract

Based on two syntactic dependency networks derived from two Chinese treebanks of different registers, a statistical study is conducted regarding word frequency and distributions. We chose three grammatical (function) words as our research objects and analyzed their network features, including degree, out-degree, in-degree, closeness, in-closeness, out-closeness and betweenness. Then we removed these three word nodes from the networks so as to see what consequences may follow in the number of vertices, average degree, average path length, diameter, the number of isolated vertices, domain and density. The results showed that all three function words are central nodes of the Chinese syntactic networks but have different status, since their influence to the overall structure is quite different. The research provides not only a new way for the study on Chinese function words but also a method for examining the influence of node characteristics to a complex network.

Xinying Chen, Haitao Liu

Non-crossing Dependencies: Least Effort, Not Grammar

Abstract

The use of null hypotheses (in a statistical sense) is common in hard sciences but not in theoretical linguistics. Here the null hypothesis that the low frequency of syntactic dependency crossings is expected by an arbitrary ordering of words is rejected. It is shown that this would require star dependency structures, which are both unrealistic and too restrictive. The hypothesis of the limited resources of the human brain is revisited. Stronger null hypotheses taking into account actual dependency lengths for the likelihood of crossings are presented. Those hypotheses suggests that crossings are likely to reduce when dependencies are shortened. A hypothesis based on pressure to reduce dependency lengths is more parsimonious than a principle of minimization of crossings or a grammatical ban that is totally dissociated from the general and non-linguistic principle of economy.

Ramon Ferrer-i-Cancho

Dynamics

Frontmatter

Simulating the Effects of Cross-Generational Cultural Transmission on Language Change

Abstract

Language evolves in a socio-cultural environment. Apart from biological evolution and individual learning, cultural transmission also casts important influence on many aspects of language evolution. In this paper, based on the lexicon-syntax coevolution model, we extend the acquisition framework in our previous work to examine the roles of three forms of cultural transmission spanning the offspring, parent, and grandparent generations in language change. These transmissions are: those between the parent and offspring generations (PO), those within the offspring generation (OO), and those between the grandparent and offspring generations (GO). The simulation results of the considered model and relevant analyses illustrate not only the necessity of PO and OO transmissions for language change, thus echoing our previous findings, but also the importance of GO transmission, a form of cross-generational cultural transmission, on preserving the mutual understandability of the communal language across generations of individuals.

Tao Gong, Lan Shuai

Social Networks and Beyond in Language Change

Abstract

We examine the effects of heterogeneous social interactions in a numerical model of language change based on the evolutionary utterance based theory developed by Croft. Two or more variants of a linguistic variable compete in the population. Social interactions can be separated into a symmetric weighted network of social contact probabilities, and asymmetric weightings given by speakers to each other’s utterances, that is, social influence. Remarkably, when interactions are symmetric between speakers, the network structure has no effect on the mean time to consensus. On the other hand large disparities in social influence, even in rather homogeneous networks, can dramatically affect mean time to reach consensus (fixation). We explore a range of representative scenarios, to give a general picture of both aspects of social interactions, in the absence of explicit selection for any particular variant.

Gareth J. Baxter

Emergence of Dominant Opinions in Presence of Rigid Individuals

Abstract

In this chapter, we study the dynamics of the so-called naming game as an opinion formation model with a focus on how the presence of a set of rigid minorities can result in the emergence of a dominant opinion in the system. These rigid minorities are “speaker-only”, i.e., they only “speak” and never “listen” thus strongly affecting the course of a social agreement process. We show that for a moderate α (fraction of rigid minorities), the agreement dynamics results in an emergence of a dominant opinion. We extensively study the property of such dominant opinions and observe that the dominance is not the characteristic property of only the “speaker-only” opinions; other opinions under certain circumstances can also become dominant. However, with increasing α, the chances of a “speaker-only” opinion becoming dominant increases. We also find early invented opinions possess higher chances of becoming dominant. We embed this model on various static interaction topologies and real-world time-varying face-to-face interaction data. Importantly, for a reasonably static societal structure the presence of rigid minorities influences the emergence of a dominant opinion to a much larger extent than in case where the societal structure is very dynamic.

Suman Kalyan Maity, Animesh Mukherjee

Resources

Frontmatter

Considerations for a Linguistic Network Markup Language

Abstract

As the previous chapters have shown, the possible ways of representing linguistic data as a graph are as diverse as the data itself. For the process of graph modeling, the decision as to what information will be represented as nodes and what information as relations is of great importance. In addition, what kind of added value is going to be expected by the representation of the data as a graph and what kinds of scientific questions should be answerable by the model.

Maik Stührenberg, Nils Diewald, Rüdiger Gleim

Linguistic Networks – An Online Platform for Deriving Collocation Networks from Natural Language Texts

Abstract

This section describes the Linguistic Networks System (LNS). Its primary goal is to allow users for exploring texts from a network-oriented perspective. One aim is to let researchers - especially from the area of historical semantics (Jussen et al. 2007) - reveal particularities of the underlying texts that are hardly accessible otherwise.

Alexander Mehler, Rüdiger Gleim

Backmatter

Titel: Towards a Theoretical Framework for Analyzing Complex Linguistic Networks
herausgegeben von: Alexander Mehler
Andy Lücking
Sven Banisch
Philippe Blanchard
Barbara Job
Verlag: Springer Berlin Heidelberg
Electronic ISBN: 978-3-662-47238-5
Print ISBN: 978-3-662-47237-8
DOI: https://doi.org/10.1007/978-3-662-47238-5