Top

2011 | Book

Read chapter Read first chapter

Towards an Information Theory of Complex Networks

Statistical Methods and Applications

Editors: Matthias Dehmer, Frank Emmert-Streib, Alexander Mehler

Publisher: Birkhäuser Boston

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

For over a decade, complex networks have steadily grown as an important tool across a broad array of academic disciplines, with applications ranging from physics to social media. A tightly organized collection of carefully-selected papers on the subject, Towards an Information Theory of Complex Networks: Statistical Methods and Applications presents theoretical and practical results about information-theoretic and statistical models of complex networks in the natural sciences and humanities. The book's major goal is to advocate and promote a combination of graph-theoretic, information-theoretic, and statistical methods as a way to better understand and characterize real-world networks.

This volume is the first to present a self-contained, comprehensive overview of information-theoretic models of complex networks with an emphasis on applications. As such, it marks a first step toward establishing advanced statistical information theory as a unified theoretical basis of complex networks for all scientific disciplines and can serve as a valuable resource for a diverse audience of advanced students and professional scientists. While it is primarily intended as a reference for research, the book could also be a useful supplemental graduate text in courses related to information science, graph theory, machine learning, and computational biology, among others.

Frontmatter

Chapter 1. Entropy of Digraphs and Infinite Networks

Abstract

The information content of a graph G is defined in Mowshowitz (Bull Math Biophys 30:175–204, 1968) as the entropy of a finite probability scheme associated with the vertex partition determined by the automorphism group of G. This provides a quantitative measure of the symmetry structure of a graph that has been applied to problems in such diverse fields as chemistry, biology, sociology, and computer science (Mowshowitz and Mitsou, Entropy, orbits and spectra of graphs, Wiley-VCH, 2009). The measure extends naturally to directed graphs (digraphs) and can be defined for infinite graphs as well (Mowshowitz, Bull Math Biophys 30:225–240, 1968).This chapter focuses on the information content of digraphs and infinite graphs. In particular, the information content of digraph products and recursively defined infinite graphs is examined.

A. Mowshowitz

Chapter 2. An Information-Theoretic Upper Bound on Planar Graphs Using Well-Orderly Maps

Abstract

This chapter deals with compressed coding of graphs. We focus on planar graphs, a widely studied class of graphs. A planar graph is a graph that admits an embedding in the plane without edge crossings. Planar maps (class of embeddings of a planar graph) are easier to study than planar graphs, but as a planar graph may admit an exponential number of maps, they give little information on graphs. In order to give an information-theoretic upper bound on planar graphs, we introduce a definition of a quasi-canonical embedding for planar graphs: well-orderly maps. This appears to be an useful tool to study and encode planar graphs. We present upper bounds on the number of unlabeled¹ planar graphs and on the number of edges in a random planar graph. We also present an algorithm to compute well-orderly maps and implying an efficient coding of planar graphs.

Nicolas Bonichon, Cyril Gavoille, Nicolas Hanusse

Chapter 3. Probabilistic Inference Using Function Factorization and Divergence Minimization

Abstract

This chapter addresses modeling issues in statistical inference problems. We will focus specifically on factorization model which is a generalization of Markov random fields and Bayesian networks. For any positive function (say an estimated probability distribution), we present a mechanical approach which approximates the function with one in a factorization model that is as simple as possible, subject to an upper bound on approximation error. We also rewrite a probabilistic inference problem into a divergence minimization (DM) problem where iterative algorithms are proposed to solve the DM problem. We prove that the well-known EM algorithm is a special case of our proposed iterative algorithm.

Terence H. Chan, Raymond W. Yeung

Chapter 4. Wave Localization on Complex Networks

Abstract

In this chapter we consider the role played by Anderson localization in transport through complex networks, for example, an optical network. The network is described by a tight binding Hamiltonian, which may be used to determine the properties of the Anderson transition according to the statistical properties of its eigenvalues. The Anderson transition properties of different complex networks will be studied, emphasizing the role played by clustering on the localization of waves. We shall show that new complex topologies lead to novel physics, specifically clustering may lead to localization.

Richard Berkovits, Lukas Jahnke, Jan W. Kantelhardt

Chapter 5. Information-Theoretic Methods in Chemical Graph Theory

Abstract

During recent years, information theory has been used extensively in chemistry for describing chemical structures and providing good correlations between physicochemical and structural properties. In this chapter, we present a survey on information-theoretic methods which are used in chemical graph theory.

Elena Konstantinova

Chapter 6. On the Development and Application of Net-Sign Graph Theory

Abstract

This report briefly describes the development and applications of net-sign graph theory. The current work enunciates the graph (molecule) signature of non-alternant non-benzenoid hydrocarbons with odd member of rings (non-bipartite molecular graphs) based on chemical signed graph theory. Experimental evidences and Hückel spectrum reveal that structure possessing nonbonding molecular orbital (NBMOs) is very unstable and highly reactive under the drastic conditions of low temperature. Chemical signed graph theoretical approach is applied successfully to classify the non-bipartite molecular graphs with a view to Randic’s conjugated circuit models based on their spectral characteristic. The obtained results based on net-sign approach are compared with those obtained using Hückel calculations.

Prabhat K. Sahu, Shyi-Long Lee

Chapter 7. The Central Role of Information Theory in Ecology

Abstract

Information theory (IT) is predicated upon that which largely eludes physics – the absence of something. The capacity for IT to portray both presence and absence in comparable quantitative fashion makes it indispensable to ecology. IT has been applied to ecology along two separate lines: (1) it has been used to quantify the distribution of stocks and numbers of organisms and (2) it has been used to quantify the pattern of interactions of trophic processes. By and large, the first endeavor has resulted in relatively few insights into ecosystem dynamics and has generated much ambiguity and disappointment, so that most ecologists remain highly skeptical about the advisability of applying IT to ecology. By contrast, the second (and less well-known) application has shed light on the possibility that ecosystem behavior is the most palpable example of a purely natural “infodynamics” that transcends classical dynamics, but remains well within the realm of the quantifiable.

Robert E. Ulanowicz

Chapter 8. Inferences About Coupling from Ecological Surveillance Monitoring: Approaches Based on Nonlinear Dynamics and Information Theory

Abstract

Some monitoring programs for ecological resources are developed as components of larger science or management programs and are thus guided by a priori hypotheses. More commonly, ecological monitoring programs are initiated for the purpose of surveillance with no a priori hypotheses in mind. No conceptual framework currently exists to guide the development of surveillance monitoring programs, resulting in substantial debate about program design. We view surveillance monitoring programs as providing information about system dynamics and focus on methods for extracting such information from time series of monitoring data. We briefly describe methods from the general field of nonlinear dynamics that we believe may be useful in extracting information about system dynamics. In looking at the system as a network of locations or components, we emphasize methods for assessing coupling between system components for use in understanding system dynamics and interactions and in detecting changes in system dynamics. More specifically, these methods hold promise for such ecological problems as identifying indicator species, developing informative spatial monitoring designs, detecting ecosystem change and damage, and investigating such topics as population synchrony, species interactions, and environmental drivers. We believe that these ideas and methods provide a useful conceptual framework for surveillance monitoring and can be used with model systems to draw inferences about the design of surveillance monitoring programs. In addition, some of the current methods should be useful with some actual ecological monitoring data, and methodological extensions and modifications should increase the applicability of these approaches to additional sources of actual ecological data.

L. J. Moniz, J. D. Nichols, J. M. Nichols, E. G. Cooch, L. M. Pecora

Chapter 9. Markov Entropy Centrality: Chemical, Biological, Crime, and Legislative Networks

Abstract

In this chapter, we propose the study of multiple systems using node centrality or connectedness information measures derived from a Graph or Complex Network. The information is quantified in terms of the Entropy centrality ^kC _θ(j) of the jth parts or states (nodes) of a Markov Chain associated with the system, represented by a network graph. The procedure is standard for all systems despite the complexity of the system. First, we define the phenomena to study, ranging from molecular systems composed by single molecules (drug activity, drug toxicity), multiple molecules (networks of chemical reactions), and macromolecules (DNA–drug interaction, protein function), to ecological systems (bacterial co-aggregation), or social systems (criminal causation, legislative productivity). Second, we collect several cases from literature (drugs, chemical reactions, proteins, bacterial species, or criminal cases). Next, we classify the cases in at least two different groups (active/nonactive drugs, enantioselective/non-enantioselective reactions, functional/nonfunctional proteins, co-aggregating/non-co-aggregating bacteria, or crime/noncrime cause, efficient/nonefficient law). After that, we represent the interconnectivity of the discrete parts of the system (atoms, amino acids, reactants, bacteria species, or people) as a graph or network. The Markov Chain theory is used to calculate the entropy of the system for nodes placed at different distances. Finally, we aim to both derive and validate a classification model using the entropy values as input variables and the classification of cases as the output variables. The model is used to predict the probability with which a case presents the studied property. The present work proposes the entropy of a Markov Chain associated with a network or graph to be used as a universal quantity in pattern recognition regardless the chemical, biological, social, or other nature of the systems under study.

C. R. Munteanu, J. Dorado, Alejandro Pazos-Sierra, F. Prado-Prado, L. G. Pérez-Montoto, S. Vilar, F. M. Ubeira, A. Sanchez-Gonzaléz, M. Cruz-Monteagudo, S. Arrasate, N. Sotomayor, E. Lete, A. Duardo-Sánchez, A. Díaz-López, G. Patlewicz, H. González-Díaz

Chapter 10. Social Ontologies as Generalized Nearly Acyclic Directed Graphs: A Quantitative Graph Model of Social Tagging

Abstract

In this paper, we introduce a quantitative graph model of social ontologies as exemplified by the category system of Wikipedia. This is done to contrast structure formation in distributed cognition with classification schemes (by example of the DDC and MeSH), formal ontologies (by example of OpenCyc and SUMO), and terminological ontologies (as exemplified by WordNet). Our basic findings are that social ontologies have a characteristic topology that clearly separates them from other types of ontologies. In this context, we introduce the notion of a Zipfian bipartivity to analyze the relationship of categories and categorized units in distributed cognition.

Alexander Mehler

Chapter 11. Typology by Means of Language Networks: Applying Information Theoretic Measures to Morphological Derivation Networks

Abstract

In this chapter we present a network theoretic approach to linguistics. In particular, we introduce a network model of derivational morphology in languages. We focus on suffixation as a mechanism to derive new words from existing ones. We induce networks of natural language data consisting of words, derivation suffixes and parts of speech (PoS) as well as the relations between them. Measuring the entropy of these networks by means of so called information functionals we aim at capturing the variation between typologically different languages. In this way, we rely on the work of Dehmer (Appl Math Comput 201:82–94, 2008) who has introduced a framework for measuring the entropy of graphs. In addition, we compare several entropy measures recently presented for graphs. We check whether these measures allow us to distinguish between language networks on the one hand, and random networks on the other.We found out, that linguistic variation among languages can be captured by investigating the topology of the underlying networks. Further, information functionals based on distributions of topological properties turned out to be better discriminators than those that are based on properties of single vertices.

Olga Abramov, Tatiana Lokot

Chapter 12. Information Theory-Based Measurement of Software

Abstract

Abstractions of software may take the form of complex graphs. This chapter presents methods for measurement of hypergraph abstractions of software, using information theory, rather than counting. Even though the context of this chapter is software engineering, the measurement methods are applicable to any hypergraph. The software-metrics literature often assumes that complex software has more bugs than simple software. Thus, if one measures a software system, one can identify which modules are more likely to have bugs. This chapter presents measures of size, complexity, and coupling in terms of the amount of information, building on formal definitions of these software-metric families proposed by Briand, Morasca, and Basili. Kolmogorov complexity and its information theory-based approximations are the foundation for our measure of size. Excess entropy is the foundation for our measures of complexity and coupling. These concepts make the metrics sensitive to the configuration of hypergraph connections, and thus, may be preferred to corresponding counting metrics.

Edward B. Allen

Chapter 13. Fair and Biased Random Walks on Undirected Graphs and Related Entropies

Abstract

The entropy rates of Markov chains (random walks) defined on connected undirected graphs are well studied in many surveys. We study the entropy rates related to the first-passage time probability distributions of fair random walks, their relative (Kullback–Leibler) entropies, and the entropy related to two biased random walks – with the random absorption of walkers and the shortest paths random walks. We show that uncertainty of first-passage times quantified by the entropy rates characterizes the connectedness of the graph. The relative entropy derived for the biased random walks estimates the level of uncertainty between connectivity and connectedness – the local and global properties of nodes in the graph.

Philippe Blanchard, Dimitri Volchenkov

Title: Towards an Information Theory of Complex Networks
Editors: Matthias Dehmer
Frank Emmert-Streib
Alexander Mehler
Publisher: Birkhäuser Boston
Electronic ISBN: 978-0-8176-4904-3
Print ISBN: 978-0-8176-4903-6
DOI: https://doi.org/10.1007/978-0-8176-4904-3

Springer Professional

Towards an Information Theory of Complex Networks

Statistical Methods and Applications

About this book

Table of Contents

Frontmatter

Chapter 1. Entropy of Digraphs and Infinite Networks

Chapter 2. An Information-Theoretic Upper Bound on Planar Graphs Using Well-Orderly Maps

Chapter 3. Probabilistic Inference Using Function Factorization and Divergence Minimization

Chapter 4. Wave Localization on Complex Networks

Chapter 5. Information-Theoretic Methods in Chemical Graph Theory

Chapter 6. On the Development and Application of Net-Sign Graph Theory

Chapter 7. The Central Role of Information Theory in Ecology

Chapter 8. Inferences About Coupling from Ecological Surveillance Monitoring: Approaches Based on Nonlinear Dynamics and Information Theory

Chapter 9. Markov Entropy Centrality: Chemical, Biological, Crime, and Legislative Networks

Chapter 10. Social Ontologies as Generalized Nearly Acyclic Directed Graphs: A Quantitative Graph Model of Social Tagging

Chapter 11. Typology by Means of Language Networks: Applying Information Theoretic Measures to Morphological Derivation Networks

Chapter 12. Information Theory-Based Measurement of Software

Chapter 13. Fair and Biased Random Walks on Undirected Graphs and Related Entropies

Premium Partner