Skip to main content

2023 | Buch

Complex Networks and Their Applications XI

Proceedings of The Eleventh International Conference on Complex Networks and Their Applications: COMPLEX NETWORKS 2022 — Volume 1

herausgegeben von: Hocine Cherifi, Rosario Nunzio Mantegna, Luis M. Rocha, Chantal Cherifi, Salvatore Miccichè

Verlag: Springer International Publishing

Buchreihe : Studies in Computational Intelligence

insite
SUCHEN

Über dieses Buch

This book highlights cutting-edge research in the field of network science, offering scientists, researchers, students, and practitioners a unique update on the latest advances in theory and a multitude of applications. It presents the peer-reviewed proceedings of the XI International Conference on Complex Networks and their Applications (COMPLEX NETWORKS 2022). The carefully selected papers cover a wide range of theoretical topics such as network models and measures; community structure, network dynamics; diffusion, epidemics, and spreading processes; resilience and control as well as all the main network applications, including social and political networks; networks in finance and economics; biological and neuroscience networks and technological networks.

Inhaltsverzeichnis

Frontmatter

Information Spreading in Social Media

Frontmatter
Cognitive Cascades within Media Ecosystems: Simulating Fragmentation, Selective Exposure and Media Tactics to Investigate Polarization

This work introduces a simple extension to the recent Cognitive Cascades model of Rabb et al. with modeling of multiple media agents, to begin to investigate how the media ecosystem might influence the spread of beliefs (such as beliefs around COVID-19 vaccination). We perform some initial simulations to see how parameters modeling audience fragmentation, selective exposure, and responsiveness of media agents to the beliefs of their subscribers influence polarization. We find that media ecosystem fragmentation and echo-chambers may not in themselves be as polarizing as initially postulated, in the absence of outside fixed media messages that are polarizing.

Nicholas Rabb, Lenore Cowen
Properties of Reddit News Topical Interactions

Most models of information diffusion online rely on the assumption that pieces of information spread independently from each other. However, several works pointed out the necessity of investigating the role of interactions in real-world processes, and highlighted possible difficulties in doing so: interactions are sparse and brief. As an answer, recent advances developed models to account for interactions in underlying publication dynamics. In this article, we propose to extend and apply one such model to determine whether interactions between news headlines on Reddit play a significant role in their underlying publication mechanisms. After conducting an in-depth case study on 100,000 news headline from 2019, we retrieve state-of-the-art conclusions about interactions and conclude that they play a minor role in this dataset.

Gaël Poux-Médard, Julien Velcin, Sabine Loudcher
Will You Take the Knee? Italian Twitter Echo Chambers’ Genesis During EURO 2020

Echo chambers can be described as situations in which individuals encounter and interact only with viewpoints that confirm their own, thus moving, as a group, to more polarized and extreme positions. Recent literature mainly focuses on characterizing such entities via static observations, thus disregarding their temporal dimension. In this work, distancing from such a trend, we study, at multiple topological levels, echo chambers genesis related to the social discussions that took place in Italy during the EURO 2020 Championship. Our analysis focuses on a well-defined topic (i.e., BLM/racism) discussed on Twitter during a perfect temporally bound (sporting) event. Such characteristics allow us to track the rise and evolution of echo chambers in time, thus relating their existence to specific episodes.

Chiara Buongiovanni, Roswita Candusso, Giacomo Cerretini, Diego Febbe, Virginia Morini, Giulio Rossetti
A Simple Model of Knowledge Scaffolding

We introduce a simple model of knowledge scaffolding, simulating the process of building a corpus of knowledge based on logic derivations starting from a set of “axioms”. The starting idea around which we developed the model is that each new contribution, still not present in the corpus of knowledge, can be accepted only if it is based on a given number of items already belonging to the corpus. When a new item is acquired by the corpus we impose a limit to the maximum growth of knowledge for every step that we call the “jump” in knowledge. We analyze the growth with time of the corpus and the maximum knowledge and analyzing the results of our simulations we managed to show that they both follow a power law. Another result is that the number of “holes” in the knowledge corpus always remains limited. Using an approach based on a death-birth Markov process we were able to derive some analytical approximation of it.

Franco Bagnoli, Guido de Bonfioli Cavalcabo
Using Knowledge Graphs to Detect Partisanship in Online Political Discourse

Existing methods for detecting partisanship and polarization on social media focus on either linguistic or network aspects of online communication, and tend to study a single platform. We explore the possibility of using knowledge graph embeddings to detect and analyze partisanship in online discourse. Knowledge graphs can potentially combine linguistic and network information across multiple platforms to enable more accurate discovery of a political dimension in online space. We train embeddings on heterogeneous graphs with different combinations of information text, network, single- and multi-platform information. Building on previous work, we develop a semi-supervised approach for uncovering a political dimension in the embedding space from a handful of labelled observations, and show that this method enables more accurate differentiation between liberal and conservative Twitter accounts. These results indicate that knowledge graphs can potentially be useful tools for analyzing online discourse.

Ari Decter-Frain, Vlad Barash
The wisdom_of_crowds: An Efficient, Philosophically-Validated, Social Epistemological Network Profiling Toolkit

The epistemic position of an agent often depends on their position in a larger network of other agents who provide them with information. In general, agents are better off if they have diverse and independent sources. Sullivan et al. [19] developed a method for quantitatively characterizing the epistemic position of individuals in a network that takes into account both diversity and independence and presented a proof-of-concept, closed-source implementation on a small graph derived from Twitter data [19]. This paper reports on an open-source re-implementation of their algorithm in Python, optimized to be usable on much larger networks. In addition to the algorithm and package, we also show the ability to scale up our package to large synthetic social network graph profiling, and finally demonstrate its utility in analyzing real-world empirical evidence of ‘echo chambers’ on online social media, as well as interdisciplinary diversity in an academic communications network.

Colin Klein, Marc Cheong, Marinus Ferreira, Emily Sullivan, Mark Alfano
Opening up Echo Chambers via Optimal Content Recommendation

Online social platforms have become central in the political debate. In this context, the existence of echo chambers is a problem of primary relevance. These clusters of like-minded individuals tend to reinforce prior beliefs, elicit animosity towards others and aggravate the spread of misinformation. We study this phenomenon on a Twitter dataset related to the 2017 French presidential elections and propose a method to tackle it with content recommendations. We use a quadratic program to find optimal recommendations that maximise the diversity of content users are exposed to, while still accounting for their preferences. Our method relies on a theoretical model that can sufficiently describe how content flows through the platform. We show that the model provides good approximations of empirical measures and demonstrate the effectiveness of the optimisation algorithm at mitigating the echo chamber effect on this dataset, even with limited budget for recommendations.

Antoine Vendeville, Anastasios Giovanidis, Effrosyni Papanastasiou, Benjamin Guedj
Change My Mind: Data Driven Estimate of Open-Mindedness from Political Discussions

One of the main dimensions characterizing the unfolding of opinion formation processes in social debates is the degree of open-mindedness of the involved population. Opinion dynamic modeling studies have tried to capture such a peculiar expression of individuals’ personalities and relate it to emerging phenomena like polarization, radicalization, and ideology fragmentation. However, one of their major limitations lies in the strong assumptions they make on the initial distribution of such characteristics, often fixed so as to satisfy a normality hypothesis. Here we propose a data-driven methodology to estimate users’ open-mindedness from online discussion data. Our analysis—focused on the political discussion taking place on Reddit during the first two years of the Trump presidency—unveils the existence of statistically diverse distributions of open-mindedness in annotated sub-populations (i.e., Republicans, Democrats, and Moderates/Neutrals). Moreover, such distributions appear to be stable across time and generated by individual users’ behaviors that remain consistent and underdispersed.

Valentina Pansanella, Virginia Morini, Tiziano Squartini, Giulio Rossetti
The Effects of Message Sorting in the Diffusion of Information in Online Social Media

In this work, we propose an agent-based model to study the effects of message sorting on the diffusion of low- and high-quality information in online social networks. We investigate the case in which each piece of information has a numerical proxy representing its quality, and the higher the quality, the greater are the chances of being transmitted further in the network. The model allows us to study how sorting information in the agent’s attention list according to their quality, node’s influence and popularity affect the overall system’s quality, diversity and discriminative power. We compare the three scenarios with a baseline model where the information is organized in a first-in first-out manner. Our results indicate that such an approach intensifies the exposure of high-quality information increasing the overall system’s quality while preserving its diversity. However, it significantly decreases the system’s discriminative power.

Diego F. M. Oliveira, Kevin S. Chan, Peter J. Mucha
Gradual Network Sparsification and Georeferencing for Location-Aware Event Detection in Microblogging Services

Event detection in microblogging services such as Twitter has become a challenging research topic within the fields of social network analysis and natural language processing. Many works focus on the identification of general events with event types ranging from political news and soccer games to entertainment. However, in application contexts like crisis management, traffic planning, or monitoring people’s mobility during pandemic scenarios, there is a high need for detecting localisable physical events. To address this need, this paper introduces an extension of an existing event detection framework by combining machine learning-based geo-localisation of tweets and network analysis to reveal events from Twitter distributed in time and space. Gradual network sparsification is introduced to improve the detection events of different granularity and to derive a hierarchical event structure. Results show that the proposed method is able to detect meaningful events including their geo-locations. This constitutes a step towards using social media data to inform, for example, traffic demand models, inform about infection risks in certain places, or the identification of points of interest.

Diaoulé Diallo, Tobias Hecking
Manipulation During the French Presidential Campaign: Coordinated Inauthentic Behaviors and Astroturfing Analysis on Text and Images

In April 2022, the French presidential election took place, and social media played a prominent role in it. By analyzing more than 150 million interactions on French Twitter, this study aims to provide evidence of coordinated behaviors from political parties. We find that extreme parties left and right, appear with a particular internal structure compared to moderate parties. Moreover, by examining similar patterns in community structures but also in duplicated tweets, we unveil online astroturfing strategies of the main parties online, and in particular the extreme right.

Victor Chomel, Maziyar Panahi, David Chavalarias

Modeling Human Behavior

Frontmatter
Lexical Networks Constructed to Correspond to Students’ Short Written Responses: A Quantum Semantic Approach

A simple method to construct lexical networks (lexicons) of how students use scientific terms in written texts is introduced. The method is based on a recently introduced quantum semantics generalization of a word-pair co-occurrence. Quantum semantics allows entangled co-occurrence, thus allowing to model the effect of subjective bias on weighting the importance of word co-occurrence. Using such a generalized word-pair co-occurrence counting, we construct students’ lexicons of scientific (life-science) terms they use in their written responses to questions concerning food chains in life-science contexts. The method allows us to construct ensembles of lexicons that probabilistically simulate the variability of individual lexicons. The re-analyses of the written reports show that while sets of top-ranking terms contain nearly the same terms irrespective of details of the method used to count co-occurrences, the relative rankings of some key-terms may be different in quantum semantic analysis.

Ismo T. Koponen, Ilona Södervik, Maija Nousiainen
Attributed Stream-Hypernetwork Analysis: Homophilic Behaviors in Pairwise and Group Political Discussions on Reddit

Complex networks are solid models to describe human behavior. However, most analyses employing them are bounded to observations made on dyadic connectivity, whereas complex human dynamics involve higher-order relations as well. In the last few years, hypergraph models are rising as promising tools to better understand the behavior of social groups. Yet even such higher-order representations ignore the importance of the rich attributes carried by the nodes. In this work we introduce ASH, an Attributed Stream-Hypernetwork framework to model higher-order temporal networks with attributes on nodes. We leverage ASH to study pairwise and group political discussions on the well-known Reddit platform. Our analysis unveils different patterns while looking at either a pairwise or a higher-order structure for the same phenomena. In particular, we find out that Reddit users tend to surround themselves by like-minded peers with respect to their political leaning when online discussions are proxied by pairwise interactions; conversely, such a tendency significantly decreases when considering nodes embedded in higher-order contexts - that often describe heterophilic discussions.

Andrea Failla, Salvatore Citraro, Giulio Rossetti
Individual Fairness for Social Media Influencers

Nowadays, many social media platforms are centered around content creators (CC). On these platforms, the tie formation process depends on two factors: (a) the exposure of users to CCs (decided by, e.g., a recommender system), and (b) the following decision-making process of users. Recent research studies underlined the importance of content quality by showing that under exploratory recommendation strategies, the network eventually converges to a state where the higher the quality of the CC, the higher their expected number of followers. In this paper, we extend prior work by (a) looking beyond averages to assess the fairness of the process and (b) investigating the importance of exploratory recommendations for achieving fair outcomes. Using an analytical approach, we show that non-exploratory recommendations converge fast but usually lead to unfair outcomes. Moreover, even with exploration, we are only guaranteed fair outcomes for the highest (and lowest) quality CCs.

Stefania Ionescu, Nicolò Pagan, Anikó Hannák
Multidimensional Online American Politics: Mining Emergent Social Cleavages in Social Graphs

Dysfunctions in online social networks (e.g., echo chambers or filter bubbles) are studied by characterizing the opinion of users, for example, as Democrat- or Republican-leaning, or in continuous scales ranging from most liberal to most conservative. Recent studies have stressed the need for studying these phenomena in complex social networks in additional dimensions of social cleavage, including anti-elite polarization and attitudes towards changing cultural issues. The study of social networks in high-dimensional opinion spaces remains challenging in settings such as that of the US, both because of the dominance of a principal liberal-conservative cleavage, and because two-party political systems structure preferences of users and the tools to measure them. This article builds on embedding of social graphs in multi-dimensional ideological spaces and NLP methods to identify additional cleavage dimensions linked to cultural, policy, social, and ideological groups and preferences. Using Twitter social graph data I infer the political stance of nearly 2 million users connected to the political debate in the US for several issue dimensions of public debate. The proposed method shows that it is possible to identify several dimensions structuring social graphs, non-aligned to liberal-conservative divides and related to new emergent social cleavages. These results also shed a new light on ideological scaling methods gaining attention in many disciplines, allowing to identify and test the nature of spatial dimensions mined on social graphs.

Pedro Ramaciotti Morales
Classical and Quantum Random Walks to Identify Leaders in Criminal Networks

Random walks simulate the randomness of objects, and are key instruments in various fields such as computer science, biology and physics. The counter part of classical random walks in quantum mechanics are the quantum walks. Quantum walk algorithms provide an exponential speedup over classical algorithms. Classical and quantum random walks can be applied in social network analysis, and can be used to define specific centrality metrics in terms of node occupation on single-layer and multilayer networks. In this paper, we applied these new centrality measures to three real criminal networks derived from an anti-mafia operation named Montagna and a multilayer network derived from them. Our aim is to (i) identify leaders in our criminal networks, (ii) study the dependence between these centralities and the degree, (iii) compare the results obtained for the real multilayer criminal network with those of a synthetic multilayer network which replicates its structure.

Annamaria Ficara, Giacomo Fiumara, Pasquale De Meo, Salvatore Catanese
Random Walk for Generalization in Goal-Directed Human Navigation on Wikipedia

Models of human navigation have been investigated in many ways on complex networks. These findings suggest that the characteristics of human navigation change during the navigation from the start to the destination. However, it is not fully clear to what extent the navigation is defined by the human navigator or the graph and the environment. Our work examines the early phase of human navigation, where we investigate the impact of the graph structure on human navigation with a random walk model based on PageRank. Our results suggest that a very high portion of human navigation in the early generalization phase can be modeled with random navigation.

Dániel Ficzere, Gergely Hollósi, Attila Frankó, András Gulyás
Sometimes Less Is More: When Aggregating Networks Masks Effects

A large body of research aims to detect the spread of something through a social network. This research often entails measuring multiple kinds of relationships among a group of people and then aggregating them into a single social network to use for analysis. The aggregation is typically done by taking a union of the various tie types. Although this has intuitive appeal, we show that in many realistic cases, this approach adds sufficient error to mask true network effects. We show that this can be the case, and demonstrate that the problem depends on: (1) whether the effect diffuses generically or in a tie-specific way, and (2) the extent of overlap between the measured network ties. Aggregating ties when diffusion is tie-specific and overlap is low will negatively bias and potentially mask network effects that are in fact present.

Jennifer M. Larson, Pedro L. Rodríguez
An Adaptive Network Model Simulating the Effects of Different Culture Types and Leader Qualities on Mistake Handling and Organisational Learning

This paper investigates computationally the following research hypotheses: (1) Higher flexibility and discretion in organizational culture results in better mistake management and thus better organizational learning, (2) Effective organizational learning requires a transformational leader to have both high social and formal status and consistency, and (3) Company culture and leader’s behavior must align for the best learning effects. Computational simulations of the introduced adaptive network were analyzed in different contexts varying in organization culture and leader characteristics. Statistical analysis results proved to be significant and supported the research hypotheses. Ultimately, this paper provides insight into how organizations that foster a mistake-tolerant attitude in alignment with the leader, can result in significantly better organizational learning on a team and individual level.

Natalie Samhan, Jan Treur, Wioleta Kucharska, Anna Wiewiora

Biological Networks

Frontmatter
Modeling of Hardy-Weinberg Equilibrium Using Dynamic Random Networks in an ABM Framework

Hardy-Weinberg equilibrium is the fundamental principle of population genetics. In this article, we present a new NetLogo model called “Hardy-Weinberg Basic model v 2.0”, characterized by a strict adherence to the original assumptions made by Hardy and Weinberg in 1908. A particularly significant feature of this model is that the algorithm does not make use of the binomial expansion formula. Instead, we show that using a procedure based on dynamic random networks, diploid equilibrium can be achieved spontaneously by a population of agents reproducing sexually in a Mendelian fashion. The model can be used to conduct simulations with a wide range of initial population sizes and genotype distributions for a single biallelic autosomal locus. Moreover, we also show that without any mathematical formalism the algorithm is also able to confirm the prediction of Kimura’s diffusion equations on the time required to fix a new neutral allele in a population, due to genetic drift alone.

Riccardo Tarantino, Greta Panunzi, Valentino Romano
COMBO: A Computational Framework to Analyze RNA-seq and Methylation Data Through Heterogeneous Multi-layer Networks

Multi-layer Complex networks are commonly used for modeling and analysing biological entities. This paper presents a new computational framework called COMBO (Combining Multi Bio Omics) for generating and analyzing heterogeneous multi-layer networks. Our model uses gene expression and DNA-methylation data. The power of COMBO relies on its ability to join different omics to study the complex interplay between various components in the disease. We tested the reliability and versatility of COMBO on colon and lung adenocarcinoma cancer data obtained from the TCGA database.

Ilaria Cosentini, Vincenza Barresi, Daniele Filippo Condorelli, Alfredo Ferro, Alfredo Pulvirenti, Salvatore Alaimo
A Network-based Approach for Inferring Thresholds in Co-expression Networks

Gene co-expression networks (GCNs) specify binary relationships between genes and are of biological interest because significant network relationships suggest that two co-expressed genes rise and fall together across different cellular conditions. GCNs are built by (i) calculating a co-expression measure between each pair of genes and (ii) selecting a significance threshold to remove spurious relationships among genes. This paper introduces a threshold criterion based on the underlying topology of the network. More specifically, the criterion considers both the rate at which isolated nodes are added to the network and the density of its components when the threshold varies. In addition to Pearson’s correlation measure, the biweight midcorrelation, the distance correlation, and the maximal information coefficient are used to build different GCNs from the same data and showcase the advantages of the proposed approach. Finally, a case study presents a comparison of the predictive performance of the different networks when trying to predict gene functional annotations using hierarchical multi-label classification.

Nicolás López-Rozo, Miguel Romero, Jorge Finke, Camilo Rocha
Building Differential Co-expression Networks with Variable Selection and Regularization

This work introduces a technique for the inference of differential co-expression networks. The approach takes as input a matrix of differential expression profiles, where each entry corresponds to the Log Fold Change of a gene expression between control and stress conditions for a specific sample. It outputs a matrix of coefficients, where each non-zero entry represents a pairwise connection between genes. The proposed approach builds on Lasso, and is applied to differential expression profiles of rice between control and salt-stress conditions. A total of 25 genes were identified to respond to salt stress and as differentially expressed. About half of these genes (11) were reported with a statistically significant number of different GO annotations relevant to salt stress response.

Camila Riccio, Jorge Finke, Camilo Rocha
Inferring Probabilistic Boolean Networks from Steady-State Gene Data Samples

Probabilistic Boolean Networks have been proposed for estimating the behaviour of dynamical systems as they combine rule-based modelling with uncertainty principles. Inferring PBNs directly from gene data is challenging however, especially when data is costly to collect and/or noisy, e.g., in the case of gene expression profile data. In this paper, we present a reproducible method for inferring PBNs directly from real gene expression data measurements taken when the system was at a steady state. The steady-state dynamics of PBNs is of special interest in the analysis of biological machinery. The proposed approach does not rely on reconstructing the state evolution of the network, which is computationally intractable for larger networks. We demonstrate the method on samples of real gene expression profiling data from a well-known study on metastatic melanoma. The pipeline is implemented using Python and we make it publicly available.

Vytenis Šliogeris, Leandros Maglaras, Sotiris Moschoyiannis
Quantifying High-Order Interactions in Complex Physiological Networks: A Frequency-Specific Approach

Recent advances in information theory have provided several tools to characterize high-order interactions (HOIs) in complex systems. Among them, the so-called O-information is emerging as particularly useful in practical analysis thanks to its ability to capture the overall balance between redundant and synergistic HOIs. While the O-information is computed for random variables, its extension to random processes studied in the frequency domain is very important to widen the applicability of this tool to networks whose node exhibit rich oscillatory content, such as brain and physiological networks. This work presents the O-information rate (OIR), a measure based on the vector autoregressive and state space modelling of multivariate time series devised to assess the synergistic and redundant HOIs among groups of series in specific bands of biological interest. The new measure is illustrated in two paradigmatic examples of physiological networks characterized by coupled oscillations across a wide range of temporal scales, i.e. the network of cardiovascular and cerebrovascular interactions where redundant synchronized activity emerges around the frequencies of vasomotor and respiratory rhythms, and the network of scalp electroencephalographic signals where synergetic HOIs are detected among the alpha and beta waves recorded over the primary sensorimotor cortex.

Laura Sparacino, Yuri Antonacci, Daniele Marinazzo, Sebastiano Stramaglia, Luca Faes
A Novel Reverse Engineering Approach for Gene Regulatory Networks

Capturing the rules that govern a particular system can be useful in any field where the causes of its effects are unknown. Indeed, discovering the causes that produced a particular effect is extremely useful in fields such as biology. In this paper, a reverse engineering method based on machine learning is proposed. This method was used to replicate real world behaviour and use this knowledge to generate the relative Gene Regulatory Network. The datasets from the DREAM4 Challenge were used to validate this method.

Francesco Zito, Vincenzo Cutello, Mario Pavone
Using the Duplication-Divergence Network Model to Predict Protein-Protein Interactions

Interactions between proteins are key to most biological processes, but thorough testing can be costly in terms of money and time. Computational approaches for predicting such interactions are an important alternative. This study presents a novel approach to this prediction using calibrated synthetic networks as input for training a decision tree ensemble model with relevant topological information. This trained model is later used for predicting interactions on the human interactome, as a case study. Results show that deterministic metrics perform better than their stochastic counterparts, although a random forest model shows a feature combination case with comparable precision results.

Nicolás López-Rozo, Jorge Finke, Camilo Rocha

Machine Learning and Networks

Frontmatter
SignedS2V: Structural Embedding Method for Signed Networks

A signed network is widely observed and constructed from the real world and is superior for containing rich information about the signs of edges. Several embedding methods have been proposed for signed networks. Current methods mainly focus on proximity similarity and the fulfillment of social psychological theories. However, no signed network embedding method has focused on structural similarity. Therefore, in this research, we propose a novel notion of degree in signed networks and a distance function to measure the similarity between two complex degrees and a node-embedding method based on structural similarity. Experiments on five network topologies, an inverted karate club network, and three real networks demonstrate that our proposed method embeds nodes with similar structural features close together and shows the superiority of a link sign prediction task from embeddings compared with the state-of-the-art methods.

Shu Liu, Fujio Toriumi, Xin Zeng, Mao Nishiguchi, Kenta Nakai
HM-LDM: A Hybrid-Membership Latent Distance Model

A central aim of modeling complex networks is to accurately embed networks in order to detect structures and predict link and node properties. The Latent Space Model (LSM) has become a prominent framework for embedding networks and includes the Latent Distance Model (LDM) and Eigenmodel (LEM) as the most widely used LSM specifications. For latent community detection, the embedding space in LDMs has been endowed with a clustering model whereas LEMs have been constrained to part-based non-negative matrix factorization (NMF) inspired representations promoting community discovery. We presently reconcile LSMs with latent community detection by constraining the LDM representation to the D-simplex forming the Hybrid-Membership Latent Distance Model (HM-LDM). We show that for sufficiently large simplex volumes this can be achieved without loss of expressive power whereas by extending the model to squared Euclidean distances, we recover the LEM formulation with constraints promoting part-based representations akin to NMF. Importantly, by systematically reducing the volume of the simplex, the model becomes unique and ultimately leads to hard assignments of nodes to simplex corners. We demonstrate experimentally how the proposed HM-LDM admits accurate node representations in regimes ensuring identifiability and valid community extraction. Importantly, HM-LDM naturally reconciles soft and hard community detection with network embeddings exploring a simple continuous optimization procedure on a volume constrained simplex that admits the systematic investigation of trade-offs between hard and mixed membership community detection.

Nikolaos Nakis, Abdulkadir Çelikkanat, Morten Mørup
The Structure of Interdisciplinary Science: Uncovering and Explaining Roles in Citation Graphs

Role discovery is the task of dividing the set of nodes on a graph into classes of structurally similar roles. Modern strategies for role discovery typically rely on graph embedding techniques, which are capable of recognising complex local structures. However, when working with large, real-world networks, it is difficult to interpret or validate a set of roles identified according to these methods. In this work, motivated by advancements in the field of explainable artificial intelligence (XAI), we propose a new framework for interpreting role assignments on large graphs using small subgraph structures known as graphlets. We demonstrate our methods on a large, multidisciplinary citation network, where we successfully identify a number of important citation patterns which reflect interdisciplinary research.

Eoghan Cunningham, Derek Greene
Inferring Parsimonious Coupling Statistics in Nonlinear Dynamics with Variational Gaussian Processes

Falsification is the basis for testing existing hypotheses, and a great danger is posed when results incorrectly reject our prior notions (false positives). Though nonparametric and nonlinear exploratory methods of uncovering coupling provide a flexible framework to study network configurations and discover causal graphs, multiple comparisons analyses make false positives more likely, exacerbating the need for their control. We aim to robustify the Gaussian Processes Convergent Cross-Mapping (GP-CCM) method through Variational Bayesian Gaussian Process modeling (VGP-CCM). We alleviate computational costs of integrating with conditional hyperparameter distributions through mean field approximations. This approximation model, in conjunction with permutation sampling of the null distribution, permits significance statistics that are more robust than permutation sampling with point hyperparameters. Simulated unidirectional Lorenz-Rossler systems as well as mechanistic models of neurovascular systems are used to evaluate the method. The results demonstrate that the proposed method yields improved specificity, showing promise to combat false positives.

Ameer Ghouse, Gaetano Valenza
Detection of Sparsity in Multidimensional Data Using Network Degree Distribution and Improved Supervised Learning with Correction of Data Weighting

Multidimensional data are representatives in a wide range of applications, from those in the latest state-of-the-art science and technology to specific social issues. And they have been subject to analysis using methods such as regression analysis and machine learning. However, they are rarely obtained as complete data and contain more or less biases and deficiencies. In this study, we form a network from a multidimensional dataset and use its degree distribution to detect data sparsity. Although model analysis based on the degree distribution has been conducted for many years, sparsity detection has not been a target of the degree distribution analysis. Furthermore, we attempt to increase the accuracy and precision of supervised learning by applying regressive weighting according to node grouping in the degree distribution spectrum. By making use of this algorithm, we can expand the range of utilization of incomplete data together with other promising progresses in complex networks.

Shinya Ueno, Osamu Sakai
Network Structure Versus Chemical Information in Drug-Drug Interaction Prediction

We apply two embedding mechanisms, node2vec & mol2vec, on the problem of predicting Drug-Drug Interactions (DDIs). These mechanisms, respectively, convert drugs into vectors using the chemical information of the underlying chemical compound and the network information from the graph of drug interactions. Our goal is to compare Single Link Prediction models that are based on each embedding method by exploring the topological features of the drug interactions graph that make each approach more efficient in making correct predictions. We base our experiments on the DrugBank data set and use various computational chemistry tools such RDKit and PubChem, along with NetworkX, in order to create the chemical and structural embeddings for each drug.

George Kefalas, Dimitrios Vogiatzis
Geometric Deep Learning Graph Pruning to Speed-Up the Run-Time of Maximum Clique Enumerarion Algorithms

In this paper we propose a method to reduce the running time to solve the Maximum Clique Enumeration (MCE) problem. Specifically, given a network we employ geometric deep learning in order to find a simpler network on which running the algorithm to derive the MCE. Our approach is based on finding a strategy to remove from the network nodes that are not functional to the solution. In doing so, the resulting network will have a reduced size and, as a result, search times of the MCE is reduced. We show that our approach is able to obtain a solver speed-up up to 42 times, while keeping all the maximum cliques.

A. Arciprete, V. Carchiolo, D. Chiavetta, M. Grassia, M. Malgeri, G. Mangioni
Graph Mining and Machine Learning for Shader Codes Analysis to Accelerate GPU Tuning

The graphics processing unit (GPU) has become one of the most important computing technologies. Disassembly shader codes, which are machine-level codes, are important for GPU designers (e.g., AMD, Intel, NVIDIA) to tune the hardware, including customization of clock speeds and voltages. Due to many use-cases of modern GPUs, engineers generally find it difficult to manually inspect a large number of shader codes emerging from these applications. To this end, we develop a framework that converts shader codes into graphs, and employs sophisticated graph mining and machine learning techniques over a number of applications to simplify shader graphs analysis in an effective and explainable manner, aiming at accelerating the whole debugging process and improving the overall hardware performance. We study shader codes’ evolution via temporal graph analysis and structure mining with frequent subgraphs. Using them as the underlying tools, we conduct a frame’s scene detection and representative frames selection. We group the scenes (applications) to identify the representative scenes, and predict a new application’s inefficient shaders. We empirically demonstrate the effectiveness of our solution and discuss future directions.

Lin Zhao, Arijit Khan, Robby Luo, Chai Kiat Yeo

Networks in Finance and Economics

Frontmatter
Pattern Analysis of Money Flows in the Bitcoin Blockchain

Bitcoin is the first and highest valued cryptocurrency that stores transactions in a publicly distributed ledger called the blockchain. Understanding the activity and behavior of Bitcoin actors is a crucial research topic as they are pseudonymous in the transaction network. In this article, we propose a method based on taint analysis to extract taint flows—dynamic networks representing the sequence of Bitcoins transferred from an initial source to other actors until dissolution. Then, we apply graph embedding methods to characterize taint flows. We evaluate our embedding method with taint flows from top mining pools and show that it can classify mining pools with high accuracy. We also found that taint flows from the same period show high similarity. Our work proves that tracing the money flows can be a promising approach to classifying source actors and characterizing different money flow patterns.

Natkamon Tovanich, Rémy Cazabet
On the Empirical Association Between Trade Network Complexity and Global Gross Domestic Product

In recent decades, trade between nations has constituted an important component of global Gross Domestic Product (GDP), with official estimates showing that it likely accounted for a quarter of total global production. While evidence of association already exists in macro-economic data between trade volume and GDP growth, there is considerably less work on whether, at the level of individual granular sectors (such as vehicles or minerals), associations exist between the complexity of trading networks and global GDP. In this paper, we explore this question by using publicly available data from the Atlas of Economic Complexity project to rigorously construct global trade networks between nations across multiple sectors, and studying the correlation between network-theoretic measures computed on these networks (such as average clustering coefficient and density) and global GDP. We find that there is indeed significant association between trade networks’ complexity and global GDP across almost every sector, and that network metrics also correlate with business cycle phenomena such as the Great Recession of 2007–2008. Our results show that trade volume alone cannot explain global GDP growth, and that network science may prove to be a valuable empirical avenue for studying complexity in macro-economic phenomena such as trade.

Mayank Kejriwal, Yuesheng Luo
Measuring the Stability of Technical Cooperation Network Based on the Nested Structure Theory

Nested structure is a structural feature that is conducive to system stability formed by co-evolution. In our opinion, it is just like what the biological species do in the mutualistic ecosystem that enterprises collaborate to apply for patents in the technical cooperation network, changing to form one dynamic equilibrium after another. In this paper, a nestedness-based analytical framework is built to reflect the topological stability of the technical cooperation network of Zhongguancun Science Park (Z-Park). We study why this technically mutualistic ecosystem can reach a stable equilibrium with time going by, as well as, we propose an index called Nestedness Disturbance Index (NDI) to study what the role park areas and technical fields play in the steady states.

Wenhui Liu, Guoqiang Liang, Lizhi Xing
Dynamic Transition Graph for Estimating the Predictability of Financial and Economical Processes

The problem of time series predictability estimation often appears when one deals with the task of forecasting financial and economical processes, especially when the processes are not sustainable and presume critical transitions and/or behaviour changes in the generating complex dynamical system. In these cases it is important to notice the moment when transitions/changes start and to distinguish their direction as soon as possible in order to adjust the forecasting algorithm or, at least, properly evaluate the forecast accuracy. To deal with such effects, we propose a dynamic transition graph-based method for real-time incremental tracing of the changes in the predictability of time series describing financial and economical processes. Our method helps to filter some “noise” time series information and emphasize the significant aspects of the corresponding dynamical system behavior. Besides, we use several graph features such as centrality degree, number and size of loops, connectivity and entropy to evaluate the predictability. We also construct a graph neural network classifier and train it on specific artificial time series datasets to efficiently classify real-world time series by predictability in a real-time incremental tracing manner.

Dmitriy Prusskiy, Anton Kovantsev, Petr Chunaev
A Network Analysis of World Trade Structural Changes (1996–2019)

Global trade suffered a significant contraction in the value of trade flows as a result of 2008 financial crisis, and again in 2020 with the pandemic crisis. There are questions arising from these observations: What is the structure of international trade? How has it changed due to these shocks? Inspired by the advantages of the network method and using the most extensive coverage data in terms of time and geography, we explore the structure of international trade by presenting a comprehensive analysis of the World Trade Network (WTN) from several angles. Connectivity results suggest that countries’ efforts towards multilateral trade relations have resulted in an increasingly dense network, highly reciprocal and clustered. However, the network has not been fully connected yet. While trade connections are distributed homogeneously among countries, trade value is concentrated in a small set of countries yielding a weighted core-periphery structure. We analyze the consequences of the past financial crisis of 2008 to infer the potential effects of the recent crisis. Although the shock did not affect the main connective overall trends, their tendencies were restrained after 2008. The financial downturn also marks a turning point in the clustering of the WTN from two main groups, led by the United States and Germany, to three, led by the United States, China, and Germany. Revisions in preferential trade partners are visible from 2008 onward. Our study provides an intuitive insight for policymakers in revising the medium term effects of a global shock.

Vu Phuong Hoang, Carlo Piccardi, Lucia Tajoli
Green Sector Space: The Evolution and Capabilities Spillover of Economic Green Sectors in the United States

Countries’ productive capabilities play a crucial role in effectively transitioning their economies towards becoming green. Current research does not address the productive capabilities in the green sectors. In particular, (a) the effect of green production capabilities on a country’s green basket development and (b) whether the productive capabilities in its green sectors spillover to affect each other and its overall green growth. In this research, we use nonparametric statistics with network science techniques to analyze green sectors’ evolution in the United States. The results of this research provide recommendations that could benefit the United States’ green economic growth. In addition, it provides a methodology that can be used by countries’ policymakers in building effective strategies that can accelerate their country’s green economic growth.

Hanin Alhaddad, Seyyedmilad Talebzadehhosseini, Ivan Garibay
Statistical Inference of Lead-Lag Between Asynchronous Time Series from P-Values of Transfer Entropy at Various Timescales

Symbolic transfer entropy is a powerful non-parametric tool to detect lead-lag between time series. Because a closed expression of the distribution of Transfer Entropy is not known for finite-size samples, statistical testing is often performed with bootstraps whose slowness prevents the inference of large lead-lag networks between long time series. On the other hand, the asymptotic distribution of Transfer Entropy between two time series is known. In this work, we derive the asymptotic distribution of the test for one time series having a larger Transfer Entropy than another one on a target time series. We then measure the convergence speed of both tests in the small sample size limits via benchmarks. We then introduce Transfer Entropy between time-shifted time series, which allows to measure the timescale at which information transfer is maximal and vanishes. We finally apply these methods to tick-by-tick price changes of several hundreds of stocks, yielding non-trivial statistically validated networks.

Christian Bongiorno, Damien Challet

Networks and Mobility

Frontmatter
Extracting Metro Passenger Flow Predictors from Network’s Complex Characteristics

Complex network characteristics such as centralities have lately started to be associated with passenger flows at metro stations. Centralities can be leveraged in an effort to develop fast and cost-efficient passenger flow predictive models. However, the accuracy of such models is still under question and the most appropriate predictors are yet to be found. In this sense, this study attempts to investigate appropriate predictors, and develop a predictive model for daily passenger flows at metro stations, based exclusively on spatial attributes. Using the Athens metro network as a case study, a linear regression model is developed, with node degree, betweenness and closeness centralities of the physical network, node strength of the substitute network, and a dummy variable of station importance being as covariates. An econometric analysis validates that a linear model is suitable for associating centralities with passenger flows, while model’s evaluation metrics indicate satisfying accuracy. In addition, a machine learning benchmark model is utilized to further investigate variable significance and validate the accuracy of the linear model. Last but not least, both models are utilized for predicting passenger flows at the new metro stations of the Athens metro network expansion. Findings suggest that node strength of the substitute network is a powerful predictor and the most significant covariate of both models; both models’ accuracy and predictions converge to a great extent. The model developed is expected to facilitate medium-term disruption management through providing information about metro passenger flows at low cost and high speed.

Athanasios Kopsidas, Aristeides Douvaras, Konstantinos Kepaptsoglou
Estimating Peak-Hour Urban Traffic Congestion

We study the emergence of congestion patterns in urban networks by modeling vehicular interaction by means of a simple traffic rule and by using a set of measures inspired by the standard Betweenness Centrality (BC). We consider a topologically heterogeneous group of cities and simulate the network loading during the morning peak-hour by increasing the number of circulating vehicles. At departure, vehicles are aware of the network state and choose paths with optimal traversal time. Each added path modifies the vehicular density and travel times for the following vehicles. Starting from an empty network and adding traffic until transportation collapses, provides a framework to study network’s transition to congestion and how connectivity is progressively disrupted as the fraction of impossible paths becomes abruptly dominant. We use standard BC to probe into the instantaneous out-of-equilibrium network state for a range of traffic levels and show how this measure may be improved to build a better proxy for cumulative road usage during peak-hours. We define a novel dynamical measure to estimate cumulative road usage and the associated total time spent over the edges by the population of drivers. We also study how congestion starts with dysfunctional edges scattered over the network, then organizes itself into relatively small, but disruptive clusters.

Marco Cogoni, Giovanni Busonera, Francesco Versaci
Adaptive Routing Potential in Road Networks

Generally, in biological and ecological disciplines, the concept of adaptation refers to ecosystems evolutionarily changing as system perturbations. This paper extends these concepts to road transportation networks and focuses on the potential for adaptability of commuters to traffic disruptions. Specifically, we apply an Ecosystem Network Analysis (ENA) based upon an information-theoretic framework to demonstrate the potential, calculated as the number of alternate options available throughout a given network, for adaptive routing optimization on road networks. An initial assessment of balanced metrics of resilience and efficiency, calculated from 13 Metropolitan Statistical Area (MSA) road networks, is performed. These metrics are then compared with their respective commuter delay levels, which indicates a correlation between the balance of resilient and efficient networks and the annual commuter delay of each MSA. Whereas road network topologies that demonstrate either a highly efficient or highly resilient network structure show a tendency for higher commuter delay levels, road networks that balance efficiency and resilience suggest a tendency for lower commuter delay levels. This study's novel implication explicitly considers the road network topology as a driver of traffic delay patterns, isolated from commuter decisions.

Michael Logan, Allison Goodwell

Diffusion and Epidemics

Frontmatter
Detecting Global Community Structure in a COVID-19 Activity Correlation Network

The global pandemic of COVID-19 over the last 2.5 years have produced an enormous amount of epidemic/public health datasets, which may also be useful for studying the underlying structure of our globally connected world. Here we used the Johns Hopkins University COVID-19 dataset to construct a correlation network of countries/regions and studied its global community structure. Specifically, we selected countries/regions that had at least 100,000 cumulative positive cases from the dataset and generated a 7-day moving average time series of new positive cases reported for each country/region. We then calculated a time series of daily change exponents by taking the day-to-day difference in log of the number of new positive cases. We constructed a correlation network by connecting countries/regions that had positive correlations in their daily change exponent time series using their Pearson correlation coefficient as the edge weight. Applying the modularity maximization method revealed that there were three major communities: (1) Mainly Europe + North America + Southeast Asia that showed similar six-peak patterns during the pandemic, (2) mainly Near/Middle East + Central/South Asia + Central/South America that loosely followed Community 1 but had a notable increase of activities because of the Delta variant and was later impacted significantly by the Omicron variant, and (3) mainly Africa + Central/East Canada + Australia that did not have much activities until a huge spike was caused by the Omicron variant. These three communities were robustly detected under varied settings. Constructing a 3D “phase space” by using the median curves in those three communities for x-y-z coordinates generated an effective summary trajectory of how the global pandemic progressed.

Hiroki Sayama
Overcoming Vaccine Hesitancy by Multiplex Social Network Targeting

Understanding the impact of social factors on disease prevention and control is one of the key questions in behavioral epidemiology. The interactions of disease spreading and human health behavior such as vaccine uptake give rise to rich dueling dynamics of biological and social contagions. In light of this, it remains largely an open problem for optimal network targeting in order to harness the power of social contagion for behavior and attitude changes. Here we address this question explicitly in a multiplex network setting. Individuals are situated on two layers of networks. On the disease transmission network layer, they are exposed to infection risks. In the meantime, their opinions and vaccine uptake behavior are driven by the social discourse of their peer influence network layer. While the disease transmits through direct close contacts, vaccine views and uptake behaviors spread interpersonally within a long-range potentially virtual network. Our comprehensive simulation results demonstrate that network-based targeting with initial seeds of pro-vaccine supporters significantly influences the ultimate adoption rates of vaccination and thus the extent of the epidemic outbreak.

Marzena Fügenschuh, Feng Fu
Community-Aware Centrality Measures Under the Independent Cascade Model

Network topology, diffusion process, and node centrality are the key elements driving the diffusion dynamics in networks. Classical centrality measures do not exploit the community structure, although it is a ubiquitous property of natural and man-made real-world networks. Recent works have shown that community-aware centrality measures can be more effective. However, in their investigation, these works generally focus on popular diffusion processes such as the Susceptible-Infected-Recovered (SIR) and the Linear Threshold (LT) models on real-world networks. This work performs an extensive comparative analysis of eight popular community-aware centrality measures using the Independent Cascade (IC) model. Besides real-world networks, we also consider artificial networks with controlled properties to better understand the influence of the network topology in the diffusion process. Results show that targeting the nodes bridging the communities or highly inter-linked nodes results in a higher diffusion when a low fraction of nodes are initially involved in the diffusion process. In contrast, when the initial budget of nodes is high, it is more effective to target distant hubs as initial spreaders. Moreover, setting a uniform threshold and a weak community structure strength hinders the diffusive power of the community-aware centrality measures.

Hawraa Zein, Ali Yassin, Stephany Rajeh, Ali Jaber, Hocine Cherifi
Paths for Emergence of Superspreaders in Dengue Fever Spreading Network

The identification of superspreaders is essential to contain an epidemic, especially when there is not enough information about the disease to develop precautionary measures. Unlike infections caused directly between individuals of the same species, epidemics caused by vectors have well-explored peculiarities. In this direction, we intend to study the networks obtained from the dissemination of dengue to verify, from the results of a simulation of agent based models, if the transmission of this disease follows the 20/80 rule for the proportion of spreaders and infected. We built different transmission networks considering the spread between vectors and humans up to the second generation and we observed that, despite the human-to-human transmission network follow the 20/80 rule, the other networks (human–mosquito, mosquito–mosquito and mosquito–human) did not follow this rule. Varying the density of agents, we show that the phenomenon of superspreading is accentuated with high density of mosquitoes. These characteristics of vector-borne disease networks need to be further explored, as these vectors are highly vulnerable to climate change, and a better understanding of disease spread can help better target dengue epidemic control strategies.

L. L. Lima, A. P. F. Atman

Multilayer Networks

Frontmatter
Structural Cores and Problems of Vulnerability of Partially Overlapped Multilayer Networks

The concept of aggregate-network of multilayer network (MLN), which in many cases significantly simplifies the study of intersystem interactions is introduced, and the properties of its k-cores are investigated. The notion of p-cores is determined, with help of which the components of MLN that are directly involved in the implementation of intersystem interactions are distinguished. Methods of reducing the complexity of multilayer network models are investigated, which allow us to significantly decrease their dimensionality and better understand the processes that take place in intersystem interactions of different types. Effective scenarios of simultaneous group and system-wide targeted attacks on partially overlapped multilayer networks have been proposed, the main attention of which is focused on the transition points of MLN through which the intersystem interactions are actually implemented. It is shown that these scenarios can also be used to solve the inverse problem, namely, which elements of MLN should be blocked in the first place to prevent the acceleration of spread of dangerous infectious epidemics diseases, etc.

Olexandr Polishchuk
Multilayer Block Models for Exploratory Analysis of Computer Event Logs

We investigate a graph-based approach to exploratory data analysis in the context of network security monitoring. Given a possibly large batch of event logs describing ongoing activity, we first represent these events as a bipartite multiplex graph. We then apply a model-based biclustering algorithm to extract relevant clusters of entities and interactions between these clusters, thereby providing a simplified situational picture. We illustrate this methodology through two case studies addressing network flow records and authentication logs, respectively. In both cases, the inferred clusters reveal the functional roles of entities as well as relevant behavioral patterns. Displaying interactions between these clusters also helps uncover malicious activity. Our code is available at https://github.com/cl-anssi/MultilayerBlockModels .

Corentin Larroche
On the Effectiveness of Using Link Weights and Link Direction for Community Detection in Multilayer Networks

Multilayer networks are useful representations of real-world complex networks, and community detection in multilayer networks has been an area of active research. There are several options for the graph representation of a multilayer network, for example, directed or undirected and weighted or unweighted. Although these options may affect the results of community detection in a multilayer network, the representations that are effective for community detection have been not yet been clarified. In this paper, we experimentally investigate how the graph representation of a multilayer network affects the results of community detection. Through experiments using multilayer networks of Twitter users, we show that using a directed graph for each layer of a multilayer network contributes to improved accuracy in estimating the communities of Twitter users. We also show that when there is a clear oppositional structure among nodes in a network, manipulating link weights to emphasize the oppositional structure improves the accuracy of community detection.

Daiki Suzuki, Sho Tsugawa
Backmatter
Metadaten
Titel
Complex Networks and Their Applications XI
herausgegeben von
Hocine Cherifi
Rosario Nunzio Mantegna
Luis M. Rocha
Chantal Cherifi
Salvatore Miccichè
Copyright-Jahr
2023
Electronic ISBN
978-3-031-21127-0
Print ISBN
978-3-031-21126-3
DOI
https://doi.org/10.1007/978-3-031-21127-0

Premium Partner