A new multi-objective evolutionary framework for community mining in dynamic social networks

doi:10.1016/j.swevo.2016.09.001

Swarm and Evolutionary Computation

Volume 31, December 2016, Pages 90-109

https://doi.org/10.1016/j.swevo.2016.09.001 Get rights and content

Abstract

Evolutionary clustering – clustering in the presence of dynamic shifts of data's topological structure – has recently drawn remarkable attention wherein several algorithms are developed in the study of complex real networks. Despite the growing interests, all of the algorithms are designed based on seemingly the same principle. The primary principle in these evolutionary clustering frameworks is guided by decomposing the problem into two individual criteria, snapshot quality and temporal smoothness. Snapshot quality should properly cluster individuals of a network into interconnected communities. Temporal smoothness, on the other hand, should capture well the dynamic shift of the interconnected clusters from one time step to another. Thus, in the absence of any dynamic behavior, an evolutionary clustering model should be no more than a community detection one in a static network. Unfortunately, all of the developed algorithms are proposed based on discretion of the snapshot quality as a unified of both intra- and inter- connected community detection model while temporal cost as a community evolution detection model. The contribution of this paper starts by noting the limitation of the existing state-of-the-art algorithms. Despite their performance on dynamic complex networks, their formulations lack complete reflection of sufficient community detection model. Our framework, then, models the evolutionary clustering problem by hypothesizing that it should not depart too much from the community detection problem. To support this claim, an alternate decomposition perspective is proposed by projecting the problem, as a multi-objective optimization problem, in the light of snapshot and temporal of both intra- and inter-community scores. Two snapshot qualities are proposed to individually emphasize the role of intra- and inter- community scores, while temporal cost is proposed to cross-fertilize inter- community score. By applying one of the prominent multi-objective evolutionary algorithms (MOEAs) to solve the proposed multi-objective evolutionary clustering framework and testing it on several synthetic and real-world dynamic networks, we demonstrate the ability of the proposed model to address the problem more accurately than the existing state-of-the-art formulations.

Introduction

Due to their practical significance and ever-increasing applicability in many real world dynamic systems, networks and their topological attributes have very recently drawn growing attention and fueled the desire for solving their problems. Examples include online worlds like technological networks, information networks, and social-communication networks such as the internet, World Wide Web, and Facebook. Other interesting examples are biological networks and ecological niches like protein-protein interaction networks and food webs.

Many algorithms have shown up in literature to analyze the behavior of complex networks in a single and, more importantly, in multi time steps. The study of functional homogeneity of group of members (commonly noted as module, co-cluster, or simply, cluster) in the network is much more involved in social network analysis ( $SNA$ ). In its context, a module or a community is a set of individuals with more appearance of intra-connection amongst its members than inter-connection with other communities in the network. Moreover, the aspect of a community can account several types of membership drifting over time resulting in continuous changes in interaction signatures. Thus, by identifying network's communities (and their evolution), several functional phenomena can be depicted and predicted from the network structure. Community mining in evolutionary networks has and continues to have growing applications. Examples include trend analysis in social spheres and dynamic link prediction [1], [2], [3], [4], [5].

Capturing the evolution of clusters in dynamic complex networks is first introduced by Chakrabarti et al. [6] and adopted in all state-of-the-art approaches (examples include [7], [8], [9], [10], [11], [12]). The fundamental issue of evolution of temporal data is addressed in these approaches based on seemingly one common ground and principle inspired primary from the detection of the two participants of the problem: the snapshot patterns and evolutionary patterns of the communities. These two sub-problems are formulated as multi-cost optimization problem content mainly with snapshot cost and temporal cost. To specify the characteristic of evolutionary clustering problem in these approaches, three design parameters are used, namely, snapshot intra-cluster quality, snapshot inter-cluster quality, and temporal cost. They proved that the interplay of these parameters plays a vital role in the ability of the adopted evolutionary clustering algorithm.

Although all of the existing state-of-the-art frameworks attempt to involve the above mentioned parameters by maximizing snapshot quality of the network at a current time step and minimizing temporal cost of the network between the current time step and the previous one, it allows (as will be demonstrated in our investigations) a certain degree of cross-competition between snapshot quality and temporal cost that may become an acute problem while eliminating some promising solutions. This cross-competition, however, can't be overlooked in any evolutionary clustering framework and thus can also act against our framework. Nevertheless, our idea is to lessen the impact of this cross-competition by designing a proper cross-fertilization model between the temporal cost and the snapshot inter-cluster connection quality. Once we do that, we can then make a cross-competition between the snapshot intra-cluster connection quality and the designed cross-fertilization model. It is not intended, here, to be an exact evolutionary clustering framework, rather, its purpose is to offer a more successful way to maintain the essential characteristic components of evolutionary clustering problem and to explore their combined impacts on the final performance of the model. The remaining sections of this paper present our alternate perspective to solve evolutionary clustering problem in complex networks. The proposed framework should contribute to each of the following two problem solving aspects:

1.
How can we characterize the evolutionary clustering problem in dynamic complex networks?
2.
How can we shift from the de-facto definition of evolutionary clustering problem and define an alternate and efficient framework to cast and state it?

Starting with Section 2, it gives related backgrounds on the network's evolutionary clustering problem while presenting relevant graph's terminology. State-of-the-art works are then reviewed in Section 3. Our framework is stated and formulated in 4 Evolutionary clustering: an alternate trajectory, 5 Methodology of the proposed. Experimental results and corresponding analysis on synthetic and real life social networks are provided in Section 6. Finally, conclusion of the main findings of this paper and further possible ramifications are highlighted in Section 7.

Section snippets

Graph clustering and evolutionary clustering

Mathematically, a network is modeled as graph of pairwise edges between its nodes. Assuming, for example, a friendship graph $G$ modeling a social network $N$ , the pairwise friendship connections between individual entities of $N$ can be modeled by the pair $(V, E)$ . The set of $n$ individuals or entities in $N$ is noted as the set of nodes or vertices $V = {v_{1}, v_{2}, \dots, v_{n}}$ in $G$ while the friendship connection between any pair of individuals in $N$ is noted as edge $(v_{i}, v_{j})$ in $E$ , i.e. $E = {(v_{i}, v_{j}) | 1 \leq i, j \leq n \land i \neq j}$ .

Literature review

By definition, capturing evolution of patterns within social networks must exploit time-evolving friendship relations gathered from the nodes and assign the nodes accordingly to different evolving clusters. Kumar et al. [30] proposed community mining model to analyze two large online social networks. They presented two properties that can be extracted from the online network. These are nodes density and membership. Their model discriminates the regions of the network, accordingly, into three

Evolutionary clustering: an alternate trajectory

With all the foregoing heuristic and meta-heuristic methodologies (which are based on one common $snapshot - temporal$ costs framework) in our arsenal, it is quite natural to ask whether it is possible to create another principle and elaborate any of the presented methodologies to tackle evolutionary clustering problem more accurately. The answer of this paper is yes. To support this claim, the remaining of this section introduces an alternate perspective of evolutionary clustering framework, noted

Methodology of the proposed $MOEC$

The methodology of the formulated $MOEC$ is mainly based on the decomposition based multi-objective evolutionary algorithm ( $MOEA / D$ ) of Zhang and Li [41]. A general review of $MOEA / D$ is presented next.

Experimental results

In this section, we will test the performance of the proposed $MOEC$ framework (based on the multi-objective minimization of an intra-snapshot cost formulated in Eq. (12) and an inter-and-intra-temporal cost formulated in Eq. (14)) against the $MOEC$ traditional framework being projected into four state-of-the-art multi-objective optimization models presented in [12], [40]. These models noted as snapshot-temporal pair $(Φ_{1}, Φ_{2})$ are: 1) $(Q, NMI)$ , 2) $(CS, NMI)$ , 3) $(CO, NMI)$ , and 4) $(NC, NMI)$ . Henceforth,

Conclusions

One of the main important aspects of the proposed multi-objective evolutionary clustering framework is that it realizes the importance of each of the three main participants of the problem, namely snapshot intra-cluster quality, snapshot inter-cluster quality, and temporal smoothness. Unlike state-of-the-art evolutionary clustering models, the detection of communities in our model is more accurate and less susceptible to trapping locally because of the explicit manipulation of the snapshot

References (51)

C. Aggarwal et al.
Evolutionary network analysis: a survey
ACM Comput. Surv. (CSUR)
(2014)
J. Leskovec et al.
The dynamics of viral marketing
ACM Trans. Web (TWEB)
(2007)
L.Yan, J. Wang, J. Han, Y. Wang, A significance-driven framework for characterizing and finding evolving patterns of...
Acar, E., Dunlavy, D.M., & Kolda, T.G. (2009, December). Link prediction on evolving data using matrix and tensor...
R. Kumar, J. Novak, A. Tomkins, Structure and evolution of online social networks. In Link Mining: Models, Algorithms,...
D. Chakrabarti, R. Kumar, & A. Tomkins, Evolutionary clustering, in: Proceedings of the 12th ACM International...
L. Tang, H. Liu, J. Zhang, & Z. Nazeri, Community evolution in dynamic multi-mode networks, in: Proceeding of the...
Y.R. Lin et al.
Analyzing communities and their evolutions in dynamic social networks
ACM Trans. Knowl. Discov. Data (TKDD)
(2009)
Y. Chi et al.
On evolutionary spectral clustering
ACM Trans. Knowl. Discov. Data (TKDD)
(2009)
M.S. Kim et al.
A particle-and-density based evolutionary clustering method for dynamic networks
Proc. VLDB Endow.
(2009)

F. Folino et al.

An evolutionary multiobjective approach for community discovery in dynamic networks

IEEE Trans. Knowl. Data Eng.

(2014)

Folino, F., & Pizzuti, C. (2010, August). A multiobjective and evolutionary clustering method for dynamic networks. in:...

M.R. Garey et al.

Computers and Intractability. A Guide to the Theory of NP-Completeness

(1979)

S. Schaeffer

Graph clustering

Comput. Sci. Rev.

(2007)

U. Brandes et al.

On modularity clustering

IEEE Trans. Knowl. Data Eng.

(2008)

A. Noack

Energy models for graph clustering

J. Graph Algorithms Appl.

(2007)

F. Radicchi et al.

Defining and identifying communities in networks

Proc. Natl. Acad. Sci. USA

(2004)

M.E.J. Newman et al.

Finding and evaluating community structure in networks

Phys. Rev. E

(2004)

M. Tasgin, A. Bingol, Communities detection in complex networks using genetic algorithms. in: Proceedings of the...

A. Clauset et al.

Finding community structure in very large networks

Phys. Rev. E

(2004)

J. Reichardt et al.

Statistical mechanics of community detection

Phys. Rev. E

(2006)

M.E.J. Newman

Finding community structure in networks using the eigenvectors of matrices

Phys. Rev. E

(2006)

M.E.J. Newman

Modularity and community structure in networks

Proc. Natl. Acad. Sci.

(2006)

Tasgin, M., Herdagdelen, A., Bingol, A. (2007). Communities detection in complex networks using genetic algorithms,...

S. Lehmann et al.

Deterministic modularity optimization

Eur. Phys. J. B

(2007)

Cited by (23)

Parallel multi-objective evolutionary optimization based dynamic community detection in software ecosystem
2022, Knowledge-Based Systems
Building a dynamic network in a software ecosystem and detecting its communities can not only observe the structure of the dynamic network, but also reveal the evolution of these communities. However, previous methods cannot timely and accurately detect its communities. In view of this, we propose a method of dynamic community detection based on parallel multi-objective evolutionary optimization in this paper. In the proposed method, a dynamic network in a software ecosystem is first built based on the relationship between entities. The relationship is often time-dependent. Then, changed/unchanged connected components of the current network and time-dependent/independent sub-networks are obtained by recognizing the change of this network. Further, previous communities of each unchanged connected component remain unchanged, whereas ones of each time-dependent sub-network are detected based on parallel multi-objective evolutionary optimization. In this way, communities of each changed connected component are obtained based on ones of the time-dependent sub-network and previous ones of the time-independent sub-network, and ones of the current network after the change are formed. Five dynamic networks in a software ecosystem are built using data crawled in GitHub. Based on them, a series of experimental results demonstrate that the proposed method is advantageous.
A review of heuristics and metaheuristics for community detection in complex networks: Current usage, emerging development and future directions
2021, Swarm and Evolutionary Computation
Citation Excerpt :
They showed plausible behaviour supporting their performance against the state-of-the-art HCD and MCD algorithms. For example, the foundation of the heuristic mutation operators, the so-called migration operators, proposed in [2,20,21,23,24] is to harness the existence of strong intra-connections among the nodes at the expense of weak ones. The proposed migration operators can discover the connections necessary to form strong intra-connections and moving nodes towards such structures.
Sensibly highlighting the hidden structures of many real-world networks has attracted growing interest and triggered a vast array of techniques on what is called nowadays community detection (CD) problem. Non-deterministic metaheuristics are proved to competitively transcending the limits of the counterpart deterministic heuristics in solving community detection problem. Despite the increasing interest, most of the existing metaheuristic based community detection (MCD) algorithms reflect one traditional language. Generally, they tend to explicitly project some features of real communities into different definitions of single or multi-objective optimization functions. The design of other operators, however, remains canonical lacking any intense interest to reflect the domain knowledge. Moreover, all the published reviews did not make any direct effort to link heuristic and metaheuristic based community detection approaches, rather, they simply state them separately. The review introduced in this paper attempts to address this issue. Mainly, we review the main heuristic and metaheuristic based community detection algorithms. Then, we introduce two new taxonomies for community detection algorithms: hybrid metaheuristic and hyper heuristic that can serve as common grounds for designing a collection of new and more effective MCD algorithms. To this end, we introduce four new systematic frameworks integrating both heuristic and metaheuristic algorithms, illustrating the possible issues that would fuel the desire for researchers to direct their future interest towards developing more effective community detection instances from the context of these frameworks.
Revealing dynamic communities in networks using genetic algorithm with merge and split operators
2020, Physica A: Statistical Mechanics and its Applications
Citation Excerpt :
Compared with other heuristics, evolutionary approaches have a stronger global search capability originating from the search mechanism based on population, which allows them to escape from local optima and hence to find a better solution. In recent years, evolutionary methods have been successfully applied in community detection for static networks [44–49], and dynamic networks [48,49]. Based on the elite genetic algorithm, we present an evolutionary approach, MSGA, which uses a different quality function combined with ad-hoc merge and split operators to accurately detect dynamic communities.
Community structures are pervasive in real-world networks, portraying the strong local clustering of nodes. Unveiling the community structure of a network is deemed to be a crucial step towards understanding its dynamics. Actually, most real-world networks are dynamic, and their community structures are evolving over time accordingly. How to reveal these dynamic communities has recently become a pressing issue. This paper presents an evolutionary method termed MSGA for accurately identifying dynamic communities in networks. First, we propose temporal asymptotic surprise (TAS), an effective measure to evaluate the quality of a partition on the snapshot of the dynamic network. Then we develop ad-hoc merge and split operators to perform an information-directed large-scale search at a low cost. Finally, large-scale search, coupled with classic genetic operators, are used to reveal a better solution for each snapshot of the network. MSGA does not require specifying the proposed number of communities. It can break the resolution limit and satisfies temporal smoothness constraints. Experimental results show that MSGA outperforms other state-of-the-art approaches on both synthetic networks and real-world networks.
Community detection in networks using bio-inspired optimization: Latest developments, new results and perspectives with a selection of recent meta-heuristics
2020, Applied Soft Computing Journal
Detecting groups within a set of interconnected nodes is a widely addressed problem that can model a diversity of applications. Unfortunately, detecting the optimal partition of a network is a computationally demanding task, usually conducted by means of optimization methods. Among them, randomized search heuristics have been proven to be efficient approaches. This manuscript is devoted to providing an overview of community detection problems from the perspective of bio-inspired computation. To this end, we first review the recent history of this research area, placing emphasis on milestone studies contributed in the last five years. Next, we present an extensive experimental study to assess the performance of a selection of modern heuristics over weighted directed network instances. Specifically, we combine seven global search heuristics based on two different similarity metrics and eight heterogeneous search operators designed ad-hoc. We compare our methods with six different community detection techniques over a benchmark of 17 Lancichinetti–Fortunato–Radicchi network instances. Ranking statistics of the tested algorithms reveal that the proposed methods perform competitively, but the high variability of the rankings leads to the main conclusion: no clear winner can be declared. This finding aligns with community detection tools available in the literature that hinge on a sequential application of different algorithms in search for the best performing counterpart. We end our research by sharing our envisioned status of this area, for which we identify challenges and opportunities which should stimulate research efforts in years to come.
A new evolutionary multi-objective community mining algorithm for signed networks
2019, Applied Soft Computing Journal
Citation Excerpt :
Recently, a thorough cover of the main research studies in multi-objective community detection models (including unsigned, signed, dynamic, and overlapping models) is put forward by Cai et al. [37]. Liu et al. in [34] compared the performance of their multi-objective maximization model against FEC of Yang et al. [24], extension to Blondel et al. optimization method for modularity [33], and Li et al. model [28]. Upon thorough analysis, the effectiveness and efficacy of Liu et al. model against other models are revealed out.
Community detection in presence of both positive and negative interactions in signed community structures has recently enjoyed a large increase in interest. The general definition of community structure which considers both strong and weak connections of the individual nodes in a network is adopted in the literature for signed community detection. Despite the widespread use of this definition, it lacks complete reflection of specific topological properties such as type of ties, in terms of positive and negative connections. To remedy this difficulty, a new community detection model for signed networks is suggested in this paper. The main contribution of this paper is three-fold. First, the quantitative definition of community structure is revisited to properly reflect positive and negative characteristics of the ties in signed networks. Three definitions are introduced to explicitly identify the possible means of signed communities in three different forms. These are strong signed community, weak signed community, and irregular signed community. Then, a new multi-objective signed community detection model and a new anti-frustration heuristic operator are introduced. The proposed model and operator hypothesize a possible clustering of the signed complex network into signed communities under the framework of multi-objective evolutionary algorithm. The essential principle of both of them is to establish “more positive and less negative intra relations between the nodes of a signed community” and “more negative and less positive inter relations among different signed communities”. The performance of the proposed model is tested against other state-of-the-art signed community detection models. In the experiments, we demonstrate that, in general, our model outperforms the counterpart models, and moreover, the proposed anti-frustration heuristic operator harnesses the strength of all detection models, keeping our model with the highest level of detection reliability.
An ideal point based many-objective optimization for community detection of complex networks
2019, Information Sciences
Community detection is one of the major topics in the study of complex networks, which aims to uncover their structural properties. Recently, many evolutionary methods have been successfully employed to identify communities of complex networks. Community detection has been treated so far as a single or multi-objective problem in evolutionary-based approaches. Since each objective covers a specific aspect of the network's properties, it could result in identification of better community structures to investigate the problem with more than two objectives. In this paper, we proposed a method referred to as MaOCD that formulates community detection as a many objective task. MaOCD uses an ideal-point based strategy to guide the population towards an optimal community structure. The main purpose is to take advantage of optimizing several objectives simultaneously and using a representation that reduces the search space. This enhances the convergence of the method, and automatically determines the number of modules. We introduced a novel metric called IGDC that gives multi/many-objective community detection methods the capability of being comparable regarding multiple objectives. Several experiments were carried out on synthetic and real-world datasets to show the performance of our method. The results demonstrated that MaOCD successfully detected the communities in the network structure compared to some state-of-the-art single and multi-objective methods.

View all citing articles on Scopus

View full text

Regular PaperA new multi-objective evolutionary framework for community mining in dynamic social networks

Abstract

Introduction

Section snippets

Graph clustering and evolutionary clustering

Literature review

Evolutionary clustering: an alternate trajectory

Methodology of the proposed MOEC

Experimental results

Conclusions

Evolutionary network analysis: a survey

ACM Comput. Surv. (CSUR)

The dynamics of viral marketing

ACM Trans. Web (TWEB)

Analyzing communities and their evolutions in dynamic social networks

ACM Trans. Knowl. Discov. Data (TKDD)

On evolutionary spectral clustering

ACM Trans. Knowl. Discov. Data (TKDD)

A particle-and-density based evolutionary clustering method for dynamic networks

Proc. VLDB Endow.

An evolutionary multiobjective approach for community discovery in dynamic networks

IEEE Trans. Knowl. Data Eng.

Computers and Intractability. A Guide to the Theory of NP-Completeness

Graph clustering

Comput. Sci. Rev.

On modularity clustering

IEEE Trans. Knowl. Data Eng.

Energy models for graph clustering

J. Graph Algorithms Appl.

Defining and identifying communities in networks

Proc. Natl. Acad. Sci. USA

Finding and evaluating community structure in networks

Phys. Rev. E

Finding community structure in very large networks

Phys. Rev. E

Statistical mechanics of community detection

Phys. Rev. E

Finding community structure in networks using the eigenvectors of matrices

Phys. Rev. E

Modularity and community structure in networks

Proc. Natl. Acad. Sci.

Deterministic modularity optimization

Eur. Phys. J. B

Regular Paper
A new multi-objective evolutionary framework for community mining in dynamic social networks

Methodology of the proposed $MOEC$