Considering the simulation of the proposed algorithm using datasets, numerical assessments, and evaluation of the output results, the research methodology used in this paper can be classified as quantitative, analytical, conclusive, and applied research.
The clique definition
There is not a precise definition of the clique concept, because its usage is different for various applications. The clique is a structural and topological attribute of the graphs. Simply, a clique is a set of nodes, such that the number of edges inside this set is far more than the edges of that set with the other vertices of the graph. The cliques of the graph may be overlapped, which means that some nodes of the graph are simultaneously a member of two or more different cliques. Many algorithms are proposed for the detection of the cliques in weighted or unweighted, directed or undirected graphs which are briefly studied and stated in the section "
Related Work".
An abstract definition of the clique is proposed by Thai and Pardalos [
24] as follows. Considering a simple undirected graph
G =
(V, E) in which
V =
{1, 2, …, n} is the set of vertices (nodes), and
E ⊆ V ×
V is the set of edges where
||V||=
n and
||E||=
m. The graph
G is complete if
∀ u,v ∈ V where
u ≠ v, the edge
(u, v) ∈ E. The clique
C is a subset of
E if the extracted subgraph
G[C] is complete. The clique is called maximal if it is not within a clique larger than itself, and it is called maximum if there is not ant another larger clique in the graph. The size of the maximum clique inside of the graph
G which is known as the clique number is shown as
ω(G) symbol [
24].
Another formal definition for clique is proposed by Hao et al. [
13] and Hao et al. [
14]. A clique is a subset of
S ⊂
V where for any two members of S like
vi and
vj, the edge (
vi,
vj) exists in
E. A clique
c is called maximal if there is not another clique c' in the same graph G, where
c ⊂
c'.
Several studies are done by the researchers and many algorithms based on deterministic, heuristic, and meta-heuristic approaches are proposed which most of which are time-consuming, sensitive to initial conditions, need to adjust the parameters, low-efficiency in sparse graphs, and not-scalability are the most common disadvantage.
The first review survey was published in 1994. Wu and Hao [
27] made a comprehensive review and published a survey including mathematical models, heuristic, and meta-heuristic approaches developed by researchers for solving the maximum clique problem. Considering the NP-Complete nature of the problem, most of the recent studies are focused on developing meta-heuristic algorithms. The Genetic Algorithm (GA) and the Ant Colony Optimization (ACO) approaches are two of the oldest methods that can be mentioned for solving the maximum clique problem. Fenet and Solnon [
6] proposed an ACO-based algorithm called Ant-Clique which distinguishes the maximal clique by repeatedly adding new nodes to the partial clique ever found. The ACO algorithm as a basic approach is combined with the Simulated Annealing (SA) algorithm by Xu et al. [
28] and Tabu-Search (TS) by Rezvanian and Meybodi [
21] which have reported acceptable results; nevertheless, both methods are complicated and the execution times are high.
Hao et al. [
11] used the basic component analysis method for finding maximum k-clique on SNs. They first created an official context for the SN using a modified adjacency matrix and next defined new concept names
k-equiconcept on a network and claimed that the k-clique problem is equivalent to the k-equiconcept problem. Hao et al. [
8]
8] developed a method for detecting k-cliques on the Dolphin dataset. In this iterative process, each node tracks its clique—through rate by averaging its neighbors' information at each step. There is also a maximum amount for the number of cliques a graph can be a member of. This approach was evaluated using the Triadic Formal Concept Analysis (TFCA) after transforming a weightless social network. Experimental results show that this algorithm can be effective in identifying reliable bands.
Duan et al. [
4] solved the maximum clique problem based on clustering and analyzed it on two datasets, ENRON and DBLP. In this model, a link is created for both users if they participate in a discussion about one or more topics or stories. In this case, they both have similar interests. Therefore, network edges are updated using the information compatibility attitude between each pair of users. Each pair of users
i and
j may exchange some topics or identities with each other, but the implicit orientations or attitudes of the two users on different topics may not be the same. Using simple statistical methods, the attitude compatibility between both users is calculated. The value of this criterion is between 0 and 1, and the higher the value, the greater the degree of consistency of attitudes of both pairs in a subject. In the second part, a fast parallel optimization algorithm is used that performs greedy optimization to find cliques. The experimental results show that the clustering algorithm for k-clique on both datasets has less error to solve the problem.
Patrick and Östergård [
19] proposed a fast algorithm for solving the maximum clique problem. The results of the proposed method on DIMACS graphs show favorable responses in terms of clique size and convergence speed compared to other methods. Finally, their proposed method and the results are validated using convergence diagrams. This approach is based on the drop in density between each pair of parent and child nodes. The lower the density, the more likely the child is to make an independent clique. Therefore, based on the maximum flow of the minimum cut theorem, a new algorithm that can automatically find an optimal set of local cliques is proposed.
Hao et al. [
8,
9] proposed a strategy to improve the graphs for finding the maximum graph by adding a preprocessing step in which the edges are weighted according to their centrality in the network topology. In this method, the centrality of the edge indicates its cooperation in graph weighting. This strategy effectively tries to complete the information about the network topology and can be used as an additional tool to maximize the graph. The calculation of the center of the ridge is performed by performing several random walks with limited length on the network, which calculates the center of the edge possible in large-scale networks.
The purpose of [
12] research was to investigate the method of extracting k-clique information in social network analysis. Initially, using a clustering algorithm, it groups all social topics into a set of topics. Network members are then divided into clusters of social issues in which they are involved. Due to the difference in the strength of the connections between the nodes, link analysis is performed in each local cluster to find cliques. In this study, by applying the principles of social networks to the Internet of Things (IoT) and considering marketing in human social networks, they showed that the impact of the Internet of Things on Internet marketing and making it smart is very important.
In the research of [
18], a meta-heuristic algorithm has been proposed that removes low-value vertices with greater probability from the current largest clique. Thus more valuable vertices have more chance (probability) for presence. With each small move, many changes are made to the current clique, and step by step it moves toward finding the largest clique. The results of their proposed method on DIMACS graphs show good results in terms of clique size and convergence speed compared to other methods.
Traag and Bruggeman [
25] proposed a new ant colony algorithm to solve the clique problem. In recent years, the ant colony optimization algorithm has achieved successful results in solving various discrete optimization problems, but in the clique problem, the standard ant colony optimization algorithm has low convergence. Therefore, to solve the maximum clique problem, the researchers suggested changes in the way the pheromone is updated to select the appropriate alternative path. Their algorithm, while maintaining the initial successful properties, has low computational complexity and rapid convergence.
Mirghorbani and Krokhmal [
17] proposed a method for finding a k-sized clique in k-segmented graphs. This method is based on the bit pattern used to find cliques. Varun and Ravikumar [
26] went from network analysis to several associations for community recognition in telecommunications networks. They also identified central users and network isolators by identifying common members between associations.
Himmel et al. [
15] examined the society detection problem in time graphs and, by modifying the Bron–Kerbosch algorithm, proposed a method for identifying time cliques. Sun et al. [
23] studied the society detection problem in dynamic graphs and showed that their proposed method is less complex in time than the previous methods.
Identifying the largest clique can also be used as a tool in image processing. By defining a threshold for the size of the largest association between the pixels of two consecutive frames of an image, [
30] provided a way to predict the direction of motion of a video camera. Saeidi [
7] proposed an Artificial Bee Colony optimization algorithm for finding the maximal clique and compared their proposed method with ACO and PS-ACO based on some DIMACS benchmarks. Table
1 summarizes the previous methods and related pros and cons.