A data clustering approach based on universal gravity rule

doi:10.1016/j.engappai.2015.07.018

Engineering Applications of Artificial Intelligence

Volume 45, October 2015, Pages 415-428

https://doi.org/10.1016/j.engappai.2015.07.018 Get rights and content

Abstract

In this paper, a new robust data clustering algorithm inspired by Newtonian law of gravity is proposed. The proposed algorithm not only reduces the effects of noise and outliers but also, it is not sensible to the initial positions of the centroids. In the proposed method, data points and the cluster centroids are considered as fixed celestial objects and movable objects, respectively. The celestial objects apply a gravity force to the movable objects and change their positions in the feature space and therefore, the best positions of the cluster centroids are determined by employing the law of gravity. To evaluate the performance of the proposed algorithm, a comparative experimental study with some well-known clustering algorithms, using three visual datasets as well as several benchmark datasets from UCI, is performed. The experimental results confirm the effectiveness and the efficiency of the proposed clustering algorithm.

Introduction

Generally, there is no common terminology on definition of data clustering, but most researchers describe a cluster through its internal homogeneity and external separation, i.e., data points in the same cluster should be similar (or related) to each other, and different from (or unrelated to) the data points in other clusters. Both similarity and dissimilarity should be examinable in a clear and meaningful way. These algorithms are usually unsupervised, and they are mostly used in the fields of machine learning, data mining, pattern recognition, image analysis and bioinformatics (Jain et al., 1999, Hammouda, 2011).

The general part of all clustering algorithms is to find the representatives of clusters, i.e. cluster centroids for compact clusters. A clustering algorithm decides, each input data belongs to which cluster, (i.e. is the closest to which centroid). In some of the clustering techniques, the algorithm tries to partition data points into a given number of clusters, e.g. K-means (Hartigan and Wong, 1978) and fuzzy C-means (Bezdek et al., 1984). In some cases the number of clusters is not known a priori. Such an algorithm starts by finding the largest cluster first, next goes to find the second, and so on (Yager and Filev, 1994, Chiu, 1995, Wu and Yang, 2002).

The well-known K-means algorithm is one of the most used algorithms due to its efficiency and simplicity in data clustering where, it measures the distance between clusters׳ representatives (centroids) and data points to partition data into K clusters. In most cases, the Euclidean distance is used as the dissimilarity measure. To find the best position of the representatives, the K-means algorithm minimizes a cost function of data variations around the centroids. However, the initial state may cause the algorithm to trap into a local optimum, therewith affecting the quality of the final solution. Many studies have been made to overcome this drawback of the K-means algorithm, particularly by using fuzzy set theory and evolutionary algorithms (Taherdangkoo and Bagheri, 2013, Niknam and Amiri, 2010).

Fuzzy C-means (FCM) algorithm is an improvement over K-means clustering. In this algorithm, each data point belongs to a cluster by a degree of membership. Similar to the K-means algorithm, FCM relies on minimizing a cost function of the dissimilarity measure to find centroids. The FCM uses the concept of fuzzy set theory to handle the uncertainty associated with the data to be clustered (Gustafson and Kessel, 1979, Pal et al., 2005, Zarandi et al., 2009).

The Mountain clustering algorithm calculates a mountain function (density function) at every possible position in the data space, and chooses the position with the greatest density value as the center of the first cluster. Then, it removes the effects of the first cluster mountain function and finds the second cluster center (Moertini, 2002). This process is repeated until a desired number of clusters is found. The subtractive clustering is similar to the mountain clustering, except that it uses data point positions to calculate the density function, thus reduces the number of calculations significantly (Kim et al., 2005). This means that the computation depends on the problem size instead of the problem dimension.

Each clustering algorithm has its advantages and disadvantages in clustering of different types of data points. Therefore, many clustering methods have been proposed to rectify disadvantages of these algorithms. Also, some researchers tried to suggest new algorithms inspired by nature. In this paper, a nature inspired clustering algorithm is introduced by employing Newtonian law of gravity. Gravity based clustering algorithms are not new and their history returns to the 70s. In the following, first related works are reviewed and then the proposed work is described.

Recently, more and more attention has been focused on using nature based inspired algorithms to solve clustering problems (Nanda and Panda, 2014). Moreover, there are many clustering algorithms which have been made by hybridizing different types of evolutionary algorithms with K-means algorithm to overcome the disadvantage of K-means. Evolutionary algorithms such as genetic algorithm (GA) (Maulik and Bandyopadhyay, 2000), ant colony optimization (ACO) (Kashef and Nezamabadi-pour, 2014), particle swarm optimization (PSO) (Niknam and Amiri, 2010), gravitational search algorithm (GSA) (Rashedi and Nezamabadi-Pour, 2013, Dowlatshahi and Nezamabadi-pour, 2014) , are usually nature inspired. Some of these methods are reviewed in Yazdani et al. (2014).

Various applications of the gravity theory in solving problems of different areas have been considered including image edge detection (Sun et al., 2007, Deregeh and Nezamabadi-pour, 2014), data classification (Shafigh et al., 2013, Rezaei and Nezamabadi-pour, 2015), optimization (Soleimanpour-Moghadam et al., 2014), and data clustering (Sanchez et al., 2014, Wright, 1977, Yung and Lai, 1998). Gravity-based clustering algorithms simulate the process of the attraction and merging of objects by their gravity forces. To realize data clustering, these algorithms consider each data point as an object and assign a mass to it.

The first version of gravitational clustering algorithm was proposed by Wright (1977). This algorithm uses a Markovian model for the gravitational clustering. It is an incremental algorithm that updates the position of each data point in each iteration. How the objects are joined is determined by the continuous motion of all objects in the system according to gravitational forces. In this method, objects do not converge together but rather converge to “equilibrium” positions (Wright, 1977).

Yung and Lai (1998) present a Markovian model of clustering based on gravitational concepts. The model is used for color image segmentation in RGB color space. In the clustering process, each pixel is considered as an object with the unity mass. All objects apply gravitational force to each other. The Markovian model of gravitational attraction between two objects i and j is defined as follows: $F (m_{i}, m_{j}) = - G \frac{m_{i} m_{j}}{{‖ X_{i} - X_{j} ‖}^{3}} (X_{i} - X_{j})$ where $X_{i}$ and $X_{j}$ present the locating D-dimensional vectors of objects $i$ and $j$ , respectively, $m_{i}$ and $m_{j}$ are the mass of them, G is the universal gravitational constant and ||.|| is a Euclidean norm function where $‖ X_{i} - X_{j} ‖ = \sqrt{\sum_{l = 1}^{D} {(x_{i l} - x_{j l})}^{2}}$ . This force causes the objects (data points) to move in the space. Two objects that move to the same location are merged and form a new object that its mass is considered as the summation of the two merged masses.

A clustering method based on the notion of attraction force between each pair of data points has been presented by Kundu (1999). The clusters are formed by allowing each data point to move slowly under the resultant effect of all forces exerted to it, and by merging two data points when they become close to each other. When two or more data points (clusters) are merged, the sum of their masses becomes the mass of the resulting cluster; therefore, the mass of each cluster equals the number of its data instances (Kundu, 1999).

In the work done by Gomez et al. (2004), a modified version of Newton׳s law of gravity is used. This modification simplifies the calculation of gravitational force, where Eq. (2) is used for moving a data point according to the gravitational law. The value of the universal gravitational constant, G, is reduced after each iteration, which serves to eliminate the big crunch effect of all the data points. This is used as a mechanism that does not end up with only one cluster $X (t + 1) = X (t) + G \vec{d} f (\frac{\tilde{d}}{\vec{d}})$ where $\vec{d} = X_{i} - X_{j}$ is the vector direction, $f$ is a decreasing function such as $f (α) = 1 / α^{3}$ and $X$ is the position of the data point, $\vec{d}$ is the vector direction of such data point in time t, $\vec{d}$ is the rough estimate of maximum distance between the closest points such as $2 \sqrt{D} / \sqrt{3} n^{\frac{1}{D}}$ for n data points in the D-dimensional [0,1] Euclidean space and ||.|| is the norm function (Gomez et al., 2004).

Long and Jin (2006) proposed a simplified gravitational clustering (SGC) method for multi-prototype learning based on minimum classification error. It simulates the movement of objects according to the gravity force and checks for possible merging. The gravitational clustering is simplified by ignoring velocity and multi-force attraction. The pair objects are merged based on the following equations: $\begin{array}{l} {X_{i}, X_{j}} = \underset{i, j}{arg \min} ‖ X_{i} - X_{j} ‖ \frac{m_{i} m_{j}}{2} \\ m_{m e r g e d} = m_{i} + m_{j} \\ X_{m e r g e d} = \frac{X_{i} m_{i} + X_{j} m_{j}}{m_{i} + m_{j}} \end{array}$

The approach proposed by Zhang is based on a simplified version of Newtonian gravitational forces and Newtonian motion of objects, as shown in Eqs. (4), (5), respectively (Zhang and Hongshan, 2010). $V_{i} (t + Δ (t)) = V_{i} (t) + G m_{i} Δ (t) \frac{\vec{d}}{{\vec{d}}^{3}}$ $X_{i} (t + Δ (t)) = X_{i} (t) + V_{i} (t) Δ (t) + G m_{i} Δ {(t)}^{2} \frac{\vec{d}}{{\vec{d}}^{3}}$ where $Δ (t)$ is a small discrete time interval and V is the velocity of object i.

The value of the universal gravitational constant, G is reduced at each iteration (Zhang and Hongshan, 2010). In the GRIN algorithm, an incremental hierarchical clustering technique based on the gravity theory, is presented to construct clustering dendrograms. An incremental clustering algorithm refers to the abstraction of distribution of data instances generated by the previous run of the algorithm. The mass of a cluster is the number of its data instances. GRIN algorithm works in two phases: initial phase, and incremental phase. In both phases, it invokes the gravitational agglomerative hierarchical clustering algorithm (Chen et al., 2005).

Ilc and Dobnikar (2012) presented a gravitational based method for clustering, known as GSOM, in which, each data point is viewed as a mass object. In GSOM, the basic idea is used and integrated with SOM, considering the connections between neurons (Ilc and Dobnikar, 2012).

Rashedi et al. (2009) have been proposed a stochastic population-based metaheuristic, called Gravitational Search Algorithm (GSA), based on Newtonian law of gravity and the laws of motion. Originally, GSA is designed for solving continuous optimization problems. GSA, like most metaheuristics has flexible abilities in different application such as clustering (Hatamlou et al., 2012).

Rashedi and Nezamabadi-Pour (2013) have been proposed a clustering algorithm based on the theory of gravity. In the proposed gravitational clustering, two objects that are close to each other with respect to a predefined threshold are merged together to construct a larger cluster. A “travel” operator has been introduced to move each pixel to a new position in the feature space, and a “merging” operator has been used to make a decision to merge or not. This may cause a drawback for sequential clustering, i.e. the final clustering depends heavily on the samples presentation order. Finally, an “escaping” operator inspired by the concept of escape velocity, has been introduced to eliminate the unpleasant effect of the travel operator. The authors apply the proposed gravitational clustering to image segmentation problem (Rashedi and Nezamabadi-Pour, 2013).

Some of the major challenges in clustering algorithms are the ability to deal with noises and outliers, imbalanced groups as well as the sensitivity to the initial position of cluster centroids. The aim of this paper is to propose a nature inspired clustering algorithm which is able to overcome some problems like outliers (or noisy data), imbalanced clusters, overlapped clusters and sensitivity to the initial position of cluster centroids. To the best of our knowledge, the data points in the existing gravity-based algorithms are considered as movable objects and are allowed to move around the feature space in the influence of Newtonian law of gravity and merge together if they are close enough to each other. In the proposed clustering method, we consider the data points as fixed celestial objects with the unity mass to apply a gravity force to movable objects and change their positions in the feature space. The aim is to find the best position of each cluster centroids (cluster representative) where each centroid is modeled by a movable agent with the unity mass. The centroids move around the feature space in the influence of the gravity force exerted by the celestial objects to find the best position. One can expect that the centroids stop in the optimum positions.

The rest of the paper is organized as follows. In the next Section, basic concepts of clustering and Newton׳s law of universal gravitation are reviewed and their properties are discussed. Section 3 is devoted to describe the proposed gravity algorithm. A comparative Experimental study is given in Section 4. Finally, we conclude the paper in Section 5.

Section snippets

Clustering

Clustering is the task of unsupervised classification of data points into groups (clusters). Data points are represented conventionally as multidimensional vectors, where each dimension is a single feature. Clustering algorithms are mainly used to identify groups of similar items within a universe of data. The grouping can be performed in a number of ways. The output clustering can be hard (i.e. a partition of the data into groups) or fuzzy where each data point has a variable degree of

Basic idea

It is assumed that each data point, $X_{i} = (x_{i 1}, x_{i 2}, \dots, x_{i D})$ , is located in a D-dimensional space, where D is the number of features. The clusters are compact and a point representative (centroid) is used to present each cluster. The main idea in the proposed algorithm is to consider a movable gravity object (agent) as the centroid of a cluster and each data point as a fixed gravity object. In this gravity system, the fixed objects apply the gravity force to the agents and change their positions in

Datasets

The performance of the proposed algorithm is evaluated in different problems in the field of clustering and classification and the results are compared with those of well-known methods in this field. To see how the proposed algorithm works for noisy data, outliers, and imbalance dataset, we apply it to three clustering datasets in 2-and-3 dimensional feature space for visual inspection and comparison. The used datasets in the visual comparison are shown in Fig. 3 (Wu and Yang, 2002).

As it is

Conclusions

In this paper, a new nature inspired clustering algorithm based on Newtonian law of gravity was presented. In order to evaluate the performance of the gravity-based clustering algorithm, experiments were performed using 12 datasets from the UCI machine learning repository in the field of clustering and classification. Furthermore, some experiments were performed on three visual datasets to see how it performs. The proposed algorithm has been compared with a number of well-known alternative

Acknowledgments

The authors give kind respect and special thanks to the anonymous reviewers for their useful advices. Moreover, the authors would like to express their gratitude towards Prof. Hadi Sadoghi-Yazdi for providing valuable help and Prof. Malihe M. Farasngi, Miss Elham Ghazizadeh and Miss Bahareh Nikpour for proof reading the manuscript.

References (52)

A. Bahrololoum et al.
A prototype classifier based on gravitational search algorithm
Appl. Soft Comput. J.
(2012)
J.C. Bezdek et al.
FCM: the fuzzy c-means clustering algorithm
Comput. Geosci.
(1984)
C.Y. Chen et al.
A statistics-based approach to control the quality of subclusters in incremental gravitational clustering
Pattern Recognit.
(2005)
C.H. Chou et al.
A prototype classification method and its use in a hybrid solution for multiclass pattern recognition
Pattern Recognit.
(2006)
M.B. Dowlatshahi et al.
GGSA: a grouping gravitational search algorithm for data clustering
Eng. Appl. Artif. Intell.
(2014)
R. Forsati et al.
Efficient stochastic algorithms for document clustering
Inf. Sci.
(2013)
S. García et al.
Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power
Inf. Sci.
(2010)
A. Hatamlou et al.
A combined approach for clustering based on K-means and gravitational search algorithms
Swarm Evol. Comput.
(2012)
Nejc Ilc et al.
Generation of a clustering ensemble based on a gravitational self-organising map
Neurocomputing
(2012)
S. Kashef et al.
An advanced ACO algorithm for feature subset selection
Neurocomputing
(2015)

S. Yazdani et al.

A gravitational search algorithm for multimodal optimization

Swarm Evol. Comput.

(2014)

Cited by (23)

Clustering ensemble-based novelty score for outlier detection
2023, Engineering Applications of Artificial Intelligence
Recently, One-class classification algorithms have been successfully used for outlier detection problems in several industrial fields. However, in case of that the target class has complex structures, single outlier detection model with one-class classifier often poorly performs because it cannot appropriately reflect intrinsic data structures. To address this limitation, we propose a clustering ensemble-based novelty score algorithm. The proposed algorithm calculates novelty score from the mixture of multiple clustering solutions generated by both random subspace and random-K ensemble approaches. Then, final ensemble novelty score is defined by summarizing multiple novelty scores obtained from individual clustering results. Because these multiple novelty scores are computed from many possible characteristics of target class information, the proposed ensemble novelty score can appropriately reflect the inherent structures of target class. Experiments were conducted on various benchmark datasets to compared with existing methods and investigate the properties of the proposed algorithm. The experimental results confirm that the proposed algorithm outperforms existing one-class classification methods in various cases.
CDBH: A clustering and density-based hybrid approach for imbalanced data classification
2021, Expert Systems with Applications
Citation Excerpt :
The performance of the k-means algorithm is very dependent on the initial centers of the clusters. In other words, by selecting the initial centers of the clusters incorrectly, the final centers of the clusters will not be appropriate, and this is the main disadvantage of this algorithm (Bahrololoum et al., 2015; Celebi et al., 2013; Hartigan & Wong, 1979; Mirzaei et al., 2014; Singh & Kaur, 2013). The proposed CDBH method is a simple hybrid method and belongs to the data-level category.
The problem of imbalanced data set classification is prevalent in the studies of machine learning and data mining. In these kinds of data sets, the number of samples in classes is unequal so that one class has a lot more samples (the majority or negative class) than the other (the minority or positive class). The classical classifiers are ineffective in these conditions because they are biased toward the majority class and ignore the minority class, which is more important. Preprocessing the data distribution before training the classifier is one of the most effective methods to resolve this problem. These methods, balance the data distribution by decreasing the majority class size (under-sampling methods) or increasing the minority class size (over-sampling methods) or combining both of them (hybrid methods). In this paper, we propose an effective and simple hybrid approach based on the density concept and clustering, which is called Clustering and Density-Based Hybrid (CDBH). First, the minority class samples are clustered by the well-known k-means algorithm and their densities in each cluster are obtained. Then, the denser minority samples are selected with more likely to generate the new minority samples. To decrease the majority class size, the k-means algorithm is applied again on the majority class samples to cluster them and compute their densities, like the previous stage. Finally, the denser majority samples will have more chance to choose from the training set, and other samples are removed to balance the data samples distribution between classes. In the experiments, the Support Vector Machine (SVM) classifier is used as the classifier, and F-measure and AUC criteria are employed for evaluation. Also, preprocessing methods are compared in terms of the complexity of the classification model and the over-sampling rate. The results of comparing CDBH and other state of the art methods over 44 imbalanced data sets show the superiority of the proposed CDBH method based on the F-measure criterion.
Optimized gravitational-based data clustering algorithm
2018, Engineering Applications of Artificial Intelligence
Citation Excerpt :
Finally, the conclusions are presented in Section 5. The GC algorithm was proposed by Bahrololoum et al. (2015). It is designed to cluster data points in a multidimensional feature space.
Gravitational clustering is a nature-inspired and heuristic-based technique. The performance of nature-inspired algorithms relies on the balance achieved between exploitation and exploration. A modification over a data clustering algorithm based on the universal gravity rule is proposed in this paper. Although gravitational clustering algorithm has a high exploration ability, it lacks a proper exploitation mechanism because of the impulsive velocity of agents that search the solution space, which leads to the huge step size of agent positions through iterations. This study proposes the following solutions to impose a balance between exploitation and exploration: (i) the dependence of the agent on velocity history is removed to avoid high velocity caused by accumulating previous velocities, and (ii) an initialization step of centroid positions is added using the variance and median initialization method with a predefined number of clusters. The initialization step eliminates the effects of random initialization and subrogates the exploration process. Experiments are conducted using 13 benchmark datasets from the UCI machine learning repository. In addition, the proposed algorithm is tested on two case studies using the electrical hotspots and cervical cell datasets. The performance of the proposed clustering algorithm is compared qualitatively and quantitatively with several state-of-the-art clustering algorithms. The obtained results indicate that the proposed clustering algorithm outperforms conventional techniques. Furthermore, the clusters obtained using the proposed algorithm are more homogeneous than those obtained using conventional techniques. The proposed algorithm quantitatively achieves better results than the other techniques in 9 out of 15 datasets in terms of accuracy, F-score, and purity.
Density-based particle swarm optimization algorithm for data clustering
2018, Expert Systems with Applications
Particle swarm optimization (PSO) algorithm is widely used in cluster analysis. However, it is a stochastic technique that is vulnerable to premature convergence to sub-optimal clustering solutions. PSO-based clustering algorithms also require tuning of the learning coefficient values to find better solutions. The latter drawbacks can be evaded by setting a proper balance between the exploitation and exploration behaviors of particles while searching the feature space. Moreover, particles must take into account the magnitude of movement in each dimension and search for the optimal solution in the most populated regions in the feature space. This study presents a novel approach for data clustering based on particle swarms. In this proposal, the balance between exploitation and exploration processes is considered using a combination of (i) kernel density estimation technique associated with new bandwidth estimation method to address the premature convergence and (ii) estimated multidimensional gravitational learning coefficients. The proposed algorithm is compared with other state-of-the-art algorithms using 11 benchmark datasets from the UCI Machine Learning Repository in terms of classification accuracy, repeatability represented by the standard deviation of the classification accuracy over different runs, and cluster compactness represented by the average Dunn index values over different runs. The results of Friedman Aligned-Ranks test with Holm's test over the average classification accuracy and Dunn index values indicate that the proposed algorithm achieves better accuracy and compactness when compared with other algorithms. The significance of the proposed algorithm is represented in addressing the limitations of the PSO-based clustering algorithms to push forward clustering as an important technique in the field of expert systems and machine learning. Such application, in turn, enhances the classification accuracy and cluster compactness. In this context, the proposed algorithm achieves better results compared with other state-of-the-art algorithms when applied to high-dimensional datasets (e.g., Landsat and Dermatology). This finding confirms the importance of estimating multidimensional learning coefficients that consider particle movements in all the dimensions of the feature space. The proposed algorithm can likewise be applied in repeatability matters for better decision making, as in medical diagnosis, as proved by the low standard deviation obtained using the proposed algorithm in conducted experiments.
A multi-expert based framework for automatic image annotation
2017, Pattern Recognition
Citation Excerpt :
One of the major challenges in clustering algorithms are the ability to deal with noises and outliers, imbalanced groups as well as the sensitivity to the initial position of cluster centroids. Ref. [46] proposed a nature inspired clustering algorithm which is able to overcome some problems like outliers (or noisy data), imbalanced clusters, overlapped clusters and sensitivity to the initial position of cluster centroids. In this clustering algorithm, the data points to be clustered are considered as fixed celestial objects with the unity mass to apply a gravity force to movable objects (centroids) and change their positions in the feature space.
Automatic image annotation (AIA) for a wide-range collection of image data is a difficult challenging topic and has attracted the interest of many researchers in the last decade. To achieve the goal of AIA, a multi-expert based framework is presented in this paper which is based on the combination of results obtained from feature space and concept space. Considering a real-world image dataset, a large storage is required; therefore, the idea of generating prototypes in both feature and concept spaces is used. The prototypes are generated in learning phase using a clustering technique. The input unlabeled images are assigned to the nearest prototypes in both feature and concept spaces, and primary labels are obtained from the nearest prototypes. Eventually, these labels are fused and final labels for a target image are chosen. Since all feature types do not describe a concept label equally, some prototypes are more effective to represent a concept and bridge the semantic gap, so a metaheuristic algorithm is employed to search for the best subset of feature types and best criterion of fusion. To evaluate the performance of the proposed framework, an example of its implementation is presented. A comparative experimental study with several state-of-the-art methods is reported on two standard databases of about 20k images. The obtained results confirm the effectiveness of the proposed framework in the field of automatic image annotation.
A new selection strategy for selective cluster ensemble based on Diversity and Independency
2016, Engineering Applications of Artificial Intelligence
Citation Excerpt :
The major aim of data clustering is to find groups of patterns (clusters) in such a way that patterns in one cluster can be more similar to each other than to patterns of other clusters (Akbari et al., 2015). A clustering algorithm decides each input data belongs to which cluster (Bahrololoum et al., 2015). Thus, Clustering can be considered as a powerful tool to reveal and visualize structure of data (Izakian et al., 2015).
This research introduces a new strategy in cluster ensemble selection by using Independency and Diversity metrics. In recent years, Diversity and Quality, which are two metrics in evaluation procedure, have been used for selecting basic clustering results in the cluster ensemble selection. Although quality can improve the final results in cluster ensemble, it cannot control the procedures of generating basic results, which causes a gap in prediction of the generated basic results’ accuracy. Instead of quality, this paper introduces Independency as a supplementary method to be used in conjunction with Diversity. Therefore, this paper uses a heuristic metric, which is based on the procedure of converting code to graph in Software Testing, in order to calculate the Independency of two basic clustering algorithms. Moreover, a new modeling language, which we called as “Clustering Algorithms Independency Language” (CAIL), is introduced in order to generate graphs which depict Independency of algorithms. Also, Uniformity, which is a new similarity metric, has been introduced for evaluating the diversity of basic results. As a credential, our experimental results on varied different standard data sets show that the proposed framework improves the accuracy of final results dramatically in comparison with other cluster ensemble methods.

View all citing articles on Scopus

View full text

A data clustering approach based on universal gravity rule

Abstract

Introduction

Section snippets

Clustering

Basic idea

Datasets

Conclusions

Acknowledgments

Appl. Soft Comput. J.

Comput. Geosci.

Pattern Recognit.

Pattern Recognit.

Eng. Appl. Artif. Intell.

Inf. Sci.

Inf. Sci.

Swarm Evol. Comput.

Neurocomputing

Neurocomputing

Pattern Recognit. Lett.

Pattern Recognit.

Pattern Recogn.

Swarm Evol. Comput.

Appl. Soft Comput. J.

Eng. Appl. Artif. Intell.

Inf. Sci.

Neurocomputing

Inf. Sci.

Inf. Sci.

Inf. Sci.

Pattern Recognit.

Eng. Appl. Artif. Intell.

Pattern Recognit.

Pattern Recognit.

Swarm Evol. Comput.