Introduction
Related work
Single-machine clustering techniques
CLARANS
BIRCH
CURE
BKSK
Locality-preserving projection
Global projection
Multi-machine clustering techniques
Parallel clustering methods
MapReduce clustering algorithms
Preliminary
Artificial Bee Colony Algorithm
Fuzzy C-means algorithm
Apache Hbase
Proposed method
Map phase
Reduce phase
Algorithm complexities
Big Data | |||||
---|---|---|---|---|---|
Bytes | \(10^6\) | \(10^8\) | \(10^{10}\) | \(10^{12}\) | \(10^{>12}\) |
Size | Medium | Large | Huge | Monster | Very Large |
Results
Operating system | Processor | Main memory | Hard disk | Band width |
---|---|---|---|---|
Ubuntu 14.10 64-bit | Intel Core i7 4.3 GHz | 16 GB | 1 TB | 100 Mbit/s |
Datasets
Comparisons
Algorithm | class1 | class2 | class3 | Overall |
---|---|---|---|---|
PCM | 0.53 | 0.51 | 0.65 | 0.59 |
wPCM | 0.60 | 0.64 | 0.70 | 0.65 |
HOPCM-15 | 0.84 | 0.75 | 0.89 | 0.85 |
HOPCM | 0.84 | 0.79 | 0.85 | 0.84 |
Proposed method | 0.85 | 0.93 | 0.91 | 0.89 |
-
Read Phase: Data read from HDFS that runs one time at the beginning of the algorithm.
-
Map Phase: Map function applies to the data that repeats K times according to FCM algorithm procedure.
-
Reduce Phase: Reduce function applies to the output of Map Phase that repeats K times too.
-
Export Phase: The output of the Reduce phase write to the Hbase database that repeats K times too.
-
Import Phase: The results of previous iteration reads from Hbase database that repeats \(K-1\) times.
Algorithm | class1 | class2 | class3 | class4 | class5 | class6 | class7 | Overall |
---|---|---|---|---|---|---|---|---|
PCM | 0.63 | 0.61 | 0.72 | 0.73 | 0.70 | 0.62 | 0.71 | 0.71 |
wPCM | 0.67 | 0.70 | 0.75 | 0.79 | 0.77 | 0.77 | 0.77 | 0.76 |
HOPCM-15 | 0.89 | 0.85 | 0.92 | 0.89 | 0.87 | 0.91 | 0.81 | 0.89 |
HOPCM | 0.89 | 0.87 | 0.93 | 0.90 | 0.89 | 0.92 | 0.87 | 0.90 |
Proposed method | 0.94 | 0.92 | 0.94 | 0.93 | 0.91 | 0.94 | 0.93 | 0.92 |
Algorithm | class1 | class2 | class3 | class4 | class5 | class6 | class7 | Overall |
---|---|---|---|---|---|---|---|---|
PCM | 0.53 | 0.49 | 0.62 | 0.65 | 0.61 | 0.54 | 0.63 | 0.61 |
wPCM | 0.59 | 0.61 | 0.68 | 0.71 | 0.65 | 0.69 | 0.67 | 0.67 |
HOPCM-15 | 0.81 | 0.74 | 0.84 | 0.81 | 0.78 | 0.82 | 0.72 | 0.80 |
HOPCM | 0.80 | 0.77 | 0.82 | 0.84 | 0.79 | 0.83 | 0.75 | 0.80 |
Proposed method | 0.84 | 0.83 | 0.81 | 0.80 | 0.85 | 0.84 | 0.79 | 0.83 |
Size of data(GB) | Read Phase | Map Phase | Reduce Phase | Export Phase | Import Phase | Overall |
---|---|---|---|---|---|---|
200 | 0.6 | 13.49 | 8.58 | 0.2 | 0.1 | 22.97 |
400 | 1.4 | 21.17 | 13.48 | 0.69 | 0.15 | 36.89 |
600 | 2.7 | 30.05 | 18.49 | 1.1 | 0.23 | 52.57 |
800 | 4.6 | 36.74 | 25.27 | 1.8 | 0.34 | 68.75 |
1000 | 6.9 | 43.12 | 31.24 | 2.9 | 0.49 | 84.65 |