Skip to main content
Log in

Parallel implementing improved k-means applied for image retrieval and anomaly detection

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Anomaly detection based on data mining is one of the key technologies to be applied to intelligent detection. K-means is a classic clustering algorithm which is efficient for anomaly detection. Traditional K-means is sensitive to the selection of initial clustering centers. Different initial value can cause different clustering results. We combine improved DD algorithm with information entropy to improve the performance of K-means. Improved K-means can optimize the selection of initial clustering centers; automatically decide the number of clusters and output stable clustering results. After the pretreatment of PCA, the adaptability of improved K-means has a distinct progress. To solve the problem of massive data processing time, we adopt the technology of cloud computing and modify the algorithm for parallel processing. We analyze the performance of improved K-means by using different data sets, KDD Cup99 and public mobile malware data set (i.e. MalGenome). The experimental results illustrate that improved K-means has accurate results and can be applied to anomaly detection in mobile networks. This improved K-means also can be applied for image retrieval by calculating the similarity between each image.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Anagnostopoulos M, Kambourakis G, Gritzalis S (2015) New facets of mobile botnet: architecture and evaluation. Int J Inf Secur 2015:1–19

    Google Scholar 

  2. Gu B, Sheng VS, Tay KY, Romano W, Li S (2014) Incremental support vector learning for ordinal regression. Ieee T Neur Net Learn 26(7):1403–1416

    Article  MathSciNet  Google Scholar 

  3. Gu B, Sheng VS, Wang Z, Ho D, Osman S, Li S (2015) Incremental learning for ν-support vector regression. Neural Netw 67:140–150

    Article  Google Scholar 

  4. Laxman S, Sastry PS (2006) A survey of temporal data mining. Sadhana Acad P Eng S 31(2):173–198

    Article  MathSciNet  MATH  Google Scholar 

  5. Leea S, Kimb G, Kimc S (2011) Self-adaptive and dynamic clustering for online anomaly detection. Exp Syst Appl 38(12):14891–14898

    Article  Google Scholar 

  6. Narudin FA, Feizollah A, Anuar NB, Gani A (2016) Evaluation of machine learning classifiers for mobile malware detection. Soft Comput 20(1):343–357

    Article  Google Scholar 

  7. Pandeeswari N, Kumar G (2015) Anomaly detection system in cloud environment using fuzzy clustering based ANN. Mob Netw Appl 2015:1–12

    Google Scholar 

  8. Shamir O, Tishby N (2010) Stability and model selection in k-means clustering. Mach Learn 80(2):213–243

    Article  MathSciNet  Google Scholar 

  9. Tong XJ, Meng FR, Wang ZX (2011) Optimization to k-means initial cluster centers. Comput Eng Des 32(8):2721–2723

    Google Scholar 

  10. Villalba SD, Cunningham P (2007) An evaluation of dimension reduction techniques for one-class classification. Artif Intell Rev 27(4):273–294

    Article  Google Scholar 

  11. Yin C (2014) Towards accurate node-based detection of P2P Botnets. Sci World J 2014:425–491

    Google Scholar 

  12. Yin C, Feng L, Ma L (2015) An improved Hoeffding-ID data-stream classification algorithm. J Supercomput 2015:1–12

    Google Scholar 

  13. Yin C, Ma L, Feng L (2016) A feature selection method for improved clonal algorithm towards intrusion detection. Int J Pattern Recognit Artif Intell 30(5):1–13

    Article  Google Scholar 

  14. Yin C, Zou M, Iko D, Wang J (2013) Botnet detection based on correlation of malicious behaviors. Int J Hybrid Inf Technol 6(6):291–300

    Article  Google Scholar 

  15. Yuan FY, Zhang XC, Luo SB (2011) Accurate property weighted K- means clustering algorithm based on information entropy. J Comput Appl 31(6):1675–1677

    Google Scholar 

  16. Zhou Y, Jiang X (2012) Dissecting android malware: characterization and evolution. 2012 I.E. Symp Secur Priv 59:95–109

    Article  Google Scholar 

Download references

Acknowledgments

Foundation item: This work was funded by the National Natural Science Foundation of China (No.61373134). It was also supported by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD), Jiangsu Key Laboratory of Meteorological Observation and Information Processing (No.KDXS1105) and Jiangsu Collaborative Innovation Center on Atmospheric Environment and Equipment Technology (CICAEET).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chunyong Yin.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yin, C., Zhang, S. Parallel implementing improved k-means applied for image retrieval and anomaly detection. Multimed Tools Appl 76, 16911–16927 (2017). https://doi.org/10.1007/s11042-016-3638-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-016-3638-1

Keywords

Navigation