Top

Arabian Journal for Science and Engineering

Published in:

24-06-2021 | Research Article-Computer Engineering and Computer Science

Large-Scale Data Clustering Using Manifold-Regularized Ensemble of Posterior in GAN

Authors: Haleh Homayouni, Eghbal Mansoori

Published in: Arabian Journal for Science and Engineering | Issue 2/2022

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Data clustering is an unsupervised learning method as a pivotal technique for statistical data analysis. It is a challenging machine learning scheme that involves the grouping of data samples, especially in large databases. Deep neural networks are scalable to large-scale data and capable of learning data structure by modeling the nonlinearity. One of the famous latent generative models in this realm is the generative adversarial network (GAN). In the latent generative models, for clustering, we need the posterior corresponding to the intended model. Then, we need a variational approximation of that. To address this problem, we can maximize mutual information or minimize the KL-divergence. In this paper, to reach a more generalized inference in clustering, an ensemble approach is employed to approximate the posterior. To implement this ensemble with deep networks, we proposed a convex lower bound for the posteriors’ variational approximation. To amend the generator behavior, we injected the geometrical structure of data as manifold regularization to the objective function to reach accurate statistical inference. The efficacy of the proposed method has been addressed in four benchmark data sets. The experimental results confirm our model’s superiority in comparison with standard clustering algorithms and some recently developed deep methods.

previous article Research on Behavior of Two New Random Entity Mobility Models in 3-D Space

next article A Highly Reliable and Cost-effective Service Model for Finite Population Clouds: Analysis and Implementation

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Junyuan, Xie, Ross, Girshick, Ali, Farhadi: Unsupervised deep embedding for clustering analysis. In International conference on machine learning, pages 478–487, (2016a)

Kingma, Diederik P.; Adam, Jimmy Ba.: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, (2014)

Goodfellow, Ian; Pouget-Abadie, Jean; Mirza, Mehdi; Xu, Bing; Warde-Farley, David; Ozair, Sherjil; Courville, Aaron; Bengio, Yoshua: Generative adversarial nets. In Advances in neural information processing systems, pages 2672–2680, (2014)

Alain, Guillaume; Bengio, Yoshua: What regularized auto-encoders learn from the data-generating distribution. J. Mach. Learn. Res. 15(1), 3563–3593 (2014)MathSciNetMATH

Sonoda, Sho; Murata, Noboru: Decoding stacked denoising autoencoders. arXiv preprint arXiv:1605.02832, (2016)

Kumar, Abhishek; Sattigeri, Prasanna; Fletcher, Tom: Semi-supervised learning with gans: Manifold invariance with improved inference. In Advances in Neural Information Processing Systems, pages 5534–5544, (2017)

Mescheder, Lars; Nowozin, Sebastian; Geiger, Andreas: Adversarial variational bayes: Unifying variational autoencoders and generative adversarial networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 2391–2400. JMLR. org, (2017)

Wu, Yuhuai; Burda, Yuri; Salakhutdinov, Ruslan; Grosse, Roger: On the quantitative analysis of decoder-based generative models. arXiv preprint arXiv:1611.04273, (2016)

Zhao, Shengjia; Song, Jiaming; Ermon, Stefano: Infovae: Information maximizing variational autoencoders. arXiv preprint arXiv:1706.02262, (2017)

10.

Diederik, P Kingma; Welling, Max, et al.: Auto-encoding variational bayes. In Proceedings of the International Conference on Learning Representations (ICLR), (2014)

11.

Higgins, Irina; Matthey, Loic; Pal, Arka; Burgess, Christopher; Glorot, Xavier; Botvinick, Matthew; Mohamed, Shakir; Lerchner, Alexander: beta-vae: Learning basic visual concepts with a constrained variational framework. Iclr 2(5), 6 (2017)

12.

Chen, Xi; Duan, Yan; Houthooft, Rein; Schulman, John; Sutskever, Ilya; Abbeel, Pieter: Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In Advances in neural information processing systems, pages 2172–2180, (2016)

13.

Brock, Andrew; Donahue, Jeff; Simonyan, Karen: Large scale gan training for high fidelity natural image synthesis. Proceedings of the International Conference on Learning Representations (ICLR), (2018)

14.

Zhu, Jun-Yan; Park, Taesung; Isola, Phillip; Efros, Alexei A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, pages 2223–2232, (2017)

15.

Miyato, Takeru; Kataoka, Toshiki; Koyama, Masanori; Yoshida, Yuichi: Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957, (2018)

16.

Manifold regularized generative adversarial networks for scientific data: Qunwei Li, Bhavya Kailkhura, Rushil Anirudh, Jize Zhang, Yi Zhou, Yingbin Liang, T Yong-Jin Han, and Pramod K Varshney. Mr-gan. Proceedings of Machine Learning Research 107, 1–27 (2020)

17.

Martin, Arjovsky; Lon, B.; Towards principled methods for training generative adversarial networks. In NIPS, : Workshop on Adversarial Training. review for ICLR 2016, 2017 (2016)

18.

Simard, Patrice; Victorri, Bernard; LeCun, Yann; Denker, John: Tangent prop-a formalism for specifying selected invariances in an adaptive network. In Advances in neural information processing systems, pages 895–903, (1992)

19.

Hu, Weihua; Miyato, Takeru; Tokui, Seiya; Matsumoto, Eiichi; Sugiyama, Masashi: Learning discrete representations via information maximizing self-augmented training. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 1558–1567. JMLR. org, (2017)

20.

Aggarwal, Charu C; Reddy, Chandan K.: Data clustering. Algorithms and Application, Boca Raton: CRC Press, (2014)

21.

Biernacki, Christophe; Celeux, Gilles; Govaert, Gérard: Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans. Pattern Anal. Mach. Intell. 22(7), 719–725 (2000)CrossRef

22.

Xu, Linli; Neufeld, James; Larson, Bryce; Schuurmans, Dale: Maximum margin clustering. In Advances in neural information processing systems, pages 1537–1544, (2005)

23.

Zhao, Bin; Wang, Fei; Zhang, Changshui: Efficient multiclass maximum margin clustering. In Proceedings of the 25th international conference on Machine learning, pages 1248–1255. ACM, (2008)

24.

Ng, Andrew Y.; Jordan, Michael I.; Weiss, Yair: On spectral clustering: Analysis and an algorithm. In Advances in neural information processing systems, pages 849–856, (2002)

25.

Von Luxburg, Ulrike: A tutorial on spectral clustering. Statistics Comput. 17(4), 395–416 (2007)MathSciNetCrossRef

26.

Steinbach, Michael; Ertöz, Levent; Kumar, Vipin: The challenges of clustering high dimensional data. In New directions in statistical physics, pages 273–309. Springer, (2004)

27.

Krause, Andreas; Perona, Pietro; Gomes, Ryan G: Discriminative clustering by regularized information maximization. In Advances in neural information processing systems, pages 775–783, (2010)

28.

Roth, Volker; Lange, Tilman: Feature selection in clustering problems. In Advances in neural information processing systems, pages 473–480, (2004)

29.

Tian, Fei; Gao, Bin; Cui, Qing; Chen, Enhong; Liu, Tie-Yan: Learning deep representations for graph clustering. In Twenty-Eighth AAAI Conference on Artificial Intelligence, (2014)

30.

Chang, Wei-Chien: On using principal components before separating a mixture of two multivariate normal distributions. J. Royal Statistical Soc: Series C (Applied Statistics) 32(3), 267–275 (1983)MathSciNetMATH

31.

Yan, Donghui; Huang, Ling; Jordan, Michael I.: Fast approximate spectral clustering. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 907–916. ACM, (2009)

32.

Vincent, Pascal; Larochelle, Hugo; Lajoie, Isabelle; Bengio, Yoshua; Manzagol, Pierre-Antoine: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11(Dec), 3371–3408 (2010)MathSciNetMATH

33.

Song, Chunfeng; Huang, Yongzhen; Liu, Feng; Wang, Zhenyu; Wang, Liang: Deep auto-encoder based clustering. Intell. Data Anal. 18(6S), S65–S76 (2014)CrossRef

34.

Yang, Bo; Fu, Xiao; Sidiropoulos, Nicholas D; Hong, Mingyi: Towards k-means-friendly spaces: Simultaneous deep learning and clustering. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 3861–3870. JMLR. org, (2017)

35.

Jiang, Zhuxi; Zheng, Yin; Tan, Huachun; Tang, Bangsheng; Zhou, Hanning: Variational deep embedding: An unsupervised and generative approach to clustering. arXiv preprint arXiv:1611.05148, (2016)

36.

Dizaji, Kamran Ghasedi; Herandi, Amirhossein; Deng, Cheng; Cai, Weidong; Huang, Heng: Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization. In Proceedings of the IEEE International Conference on Computer Vision, pages 5736–5745, (2017)

37.

Gulrajani, Ishaan; Ahmed, Faruk; Arjovsky, Martin; Dumoulin, Vincent; Courville, Aaron C: Improved training of wasserstein gans. In Advances in neural information processing systems, pages 5767–5777, (2017)

38.

Salimans, Tim; Goodfellow, Ian; Zaremba, Wojciech; Cheung, Vicki; Radford, Alec; Chen, Xi: Improved techniques for training gans. In Advances in neural information processing systems, pages 2234–2242, (2016)

39.

Banerjee, Arindam; Merugu, Srujana; Dhillon, Inderjit S.; Ghosh, Joydeep: Clustering with bregman divergences. J. Mach. Learn. Res. 6(Oct), 1705–1749 (2005)MathSciNetMATH

40.

Tikhonov, Andrei Nikolaevich: Regularization of incorrectly posed problems. Soviet Math. Doklady 4, 1624–1627 (1963)MATH

41.

LeCun, Yann; Bottou, Léon; Bengio, Yoshua; Haffner, Patrick; et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef

42.

Torralba, Antonio; Fergus, Rob; Freeman, William T.: 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1958–1970 (2008)CrossRef

43.

Netzer, Yuval; Wang, Tao; Coates, Adam; Bissacco, Alessandro; Wu, Bo; Ng, Andrew Y.: Reading digits in natural images with unsupervised feature learning. (2011)

44.

Lewis, David D.; Yang, Yiming; Rose, Tony G.; Li, Fan: Rcv1: A new benchmark collection for text categorization research. J. Mach. Learn. Res. 5(Apr), 361–397 (2004)

45.

Xie, Junyuan; Girshick, Ross; Farhadi, Ali: Unsupervised deep embedding for clustering analysis. In International conference on machine learning, pages 478–487, (2016b)

Title: Large-Scale Data Clustering Using Manifold-Regularized Ensemble of Posterior in GAN
Authors: Haleh Homayouni
Eghbal Mansoori
Publication date: 24-06-2021
Publisher: Springer Berlin Heidelberg
Published in: Arabian Journal for Science and Engineering / Issue 2/2022
Print ISSN: 2193-567X
Electronic ISSN: 2191-4281
DOI: https://doi.org/10.1007/s13369-021-05809-y

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Other articles of this Issue 2/2022

An Effective Hash-Based Assessment and Recovery Algorithm for Healthcare Systems

Automated Query Relaxation Mechanism for QoS-Aware Service Provisioning

High Occupancy Itemset Mining with Consideration of Transaction Occupancy

High-Accuracy 3D Indoor Visible Light Positioning Method Based on the Improved Adaptive Cuckoo Search Algorithm

An Upper Limb Rehabilitation Exercise Status Identification System Based on Machine Learning and IoT

Fuzzy Logic-Based SBR Acceleration Approach for Radio Propagation Prediction in Indoor Environments

Premium Partners