Skip to main content
Top

2023 | OriginalPaper | Chapter

Nonparametric Bayesian Deep Visualization

Authors : Haruya Ishizuka, Daichi Mochihashi

Published in: Machine Learning and Knowledge Discovery in Databases

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Visualization methods such as t-SNE [1] have helped in knowledge discovery from high-dimensional data; however, their performance may degrade when the intrinsic structure of observations is in low-dimensional space, and they cannot estimate clusters that are often useful to understand the internal structure of a dataset. A solution is to visualize the latent coordinates and clusters estimated using a neural clustering model. However, they require a long computational time since they have numerous weights to train and must tune the layer width, the number of latent dimensions and clusters to appropriately model the latent space. Additionally, the estimated coordinates may not be suitable for visualization since such a model and visualization method are applied independently. We utilize neural network Gaussian processes (NNGP) [2] equivalent to a neural network whose weights are marginalized to eliminate the necessity to optimize weights and layer widths. Additionally, to determine latent dimensions and the number of clusters without tuning, we propose a latent variable model that combines NNGP with automatic relevance determination [3] to extract necessary dimensions of latent space and infinite Gaussian mixture model [4] to infer the number of clusters. We integrate this model and visualization method into nonparametric Bayesian deep visualization (NPDV) that learns latent and visual coordinates jointly to render latent coordinates optimal for visualization. Experimental results on images and document datasets show that NPDV shows superior accuracy to existing methods, and it requires less training time than the neural clustering model because of its lower tuning cost. Furthermore, NPDV can reveal plausible latent clusters without labels.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
Rotatable plots are provided as an html file in the supplemental material.
 
2
All appendices are provided in the Supplemental Materials.
 
3
The network architecture is the same as that of the data generation network.
 
4
http://qwone.com/\(\sim {}\)jason/20Newsgroups/.
 
5
http://korpus.uib.no/icame/manuals/BROWN/INDEX.HTML.
 
Literature
1.
go back to reference van der Maaten, L., Hinton, G.: Visualizing data using \(t\)-SNE. J. Mach. Learn. Res. 9(1), 2579–2605 (2008) van der Maaten, L., Hinton, G.: Visualizing data using \(t\)-SNE. J. Mach. Learn. Res. 9(1), 2579–2605 (2008)
2.
go back to reference Lee, J.H., Bahri, Y., Novak, R., Schoenholz, S., Pennington, J., Sohl-Dickstein, J.: Deep neural networks as Gaussian processes. In: International Conference on Learning, Representation, vol.2018, no. 48, pp. 478–487 (2018) Lee, J.H., Bahri, Y., Novak, R., Schoenholz, S., Pennington, J., Sohl-Dickstein, J.: Deep neural networks as Gaussian processes. In: International Conference on Learning, Representation, vol.2018, no. 48, pp. 478–487 (2018)
3.
go back to reference Mackay, D.J.: Bayesian non-linear modeling for the prediction competition. ASHRAE Trans. 100(2), 1053–1062 (1994) Mackay, D.J.: Bayesian non-linear modeling for the prediction competition. ASHRAE Trans. 100(2), 1053–1062 (1994)
4.
go back to reference Rassmusen, C.E.: The infinite Gaussian mixture model. In: Proceedings of the 12th International Conference on Neural Information Processing Systems, pp. 554–560 Rassmusen, C.E.: The infinite Gaussian mixture model. In: Proceedings of the 12th International Conference on Neural Information Processing Systems, pp. 554–560
5.
go back to reference Kruscal, J.B.: Multidimensional scaling by optimizing goodness of git to a nonmetric hypothesis. Psychometrika 29, 1–27 (1964)MathSciNetCrossRef Kruscal, J.B.: Multidimensional scaling by optimizing goodness of git to a nonmetric hypothesis. Psychometrika 29, 1–27 (1964)MathSciNetCrossRef
6.
go back to reference Tenebaum, J.B., de Silva, J.C., Langford, A.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000) Tenebaum, J.B., de Silva, J.C., Langford, A.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)
7.
go back to reference Hinto, G.E., Roweis, S.: Stochastic neighbor embedding. In: Advances in Neural Information Processing Systems 15 (2002) Hinto, G.E., Roweis, S.: Stochastic neighbor embedding. In: Advances in Neural Information Processing Systems 15 (2002)
8.
go back to reference Tang, J., Liu, J., Zhang, M., Mei, Q.: Visualizing large-scale and high-dimensional data. In: The 25th International Conference on the World Wide Web, pp. 287–297 (2016) Tang, J., Liu, J., Zhang, M., Mei, Q.: Visualizing large-scale and high-dimensional data. In: The 25th International Conference on the World Wide Web, pp. 287–297 (2016)
9.
go back to reference Aleksandr, A., Maxim, P.: NCVis: noise contrastive approach for scalable visualization. Proceed. Web Conf. 2020, 2941–2947 (2020) Aleksandr, A., Maxim, P.: NCVis: noise contrastive approach for scalable visualization. Proceed. Web Conf. 2020, 2941–2947 (2020)
10.
go back to reference McInnes, L., Healy, J., Saul, N., Großberge, L.: UMAP: uniform manifold approximation and projection. J. Open Source Softw. 3(29), 2579–2605 (2018) McInnes, L., Healy, J., Saul, N., Großberge, L.: UMAP: uniform manifold approximation and projection. J. Open Source Softw. 3(29), 2579–2605 (2018)
11.
go back to reference Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1735–1742 (2006) Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1735–1742 (2006)
12.
go back to reference van der Maaten, L., Weinberger, K.: Stochastic triplet embedding. In: Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing, pp. 1–6 (2012) van der Maaten, L., Weinberger, K.: Stochastic triplet embedding. In: Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing, pp. 1–6 (2012)
13.
go back to reference Wilber, M.J., Kwak, I.S., Kriegman D.J., Belongie, S.: Learning concept embeddings with combined human-machine expertise. In: Proceedings of the IEEE International Conference on Computer Vision 2, pp. 981–989 (2015) Wilber, M.J., Kwak, I.S., Kriegman D.J., Belongie, S.: Learning concept embeddings with combined human-machine expertise. In: Proceedings of the IEEE International Conference on Computer Vision 2, pp. 981–989 (2015)
14.
15.
go back to reference Wang, Y., Huang, H.M., Rudin, C., Shaposhnik, Y.: Understanding how dimension reduction tools work: an empirical approach to deciphering t-SNE, UMAP, TriMap, and PaCMAP for Data Visualization. J. Mach. Learn. Res. 2021(201), 1–73 (2021) Wang, Y., Huang, H.M., Rudin, C., Shaposhnik, Y.: Understanding how dimension reduction tools work: an empirical approach to deciphering t-SNE, UMAP, TriMap, and PaCMAP for Data Visualization. J. Mach. Learn. Res. 2021(201), 1–73 (2021)
16.
go back to reference Wallach, I., Liliean, R.: The protein-small-molecule database, a non-redundant structural resource for the analysis of protein-ligand binding. Bioinformatics 25(5), 615–620 (2010) Wallach, I., Liliean, R.: The protein-small-molecule database, a non-redundant structural resource for the analysis of protein-ligand binding. Bioinformatics 25(5), 615–620 (2010)
17.
go back to reference Hamel, P., Eck, D.: Learning features from music audio with deep belief networks. In: Proceedings of the International Society for Music Information Retrieval Conference, pp. 339–344 (2010) Hamel, P., Eck, D.: Learning features from music audio with deep belief networks. In: Proceedings of the International Society for Music Information Retrieval Conference, pp. 339–344 (2010)
18.
go back to reference Geng, X., Zhan, D.-C., Zhou, Z.-H.: Supervised nonlinear dimensionality reduction for visualization and classification. IEEE Trans. Syst. Man Cybern. 35(6), 1098–1107 (2005) Geng, X., Zhan, D.-C., Zhou, Z.-H.: Supervised nonlinear dimensionality reduction for visualization and classification. IEEE Trans. Syst. Man Cybern. 35(6), 1098–1107 (2005)
19.
go back to reference Venna, A., Peltonen, J., Nybo, K., Aidos, H., Kaski, S.: Information retrieval perspective to nonlinear dimensionality reduction for data visualization. J. Mach. Learn. Res. 11(13), 451–490 (2010)MathSciNetMATH Venna, A., Peltonen, J., Nybo, K., Aidos, H., Kaski, S.: Information retrieval perspective to nonlinear dimensionality reduction for data visualization. J. Mach. Learn. Res. 11(13), 451–490 (2010)MathSciNetMATH
20.
go back to reference Zheng, J., Zhang, H.H., Cattani, C., Wang, W.: Dimensionality reduction by supervised neighbor embedding using Laplacian search. Biomed. Sig. Process. Model. Complexity Living Syst. 2014, 594379 (2014) Zheng, J., Zhang, H.H., Cattani, C., Wang, W.: Dimensionality reduction by supervised neighbor embedding using Laplacian search. Biomed. Sig. Process. Model. Complexity Living Syst. 2014, 594379 (2014)
21.
go back to reference Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, vol. 48, pp. 478–487 (2016) Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, vol. 48, pp. 478–487 (2016)
22.
go back to reference Fard, M.N., Thonet, T., Gaussier, E.: Deep \(k\)-means: jointly clustering with \(k\)-means and learning representations. ArXiv:1806.10069 (2018) Fard, M.N., Thonet, T., Gaussier, E.: Deep \(k\)-means: jointly clustering with \(k\)-means and learning representations. ArXiv:​1806.​10069 (2018)
23.
go back to reference Yang, X., Yan, Y., Huang, K., Zhang, R.: VSB-DVM: an end-to-end Bayesian nonparametric generalization of deep variational Mixture Model. In: 2019 IEEE International Conference on Data Mining (2019) Yang, X., Yan, Y., Huang, K., Zhang, R.: VSB-DVM: an end-to-end Bayesian nonparametric generalization of deep variational Mixture Model. In: 2019 IEEE International Conference on Data Mining (2019)
24.
go back to reference Iwata, T., Duvenaud, D., Ghahramani, Z.: Warped mixtures for nonparametric cluster shapes. In: Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence, pp. 311–320 (2013) Iwata, T., Duvenaud, D., Ghahramani, Z.: Warped mixtures for nonparametric cluster shapes. In: Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence, pp. 311–320 (2013)
25.
go back to reference Lawrence, N.D.: Probabilistic non-linear principal component analysis with Gaussian process latent variable models. J. Mach. Learn. Res. 6, 1783–1816 (2004)MathSciNetMATH Lawrence, N.D.: Probabilistic non-linear principal component analysis with Gaussian process latent variable models. J. Mach. Learn. Res. 6, 1783–1816 (2004)MathSciNetMATH
26.
go back to reference Rassmusen, C.E., Williams, C.: Gaussian Processes for Machine Learning. The MIT Press (2006) Rassmusen, C.E., Williams, C.: Gaussian Processes for Machine Learning. The MIT Press (2006)
27.
go back to reference Ferguson, T.S.: A Bayesian analysis of some nonparametric problems. Ann. Statist. 1(2), 209–230 (1973) Ferguson, T.S.: A Bayesian analysis of some nonparametric problems. Ann. Statist. 1(2), 209–230 (1973)
28.
go back to reference Cho, Y., Saul, L.K.: Kernel methods for deep learning. Adv. Neural. Inf. Process. Syst. 22, 342–350 (2009) Cho, Y., Saul, L.K.: Kernel methods for deep learning. Adv. Neural. Inf. Process. Syst. 22, 342–350 (2009)
29.
go back to reference Sethuraman, J.: Constructive definition of Dirichlet process. Statist. Sinica 4(2), 639–650 (1994)MathSciNetMATH Sethuraman, J.: Constructive definition of Dirichlet process. Statist. Sinica 4(2), 639–650 (1994)MathSciNetMATH
30.
go back to reference Zhu, J., Chen, N., Xing, E.P.: Bayesian inference with posterior regularization and applications to infinite latent SVMs. J. Mach. Learn. 15(1), 1799–1847 (2014)MathSciNetMATH Zhu, J., Chen, N., Xing, E.P.: Bayesian inference with posterior regularization and applications to infinite latent SVMs. J. Mach. Learn. 15(1), 1799–1847 (2014)MathSciNetMATH
31.
go back to reference Zellner, A.: Optimal information processing and Bayes’ theorem. Am. Stat. 42(4), 278–280 (1988)MathSciNet Zellner, A.: Optimal information processing and Bayes’ theorem. Am. Stat. 42(4), 278–280 (1988)MathSciNet
32.
go back to reference Bishop, C.M.: Pattern recognition and machine learning, 1st Edn. Springer (2006) Bishop, C.M.: Pattern recognition and machine learning, 1st Edn. Springer (2006)
33.
go back to reference Blei, D., Jordan, M.: Variational inference for Dirichlet process mixtures. J. Bayesian Anal. 1(1), 121–144 (2006)MathSciNetMATH Blei, D., Jordan, M.: Variational inference for Dirichlet process mixtures. J. Bayesian Anal. 1(1), 121–144 (2006)MathSciNetMATH
34.
go back to reference Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Proceedings of the 2nd International Conference on Learning Representations (2013) Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Proceedings of the 2nd International Conference on Learning Representations (2013)
35.
go back to reference Titsias, M.K., Lawrence, N.D.: Bayesian Gaussian process latent variable model. In: Proceedings of the 13th International Workshop on Artificial Intelligence and Statistics, vol. 9, pp. 844–851 (2010) Titsias, M.K., Lawrence, N.D.: Bayesian Gaussian process latent variable model. In: Proceedings of the 13th International Workshop on Artificial Intelligence and Statistics, vol. 9, pp. 844–851 (2010)
36.
go back to reference Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, pp. 2623–2631 (2019) Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, pp. 2623–2631 (2019)
37.
go back to reference Arora, S., Liang, Y., Ma, T.: A simple but tough-to-beat baseline for sentence embeddings. In: International Conference on Learning Representations (2017) Arora, S., Liang, Y., Ma, T.: A simple but tough-to-beat baseline for sentence embeddings. In: International Conference on Learning Representations (2017)
Metadata
Title
Nonparametric Bayesian Deep Visualization
Authors
Haruya Ishizuka
Daichi Mochihashi
Copyright Year
2023
DOI
https://doi.org/10.1007/978-3-031-26387-3_8

Premium Partner