Skip to main content
Top

2021 | OriginalPaper | Chapter

Data Compression and Visualization Using PCA and T-SNE

Authors : Jyoti Pareek, Joel Jacob

Published in: Advances in Information Communication Technology and Computing

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This paper examines two commonly used data dimensionality reduction techniques, namely, PCA and T-SNE. PCA was founded in 1933 and T-SNE in 2008, both are fundamentally different techniques. PCA focuses heavily on linear algebra while T-SNE is a probabilistic technique. The goal is to apply these algorithms on MNIST dataset and to see how they practically work and what conclusions we could draw from their application. The objective is to reduce the dimension of the data while retaining most of the information. We perform both these techniques and make a comparison between them by observing the results. We note the behavior of the reduced components obtained from both techniques, by visualizing it in 2-dimensional space. Upon further research and application, it became apparent that the data dimensionality reduction is sensitive to the parameter settings and must be fine-tuned carefully to be successful.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Tharwat A (2009) Principal component analysis—a tutorial. In: Inderscience enterprises Tharwat A (2009) Principal component analysis—a tutorial. In: Inderscience enterprises
2.
go back to reference Aráujo D, DóriaNeto A, Martins A, Melo J (2011) Comparative study on dimension reduction techniques for cluster analysis of microarray data. In: International joint conference on neural networks Aráujo D, DóriaNeto A, Martins A, Melo J (2011) Comparative study on dimension reduction techniques for cluster analysis of microarray data. In: International joint conference on neural networks
3.
go back to reference Jolliffe IT (1986) Principal component analysis and factor analysis. Springer, Berlin Jolliffe IT (1986) Principal component analysis and factor analysis. Springer, Berlin
4.
go back to reference Lefter C, Bratucu G et al (2006) Marketing, vol 1. Transilvania University of Brasov Publishing House, Brasov. In Lefter C, Bratucu G et al (2006) Marketing, vol 1. Transilvania University of Brasov Publishing House, Brasov. In
6.
go back to reference Wold S, Esbensen K, Geladi P (1987) Principal component analysis. In: Chemometrics and intelligent laboratory systems, vol 2, no 1–3, pp 37–52 Wold S, Esbensen K, Geladi P (1987) Principal component analysis. In: Chemometrics and intelligent laboratory systems, vol 2, no 1–3, pp 37–52
7.
go back to reference van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9 van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9
8.
go back to reference Hinton G, Roweis S (2002) Stochastic neighbor embedding. In: Advances in Neural Information Processing (NIPS) Hinton G, Roweis S (2002) Stochastic neighbor embedding. In: Advances in Neural Information Processing (NIPS)
9.
go back to reference Husnain M, Missen MMS, Mumtaz S, Muzzamil M, Luqman MM, Coustaty M, Ogier J-M (2019) Visualization of high-dimensional data by pairwise fusion matrices using t-SNE. In: Symmetry Husnain M, Missen MMS, Mumtaz S, Muzzamil M, Luqman MM, Coustaty M, Ogier J-M (2019) Visualization of high-dimensional data by pairwise fusion matrices using t-SNE. In: Symmetry
10.
go back to reference Kobak1 D, Berens P (2018)The art of using t-SNE for single-cell transcriptomics. In: bioRxiv Kobak1 D, Berens P (2018)The art of using t-SNE for single-cell transcriptomics. In: bioRxiv
11.
go back to reference Nanaware T, Mahajan P, Chandak R, Deshpande P, Patil M (2018) Exploratory data analysis using dimension reduction. IOSR J Eng (IOSRJEN) Nanaware T, Mahajan P, Chandak R, Deshpande P, Patil M (2018) Exploratory data analysis using dimension reduction. IOSR J Eng (IOSRJEN)
Metadata
Title
Data Compression and Visualization Using PCA and T-SNE
Authors
Jyoti Pareek
Joel Jacob
Copyright Year
2021
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-15-5421-6_34