Skip to main content
Top
Published in: Journal of Visualization 6/2020

30-07-2020 | Regular Paper

SADIRE: a context-preserving sampling technique for dimensionality reduction visualizations

Authors: Wilson Estécio Marcilio-Jr, Danilo Medeiros Eler

Published in: Journal of Visualization | Issue 6/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Sampling techniques are widely used in the effort to reduce complexity and improve interpretability of datasets. Given the enormous availability of data, these techniques try to select representative data points that inherently reflect the data structure. In this work, we propose a novel sampling technique that preserves the structures imposed by dimensionality reduction techniques when visualized as scatter plots. In the experiments, we demonstrate how our technique is able to reflect the class boundaries and layout structures, besides decreasing redundancy of the datasets visualized as scatter plots. We also provide an user experiment regarding the perception of sampling from scatter plot visualizations.

Graphic abstract

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38MathSciNetMATH Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38MathSciNetMATH
go back to reference Eler DM, Nakazaki MY, Paulovich FV, Santos DP, Andery GF, Oliveira MCF, Neto JB, Minghim R (2009) Visual analysis of image collections. Vis Comput 25(10):923–937CrossRef Eler DM, Nakazaki MY, Paulovich FV, Santos DP, Andery GF, Oliveira MCF, Neto JB, Minghim R (2009) Visual analysis of image collections. Vis Comput 25(10):923–937CrossRef
go back to reference Joia P, Petronetto F, Nonato LG (2015) Uncovering representative groups in multidimensional projections. Comput Graph Forum 34:281–290CrossRef Joia P, Petronetto F, Nonato LG (2015) Uncovering representative groups in multidimensional projections. Comput Graph Forum 34:281–290CrossRef
go back to reference Knuth DE (1997) The art of computer programming, volume 2 (3rd Edn): seminumerical algorithms. Addison-Wesley Longman Publishing Co., Inc., Boston Knuth DE (1997) The art of computer programming, volume 2 (3rd Edn): seminumerical algorithms. Addison-Wesley Longman Publishing Co., Inc., Boston
go back to reference Ma B, Wei Q, Chen G (2011) A combined measure for representative information retrieval in enterprise information systems. J Enterp Inf Manag 24:310–321CrossRef Ma B, Wei Q, Chen G (2011) A combined measure for representative information retrieval in enterprise information systems. J Enterp Inf Manag 24:310–321CrossRef
go back to reference van der Maaten L, Hinton G (2008) Visualizing high-dimensional data using t-SNE. J Mach Learn Res 9:2579–2605MATH van der Maaten L, Hinton G (2008) Visualizing high-dimensional data using t-SNE. J Mach Learn Res 9:2579–2605MATH
go back to reference MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Cam LML, Neyman J (eds) Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol 1. University of California Press, pp 281–297 MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Cam LML, Neyman J (eds) Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol 1. University of California Press, pp 281–297
go back to reference McInnes L, Healy J, Melville J (2018) Umap: Uniform manifold approximation and projection for dimension reduction. ArXiv:1802.03426 McInnes L, Healy J, Melville J (2018) Umap: Uniform manifold approximation and projection for dimension reduction. ArXiv:1802.​03426
go back to reference Nguyen TT, Song I (2016) Centrality clustering-based sampling for big data visualization. In: International joint conference on neural networks (IJCNN), pp 24–29 Nguyen TT, Song I (2016) Centrality clustering-based sampling for big data visualization. In: International joint conference on neural networks (IJCNN), pp 24–29
go back to reference Paulovich FV, Eler DM, Poco J, Botha CP, Minghim R, Nonato LG (2011) Piecewise Laplacian-based projection for interactive data exploration and organization. In: Proceedings of the 13th Eurographics/IEEE—VGTC conference on visualization, EuroVis’11. The Eurographs Association and John Wiley & Sons, Ltd., Chichester, pp 1091–1100. https://doi.org/10.1111/j.1467-8659.2011.01958.x Paulovich FV, Eler DM, Poco J, Botha CP, Minghim R, Nonato LG (2011) Piecewise Laplacian-based projection for interactive data exploration and organization. In: Proceedings of the 13th Eurographics/IEEE—VGTC conference on visualization, EuroVis’11. The Eurographs Association and John Wiley & Sons, Ltd., Chichester, pp 1091–1100. https://​doi.​org/​10.​1111/​j.​1467-8659.​2011.​01958.​x
go back to reference Pekalska E, de Ridder D, Duin RP, Kraaijveld MA (1999) A new method of generalizing sammon mapping with application to algorithm speed-up. In: Proceedings 5th annual conference of the advanced school for computing and imaging (ASCI1999) Pekalska E, de Ridder D, Duin RP, Kraaijveld MA (1999) A new method of generalizing sammon mapping with application to algorithm speed-up. In: Proceedings 5th annual conference of the advanced school for computing and imaging (ASCI1999)
go back to reference Pezzotti N, Höllt T, van Gemert JC, Lelieveldt BPF, Eisemann E, Vilanova A (2018) Deepeyes: progressive visual analytics for designing deep neural networks. IEEE Trans Vis Comput Graph 24:98–108CrossRef Pezzotti N, Höllt T, van Gemert JC, Lelieveldt BPF, Eisemann E, Vilanova A (2018) Deepeyes: progressive visual analytics for designing deep neural networks. IEEE Trans Vis Comput Graph 24:98–108CrossRef
go back to reference Rauber PE, Fadel SG, Falcão AX, Telea A (2017) Visualizing the hidden activity of artificial neural networks. IEEE Trans Vis Comput Graph 23:101–110CrossRef Rauber PE, Fadel SG, Falcão AX, Telea A (2017) Visualizing the hidden activity of artificial neural networks. IEEE Trans Vis Comput Graph 23:101–110CrossRef
go back to reference Tan PN, Steinbach M, Kumar V (2006) Introduction to data mining. Pearson Education, London Tan PN, Steinbach M, Kumar V (2006) Introduction to data mining. Pearson Education, London
go back to reference Tillé Y (2011) Sampling algorithms. Springer, Berlin, pp 1273–1274 Tillé Y (2011) Sampling algorithms. Springer, Berlin, pp 1273–1274
Metadata
Title
SADIRE: a context-preserving sampling technique for dimensionality reduction visualizations
Authors
Wilson Estécio Marcilio-Jr
Danilo Medeiros Eler
Publication date
30-07-2020
Publisher
Springer Berlin Heidelberg
Published in
Journal of Visualization / Issue 6/2020
Print ISSN: 1343-8875
Electronic ISSN: 1875-8975
DOI
https://doi.org/10.1007/s12650-020-00685-4

Other articles of this Issue 6/2020

Journal of Visualization 6/2020 Go to the issue

Premium Partner