Skip to main content

2023 | OriginalPaper | Buchkapitel

Revised Conditional t-SNE: Looking Beyond the Nearest Neighbors

verfasst von : Edith Heiter, Bo Kang, Ruth Seurinck, Jefrey Lijffijt

Erschienen in: Advances in Intelligent Data Analysis XXI

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Conditional t-SNE (ct-SNE) is a recent extension to t-SNE that allows removal of known cluster information from the embedding, to obtain a visualization revealing structure beyond label information. This is useful, for example, when one wants to factor out unwanted differences between a set of classes. We show that ct-SNE fails in many realistic settings, namely if the data is well clustered over the labels in the original high-dimensional space. We introduce a revised method by conditioning the high-dimensional similarities instead of the low-dimensional similarities and storing within- and across-label nearest neighbors separately. This also enables the use of recently proposed speedups for t-SNE, improving the scalability. From experiments on synthetic data, we find that our proposed method resolves the considered problems and improves the embedding quality. On real data containing batch effects, the expected improvement is not always there. We argue revised ct-SNE is preferable overall, given its improved scalability. The results also highlight new open questions, such as how to handle distance variations between clusters.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Assuming the distances between differently-labeled samples are larger than between same-labeled samples. A perplexity of \(\frac{n}{2}\) would only assign same-labeled points a high similarity.
 
2
We use \(\alpha \) and \(\beta \) instead of \(\alpha '\) and \(\beta '\) as in the original paper [3], and thus need to normalize with the number of distinct label assignments.
 
Literatur
2.
Zurück zum Zitat de Bodt, C., Mulders, D., Sánchez, D.L., Verleysen, M., Lee, J.A.: Class-aware t-SNE: cat-SNE. In: ESANN (2019) de Bodt, C., Mulders, D., Sánchez, D.L., Verleysen, M., Lee, J.A.: Class-aware t-SNE: cat-SNE. In: ESANN (2019)
3.
Zurück zum Zitat Kang, B., García García, D., Lijffijt, J., Santos-Rodríguez, R., De Bie, T.: Conditional t-SNE: more informative t-SNE embeddings. Mach. Learn. 110(10), 2905–2940 (2021)MathSciNetCrossRefMATH Kang, B., García García, D., Lijffijt, J., Santos-Rodríguez, R., De Bie, T.: Conditional t-SNE: more informative t-SNE embeddings. Mach. Learn. 110(10), 2905–2940 (2021)MathSciNetCrossRefMATH
4.
Zurück zum Zitat Lee, J.A., Peluffo-Ordóñez, D.H., Verleysen, M.: Multi-scale similarities in stochastic neighbour embedding: reducing dimensionality while preserving both local and global structure. Neurocomputing 169, 246–261 (2015)CrossRef Lee, J.A., Peluffo-Ordóñez, D.H., Verleysen, M.: Multi-scale similarities in stochastic neighbour embedding: reducing dimensionality while preserving both local and global structure. Neurocomputing 169, 246–261 (2015)CrossRef
5.
Zurück zum Zitat Lee, J.A., Verleysen, M.: Quality assessment of dimensionality reduction: rank-based criteria. Neurocomputing 72(7–9), 1431–1443 (2009)CrossRef Lee, J.A., Verleysen, M.: Quality assessment of dimensionality reduction: rank-based criteria. Neurocomputing 72(7–9), 1431–1443 (2009)CrossRef
6.
Zurück zum Zitat Linderman, G.C., Rachh, M., Hoskins, J.G., Steinerberger, S., Kluger, Y.: Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data. Nat. Methods 16(3), 243–245 (2019)CrossRef Linderman, G.C., Rachh, M., Hoskins, J.G., Steinerberger, S., Kluger, Y.: Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data. Nat. Methods 16(3), 243–245 (2019)CrossRef
7.
Zurück zum Zitat van der Maaten, L.: Accelerating t-SNE using tree-based algorithms. JMLR 15(1), 3221–3245 (2014)MathSciNetMATH van der Maaten, L.: Accelerating t-SNE using tree-based algorithms. JMLR 15(1), 3221–3245 (2014)MathSciNetMATH
8.
Zurück zum Zitat Poličar, P.G., Stražar, M., Zupan, B.: Embedding to reference t-SNE space addresses batch effects in single-cell classification. Mach. Learn. 112, 721–740 (2021)MathSciNetCrossRef Poličar, P.G., Stražar, M., Zupan, B.: Embedding to reference t-SNE space addresses batch effects in single-cell classification. Mach. Learn. 112, 721–740 (2021)MathSciNetCrossRef
9.
Zurück zum Zitat Satija Lab: panc8.SeuratData: Eight Pancreas Datasets Across Five Technologies (2019). R package version 3.0.2 Satija Lab: panc8.SeuratData: Eight Pancreas Datasets Across Five Technologies (2019). R package version 3.0.2
10.
Zurück zum Zitat Vu, V.M., Bibal, A., Frénay, B.: HCt-SNE: hierarchical constraints with t-SNE. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021) Vu, V.M., Bibal, A., Frénay, B.: HCt-SNE: hierarchical constraints with t-SNE. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021)
11.
Zurück zum Zitat Yianilos, P.N.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: Proceedings of ACM-SIAM Symposium on Discrete algorithms, vol. 66, pp. 311–321 (1993) Yianilos, P.N.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: Proceedings of ACM-SIAM Symposium on Discrete algorithms, vol. 66, pp. 311–321 (1993)
Metadaten
Titel
Revised Conditional t-SNE: Looking Beyond the Nearest Neighbors
verfasst von
Edith Heiter
Bo Kang
Ruth Seurinck
Jefrey Lijffijt
Copyright-Jahr
2023
DOI
https://doi.org/10.1007/978-3-031-30047-9_14

Premium Partner