Skip to main content

2023 | OriginalPaper | Buchkapitel

CHLPCA: Correntropy-Based Hypergraph Regularized Sparse PCA for Single-Cell Type Identification

verfasst von : Tai-Ge Wang, Xiang-Zhen Kong, Sheng-Jun Li, Juan Wang

Erschienen in: Bioinformatics Research and Applications

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Over the past decade, high-throughput sequencing technologies have driven a dramatic increase in single-cell RNA sequencing (scRNA-seq) data. The study of scRNA-seq data has widened the scope and depth of researchers’ understanding of cellular heterogeneity. A prerequisite for studying heterogeneous cell populations is accurate cell type identification. However, the highly noisy and high-dimensional nature of scRNA-seq data poses a challenge to existing methods to further improve the success rate of cell type identification. Principal component analysis (PCA) is an important data analysis technique that is widely used to identify cell subpopulations. On the basis of PCA, we propose correntropy-based hypergraph regularized sparse PCA (CHLPCA) for accurate cell type identification. In addition to using correntropy to reduce the effect of noise, CHLPCA also considers higher-order relationships between samples by constructing the hypergraph, which compensates for the lack of local structure capture ability of PCA. Furthermore, we introduce the L2,1/5-norm into the model to enhance the interpretability of principal components (PCs), which further improves the model performance. CHLPCA has superior clustering accuracy and outperforms the best comparative method by 5.13% and 8.00% for ACC and NMI metrics, respectively. The results of clustering visualization experiments also confirm that CHLPCA can better perform the cell type recognition task.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Raman, P., et al.: A comparison of survival analysis methods for cancer gene expression RNA-sequencing data. Cancer Genet. 235, 1–12 (2019)CrossRefPubMed Raman, P., et al.: A comparison of survival analysis methods for cancer gene expression RNA-sequencing data. Cancer Genet. 235, 1–12 (2019)CrossRefPubMed
3.
Zurück zum Zitat Zheng, R., Li, M., Liang, Z., Wu, F.-X., Pan, Y., Wang, J.: SinNLRR: a robust subspace clustering method for cell type detection by non-negative and low-rank representation. Bioinformatics 35, 3642–3650 (2019)CrossRefPubMed Zheng, R., Li, M., Liang, Z., Wu, F.-X., Pan, Y., Wang, J.: SinNLRR: a robust subspace clustering method for cell type detection by non-negative and low-rank representation. Bioinformatics 35, 3642–3650 (2019)CrossRefPubMed
4.
Zurück zum Zitat Wang, B., Zhu, J., Pierson, E., Ramazzotti, D., Batzoglou, S.: Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning. Nat. Methods 14, 414–416 (2017)CrossRefPubMed Wang, B., Zhu, J., Pierson, E., Ramazzotti, D., Batzoglou, S.: Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning. Nat. Methods 14, 414–416 (2017)CrossRefPubMed
5.
Zurück zum Zitat Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdiscip. Rev.: Comput. Stat. 2, 433–459 (2010)CrossRef Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdiscip. Rev.: Comput. Stat. 2, 433–459 (2010)CrossRef
6.
Zurück zum Zitat Lall, S., Sinha, D., Bandyopadhyay, S., Sengupta, D.: Structure-aware principal component analysis for single-cell RNA-seq data. J. Comput. Biol. 25, 1365–1373 (2018)CrossRef Lall, S., Sinha, D., Bandyopadhyay, S., Sengupta, D.: Structure-aware principal component analysis for single-cell RNA-seq data. J. Comput. Biol. 25, 1365–1373 (2018)CrossRef
7.
Zurück zum Zitat Pierson, E., Yau, C.: ZIFA: dimensionality reduction for zero-inflated single-cell gene expression analysis. Genome Biol. 16, 1–10 (2015)CrossRef Pierson, E., Yau, C.: ZIFA: dimensionality reduction for zero-inflated single-cell gene expression analysis. Genome Biol. 16, 1–10 (2015)CrossRef
8.
Zurück zum Zitat Liu, W., Pokharel, P.P., Principe, J.C.: Correntropy: properties and applications in non-Gaussian signal processing. IEEE Trans. Sig. Process. 55, 5286–5298 (2007)CrossRef Liu, W., Pokharel, P.P., Principe, J.C.: Correntropy: properties and applications in non-Gaussian signal processing. IEEE Trans. Sig. Process. 55, 5286–5298 (2007)CrossRef
9.
Zurück zum Zitat He, R., Hu, B.-G., Zheng, W.-S., Kong, X.-W.: Robust principal component analysis based on maximum correntropy criterion. IEEE Trans. Image Process. 20, 1485–1494 (2011)CrossRefPubMed He, R., Hu, B.-G., Zheng, W.-S., Kong, X.-W.: Robust principal component analysis based on maximum correntropy criterion. IEEE Trans. Image Process. 20, 1485–1494 (2011)CrossRefPubMed
10.
Zurück zum Zitat Yu, N., Wu, M.-J., Liu, J.-X., Zheng, C.-H., Xu, Y.: Correntropy-based hypergraph regularized NMF for clustering and feature selection on multi-cancer integrated data. IEEE Trans. Cybern. 51, 3952–3963 (2020)CrossRef Yu, N., Wu, M.-J., Liu, J.-X., Zheng, C.-H., Xu, Y.: Correntropy-based hypergraph regularized NMF for clustering and feature selection on multi-cancer integrated data. IEEE Trans. Cybern. 51, 3952–3963 (2020)CrossRef
11.
Zurück zum Zitat Wang, T.-G., Shang, J.-L., Liu, J.-X., Li, F., Yuan, S., Wang, J.: Joint L2,p-norm and random walk graph constrained PCA for single-cell RNA-seq data. Comput. Methods Biomech. Biomed. Eng. 1–14 (2023) Wang, T.-G., Shang, J.-L., Liu, J.-X., Li, F., Yuan, S., Wang, J.: Joint L2,p-norm and random walk graph constrained PCA for single-cell RNA-seq data. Comput. Methods Biomech. Biomed. Eng. 1–14 (2023)
12.
Zurück zum Zitat Nikolova, M., Chan, R.H.: The equivalence of half-quadratic minimization and the gradient linearization iteration. IEEE Trans. Image Process. 16, 1623–1627 (2007)CrossRefPubMed Nikolova, M., Chan, R.H.: The equivalence of half-quadratic minimization and the gradient linearization iteration. IEEE Trans. Image Process. 16, 1623–1627 (2007)CrossRefPubMed
13.
Zurück zum Zitat Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends® Mach. Learn. 3, 1–122 (2011) Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends® Mach. Learn. 3, 1–122 (2011)
14.
Zurück zum Zitat Cai, D., He, X., Han, J.: Document clustering using locality preserving indexing. IEEE Trans. Knowl. Data Eng. 17, 1624–1637 (2005)CrossRef Cai, D., He, X., Han, J.: Document clustering using locality preserving indexing. IEEE Trans. Knowl. Data Eng. 17, 1624–1637 (2005)CrossRef
15.
Zurück zum Zitat McDaid, A.F., Greene, D., Hurley, N.: Normalized mutual information to evaluate overlapping community finding algorithms. arXiv preprint arXiv:1110.2515 (2011) McDaid, A.F., Greene, D., Hurley, N.: Normalized mutual information to evaluate overlapping community finding algorithms. arXiv preprint arXiv:​1110.​2515 (2011)
17.
Zurück zum Zitat Pollen, A.A., et al.: Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex. Nat. Biotechnol. 32, 1053–1058 (2014)CrossRefPubMedPubMedCentral Pollen, A.A., et al.: Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex. Nat. Biotechnol. 32, 1053–1058 (2014)CrossRefPubMedPubMedCentral
18.
Zurück zum Zitat Grover, A., et al.: Single-cell RNA sequencing reveals molecular and functional platelet bias of aged haematopoietic stem cells. Nat. Commun. 7, 11075 (2016)CrossRefPubMedPubMedCentral Grover, A., et al.: Single-cell RNA sequencing reveals molecular and functional platelet bias of aged haematopoietic stem cells. Nat. Commun. 7, 11075 (2016)CrossRefPubMedPubMedCentral
19.
Zurück zum Zitat Buettner, F., et al.: Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells. Nat. Biotechnol. 33, 155–160 (2015)CrossRefPubMed Buettner, F., et al.: Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells. Nat. Biotechnol. 33, 155–160 (2015)CrossRefPubMed
20.
Zurück zum Zitat Engel, I., et al.: Innate-like functions of natural killer T cell subsets result from highly divergent gene programs. Nat. Immunol. 17, 728–739 (2016)CrossRefPubMedPubMedCentral Engel, I., et al.: Innate-like functions of natural killer T cell subsets result from highly divergent gene programs. Nat. Immunol. 17, 728–739 (2016)CrossRefPubMedPubMedCentral
21.
Zurück zum Zitat Deng, Q., Ramsköld, D., Reinius, B., Sandberg, R.: Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells. Science 343, 193–196 (2014)CrossRefPubMed Deng, Q., Ramsköld, D., Reinius, B., Sandberg, R.: Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells. Science 343, 193–196 (2014)CrossRefPubMed
22.
Zurück zum Zitat Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a data set via the gap statistic. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 63, 411–423 (2001)CrossRef Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a data set via the gap statistic. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 63, 411–423 (2001)CrossRef
23.
Zurück zum Zitat Jiang, B., Ding, C., Luo, B., Tang, J.: Graph-Laplacian PCA: closed-form solution and robustness. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3492–3498. (2011) Jiang, B., Ding, C., Luo, B., Tang, J.: Graph-Laplacian PCA: closed-form solution and robustness. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3492–3498. (2011)
24.
Zurück zum Zitat Zhang, W., Xue, X., Zheng, X., Fan, Z.: NMFLRR: clustering scRNA-seq data by integrating nonnegative matrix factorization with low rank representation. IEEE J. Biomed. Health Inform. 26, 1394–1405 (2021)CrossRef Zhang, W., Xue, X., Zheng, X., Fan, Z.: NMFLRR: clustering scRNA-seq data by integrating nonnegative matrix factorization with low rank representation. IEEE J. Biomed. Health Inform. 26, 1394–1405 (2021)CrossRef
25.
Zurück zum Zitat Feng, C.-M., Gao, Y.-L., Liu, J.-X., Zheng, C.-H., Yu, J.: PCA based on graph Laplacian regularization and P-norm for gene selection and clustering. IEEE Trans. Nanobiosci. 16, 257–265 (2017)CrossRef Feng, C.-M., Gao, Y.-L., Liu, J.-X., Zheng, C.-H., Yu, J.: PCA based on graph Laplacian regularization and P-norm for gene selection and clustering. IEEE Trans. Nanobiosci. 16, 257–265 (2017)CrossRef
26.
Zurück zum Zitat Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9 (2008) Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9 (2008)
27.
Zurück zum Zitat Van Der Maaten, L.: Fast optimization for t-SNE. In: Neural Information Processing Systems (NIPS) 2010 Workshop on Challenges in Data Visualization. Citeseer (2010) Van Der Maaten, L.: Fast optimization for t-SNE. In: Neural Information Processing Systems (NIPS) 2010 Workshop on Challenges in Data Visualization. Citeseer (2010)
Metadaten
Titel
CHLPCA: Correntropy-Based Hypergraph Regularized Sparse PCA for Single-Cell Type Identification
verfasst von
Tai-Ge Wang
Xiang-Zhen Kong
Sheng-Jun Li
Juan Wang
Copyright-Jahr
2023
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-99-7074-2_44

Premium Partner