research-article

Dimensionality Reduction Via Graph Structure Learning

Authors:
Qi Mao

SUNY at Buffalo, Buffalo, NY, USA

SUNY at Buffalo, Buffalo, NY, USA
View Profile

,
Li Wang

University of Victoria, Victoria, Canada

University of Victoria, Victoria, Canada
View Profile

,
Steve Goodison

Mayo Clinic, Jacksonville, FL, USA

Mayo Clinic, Jacksonville, FL, USA
View Profile

,
Yijun Sun

SUNY at Buffalo, Buffalo, NY, USA

SUNY at Buffalo, Buffalo, NY, USA
View Profile

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data MiningAugust 2015Pages 765–774https://doi.org/10.1145/2783258.2783309

Published:10 August 2015Publication History

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Pages 765–774

ABSTRACT

We present a new dimensionality reduction setting for a large family of real-world problems. Unlike traditional methods, the new setting aims to explicitly represent and learn an intrinsic structure from data in a high-dimensional space, which can greatly facilitate data visualization and scientific discovery in downstream analysis. We propose a new dimensionality-reduction framework that involves the learning of a mapping function that projects data points in the original high-dimensional space to latent points in a low-dimensional space that are then used directly to construct a graph. Local geometric information of the projected data is naturally captured by the constructed graph. As a showcase, we develop a new method to obtain a discriminative and compact feature representation for clustering problems. In contrast to assumptions used in traditional clustering methods, we assume that centers of clusters should be close to each other if they are connected in a learned graph, and other cluster centers should be distant. Extensive experiments are performed that demonstrate that the proposed method is able to obtain discriminative feature representations yielding superior clustering performance, and correctly recover the intrinsic structures of various real-world datasets including curves, hierarchies and a cancer progression path.

References

R. K. Ando and T. Zhang. A framework for learning predictive structures from multiple tasks and unlabeled data. JMLR, 6:1817--1853, 2005. Google ScholarDigital Library
M. Belkin and P. Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. In NIPS, 2001.Google ScholarDigital Library
K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft. When is "nearest neighbor"meaningful? In Database Theory-ICD'99, 1999. Google ScholarDigital Library
C. M. Bishop, M. Svensén, and C. K. I. Williams. GTM: the generative topographic mapping. Neural Comput, 10(1):215--234, 1998. Google ScholarDigital Library
S. Boyd and L. Vandenberghe. Conex Optimization. Cambridge University Press, 2004. Google ScholarDigital Library
C. J. C. Burges. Dimension reduction: a guided tour. FTML, 2(4):275--365, 2009.Google Scholar
L. Cayton. Algorithms for manifold learning. Technical report, UCSD, 2005.Google Scholar
Y. Cheng. Mean shift, mode seeking, and clustering. IEEE T-PAMI, 17(8):790--799, 1995. Google ScholarDigital Library
M. Cheung. Minimum-cost spanning trees. http://people.orie.cornell.edu/dpw/orie6300fall2008/Recitations/rec09.pdf.Google Scholar
C. Curtis, S. P. Shah, S. Chin, et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature, 486(7403):346--352, 2012.Google ScholarCross Ref
C. Ding and X. He. K-means clustering via principal component analysis. In ICML, 2004. Google ScholarDigital Library
E. Erwin, K. Obermayer, and K. Schulten. Self-organizing maps: ordering, convergence properties and energy functions. Biol Cybern, 67:47--55, 1992. Google ScholarDigital Library
M. Filippone, F. Camastra, F. Masulli, and S. Rovetta. A survey of kernel and spectral methods for clustering. Pattern Recogn, 41:176--190, 2008. Google ScholarDigital Library
A. Gorban, B. Kégl, D. Wunsch, and A. Zinovyev. Principal Manifolds for Data Visualisation and Dimension Reduction, volume 58. Springer, 2007. Google ScholarCross Ref
M. Greaves and C. C. Maley. Clonal evolution in cancer. Nature, 481(7381):306--313, 2012.Google ScholarCross Ref
T. Hastie and W. Stuetzle. Principal curves. JASA, 84:502--516, 1989.Google ScholarCross Ref
R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, 2012. Google ScholarCross Ref
J. T. Jolliffe. Principal Component Analysis. Springer-Verlag, Berlin, 1986.Google ScholarCross Ref
B. Kégl and A. Kryzak. Piecewise linear skeletonization using principal curves. IEEE T-PAMI, 24(1):59--74, 2002. Google ScholarDigital Library
T. Kohonen. Self-organizing Maps. Springer, 1997. Google ScholarDigital Library
Q. Mao, I. Tsang, S. Gao, and L. Wang. Generalized multiple kernel learning with data-dependent priors. IEEE T-NNLS, 29(6):1134--1148, 2015.Google Scholar
Q. Mao, L. Yang, L. Wang, S. Goodison, and Y. Sun. SimplePPT: A simple principal tree algorithm. In SDM, 2015.Google ScholarCross Ref
J. Parker, M. Mullins, M. Cheang, et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. J Clin Oncol, 27(8):1160--1167, 2009.Google ScholarCross Ref
M. Radovanović, A. Nanopoulos, and M. Ivanović. Hubs in space: popular nearest neighbors in high-dimensional data. JMLR, 11:2487--2531, 2010. Google ScholarDigital Library
S. T. Roweis and L. K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323--2326, 2000.Google ScholarCross Ref
L. Saul and S. Roweis. Think globally, fit locally: unsupervised learning of low dimensional mainfolds. JMLR, 4:119--155, 2003. Google ScholarDigital Library
B. Schölkopf, A. Smola, and K. Muller. Kernel principal component analysis. Advances in Kernel Methods - Support Vector Learning, pages 327--352, 1999. Google ScholarDigital Library
A. J. Smola, S. Mika, B. Schölkopf, and R. C. Williamson. Regularized principal manifolds. JMLR, 1:179--209, 2001. Google ScholarDigital Library
L. Song, A. Smola, A. Gretton, and K. Borgwardt. A dependence maximization view of clustering. In ICML, 2007. Google ScholarDigital Library
Y. Sun, J. Yao, N. Nowak, and S. Goodison. Cancer progression modeling using static sample data. Genome Biol, 15(8):440, 2014.Google ScholarCross Ref
J. B. Tenenbaum, V. deSilva, and J. C. Landford. A global geometric framework for nonlinear dimensionality reduction. Science, 290:2319--2323, 2000.Google ScholarCross Ref
R. Tibshirani. Principal curves revisited. Stat Comput, 2:183--190, 1992.Google ScholarCross Ref
K. Q. Weinberger and L. K. Saul. An introduction to nonlinear dimensionality reduction by maximum variance unfolding. In AAAI, 2006. Google ScholarDigital Library
J. Yao, Q. Mao, S. Goodison, V. Mai, and Y. Sun. Feature selection for unsupervised learning through local learning. Pattern Recogn Lett, 53:100--107, 2015. Google ScholarDigital Library

Index Terms

Dimensionality Reduction Via Graph Structure Learning
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Cluster analysis
    2. Machine learning algorithms
      1. Feature selection

Recommendations

Graph Structure Learning for Robust Graph Neural Networks
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Graph Neural Networks (GNNs) are powerful tools in representation learning for graphs. However, recent studies show that GNNs are vulnerable to carefully-crafted perturbations, called adversarial attacks. Adversarial attacks can easily fool GNNs in ...
Read More
Towards Unsupervised Deep Graph Structure Learning
WWW '22: Proceedings of the ACM Web Conference 2022

In recent years, graph neural networks (GNNs) have emerged as a successful tool in a variety of graph-related applications. However, the performance of GNNs can be deteriorated when noisy connections occur in the original graph structures; besides, the ...
Read More
Heterogeneous Graph Attention Network
WWW '19: The World Wide Web Conference

Graph neural network, as a powerful graph representation technique based on deep learning, has shown superior performance and attracted considerable research interest. However, it has not been fully considered in graph neural network for heterogeneous ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
August 2015
2378 pages
ISBN:9781450336642
DOI:10.1145/2783258
General Chairs:
Longbing Cao
University of Technology, Sydney
,
Chengqi Zhang
University of Technology, Sydney
,
Program Chairs:
Thorsten Joachims
Cornell University
,
Geoff Webb
Monash University
,
Dragos D. Margineantu
Boeing Research
,
Graham Williams
Australian Taxation Office
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 August 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
clustering
dimensionality reduction
graph structure learning
unsupervised learning
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '15 Paper Acceptance Rate160of819submissions,20%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 42
  Total Citations
  View Citations
- 2,514
  Total Downloads
- Downloads (Last 12 months)185
- Downloads (Last 6 weeks)30
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Dimensionality Reduction Via Graph Structure Learning

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Graph Structure Learning for Robust Graph Neural Networks

Towards Unsupervised Deep Graph Structure Learning

Heterogeneous Graph Attention Network