Skip to main content
Erschienen in: Pattern Analysis and Applications 1/2021

28.06.2020 | Theoretical advances

High-dimensional cluster boundary detection using directed Markov tree

verfasst von: Xiaofeng Cao

Erschienen in: Pattern Analysis and Applications | Ausgabe 1/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Hypersurface of an inscribed geometry decides the distribution of an embedded cluster, in which its boundary points approximately fit this surface. To detect these points, capturing the implicit features of a local space is used to distinguish whether the data is an inner or outer feature. However, this approximation on the boundary is coarse-grained and may be ineffective in a high-dimensional space due to unbalanced feature distribution. In this paper, we introduce a directed Markov tree in high-dimensional cluster boundary detection. The key idea is to project each one-dimensional subspace of a local high-dimensional feature space into a layer of a directed Markov tree, covering absorptive and reflective walls. We then derive a fine-grained detection coefficient against on the Markov process of knight’s tour over each layer of the tree. In this fine-grained view, the local feature space centered with a cluster boundary point has lower estimate on the tour cost than the internal data of the cluster. Based on this observation, we propose a knight algorithm to detect the boundary points of a high-dimensional feature space. Experiments on gene expression and video retrieval datasets demonstrate that the proposed algorithm can achieve a higher F-measure score than the other boundary detection baselines.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Aggarwal CC (2015) Outlier analysis. In: Data mining, Springer, pp 237–263 Aggarwal CC (2015) Outlier analysis. In: Data mining, Springer, pp 237–263
2.
Zurück zum Zitat Alzate C, Suykens JA (2008) Multiway spectral clustering with out-of-sample extensions through weighted kernel PCA. IEEE Trans Pattern Anal Mach Intell 32(2):335–347CrossRef Alzate C, Suykens JA (2008) Multiway spectral clustering with out-of-sample extensions through weighted kernel PCA. IEEE Trans Pattern Anal Mach Intell 32(2):335–347CrossRef
3.
Zurück zum Zitat Beil F, Ester M, Xu X (2002) Frequent term-based text clustering. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 436–442 Beil F, Ester M, Xu X (2002) Frequent term-based text clustering. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 436–442
4.
Zurück zum Zitat Brunet JP, Tamayo P, Golub TR, Mesirov JP (2004) Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci 101(12):4164–4169CrossRef Brunet JP, Tamayo P, Golub TR, Mesirov JP (2004) Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci 101(12):4164–4169CrossRef
5.
Zurück zum Zitat Cao X, Qiu B, Xu G (2018) Bordershift: toward optimal meanshift vector for cluster boundary detection in high-dimensional data. Pattern Anal Appl 6:1–13 Cao X, Qiu B, Xu G (2018) Bordershift: toward optimal meanshift vector for cluster boundary detection in high-dimensional data. Pattern Anal Appl 6:1–13
6.
Zurück zum Zitat Chatzis SP, Varvarigou TA (2008) A fuzzy clustering approach toward hidden Markov random field models for enhanced spatially constrained image segmentation. IEEE Trans Fuzzy Syst 16(5):1351–1361CrossRef Chatzis SP, Varvarigou TA (2008) A fuzzy clustering approach toward hidden Markov random field models for enhanced spatially constrained image segmentation. IEEE Trans Fuzzy Syst 16(5):1351–1361CrossRef
7.
Zurück zum Zitat Cheeseman PC, Stutz JC et al (1996) Bayesian classification (autoclass): theory and results. Adv Knowl Discov Data Min 180:153–180MATH Cheeseman PC, Stutz JC et al (1996) Bayesian classification (autoclass): theory and results. Adv Knowl Discov Data Min 180:153–180MATH
8.
Zurück zum Zitat Ding C, He X (2004) K-means clustering via principal component analysis. In: Proceedings of the twenty-first international conference on machine learning, p 29 Ding C, He X (2004) K-means clustering via principal component analysis. In: Proceedings of the twenty-first international conference on machine learning, p 29
9.
Zurück zum Zitat Ding C, He X, Zha H, Simon HD (2002) Adaptive dimension reduction for clustering high dimensional data. In: 2002 IEEE international conference on data mining, 2002. Proceedings., IEEE, pp 147–154 Ding C, He X, Zha H, Simon HD (2002) Adaptive dimension reduction for clustering high dimensional data. In: 2002 IEEE international conference on data mining, 2002. Proceedings., IEEE, pp 147–154
10.
Zurück zum Zitat Ferman AM, Tekalp AM (1998) Efficient filtering and clustering methods for temporal video segmentation and visual summarization. J Vis Commun Image Represent 9(4):336–351CrossRef Ferman AM, Tekalp AM (1998) Efficient filtering and clustering methods for temporal video segmentation and visual summarization. J Vis Commun Image Represent 9(4):336–351CrossRef
11.
Zurück zum Zitat Fukunaga K, Hostetler L (1973) Optimization of k nearest neighbor density estimates. IEEE Trans Inf Theory 19(3):320–326MathSciNetCrossRef Fukunaga K, Hostetler L (1973) Optimization of k nearest neighbor density estimates. IEEE Trans Inf Theory 19(3):320–326MathSciNetCrossRef
12.
Zurück zum Zitat Fukunaga K, Narendra PM (1975) A branch and bound algorithm for computing k-nearest neighbors. IEEE Trans Comput 100(7):750–753CrossRef Fukunaga K, Narendra PM (1975) A branch and bound algorithm for computing k-nearest neighbors. IEEE Trans Comput 100(7):750–753CrossRef
13.
Zurück zum Zitat Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 10:993–1001CrossRef Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 10:993–1001CrossRef
14.
Zurück zum Zitat He W, Li B, Song D (2018) Decision boundary analysis of adversarial examples. In: 6th international conference on learning representations, ICLR 2018 He W, Li B, Song D (2018) Decision boundary analysis of adversarial examples. In: 6th international conference on learning representations, ICLR 2018
15.
Zurück zum Zitat Hjelm R, Jacob A, Che T, Trischler A, Cho K, Bengio Y (2018) Boundary-seeking generative adversarial networks. In: 6th international conference on learning representations, ICLR 2018; Conference date: 30-04-2018 Through 03-05-2018 Hjelm R, Jacob A, Che T, Trischler A, Cho K, Bengio Y (2018) Boundary-seeking generative adversarial networks. In: 6th international conference on learning representations, ICLR 2018; Conference date: 30-04-2018 Through 03-05-2018
16.
Zurück zum Zitat Hodge V, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22(2):85–126CrossRef Hodge V, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22(2):85–126CrossRef
17.
Zurück zum Zitat Honda K, Ichihashi H (2005) Regularized linear fuzzy clustering and probabilistic PCA mixture models. IEEE Trans Fuzzy Syst 13(4):508–516CrossRef Honda K, Ichihashi H (2005) Regularized linear fuzzy clustering and probabilistic PCA mixture models. IEEE Trans Fuzzy Syst 13(4):508–516CrossRef
18.
Zurück zum Zitat Johnson SC (1967) Hierarchical clustering schemes. Psychometrika 32(3):241–254CrossRef Johnson SC (1967) Hierarchical clustering schemes. Psychometrika 32(3):241–254CrossRef
19.
Zurück zum Zitat Li SZ, Chu R, Liao S, Zhang L (2007) Illumination invariant face recognition using near-infrared images. IEEE Trans Pattern Anal Mach Intell 29(4):627–639CrossRef Li SZ, Chu R, Liao S, Zhang L (2007) Illumination invariant face recognition using near-infrared images. IEEE Trans Pattern Anal Mach Intell 29(4):627–639CrossRef
20.
Zurück zum Zitat Li TH, Chang SJ, Tong W (2004) Fuzzy target tracking control of autonomous mobile robots by using infrared sensors. IEEE Trans Fuzzy Syst 12(4):491–501CrossRef Li TH, Chang SJ, Tong W (2004) Fuzzy target tracking control of autonomous mobile robots by using infrared sensors. IEEE Trans Fuzzy Syst 12(4):491–501CrossRef
21.
Zurück zum Zitat Liao S, Jain AK, Li SZ (2013) Partial face recognition: alignment-free approach. IEEE Trans Pattern Anal Mach Intell 35(5):1193–1205CrossRef Liao S, Jain AK, Li SZ (2013) Partial face recognition: alignment-free approach. IEEE Trans Pattern Anal Mach Intell 35(5):1193–1205CrossRef
22.
Zurück zum Zitat Melnik O (2002) Decision region connectivity analysis: a method for analyzing high-dimensional classifiers. Mach Learn 48(1–3):321–351CrossRef Melnik O (2002) Decision region connectivity analysis: a method for analyzing high-dimensional classifiers. Mach Learn 48(1–3):321–351CrossRef
23.
Zurück zum Zitat Ng AY, Jordan MI, Weiss Y (2002) On spectral clustering: analysis and an algorithm. In: Advances in neural information processing systems, pp 849–856 Ng AY, Jordan MI, Weiss Y (2002) On spectral clustering: analysis and an algorithm. In: Advances in neural information processing systems, pp 849–856
24.
Zurück zum Zitat Porter R, Canagarajah N (1996) A robust automatic clustering scheme for image segmentation using wavelets. IEEE Trans Image Process 5(4):662–665CrossRef Porter R, Canagarajah N (1996) A robust automatic clustering scheme for image segmentation using wavelets. IEEE Trans Image Process 5(4):662–665CrossRef
25.
Zurück zum Zitat Qiu B, Cao X (2016) Clustering boundary detection for high dimensional space based on space inversion and Hopkins statistics. Knowl Based Syst 98:216–225CrossRef Qiu B, Cao X (2016) Clustering boundary detection for high dimensional space based on space inversion and Hopkins statistics. Knowl Based Syst 98:216–225CrossRef
26.
Zurück zum Zitat Qiu B, Feng Y, Yi SJ (2007) Brim: an efficient boundary points detecting algorithm. In: Advances in knowledge discovery and data mining Qiu B, Feng Y, Yi SJ (2007) Brim: an efficient boundary points detecting algorithm. In: Advances in knowledge discovery and data mining
27.
Zurück zum Zitat Qiu B, Yang Y, Xiaowu D (2012) Brink: an algorithm of boundary points of clusters detection based on local qualitative factors. J Zhengzhou Univ Eng Sci 33(3):117–121MathSciNet Qiu B, Yang Y, Xiaowu D (2012) Brink: an algorithm of boundary points of clusters detection based on local qualitative factors. J Zhengzhou Univ Eng Sci 33(3):117–121MathSciNet
28.
Zurück zum Zitat Ruggieri S (2002) Efficient c4. 5 [classification algorithm]. IEEE Trans Knowl Data Eng 14(2):438–444CrossRef Ruggieri S (2002) Efficient c4. 5 [classification algorithm]. IEEE Trans Knowl Data Eng 14(2):438–444CrossRef
29.
Zurück zum Zitat Smith FW (1968) Pattern classifier design by linear programming. IEEE Trans Comput 100(4):367–372CrossRef Smith FW (1968) Pattern classifier design by linear programming. IEEE Trans Comput 100(4):367–372CrossRef
30.
Zurück zum Zitat Tong S, Chang E (2001) Support vector machine active learning for image retrieval. In: Proceedings of the ninth ACM international conference on Multimedia, pp 107–118 Tong S, Chang E (2001) Support vector machine active learning for image retrieval. In: Proceedings of the ninth ACM international conference on Multimedia, pp 107–118
31.
Zurück zum Zitat Xia C, Hsu W, Lee ML, Ooi BC (2006) Border: efficient computation of boundary points. IEEE Trans Knowl Data Eng 18(3):289–303CrossRef Xia C, Hsu W, Lee ML, Ooi BC (2006) Border: efficient computation of boundary points. IEEE Trans Knowl Data Eng 18(3):289–303CrossRef
32.
Zurück zum Zitat Xue L, Qiu B (2009) Boundary points detection algorithm based on coefficient of variation. Pattern Recognit Artif Intell 22(5):799–802 Xue L, Qiu B (2009) Boundary points detection algorithm based on coefficient of variation. Pattern Recognit Artif Intell 22(5):799–802
33.
Zurück zum Zitat Zhang H, Wang S, Xu X, Chow TW, Wu QJ (2018) Tree2vector: learning a vectorial representation for tree-structured data. IEEE Trans Neural Netw Learn Syst 99:1–15MathSciNet Zhang H, Wang S, Xu X, Chow TW, Wu QJ (2018) Tree2vector: learning a vectorial representation for tree-structured data. IEEE Trans Neural Netw Learn Syst 99:1–15MathSciNet
34.
Zurück zum Zitat Zhang H, Guo H, Wang X, Ji Y, Wu QJ (2020) Clothescounter: a framework for star-oriented clothes mining from videos. Neurocomputing 377:38–48CrossRef Zhang H, Guo H, Wang X, Ji Y, Wu QJ (2020) Clothescounter: a framework for star-oriented clothes mining from videos. Neurocomputing 377:38–48CrossRef
35.
Metadaten
Titel
High-dimensional cluster boundary detection using directed Markov tree
verfasst von
Xiaofeng Cao
Publikationsdatum
28.06.2020
Verlag
Springer London
Erschienen in
Pattern Analysis and Applications / Ausgabe 1/2021
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-020-00897-2

Weitere Artikel der Ausgabe 1/2021

Pattern Analysis and Applications 1/2021 Zur Ausgabe

Premium Partner