Skip to main content
Erschienen in: Soft Computing 4/2021

10.10.2020 | Methodologies and Application

Convex clustering method for compositional data modeling

verfasst von: Xiaokang Wang, Huiwen Wang, Zhichao Wang, Jidong Yuan

Erschienen in: Soft Computing | Ausgabe 4/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Compositional data refer to a vector with parts that are positive and subject to a constant-sum constraint. Examples of compositional data in the real world include a vector with each entry representing the weight of a stock in an investment portfolio, or the relative concentration of air pollutants in the environment. In this study, we developed a Convex Clustering approach for grouping Compositional data. Convex clustering is desirable because it provides a global optimal solution given its convex relaxations of hierarchical clustering. However, when directly applied to compositions, the clustering result offers little interpretability because it ignores the unit-sum constraint of compositional data. In this study, we discuss the clustering of compositional variables in the Aitchison framework with an isometric log-ratio (ilr) transformation. The objective optimization function is formulated as a combination of a \(L_2\)-norm loss term and a \(L_1\)-norm regularization term and is then efficiently solved using the alternating direction method of multipliers. Based on the numerical simulation results, the accuracy of clustering ilr-transformed data is higher than the accuracy of directly clustering untransformed compositional data. To demonstrate its practical use in real applications, the proposed method is also tested on several real-world datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Aitchison J (1982) The statistical analysis of compositional data. J R Stat Soc 44(2):139–177MathSciNetMATH Aitchison J (1982) The statistical analysis of compositional data. J R Stat Soc 44(2):139–177MathSciNetMATH
Zurück zum Zitat Aitchison J (1986) The statistical analysis of compositional data. Monographs on statistics and applied probability. Chapman and Hall, London Aitchison J (1986) The statistical analysis of compositional data. Monographs on statistics and applied probability. Chapman and Hall, London
Zurück zum Zitat Borg A, Boldt M, Lavesson N, Melander U, Boeva V (2014) Detecting serial residential burglaries using clustering. Expert Syst Appl 41(11):5252–5266CrossRef Borg A, Boldt M, Lavesson N, Melander U, Boeva V (2014) Detecting serial residential burglaries using clustering. Expert Syst Appl 41(11):5252–5266CrossRef
Zurück zum Zitat Brodinova S, Filzmoser P, Ortner T, Breiteneder C, Zaharieva M (2017) Robust and sparse k-means clustering for high-dimensional data. Adv Data Anal Classif 13(4):905–932MathSciNetMATH Brodinova S, Filzmoser P, Ortner T, Breiteneder C, Zaharieva M (2017) Robust and sparse k-means clustering for high-dimensional data. Adv Data Anal Classif 13(4):905–932MathSciNetMATH
Zurück zum Zitat Chen J, Mao G, Li C, Liang W, Zhang D (2018) Capacity of cooperative vehicular networks with infrastructure support: multiuser case. IEEE Trans Vehicul Technol 67(2):1546–1560CrossRef Chen J, Mao G, Li C, Liang W, Zhang D (2018) Capacity of cooperative vehicular networks with infrastructure support: multiuser case. IEEE Trans Vehicul Technol 67(2):1546–1560CrossRef
Zurück zum Zitat Chen J, Mao G, Li C, Zhang D (2020) A topological approach to secure message dissemination in vehicular networks. IEEE Trans Intell Transp Syst 21(1):135–148CrossRef Chen J, Mao G, Li C, Zhang D (2020) A topological approach to secure message dissemination in vehicular networks. IEEE Trans Intell Transp Syst 21(1):135–148CrossRef
Zurück zum Zitat Chi EC, Lange K (2014) Splitting methods for convex clustering. J Comput Graph Statist 46(1):80–89 Chi EC, Lange K (2014) Splitting methods for convex clustering. J Comput Graph Statist 46(1):80–89
Zurück zum Zitat Cui Y, Zhang D, Ting Zhang L, Chen MP, Zhu H (2020) Novel method of mobile edge computation offloading based on evolutionary game strategy for IoT devices. AEU Int J Electr Commun 118:153134CrossRef Cui Y, Zhang D, Ting Zhang L, Chen MP, Zhu H (2020) Novel method of mobile edge computation offloading based on evolutionary game strategy for IoT devices. AEU Int J Electr Commun 118:153134CrossRef
Zurück zum Zitat De Gan Z, Chen L, Jie Z, Jie C, Ning QJ (2020) A multi-path routing protocol based on link lifetime and energy consumption prediction for mobile edge computing. IEEE Acc 99:1–1 De Gan Z, Chen L, Jie Z, Jie C, Ning QJ (2020) A multi-path routing protocol based on link lifetime and energy consumption prediction for mobile edge computing. IEEE Acc 99:1–1
Zurück zum Zitat De-gan Z, Ting Z, Yue D, Xiao-huan L, Yu-ya C, De-xin Z (2018) Novel optimized link state routing protocol based on quantum genetic strategy for mobile learning. J Netw Comput Appl 122:37–49CrossRef De-gan Z, Ting Z, Yue D, Xiao-huan L, Yu-ya C, De-xin Z (2018) Novel optimized link state routing protocol based on quantum genetic strategy for mobile learning. J Netw Comput Appl 122:37–49CrossRef
Zurück zum Zitat Duan P, Mao G, Liang W, Zhang DG (2018) A unified spatio-temporal model for short-term traffic flow prediction. IEEE Trans Intell Transp Syst 20:1–12 Duan P, Mao G, Liang W, Zhang DG (2018) A unified spatio-temporal model for short-term traffic flow prediction. IEEE Trans Intell Transp Syst 20:1–12
Zurück zum Zitat Egozcue JJ, Pawlowsky-Glahn V, Mateu-Figueras G, Barceló-Vidal C (2003) Isometric logratio transformations for compositional data analysis. Math Geol 35(3):279–300MathSciNetMATHCrossRef Egozcue JJ, Pawlowsky-Glahn V, Mateu-Figueras G, Barceló-Vidal C (2003) Isometric logratio transformations for compositional data analysis. Math Geol 35(3):279–300MathSciNetMATHCrossRef
Zurück zum Zitat Fačevicová K, Hron K, Todorov V, Templ M (2017) General approach to coordinate representation of compositional tables. Scandinavian J Statist 45(4):879–899MathSciNetMATHCrossRef Fačevicová K, Hron K, Todorov V, Templ M (2017) General approach to coordinate representation of compositional tables. Scandinavian J Statist 45(4):879–899MathSciNetMATHCrossRef
Zurück zum Zitat Hartigan John A (1975) Clustering algorithms, vol 99. Wiley, New YorkMATH Hartigan John A (1975) Clustering algorithms, vol 99. Wiley, New YorkMATH
Zurück zum Zitat He Z (2016) Evolutionary k-means with pair-wise constraints. Soft Comput 20(1):287–301CrossRef He Z (2016) Evolutionary k-means with pair-wise constraints. Soft Comput 20(1):287–301CrossRef
Zurück zum Zitat Hocking T, Vert Jean P, Bach F, Joulin A (2011) Clusterpath: an algorithm for clustering using convex fusion penalties. In: International Conference on International Conference on Machine Learning Hocking T, Vert Jean P, Bach F, Joulin A (2011) Clusterpath: an algorithm for clustering using convex fusion penalties. In: International Conference on International Conference on Machine Learning
Zurück zum Zitat Liu XH, Zhang DG, Yan HR, Cui YY, Chen L (2019) A new algorithm of the best path selection based on machine learning. IEEE Acc 7:126913–126928CrossRef Liu XH, Zhang DG, Yan HR, Cui YY, Chen L (2019) A new algorithm of the best path selection based on machine learning. IEEE Acc 7:126913–126928CrossRef
Zurück zum Zitat Liu S, Zhang D, Liu X, Zhang T, Gao J, Gong C, Cui Y (2019) Dynamic analysis for the average shortest path length of mobile ad hoc networks under random failure scenarios. IEEE Acc 7:21343–21358CrossRef Liu S, Zhang D, Liu X, Zhang T, Gao J, Gong C, Cui Y (2019) Dynamic analysis for the average shortest path length of mobile ad hoc networks under random failure scenarios. IEEE Acc 7:21343–21358CrossRef
Zurück zum Zitat Liu S, Zhang D, Liu X, Zhang T, Hao W (2020) Adaptive repair algorithm for tora routing protocol based on flood control strategy. Comput Commun 151:437–448CrossRef Liu S, Zhang D, Liu X, Zhang T, Hao W (2020) Adaptive repair algorithm for tora routing protocol based on flood control strategy. Comput Commun 151:437–448CrossRef
Zurück zum Zitat Martin-Fernandez JA, Palarea-Albaladejo J, Soto Jesus A (2012) Dealing with distances and transformations for fuzzy c -means clustering of compositional data. J Classif 29:144–169 Martin-Fernandez JA, Palarea-Albaladejo J, Soto Jesus A (2012) Dealing with distances and transformations for fuzzy c -means clustering of compositional data. J Classif 29:144–169
Zurück zum Zitat Park C, Choi H, Delcher C, Wang Y, Yoon YJ (2019) Convex clustering analysis for histogram-valued data. Biometrics 75:1–10 Park C, Choi H, Delcher C, Wang Y, Yoon YJ (2019) Convex clustering analysis for histogram-valued data. Biometrics 75:1–10
Zurück zum Zitat Pawlowsky-Glahn V, Egozcue JJ, Tolosana-Delgado R (2015) Modelling and analysis of compositional data. Wiley Pawlowsky-Glahn V, Egozcue JJ, Tolosana-Delgado R (2015) Modelling and analysis of compositional data. Wiley
Zurück zum Zitat Quan Z, Chen S (2020) Robust convex clustering. Soft Comput 24:731–744CrossRef Quan Z, Chen S (2020) Robust convex clustering. Soft Comput 24:731–744CrossRef
Zurück zum Zitat Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Statist Assoc 66(336):846–850CrossRef Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Statist Assoc 66(336):846–850CrossRef
Zurück zum Zitat Shen X, Tokoglu F, Papademetris X, Constable RT (2013) Groupwise whole-brain parcellation from resting-state FMRI data for network node identification. Neuroimage 82:403–415CrossRef Shen X, Tokoglu F, Papademetris X, Constable RT (2013) Groupwise whole-brain parcellation from resting-state FMRI data for network node identification. Neuroimage 82:403–415CrossRef
Zurück zum Zitat Sui X, Xu EL, Qian X, Liu T (2018) Convex clustering with metric learning. Pattern Recogn 81:575–584CrossRef Sui X, Xu EL, Qian X, Liu T (2018) Convex clustering with metric learning. Pattern Recogn 81:575–584CrossRef
Zurück zum Zitat Templ Ms, Hron K, Filzmoser P, Monti G (2013) Methods to detect outliers in compositional data with structural zeros. In: Proceedings of the 5th international workshop on compositional data analysis CoDaWork 2013 June 3–7, 2013, Vorau, Austria Templ Ms, Hron K, Filzmoser P, Monti G (2013) Methods to detect outliers in compositional data with structural zeros. In: Proceedings of the 5th international workshop on compositional data analysis CoDaWork 2013 June 3–7, 2013, Vorau, Austria
Zurück zum Zitat Thuy TM, Hoai A, Le T (2015) An improvement of stability based method to clustering. In: Le T, Hoai A, Nguyen NT, Do TV (eds) Advanced computational methods for knowledge engineering. Springer, Cham, pp 129–140 Thuy TM, Hoai A, Le T (2015) An improvement of stability based method to clustering. In: Le T, Hoai A, Nguyen NT, Do TV (eds) Advanced computational methods for knowledge engineering. Springer, Cham, pp 129–140
Zurück zum Zitat Van Den Boogaart K, Gerald T-DR (2013) Analyzing compositional data with r. Use R. Springer, Heidelberg, pp 73–93MATHCrossRef Van Den Boogaart K, Gerald T-DR (2013) Analyzing compositional data with r. Use R. Springer, Heidelberg, pp 73–93MATHCrossRef
Zurück zum Zitat Venables WN, Ripley BD (2002) Modern Applied Statistics with S, 4th edn. Springer, New York Inc, p 2002MATHCrossRef Venables WN, Ripley BD (2002) Modern Applied Statistics with S, 4th edn. Springer, New York Inc, p 2002MATHCrossRef
Zurück zum Zitat Wang X, Wang H, Wang Y (2020) A density weighted fuzzy outlier clustering approach for class imbalanced learning. Neural Comput Appl 32:1–15 Wang X, Wang H, Wang Y (2020) A density weighted fuzzy outlier clustering approach for class imbalanced learning. Neural Comput Appl 32:1–15
Zurück zum Zitat Wei F, Perry PO (2019) Estimating the number of clusters using cross-validation. J Comput Graph Statist 00:1–12 Wei F, Perry PO (2019) Estimating the number of clusters using cross-validation. J Comput Graph Statist 00:1–12
Zurück zum Zitat Xiangjun L, Wei L, Xinping Z, Song Q, Chann CP (2019) A cluster validity evaluation method for dynamically determining the near-optimal number of clusters. Soft Comput 2–3:1–15 Xiangjun L, Wei L, Xinping Z, Song Q, Chann CP (2019) A cluster validity evaluation method for dynamically determining the near-optimal number of clusters. Soft Comput 2–3:1–15
Zurück zum Zitat Yang J, Ding M, Mao G, Lin Z, Zhang D, Luan TH (2019) Optimal base station antenna downtilt in downlink cellular networks. IEEE Trans Wirel Commun 18(3):1779–1791CrossRef Yang J, Ding M, Mao G, Lin Z, Zhang D, Luan TH (2019) Optimal base station antenna downtilt in downlink cellular networks. IEEE Trans Wirel Commun 18(3):1779–1791CrossRef
Zurück zum Zitat Zhang DG, Wang X, Song XD, Zhang T, Zhu YN (2015) A new clustering routing method based on PECE for WSN. Eurasip J Wirel Commun Netw, 162 Zhang DG, Wang X, Song XD, Zhang T, Zhu YN (2015) A new clustering routing method based on PECE for WSN. Eurasip J Wirel Commun Netw, 162
Zurück zum Zitat Zhang DG, Zhang XD (2012) Design and implementation of embedded un-interruptible power supply system (EUPSS) for web-based mobile application. Enterp Inf Syst 6(4):473–489CrossRef Zhang DG, Zhang XD (2012) Design and implementation of embedded un-interruptible power supply system (EUPSS) for web-based mobile application. Enterp Inf Syst 6(4):473–489CrossRef
Zurück zum Zitat Zhang D, Li G, Zheng K, Ming X, Pan Z (2014) An energy-balanced routing method based on forward-aware factor for wireless sensor networks. IEEE Trans Ind Inf 10(1):766–773CrossRef Zhang D, Li G, Zheng K, Ming X, Pan Z (2014) An energy-balanced routing method based on forward-aware factor for wireless sensor networks. IEEE Trans Ind Inf 10(1):766–773CrossRef
Zurück zum Zitat Zhang DG, Xiang W, Song XD (2015) New medical image fusion approach with coding based on scd in wireless sensor network. J Electr Eng Technol 10(6):2384–2392CrossRef Zhang DG, Xiang W, Song XD (2015) New medical image fusion approach with coding based on scd in wireless sensor network. J Electr Eng Technol 10(6):2384–2392CrossRef
Zurück zum Zitat Zhang DG, Niu HL, Liu S (2016) Novel peecr-based clustering routing approach. Soft Comput 21:7313–7323CrossRef Zhang DG, Niu HL, Liu S (2016) Novel peecr-based clustering routing approach. Soft Comput 21:7313–7323CrossRef
Zurück zum Zitat Zhang D, Liu S, Zhang T, Liang Z (2017) Novel unequal clustering routing protocol considering energy balancing based on network partition and distance for mobile education. J Netw Comput Appl 88:1–9CrossRef Zhang D, Liu S, Zhang T, Liang Z (2017) Novel unequal clustering routing protocol considering energy balancing based on network partition and distance for mobile education. J Netw Comput Appl 88:1–9CrossRef
Zurück zum Zitat Zhang DG, Liu XH, Cui YY, Chen L, Zhang T (2019a) A kind of novel RSAR protocol for mobile vehicular ad hoc network. CCF Transa Network 2:111–125 Zhang DG, Liu XH, Cui YY, Chen L, Zhang T (2019a) A kind of novel RSAR protocol for mobile vehicular ad hoc network. CCF Transa Network 2:111–125
Zurück zum Zitat Zhang D, Ge H, Zhang T, Cui Y, Liu X, Mao G (2019b) New multi-hop clustering algorithm for vehicular Ad Hoc networks. IEEE Trans Intell Transp Syst 20(4):1517–1530CrossRef Zhang D, Ge H, Zhang T, Cui Y, Liu X, Mao G (2019b) New multi-hop clustering algorithm for vehicular Ad Hoc networks. IEEE Trans Intell Transp Syst 20(4):1517–1530CrossRef
Zurück zum Zitat Zhang D, Zhao P, Cui Y, Chen L, Zhang T, Wu H (2019c) A new method of mobile ad hoc network routing based on greed forwarding improvement strategy. IEEE Acc 7:158514–158524CrossRef Zhang D, Zhao P, Cui Y, Chen L, Zhang T, Wu H (2019c) A new method of mobile ad hoc network routing based on greed forwarding improvement strategy. IEEE Acc 7:158514–158524CrossRef
Zurück zum Zitat Zhang D, Zhang T, Liu X (2019d) Novel self-adaptive routing service algorithm for application in vanet. Appl Intell 49:1866–1879CrossRef Zhang D, Zhang T, Liu X (2019d) Novel self-adaptive routing service algorithm for application in vanet. Appl Intell 49:1866–1879CrossRef
Zurück zum Zitat Zhang D, Gao J, Liu X, Zhang T, Zhao D (2019e) Novel approach of distributed and adaptive trust metrics for manet. Wirel Netw 25(6):3587–3603CrossRef Zhang D, Gao J, Liu X, Zhang T, Zhao D (2019e) Novel approach of distributed and adaptive trust metrics for manet. Wirel Netw 25(6):3587–3603CrossRef
Zurück zum Zitat Zhang T, Zhang D, Qiu J, Zhang X, Zhao P, Gong C (2019f) A kind of novel method of power allocation with limited cross-tier interference for crn. IEEE Acc 7:82571–82583CrossRef Zhang T, Zhang D, Qiu J, Zhang X, Zhao P, Gong C (2019f) A kind of novel method of power allocation with limited cross-tier interference for crn. IEEE Acc 7:82571–82583CrossRef
Zurück zum Zitat Zhang D, Gong C, Jiang K, Zhang X, Zhang T (2019g) A kind of new method of intelligent trust engineering metrics (item) for application of mobile ad hoc network. Eng Comput 37(5):1617–1643CrossRef Zhang D, Gong C, Jiang K, Zhang X, Zhang T (2019g) A kind of new method of intelligent trust engineering metrics (item) for application of mobile ad hoc network. Eng Comput 37(5):1617–1643CrossRef
Zurück zum Zitat Zhang DG, Wu H, Zhao PZ, Liu XH, Cui YY, Chen L, Zhang T (2020a) New approach of multi-path reliable transmission for marginal wireless sensor network. Wirel Netw 26(2):1503–1517CrossRef Zhang DG, Wu H, Zhao PZ, Liu XH, Cui YY, Chen L, Zhang T (2020a) New approach of multi-path reliable transmission for marginal wireless sensor network. Wirel Netw 26(2):1503–1517CrossRef
Zurück zum Zitat Zhang D, Piao M, Zhang T, Chen C, Zhu H (2020b) New algorithm of multi-strategy channel allocation for edge computing. AEU Int J Electr Commun 126:153372CrossRef Zhang D, Piao M, Zhang T, Chen C, Zhu H (2020b) New algorithm of multi-strategy channel allocation for edge computing. AEU Int J Electr Commun 126:153372CrossRef
Metadaten
Titel
Convex clustering method for compositional data modeling
verfasst von
Xiaokang Wang
Huiwen Wang
Zhichao Wang
Jidong Yuan
Publikationsdatum
10.10.2020
Verlag
Springer Berlin Heidelberg
Erschienen in
Soft Computing / Ausgabe 4/2021
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-020-05355-z

Weitere Artikel der Ausgabe 4/2021

Soft Computing 4/2021 Zur Ausgabe

Methodologies and Application

Categorical structures of soft groups

Premium Partner