Skip to main content

2019 | OriginalPaper | Buchkapitel

Improving Clinical Subjects Clustering by Learning and Optimizing Feature Weights

verfasst von : Sergio Consoli, Monique Hendriks, Pieter Vos, Jacek Kustra, Dimitrios Mavroeidis, Ralf Hoffmann

Erschienen in: Machine Learning, Optimization, and Data Science

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Data analytics methods in the clinical domain are challenging to put into practice. Unsupervised learning provides opportunity for giving the level of personalization in evidence based decision-making that can otherwise only be achieved through the use of prediction models, by helping doctors gaining insights from data. In this context, grouping of clinical subjects, in terms of biomedical information of patients, is an important task for patient cohort identification for comparative effectiveness studies and clinical decision-support applications. It allows the decision-making process to leverage not only on data but also on doctors’ domain knowledge. However, one of the issues that needs to be addressed for a focused and realist unsupervised clustering of clinical subjects, is the fact that in the majority of the cases patients datasets are heterogeneous, i.e. their data features belong to several different feature spaces, e.g. nominal, ordinal, interval or rational, with completely different variation ranges and statistical distributions, affecting clustering quality and performance. In order to use these data measurements properly in an unsupervised manner, their corresponding weights need to be modeled. In this paper, we present a method for learning feature weights on clinical data. We show that learning feature weights is necessary in order to generate meaningful separation of data in high dimensional space. The method is based on silhouette score and principal component analysis, demonstrating its performance on a clinical test dataset.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Dhillon, I.S., Modha, D.S.: Concept decompositions for large sparse text data using clustering. Mach. Learn. 42, 143–175 (2001)CrossRef Dhillon, I.S., Modha, D.S.: Concept decompositions for large sparse text data using clustering. Mach. Learn. 42, 143–175 (2001)CrossRef
3.
Zurück zum Zitat Gendreau, M., Potvin, J.-Y.: Metaheuristics in combinatorial optimization. Ann. Oper. Res. 140, 189–213 (2005)MathSciNetCrossRef Gendreau, M., Potvin, J.-Y.: Metaheuristics in combinatorial optimization. Ann. Oper. Res. 140, 189–213 (2005)MathSciNetCrossRef
4.
Zurück zum Zitat Goroshin, R., Bruna, J., Tompson, J., Eigen, D., LeCun, Y.: Unsupervised learning of spatiotemporally coherent metrics. In: ICCV 2015, pp. 4086–4093 (2015) Goroshin, R., Bruna, J., Tompson, J., Eigen, D., LeCun, Y.: Unsupervised learning of spatiotemporally coherent metrics. In: ICCV 2015, pp. 4086–4093 (2015)
5.
Zurück zum Zitat Holland, J.H.: Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. The MIT Press, Cambridge (1992)CrossRef Holland, J.H.: Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. The MIT Press, Cambridge (1992)CrossRef
7.
Zurück zum Zitat Jordan, M.I., Bishop, C.M.: Neural networks. In: Tucker, A.B. (ed.) Computer Science Handbook (Section VII: Intelligent Systems), 2nd edn, pp. 137–142. Chapman & Hall/CRC Press LLC, Boca Raton (2004) Jordan, M.I., Bishop, C.M.: Neural networks. In: Tucker, A.B. (ed.) Computer Science Handbook (Section VII: Intelligent Systems), 2nd edn, pp. 137–142. Chapman & Hall/CRC Press LLC, Boca Raton (2004)
8.
Zurück zum Zitat Kang, J., Schwartz, R., Flickinger, J., Beriwal, S.: Machine learning approaches for predicting radiation therapy outcomes: a clinician’s perspective. Int. J. Radiat. Oncol. Biol. Phys. 93(5), 1127–1135 (2015)CrossRef Kang, J., Schwartz, R., Flickinger, J., Beriwal, S.: Machine learning approaches for predicting radiation therapy outcomes: a clinician’s perspective. Int. J. Radiat. Oncol. Biol. Phys. 93(5), 1127–1135 (2015)CrossRef
9.
11.
Zurück zum Zitat Modha, D.S., Scott Spangler, W.: Feature weighting in k-means clustering. J. Mach. Learn. 52, 217–237 (2001)CrossRef Modha, D.S., Scott Spangler, W.: Feature weighting in k-means clustering. J. Mach. Learn. 52, 217–237 (2001)CrossRef
12.
Zurück zum Zitat Moore, J., Ackerman, M.: Foundations of perturbation robust clustering. In: Proceedings of the IEEE 16th International Conference on Data Mining (ICDM), pp. 1089–1094 (2016) Moore, J., Ackerman, M.: Foundations of perturbation robust clustering. In: Proceedings of the IEEE 16th International Conference on Data Mining (ICDM), pp. 1089–1094 (2016)
13.
Zurück zum Zitat Pardalos, P.M., Resende, M.G.C.: Handbook of Applied Optimization. Oxford University Press, Oxford (2002)CrossRef Pardalos, P.M., Resende, M.G.C.: Handbook of Applied Optimization. Oxford University Press, Oxford (2002)CrossRef
14.
Zurück zum Zitat Pugh, J., Martinoli, A.: Discrete multi-valued particle swarm optimization. In: Proceedings of IEEE Swarm Intelligence Symposium, vol. 1, pp. 103–110 (2006) Pugh, J., Martinoli, A.: Discrete multi-valued particle swarm optimization. In: Proceedings of IEEE Swarm Intelligence Symposium, vol. 1, pp. 103–110 (2006)
15.
Zurück zum Zitat Qian, B., Wang, X., Cao, N., Li, H., Jiang, Y.-G.: A relative similarity based method for interactive patient risk prediction. Data Min. Knowl. Discov. 29(4), 1070–1093 (2015)MathSciNetCrossRef Qian, B., Wang, X., Cao, N., Li, H., Jiang, Y.-G.: A relative similarity based method for interactive patient risk prediction. Data Min. Knowl. Discov. 29(4), 1070–1093 (2015)MathSciNetCrossRef
16.
Zurück zum Zitat Wang, F., Sun, J., Ebadollahi, S.: Composite distance metric integration by leveraging multiple experts’ inputs and its application in patient similarity assessment. Stat. Anal. Data Min. 5(1), 54–69 (2012)MathSciNetCrossRef Wang, F., Sun, J., Ebadollahi, S.: Composite distance metric integration by leveraging multiple experts’ inputs and its application in patient similarity assessment. Stat. Anal. Data Min. 5(1), 54–69 (2012)MathSciNetCrossRef
17.
Zurück zum Zitat Weinstein, J.N., et al.: The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45(10), 1113–1120 (2013)CrossRef Weinstein, J.N., et al.: The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45(10), 1113–1120 (2013)CrossRef
18.
Zurück zum Zitat Xiao, Y., Yu, J.: Partitive clustering (k-means family). Wiley Interdiscip. Rev.: Data Min. Knowl. Discov. 2(3), 209–225 (2012) Xiao, Y., Yu, J.: Partitive clustering (k-means family). Wiley Interdiscip. Rev.: Data Min. Knowl. Discov. 2(3), 209–225 (2012)
Metadaten
Titel
Improving Clinical Subjects Clustering by Learning and Optimizing Feature Weights
verfasst von
Sergio Consoli
Monique Hendriks
Pieter Vos
Jacek Kustra
Dimitrios Mavroeidis
Ralf Hoffmann
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-13709-0_26

Premium Partner