Abstract
Patient similarity assessment is an important task in the context of patient cohort identif cation for comparative effectiveness studies and clinical decision support applications. The goal is to derive clinically meaningful distance metric to measure the similarity between patients represented by their key clinical indicators. How to incorporate physician feedback with regard to the retrieval results? How to interactively update the underlying similarity measure based on the feedback? Moreover, often different physicians have different understandings of patient similarity based on their patient cohorts. The distance metric learned for each individual physician often leads to a limited view of the true underlying distance metric. How to integrate the individual distance metrics from each physician into a globally consistent unif ed metric?
We describe a suite of supervised metric learning approaches that answer the above questions. In particular, we present Locally Supervised Metric Learning (LSML) to learn a generalized Mahalanobis distance that is tailored toward physician feedback. Then we describe the interactive metric learning (iMet) method that can incrementally update an existing metric based on physician feedback in an online fashion. To combine multiple similarity measures from multiple physicians, we present Composite Distance Integration (Comdi) method. In this approach we f rst construct discriminative neighborhoods from each individual metrics, then combine them into a single optimal distance metric. Finally, we present a clinical decision support prototype system powered by the proposed patient similarity methods, and evaluate the proposed methods using real EHR data against several baselines.
- C. Aggarwal and P. S. Yu. Privacy-Preserving Data Mining: Models and Algorithms. Springer, 2008. Google ScholarDigital Library
- A. S. Ash, R. P. Ellis, G. C. Pope, J. Z. Ayanian, D. W. Bates, H. Burstin, L. I. Iezzoni, E. MacKay, and W. Yu. Using diagnoses to describe populations and predict costs. Health care financing review, 21(3):7--28, 2000. PMID: 11481769.Google Scholar
- F. R. Bach, G. R. G. Lanckriet, and M. I. Jordan. Multiple kernel learning, conic duality, and the smo algorithm. In Proc. of International Conference on Machine Learning, pages 6--13, 2004. Google ScholarDigital Library
- L. Breiman. Bagging predictors. Machine Learning, 24(2):123--140, 1996. Google ScholarCross Ref
- D. Cai, X. He, K. Zhou, J. Han, and H. Bao. Locality sensitive discriminant analysis. In Proceedings of the 20th International Joint Conference on Artif cal Intelligence, pages 708--713, 2007. Google ScholarDigital Library
- T. F. Cox and M. A. A. Cox. Multimensional Scaling. London, UK, 2001.Google Scholar
- J. Duchi, S. Shalev-Shwartz, Y. Singer, and T. Chandra. Efficient projections onto the l1-ball for learning in high dimensions. In Proceedings of the 25th international conference on Machine learning, pages 272--279, 2008. Google ScholarDigital Library
- R. O. Duda, P. E. Hart, and D. H. Stork. Pattern Classif cation (2nd ed.). Wiley Interscience, 2000. Google ScholarDigital Library
- Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. In European Conference on Computational Learning Theory, pages 23--37, 1995. Google ScholarDigital Library
- J. H. Friedman. Regularized discriminant analysis. Journal of the American Statistical Association, 84(405):165--175, 1989.Google ScholarCross Ref
- J. Goldberger, S. Roweis, G. Hinton, and R. Salakhutdinov. Neighbourhood component analysis. In Advances in Neural Information Processing Systems 17, pages 513--520, 2005.Google Scholar
- G. Hinton and S. Roweis. Stochastic neighbor embedding. In Advances in Neural Information Processing Systems 15, pages 833--840. MIT Press, 2002.Google Scholar
- A. E. Hoerl and R. Kennard. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12:55--67, 1970.Google Scholar
- I. Jolliffe. Principal Component Analysis (2nd ed.). Springer Verlag, Berlin, Germany, 2002.Google Scholar
- G. R. G. Lanckriet, N. Cristianini, P. Bartlett, L. El Ghaoui, and M. I. Jordan. Learning the kernel matrix with semidef nite programming. J. Mach. Learn. Res., 5:27--72, 2004. Google ScholarDigital Library
- J. Liu and J. Ye. Eff cient euclidean projections in linear time. In International Conference on Machine Learning, pages 657--664, 2009. Google ScholarDigital Library
- B. Schölkopf and A. J. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, Cambridge, MA, 2002. Google ScholarDigital Library
- S. Sonnenburg, G. Rätsch, C. Schäfer, and B. Schölkopf. Large Scale Multiple Kernel Learning. Journal of Machine Learning Research, 7:1531--1565, July 2006. Google ScholarDigital Library
- G. Stewart and J.-G. Sun. Matrix Perturbation Theory. Academic Press, Boston, 1990.Google Scholar
- J. Sun, D. Sow, J. Hu, and S. Ebadollahi. Localized supervised metric learning on temporal physiological data. In ICPR, 2010. Google ScholarDigital Library
- J. Vaidya, C. Clifton, and M. Zhu. Privacy preserving data mining. Springer, 2005. Google ScholarDigital Library
- V. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, 1995. Google ScholarDigital Library
- F. Wang, S. Chen, T. Li, and C. Zhang. Semi-supervised metric learning by maximizing constraint margin. In Proceedings of ACM 17th Conference on Information and Knowledge Management, pages 1457--1458, 2008. Google ScholarDigital Library
- F. Wang, J. Sun, and S. Ebadollahi. Integrating distance metrics learned from multiple experts and its application in patient similarity assessment. In SDM, 2011.Google ScholarCross Ref
- F. Wang, J. Sun, J. Hu, and S. Ebadollahi. imet: Interactive metric learning in healthcare application. In SDM, 2011.Google ScholarCross Ref
- F. Wang, J. Sun, T. Li, and N. Anerousis. Two heads better than one: Metric+active learning and its applications for it service classif cation. In IEEE International Conference on Data Mining, pages 1022--1027, 2009. Google ScholarDigital Library
- K. Q. Weinberger and L. K. Saul. Distance metric learning for large margin nearest neighbor classif cation. The Journal of Machine Learning Research, 10:207--244, 2009. Google ScholarDigital Library
- E. Xing, A. Ng, M. Jordan, and S. Russell. Distance metric learning with application to clustering with side-information. In Advances in Neural Information Processing System 15, pages 505--512, 2003.Google Scholar
- L. Yang. Distance metric learning: A comprehensive survey. Technical report, Department of Computer Science and Engineering, Michigan State University, 2006.Google Scholar
- J. Ye and T. Xiong. Computational and theoretical analysis of null space and orthogonal linear discriminant analysis. volume 7, pages 1183--1204, 2006. Google ScholarDigital Library
- H. Zha, X. He, C. Ding, H. Simon, and M. Gu. Spectral relaxation for k-means clustering. In Advances in Neural Information Processing Systems, pages 1057--1064, 2001.Google Scholar
Index Terms
- Supervised patient similarity measure of heterogeneous patient records
Recommendations
Fine-grained Patient Similarity Measuring using Deep Metric Learning
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge ManagementPatient similarity measuring plays a significant role in many healthcare applications, such as cohort study and treatment comparative effectiveness research. Existing methods mainly rely on supervised metric learning method to study patient similarity ...
Designing Patient-Centered Personal Health Records (PHRs): Health Care Professionals' Perspective on Patient-Generated Data
Currently, patients not only want access to various medical records their health care providers keep about them, but they also are willing to become active participants in managing their own health information and the health information of the ones they ...
An Aboriginal English ontology framework for Patient-Practitioner Interview Encounters
CBMS '10: Proceedings of the 2010 IEEE 23rd International Symposium on Computer-Based Medical SystemsCurrent diagnosis, treatment and healthcare delivery processes in Australia are dominated by long established westernized clinically driven methods of patient-practitioner interaction. Consequently this dominant healthcare provider influence contributes ...
Comments