research-article

Unfolding physiological state: mortality modelling in intensive care units

Authors:
Marzyeh Ghassemi

Massachusetts Institute of Technology, Cambridge, MA, USA

Massachusetts Institute of Technology, Cambridge, MA, USA
View Profile

,
Tristan Naumann

Massachusetts Institute of Technology, Cambridge, MA, USA

Massachusetts Institute of Technology, Cambridge, MA, USA
View Profile

,
Finale Doshi-Velez

Harvard, Boston, MA, USA

Harvard, Boston, MA, USA
View Profile

,
Nicole Brimmer

Massachusetts Institute of Technology, Cambridge, MA, USA

Massachusetts Institute of Technology, Cambridge, MA, USA
View Profile

,
Rohit Joshi

Massachusetts Institute of Technology, Cambridge, MA, USA

Massachusetts Institute of Technology, Cambridge, MA, USA
View Profile

,
Anna Rumshisky

University of Massachusetts Lowell, Lowell, MA, USA

University of Massachusetts Lowell, Lowell, MA, USA
View Profile

,
Peter Szolovits

Massachusetts Institute of Technology, Cambridge, MA, USA

Massachusetts Institute of Technology, Cambridge, MA, USA
View Profile

KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2014Pages 75–84https://doi.org/10.1145/2623330.2623742

Published:24 August 2014Publication History

KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 75–84

ABSTRACT

Accurate knowledge of a patient's disease state and trajectory is critical in a clinical setting. Modern electronic healthcare records contain an increasingly large amount of data, and the ability to automatically identify the factors that influence patient outcomes stand to greatly improve the efficiency and quality of care.

We examined the use of latent variable models (viz. Latent Dirichlet Allocation) to decompose free-text hospital notes into meaningful features, and the predictive power of these features for patient mortality. We considered three prediction regimes: (1) baseline prediction, (2) dynamic (time-varying) outcome prediction, and (3) retrospective outcome prediction. In each, our prediction task differs from the familiar time-varying situation whereby data accumulates; since fewer patients have long ICU stays, as we move forward in time fewer patients are available and the prediction task becomes increasingly difficult.

We found that latent topic-derived features were effective in determining patient mortality under three timelines: in-hospital, 30 day post-discharge, and 1 year post-discharge mortality. Our results demonstrated that the latent topic features important in predicting hospital mortality are very different from those that are important in post-discharge mortality. In general, latent topic features were more predictive than structured features, and a combination of the two performed best.

The time-varying models that combined latent topic features and baseline features had AUCs that reached 0.85, 0.80, and 0.77 for in-hospital, 30 day post-discharge and 1 year post-discharge mortality respectively. Our results agreed with other work suggesting that the first 24 hours of patient information are often the most predictive of hospital mortality. Retrospective models that used a combination of latent topic features and structured features achieved AUCs of 0.96, 0.82, and 0.81 for in-hospital, 30 day, and 1-year mortality prediction.

Our work focuses on the dynamic (time-varying) setting because models from this regime could facilitate an on-going severity stratification system that helps direct care-staff resources and inform treatment strategies.

Supplemental Material

p75-sidebyside.mp4

mp4

275.8 MB

Download

References

C. Arnold et al. Clinical case-based retrieval using latent topic analysis. In AMIA Annual Symposium Proceedings, volume 2010, page 26. AMIA, 2010.Google Scholar
D. Blei, A. Ng, and M. Jordan. Latent dirichlet allocation. JMLR, 3(5):993--1022, 2003. Google ScholarDigital Library
D. M. Blei and J. D. Lafferty. Dynamic topic models. In Proceedings of the 23rd international conference on Machine learning, pages 113--120. ACM, 2006. Google ScholarDigital Library
C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM TIST, 2:27:1--27:27, 2011. Google ScholarDigital Library
M. Ghassemi, T. Naumann, R. Joshi, and A. Rumshisky. Topic models for mortality modeling in intensive care units. In Proceedings of ICML 2012(Machine Learning for Clinical Data Analysis Workshop), Poster Presentation, Edinburgh, UK, June 2012.Google Scholar
T. Griffiths and M. Steyvers. Finding scientific topics. In PNAS, volume 101, pages 5228--5235, 2004.Google ScholarCross Ref
C. W. Hug and P. Szolovits. Icu acuity: real-time models versus daily models. In AMIA Annual Symposium Proceedings, volume 2009, page 260. American Medical Informatics Association, 2009.Google Scholar
A. E. Johnson, A. A. Kramer, and G. D. Clifford. A new severity of illness scale using a subset of acute physiology and chronic health evaluation data elements shows comparable predictive accuracy*. Critical care medicine, 41(7):1711--1718, 2013.Google Scholar
W. A. Knaus, D. Wagner, E. e. a. Draper, J. Zimmerman, M. Bergner, P. G. Bastos, C. Sirio, D. Murphy, T. Lotring, and A. Damiano. The apache iii prognostic system. risk prediction of hospital mortality for critically ill hospitalized adults. CHEST Journal, 100(6):1619--1636, 1991.Google ScholarCross Ref
J. Le Gall, S. Lemeshow, and F. Saulnier. A new simplified acute physiology score (saps ii) based on a european/north american multicenter study. JAMA, 270(24):2957--2963, 1993.Google ScholarCross Ref
L.-w. Lehman, M. Saeed, W. Long, J. Lee, and R. Mark. Risk stratification of icu patients using topic models inferred from unstructured progress notes. In AMIA Annual Symposium Proceedings, volume 2012, page 505. American Medical Informatics Association, 2012.Google Scholar
B. M. Marlin, D. C. Kale, R. G. Khemani, and R. C. Wetzel. Unsupervised pattern discovery in electronic health care data using probabilistic clustering models. In Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium, pages 389--398. ACM, 2012. Google ScholarDigital Library
M. Saeed et al. Multiparameter Intelligent Monitoring in Intensive Care II: A public-access intensive care unit database. Critical Care Medicine, 39(5):952--960, May 2011.Google ScholarCross Ref
G. Salton and C. S. Yang. On the specification of term values in automatic indexing. Journal of Documentation, 29(4):351--372, 1973.Google ScholarCross Ref
S. Saria, G. McElvain, A. K. Rajani, A. A. Penn, and D. L. Koller. Combining structured and free-text data for automatic coding of patient outcomes. In AMIA Annual Symposium Proceedings, volume 2010, page 712. American Medical Informatics Association, 2010.Google Scholar
G. Siontis, I. Tzoulaki, and J. Ioannidis. Predicting death: an empirical evaluation of predictive tools for mortality. Archives of internal medicine, pages archinternmed-2011, 2011.Google Scholar
J.-L. Vincent, R. Moreno, J. Takala, S. Willatts, A. De Mendonca, H. Bruining, C. Reinhart, P. Suter, and L. Thijs. The sofa (sepsis-related organ failure assessment) score to describe organ dysfunction/failure. Intensive care medicine, 22(7):707--710, 1996.Google Scholar

Index Terms

Unfolding physiological state: mortality modelling in intensive care units
1. General and reference
  1. Document types
    1. General conference proceedings

Recommendations

Predictive modelling of survival and length of stay in critically ill patients using sequential organ failure scores

HighlightsA dataset of 14,480 critically ill patients within the ICU was collected.Machine learning models are constructed to predict patient length of stay and mortality.Prediction accuracy was improved by using a two-by-two classification ...
Read More
A Data-Driven Model Based on Support Vector Machine to Identify Chronic Hypertensive and Diabetic Patients
Physiological Computing Systems
Abstract
Hypertension and diabetes are chronic conditions that have a considerable prevalence in the elderly. It is estimated that both hypertensive patients and people with diagnosed diabetes double cost of normotensive individuals and those in the ...
Read More
Predicting ICU readmission using grouped physiological and medication trends
Abstract Background
Patients who are readmitted to an intensive care unit (ICU) usually have a high risk of mortality and an increased length of stay. ICU readmission risk prediction may help physicians to re-evaluate the patient’s ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2014
2028 pages
ISBN:9781450329569
DOI:10.1145/2623330
General Chairs:
Sofus Macskassy
Facebook
,
Claudia Perlich
Dstillery
,
Program Chairs:
Jure Leskovec
Stanford University
,
Wei Wang
UCLA
,
Rayid Ghani
University of Chicago
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 August 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
data mining for social good
graphical and latent variable models
healthcare and medicine
support vector machines
text
topic
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '14 Paper Acceptance Rate151of1,036submissions,15%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 97
  Total Citations
  View Citations
- 1,324
  Total Downloads
- Downloads (Last 12 months)46
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Unfolding physiological state: mortality modelling in intensive care units

KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Predictive modelling of survival and length of stay in critically ill patients using sequential organ failure scores

A Data-Driven Model Based on Support Vector Machine to Identify Chronic Hypertensive and Diabetic Patients

Predicting ICU readmission using grouped physiological and medication trends