Rare diseases affect 350 million patients worldwide, but they are commonly delayed in diagnosis or misdiagnosed. The problem of detecting rare disease faces two main challenges: the first being extreme imbalance of data and the second being finding the appropriate features. In this paper, we propose to address the problems by using semi-supervised generative adversarial networks (GANs) to deal with the data imbalance issue and recurrent neural networks (RNNs) to directly model patient sequences. We experimented with detecting patients with a particular rare disease (exocrine pancreatic insufficiency, EPI). The dataset includes 1.8 million patients with 29,149 patients being positive, from a large longitudinal study using 7 years medical claims. Our model achieved 0.56 PR-AUC and outperformed benchmark models in terms of precision and recall.
Anzeige
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten