nach oben

2017 | Buch

Kapitel lesen Erstes Kapitel lesen

Data Management and Analytics for Medicine and Healthcare

Third International Workshop, DMAH 2017, Held at VLDB 2017, Munich, Germany, September 1, 2017, Proceedings

herausgegeben von: Edmon Begoli, Fusheng Wang, Gang Luo

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This book constitutes the thoroughly refereed conference proceedings of the Third International Workshop on Data Management and Analytics for Medicine and Healthcare, DMAH 2017, in Munich, Germany, in September 2017, held in conjunction with the 43rd International Conference on Very Large Data Bases, VLDB 2017.

The 9 revised full papers presented together with 2 keynote abstracts were carefully reviewed and selected from 16 initial submissions. The papers are organized in topical sections on data privacy and trustability for electronic health records; biomedical data management and Integration; online mining of Health related data; and clinical data analytics.

Inhaltsverzeichnis

Frontmatter

Data Privacy and Trustability for Electronic Health Records

Frontmatter

How Blockchain Could Empower eHealth: An Application for Radiation Oncology

(Extended Abstract)

Abstract

Electronic medical records (EMRs) contain critical, highly sensitive private healthcare information, and need to be frequently shared among peers. Blockchain provides a shared, immutable and transparent history of all the transactions to build applications with trust, accountability and transparency. This provides a unique opportunity to develop a secure and trustable EMR data management and sharing system using blockchain. In this paper, we discuss our perspectives on blockchain based healthcare data management and present a prototype of a framework for managing and sharing EMR data for cancer patient care.

Alevtina Dubovitskaya, Zhigang Xu, Samuel Ryu, Michael Schumacher, Fusheng Wang

Biomedical Data Management and Integration

Frontmatter

On-Demand Service-Based Big Data Integration: Optimized for Research Collaboration

Abstract

Biomedical research requires distributed access, analysis, and sharing of data from various disperse sources in the Internet scale. Due to the volume and variety of big data, materialized data integration is often infeasible or too expensive including the costs of bandwidth, storage, maintenance, and management. Óbidos (On-demand Big Data Integration, Distribution, and Orchestration System) provides a novel on-demand integration approach for heterogeneous distributed data. Instead of integrating data from the data sources to build a complete data warehouse as the initial step, Óbidos employs a hybrid approach of virtual and materialized data integrations. By allocating unique identifiers as pointers to virtually integrated data sets, Óbidos supports efficient data sharing among data consumers. We design Óbidos as a generic service-based data integration system, and implement and evaluate a prototype for multimodal medical data.

Pradeeban Kathiravelu, Yiru Chen, Ashish Sharma, Helena Galhardas, Peter Van Roy, Luís Veiga

CHIPS – A Service for Collecting, Organizing, Processing, and Sharing Medical Image Data in the Cloud

Abstract

Web browsers are increasingly used as middleware platforms offering a central access point for service provision. Using backend containerization, RESTful APIs, and distributed computing allows for complex systems to be realized that address the needs of modern compute intense environments. In this paper, we present a web-based medical image data and information management software platform called CHIPS (Cloud Healthcare Image Processing Service). This cloud-based services allows for authenticated and secure retrieval of medical image data from resources typically found in hospitals, organizes and presents information in a modern feed-like interface, provides access to a growing library of plugins that process these data, allows for easy data sharing between users and provides powerful 3D visualization and real-time collaboration. Image processing is orchestrated across additional cloud-based resources using containerization technologies.

Rudolph Pienaar, Ata Turk, Jorge Bernal-Rusiel, Nicolas Rannou, Daniel Haehn, P. Ellen Grant, Orran Krieger

High Performance Merging of Massive Data from Genome-Wide Association Studies

Abstract

The traditional data processing methods working on single computer show less scalability and efficiency for performing ordered full-outer-joining, on merging large number of individual Genome-Wide Associations Studies (GWAS) data. Although the emerging of big data platforms such as Hadoop and Spark shed lights on this problem, the inefficiency of keeping data in total-sorted order as well as the workload imbalance problem limit their performance. In this study, we designed and compared three new methodologies based on MapReduce, HBase and Spark respectively, to merge hundreds of individuals VCF files on their Single Nucleotide Polymorphism (SNP) location into a single TPED file. Our methodologies overcame the limitations stated above and considerably improved the performance with good scalability on input size and computing resources.

Xiaobo Sun, Fusheng Wang, Zhaohui Qin

An Emerging Role for Polystores in Precision Medicine

Abstract

Medical data is organically heterogeneous, and it usually varies significantly in both size and composition. Yet, this data is also a key for the recent and promising field of precision medicine, which focuses on identifying and tailoring appropriate medical treatments for the needs of the individual patients, based on their specific conditions, their medical history, lifestyle, genetic, and other individual factors. As we, and a database community at large, recognize that a “one size does not fit all” solution is required to work with such data, we present our observations based on our experiences, and the applications in the field of precision medicine. We make the case for the use of polystore architecture; how it applies for precision medicine; we discuss the reference architecture; describe some of its critical components (array database); and discuss the specific types of analysis that directly benefit from this database architecture, and the ways it serves the data.

Edmon Begoli, J. Blair Christian, Vijay Gadepally, Stavros Papadopoulos

Online Mining of Health Related Data

Frontmatter

Social Media Mining to Understand Public Mental Health

Abstract

In this paper, we apply text mining and topic modelling to understand public mental health. We focus on identifying common mental health topics across two anonymous social media platforms: Reddit and a mobile journalling/mood-tracking app. Furthermore, we analyze journals from the app to uncover relationships between topics, journal visibility (private vs. visible to other users of the app), and user-labelled sentiment. Our main findings are that (1) anxiety and depression are shared on both platforms; (2) users of the journalling app keep routine topics such as eating private, and these topics rarely appear on Reddit; and (3) sleep was a critical theme on the journalling app and had an unexpectedly negative sentiment.

Andrew Toulis, Lukasz Golab

Clinical Data Analytics

Frontmatter

Effects of Varying Sampling Frequency on the Analysis of Continuous ECG Data Streams

Abstract

A myriad of data is produced in intensive care units (ICU) even for short periods of time. This data is frequently used for monitoring patient’s immediate health status, not for real-time analysis because of technical challenges in real-time processing of such massive data. Data storage is also another challenge in making ICU data useful for retrospective studies. Therefore, it is important to know the minimal sampling frequency requirement to develop real-time analysis on ICU data and to develop a data storage plan. In this study, we have applied the Probabilistic Symbolic Pattern Recognition (PSPR) method in Paroxysmal Atrial Fibrillation (PAF) screening problem by analyzing electrocardiogram signals at different sampling frequencies varying from 128 Hz to 8 Hz. Our results show that using PSPR method, we can obtain a classification accuracy of 82.67% in identifying PAF subjects even when the test data is sampled at 8 Hz frequency (73.33% for 128 Hz). This classification accuracy drastically improved to 92% when other descriptive features were used along with PSPR features. The PSPR’s PAF screening ability at low sampling frequency indicates its potential for real-time analysis and wearable embedded computing applications.

Ruhi Mahajan, Rishikesan Kamaleswaran, Oguz Akbilgic

Detection and Visualization of Variants in Typical Medical Treatment Sequences

Abstract

Electronic Medical Records (EMRs) are widely used in many large hospitals. EMRs can reduce the cost of managing medical histories, and can also improve medical processes by the secondary use of these records. Medical workers including doctors, nurses, and technicians generally use clinical pathways as their guidelines for typical sequences of medical treatments. The medical workers traditionally generate the clinical pathways themselves based on their experiences. It is helpful for the medical workers to verify the correctness of existing clinical pathways or modify them by comparing the frequent sequential patterns in medical orders computationally extracted from EMR logs. Thinking that the EMR is a database and a typical clinical pathway is a frequent sequential pattern in the database in our previous work, we proposed a method to extract typical clinical pathways as frequent sequential patterns with treatment time information from EMR logs. These patterns tend to contain variants that are influential in verification and modification. In this paper, we propose an approach for detecting the variants in frequent sequential patterns of medical orders while considering time information. Since it is important to provide visual views of these variants so the results can be used effectively by the medical workers, we also develop an interactive graphical interface system for visualizing the results of variants in clinical pathways. The results of applying the approach to actual EMR logs in an university hospital are reported.

Yuichi Honda, Muneo Kushima, Tomoyoshi Yamazaki, Kenji Araki, Haruo Yokota

Umedicine: A System for Clinical Practice Support and Data Analysis

Abstract

Recording patient clinical data in a comprehensive and easy way is very important for health care providers. However, and although there are information systems to facilitate the storage and access to patient data, many records are still in paper. Even when data is stored electronically, systems often are complex to use and do not provide means to gather statistical information about a population of patients, thus limiting the usefulness of the data. Physicians often give up searching for relevant information to support their medical decisions because the task is too time-consuming. This paper proposes Umedicine, a web-based software application in Portuguese that addresses current limitations of clinical information systems. Umedicine is an application for physicians, patients and administrative staff that keeps clinical data (e.g., symptoms, clinical examination results, and treatments prescribed) up to date on a database in a structured way. It also provides easy and quick access to a large amount of clinical data collected over time. Furthermore, Umedicine supports the application of a particular clustering algorithm and a visualization module for analyzing patient time-series data, to identify evolution patterns. Preliminary user tests revealed promising results, showing that users were able to identify the evolution of groups of patients over time and their common characteristics.

Nuno F. Lages, Bernardo Caetano, Manuel J. Fonseca, João D. Pereira, Helena Galhardas, Rui Farinha

Association Rule Learning and Frequent Sequence Mining of Cancer Diagnoses in New York State

Abstract

Analyzing large scale diagnosis histories of patients could help to discover comorbidity or disease progression patterns. Recently, open data initiatives make it possible to access statewide patient data at individual level, such as New York State SPARCS data. The goal of this study is to explore frequent disease co-occurrence and sequence patterns of cancer patients in New York State using SPARCS data. Our collection includes 18,208,830 discharge records from 1,565,237 patients with cancer-related diagnoses during 2011–2015. We use Apriori algorithm to discover top disease co-occurrences for common cancer categories based on support. We generate top frequent sequences of diagnoses with at least one cancer related diagnosis from patients’ diagnosis histories using the cSPADE algorithm. Our data driven approach provides essential knowledge to support the investigation of disease co-occurrence and progression patterns for improving the management of multiple diseases.

Yu Wang, Fusheng Wang

Healthsurance – Mobile App for Standardized Electronic Health Records Database

Abstract

With the increasing popularity of Electronic Health Records (EHRs), there arises a need to understand its importance in terms of clinical contexts for a standard based health application. Standards for semantic interoperability propose the use of archetypes for building a health application. A usual practice followed for storing of EHRs is through graphical user interfaces. Generally, user interface is static corresponding to the underlying medical concept, often made manually and are prone to errors. However, evolution in knowledge demands for dynamically generated user interfaces to reduce time, minimize cost and enhance reliability. Current research implements mobile app for standardized Electronic Health Records Database termed as HEALTHSURANCE. The application maintains its dynamic behavior through creation of graphical user interfaces at runtime by gaining knowledge from the artefacts (known as archetypes) available from standard clinical repositories (such as Clinical Knowledge Manager). This provides easy and hassle-free user operability without any need of mobile developer. A standardized format and content helps to uplift the credibility of data and maintains a uniform and specific set of constraints used to evaluate the user’s health. A generic centralized database is chosen for data storage to support evolution in clinical knowledge and to handle heterogeneity of EHRs data. Implementing mobile app based on archetype paradigm avoids reimplementation of systems, migrating databases and allows the creation of future-proof systems.

Prateek Jain, Sagar Bhargava, Naman Jain, Shelly Sachdeva, Shivani Batra, Subhash Bhalla

Backmatter

Titel: Data Management and Analytics for Medicine and Healthcare
herausgegeben von: Edmon Begoli
Fusheng Wang
Gang Luo
Verlag: Springer International Publishing
Electronic ISBN: 978-3-319-67186-4
Print ISBN: 978-3-319-67185-7
DOI: https://doi.org/10.1007/978-3-319-67186-4

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Data Privacy and Trustability for Electronic Health Records

Frontmatter

How Blockchain Could Empower eHealth: An Application for Radiation Oncology

Biomedical Data Management and Integration

Frontmatter

On-Demand Service-Based Big Data Integration: Optimized for Research Collaboration

CHIPS – A Service for Collecting, Organizing, Processing, and Sharing Medical Image Data in the Cloud

High Performance Merging of Massive Data from Genome-Wide Association Studies

An Emerging Role for Polystores in Precision Medicine

Online Mining of Health Related Data

Frontmatter

Social Media Mining to Understand Public Mental Health

Clinical Data Analytics

Frontmatter

Effects of Varying Sampling Frequency on the Analysis of Continuous ECG Data Streams

Detection and Visualization of Variants in Typical Medical Treatment Sequences

Umedicine: A System for Clinical Practice Support and Data Analysis

Association Rule Learning and Frequent Sequence Mining of Cancer Diagnoses in New York State

Healthsurance – Mobile App for Standardized Electronic Health Records Database

Backmatter

Premium Partner