Skip to main content

Über dieses Buch

This book introduces readers to the methods, types of data, and scale of analysis used in the context of health. The challenges of working with big data are explored throughout the book, while the benefits are also emphasized through the discoveries made possible by linking large datasets. Methods include thorough case studies from statistics, as well as the newest facets of data analytics: data visualization, modeling and simulation, and machine learning. The diversity of datasets is illustrated through chapters on networked data, image processing, and text, in addition to typical structured numerical datasets. While the methods, types of data, and scale have been individually covered elsewhere, by bringing them all together under one “umbrella” the book highlights synergies, while also helping scholars fluidly switch between tools as needed. New challenges and emerging frontiers are also discussed, helping scholars grasp how methods will need to change in response to the latest challenges in health.



Data Exploration and Visualization


Dimensionality Reduction for Exploratory Data Analysis in Daily Medical Research

In contrast to traditional, industrial applications such as market basket analysis, the process of knowledge discovery in medical research is mostly performed by the medical domain experts themselves. This is mostly due to the high complexity of the research domain, which requires deep domain knowledge. At the same time, these domain experts face major obstacles in handling and analyzing their high-dimensional, heterogeneous, and complex research data. In this paper, we present a generic, ontology-centered data infrastructure for scientific research which actively supports the medical domain experts in data acquisition, processing and exploration. We focus on the system’s capabilities to automatically perform dimensionality reduction algorithms on arbitrary high-dimensional data sets and allow the domain experts to visually explore their high-dimensional data of interest, without needing expert IT or specialized database knowledge.
Dominic Giradi, Andreas Holzinger

Navigating Complex Systems for Policymaking Using Simple Software Tools

Comprehensive maps of selected issues such as obesity have been developed to list the key factors and their interactions, thus defining a network where factors (e.g., weight bias, disordered eating) are represented as nodes while causal connections are captured as edges. While such maps contain a wealth of information, they can be seen as a maze which practitioners and policymakers struggle to explore. For instance, the Foresight Obesity Map has been depicted as an ‘almost incomprehensible web of interconnectedness’. Rather than presenting maps as static images, we posit that their value can be unlocked through interactive visualizations. Specifically, we present five required functionalities for interactive visualizations, based on experimental studies and key concepts of systems thinking in public policy. These functionalities include shifting from simple ‘policy inputs’ to loops, capturing what unfolds between an intervention and its evaluation, and accounting for the rippling effects of interventions. We reviewed ten software that support different policy purposes (visualization, argumentation, or modeling) and found that none supports four or all five of the functionalities listed. We thus created a new open-source software, ActionableSystems. The chapter details its design principles and how it implements the five functionalities. The use of the software to address policy-relevant questions is briefly illustrated, taking obesity and public health nutrition as guiding example. We conclude with open questions for software development and public health informatics, emphasizing the need to design software that supports a more inclusive approach to policy-making and a more comprehensive exploration of complex systems.
Philippe J. Giabbanelli, Magda Baniukiewicz

Modeling and Simulation


An Agent-Based Model of Healthy Eating with Applications to Hypertension

Changing descriptive social norm in health behavior (“how many people are behaving healthy”) has been shown to be effective in promoting healthy eating. We developed an agent-based model to explore the potential of changing social norm in reducing hypertension among the adult population of Los Angeles County. The model uses the 2007 California Health Interview Survey (CHIS) to create a virtual population that mimics the joint distribution of demographic characteristics and health behavior in the Los Angeles County. We calibrated the outcome of hypertension as a function of individual age and fruits/vegetable consumption, based upon the observed pattern in the survey. We then simulated an intervention scenario to promote healthier eating by increasing the visibility (i.e. descriptive social norms) of those who eat at least one serving of fruits/vegetable per day. We compare the hypertension incidence under the status quo scenario and the intervention scenario. We found that the effect size of 5% in social norm enhancement yields a reduction in 5 year hypertension incidence by 10.08%. An effect size of 15% would reduce incidence by 15.50%. In conclusion, the agent-based model built and calibrated around real-world data shows that changes descriptive social norms in healthy eating can be effective to reduce the burden of hypertension. The model can be improved in the future by also including the chronic conditions that are affected by changes in fruits/vegetable consumption.
Amin Khademi, Donglan Zhang, Philippe J. Giabbanelli, Shirley Timmons, Chengqian Luo, Lu Shi

Soft Data Analytics with Fuzzy Cognitive Maps: Modeling Health Technology Adoption by Elderly Women

Modeling how patients adopt personal health technology is a challenging problem: Decision-making processes are largely unknown, occur in complex, multi-stakeholder settings, and may play out differently for different products and users. To address this problem, this chapter develops a soft analytics approach, based on Fuzzy Cognitive Maps (FCM) that leads to adoption models that are specific for a particular product and group of adopters. Its empirical grounding is provided by a case study, in which a group of women decides whether to adopt a wearable remote healthcare monitoring device. The adoption model can simulate different product configurations and levels of support and provide insight as to what scenarios will most likely lead to successful adoption. The model can be used by product developers and rollout managers to support technology planning decisions.
Noshad Rahimi, Antonie J. Jetter, Charles M. Weber, Katherine Wild

Machine Learning


Machine Learning for the Classification of Obesity from Dietary and Physical Activity Patterns

Conventional epidemiological analyses in health-related research have been successful in identifying individual risk factors for adverse health outcomes, e.g. cigarettes’ effect on lung cancer. However, for conditions that are multifactorial or for which multiple variables interact to affect risk, these approaches have been less successful. Machine learning approaches such as classifiers can improve risk prediction due to their ability to empirically detect patterns of variables that are “diagnostic” of a particular outcome, over the conventional approach of examining isolated, statistically independent relationships that are specified a priori. This chapter presents a proof-of-concept using several classifiers (discriminant analysis, support vector machines (SVM), and neural nets) to classify obesity from 18 dietary and physical activity variables. Random subsampling cross-validation was used to measure prediction accuracy. Classifiers outperformed logistic regressions: quadratic discriminant analysis (QDA) correctly classified 59% of cases versus logistic regression’s 55% using original, unbalanced data; and radial-basis SVM classified nearly 61% of cases using balanced data, versus logistic regression’s 59% prediction accuracy. Moreover, radial SVM predicted both categories (obese and non-obese) above chance simultaneously, while some other methods achieved above-chance prediction accuracy for only one category, usually to the detriment of the other. These findings show that obesity can be more accurately classified by a combination or pattern of dietary and physical activity behaviors, than by individual variables alone. Classifiers have the potential to inform more effective nutritional guidelines and treatments for obesity. More generally, machine learning methods can improve risk prediction for health outcomes over conventional epidemiological approaches.
Arielle S. Selya, Drake Anshutz

Classifying Mammography Images by Using Fuzzy Cognitive Maps and a New Segmentation Algorithm

Mammography is one of the best techniques for the early detection of breast cancer. In this chapter, a method based on fuzzy cognitive map (FCM) and its evolutionary-based learning capabilities is presented for classifying mammography images. The main contribution of this work is two-fold: (a) to propose a new segmentation approach called the threshold based region growing (TBRG) algorithm for segmentation of mammography images, and (b) to implement FCM method in the context of mammography image classification by developing a new FCM learning algorithm efficient for tumor classification. By applying the proposed (TBRG) algorithm, a possible tumor is delineated against the background tissue. We extracted 36 features from the tissue, describing the texture and the boundary of the segmented region. Due to the curse of dimensionality of features space, the features were selected with the help of the continuous particle swarm optimization algorithm. The FCM was trained using a new evolutionary approach based on the area under curve (AUC) of the output concept. In order to evaluate the efficacy of the presented scheme, comparisons with benchmark machine learning algorithms were conducted and known metrics like ROC, AUC were calculated. The AUC obtained for the test data set is 87.11%, which indicates the excellent performance of the proposed FCM.
Abdollah Amirkhani, Mojtaba Kolahdoozi, Elpiniki I. Papageorgiou, Mohammad R. Mosavi

Text-Based Analytics for Biosurveillance

The ability to prevent, mitigate, or control a biological threat depends on how quickly the threat is identified and characterized. Ensuring the timely delivery of data and analytics is an essential aspect of providing adequate situational awareness in the face of a disease outbreak. This chapter outlines an analytic pipeline for supporting an advanced early warning system that can integrate multiple data sources and provide situational awareness of potential and occurring disease situations. The pipeline includes real-time automated data analysis founded on natural language processing, semantic concept matching, and machine learning techniques, to enrich content with metadata related to biosurveillance. Online news articles are presented as a use case for the pipeline, but the processes can be generalized to any textual data. In this chapter, the mechanics of a streaming pipeline are briefly discussed as well as the major steps required to provide targeted situational awareness. The text-based analytic pipeline includes various processing steps as well as identifying article relevance to biosurveillance (e.g., relevance algorithm) and article feature extraction (who, what, where, why, how, and when).
Lauren E. Charles, William Smith, Jeremiah Rounds, Joshua Mendoza

Case Studies


Young Adults, Health Insurance Expansions and Hospital Services Utilization

Under the dependent coverage expansion (DCE) provision of health reform adult children up to 26 years of age whose parents have employer-sponsored or individual health insurance are eligible for insurance under their parents’ health plan. Using a difference-in-differences approach and the 2008–2014 Healthcare Cost and Utilization Project State Emergency Department Databases and State Inpatient Databases we examined the impact of the DCE on hospital services use. In analyses of individuals age <26 years (compared to individuals over 26) we found a 1.5% increase in non-pregnancy related inpatient visits in 2010 through 2013 during the initial DCE period and a 1.6% increase in 2014 when other state expansions went into effect. We found that the impact of the DCE persisted into 2014 when many state insurance expansions occurred, although effects varied for states adopting and not adopting Medicaid expansions.
Teresa B. Gibson, Zeynal Karaca, Gary Pickens, Michael Dworsky, Eli Cutler, Brian J. Moore, Richele Benevent, Herbert Wong

The Impact of Patient Incentives on Comprehensive Diabetes Care Services and Medical Expenditures

A large nondurable goods manufacturing firm introduced a value-based insurance design health benefit program for comprehensive diabetes care with six diabetes-related service types subject to a copayment waiver: laboratory tests, physician office visits, diabetes supplies, diabetes medications, antihypertensive (blood pressure) medications, and cholesterol-lowering medications. We evaluated the impact of this natural experiment compared to a matched comparison group drawn from firms with similar composition and baseline trends. We examined the difference-in-differences impact of the program on diabetes-related services, utilization and all-cause spending. In the first year, adherence to oral diabetes medications was 15.0% higher relative to the matched comparison group (p < 0.01) and 14.4% higher in the second year (p < 0.01). The likelihood of adherence to a regimen of diabetes-related recommended diabetes care services (laboratory visits, office visits and medications) was low in the baseline year (5.8% of enrollees) and increased 92.1% in the first year (p < 0.01) and 82% in the second year (p < 0.05). The program was cost-neutral in terms of total all-cause healthcare spending (health plan plus employee out of pocket payments) and all-cause net health plan payments (both p > 0.10). Our analysis suggests that a comprehensive diabetes care program with patient incentives can improve care without increasing direct health plan costs.
Teresa B. Gibson, J. Ross Maclean, Ginger S. Carls, Emily D. Ehrlich, Brian J. Moore, Colin Baigel

Analyzing the Complexity of Behavioural Factors Influencing Weight in Adults

Managing obesity is a difficult and pressing problem given its detrimental health effects and associated healthcare costs. This difficulty stems from obesity being the result of a complex system. This complexity is often ignored by generic interventions, and not fully utilized for clinical decision-making. We focused on heterogeneity and feedback loops as key parts of this complexity. We measured heterogeneity and found it high, in a demographically homogeneous sample as well as in a larger, more varied sample. We also demonstrated that taking a systems approach could hold value for clinical decision-making. Specifically, we showed that feedback loops had better associations with weight categories than individual factors or relationships, in addition to clear implications for weight dynamics. Clinical implications were discussed, in part through adapting techniques such as card decks in a computerized format. Further research was suggested on heterogeneity among population groups and categories of driver of weight.
Philippe J. Giabbanelli

Challenges and New Frontiers


The Cornerstones of Smart Home Research for Healthcare

The aging of the world population has a strong impact on the world wide health care expenditure and is especially significant for countries providing free health care services to their population. One of the consequences is the increase in semi-autonomous persons requiring to be placed in specialized long term care centers. These kinds of facilities are very costly and often not appreciated by their residents. The idea of “aging in place” or living in one’s home independently is a key solution to counter the impact of institutionalization. It can decrease the costs for the institutions while maximizing the quality of life of the individuals. However, these semi-autonomous persons require assistance during their daily life activities that professionals cannot hope to completely fill. Many envision the use of the smart home concept, a home equipped with distributed sensors and effectors, to add an assistance layer for these semi-autonomous populations. Still, despite years of research, there are several challenges to overcome in order to implement the smart home dream. This chapter positions itself as an easy to read introduction for readers unfamiliar with the challenges faced by computer science researchers regarding this difficult endeavor. It aims to walk the reader through the cornerstones of smart home research for health care.
Kevin Bouchard, Jianguo Hao, Bruno Bouchard, Sébastien Gaboury, Mohammed Tarik Moutacalli, Charles Gouin-Vallerand, Hubert Kenfack Ngankam, Hélène Pigot, Sylvain Giroux

Challenges and Cases of Genomic Data Integration Across Technologies and Biological Scales

Current technological advancements have facilitated novel experimental methods that measure a diverse assortment of biological processes, creating a data deluge in biology and medicine. This proliferation of data sources, from large repositories and data warehouses to specialist databases that store a variety of different data types, contributing to a multitude of different file formats, have necessitated minimal data standards that describe both data and annotation. In addition to integrating at the data resource level, development of integrative computational or statistical methods that explore two or more data types or biological layers to understand their joint influence can lead to a better understanding of both normal and pathological processes. Combination of these different data-layers, in turn enables us to glean a more integrative understanding of complex biological systems. Development of integrative methods that bridge both biology and technology can provide insight into different scales of gene and genome regulation. Some of these integrative approaches and their application are explored in this chapter in the context of modern genomics.
Shamith A. Samarajiwa, Ioana Olan, Dóra Bihary
Weitere Informationen

Premium Partner