Skip to main content
Top

2015 | Book

Dependent Data in Social Sciences Research

Forms, Issues, and Methods of Analysis

insite
SEARCH

About this book

This volume presents contributions on handling data in which the postulate of independence in the data matrix is violated. When this postulate is violated and when the methods assuming independence are still applied, the estimated parameters are likely to be biased, and statistical decisions are very likely to be incorrect. Problems associated with dependence in data have been known for a long time, and led to the development of tailored methods for the analysis of dependent data in various areas of statistical analysis. These methods include, for example, methods for the analysis of longitudinal data, corrections for dependency, and corrections for degrees of freedom. This volume contains the following five sections: growth curve modeling, directional dependence, dyadic data modeling, item response modeling (IRT), and other methods for the analysis of dependent data (e.g., approaches for modeling cross-section dependence, multidimensional scaling techniques, and mixed models). Researchers and graduate students in the social and behavioral sciences, education, econometrics, and medicine will find this up-to-date overview of modern statistical approaches for dealing with problems related to dependent data particularly useful.

Table of Contents

Frontmatter

Growth Curve Modeling

Frontmatter
The Observed Dependency of Longitudinal Data
Abstract
It is well known that longitudinal data can deal with different concepts than cross-sectional data (see Baltes & Nesselroade, 1979; McArdle & Nesselroade, 2014). The key is in the observed dependency—that allows us to examine individual changes. Thus, all of the individual changes that can be examined are due to the longitudinal models (see McArdle, 2008) allowing dependencies among the observed scores at various time points. It is demonstrated here that the statistical power to detect changes is an explicit function of the positive dependencies and the timing of the observations. A lot of time is spent on the move to the latent curve model (LCM) from the basic regression structural model and the repeated measures model (RANOVA) because the latter seems standard in the field now. This LCM is introduced in this chapter as a principle that does have power to detect many more changes than the usual regression analysis but it comes along with several (to be discussed) assumptions.
The four articles to follow in this volume are reviewed with longitudinal dependency in mind, and the highlights of each chapter are brought out. The chapter “Nonlinear Growth Curve Models” extends the LCM to handle serious forms of nonlinearity, and this is clearly prevalent in Psychology. The chapter “Stage-Sequential Growth Mixture Modeling” extends this work to include multistage models, Poisson relations, all in the context of a multiple mixture model. This is a fairly complex example. The chapter “General Growth Mixture Modeling: The Study of Developmental Pathways of Externalizing Behavior from Preschool Age to Adolescence” is a real-life example that includes LCMs for five mixture groups. The chapter “A Generalization of Nagin’s Finite Mixture Model” extends the mixture models further, mainly by adding a slope component.
But what is also important in this regard is “measurement invariance” and how this can be crucial to understanding changes. Some elaboration of the early work on scales is further developed for selected items. The data to be considered here for LCM are a subset of the full set of data collected in the Cognition in the USA (CogUSA survey; McArdle & Fisher, 2015). These scales were chosen in a way that would be consistent with the principles of multiple factorial invariance over time (MFIT) but the result of the age-related changes over two waves was largely unknown and in need of establishment. Basically, we first try to establish MFIT over the two waves and then look for latent changes in these scales over age. Thus there are only eight scales to consider here (four cross-sectional scales by two longitudinal occasions), so there is still a lot of work to do!
John J. McArdle
Nonlinear Growth Curve Models
Abstract
In the past three decades, the growth curve model (also known as latent curve model) has become a popular statistical methodology for the analysis of longitudinal or, more generally, repeated-measures data. Developed primarily within the latent variable modeling framework, the equivalent model emerged from other fields under the names of linear mixed-effects model, random-effects model, hierarchical linear model, and linear multilevel model. This methodology estimates the so-called growth parameters that describe individuals’ change trajectories across time and are related via linear combinations to the dependent variable. While satisfying in many research settings, oftentimes a linear relation between dependent variable and growth parameters cannot allow for meaningful interpretation of the growth parameters, parsimonious descriptions of the change phenomenon, good adjustment to the data across all values of the time predictor, and realistic extrapolations outside the empirical range of the time predictor. Consequently, nonlinear alternatives have been proposed, for which the growth parameters can be related to the dependent variable via any mathematical function (not just linear combinations). We discuss the theoretical foundations as well as practical implications of estimating nonlinear growth curve models. We also illustrate the methodology with an example from the psychological literature.
Paolo Ghisletta, Eva Cantoni, Nadège Jacot
Stage-Sequential Growth Mixture Modeling of Criminological Panel Data
Abstract
The detection of distinctive developmental trajectories is of great importance in criminological research. The methodology of growth curve and finite mixture modeling provides the opportunity to examine different developments of offending. With latent growth curve models (LGM) (Meredith and Tisak, Psychometrika 55:107–122, 1990) the structural equation methodology offers a strategy to examine intra- and interindividual developmental processes of delinquent behavior. There might, however, not be a single but a mixture of populations underlying the growth curves which refers to unobserved heterogeneity in the longitudinal data. Growth mixture models (GMM) introduced by Muthén and Shedden (Biometrics 55:463–469, 1999) can consider unobserved heterogeneity when estimating growth curves. GMM distinguish between continuous variables which represent the growth curve model and categorical variables which refer to subgroups that have a common development in the growth process. The models are usually based on single-phase data which associate any event with a specific period. Panel data, however, often contain several relevant phases. In this context, stage-sequential growth mixture models with multiphase longitudinal data become increasingly important. Kim and Kim (Structural Equation Modeling: A Multidisciplinary Journal 19:293–319, 2012) investigated and discussed three distinctive types of stage-sequential growth mixture models: traditional piecewise GMM, discontinuous piecewise GMM, and sequential process GMM. These models will be applied here to examine different stages of delinquent trajectories within the time range of adolescence and young adulthood using data from the German panel study Crime in the Modern City (CrimoC, Boers et al., Monatsschrift für Kriminologie und Strafrechtsreform 3:183–202, 2014). Methodological and substantive differences between single-phase and multi-phase models are discussed as well as recommendations for future applications.
Jost Reinecke, Maike Meyer, Klaus Boers
Developmental Pathways of Externalizing Behavior from Preschool Age to Adolescence: An Application of General Growth Mixture Modeling
Abstract
This study applies a developmental and life-course perspective on the data of the Erlangen-Nuremberg Development and Prevention Study (ENDPS; Lösel, Stemmler, Jaursch, and Beelmann, Monatsschrift für Kriminologie und Strafrechtsreform 92:289–308, 2009) to find interindividual differences in intraindividual change of externalizing problem behavior. Based on a sample of N = 541 boys and girls, general growth mixture modeling (GGMM; Nagin, Psychological Methods 4:139–177, 1999; McArdle, The handbook of research methods in developmental psychology. New York: Blackwell Publishers, 2005) was applied. In a prospective longitudinal design measurements with multiple informants were analyzed from preschool to adolescence. The results of the GGMM showed five groups representing different developmental trajectories: (1) “high-chronics” (2.4 %; n = 13), who had the highest scores of externalizing behavior at all times; (2) “low-chronics” (58.8 %; n = 317) who were low on externalizing behavior throughout the years; (3) “high-reducers” (7.9 %; n = 43) who started out high, but reduced their externalizing behavior monotonically over time; (4) “late-starters-medium” who increased externalizing problems at later age (8.7 %; n = 47); and (5) “medium-reducers” whose problems decreased from an originally medium level (22.4 %; n = 121). The results are in accordance with international studies on developmental trajectories of offending and suggest that a perspective on a broad range of behavioral problems can be fruitful. The findings are discussed with regard to other studies on latent group-based modeling, non-statistical taxonomies, and practical applications.
Mark Stemmler, Friedrich Lösel
A Generalization of Nagin’s Finite Mixture Model
Abstract
We present a generalization of Nagin’s finite mixture model that allows non-parallel trajectories for different values of covariates. We investigate some mathematical properties of this model and illustrate its use by giving typical salary curves for the employees in the private sector in Luxembourg between 1981 and 2006, as a function of their gender, as well as of Luxembourg’s gross domestic product (GDP).
Jang Schiltz

Directional Dependence in Regression Models

Frontmatter
Granger Causality: Linear Regression and Logit Models
Abstract
Granger causality models are very popular when it comes to making decisions on which of a number of series of scores is on the dependent versus the independent side. With this chapter, we pursue two goals. First, we specify Granger causality models in terms of logit models and compare these with the routinely applied linear regression models. The comparison shows that, in order to make the models parallel, either model assumptions must be changed or model terms must be removed from (or inserted into) the model specification. The second goal involves extending Granger causality modeling. We propose conditioning terms on measures within the observed series. By implication, these models require higher-order interactions. In addition, model terms can be conditioned on covariates. Issues concerning parameter interpretation are discussed. Data examples are given from the fields of aggression development in adolescents and intimate partner violence.
Alexander von Eye, Wolfgang Wiedermann, Ingrid Koller
Decisions Concerning the Direction of Effects in Linear Regression Models Using Fourth Central Moments
Abstract
Direction dependence analysis is attracting growing attention in the social sciences for its potential to help decide concerning the direction of effects of linear regression models. Direction dependence analysis assumes that observed data deviate from normality. Various tests have been proposed that can be applied when observed variables are skewed. However, these tests cannot be used when data are nonnormal and symmetric. The present chapter discusses direction dependence approaches for symmetric nonnormal data based on the fourth central moment. A new direction dependence approach based on regression residuals obtained from competing linear regression models is proposed. Three significance tests are described which can be used to test hypotheses compatible with direction dependence when data are nonnormal and symmetric. Results of a Monte Carlo simulation are reported which suggest that the significance tests perform well under various data scenarios. An empirical example from research on intimate partner violence is given to illustrate the application of the direction dependence tests.
Wolfgang Wiedermann

Dyadic Data Modeling

Frontmatter
Analyzing Dyadic Data with IRT Models
Abstract
Dyadic data frequently occur in social sciences and numerous techniques have been developed for their analysis. The most prominent methods involve using regression, path, and structural equation models. The present contribution extends these approaches by considering Item Response Theory (IRT) Models. Two pivotal dyadic data analysis models, the Actor-Partner Interdependence Model (APIM) and the Common Fate Model (CFM), are built using the Multidimensional Random Coefficients Multinomial Logit Model (MRCMLM). This approach combines the advantages of dyadic data analysis with a model for discrete data, thus allowing for categorical items while drawing inferences based on the estimated true scores on an interval scale.
Rainer W. Alexandrowicz
Longitudinal Analysis of Dyads Using Latent Variable Models: Current Practices and Constraints
Abstract
Interdependencies between dyads have long been recognized and taken into account in the analysis of partnership and marital data. However, most of the research that has examined dyadic influences is based on cross-sectional data or basic longitudinal models. When more complex longitudinal models are examined, several limitations and barriers arise. In this chapter, some of the practical issues with dyadic analyses of multi-time point samples will be discussed. In particular, we discuss (1) applications of latent growth curve mixture modeling trajectories of intimate partner relationship adjustment and (2) latent difference score modeling associations between relationship adjustment and depressive symptoms over time. A 4-year longitudinal sample of 237 families assessed over six time points will be used to illustrate these practical issues.
Heather M. Foran, Sören Kliem
Can Psychometric Measurement Models Inform Behavior Genetic Models? A Bayesian Model Comparison Approach
Abstract
As methodologists have increasingly noted, the role of psychometrics in operationalizing a construct is often overlooked when evaluating research claims (Borsboom, 2006). In a related vein, others have noted that psychological research appears to move away from assessment and interpretation of a single a priori statistical model to a more nuanced comparison of models which assess the trade-off between a model’s parsimony and complexity in explaining behavior (e.g., Rodgers, 2010). The genetic factor model is one such statistical model often used to estimate the relative contributions of genetic and environmental components of observed behavior in genetically informative designs (Heath, Neale, Hewitt, Eaves, & Fulker, 1989; Martin, Eaves et al., 1977; Neale & Cardon, 1992). Mathematically, the genetic factor model decomposes observed phenotypic variability into additive genetic (A), common (C), and unique (E) environmental components and is, for that reason, often referred to as the ACE model.
Ting Wang, Phillip K. Wood, Andrew C. Heath

Item-Response-Modeling

Frontmatter
Item Response Models for Dependent Data: Quasi-exact Tests for the Investigation of Some Preconditions for Measuring Change
Abstract
The Rasch model has several advantages for the psychometric investigation of item quality (e.g., specific objectivity). One approach to testing model fit uses quasi-exact tests which are well suited to test the validity of the Rasch model when sample sizes are rather small. Application of these tests is not restricted to Rasch modeling. In this chapter, we show that these tests can be used to test preconditions for measuring change such as measurement invariance, unidimensionality, and local independence across time points. For example, if items are unidimensional across time points (i.e., all items measure the same latent construct across time) and groups (e.g., control and training groups), it follows that there are no significant interindividual differences within groups and over time. All individuals in a group change in the same direction. On the other hand, significant results across time but not within groups suggest group differences in change, such as training effects. In this chapter, we first give an introduction to quasi-exact tests. Then, we demonstrate the applicability of three test statistics for the investigation of preconditions for measuring change using empirical power analysis and an empirical example concerning spatial ability.
Ingrid Koller, Wolfgang Wiedermann, Judith Glück
Measuring Competencies across the Lifespan - Challenges of Linking Test Scores
Abstract
The National Educational Panel Study (NEPS) aims at investigating the development of competencies across the whole lifespan. Competencies are assessed via tests and competence scores are estimated based on models of Item Response Theory (IRT). IRT allows a comparison of test scores—and, thus, the investigation of change across time and differences between cohorts—even when the respective competence is measured with different items. As in NEPS for most of the competencies retest effects are assumed, linking is done via additional link studies in which the tests for two age groups are administered to a separate sample of participants. However, in order to be able to link the test results of two different measurement occasions, certain assumptions, such as, that the measures are invariant across samples and that the tests measure the same construct, need to hold. These are challenging assumptions regarding the linking of competencies across the whole lifespan. Before linking reading tests in NEPS for different age cohorts in secondary school as well as in adulthood, we, thus, investigated unidimensionality of the items for different cohorts as well as measurement invariance across samples. Our results show that the tests for different age groups do measure a unidimensional construct within the same sample. However, measurement invariance of the same test across different samples does not hold for all age groups. Thus, the same test exhibits a different measurement model in different samples. Based on our results, linking may well be justified within secondary school, while linking test scores in secondary school with those in adult age is threatened by differences in the measurement model. Possible reasons for these results are discussed and implications for the design of longitudinal studies as well as for possible analyses strategies are drawn.
Steffi Pohl, Kerstin Haberkorn, Claus H. Carstensen
Mixed Rasch Models for Analyzing the Stability of Response Styles Across Time: An Illustration with the Beck Depression Inventory (BDI-II)
Abstract
Questionnaires for clinical studies are often evaluated in cross-sectional settings and on the basis of classical test theory. Some of them, like the BDI-II which is one of the most widely used self-report instruments for assessing depression severity, are considered to have very good psychometric properties. However, these properties are rarely evaluated in longitudinal designs, and even less with models of item response theory (IRT). In addition, analyses of self-report questionnaires with IRT models provided evidence of two major response styles: the tendency to prefer extreme response categories, and the tendency to prefer the middle categories. Rasch models, in particular their extension to the so-called mixed Rasch model, are well suited to address these questions. They allow one to determine latent classes with different response styles and to analyze qualitative aspects of change such as the consistency of response styles across time. In this chapter first, an introduction to response styles and an overview of the mixed Rasch model, especially in the context of measuring change, are given and second, a practical example is elaborated using a sample of in-patients from a psychosomatic clinic that were assessed with the BDI-II at the beginning and at the end of in-patient treatment. The presence of two response styles is confirmed for the admission data, whereas for the discharge data the Rasch model seems sufficient. A combined analysis of both time points reveals three classes, one of which is a low symptom class and the other two reflect, again, the two response styles; these two classes remain quite stable over time.
Ferdinand Keller, Ingrid Koller

Other Methods for the Analyses of Dependent Data

Frontmatter
Studying Behavioral Change: Growth Analysis via Multidimensional Scaling Model
Abstract
In recent years, statistical methods for latent growth modeling have been commonly used in educational and psychological research. The purpose of this chapter is to illustrate growth modeling of change in pattern using multidimensional scaling (MDS) in the context of growth mixture modeling (GMM). We discuss how MDS growth pattern analysis may differ with respect to modeling changes in level, as commonly done with GMM, given that they have similarities in terms of model estimation, latent group identification, classification of individuals, and the interpretation of growth trajectory. We discuss the MDS growth pattern analysis in particular since it is less known. Using two simulated data sets as well as actual data from the Early Childhood Longitudinal Study of the Kindergarten Class of 1998–99 (ECLS-K) study, we demonstrate differences in growth pattern vs. level. It is our goal to provide researchers with a better idea of what MDS growth pattern analysis can accomplish, which may provide them with the knowledge to appropriately utilize this type of analysis in their own research.
Cody Ding
A Nonparametric Approach to Modeling Cross-Section Dependence in Panel Data: Smart Regions in Germany
Abstract
In addition to intuitively plausible dependence structures in the time series dimension, in many applications it is reasonable to assume that there are contagion, spill-over, and repercussion effects among cross-sectional units. Modeling those structures in the systematic part of a panel regression requires both information on the underlying sources that drive the dependence and their respective range. The range allows one to define a neighborhood for each unit, a crucial concept for common methods in spatial statistics and econometrics. Furthermore, specification of a parametric regression function requires knowledge of the specific functional form of the spatial associations. However, lacking information on the sources usually leads to accepting misspecification and to including spatial error component or factor structures. As recent research reveals, the consequences of misspecification in both strategies are troubling in many cases. This paper proposes a data-driven nonparametric method for determining neighborhood as a first step. Second step nonparametric panel regressions have several benefits: (i) they allow one to test for misclassification of cross-sectional units to a wrong neighborhood in the first step; (ii) estimation is accomplished using data beyond the respective neighborhood, thus imposing less structure than parametric methods; (iii) neighborhood/location effects can be directly estimated in analogy to spatial statistics; (iv) no assumptions on functional form are required. The proposed method is illustrated with an empirical analysis of spatio-temporal patterns of high-skilled employees across German regions.
Harry Haupt, Joachim Schnurbus
MANOVA Versus Mixed Models: Comparing Approaches to Modeling Within-Subject Dependence
Abstract
For inferential purposes such as hypothesis testing or confidence interval calculations, analysis of repeated measures data needs to account for within-subject dependence of observations. Multivariate analysis of variance (MANOVA) is a suitable traditional technique for this purpose. It assumes an unconstrained within-subject covariance matrix and balanced data. However, the so-called mixed-model approach is a viable alternative to analyzing this type of data, because its underlying statistical assumptions are equivalent to the MANOVA model. While MANOVA is the classical approach, the mixed-model methodology, although by now implemented in all major statistical software packages, still is a relatively recent statistical development. The equivalence of both approaches to analyzing repeated measures data has frequently been noted in the literature. Nevertheless, in terms of test-statistics both approaches differ. While in large samples the test-statistics are essentially equivalent, their small sample behavior is not well known. In this article, we investigate by computer simulation the performance of several test-statistics calculated either from the MANOVA or the mixed-model approach for testing the interaction hypothesis with balanced data.
Christof Schuster, Dirk Lubbe
Metadata
Title
Dependent Data in Social Sciences Research
Editors
Mark Stemmler
Alexander von Eye
Wolfgang Wiedermann
Copyright Year
2015
Electronic ISBN
978-3-319-20585-4
Print ISBN
978-3-319-20584-7
DOI
https://doi.org/10.1007/978-3-319-20585-4

Premium Partner