In social science research, scales of measurement are usually arbitrary, and using the same metric for all variables is rare. Social researchers often have to consider the issue of standardization. Generally speaking, unstandardized coefficients are model parameter estimates based on the analysis of raw data. In contrast, standardized coefficients are model parameter estimates based on the analysis of standardized data, in the sense that all variables are supposed to have unit variance. Standardized data are affected less by the scales of measurement and can be used to compare the relative impact of variables that are incommensurable (i.e., measured in different units on the same/different scales). In multiple regression analysis, for example, researchers are advised to use the beta weights (i.e., standardized regression coefficients)for comparing the relative importance of different incommensurable independent variables for the outcome (Fox, 1997). In fact, there are two different contexts for comparing standardized coefficients. The first is a within-group comparison in which standardized coefficients across different variables are compared within a single sample. The second is a between-group comparison in which standardized coefficients for the same variables are compared across different samples. This article deals primarily with the first context, because it is more likely for different variables to be incommensurable within a single sample and the standardization issue is of particular relevance to this context.

To put the standardization issue into perspective, let us consider an example by using data from the Organization for Economic Cooperation and Development (OECD) Programme for International Student Assessment (PISA) 2006 (OECD, 2009).Footnote 1 Specifically, a regression model is proposed for examining the effects of parental educational level (V2) and child’s home possession (V3) on a child’s educational resources at home (V4). Table 1 summarizes the covariance matrix of the variables on a sample of 200 Hong Kong students.

Table 1 Sample covariance (below diagonal) /correlation (above diagonal) matrix for Hong Kong students from PISA2006 (OECD, 2009) (N = 200)

To evaluate the relative importance of the two predictors for the dependent variable, we test the equality of the two regression coefficients by using a likelihood ratio (LR) test.A standard path model analysis is conducted to test the unstandardized hypothesis H0: γ 1 = γ 2, whereas the proposed method, which will be explained in the later section, is applied to the test of the standardized hypothesis H0: γ 1 * = γ 2 *. The analysis is done by using EQS 6.1.Footnote 2 Table 2 (lower panel) shows the LR test results. In the unstandardized condition (H0: γ 1 = γ 2), the result is significant, with ∆χ 2 = 6.002, ∆df = 1, p < .05, suggesting that the effects of V2 and V3 on V4 are different from each other. However, in the standardized condition (H0: γ 1 * = γ 2 *), the result is not significant, with ∆χ 2(1) = 0.011, ∆df = 1, p > .05, suggesting that the standardized effects of V2 and V3 on V4 are the same. The seemingly inconsistent findings indicate the fact that these are, indeed, two different tests assessing the equality of different model parameters. If H0: γ 1 = γ 2 is the null hypothesis we test in the unstandardized condition, the null hypothesis in the standardized condition will become H0: γ 1* = γ 2*, where \( {\gamma_1}* = \frac{{SD(V2)}}{{SD(V4)}}{\gamma_1} \) and \( {\gamma_2}* = \frac{{SD(V3)}}{{SD(V4)}}{\gamma_2} \). It is clear that if the standard deviations (SDs) of V2 and V3 are similar (i.e., V2 and V3 are measured by comparable metrics), the two tests will lead to similar results. However, V2 and V3 are measured in very different metrics in this example. The variance of V2 (=9.07) is about ten times larger than the variance of V3 (=0.84), and, therefore, the two tests lead to a very different statistical conclusion.

Table 2 Example of the comparison of unstandardized results and standardized results

Behavioral researchers are often interested in comparing the effects of different variables. In some situations, one can draw meaningful conclusions only by comparing different variables in the standardized metric and the associated standardized coefficients thereof. As in the example above, the impact of parental educational level and child’s home possession are, in fact, similar if we consider the two effects in a standardized metric. Unfortunately, we sometimes may fail to recognize the fact that the tests for H0: γ 1 = γ 2 and H0: γ 1 * = γ 2 * are different and, therefore, may attempt to make statistical conclusions about the standardized coefficients on the basis of the comparison of their unstandardized counterparts. This could be problematic because, as was shown in our previous example, the two tests could lead to a very different result.

Structural equation modeling (SEM) is becoming an increasingly important statistical technique among applied researchers because of its flexibility for studying a variety of different models (e.g., Hershberger, 2003; Tremblay & Gardner, 1996). Moreover, the development of user-friendly and powerful software programs has contributed significantly to the popularity of this technique (Guo, Perron, & Gillespie, 2009). Nevertheless, different SEM programs are equipped with different programming features, which may be critical for addressing a particular research question, such as the comparison of standardized coefficients. The aim of this study, therefore, is to propose a general method for comparing the standardized coefficients in SEM based on the idea of model reparameterization.

Although there currently exist other methods for comparing standardized coefficients in SEM, we argue that the proposed method is a more general and flexible one. These existing methods are known to be program specific, in the sense that their implementation depends critically on the special features, such as the specification of nonlinear constraints and the availability of an overall test for comparing three or more parameters simultaneously, which some popular programs are still lacking. Our proposed method, on the other hand, does not require any advanced programming features except the basic functions, and it is, therefore, compatible with all major SEM software programs. We believe that one issue that prevents applied researchers from comparing standardized coefficients is that the researchers are limited by the inability of their SEM software programs to provide the relevant tests. In the next section, we will briefly summarize the existing methods for comparing standardized coefficients in SEM. The proposed method is then given in the third section. Three real examples that illustrate the proposed method will be considered in the fourth section. A discussion and conclusions will be provided in the final section.

Comparing standardized coefficients in structural equation modeling

Built-in functions by different software programs

The early development of SEM software programs primarily focused on parameter estimation and statistical inference of unstandardized parameters. Although many programs nowadays have built-in functions for handling standardized parameters, these functions are still very limited. For example, the current versions of AMOS (Arbuckle, 2007), EQS (Bentler, 1995), and LISREL (Jöreskog & Sörbom, 1996) can provide only the standardized parameter estimates, without their standard errors (SEs). Mplus 5.0 or above (Muthén & Muthén, 2007) provides both the standardized parameter estimates and their SEs. The program, however, does not report the covariances among the standardized parameter estimates, which are also important for the test because the standardized coefficients being compared are generally not independent. Moreover, none of these programs has any built-in function that allows their users to compare the standardized coefficients directly.

Phantom variables approach

Cheung (2009b) has given a detailed description of how to construct the confidence intervals on the difference between two standardized coefficients with the use of phantom variables. A phantom variable is a latent variable without observed indicators and has no residual (Rindskopf, 1984). It can be used to trick model-fitting programs into imposing constraints that are not normally within their repertoire (Loehlin, 2004). Many SEM programs have functions that help to simplify the model specification involving phantom variables. For example, LISREL has an AP function (Jöreskog & Sörbom, 1996), and Mplus has a MODEL CONSTRAINT option (Muthén & Muthén, 2007) for creating additional parameters. By defining an additional parameter as the difference between two standardized coefficients, we can readily obtain the parameter estimate and its SE and can use the Wald statistic to test its statistical significance (cf. Cheung, 2009b).

Although the phantom variable approach provides a general solution for comparing standardized coefficients in SEM, it can be implemented only by using a specific class of SEM software programs. First, a nonlinear constraint is an essential feature of the phantom variables approach, because the difference in two standardized coefficients (i.e., the additional parameter) is defined as a nonlinear function of the basic model parameters. As a result, users must have an SEM program that supports model fitting with nonlinear constraints (e.g., LISREL and Mplus) in order to implement the method. Unfortunately, not many applied researchers can get access to these programs freely. In fact, most local academic departments and research institutes can afford to support and maintain only one SEM program due to various practical reasons, such as limited resources, personal preference, and faculty training. It is not feasible for them to switch from one SEM program to another. Furthermore, many reported SEM studies indicated that their analyses were based primarily on AMOS or EQS, which currently do not support the specification of nonlinear constraints. For example, a review by Guo et al. (2009) showed that around 50% of the studies used either AMOS or EQS but only around 3% of those used Mplus. Similarly, another review by Jackson, Gillaspy, and Purc-Stephenson (2009) suggested that the figures were around 40% and 7%, respectively.

Second, the phantom variable approach basically defines an additional parameter as the difference of two standardized coefficients and tests its value against zero by using the Wald statistic. Consequently, whether this approach can be extended to the comparison of three or more standardized coefficients simultaneously will depend further on the availability of an overall test for all the additional parameters concerned. As far as we know, most of the SEM programs fail to perform an overall test like this, except Mplus. This further narrows down one’s choices of programs for comparing standardized coefficients.

The proposed method

The proposed method uses model reparameterization for standardizing model parameters. The idea of model reparameterization is to transform the hypothesized model into a set of successive covariance-equivalent models that share the same implied covariance matrix as the original model. As a result, a coefficient that does not exist as a model parameter in the original model becomes a model parameter in the final transformed model. Chan’s (2007) sequential model-fitting method for comparing the indirect (mediation) effects in SEM demonstrated one of the usages of the model reparameterization technique. Our proposed method applies the model reparameterization technique to the standardization of model parameters and demonstrates another usage of the technique.

The proposed method adopts a two-stage approach for comparing standardized coefficients. At stage 1, we first transform the original model (M1) into the standardized model (M2) by reparameterization so that the path coefficients as described in the transformed model are equivalent to the standardized path coefficients of the original model. Once the standardized coefficients appear as free model parameters in M2, we can test their differences. Hence, at stage 2, we compare the standardized coefficients by imposing appropriate equality constraint(s) on the parameters of interest in M2 and perform statistical inference on the basis of the LR test. In the following section, we give a detailed description of how to transform the original model into the standardized model at stage 1.

General framework of model transformation at stage 1

Like other SEM analyses, we first define a given model that is of theoretical interest. We label this model as the original model, M1. Figure 1a show M1 with k effects acting on Y, where γ 1, γ 2, . . .γ k are the unstandardized path coefficients, ϕ ij is the covariance between Xi and Xj, and E is the error term. Without loss of generality, all variables are assumed to have zero means. The model equation of the original model in standardized form can be written as follows:

$$ \begin{array}{*{20}{c}} {Y = \sum\limits_i^k {{\gamma_i}Xi + E} } \\ {\frac{Y}{{SD(Y)}} = \sum\limits_i^k {\left( {\frac{{SD(Xi){\gamma_i}}}{{SD(Y)}} \times \frac{{Xi}}{{SD(Xi)}}} \right)} + \frac{E}{{SD(Y)}}} \\ \end{array} $$
(1)

Our task is, therefore, to transform the original model so that the standardized coefficients, \( {\gamma_i}* = \frac{{SD(Xi)}}{{SD(Y)}}{\gamma_i} \), become model parameters of the standardized model, M2.

Fig. 1
figure 1

A general regression model with k effects on Y. a Original model (M1). b Half –transformed model. c Final standardized model (M2). Observed variables (X1, . . . , Xk and Y) are omitted in panel c

Figure 1b shows a half-transformed model of M1. We first transform the model by regressing the (k + 1)observed variables on (k + 1) dummy latent variables (DLVs), F1, F2, . . . , Fk, and FY, which are manipulated to have unit variance. D is the disturbance term. Chan (2007) has used the term DLV to denote the variable that was used to factorize the original mediator in the sequential model-fitting method. The function of DLV is similar to Rindskopf’s (1984) concept of a phantom variable.Footnote 3 The model equation of the half-transformed model is as follows:

$$ FY = \sum\limits_i^k {{\gamma_i}*{F_i} + D} $$
(2)

By standardizing F1, F2, . . . , Fk, and FY (i.e., all the variables have unit variance), Eq. 2 will be equivalent to Eq. 1. In other words, γ i * will be equal to the standardized path coefficients of the original model. Since F1 to Fk are independent variables, we can fix their variances directly as 1.0 in SEM. However, FY is a dependent variable, and its variance does not exist as a free parameter in SEM, so we cannot fix its variance directly. Therefore, the question becomes how we can standardize FY. From Eq. 2, the variance of FY is

$$ \begin{array}{*{20}{c}} {{\rm var} (FY) = {\rm var} \left( {\sum\limits_i^k {{\gamma_i}*Fi + D} } \right)} \\ { = \sum\limits_i^k {{\gamma_i}{*^2} + 2\sum {\sum\limits_{{i \ne j}} {{\gamma_i}*{\gamma_j}*{\varphi_{{ij}}}*} } } + {\rm var} (D)} \\ { = g(\theta ) + {\rm var} (D)} \\ \end{array} $$
(3)

where ϕ ij * is the covariance between Fi and Fj, θ is a vector of unknown model parameters, \( g(\theta ) = \sum\limits_i^k {{\gamma_i}{*^2}} + 2\sum {\sum\limits_{{i \ne j}} {{\gamma_i}*{\gamma_j}*{\varphi_{{ij}}}*} } \) is defined as the total variances and covariances due to the antecedent variables, and var(D) is the disturbance variance. If var(D) = 1−g(θ), then var(FY) will become 1.0 as \( {\text{var}}\left( {FY} \right) = g(\theta ) + [{1} - g(\theta )] = {1}.0 \).

In programs such as LISREL and Mplus, we can fix the disturbance variance by using nonlinear constraints, but our aim is to propose a method that does not involve nonlinear constraints. Therefore, we need to further transform the model into the final standardized model (M2) by regressing the disturbance term on a phantom variable, F999,Footnote 4 with unit variance and k image latent variables (F1′, F2′ . . . Fk′), with variance = −1.0.We labeled F1′, F2′ . . . Fk′ as image latent variables and the structure formed by them as the image structure of the effects on FY. An image structure is defined by the following four properties: (1) The image structure has the same structural form as the target structure; (2) the path coefficients of the image structure are the same as the corresponding path coefficients of the target structure;(3) var(Fi) = −var(Fi); and (4) cov(Fi, Fj) = −cov(Fi, Fj). As can be seen in Fig. 1c, F1′ to Fk′ has the same structure as F1 to Fk. The path leading from Fi′ to D is the same as the path leading from Fi to FY (i.e.,γ i *). Var(Fi) = −1, which is the image of var(Fi) = 1; and cov(Fi, Fj) = −ϕ ij *, which is the image of cov(Fi,Fj) = ϕ ij . In theory, the variance of a random variable could not be negative. In this case, however, we pragmatically consider negative unit variance of Fi′ to generate the desired variance of D.

In Fig. 1c, the effect of Fi′ on D (γ i *) is equal to the effect of Fi on FY, and the path leading from F999 to D is always fixed at 1.0. By fixing var(F999) = 1.0, var(Fi′) = −1.0, and cov(Fi′, Fj′) = −ϕ ij *, we will have

$$ \begin{array}{*{20}{c}} {D = F999 + \sum\limits_i^k {{\gamma_i}^{*}{F_i}\prime } } \\ {{\rm var} (D) = {\rm var} (F999) + \sum\limits_i^k {{\gamma_i}{*^2}{\rm var} (Fi\prime ) + 2} \sum {\sum\limits_{{i \ne j}} {{\gamma_i}*{\gamma_j}*( - {\varphi_i}_j*)} } } \\ { = 1.0 - \left( {\sum\limits_i^k {{\gamma_i}{*^2} + 2} \sum {\sum\limits_{{i \ne j}} {{\gamma_i}*{\gamma_j}*{\varphi_{{ij}}}*} } } \right)} \\ { = 1.0 - g(\theta )} \\ \end{array} $$
(4)

In short, two sources of effects act on the disturbance term, D, in M2: (1) the effect of phantom variable, F999, with unit variance, and (2) the effects of the image structure (F1′ to Fk′), which make up a total variance of −g(θ).By substituting Eq. 4 into Eq. 3, we can see that the variance of the dependent latent variable, FY, is fixed at 1.0 nonstochastically:

$$ \begin{array}{*{20}{c}} {{\rm var} (FY) = g(\theta ) + {\rm var} (D)} \\ { = g(\theta ) + 1 - g(\theta )} \\ { = { 1}{.0}{.}} \\ \end{array} $$
(5)

To summarize, two important criteria need to be observed when a model transformation is performed. First, the variances of the DLVs in M2 must be fixed at 1.0 nonstochastically. Second, M2 must have the same implied covariance structure as the original model. Once we successfully transform the original model into the standardized model at stage 1, comparing the standardized coefficients using the LR test at stage 2 is straightforward.

Real examples

Three real examples are considered in order to illustrate the proposed method using EQS (Bentler, 1995). Readers who are interested in working on these examples can also download the complete EQS program codes (see Electronic Supplementary Material). The first two examples use a sample that consists of 200 (Hong Kong) cases randomly selected from the PISA 2006 data set (OECDS, 2009) on five variables. They are, namely, parental occupational status (V1), parental educational level (V2), child’s home possession (V3), child’s home educational resources (V4), and child’s reading ability (V5). Table 1 summarizes the sample covariance matrix of the variables.

Example 1: A regression model with three predictors

Stage 1

Figure 2a shows the original model, M1. In this example, a regression model with three predictors is considered. Specifically, we attempt to compare the standardized effects of parental occupational status (γ 1 *), parental educational level (γ 2 *), and child’s home possession (γ 3 *) on child’s reading ability (V5). The model is fitted to the observed data by using EQS6.1 for Windows. Since M1is a saturated model with 0 degrees of freedom, it has a perfect fit with model chi-square, χ 2 = 0.

Fig. 2
figure 2

Models in Example 1. a Original model (M1). b Standardized model (M2). V1 = parental occupational status, V2 = parental educational level, V3 = child’s home possession, V5 = reading scores. Labels for parameters of interest are printed

We follow the general framework to transform M1 into the standardized model, M2. Figure 2b depicts the standardized model, M2. First, each observed variable is regressed on a DLV (F1 to F4). Variances of F1 to F3 are fixed at 1.0. F5 is the disturbance term of F4Footnote 5. F999 is the phantom variable with unit variance, and the path from F999 to F5 is fixed at 1.0. F6 = F1′, F7 = F2′, and F8 = F3′ are image latent variables with negative unit variance, and they form the image structure of F1, F2, and F3.

Second, M2 is fitted to the observed data.Footnote 6 The path coefficients of the image latent variables F6, F7, and F8 are constrained to be equal to the coefficients of the corresponding target variables F1, F2, and F3. Covariance between the image latent variables is constrained to be equal to the negative of the covariance between the target variables. Six linear constraints (three on γ i *'s and three on ϕ ij *'s) are imposed on the model altogether. Again, the model has a perfect fit with model chi-square, χ 2 = 0, df = 0. Table 3 shows the parameter estimates and their standard errors (SEs) in M2 (under the heading “reparameterization”). By comparing the parameter estimates and their SEs with the standardized parameter estimates and their SEs reported by Mplus (with a built-in function) and LISREL (by using the phantom variables approachFootnote 7), they are perfectly comparable with each other.Footnote 8 In other words, the original model has been successfully transformed into the standardized model, and the path coefficients that appear in M2 are equivalent to the standardized coefficients of the original model, M1.

Table 3 Summary of the standardized parameter estimates and their estimated standard errors in Example 1 by analysis using different structural equation modeling programs and approaches

Stage 2

To test the equality of the three standardized coefficients from V1, V2, and V3 to V5, we fit a constrained model under H0: γ 1 * = γ 2 * = γ 3 * by imposing two linear equality constraints: (1) F1→F4 = F3→F4 and (2) F2→F4 = F3→F4. The model chi-square is χ 2 = 2.070, df = 2, p = .355. Table 4 shows the model chi-squares and the parameter estimates of the constrained and unconstrained models. We compare the chi-square of the constrained model with that of the unconstrained model. The LR test gives ∆χ 2 = 2.070−0 = 2.070, ∆df = 2−0 = 2, p > .05, suggesting that there is no significant difference among the standardized regression coefficients. Hence, we can conclude that the relative effects of parental occupational status, parental educational level, and child’s home possession on child’s reading ability are the same.

Table 4 Hypothesis testing results based on the unstandardized model (M1) and standardized model (M2) in Example 1

Example 2: A path model with three antecedent variables, one mediator, and one outcome variable

Stage 1

We further hypothesize that the child’s educational resources at home (V4) mediates the relationships of parental occupational status (V1), parental educational level (V2), and child’s possession at home (V3) with child’s reading ability (V5). Therefore, we define the original model, M1 (Fig. 3a), as a five-variable path model with V1, V2, and V3 as the antecedent variables, V4 as the mediator, and V5 as the outcome variable. In this example, the aim is to compare the standardized effects of V1, V2, and V3 on V4. When γ 1 * = γ 2 * = γ 3 *, the indirect effects on V5 are said to be equal. We fit M1 to the observed data using EQS. The chi-square goodness-of-fit statistic is χ 2 = 13.218, df = 3, p < .01.

Fig. 3
figure 3

Models in Example 2. a Original model (M1). b Standardized model (M2). V1 = parental occupational status, V2 = parental educational level, V3 = child’s home possession, V4 = child’s home educaitonal resources, V5 = reading scores. Labels for parameters of interest are printed

Following the proposed method, we first transform the model into the standardized model. Although the original model is more complex in this example, the logic of model transformation remains the same. Figure 3b depicts the standardized model. Again, each observed variable is regressed on a DLV (F1 to F5). F1 to F3 are independent variables, and their variances are fixed at 1.0. F4 and F5 are the dependent variables, so we use the disturbance terms, F6 and F7, to standardized F4 and F5, respectively.

For F4, let g 4(θ) be the total variances and covariances due to F1 to F3. Our task, therefore, is to cancel out g 4(θ) by making use of F6 so that F4 can have unit variance. We follow the general framework and regress F6 on F888, F8, F9, and F10. Hence, F888 = F999 is the phantom variable with unit variance. F8 = F1′, F9 = F2′, and F10 = F3′ are image latent variables with negative unit variance, and they form the image structure of F1, F2, and F3 on F4. The path leading from F888 to F6 is fixed at 1.0.

Similarly, for F5, g 5(θ) is equal to the variance due to F4. To standardized F5, we need to cancel out g 5(θ) by regressing F7 on F999 and F11. F999 is the phantom variable with unit variance, and F11 = F4′ is the image latent variable of F4in this case. The path leading from F999 to F7 is fixed at 1.0. Since the variance of F4has been fixed at 1.0 indirectly in the previous step, we can cancel the effect of F4 and impose unit variance on F5 by simply fixing the variance of F11 at −1.0 and constraining the path from F4to F5 to be equal to the path from F11 to F7.

As in Example 1, we fit M2 to the observed data. Four linear equality constraints are imposed on the path coefficients of the image latent variables (F8 to F11) and those of their corresponding target variables (F1 to F4). The three covariances among F8, F9, and F10 are constrained to be equal to the negative of the covariances among F1, F2, and F3. Seven linear constraints are specified altogether. M2 gives exactly the same chi-square value as M1, χ 2 = 13.218, df = 3, p < .01, indicating that the two models are covariance equivalent and have the same implied covariance matrix. Table 5 shows the parameter estimates and their SEs, which again are the same as the results given by Mplus and LISREL.Footnote 9

Table 5 Summary of the standardized parameter estimates and their estimated standard error in Example 2 by analysis using different structural equation modeling programs and approaches

Stage2

To compare the standardized coefficients of V1, V2, and V3 on V4, we fit a constrained model by imposing two linear equality constraints: (1) F1→F4 = F3→F4 and (2) F2→F4 = F3→F4. As is shown in Table 6, the model chi-square is χ 2 = 39.097, df = 5. The LR test gives ∆χ 2 = 39.097−13.218 = 25.879, ∆df = 5−3 = 2, p<.001, suggesting that the relative effects of parental occupational status, parental education level, and child’s home possession on child’s educational resources at home are not all equal.Hence, the indirect effects on child’s reading ability are also not all equal.

Table 6 Hypothesis testing results based on the unstandardized model (M1) and standardized model (M2) in Example 2

Example 3: A structural model with two antecedent variables and one outcome variable

This example further demonstrates how the proposed method can be applied for comparing standardized coefficients in models with latent variables. Six variables were selected from the dataset in Schoon and Parson (2002), who studied how the social structure influences teenage aspirations and subsequent occupational attainment. They are, namely, examination score (V1), highest qualifications (V2), job aspiration (V3), educational aspiration (V4), Goldthorpe (V5), and RGSC (V6). The six selected variables together measure three latent factors: educational achievement (F1), teenage aspiration (F2), and occupational attainment (F3). The sample consists of 6,407 cases from the 1970 British Cohort Study (BCS70). Table 7 summarizes the sample covariance matrix of the variables.

Table 7 Covariance matrix for the 1970 British cohort (N = 6407) from Schoon and Parson’s (2002) study

Stage 1

Figure 4a shows the original model, M1. Our aim is to compare the standardized coefficients of educational achievement (γ 1 *) and teenage aspiration (γ 2 *) on occupational attainment. The model is fitted to the observed data by using EQS. The chi-square goodness-of-fit statistics are χ 2 = 1.199, df = 6, p > .05.

Fig. 4
figure 4

Structural model in Example 3. a Original model (M1). b Standardized model (M2). F1 = educational achievement, F2 = teenage aspiration, F3 = occupational attainment. Labels for parameters of interest are printed

We follow the general framework to transform M1 into the standardized model, M2. Fig. 4b depicts the standardized model, M2. Since the latent factors F1, F2, and F3 are the target factors that we are going to standardize, each latent factor is regressed on a DLV (F4 to F6). Variance of F4 and F5 are fixed at 1.0. F7 is the disturbance term of F6. F999 is the phantom variable with unit variance, and the path from F999 to F7 is fixed at 1.0. F8 = F4′ and F9 = F5′ are image latent variables with negative unit variance, and they form the image structure of F4 and F5.

M2 is then fitted to the observed data. The path coefficients of the image latent variables, F8 and F9 are constrained to be equal to the coefficients of the corresponding target variables F4 and F5. Covariance between F8 and F9 is constrained to be equal to the negative of the covariance between F4 and F5. Three linear constraints are imposed on the model altogether. A comparison with the model chi-square of M1, M2 gives the same chi-square value, χ 2 = 1.199, df = 6, p > .05, indicating that the two models are covariance equivalent and have the same implied covariance matrix. Table 8 shows the parameter estimates and their SEs. The parameter estimates and their SEs are the same as the standardized parameter estimates and their corresponding SEs reported by Mplus, suggesting that we successfully transformed the model into the standardized model.

Table 8 Summary of the parameter estimates and hypothesis test results in Example 3 by analysis using EQS and Mplus

Stage 2

To test the equality of the two standardized coefficients from F1 and F2 on F3, we fit a constrained model by imposing a linear equality constraint—that is, F4→F6 = F5→F6—on M2. Table 8 shows the model chi-squares and the parameter estimates of the constrained and unconstrained model (under the heading “Standardized”). We compare the chi-square of the constrained model with that of the unconstrained model. The LR test gives ∆χ 2 = 2.981−1.199 = 1.782, ∆df = 7−6 = 1, p > .05, suggesting that the relative effects of educational achievement and teenage aspiration on occupational attainment are not significantly different from each other. As compared with the Wald test results based on the analysis by Mplus, the two tests give the same conclusion about the two coefficients.

Discussion

In this article, a method for comparing standardized coefficients in SEM is proposed. Since different variables are often measured in different units in behavioral research, comparing their standardized effects will lead to a more meaningful conclusion, because they are affected less by the units of measurement. Three real examples are given to demonstrate the implementation of the proposed method. In all the examples, our method performs accurately in standardizing the model at stage 1. It provides the same standardized parameter estimates and standard errors, as compared with those reported by Mplus and LISREL (using the phantom variables approach). At stage 2, the LR test can be employed as a routine step to compare the coefficients of interest.

We also compare the LR test with the Wald test results in the regression example at the beginning of this article and Example 3. Theoretically, these tests address the same question, and they are asymptotically equivalent under the same null hypothesis (Chou & Bentler, 1990; Satorra, 1989). From Tables 2 and 8, the p-values reported by the LR test and the Wald test are highly comparable, suggesting that the LR test at stage 2 behaves similarly to the Wald test used by the phantom variable approach for comparing standardized coefficients.

It is worthwhile noting that the comparison of coefficients based on the unstandardized and standardized metrics lead to different statistical conclusions throughout our examples (see Tables 2, 4, 6, and 8),because the two tests test different null hypotheses, as was discussed previously. If researchers fail to recognize the difference between the two tests and make inference about one metric on the basis of the analysis of the other, they may risk drawing a misleading conclusion. Researchers should pay special attention to the difference between the two metrics, especially when the variances of the variables are very different, and should choose the appropriate metric for testing according to the questions they are going to address.

There are several distinguishing features of the proposed method. First, it gives accurate standard error estimates for the standardized parameters. To obtain the standardized estimates, it is procedurally tempting for one to standardize the variables first and analyze the data on the basis of the correlation matrix, because this can save lots of effort. However, as is shown in Table 3 and 5 (under the heading of “Correlation”), an analysis based on the correlation matrix in general gives us correct parameter estimates but incorrect SEs (e.g., Bentler, 2007; Cheung, 2009a; Cudeck, 1989). When we analyze the correlation matrix, the variances of the dependent variables depend on other random parameters and are, therefore, subject to sampling variability (i.e., they are not fixed at 1.0 nonstochastically), which eventually affects the accuracy of the standard error estimates. We can further verify this by comparing the correlation-based SEs and our SE estimates with the bootstrap SEs. Tables 3 and 5 (under the heading “Bootstrap”) show the standardized parameter estimates and their corresponding SEs by bootstrapping.Footnote 10 As was expected, the correlation-based SEs deviate quite substantially from the bootstrap estimates for some of the parameters, suggesting that they are problematic. In contrast, the SE estimates given by our method agree with the bootstrap SEs up to two decimal places, suggesting that they can be trusted generally. Hence, as is shown in our Examples 1 and 2, it is generally inappropriate to compare the standardized coefficients based on the analysis of correlations unless the method of constrained estimation, which gives the correct SEs, is implemented when a correlation matrix is analyzed (see Browne, 1982; Browne & Mels, 1992; Mels, 1989).

Second, the method is compatible with all major SEM software programs on the market. Unlike other approaches, our method does not involve the use of nonlinear constraints, and it requires only the basic standard functions to work. Although we demonstrate the implementation of the method only by using EQS, the method can work well with other programs, such as AMOS,Footnote 11 too. Researchers can choose their favorite SEM programs for implementing the proposed method.

In relation to this, another advantage of our method is that one can keep the use of programming to a minimum. The phantom variable approach requires researchers to fully understand the functional relationshipsFootnote 12 among the model parameters before they can specify the nonlinear constraints on the additional parameters correctly. In contrast, one can use the graphical programming capabilities possessed by some SEM software programs (e.g., EQS and AMOS) to implement the proposed method and avoid the complicated syntax. Indeed, many SEM beginners welcome SEM software programs such as EQS and AMOS because of their well-designed graphical user interface (e.g., Kline, 1998). Some SEM practitioners may find our proposed method favorable because they can follow the general framework of model transformation and use the graphical interface to specify the standardized model without going into the mathematical details.

Finally, the proposed method is capable of comparing three or more standardized coefficients simultaneously. By using the LR test at stage 2, we can test the equality of k standardized coefficients by simply imposing (k−1) linear equality constraints on the standardized coefficients and comparing the chi-square statistics between the constrained and unconstrained standardized models. As a posthoc comparison, we can apply the Lagrange multiplier (LM) tests to further evaluate the significance of each equality constraint in a pairwise fashion after an overall significant LR test result has been observed. For example, we can further test the null hypotheses H0: γ 1 * = γ 3 * and H0: γ 2 * = γ 3 * after finding an overall significant difference among the three standardized coefficients by using LM test in Example 2 (see Table 6). The LM test results show that releasing the equality constraint, γ 1 * = γ 3 *, yields a significant improvement in model fit for the standardized model with χ 2 = 23.00, df = 1, p < .001.This improvement means that there is a significant difference between the standardized coefficients, γ 1 * and γ 3 *. In contrast, releasing the equality constraint, γ 2 * = γ 3 *, in the standardized model does not significantly improve the model fit, with χ 2 = 3.40, df = 1, p > .05, suggesting that there is no significant difference between the standardized coefficients, γ 2 * and γ 3 *.

Conclusion

For many years, methodologists have studied how different parametric statistical procedures such as canonical correlation analysis can be incorporated into SEM (e.g., Fan, 1997; Graham, 2008). One reason is that many of these multivariate techniques do not provide the SEs for different types of coefficients and, therefore, statistical significance tests cannot be conducted. The present study shows that if these coefficients are made explicit in the model (i.e., appear as a model parameter in the specified model), we can easily obtain the SE and carry out subsequent testing involving these coefficients by using standard SEM analysis. Our proposed method demonstrates one of the usages of the model reparameterization technique in this area of study.

In this article, only limited kinds of models are considered in order to demonstrate how the method can be applied for making relevant statistical inference. Model transformation will become more tedious if the original model is complex. However, the general principle of model transformation for more complex models remains unchanged. Although more image latent variables and a larger image structure are involved for a more complex model, it does not influence the effectiveness of the proposed method for comparing standardized coefficients. Nevertheless, the proposed model may fail in other extreme model conditions, such as a nonrecursive model with the presence of reciprocal effects. Further investigation is required to evaluate the effectiveness of the proposed method in these conditions.

Future studies can also be done to explore how model reparameterization can be used for testing other parameters, such as standardized indirect effects and squared multiple correlation coefficients (R 2) (Kwan & Chan, 2010). Considering the general model in Fig. 2, R 2 can be defined as the proportion of total variance of Y that is accounted for by the predictors (X1 . . . Xk). Mathematically, it can be expressed as

$$ \begin{array}{*{20}{c}} {{R^2} = 1 - {\rm var} (D)} \\ { = 1 - [1 - g\left( \theta \right)]} \\ { = g\left( \theta \right)} \\ \end{array} $$
(6)

Therefore, R 2 is, in fact, equivalent to g(θ), which is defined as the total variances and covariances due to the antecedent variables in the standardized model. It will be interesting to reparameterize g(θ) as a single model parameter in the transformed model, so that we can conduct significance tests and subsequent analysis involving R 2.

Finally, in addition to raising researchers’ awareness about the difference between the tests of coefficients based on standardized and unstandardized metrics, we hope that our proposed method can also inspire methodologists about the potential usefulness of model reparameterization as a general modeling technique in SEM. Future research can probably be done to explore how model reparameterization can be a useful technique in other kinds of SEM analysis.