Skip to main content
Erschienen in: Advances in Data Analysis and Classification 2/2019

Open Access 26.03.2018 | Regular Article

A bivariate index vector for measuring departure from double symmetry in square contingency tables

verfasst von: Shuji Ando, Kouji Tahata, Sadao Tomizawa

Erschienen in: Advances in Data Analysis and Classification | Ausgabe 2/2019

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

For square contingency tables, a double symmetry model having a matrix structure that combines both symmetry and point symmetry was proposed. Also, an index which represents the degree of departure from double symmetry was proposed. However, this index cannot simultaneously characterize the degree of departure from symmetry and the degree of departure from point symmetry. For measuring the degree of departure from double symmetry, the present paper proposes a bivariate index vector that can simultaneously characterize the degree of departure from symmetry and the degree of departure from point symmetry.

1 Introduction

Consider an \(r \times r\) square contingency table \({\varvec{P}}=(p_{ij})\) with the same row and column nominal classifications. Here \(p_{ij}\) denotes the probability that an observation will fall in the ith row and jth column of the table (\(i=1,\dots ,r;j=1,\dots ,r\)). For the analysis of square contingency tables, we are interested here in whether or not the row classification is symmetric or point-symmetric to the column classification (see Tahata and Tomizawa 2014).
Bowker (1948) proposed the symmetry model defined by
$$\begin{aligned} p_{ij}=p_{ji} \quad \mathrm{for} \ i\ne j. \end{aligned}$$
This model indicates that the probabilities are symmetric with respect to the main diagonal of the contingency table. Namely, this model indicates that the probability that an observation will fall in row category i and column category j is equal to the probability that the observation falls in row category j and column category i\((i\ne j)\). Wall and Lienert (1976) proposed the point symmetry model defined by
$$\begin{aligned} p_{ij}=p_{r+1-i,r+1-j} \quad \mathrm{for} \ i, j=1,\dots ,r. \end{aligned}$$
This model indicates that the probabilities are point symmetric with respect to the center cell (when r is odd) or center point (when r is even) of the contingency table.
Consider the data in Table 1 taken from Andersen (1997, p. 226). These data show the forecasts for production and prices for the coming three year periods given by experts in July 1956 and the actual production figures for production and prices in May 1959 given from a sample of about 4000 Danish factories. For Table 1, we shall denote, for example, the probability that the forecast and the actual values are “Higher” and “Lower”, respectively, by P(H, L), and the probability that those are “Higher” and “No change”, respectively, by P(H, N). For these data, we are interested in whether (1) P(H, N) \(=\) P(N, H), P(H, L) \(=\) P(L, H), and P(L, N) \(=\) P(N, L), and (2) P(H, H) \(=\) P(L, L), P(H, N) \(=\) P(L, N), P(H, L) \(=\) P(L, H), and P(N, H) \(=\) P(N, L). Note that (1) means the symmetry of cell probabilities with respect to the main diagonal in the table, and (2) means the symmetry of cell probabilities with respect to the center point (the center cell) in the table.
Tomizawa (1985) considered a matrix structure that combines both symmetry and point symmetry. The double symmetry model is defined by
$$\begin{aligned} p_{ij}=p_{ji}=p_{r+1-i,r+1-j}=p_{r+1-j,r+1-i} \quad \mathrm{for} \ i, j=1,\dots ,r. \end{aligned}$$
This model indicates that the probabilities are symmetric with respect to both the main diagonal and the secondary diagonal of the contingency table.
Table 1
The two tables below show the forecasts for production and prices for the coming three year periods given by experts in July 1956 and the actual production figures for production and prices in May 1959 given from a sample of about 4000 Danish factories; from Andersen (1997, p. 226)
Forecast 1956
Actual 1959
Higher
No change
Lower
Total
(a) For prices
   Higher
209
169
6
384
   No change
190
3073
184
3447
   Lower
3
62
81
146
   Total
402
3304
271
3977
(b) For production
   Higher
532
394
69
995
   No change
447
1727
334
2508
   Lower
39
230
231
500
   Total
1018
2351
634
4003
In case of unknown probabilities \(p_{ij}\) these are estimated from an observed \(r \times r\) contingency table \({\varvec{N}}=(x_{ij})\) obtained from random sampling. Typically, when a given model does not hold, we are interested in checking for an extended model or in analyzing the deviation from the model (from the residuals). On the other hand, we are also interested in measuring the degree of departure from the corresponding model. For evaluating goodness-of-fit of the model, test statistics (e.g., Pearson’s Chi-squared statistic or likelihood ratio statistic) are used. When the model does not hold for several tables, we may be interested in comparing the degrees of departure from the corresponding model. However, test statistics would not be useful for comparing the degrees of departure from the model in several tables because test statistics depend on the dimension r and sample size. Thus, for comparing the degrees of departure from the model in several tables, we are interested in an index that does not depend on the dimension r and sample size.

1.1 Index \(\varPhi _{S}\) for departure from symmetry

For square contingency tables with nominal categories, Tomizawa (1994) considered the index \(\varPhi _{S}\) which represents the degree of departure from symmetry. The index \(\varPhi _{S}\) is independent of the dimension r. Assume that \(p_{ij}+p_{ji}>0\) for all \(i\ne j\). The index \(\varPhi _{S}\) is expressed as follows:
$$\begin{aligned} \varPhi _{S}=\frac{1}{\log 2}I_{S}, \end{aligned}$$
where
$$\begin{aligned} I_{S}=\underset{i\ne j}{\sum \sum }a_{ij}\log \bigg (\frac{a_{ij}}{b_{ij}}\bigg ), \end{aligned}$$
with
$$\begin{aligned} a_{ij}=\frac{p_{ij}}{\delta },\quad b_{ij}=\frac{p_{ij}+p_{ji}}{2\delta },\quad \delta =\underset{i\ne j}{\sum \sum }p_{ij}. \end{aligned}$$
Note that \(I_{S}\) is the Kullback–Leibler information between \(a_{ij}\) and \(b_{ij}\) for all \(i\ne j\). Note that (1) the index \(\varPhi _{S}\) lies between 0 and 1, (2) \(\varPhi _{S} = 0\) if and only if the symmetry model holds, and (3) \(\varPhi _{S} = 1\) if and only if the degree of departure from symmetry is maximal, in the sense that for all \(i \ne j\) either \(p_{ij} = 0\) or \(p_{ji} = 0\). We point out that the index \(\varPhi _{S}\) can be expressed as
$$\begin{aligned} \varPhi _{S}=\frac{1}{\log 2}\underset{\{e_{ij}\}}{\min }\underset{i\ne j}{\sum \sum }a_{ij}\log \bigg (\frac{a_{ij}}{e_{ij}}\bigg ), \end{aligned}$$
where minimization is constrained by
$$\begin{aligned} \underset{i\ne j}{\sum \sum }e_{ij}=1, \quad e_{ij}>0, \quad e_{ij}=e_{ji}, \quad \mathrm{for} \ i\ne j. \end{aligned}$$

1.2 Index \(\varPhi _{PS}\) for departure from point symmetry

Tomizawa et al. (2007) proposed the index \(\varPhi _{PS}\) which represents the degree of departure from point symmetry. The index \(\varPhi _{PS}\) is independent of the dimension r. Let
$$\begin{aligned} D=\left\{ \begin{array}{ll} \{(i,j)| i,j=1,\dots ,r; (i,j)\ne ((r+1)/2, (r+1)/2)\} &{}\quad (r :\mathrm{odd}),\\ \{(i,j)| i,j=1,\dots ,r\} &{}\quad (r :\mathrm{even}),\\ \end{array}\right. \end{aligned}$$
be the set of cells (ij) outside of the center of the table. Assume that \(p_{ij}+p_{r+1-i,r+1-j}>0\) for all \((i, j)\in D\). The index \(\varPhi _{PS}\) is expressed as follows:
$$\begin{aligned} \varPhi _{PS}=\frac{1}{\log 2}I_{PS}, \end{aligned}$$
where
$$\begin{aligned} I_{PS}=\underset{(i, j)\in D}{\sum \sum }A_{ij}\log \bigg (\frac{A_{ij}}{B_{ij}}\bigg ), \end{aligned}$$
with
$$\begin{aligned} A_{ij}=\frac{p_{ij}}{\varDelta },\quad B_{ij}=\frac{p_{ij}+p_{r+1-i,r+1-j}}{2\varDelta },\quad \varDelta =\underset{(i, j)\in D}{\sum \sum }p_{ij}. \end{aligned}$$
Note that \(I_{PS}\) is the Kullback–Leibler information between \(A_{ij}\) and \(B_{ij}\) for all \((i, j)\in D\). Note that (1) the index \(\varPhi _{PS}\) lies between 0 and 1, (2) \(\varPhi _{PS} = 0\) if and only if the point symmetry model holds, and (3) \(\varPhi _{PS} = 1\) if and only if the degree of departure from point symmetry is maximal, in the sense that for all \((i, j)\in D\) either \(p_{ij} = 0\) or \(p_{r+1-i,r+1-j} = 0\). We point out that the index \(\varPhi _{PS}\) can be expressed as
$$\begin{aligned} \varPhi _{PS}=\frac{1}{\log 2}\underset{\{E_{ij}\}}{\min }\underset{(i, j)\in D}{\sum \sum }A_{ij}\log \bigg (\frac{A_{ij}}{E_{ij}}\bigg ), \end{aligned}$$
where minimization is constrained by
$$\begin{aligned} \underset{(i, j)\in D}{\sum \sum }E_{ij}=1, \quad E_{ij}>0, \quad E_{ij}=E_{r+1-i,r+1-j}, \quad \mathrm{for} \ (i, j)\in D. \end{aligned}$$

1.3 Index \(\varPhi _{DS}\) for departure from double symmetry

Yamamoto et al. (2010) proposed the index \(\varPhi _{DS}\) which represents the degree of departure from double symmetry. Let
$$\begin{aligned} E_{1}=\left\{ \begin{array}{ll} \{(i,j)| i,j=1,\dots ,r;i=j \ \mathrm{or} \ i+j=r+1; &{}\quad (r:\mathrm{odd}), \\ (i,j)\ne ((r+1)/2,(r+1)/2)\} &{} \\ \{(i,j)| i,j=1,\dots ,r;i=j \ \mathrm{or} \ i+j=r+1\} &{}\quad (r:\mathrm{even}),\\ \end{array} \right. \end{aligned}$$
and, let
$$\begin{aligned} E_{2}= \{(i,j)| i,j=1,\dots ,r;i\ne j \ \mathrm{and} \ i+j\ne r+1\}. \end{aligned}$$
\(E_{1}\) is the set of cells (ij) on the diagonal or on the secondary diagonal (without center point if any), while \(E_2\) includes the remaining cells. Assume that \(p_{ij}+p_{ji}+p_{r+1-i,r+1-j}+p_{r+1-j,r+1-i}>0\) for all \((i, j)\in E_{1}\cup E_{2}\). Let for \(t=1,2,\)
$$\begin{aligned} p_{ij(t)}= & {} \frac{p_{ij}}{\delta _t },\ \ q_{ij(t)}=\frac{p_{ij}+p_{ji} +p_{r+1-i,r+1-j} +p_{r+1-j,r+1-i}}{4\delta _t} \ \ \mathrm{for} \ (i,j)\in E_{t},\\ \delta _{t}= & {} \underset{(i,j)\in E_{t}}{\sum \sum }\ p_{ij}. \end{aligned}$$
The index \(\varPhi _{DS}\) is expressed as follows:
$$\begin{aligned} \varPhi _{DS}=\frac{\delta _1\phi _1+\delta _2\phi _2}{\delta _1+\delta _2}, \end{aligned}$$
where
$$\begin{aligned} \phi _t=\frac{1}{\log (2t)}I_t \quad (t=1,2), \end{aligned}$$
with
$$\begin{aligned} I_t=\underset{(i,j)\in E_{t}}{\sum \sum }\ p_{ij(t)} \log \left( \frac{p_{ij(t)}}{q_{ij(t)}}\right) . \end{aligned}$$
For the analysis of the data, when the double symmetry model does not hold, we may measure the degree of departure from the double symmetry model by using \(\varPhi _{DS}\). However, the \(\varPhi _{DS}\) cannot simultaneously characterize the degree of departure from symmetry and the degree of departure from point symmetry (although the double symmetry model has a structure that combines both symmetry and point symmetry), see Sect. 4.

2 Index vector and a confidence region

The purpose of the present paper is to propose a bivariate index vector which represents the degree of departure from double symmetry. For measuring the degree of departure from double symmetry, the proposed index vector can simultaneously characterize the degree of departure from symmetry and the degree of departure from point symmetry. Also, the proposed index vector would be useful for visually comparing the degrees of departure from double symmetry using confidence regions.

2.1 Definition of the index vector

Assume that \(p_{ij}+p_{ji}>0\) for all \(i\ne j\), and \(p_{ij}+p_{r+1-i,r+1-j}>0\) for all \((i,j) \in D\). In order to represent the degree of departure from double symmetry, we propose in this paper the bivariate index vector
$$\begin{aligned} {\varvec{\varPsi }}=\left( \begin{array}{c} \varPhi _{S}\\ \varPhi _{PS}\\ \end{array} \right) ; {\ \mathrm a} {\ 2 \times 1} {\ \mathrm vector}. \end{aligned}$$
It has the properties: (1) \({\varvec{\varPsi }}= (0,0)^{\prime }\) if and only if both symmetry and point symmetry models hold (namely, the double symmetry model holds), (2) \({\varvec{\varPsi }}= (1,1)^{\prime }\) if and only if the degrees of departure from both symmetry and point symmetry are maximal (i.e., the degree of departure from double symmetry is maximal), in the sense that \(p_{ij}=p_{r+1-j,r+1-i}=0\) (then \(p_{ji}>0\) and \(p_{r+1-i,r+1-j}>0\)) or \(p_{ji}=p_{r+1-i,r+1-j}=0\) (then \(p_{ij}>0\) and \(p_{r+1-j,r+1-i}>0\)) for all \(i\ne j\), and either \(p_{ii}=0\) or \(p_{r+1-i,r+1-i}=0\) for \(i=1, \dots , r/2\) (when r is even) or \(i=1, \dots , (r-1)/2\) (when r is odd). Note that the definition of the maximum degree of departure from double symmetry for the proposed index vector \({\varvec{\varPsi }}\) is different from that for the index \(\varPhi _{DS}\).

2.2 A confidence region for the index vector

Let
$$\begin{aligned} {\varvec{x}}= & {} (x_{11}, x_{12}, \dots , x_{1r}, x_{21}, x_{22},\dots , x_{2r}, \dots , x_{r1}, x_{r2}, \dots , x_{rr})^{\prime }; {\ \mathrm an} {\ r^{2} \times 1} {\ \mathrm vector},\\ {\varvec{p}}= & {} (p_{11}, p_{12}, \dots , p_{1r}, p_{21}, p_{22},\dots , p_{2r}, \dots , p_{r1}, p_{r2}, \dots , p_{rr})^{\prime }; {\ \mathrm an} {\ r^{2} \times 1} {\ \mathrm vector}. \end{aligned}$$
We assume that \({\varvec{x}}\) has a multinomial distribution Multi\((n; {\varvec{p}})\) with sample size n and probability vector \({\varvec{p}}\). Then \(\sqrt{n}(\hat{{\varvec{p}}}-{\varvec{p}})\) has asymptotically a normal distribution with zero mean and covariance matrix \(\mathbf{Diag}({\varvec{p}})-{\varvec{pp}}^{\prime }\), where \(\hat{{\varvec{p}}}={\varvec{x}}/n\) and \(\mathbf{Diag}({\varvec{p}})\) is diagonal matrix with the elements of \({\varvec{p}}\) on the main diagonal (see, e.g., Agresti 2013, p. 590). In order to estimate the indexes, \(\hat{\varPhi }_{S}\) and \(\hat{\varPhi }_{PS}\) are given by \(\varPhi _{S}\) and \(\varPhi _{PS}\) with \(\{p_{ij}\}\) replaced by \(\{\hat{p}_{ij}\}\), respectively. Therefore, the sample version of \({\varvec{\varPsi }}\), i.e., \(\widehat{{\varvec{\varPsi }}}\), is given by \({\varvec{\varPsi }}\) with \(\varPhi _{S}\) and \(\varPhi _{PS}\) replaced by \(\hat{\varPhi }_{S}\) and \(\hat{\varPhi }_{PS}\), respectively. Let \((\partial {\varvec{\varPsi }}/\partial {\varvec{p}}^{\prime })\) denote the \(2\times r^{2}\) matrix for which the entry in row k and column l is \(\partial \varPsi _{k}({\varvec{p}})/\partial p_{l}\), where \(\varPsi _{1}\) and \(\varPsi _{2}\) denote \(\varPhi _{S}\) and \(\varPhi _{PS}\), respectively, and \(p_{l}\) denotes the lth element of \({\varvec{p}}\). For n approaching infinity, the estimated index vector can be approximated by
$$\begin{aligned} \widehat{{\varvec{\varPsi }}}={\varvec{\varPsi }}+\bigg (\frac{\partial {\varvec{\varPsi }}}{\partial {\varvec{p}}^{\prime }}\bigg )(\hat{{\varvec{p}}}-{\varvec{p}})+o(\parallel \hat{{\varvec{p}}}-{\varvec{p}}\parallel ), \end{aligned}$$
where \(o(\parallel \hat{{\varvec{p}}}-{\varvec{p}}\parallel )\) tends to \((0,0)^{\prime }\). Using the delta method (see Agresti 2013, Sect. 16.1), \(\sqrt{n}(\widehat{{\varvec{\varPsi }}}-{\varvec{\varPsi }})\) has asymptotically a bivariate normal distribution with zero mean and covariance matrix
$$\begin{aligned} \varvec{\Sigma }= & {} \bigg (\frac{\partial {\varvec{\varPsi }}}{\partial {\varvec{p}}^{\prime }}\bigg )(\mathbf{Diag}({\varvec{p}})-{\varvec{pp}}^{\prime })\bigg (\frac{\partial {\varvec{\varPsi }}}{\partial {\varvec{p}}^{\prime }}\bigg )^{\prime }\\= & {} \left( \begin{array}{cc} \sigma _{11} &{}\quad \sigma _{12} \\ \sigma _{21} &{}\quad \sigma _{22}\\ \end{array} \right) , \end{aligned}$$
with \(\sigma _{12}=\sigma _{21}\). The elements \(\sigma _{11}\), \(\sigma _{12}\) and \(\sigma _{22}\) are expressed as follows:
$$\begin{aligned} \sigma _{11}= & {} \bigg (\frac{\partial \varPhi _{S}}{\partial {\varvec{p}}^{\prime }}\bigg )(\mathbf{Diag}({\varvec{p}})-{\varvec{pp}}^{\prime })\bigg (\frac{\partial \varPhi _{S}}{\partial {\varvec{p}}^{\prime }}\bigg )^{\prime }\\= & {} \frac{1}{\delta ^{2}}\bigg [\underset{i\ne j}{\sum \sum }p_{ij}\big (\varOmega _{ij}\big )^{2}-\delta \big (\varPhi _{S}\big )^{2}\bigg ],\\ \sigma _{12}= & {} \bigg (\frac{\partial \varPhi _{S}}{\partial {\varvec{p}}^{\prime }}\bigg )(\mathbf{Diag}({\varvec{p}})-{\varvec{pp}}^{\prime })\bigg (\frac{\partial \varPhi _{PS}}{\partial {\varvec{p}}^{\prime }}\bigg )^{\prime }\\= & {} \frac{1}{\delta \varDelta }\bigg [\underset{i \ne j}{\sum \sum }p_{ij}\big (\varOmega _{ij}-\varPhi _{S}\big )\big ( W_{ij}-\varPhi _{PS}\big )\bigg ],\\ \sigma _{22}= & {} \bigg (\frac{\partial \varPhi _{PS}}{\partial {\varvec{p}}^{\prime }}\bigg )(\mathbf{Diag}({\varvec{p}})-{\varvec{pp}}^{\prime })\bigg (\frac{\partial \varPhi _{PS}}{\partial {\varvec{p}}^{\prime }}\bigg )^{\prime }\\= & {} \frac{1}{\varDelta ^{2}}\bigg [\underset{(i, j)\in D}{\sum \sum }p_{ij}\big (W_{ij}\big )^{2}-\varDelta \big (\varPhi _{PS}\big )^{2}\bigg ], \end{aligned}$$
where
$$\begin{aligned} \varOmega _{ij}=\frac{1}{\log 2}\log \bigg (\frac{2p_{ij}}{p_{ij}+p_{ji}}\bigg ), \quad W_{ij}=\frac{1}{\log 2}\log \bigg (\frac{2p_{ij}}{p_{ij}+p_{r+1-i,r+1-j}}\bigg ). \end{aligned}$$
Note that the previous formulas for the asymptotic variances \(\sigma _{11}\) and \(\sigma _{22}\) of \(\varPhi _{S}\) and \(\varPhi _{PS}\), respectively, have been derived by Tomizawa (1994) and Tomizawa et al. (2007).
Therefore, an approximate bivariate \(100(1-\alpha )\%\) confidence region for the index vector \({\varvec{\varPsi }}\) is given by
$$\begin{aligned} n\big (\widehat{{\varvec{\varPsi }}}-{\varvec{\varPsi }}\big )^{\prime }\widehat{\varvec{\Sigma }}^{-1}\big (\widehat{{\varvec{\varPsi }}}-{\varvec{\varPsi }}\big )\le \chi ^{2}_{(1-\alpha ; 2)}, \end{aligned}$$
where \(\chi ^{2}_{(1-\alpha ; 2)}\) is the \(1-\alpha \) quantile of the chi-square distribution with two degrees of freedom and \(\widehat{\varvec{\Sigma }}\) is given by \(\Sigma \) with \(\{p_{ij}\}\) replaced by \(\{\hat{p}_{ij}\}\).

3 Examples

Consider the data in Table 1 taken from Andersen (1997, p. 226). For these data, we are interested in whether (1) P(H, N) \(=\) P(N, H), P(H, L) \(=\) P(L, H), and P(L, N) \(=\) P(N, L) (symmetry), and (2) P(H, H) \(=\) P(L, L), P(H, N) \(=\) P(L, N), P(H, L) \(=\) P(L, H), and P(N, H) \(=\) P(N, L) (point symmetry). Thus, for these data, we are interested in the structure of symmetry and point symmetry (namely, double symmetry).
For Table 1a, b, the estimated index vectors are
$$\begin{aligned} \hat{{\varvec{\varPsi }}}=\left( \begin{array}{c} 0.0770\\ 0.0887\\ \end{array} \right) \quad \mathrm{and} \quad \hat{{\varvec{\varPsi }}}=\left( \begin{array}{c} 0.0148\\ 0.0604\\ \end{array} \right) , \end{aligned}$$
respectively (see Table 2).
We shall compare the degrees of departure from double symmetry in Table 1a, b using the confidence region for \({\varvec{\varPsi }}\). From Table 1a, b, the estimates of \(\varvec{\Sigma }\) are obtained as
$$\begin{aligned} \widehat{{\varvec{\Sigma }}}=\left( \begin{array}{cc} 1.3045 &{}\quad 0.3148 \\ 0.3148 &{}\quad 1.0286\\ \end{array} \right) \quad \mathrm{and} \quad \widehat{\varvec{\Sigma } }=\left( \begin{array}{cc} 0.1113 &{}\quad 0.0268 \\ 0.0268 &{}\quad 0.2902\\ \end{array} \right) , \end{aligned}$$
respectively. Figure 1 shows that the confidence ellipsoids for \({\varvec{\varPsi }}\) do not overlap for the data from Table 1a (prices) and b (production). Therefore, we see that Table 1a, b have a different structure with respect to the degree of departure from double symmetry, in the sense that Table 1a, b have a different structure with respect to the degree of departure from symmetry or point symmetry. From Fig. 1, we can conclude that the degree of departure from symmetry in Table 1a is greater than that in Table 1b. But, we cannot conclude that the degree of departure from point symmetry in Table 1a is greater than that in Table 1b. Thus, it may be difficult to judge whether the degree of departure from double symmetry in Table 1a is greater than that in Table 1b.
Table 2
Estimates of \(\varPhi _{S}\), \(\varPhi _{PS}\) and \(\varPhi _{DS}\), approximate standard errors for \(\hat{\varPhi }_{S}\), \(\hat{\varPhi }_{PS}\) and \(\hat{\varPhi }_{DS}\), and approximate 95% confidence intervals for \(\varPhi _{S}\), \(\varPhi _{PS}\) and \(\varPhi _{DS}\), for the data of Table 1a, b
 
Estimated index
Standard error
Confidence interval
(a) For Table 1a (prices)
   \(\varPhi _{S}\)
0.0770
0.0181
(0.0415, 0.1125)
   \(\varPhi _{PS}\)
0.0887
0.0161
(0.0571, 0.1202)
   \(\varPhi _{DS}\)
0.0817
0.0134
(0.0554, 0.1080)
(b) For Table 1b (production)
   \(\varPhi _{S}\)
0.0148
0.0053
(0.0045, 0.0251)
   \(\varPhi _{PS}\)
0.0604
0.0085
(0.0437, 0.0771)
   \(\varPhi _{DS}\)
0.0536
0.0075
(0.0389, 0.0684)
Also, we shall compare the degrees of departure from double symmetry in Table 1a, b using the confidence interval of \(\varPhi _{DS}\). From Table 2, we see that the confidence intervals overlap for Table 1a, b. Thus, we cannot judge whether the degree of departure from double symmetry in Table 1a is greater than that in Table 1b.

4 Discussion

We point out that the bivariate index vector is useful for visually comparing the degrees of departure from double symmetry. For example, we consider the artificial data in Table 3a–c. From Table 4a–c, we see that all values of \(\hat{\varPhi }_{DS}\) for Table 3a–c are equal. However, the value \(\widehat{{\varvec{\varPsi }}}\) for Table 3a is not equal to those for Table 3b, c.
Table 3
Three artificial contingency tables with \(n = 4020\) each
(a)
   100
10
100
100
   2000
100
100
100
   100
100
100
800
   100
100
10
100
(b)
   100
10
100
100
   10
100
100
100
   100
100
100
800
   100
100
2000
100
(c)
   100
10
100
100
   2000
100
100
100
   100
100
100
10
   100
100
800
100
Table 4
Estimates of \(\varPhi _{S}\), \(\varPhi _{PS}\) and \(\varPhi _{DS}\), approximate standard errors for \(\hat{\varPhi }_{S}\), \(\hat{\varPhi }_{PS}\) and \(\hat{\varPhi }_{DS}\), and approximate 95% confidence intervals for \(\varPhi _{S}\), \(\varPhi _{PS}\) and \(\varPhi _{DS}\), for the data of Table 3a–c
 
Estimated index
Standard error
Confidence interval
(a) For Table 3a
   \(\varPhi _{S}\)
0.7324
0.0108
(0.7113, 0.7536)
   \(\varPhi _{PS}\)
0.0953
0.0079
(0.0798, 0.1109)
   \(\varPhi _{DS}\)
0.3771
0.0068
(0.3637, 0.3905)
(b) For Table 3b
   \(\varPhi _{S}\)
0.1059
0.0088
(0.0887, 0.1231)
   \(\varPhi _{PS}\)
0.6595
0.0103
(0.6393, 0.6798)
   \(\varPhi _{DS}\)
0.3771
0.0068
(0.3637, 0.3905)
(c) For Table 3c
   \(\varPhi _{S}\)
0.7324
0.0108
(0.7113, 0.7536)
   \(\varPhi _{PS}\)
0.6595
0.0103
(0.6393, 0.6798)
   \(\varPhi _{DS}\)
0.3771
0.0068
(0.3637, 0.3905)
Also, from Fig. 2, we see the degree of departure from double symmetry, while distinguishing the degree of departure from symmetry and the degree of departure from point symmetry. Thus, we see that (1) for Table 3a, the degree of departure from symmetry is large but the degree of departure from point symmetry is small, (2) for Table 3b, the degree of departure from symmetry is small but the degree of departure from point symmetry is large, (3) for Table 3c, both the degree of departure from symmetry and the degree of departure from point symmetry are large.

5 Concluding remarks

This paper proposed a bivariate index vector that simultaneously characterizes the degree of departure from symmetry and the degree of departure from point symmetry. Since \(\varPhi _{S}\) and \(\varPhi _{PS}\) are not independent as shown in Sect. 2.2, we believe that it is important to simultaneously characterizes the degree of departure from symmetry and the degree of departure from point symmetry. In addition, the proposed index vector would be useful for visually comparing the degrees of departure from double symmetry using confidence regions. For comparing the degrees of departure from double symmetry in several square contingency tables, we consider that with the proposed index vector it becomes easier to understand the characteristics of the data.

Acknowledgements

The authors would like to thank the anonymous reviewers and the editor for their comments and suggestions to improve this paper.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://​creativecommons.​org/​licenses/​by/​4.​0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Literatur
Zurück zum Zitat Agresti A (2013) Categorical data analysis, 3rd edn. Wiley, HobokenMATH Agresti A (2013) Categorical data analysis, 3rd edn. Wiley, HobokenMATH
Zurück zum Zitat Andersen EB (1997) Introduction to the statistical analysis of categorical data. Springer, BerlinCrossRefMATH Andersen EB (1997) Introduction to the statistical analysis of categorical data. Springer, BerlinCrossRefMATH
Zurück zum Zitat Bowker AH (1948) A test for symmetry in contingency tables. J Am Stat Assoc 43:572–574CrossRefMATH Bowker AH (1948) A test for symmetry in contingency tables. J Am Stat Assoc 43:572–574CrossRefMATH
Zurück zum Zitat Tahata K, Tomizawa S (2014) Symmetry and asymmetry models and decompositions of models for contingency tables. SUT J Math 50:131–165MathSciNetMATH Tahata K, Tomizawa S (2014) Symmetry and asymmetry models and decompositions of models for contingency tables. SUT J Math 50:131–165MathSciNetMATH
Zurück zum Zitat Tomizawa S (1985) Double symmetry model and its decomposition in a square contingency table. J Jpn Stat Soc 15:17–23MathSciNetMATH Tomizawa S (1985) Double symmetry model and its decomposition in a square contingency table. J Jpn Stat Soc 15:17–23MathSciNetMATH
Zurück zum Zitat Tomizawa S (1994) Two kinds of measures of departure from symmetry in square contingency tables having nominal categories. Stat Sin 4:325–334MathSciNetMATH Tomizawa S (1994) Two kinds of measures of departure from symmetry in square contingency tables having nominal categories. Stat Sin 4:325–334MathSciNetMATH
Zurück zum Zitat Tomizawa S, Yamamoto K, Tahata K (2007) An entropy measure of departure from point-symmetry for two-way contingency tables. Symmetry Cult Sci 18:279–297 Tomizawa S, Yamamoto K, Tahata K (2007) An entropy measure of departure from point-symmetry for two-way contingency tables. Symmetry Cult Sci 18:279–297
Zurück zum Zitat Wall KD, Lienert GA (1976) A test for point-symmetry in J-dimensional contingency-cubes. Biom J 18:259–264MATH Wall KD, Lienert GA (1976) A test for point-symmetry in J-dimensional contingency-cubes. Biom J 18:259–264MATH
Zurück zum Zitat Yamamoto K, Komatsu M, Tomizawa S (2010) Measure of departure from double-symmetry for square contingency tables. J Stati Appl 5:105–118 Yamamoto K, Komatsu M, Tomizawa S (2010) Measure of departure from double-symmetry for square contingency tables. J Stati Appl 5:105–118
Metadaten
Titel
A bivariate index vector for measuring departure from double symmetry in square contingency tables
verfasst von
Shuji Ando
Kouji Tahata
Sadao Tomizawa
Publikationsdatum
26.03.2018
Verlag
Springer Berlin Heidelberg
Erschienen in
Advances in Data Analysis and Classification / Ausgabe 2/2019
Print ISSN: 1862-5347
Elektronische ISSN: 1862-5355
DOI
https://doi.org/10.1007/s11634-018-0320-7

Weitere Artikel der Ausgabe 2/2019

Advances in Data Analysis and Classification 2/2019 Zur Ausgabe