In this section, I discuss the difference-in-differences (DiD) strategy for identifying the Average Treatment Effect on the Treated (ATET) (see e.g., Lechner
2010), i.e., on children living in Hungary or France in 2014. The potential outcome
Y (e.g., frequency of consumed sodas) depends on the time period
\(t \in \{0, 1\}\) and the potential treatment state
\(d \in \{0, 1\}\). The notation
\(Y_{t}^{d}\) indicates the potential outcome in the potential treatment state d and in time period
t. For example, the potential outcome of the treatment group (
\(d = 1\)) in the pre-treatment period (
\(t = 0\)) is represented by
\(Y^{1}_{0}\). This notation facilitates to state the identifying assumptions of the DiD framework, see Lechner (
2010).
The first assumption, formulated in equation
5.1, requires the exogeneity of the covariates (
X). This assumption would be violated if the soda tax affects the characteristics of the children or the household. Time-independent covariates, like gender, cannot be affected by the soda tax because they are constant over time. Time-dependent variables may be affected by the treatment, especially if these variables are measured after the implementation of the soda tax. Since I use repeated cross-sections, the covariates are measured in 2014, whereas the soda tax is in force since 2011 or 2012 respectively. However, it is rather unlikely that the soda tax affects, for example, children’s TV consumption or whether the mother lives at the main home or not.
$$\begin{aligned} \begin{array}{l} X^{1}=X^{0}=X; \; \forall x \in \chi . \end{array} \end{aligned}$$
(5.1)
The main identifying assumption in the context of DiD is the common trend assumption, formally stated in Eq.
5.2. Intuitively speaking, the soda consumption and the BMI of children living in Hungary and Croatia, would follow the same time trend in the absence of the soda tax.
13 For this reason, I need to control for child and household covariates that would lead to different time trends. For example, soda consumption increases with the age of the child (Grimm et al.
2004), such that the time trend differs between older and younger children. Another example represents children from low-income families which might have fewer available pocket money in time of an economic crisis. I provide a placebo test conditional on covariates using unaffected periods in Table
6 in Sect.
6 to support this assumption.
$$\begin{aligned} \begin{array}{l} E[Y^{1}_{0}|X=x, D=1] - E[Y^{0}_{0}|X=x, D=1] \\ =E[Y^{1}_{0}|X=x, D=0] - E[Y^{0}_{0}|X=x, D=0]\\ =E[Y^{1}_{0}|X=x] - E[Y^{0}_{0}|X=x]; \; \forall x \in \chi . \end{array} \end{aligned}$$
(5.2)
A further assumption rules out an anticipatory effect (
\(\theta \)) of the policy in the pre-treatment period
\((t = 0)\) as formulated in Eq.
5.3. Accordingly, children in the treated countries Hungary and France must not anticipate the effect of the soda tax in 2010, i.e., they must not change their soda consumption prior to the implementation of the tax. Since the tax was discussed from 2005 to 2011 in the parliament in France, it might have raised the awareness of unhealthy beverages among French children. As a result, I may underestimate the impact of the soda tax. Conversely, the knowledge of this proposed tax could also have influenced consumers to stock up on sodas before the tax was introduced. However, I focus on school-age children who possess only their limited pocket money, so this caveat is unlikely. Additionally, the decision to pass this law was unexpected and the implementation time of five months was rather short (Le Bodo et al.
2019). In Hungary, the law was passed one and a half months before it came into force (Ecorys
2014), which represents even a shorter period for anticipatory behavior.
$$\begin{aligned} \begin{array}{l} \theta _{0}(x)= 0; \; \forall x \in \chi . \end{array} \end{aligned}$$
(5.3)
The last assumption is known as the common support assumption and is formulated in Eq.
5.4. It demands that for each child in Hungary in 2014, another child exists with the same characteristics in the following three groups: (i) Hungary in 2010, (ii) Croatia in 2010, and (iii) Croatia in 2014.
14 Under assumptions
5.1‐
5.4, the ATET is identified.
$$\begin{aligned} \begin{array}{l} P[TD=1|X=x, (T,D) \in {(t,d), (1,1)}]< 1; \\ \; \forall (t, d) \in \{(0,1), (0,0), (1,0)\}; \;\forall x \in \chi . \end{array} \end{aligned}$$
(5.4)
A standard DiD approach models a linear relationship between the policy and the outcome, in this case, the outcome variable is continuous. The variable “Frequency of sodas” is measured as a categorical variable in the HBSC dataset. Therefore, this variable is a limited dependent variable, implying a non-linear relationship between the policy and the outcome. However, considering the non-linearity may lead to the violation of the identifying assumption of the DiD, the common trend assumption (Lechner
2010). To deal with this issue, I use a semi-parametric approach with a modified common trend assumption to model the relationship in a more flexible way than a parametric approach.
Equation
5.5 describes the identification of the semi-parametric ATET based on inverse probability weighting (Huber
2019). The outcome variable
Y is multiplied by an inverse probability weight, where
\(\Pi \) gives the share of treated observations in the post-treatment period and
\(\rho _{d,t}(X)\) is the probability of being in the treatment state d and in the time period t, conditional on covariates
X. This propensity score is estimated by probit.
$$\begin{aligned} \begin{array}{l} E\left[ \left\{ \frac{DT}{\Pi }- \frac{D(1-T)\rho _{1,1}(X)}{\rho _{1,0}(X)\Pi }- (\frac{(1-D)T\rho _{1,1}(X)}{\rho _{0,1}(X)\Pi })-\frac{(1-D)(1-T)\rho _{1,1}(X)}{\rho _{0,0}(X)\Pi }) \right\} Y \right] ,\\ where\ \Pi = Pr(D=1, T=1),\ \rho _{d,t}(X)=Pr(D=d, T=t|X). \end{array} \end{aligned}$$
(5.5)
To ensure that the common trend assumption holds, I include the following covariates (
X) in the estimation: On the individual level, I control for age and sex of the child, because older children reveal a different soda consumption than younger children and boys differ in their consumption behavior from girls (Vereecken et al.
2005). Since TV consumption was associated with soda consumption (see Andreyeva et al.
2011; Gebremariam et al.
2017; Grimm et al.
2004; Vereecken et al.
2006), I control for television consumption on a weekday. On the household level, I take into account several characteristics: Firstly, I control for the household structure, in particular, whether the mother or the father lives in the same household as the child. Secondly, I control for the wealth of the family, because it is associated with different soda consumption levels (Drewnowski et al.
2019). I use the following proxies for family’s wealth: Ownership of a family car, number of computers in the household, well-off of the family, a dummy indicating whether the child has his/her own bedroom. Furthermore, soda consumption increases with wealth in Eastern European countries, whereas it decreases in Western European countries (Vereecken et al.
2005). Country-specific characteristics, like the growth of the Gross Domestic Product (GDP), may affect the soda consumption of its inhabitants and thus bias the results. Controlling for country-specific covariates could serve as a solution for this problem, yet this is not possible because of the multi-collinearity with the treatment. Therefore, I inspect the GDP growth of each country pair in Sect.
6.
For the estimation, I use the didweight command of the causalweight package in R, with the default number of bootstrap replications of 1999 to calculate the standard errors, and the default trimming rule of 0.05 to drop observations with an extreme propensity score from the sample. Since the didweight command is designed for one pre- and one post-treatment-period, I use the survey years 2010 and 2014 in the estimation. Several pre-treatment years are available to test the parallel trend assumption. I use the survey years 2006 and 2010 and run the estimation with a fake treatment in the latter.