Since the joint distribution implies all marginal and conditional distribution, respectively, the conditional distribution
$$\begin{aligned} P_{\mathsf Z_0 \mid \bigotimes _{\ell =1}^m \mathsf Z_\ell } = \frac{P_{\bigotimes _{\ell =0}^m \mathsf Z_\ell }}{P_{\bigotimes _{\ell =1}^m \mathsf Z_\ell }} \end{aligned}$$
(3.6)
is explicitly given here by
$$\begin{aligned} \frac{P_{\bigotimes _{\ell =0}^m \mathsf Z_\ell }(s_{k_0}, \dots , s_{k_\ell })}{P_{\bigotimes _{\ell =1}^m \mathsf Z_\ell }(s_{k_1}, \dots , s_{k_\ell })} = \frac{P_{\bigotimes _{\ell =0}^m \mathsf Z_\ell }(s_{k_0}, \dots , s_{k_\ell })}{\sum _{k_0=1}^{K_0} P_{\bigotimes _{\ell =0}^m \mathsf Z_\ell }(s_{k_0}, s_{k_1}, \dots , s_{k_\ell })}. \end{aligned}$$
Assuming independence, Eq. (
3.6) immediately reveals
$$\begin{aligned} P_{\mathsf Z_0 \mid \bigotimes _{\ell =1}^m \mathsf Z_\ell } = P_{\mathsf Z_0}. \end{aligned}$$
Assuming conditional independence of all
\(\mathsf Z_\ell , \ell =1,\ldots ,m\), given
\(\mathsf Z_0\) and further that
\(\mathsf Z_0\) is dichotomous, then
$$\begin{aligned} P_{\mathsf Z_0 \mid \bigotimes _{\ell =1}^m \mathsf Z_\ell } (1 \mid s_{k_1}, \dots , s_{k_\ell }) = \frac{P_{\bigotimes _{\ell =0}^m \mathsf Z_\ell }(1, s_{k_1}, \dots , s_{k_\ell })}{\sum _{i =0}^1 P_{\bigotimes _{\ell =0}^m \mathsf Z_\ell }(i, s_{k_1}, \dots , s_{k_\ell })} \end{aligned}$$
(3.7)
with
$$\begin{aligned} P_{\bigotimes _{\ell =0}^m \mathsf Z_\ell }(1, s_{k_1}, \dots , s_{k_\ell }) = \exp \left( \phi _{1} + \sum _{\ell =1}^m \phi _{k_\ell } + \sum _{\ell =1}^m \; \sum _{k_\ell =1}^{K_\ell } \phi _{1, k_\ell } \right) \end{aligned}$$
and
$$\begin{aligned} \sum _{i=0}^1 P_{\bigotimes _{\ell =0}^m \mathsf Z_\ell }(i, s_{k_1}, \dots , s_{k_\ell }) = \sum _{i =0}^1 \exp \left( \phi _{i} + \sum _{\ell =1}^m \phi _{k_\ell } + \sum _{\ell =1}^m \; \sum _{k_\ell =1}^{K_\ell } \phi _{i, k_\ell } \right) \end{aligned}$$
Thus,
Finally,
$$\begin{aligned} P_{\mathsf Z_0 \mid \bigotimes _{\ell =1}^m \mathsf Z_\ell } = \varLambda \Big ( \beta _0 + \sum _{\ell =1}^m \beta _\ell \mathsf Z_\ell \Big ), \end{aligned}$$
which is obviously logistic regression
$$\begin{aligned} \mathrm {logit} P_{\mathsf Z_0 \mid \bigotimes _{\ell =1}^m \mathsf Z_\ell } = \beta _0 + \sum _{\ell =1}^m \beta _\ell \mathsf Z_\ell . \end{aligned}$$
(3.8)
It should be noted that additional product terms in the joint probability
\(P_{\bigotimes _{\ell =0}^m \mathsf Z_\ell }\) on the right hand side of Eq. (
3.7) of the form
\(\bigotimes _{\ell =1}^k \bigotimes _{\ell _i \in C_{\ell }^k} \mathsf Z_{\ell _i}\) including
\(\mathsf Z_\ell , \ell =1,\ldots ,m\), only, i.e., not including
\(\mathsf Z_0\), would not effect the form of the conditional probability, Eq. (
3.8). Additional product terms of the form
\(\mathsf Z_0 \otimes \bigotimes _{\ell =1}^k \bigotimes _{\ell _i \in C_{\ell }^k} \mathsf Z_{\ell _i}\), i.e., including
\(\mathsf Z_0\), result in a logistic regression model with interaction terms, Eq. (
3.2).
Ordinary logistic regression is optimum, if the joint probability of the (dichotomous) target variable and the predictor variables is of log-linear form and all predictor variables are jointly conditionally independent given the target variable; in particular, it is optimum if the predictor variables are categorical and jointly conditionally independent given the target variable (Schaeben
2014a). Logistic regression with interaction terms is optimum, if the joint probability of the (dichotomous) target variable and the predictor variables is of log-linear form and the interaction terms correspond to lacking conditionally independence given the target variable; for categorical predictor variables, interaction terms can compensate for any lack of conditional independence exactly. Logistic regression with interaction terms is optimum in case of lacking conditional independence (Schaeben
2014a).