Testing the equivalence of two groups is a familiar problem in statistics. Typically we are interested in testing a null hypothesis that two population means are equal versus an alternative that the means are not equal (for a two-sided test) or that the mean for an experimental treatment is greater than that for a standard treatment (one-sided test). We compute a test statistic from the observed data, and reject the null hypothesis if the test statistic exceeds a particular constant. The significance level of the test is the probability that we reject the null hypothesis when the null hypothesis is in fact true. A widely known test is the two-sample Students t-test for continuous observations, which requires the assumption that the observations are normally distributed. If the normal distribution assumption is in doubt, a rank-based test called the Mann-Whitney test may be used, which gives valid test results without making parametric assumptions. With survival data, if we are willing to assume that the data follow a particular parametric distribution, we can use likelihood theory to construct a test for equivalence of the two distributions, as we shall see in Chap.
10 However, as we have discussed in the previous chapters, survival data from biomedical experiments or clinical trials generally doesn’t lend itself to analysis by parametric methods. Thus, we shall construct nonparametric tests of equivalence of two survival functions,
H
0:
S
1(
t) =
S
0(
t). Typically,
S
1 and
S
0 will represent the survival distributions for, respectively, an experimental and a control therapy. Now, a statistical hypothesis test (in the classical hypothesis testing framework) also requires us to specify an alternative hypothesis, and one might at first try to specify a one-sided alternative
H
A
:
S
1(
t) >
S
0(
t) or two-sided alternative
H
A
:
S
1(
t) ≠
S
0(
t). Unfortunately, things aren’t so simple in survival analysis, since the alternative can take a wide range of forms. What if the survival distributions are similar for some values of
t and differ for others? What if the survival distributions cross? How do we want our test statistic to behave under these different scenarios? One solution is to consider what is called a Lehman alternative,
\(H_{A}: S_{1}(t) = \left [S_{0}(t)\right ]^{\psi }\). Equivalently, we can view Lehman alternatives in terms of proportional hazards as
h
1(
t) =
ψ h
0(
t). Either way we would construct a one sided test as
H
0:
ψ = 1 versus
H
A
:
ψ < 1, so that under the alternative hypothesis
S
1(
t) will be uniformly higher than
S
0(
t) and
h
1(
t) uniformly lower than
h
0(
t) (i.e. subjects in Group 1will have longer survival times than subjects in Group 0). As we shall see, we can construct a test statistic using the ranks of the survival times. While these rank-based tests are similar to the Mann-Whitney test, the presence of censoring complicates the assignment of ranks. Thus, we initially take an alternative approach to developing this test, where we view the numbers of failure and numbers at risk at each distinct time as a two-by-two table. That is, for each failure time
t
i
we may construct a two-by-two table showing the numbers at risk (
n
0i
and
n
1i
for the control and treatment arms, respectively) and the number of failures (
d
0i
and
d
1i
, respectively). Also shown in the table are the “marginals”, that is, the row and column sums. For example, we have
\(d_{i} = d_{0i} + d_{1i}\) and
\(n_{i} = n_{0i} + n_{1i}\). We first order the distinct failure times. Then for the
i’th failure time, we have the following table: