Elsevier

Statistics & Probability Letters

Volume 92, September 2014, Pages 114-120
Statistics & Probability Letters

A sufficient criterion for control of some generalized error rates in multiple testing

https://doi.org/10.1016/j.spl.2014.05.009Get rights and content

Abstract

We present a matrix-based criterion for controlling some generalized error rates for arbitrarily dependent p-values. This criterion can be used to obtain improved multiple testing procedures whose performance is evaluated in a simulation study and for some empirical data.

Introduction

Consider the problem of testing n hypotheses H1,,Hn simultaneously. A classical approach to deal with the multiplicity problem is to control the familywise error rate (FWER), i.e. the probability of one or more false rejections. However, when the number n of hypotheses is large, the ability to reject false hypotheses is small. Therefore, alternative Type I error rates have been proposed that relax control of FWER in order to reject more false hypotheses (for a survey, see e.g. Dudoit and van der Laan, 2007 and Guo et al., 2014).

One such generalized error rate is the kFWER, the probability of k or more false rejections for some integer k1, as considered by Hommel and Hoffmann (1987) and Lehmann and Romano (2005). For k=1 the usual FWER is obtained. Alternatively, it may be desirable to control the false discovery proportion (FDP), i.e. the proportion of false rejections amongst all rejected hypotheses. If R denotes the number of rejected hypotheses and V the number of falsely rejected hypotheses, set FDP=V/R (and equal to 0 if there are no rejections). For the FDP, mainly two types of control have been considered in the literature. One aim might be to control the tail probability P(FDP>γ) for some user-specified value γ[0,1). This error measure has been termed γ-FDP by Lehmann and Romano (2005) and tail probability for the proportion of false positives by Dudoit and van der Laan (2007). The false discovery rate (FDR) requires control in the mean, i.e.  FDR=E(FDP)γ. As Romano and Wolf (2010) point out, probabilistic control of the FDP allows one to make useful statements about the realized FDP in applications, whereas this is not possible when controlling the FDR. Guo et al. (2014) generalize the notion of γ-FDP to the situation where a small number k of false rejections are acceptable: they introduce kFDP=V/R if Vk and 0 else, and present results for controlling γ-kFDP=P(kFDP>γ) in various dependence settings. Similarly to the FDR, a generalized false discovery rate is defined by kFDR=E(kFDP), see e.g. Sarkar (2007).

In this paper we focus on multiple testing procedures that are based on marginal p-values and are valid for finite sample sizes under no assumptions on the type of dependency of these p-values. For kFWER and γ-FDP, Romano and Shaikh, 2006a, Romano and Shaikh, 2006b and Lehmann and Romano (2005) have made fundamental contributions that have been extended by Guo et al. (2014). For FDR, Benjamini and Yekutieli (2001) have shown that a rescaled version of the original step-up procedure of Benjamini and Hochberg (1995) controls the FDR under arbitrary dependencies. Guo and Rao (2008) have extended these results and have also given corresponding upper bounds for step-down FDR procedures.

The aim of this paper is two-fold. First, we present a simple sufficient condition for control of kFWER and γ-kFDP based on matrices that are associated with a specific error-rate and direction of stepping. This result follows directly from the work of Lehmann and Romano (2005), Romano and Shaikh, 2006a, Romano and Shaikh, 2006b and Guo et al. (2014) (in the sequel abbreviated as (LRSG)). In the second step we show how the rescaled γ-FDP controlling procedures considered by (LSRG) can in some cases be improved. In particular, we investigate a linear programming approach which uses the above-mentioned matrices.

Section snippets

Definitions and assumptions

When testing hypotheses H1,,Hn, we assume that corresponding p-values PV1,,PVn are available. Following Lehmann and Romano (2005), the only distributional assumption we make in this paper is the following.

Assumption 1

For any true hypothesis Hi we assume that the distribution of the p-value PVi is stochastically larger than a uniform random variable, i.e.  P(PViu)u for all u(0,1).

Let PV(1)PV(n) denote the ordered p-values and H(1),,H(n) the corresponding (null-)hypotheses. Let C={cR+n|c1cn}

Main results

First we state the main result of this paper. As the proof in Appendix shows, it is actually a direct consequence of several results of (LRSG).

Theorem 1

For k{1,,n} and γ[0,1) it holds under   Assumption  1   for any cC

  • (a)

    for the step-up procedure kFWER(c)(AkFWER   -SUc)|I|,

  • (b)

    for the step-up procedure γ-kFDP(c)(Aγ-kFDP   -SUc)|I|,

  • (c)

    for the step-down procedure kFWER(c)(AkFWER   -SDc)|I|,

  • (d)

    for the step-down procedure γ-kFDP(c)(Aγ-kFDP   -SDc)|I|,

where I{1,,n} denotes the set of true null hypotheses

Modified FDP-controlling procedures

We now focus on improving some classical procedures that control the γ-FDP. To be more specific, our goal is to find, for a given initial procedure cF(A) a new procedure ξF(A) which is at least as good as c, i.e.  ξici for i=1,,n and if possible strictly better than c, i.e. for at least one i we have ξi>ci. In addition, we would like the new procedure ξ to be unimprovable in the sense that there exists no ξF(A) with ξ>ξ. We now describe a linear programming approach which yields such a

Simulation study

In this section we investigate the power of the different FDP procedures in a simulation study. We consider the rescaled Benjamini–Hochberg step-up procedure, which we abbreviate by FDP-BH-SU, and its modified variant FDP-BH-SU (mod) defined by Definition 2. In the same way we consider the rescaled Romano–Shaikh step-up (FDP-RS-SU), the rescaled Benjamini–Hochberg step-down (FDP-BH-SD), the rescaled Romano–Shaikh step-down (FDP-RS-SD) procedures as well as their modifications. For comparing

Empirical applications

We now compare the performance of the median FDP and FDR approaches from the previous section for some empirical data.

  • (A)

    First, we revisit the data analyzed in Benjamini and Hochberg (1995), consisting of 15 p-values from a study on myocardial infarction.

  • (B)

    Westfall and Young (1993) used resampling methods to analyze data from a complex epidemiological survey designed to assess the mental health of urban and rural individuals living in central North Carolina. The data consists of 72 raw p-values,

Discussion

In this paper we have used matrix representations to obtain more powerful multiple testing procedures under general dependency. We have illustrated their usefulness through a simulation study and some empirical applications. Since these procedures are valid under arbitrary dependence of the p-values, they are universally applicable. However, when information on the dependency structure is available, they may also be quite conservative. Thus, it would be very useful to develop procedures for the

Acknowledgments

The author would like to thank Helmut Finner for pointing out Gordon’s work and two referees and an associate editor for helpful comments and suggestions that improved the content and presentation of the paper.

References (19)

  • A.Y. Gordon

    Explicit formulas for generalized family-wise error rates and unimprovable step-down multiple testing procedures

    J. Statist. Plann. Inference

    (2007)
  • W. Guo et al.

    On control of the false discovery rate under no assumption of dependency

    J. Statist. Plann. Inference

    (2008)
  • Y. Benjamini et al.

    Controlling the false discovery rate: a practical and powerful approach to multiple testing

    J. R. Stat. Soc. Ser. B

    (1995)
  • Y. Benjamini et al.

    The control of the false discovery rate in multiple testing under dependency

    Ann. Statist.

    (2001)
  • G. Dantzig

    Linear Programming and Extensions

    (1963)
  • Delattre, S., Roquain, E., 2013. On k-FWE-based critical values for controlling the false discovery proportion under...
  • S. Dudoit et al.

    Multiple Testing Procedures and Applications to Genomics

    (2007)
  • W. Guo et al.

    Further results on controlling the false discovery proportion

    Ann. Statist.

    (2014)
  • G. Hommel et al.

    Aesthetics and power considerations in multiple testing—a contradiction?

    Biom. J.

    (2008)
There are more references available in the full text version of this article.

Cited by (0)

View full text