A sufficient criterion for control of some generalized error rates in multiple testing
Introduction
Consider the problem of testing hypotheses simultaneously. A classical approach to deal with the multiplicity problem is to control the familywise error rate (FWER), i.e. the probability of one or more false rejections. However, when the number of hypotheses is large, the ability to reject false hypotheses is small. Therefore, alternative Type I error rates have been proposed that relax control of FWER in order to reject more false hypotheses (for a survey, see e.g. Dudoit and van der Laan, 2007 and Guo et al., 2014).
One such generalized error rate is the , the probability of or more false rejections for some integer , as considered by Hommel and Hoffmann (1987) and Lehmann and Romano (2005). For the usual FWER is obtained. Alternatively, it may be desirable to control the false discovery proportion (FDP), i.e. the proportion of false rejections amongst all rejected hypotheses. If denotes the number of rejected hypotheses and the number of falsely rejected hypotheses, set (and equal to 0 if there are no rejections). For the , mainly two types of control have been considered in the literature. One aim might be to control the tail probability for some user-specified value . This error measure has been termed by Lehmann and Romano (2005) and tail probability for the proportion of false positives by Dudoit and van der Laan (2007). The false discovery rate (FDR) requires control in the mean, i.e. . As Romano and Wolf (2010) point out, probabilistic control of the FDP allows one to make useful statements about the realized FDP in applications, whereas this is not possible when controlling the FDR. Guo et al. (2014) generalize the notion of to the situation where a small number of false rejections are acceptable: they introduce if and 0 else, and present results for controlling in various dependence settings. Similarly to the , a generalized false discovery rate is defined by , see e.g. Sarkar (2007).
In this paper we focus on multiple testing procedures that are based on marginal -values and are valid for finite sample sizes under no assumptions on the type of dependency of these -values. For and , Romano and Shaikh, 2006a, Romano and Shaikh, 2006b and Lehmann and Romano (2005) have made fundamental contributions that have been extended by Guo et al. (2014). For , Benjamini and Yekutieli (2001) have shown that a rescaled version of the original step-up procedure of Benjamini and Hochberg (1995) controls the under arbitrary dependencies. Guo and Rao (2008) have extended these results and have also given corresponding upper bounds for step-down procedures.
The aim of this paper is two-fold. First, we present a simple sufficient condition for control of and based on matrices that are associated with a specific error-rate and direction of stepping. This result follows directly from the work of Lehmann and Romano (2005), Romano and Shaikh, 2006a, Romano and Shaikh, 2006b and Guo et al. (2014) (in the sequel abbreviated as (LRSG)). In the second step we show how the rescaled controlling procedures considered by (LSRG) can in some cases be improved. In particular, we investigate a linear programming approach which uses the above-mentioned matrices.
Section snippets
Definitions and assumptions
When testing hypotheses , we assume that corresponding -values are available. Following Lehmann and Romano (2005), the only distributional assumption we make in this paper is the following. Assumption 1 For any true hypothesis we assume that the distribution of the -value is stochastically larger than a uniform random variable, i.e. for all .
Main results
First we state the main result of this paper. As the proof in Appendix shows, it is actually a direct consequence of several results of (LRSG). Theorem 1 For and it holds under Assumption 1 for any for the step-up procedure , for the step-up procedure , for the step-down procedure , for the step-down procedure ,
where denotes the set of true null hypotheses
Modified FDP-controlling procedures
We now focus on improving some classical procedures that control the . To be more specific, our goal is to find, for a given initial procedure a new procedure which is at least as good as , i.e. for and if possible strictly better than , i.e. for at least one we have . In addition, we would like the new procedure to be unimprovable in the sense that there exists no with . We now describe a linear programming approach which yields such a
Simulation study
In this section we investigate the power of the different FDP procedures in a simulation study. We consider the rescaled Benjamini–Hochberg step-up procedure, which we abbreviate by FDP-BH-SU, and its modified variant FDP-BH-SU (mod) defined by Definition 2. In the same way we consider the rescaled Romano–Shaikh step-up (FDP-RS-SU), the rescaled Benjamini–Hochberg step-down (FDP-BH-SD), the rescaled Romano–Shaikh step-down (FDP-RS-SD) procedures as well as their modifications. For comparing
Empirical applications
We now compare the performance of the median FDP and FDR approaches from the previous section for some empirical data.
- (A)
First, we revisit the data analyzed in Benjamini and Hochberg (1995), consisting of 15 -values from a study on myocardial infarction.
- (B)
Westfall and Young (1993) used resampling methods to analyze data from a complex epidemiological survey designed to assess the mental health of urban and rural individuals living in central North Carolina. The data consists of 72 raw -values,
Discussion
In this paper we have used matrix representations to obtain more powerful multiple testing procedures under general dependency. We have illustrated their usefulness through a simulation study and some empirical applications. Since these procedures are valid under arbitrary dependence of the -values, they are universally applicable. However, when information on the dependency structure is available, they may also be quite conservative. Thus, it would be very useful to develop procedures for the
Acknowledgments
The author would like to thank Helmut Finner for pointing out Gordon’s work and two referees and an associate editor for helpful comments and suggestions that improved the content and presentation of the paper.
References (19)
Explicit formulas for generalized family-wise error rates and unimprovable step-down multiple testing procedures
J. Statist. Plann. Inference
(2007)- et al.
On control of the false discovery rate under no assumption of dependency
J. Statist. Plann. Inference
(2008) - et al.
Controlling the false discovery rate: a practical and powerful approach to multiple testing
J. R. Stat. Soc. Ser. B
(1995) - et al.
The control of the false discovery rate in multiple testing under dependency
Ann. Statist.
(2001) Linear Programming and Extensions
(1963)- Delattre, S., Roquain, E., 2013. On k-FWE-based critical values for controlling the false discovery proportion under...
- et al.
Multiple Testing Procedures and Applications to Genomics
(2007) - et al.
Further results on controlling the false discovery proportion
Ann. Statist.
(2014) - et al.
Aesthetics and power considerations in multiple testing—a contradiction?
Biom. J.
(2008)