Skip to main content

2018 | OriginalPaper | Buchkapitel

8. Hypothesis Testing for High-Dimensional Data

verfasst von : Wei Biao Wu, Zhipeng Lou, Yuefeng Han

Erschienen in: Handbook of Big Data Analytics

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We present a systematic theory for tests for means of high-dimensional data. Our testing procedure is based on an invariance principle which provides distributional approximations of functionals of non-Gaussian vectors by those of Gaussian ones. Differently from the widely used Bonferroni approach, our procedure is dependence-adjusted and has an asymptotically correct size and power. To obtain cutoff values of our test, we propose a half-sampling method which avoids estimating the underlying covariance matrix of the random vectors. The latter method is shown via extensive simulations to have an excellent performance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Bai ZD, Saranadasa H (1996) Effect of high dimension: by an example of a two sample problem. Stat Sin 6:311–329 Bai ZD, Saranadasa H (1996) Effect of high dimension: by an example of a two sample problem. Stat Sin 6:311–329
Zurück zum Zitat Bai ZD, Jiang DD, Yao JF, Zheng SR (2009) Corrections to LRT on large-dimensional covariance matrix by RMT. Ann Stat 37:3822–3840MathSciNetCrossRef Bai ZD, Jiang DD, Yao JF, Zheng SR (2009) Corrections to LRT on large-dimensional covariance matrix by RMT. Ann Stat 37:3822–3840MathSciNetCrossRef
Zurück zum Zitat Birke M, Dette H (2005) A note on testing the covariance matrix for large dimension. Stat Probab Lett 74:281–289MathSciNetCrossRef Birke M, Dette H (2005) A note on testing the covariance matrix for large dimension. Stat Probab Lett 74:281–289MathSciNetCrossRef
Zurück zum Zitat Cai Y, Ma ZM (2013) Optimal hypothesis testing for high dimensional covariance matrices. Bernoulli 19:2359–2388MathSciNetCrossRef Cai Y, Ma ZM (2013) Optimal hypothesis testing for high dimensional covariance matrices. Bernoulli 19:2359–2388MathSciNetCrossRef
Zurück zum Zitat Cai T, Liu WD, Luo X (2011) A constrained l 1 minimization approach to sparse precision matrix estimation. J Am Stat Assoc 106:594–607MathSciNetCrossRef Cai T, Liu WD, Luo X (2011) A constrained l 1 minimization approach to sparse precision matrix estimation. J Am Stat Assoc 106:594–607MathSciNetCrossRef
Zurück zum Zitat Chen SX, Qin Y-L (2010) A two-sample test for high-dimensional data with applications to gene-set testing. Ann Stat 38:808–835MathSciNetCrossRef Chen SX, Qin Y-L (2010) A two-sample test for high-dimensional data with applications to gene-set testing. Ann Stat 38:808–835MathSciNetCrossRef
Zurück zum Zitat Chen SX, Zhang L-X, Zhong P-S (2010) Tests for high-dimensional covariance matrices. J Am Stat Assoc 105:810–819MathSciNetCrossRef Chen SX, Zhang L-X, Zhong P-S (2010) Tests for high-dimensional covariance matrices. J Am Stat Assoc 105:810–819MathSciNetCrossRef
Zurück zum Zitat Chen XH, Shao QM, Wu WB, Xu LH (2016) Self-normalized Cramér type moderate deviations under dependence. Ann Stat 44:1593–1617MathSciNetCrossRef Chen XH, Shao QM, Wu WB, Xu LH (2016) Self-normalized Cramér type moderate deviations under dependence. Ann Stat 44:1593–1617MathSciNetCrossRef
Zurück zum Zitat Chernozhukov V, Chetverikov D, Kato K (2014) Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Ann Stat 41:2786–2819MathSciNetCrossRef Chernozhukov V, Chetverikov D, Kato K (2014) Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Ann Stat 41:2786–2819MathSciNetCrossRef
Zurück zum Zitat Dickhaus T (2014) Simultaneous statistical inference: with applications in the life sciences. Springer, HeidelbergCrossRef Dickhaus T (2014) Simultaneous statistical inference: with applications in the life sciences. Springer, HeidelbergCrossRef
Zurück zum Zitat Dudiot S, van der Laan M (2008) Multiple testing procedures with applications to genomics. Springer, New York Dudiot S, van der Laan M (2008) Multiple testing procedures with applications to genomics. Springer, New York
Zurück zum Zitat Efron B (2010) Large-scale inference: empirical Bayes methods for estimation, testing, and prediction. Cambridge University Press, Cambridge Efron B (2010) Large-scale inference: empirical Bayes methods for estimation, testing, and prediction. Cambridge University Press, Cambridge
Zurück zum Zitat Fan J, Hall P, Yao Q (2007) To how many simultaneous hypothesis tests can normal, Student’s t or bootstrap calibration be applied. J Am Stat Assoc 102:1282–1288MathSciNetCrossRef Fan J, Hall P, Yao Q (2007) To how many simultaneous hypothesis tests can normal, Student’s t or bootstrap calibration be applied. J Am Stat Assoc 102:1282–1288MathSciNetCrossRef
Zurück zum Zitat Fan J, Liao Y, Mincheva M (2013) Large covariance estimation by thresholding principal orthogonal complements. J R Stat Soc Ser B Stat Methodol 75:603–680MathSciNetCrossRef Fan J, Liao Y, Mincheva M (2013) Large covariance estimation by thresholding principal orthogonal complements. J R Stat Soc Ser B Stat Methodol 75:603–680MathSciNetCrossRef
Zurück zum Zitat Fisher TJ, Sun XQ, Gallagher CM (2010) A new test for sphericity of the covariance matrix for high dimensional data. J Multivar Anal 101:2554–2570MathSciNetCrossRef Fisher TJ, Sun XQ, Gallagher CM (2010) A new test for sphericity of the covariance matrix for high dimensional data. J Multivar Anal 101:2554–2570MathSciNetCrossRef
Zurück zum Zitat Georgiou S, Koukouvinos C, Seberry J (2003) Hadamard matrices, orthogonal designs and construction algorithms. In: Designs 2002: further computational and constructive design theory, vols 133–205. Kluwer, BostonCrossRef Georgiou S, Koukouvinos C, Seberry J (2003) Hadamard matrices, orthogonal designs and construction algorithms. In: Designs 2002: further computational and constructive design theory, vols 133–205. Kluwer, BostonCrossRef
Zurück zum Zitat Han YF, Wu WB (2017) Test for high dimensional covariance matrices. Submitted to Ann Stat Han YF, Wu WB (2017) Test for high dimensional covariance matrices. Submitted to Ann Stat
Zurück zum Zitat Jiang TF (2004) The asymptotic distributions of the largest entries of sample correlation matrices. Ann Appl Probab 14:865–880MathSciNetCrossRef Jiang TF (2004) The asymptotic distributions of the largest entries of sample correlation matrices. Ann Appl Probab 14:865–880MathSciNetCrossRef
Zurück zum Zitat Jiang DD, Jiang TF, Yang F (2012) Likelihood ratio tests for covariance matrices of high-dimensional normal distributions. J Stat Plann Inference 142:2241–2256MathSciNetCrossRef Jiang DD, Jiang TF, Yang F (2012) Likelihood ratio tests for covariance matrices of high-dimensional normal distributions. J Stat Plann Inference 142:2241–2256MathSciNetCrossRef
Zurück zum Zitat Ledoit O, Wolf M (2002) Some hypothesis tests for the covariance matrix when the dimension is large compared to the sample size. Ann Stat 30:1081–1102MathSciNetCrossRef Ledoit O, Wolf M (2002) Some hypothesis tests for the covariance matrix when the dimension is large compared to the sample size. Ann Stat 30:1081–1102MathSciNetCrossRef
Zurück zum Zitat Liu WD, Shao QM (2013) A Cramér moderate deviation theorem for Hotelling’s T 2-statistic with applications to global tests. Ann Stat 41:296–322CrossRef Liu WD, Shao QM (2013) A Cramér moderate deviation theorem for Hotelling’s T 2-statistic with applications to global tests. Ann Stat 41:296–322CrossRef
Zurück zum Zitat Lou ZP, Wu WB (2018) Construction of confidence regions in high dimension (Paper in preparation) Lou ZP, Wu WB (2018) Construction of confidence regions in high dimension (Paper in preparation)
Zurück zum Zitat Marčenko VA, Pastur LA (1967) Distribution of eigenvalues for some sets of random matrices. Math U S S R Sbornik 1:457–483CrossRef Marčenko VA, Pastur LA (1967) Distribution of eigenvalues for some sets of random matrices. Math U S S R Sbornik 1:457–483CrossRef
Zurück zum Zitat Onatski A, Moreira MJ, Hallin M (2013) Asymptotic power of sphericity tests for high-dimensional data. Ann Stat 41:1204–1231MathSciNetCrossRef Onatski A, Moreira MJ, Hallin M (2013) Asymptotic power of sphericity tests for high-dimensional data. Ann Stat 41:1204–1231MathSciNetCrossRef
Zurück zum Zitat Portnoy S (1986) On the central limit theorem in \(\mathbb {R}^p\) when p →∞. Probab Theory Related Fields 73:571–583 Portnoy S (1986) On the central limit theorem in \(\mathbb {R}^p\) when p →. Probab Theory Related Fields 73:571–583
Zurück zum Zitat Qu YM, Chen SX (2012) Test for bandedness of high-dimensional covariance matrices and bandwidth estimation. Ann Stat 40:1285–1314MathSciNetCrossRef Qu YM, Chen SX (2012) Test for bandedness of high-dimensional covariance matrices and bandwidth estimation. Ann Stat 40:1285–1314MathSciNetCrossRef
Zurück zum Zitat Schott JR (2007) A test for the equality of covariance matrices when the dimension is large relative to the sample size. Comput Stat Data Anal 51:6535–6542MathSciNetCrossRef Schott JR (2007) A test for the equality of covariance matrices when the dimension is large relative to the sample size. Comput Stat Data Anal 51:6535–6542MathSciNetCrossRef
Zurück zum Zitat Srivastava MS (2005) Some tests concerning the covariance matrix in high-dimensional data. J Jpn Stat Soc 35:251–272MathSciNetCrossRef Srivastava MS (2005) Some tests concerning the covariance matrix in high-dimensional data. J Jpn Stat Soc 35:251–272MathSciNetCrossRef
Zurück zum Zitat Srivastava MS (2009) A test for the mean vector with fewer observations than the dimension under non-normality. J Multivar Anal 100:518–532MathSciNetCrossRef Srivastava MS (2009) A test for the mean vector with fewer observations than the dimension under non-normality. J Multivar Anal 100:518–532MathSciNetCrossRef
Zurück zum Zitat Veillette MS, Taqqu MS (2013) Properties and numerical evaluation of the Rosenblatt distribution. Bernoulli 19:982–1005MathSciNetCrossRef Veillette MS, Taqqu MS (2013) Properties and numerical evaluation of the Rosenblatt distribution. Bernoulli 19:982–1005MathSciNetCrossRef
Zurück zum Zitat Wu WB (2005) Nonlinear system theory: another look at dependence. Proc Natl Acad Sci USA 102:14150–14154 (electronic)MathSciNetCrossRef Wu WB (2005) Nonlinear system theory: another look at dependence. Proc Natl Acad Sci USA 102:14150–14154 (electronic)MathSciNetCrossRef
Zurück zum Zitat Xiao H, Wu WB (2013) Asymptotic theory for maximum deviations of sample covariance matrix estimates. Stoch Process Appl 123:2899–2920MathSciNetCrossRef Xiao H, Wu WB (2013) Asymptotic theory for maximum deviations of sample covariance matrix estimates. Stoch Process Appl 123:2899–2920MathSciNetCrossRef
Zurück zum Zitat Yarlagadda RK, Hershey JE (1997) Hadamard matrix analysis and synthesis. Kluwer, BostonCrossRef Yarlagadda RK, Hershey JE (1997) Hadamard matrix analysis and synthesis. Kluwer, BostonCrossRef
Zurück zum Zitat Zhang RM, Peng L, Wang RD (2013) Tests for covariance matrix with fixed or divergent dimension. Ann Stat 41:2075–2096MathSciNetCrossRef Zhang RM, Peng L, Wang RD (2013) Tests for covariance matrix with fixed or divergent dimension. Ann Stat 41:2075–2096MathSciNetCrossRef
Metadaten
Titel
Hypothesis Testing for High-Dimensional Data
verfasst von
Wei Biao Wu
Zhipeng Lou
Yuefeng Han
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-18284-1_8

Premium Partner