Skip to main content
Log in

A review of statistical methods for expression quantitative trait loci mapping

  • Published:
Mammalian Genome Aims and scope Submit manuscript

Abstract

With high-throughput technologies now widely available, investigators can easily measure thousands of phenotypes for quantitative trait loci (QTL) mapping. Microarray measurements are particularly amenable to QTL mapping, as evidenced by a number of recent studies demonstrating utility across a broad range of biological endeavors. The early success stories have impelled a rapid increase in both the number and complexity of expression QTL (eQTL) experiments. Consequently, there is a need to consider the statistical principles involved in the design and analysis of these experiments and the methods currently being used. In this article we review these principles and methods and discuss the open questions most likely to yield significant progress toward increasing the amount of meaningful information obtained from eQTL mapping experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Barry WT, Nobel AB, Wright FA (2005) Significance analysis of functional categories in gene expression studies: a structured permutation approach. Bioinformatics 21, 1943–1949

    Article  CAS  PubMed  Google Scholar 

  • Bing N, Hoeschele I (2005) Genetical genomics analysis of a yeast segregant population for transcription network inference. Genetics 170:533–542

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Black MA, Doerge RW (2002) Calculation of the minimum number of replicate spots required for detection of significant gene expression fold change in microarray experiments. Bioinformatics 18:1609–1616

    Article  CAS  PubMed  Google Scholar 

  • Brem RB, Kruglyak L (2005) The landscape of genetic complexity across 5700 gene expression traits in yeast. Proceedings of the National Academy of Sciences 102:1572–1577

    Article  CAS  Google Scholar 

  • Brem RB, Yvert G, Clinton R, Kruglyak L (2002) Genetic dissection of transcriptional regulation in budding yeast. Science 296:752–755

    Article  CAS  PubMed  Google Scholar 

  • Bystrykh L, Weersing E, Dontje B, Sutton S, Pletcher MT, et al. (2005) Uncovering regulatory pathways that affect hematopoietic stem cell function using “genetical genomics.” Nat Genet 37:225–232

    Article  CAS  PubMed  Google Scholar 

  • Chesler EJ, Lu L, Shou S, Qu Y, Gu J, et al. (2005) Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function. Nat Genet 37:233–242

    Article  CAS  PubMed  Google Scholar 

  • Churchill GA (2002) Fundamentals of experimental design for cDNA microarrays. Nat Genet 32:490–495

    Article  CAS  PubMed  Google Scholar 

  • Cui X, Churchill GA (2003) How many mice and how many arrays? Replication in mouse cDNA microarray experiments In: Methods of Microarray Data Analysis III, Johnson KF, Lin SM (eds.) (Norwell MA: Kluwer Academic Publishers) pp 139–154

    Google Scholar 

  • Dobbin K, Simon R (2005) Sample size determination in microarray experiments for class comparison and prognostic classification. Biostatistics 6(1):27–38

    Article  PubMed  Google Scholar 

  • Dobbin K, Shih JH, Simon R (2003a) Statistical design of reverse dye microarrays. Bioinformatics 19(7):803–810

    Article  CAS  PubMed  Google Scholar 

  • Dobbin K, Shih JH, Simon R (2003b) Questions and answers on design of dual-label microarrays for identifying differentially expressed genes. J Natl Cancer Inst 95(18):1362–1369

    Article  PubMed  Google Scholar 

  • Dombkowski AA, Thibodeau BJ, Starcevic SL, Novak RF (2004) Gene-specific dye bias in microarray reference designs. FEBS Lett 560:120–124

    Article  CAS  PubMed  Google Scholar 

  • Dupuis J, Siegmund D (1999) Statistical methods for mapping quantitative trait loci from a dense set of markers. Genetics 151:373–386

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Efron B (2005) Local False Discovery Rates. Available at http://www-stanford.edu/∼brad/papers/. Last accessed April 21 2006

  • Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci 95:14863–14868

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gadbury GL, Page GP, Edwards JW, Kayo T, Prolla TA, et al. (2004) Power and sample size estimation in high dimensional biology. Stat Methods Med Res 13:325–338

    Article  Google Scholar 

  • Gentleman R (2005) Using GO for Statistical Analyses, Bioconductor vignette http://www.bioconductor.org

  • Hubner N, Wallace CA, Zimdahl H, Petretto E, Schulz H, et al. (2005) Integrated transcriptional profiling and linkage analysis for identification of genes underlying disease. Nat Genet 37:243–253

    Article  CAS  PubMed  Google Scholar 

  • Hu J, Zou F, Wright FA (2005) Practical FDR-based sample size calculations in microarray experiments. Bioinformatics 21(15):3264–3272

    Article  CAS  PubMed  Google Scholar 

  • Irizarry RA, Warren D, Spencer F, Kim IF, Biswal S, et al. (2005) Multiple-laboratory comparison of microarray platforms. Nat Methods 2:345–350

    Article  CAS  PubMed  Google Scholar 

  • Jannink JL (2005) Selective phenotyping to accurately map quantitative trait loci. Crop Sci 45:901–908

    Article  CAS  Google Scholar 

  • Jansen RC, Nap JP (2001) Genetical genomics: the added value from segregation. Trends Genet 17:388–391

    Article  CAS  PubMed  Google Scholar 

  • Jensen FV (2001) Bayesian Network and Decision Graphs. In Statistics for Engineering and Information Science (New York: Springer-Verlag)

    Google Scholar 

  • Jeong H, Tombor B, Albert R, Oltvai ZN, Barabasi AL (2000) The large scale organization of metabolic networks. Nature 407:651–653

    Article  CAS  PubMed  Google Scholar 

  • Jin C, Lan H, Attie AD, Bulutuglo D, Churchill GA, et al. (2004) Selective phenotyping for increased efficiency in genetic mapping studies. Genetics 168:2285-2293

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Jung S-H, Bang H, Young S (2005a) Sample size calculation for multiple testing in microarray data analysis. Biostatistics 6(1):157–169

    Article  PubMed  Google Scholar 

  • Jung S-H (2005b) Sample size for FDR-control in microarray data analysis. Bioinformatics 21(14):3097–3104

    Article  CAS  PubMed  Google Scholar 

  • Kendziorski C, Zhang Y, Lan H, Attie AD (2003) The efficiency of mRNA pooling in microarray experiments. Biostatistics 4:465–477

    Article  CAS  PubMed  Google Scholar 

  • Kendziorski C, Irizarry RA, Chen K, Haag JD, Gould MN (2005) On the utility of pooling biological samples in microarray experiments. Proc Natl Acad Sci USA 102(12):4252–4257

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kendziorski C, Chen M, Yuan M, Lan H, Attie AD (2006) Statistical methods for expression quantitative trait loci (eQTL) mapping. Biometrics 62:19-27

    Article  CAS  PubMed  Google Scholar 

  • Kerr K (2003) Design considerations for efficient and effective microarray studies. Biometrics 59(4):822–828

    Article  PubMed  Google Scholar 

  • Kerr K, Churchill GA (2001) Experimental design for gene expression microarrays. Biostatistics 2:183–201

    Article  CAS  PubMed  Google Scholar 

  • Lan H, Chen M, Flowers JB, Yandell BS, Stapleton DS, et al. (2006) Combined expression trait correlations and expression quantitative trait locus mapping. PLoS Genet 2:e6

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Larget B, Simon D (1999) Markov chain Monte Carlo algorithms for the Bayesian analysis of phylogenetic trees. Mol Biol Evol 16:750–759

    Article  CAS  Google Scholar 

  • Larkin JE, Frank BC, Gavras H, Sultana R, Quackenbush J (2005) Independence and reproducibility across microarray platforms. Nat Methods 2:337–344

    Article  CAS  PubMed  Google Scholar 

  • Lee MT, Whitmore GA (2002) Power and sample size for DNA microarray studies. Stat Med 21:3543–3570

    Article  PubMed  Google Scholar 

  • Li H, Lu L, Manly KF, Chesler EJ, Bao L, et al. (2005a) Inferring gene transcriptional modulatory relations: a genetical genomics approach. Hum Mol Genet 14(9):1119–1125

    Article  CAS  PubMed  Google Scholar 

  • Li L, Alderson D, Doyle JC, Willinger W (2005b) Towards a theory of scale-free graphs: definition, properties, and implications. Internet Mathematics 2(4), 431–523

    Article  CAS  Google Scholar 

  • Liu Y, Zeng ZB (2000) A general mixture model approach for mapping quantitative trait loci from diverse cross designs involving multiple inbred lines. Genet Res 75:345–355

    Article  CAS  PubMed  Google Scholar 

  • Mehrabian M, Allayee H, Stockton J, Lum PY, Drake TA, et al. (2005) Integrating genotypic and expression data in a segregating mouse population to identity 5-lipoxygenase as a susceptibility gene for obesity and bone traits. Nat Genet 37, 1224–1233

    Article  CAS  PubMed  Google Scholar 

  • Morley M, Molony CM, Weber TM, Devlin JL, Ewens KG, et al. (2004) Genetic analysis of genome-wide variation in human gene expression. Nature 430:743–747

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Muller P, Parmigiani G, Robert C, Rousseau J (2004) Optimal sample size for multiple testing: the case of gene expression microarrays. J Am Stat Assoc 99:990–1001

    Article  Google Scholar 

  • Pan W, Lin J, Le CT (2002) How many replicates of arrays are required to detect gene expression changes in microarray experiments? A mixture model approach. Genome Biol 3(5), research0022

  • Perez-Enciso M (2004) In silico study of transcriptome genetic variation in outbred populations. Genetics 166:547–554

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • R Development Core Team (2004) R: A language and environment for statistical computing (Vienna, Austria: R Foundation for Statistical Computing)

    Google Scholar 

  • Ruschhaupt M, Huber W, Poustka A, Mansmann U (2004) A compendium to ensure computational reproducibility in high-dimensional classification tasks. Statistical Applications in Genetics and Molecular Biology 3(1), article 37

    Article  Google Scholar 

  • Schadt EE, Monks SA, Drake TA, Lusis AJ, Che N, et al. (2003) Genetics of gene expression surveyed in maize, mouse and man. Nature 422:297–302

    Article  CAS  PubMed  Google Scholar 

  • Schadt EE, Lamb J, Yang X, Zhu J, Edwards S, et al. (2005) An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet 37:710–717

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Sen S, Satagopan J, Churchill GA (2005) QTL study design from an information perspective. Genetics 170:447–464

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Simon RM, Dobbin K (2003) Experimental design of DNA microarray experiments. BioTechniques Suppl, 16–21

  • Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci USA 100(16):9440–9445

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Storey JD, Akey JM, Kruglyak L (2005) Multiple locus linkage analysis of genomewide expression in yeast. PLoS Biol 3(8):e267

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, et al. (2005). Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci 102:15545–15550

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Weis BK, Members of the Toxicogenomics Research Consortium (2005) Standardizing global gene expression analysis between laboratories and across platforms. Nat Methods 2(5):351–356

    Article  PubMed  CAS  Google Scholar 

  • Yang YH, Speed TP (2002) Design issues for cDNA microarray experiments. Nat Rev Genet 3:579–588

    Article  CAS  PubMed  Google Scholar 

  • Yvert G, Brem RB, Whittle J, Akey JM, Foss E, et al. (2003) Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors. Nat Genet 35:57–64

    Article  CAS  PubMed  Google Scholar 

  • Zhu J, Lum PY, Lamb J, GuhaThakurta D, Edwards SW, et al. (2004). An integrative genomics approach to the reconstruction of gene networks in segregating populations Cytogenet Genome Res 105:363–374

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgments

The authors thank Alan Attie, Meng Chen, Michael Newton, and Brian Yandell for useful discussions and two anonymous reviewers for comments that improved the manuscript. They also thank Stephanie Ciatti for extra help at home.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christina Kendziorski.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kendziorski, C., Wang, P. A review of statistical methods for expression quantitative trait loci mapping. Mamm Genome 17, 509–517 (2006). https://doi.org/10.1007/s00335-005-0189-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00335-005-0189-6

Keywords

Navigation