Skip to main content

2021 | OriginalPaper | Buchkapitel

Finding the Centre: Compositional Asymmetry in High-Throughput Sequencing Datasets

verfasst von : Jia R. Wu, Jean M. Macklaim, Briana L. Genge, Gregory B. Gloor

Erschienen in: Advances in Compositional Data Analysis

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

High-throughput sequencing datasets comprise millions of reads of genomic data and can be modelled as count compositions. These data are used for transcription profiles, microbial diversity, or relative cellular abundance in culture. The data are sparse and high dimensional. Moreover, they are often unbalanced, i.e. there is often systematic variation between groups due to presence or absence of features, and this variation is important to the biological interpretation of the data. The imbalance causes samples in the comparison groups to exhibit varying centres contributing to false positive and false negative identifications. Here, we extend the centred log-ratio transformation method used for the comparison of differential relative abundance between two groups in a Bayesian compositional context. We demonstrate the pathology in modelled and real unbalanced experimental designs to show how this causes both false negative and false positive inference. We examined four approaches to identify denominator features, and tested them with different proportions of modelled asymmetry; two were relatively robust, and recommended. We recommend the ‘LVHA’ transformation for asymmetric transcriptome datasets, and the ‘IQLR’ method for all other datasets when using the ALDEx2 tool available on Bioconductor.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat J. Aitchison, The Statistical Analysis of Compositional Data (Chapman & Hall, London, 1986) J. Aitchison, The Statistical Analysis of Compositional Data (Chapman & Hall, London, 1986)
Zurück zum Zitat C. Barceló-Vidal, J.A. Martin-Fernàndez, V. Pawlowsky-Glahn, Mathematical foundations of compositional data analysis, in Proceedings of IAMG, vol. 1 (Springer, 2001), pp. 1–20 C. Barceló-Vidal, J.A. Martin-Fernàndez, V. Pawlowsky-Glahn, Mathematical foundations of compositional data analysis, in Proceedings of IAMG, vol. 1 (Springer, 2001), pp. 1–20
Zurück zum Zitat G. Bian, G.B. Gloor, A. Gong, C. Jia, W. Zhang, J. Hu, H. Zhang, Y. Zhang, Z. Zhou, J. Zhang, J.P. Burton, G. Reid, Y. Xiao, Q. Zeng, K. Yang, J. Li, The gut microbiota of healthy aged Chinese is similar to that of the healthy young. mSphere 2(5), e00327–17 (2017). https://doi.org/10.1128/mSphere.00327-17 G. Bian, G.B. Gloor, A. Gong, C. Jia, W. Zhang, J. Hu, H. Zhang, Y. Zhang, Z. Zhou, J. Zhang, J.P. Burton, G. Reid, Y. Xiao, Q. Zeng, K. Yang, J. Li, The gut microbiota of healthy aged Chinese is similar to that of the healthy young. mSphere 2(5), e00327–17 (2017). https://​doi.​org/​10.​1128/​mSphere.​00327-17
Zurück zum Zitat M.A. Dillies, A. Rau, J. Aubert, C. Hennequet-Antier, M. Jeanmougin, N. Servant, C. Keime, G. Marot, D. Castel, J. Estelle, G. Guernec, B. Jagla, L. Jouneau, D. Laloë, C. Le Gall, B. Schaëffer, S. Le Crom, M. Guedj, F. Jaffrézic, French StatOmique consortium: a comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Brief. Bioinform. 14(6), 671–683 (2013). https://doi.org/10.1093/bib/bbs046 M.A. Dillies, A. Rau, J. Aubert, C. Hennequet-Antier, M. Jeanmougin, N. Servant, C. Keime, G. Marot, D. Castel, J. Estelle, G. Guernec, B. Jagla, L. Jouneau, D. Laloë, C. Le Gall, B. Schaëffer, S. Le Crom, M. Guedj, F. Jaffrézic, French StatOmique consortium: a comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Brief. Bioinform. 14(6), 671–683 (2013). https://​doi.​org/​10.​1093/​bib/​bbs046
Zurück zum Zitat A.D. Fernandes, J.N. Reid, J.M. Macklaim, T.A. McMurrough, D.R. Edgell, G.B. Gloor, Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis. Microbiome 2, 15.1–15.13 (2014). https://doi.org/10.1186/2049-2618-2-15 A.D. Fernandes, J.N. Reid, J.M. Macklaim, T.A. McMurrough, D.R. Edgell, G.B. Gloor, Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis. Microbiome 2, 15.1–15.13 (2014). https://​doi.​org/​10.​1186/​2049-2618-2-15
Zurück zum Zitat A.C. Frazee, A.E. Jaffe, R. Kirchner, J.T. Leek, Polyester: simulate RNA-seq reads. R package version 1.10.0 (2016) A.C. Frazee, A.E. Jaffe, R. Kirchner, J.T. Leek, Polyester: simulate RNA-seq reads. R package version 1.10.0 (2016)
Zurück zum Zitat M. Gierliński, C. Cole, P. Schofield, N.J. Schurch, A. Sherstnev, V. Singh, N. Wrobel, K. Gharbi, G. Simpson, T. Owen-Hughes, M. Blaxter, G.J. Barton, Statistical models for RNA-seq data derived from a two-condition 48-replicate experiment. Bioinformatics 31(22), 3625–3630 (2015). https://doi.org/10.1093/bioinformatics/btv425 M. Gierliński, C. Cole, P. Schofield, N.J. Schurch, A. Sherstnev, V. Singh, N. Wrobel, K. Gharbi, G. Simpson, T. Owen-Hughes, M. Blaxter, G.J. Barton, Statistical models for RNA-seq data derived from a two-condition 48-replicate experiment. Bioinformatics 31(22), 3625–3630 (2015). https://​doi.​org/​10.​1093/​bioinformatics/​btv425
Zurück zum Zitat B. Langmead, S.L. Salzberg, Fast gapped-read alignment with Bowtie 2. Nat. Methods 9(4), 357–359 (2012) B. Langmead, S.L. Salzberg, Fast gapped-read alignment with Bowtie 2. Nat. Methods 9(4), 357–359 (2012)
Zurück zum Zitat D.R. Lovell, X.Y. Chua, A. McGrath, Counts: an outstanding challenge for log-ratio analysis of compositional data in the molecular biosciences. NAR Genomics Bioinform. 2(2), lqaa040 (2020) D.R. Lovell, X.Y. Chua, A. McGrath, Counts: an outstanding challenge for log-ratio analysis of compositional data in the molecular biosciences. NAR Genomics Bioinform. 2(2), lqaa040 (2020)
Zurück zum Zitat J.M. Macklaim, A.D. Fernandes, J.M. Di Bella, J.A. Hammond, G. Reid, G.B. Gloor, Comparative meta-RNA-seq of the vaginal microbiota and differential expression by Lactobacillus iners in health and dysbiosis. Microbiome 1(1), 12 (2013). https://doi.org/10.1186/2049-2618-1-12 J.M. Macklaim, A.D. Fernandes, J.M. Di Bella, J.A. Hammond, G. Reid, G.B. Gloor, Comparative meta-RNA-seq of the vaginal microbiota and differential expression by Lactobacillus iners in health and dysbiosis. Microbiome 1(1), 12 (2013). https://​doi.​org/​10.​1186/​2049-2618-1-12
Zurück zum Zitat J.M. Macklaim, J.C. Clemente, R. Knight, G.B. Gloor, G. Reid, Changes in vaginal microbiota following antimicrobial and probiotic therapy. Microb. Ecol. Health Dis. 26, 27799 (2015) J.M. Macklaim, J.C. Clemente, R. Knight, G.B. Gloor, G. Reid, Changes in vaginal microbiota following antimicrobial and probiotic therapy. Microb. Ecol. Health Dis. 26, 27799 (2015)
Zurück zum Zitat T.A. McMurrough, C.M. Brown, K. Zhang, G. Hausner, M.S. Junop, G.B. Gloor, D.R. Edgell, Active site residue identity regulates cleavage preference of LAGLIDADG homing endonucleases. Nucleic Acids Res. 46(22), 11990–12007 (2018). https://doi.org/10.1093/nar/gky976 T.A. McMurrough, C.M. Brown, K. Zhang, G. Hausner, M.S. Junop, G.B. Gloor, D.R. Edgell, Active site residue identity regulates cleavage preference of LAGLIDADG homing endonucleases. Nucleic Acids Res. 46(22), 11990–12007 (2018). https://​doi.​org/​10.​1093/​nar/​gky976
Zurück zum Zitat A.L. Mitchell, M. Scheremetjew, H. Denise, S. Potter, A. Tarkowska, M. Qureshi, G.A. Salazar, S. Pesseat, M.A. Boland, F.M.I. Hunter, P. Ten Hoopen, B. Alako, C. Amid, D.J. Wilkinson, T.P. Curtis, G. Cochrane, R.D. Finn, EBI metagenomics in 2017: enriching the analysis of microbial communities, from sequence reads to assemblies. Nucleic Acids Res. 46(D1), D726–D735 (2018). https://doi.org/10.1093/nar/gkx967 A.L. Mitchell, M. Scheremetjew, H. Denise, S. Potter, A. Tarkowska, M. Qureshi, G.A. Salazar, S. Pesseat, M.A. Boland, F.M.I. Hunter, P. Ten Hoopen, B. Alako, C. Amid, D.J. Wilkinson, T.P. Curtis, G. Cochrane, R.D. Finn, EBI metagenomics in 2017: enriching the analysis of microbial communities, from sequence reads to assemblies. Nucleic Acids Res. 46(D1), D726–D735 (2018). https://​doi.​org/​10.​1093/​nar/​gkx967
Zurück zum Zitat R. Overbeek, R. Olson, G.D. Pusch, G.J. Olsen, J.J. Davis, T. Disz, R.A. Edwards, S. Gerdes, B. Parrello, M. Shukla, V. Vonstein, A.R. Wattam, F. Xia, R. Stevens, The seed and the rapid annotation of microbial genomes using subsystems technology (RAST). Nucleic Acids Res. 42(Database issue), D206–14 (2014). https://doi.org/10.1093/nar/gkt1226 R. Overbeek, R. Olson, G.D. Pusch, G.J. Olsen, J.J. Davis, T. Disz, R.A. Edwards, S. Gerdes, B. Parrello, M. Shukla, V. Vonstein, A.R. Wattam, F. Xia, R. Stevens, The seed and the rapid annotation of microbial genomes using subsystems technology (RAST). Nucleic Acids Res. 42(Database issue), D206–14 (2014). https://​doi.​org/​10.​1093/​nar/​gkt1226
Zurück zum Zitat J. Ravel, P. Gajer, Z. Abdo, G.M. Schneider, S.S.K. Koenig, S.L. McCulle, S. Karlebach, R. Gorle, J. Russell, C.O. Tacket, R.M. Brotman, C.C. Davis, K. Ault, L. Peralta, L.J. Forney, Vaginal microbiome of reproductive-age women. Proc. Natl. Acad. Sci. USA 108, 4680–4687 (2011). https://doi.org/10.1073/pnas.100611107 J. Ravel, P. Gajer, Z. Abdo, G.M. Schneider, S.S.K. Koenig, S.L. McCulle, S. Karlebach, R. Gorle, J. Russell, C.O. Tacket, R.M. Brotman, C.C. Davis, K. Ault, L. Peralta, L.J. Forney, Vaginal microbiome of reproductive-age women. Proc. Natl. Acad. Sci. USA 108, 4680–4687 (2011). https://​doi.​org/​10.​1073/​pnas.​100611107
Zurück zum Zitat O. Thellin, W. Zorzi, B. Lakaye, B. De Borman, B. Coumans, G. Hennen, T. Grisar, A. Igout, E. Heinen, Housekeeping genes as internal standards: use and limits. J. Biotechnol. 75(2–3), 291–295 (1999)CrossRef O. Thellin, W. Zorzi, B. Lakaye, B. De Borman, B. Coumans, G. Hennen, T. Grisar, A. Igout, E. Heinen, Housekeeping genes as internal standards: use and limits. J. Biotechnol. 75(2–3), 291–295 (1999)CrossRef
Zurück zum Zitat J. Vandesompele, K. De Preter, F. Pattyn, B. Poppe, N. Van Roy, A. De Paepe, F. Speleman, Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biol. 3(7), RESEARCH0034 (2002) J. Vandesompele, K. De Preter, F. Pattyn, B. Poppe, N. Van Roy, A. De Paepe, F. Speleman, Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biol. 3(7), RESEARCH0034 (2002)
Zurück zum Zitat J.M. Wolfs, T.A. Hamilton, J.T. Lant, M. Laforet, J. Zhang, L.M. Salemi, G.B. Gloor, C. Schild-Poulter, D.R. Edgell, Biasing genome-editing events toward precise length deletions with an RNA-guided TevCas9 dual nuclease. Proc. Natl. Acad. Sci. USA (2016). https://doi.org/10.1073/pnas.1616343114 J.M. Wolfs, T.A. Hamilton, J.T. Lant, M. Laforet, J. Zhang, L.M. Salemi, G.B. Gloor, C. Schild-Poulter, D.R. Edgell, Biasing genome-editing events toward precise length deletions with an RNA-guided TevCas9 dual nuclease. Proc. Natl. Acad. Sci. USA (2016). https://​doi.​org/​10.​1073/​pnas.​1616343114
Zurück zum Zitat H. Zhao, C. Chen, Y. Xiong, X. Xu, R. Lan, H. Wang, X. Yao, X. Bai, X. Liu, Q. Meng, X. Zhang, H. Sun, A. Zhao, X. Bai, Y. Cheng, Q. Chen, C. Ye, J. Xu, Global transcriptional and phenotypic analyses of Escherichia coli O157:H7 strain Xuzhou21 and its pO157\_Sal cured mutant. PLoS One 8(5), e65466 (2013). https://doi.org/10.1371/journal.pone.0065466 H. Zhao, C. Chen, Y. Xiong, X. Xu, R. Lan, H. Wang, X. Yao, X. Bai, X. Liu, Q. Meng, X. Zhang, H. Sun, A. Zhao, X. Bai, Y. Cheng, Q. Chen, C. Ye, J. Xu, Global transcriptional and phenotypic analyses of Escherichia coli O157:H7 strain Xuzhou21 and its pO157\_Sal cured mutant. PLoS One 8(5), e65466 (2013). https://​doi.​org/​10.​1371/​journal.​pone.​0065466
Metadaten
Titel
Finding the Centre: Compositional Asymmetry in High-Throughput Sequencing Datasets
verfasst von
Jia R. Wu
Jean M. Macklaim
Briana L. Genge
Gregory B. Gloor
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-71175-7_17