Skip to main content

2019 | OriginalPaper | Buchkapitel

MOSCA: An Automated Pipeline for Integrated Metagenomics and Metatranscriptomics Data Analysis

verfasst von : João Carlos Sequeira, Miguel Rocha, Maria Madalena Alves, Andreia Ferreira Salvador

Erschienen in: Practical Applications of Computational Biology and Bioinformatics, 12th International Conference

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Metagenomics (MG) and Metatranscriptomics (MT) approaches open new perspectives on the interpretation of biological systems composed by complex microbial communities. Dealing with large sequencing datasets, to extract the desired information and interpret the results are big challenges associated with meta-omics studies. There are several bioinformatics pipelines for MG data analysis and less to MT. Up to date, none performs a complete analysis integrating both MG and MT data, including the assembly of reads into contigs, functional and taxonomic annotation of identified genes, differential gene expression analysis and the comparison of multiple samples. Here, we present Meta-Omics Software for Community Analysis (MOSCA) that was designed with this purpose. It integrates RNA-Seq analysis with Whole Genome Sequencing as reference. Raw sequencing reads are submitted to preprocessing for quality trimming and rRNA removal, and assembled into contigs, which afterwards are annotated by using a reference database. MOSCA performs differential gene expression and provides graphical visualization of the results and comparison of multiple samples. Validation and reproducibility of the pipeline was obtained by using simulated MG and MT datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
Literatur
1.
Zurück zum Zitat Zhou, J., He, Z., Yang, Y., Deng, Y., Tringe, S.G., Alvarez-cohen, L.: High-throughput metagenomic technologies for complex microbial community analysis: open and closed formats. MBio 6(1), e02288-14 (2015)CrossRef Zhou, J., He, Z., Yang, Y., Deng, Y., Tringe, S.G., Alvarez-cohen, L.: High-throughput metagenomic technologies for complex microbial community analysis: open and closed formats. MBio 6(1), e02288-14 (2015)CrossRef
2.
Zurück zum Zitat Narayanasamy, S., Jarosz, Y., Muller, E.E., et al.: IMP: a pipeline for reproducible metagenomic and metatranscriptomic analyses. bioRxiv (7), 039263 (2016) Narayanasamy, S., Jarosz, Y., Muller, E.E., et al.: IMP: a pipeline for reproducible metagenomic and metatranscriptomic analyses. bioRxiv (7), 039263 (2016)
3.
Zurück zum Zitat Kultima, J.R., Coelho, L.P., Forslund, K., et al.: Genome analysis MOCAT2: a metagenomic assembly, annotation and profiling framework. Bioinformatics 32(16), 2520–2523 (2016)CrossRef Kultima, J.R., Coelho, L.P., Forslund, K., et al.: Genome analysis MOCAT2: a metagenomic assembly, annotation and profiling framework. Bioinformatics 32(16), 2520–2523 (2016)CrossRef
4.
Zurück zum Zitat Wilke, A., Bischof, J., Gerlach, W., Glass, E., et al.: The MG-RAST metagenomics database and portal in 2015. Nucleic Acids Res. 44(D1), D590–D594 (2015)CrossRef Wilke, A., Bischof, J., Gerlach, W., Glass, E., et al.: The MG-RAST metagenomics database and portal in 2015. Nucleic Acids Res. 44(D1), D590–D594 (2015)CrossRef
5.
Zurück zum Zitat Martinez, X., Pozuelo, M., Pascal, V., et al.: MetaTrans: an open-source pipeline for metatranscriptomics. Sci. Rep. 6, 26447 (2016)CrossRef Martinez, X., Pozuelo, M., Pascal, V., et al.: MetaTrans: an open-source pipeline for metatranscriptomics. Sci. Rep. 6, 26447 (2016)CrossRef
6.
Zurück zum Zitat Westreich, S.T., Treiber, M.L., Mills, D.A., Korf, I., Lemay, D.G.: SAMSA2: a standalone metatranscriptome analysis pipeline. bioRxiv, 195826 (2017) Westreich, S.T., Treiber, M.L., Mills, D.A., Korf, I., Lemay, D.G.: SAMSA2: a standalone metatranscriptome analysis pipeline. bioRxiv, 195826 (2017)
7.
Zurück zum Zitat Kim, J., Kim, M.S., Koh, A.Y., et al.: FMAP: Functional Mapping and Analysis Pipeline for metagenomics and metatranscriptomics studies. BMC Bioinform. 17(1), 420 (2016)CrossRef Kim, J., Kim, M.S., Koh, A.Y., et al.: FMAP: Functional Mapping and Analysis Pipeline for metagenomics and metatranscriptomics studies. BMC Bioinform. 17(1), 420 (2016)CrossRef
8.
Zurück zum Zitat Nurk, S., Meleshko, D., Korobeynikov, A., Pevzner, P.A.: metaSPAdes: a new versatile metagenomic assembler. Genome Res. 27(5), 824–834 (2017)CrossRef Nurk, S., Meleshko, D., Korobeynikov, A., Pevzner, P.A.: metaSPAdes: a new versatile metagenomic assembler. Genome Res. 27(5), 824–834 (2017)CrossRef
9.
Zurück zum Zitat Li, D., Liu, C.M., Luo, R., Sadakane, K., Lam, T.W.: MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31(10), 1674–1676 (2015)CrossRef Li, D., Liu, C.M., Luo, R., Sadakane, K., Lam, T.W.: MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31(10), 1674–1676 (2015)CrossRef
10.
Zurück zum Zitat Andrews, S.: FastQC: a quality control tool for high throughput sequence data (2010) Andrews, S.: FastQC: a quality control tool for high throughput sequence data (2010)
11.
Zurück zum Zitat Bolger, A.M., Lohse, M., Usadel, B.: Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15), 2114–2120 (2014)CrossRef Bolger, A.M., Lohse, M., Usadel, B.: Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15), 2114–2120 (2014)CrossRef
12.
Zurück zum Zitat Kopylova, E., Noé, L., Touzet, H.: Sortmerna: fast and accurate filtering of ribosomal RNAs in metatranscriptomic data. Bioinformatics 28(24), 3211–3217 (2012)CrossRef Kopylova, E., Noé, L., Touzet, H.: Sortmerna: fast and accurate filtering of ribosomal RNAs in metatranscriptomic data. Bioinformatics 28(24), 3211–3217 (2012)CrossRef
13.
Zurück zum Zitat Quast, C., Pruesse, E., Yilmaz, P., et al.: The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41(D1), D590–D596 (2012)CrossRef Quast, C., Pruesse, E., Yilmaz, P., et al.: The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41(D1), D590–D596 (2012)CrossRef
14.
Zurück zum Zitat Griffiths-Jones, S., Bateman, A., Marshall, M., Khanna, A., Eddy, S.R.: Rfam: an RNA family database. Nucleic Acids Res. 31(1), 439–441 (2003)CrossRef Griffiths-Jones, S., Bateman, A., Marshall, M., Khanna, A., Eddy, S.R.: Rfam: an RNA family database. Nucleic Acids Res. 31(1), 439–441 (2003)CrossRef
15.
Zurück zum Zitat Mikheenko, A., Saveliev, V., Gurevich, A.: MetaQUAST: evaluation of metagenome assemblies. Bioinformatics 32(7), 1088–1090 (2015)CrossRef Mikheenko, A., Saveliev, V., Gurevich, A.: MetaQUAST: evaluation of metagenome assemblies. Bioinformatics 32(7), 1088–1090 (2015)CrossRef
16.
Zurück zum Zitat Langmead, B., Salzberg, S.L.: Fast gapped-read alignment with Bowtie 2. Nat. Methods 9(4), 357 (2012)CrossRef Langmead, B., Salzberg, S.L.: Fast gapped-read alignment with Bowtie 2. Nat. Methods 9(4), 357 (2012)CrossRef
17.
Zurück zum Zitat Rho, M., Tang, H., Ye, Y.: FragGeneScan: predicting genes in short and error-prone reads. Nucleic Acids Res. 38(20), e191 (2010)CrossRef Rho, M., Tang, H., Ye, Y.: FragGeneScan: predicting genes in short and error-prone reads. Nucleic Acids Res. 38(20), e191 (2010)CrossRef
18.
Zurück zum Zitat UniProt Consortium: UniProt: the universal protein knowledgebase. Nucleic Acids Res. 45(D1), D158–D169 (2016) UniProt Consortium: UniProt: the universal protein knowledgebase. Nucleic Acids Res. 45(D1), D158–D169 (2016)
19.
Zurück zum Zitat Buchfink, B., Xie, C., Huson, D.H.: Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12(1), 59–60 (2015)CrossRef Buchfink, B., Xie, C., Huson, D.H.: Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12(1), 59–60 (2015)CrossRef
20.
Zurück zum Zitat Anders, S., Pyl, P.T., Huber, W.: HTSeqa Python framework to work with high-throughput sequencing data. Bioinformatics 31(2), 166–169 (2015)CrossRef Anders, S., Pyl, P.T., Huber, W.: HTSeqa Python framework to work with high-throughput sequencing data. Bioinformatics 31(2), 166–169 (2015)CrossRef
21.
Zurück zum Zitat Love, M., Anders, S., Huber, W.: Differential analysis of count data – the DESeq2 package. Genome Biol. 15, 550 (2014)CrossRef Love, M., Anders, S., Huber, W.: Differential analysis of count data – the DESeq2 package. Genome Biol. 15, 550 (2014)CrossRef
22.
Zurück zum Zitat R Core Team: R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2015) R Core Team: R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2015)
23.
Zurück zum Zitat Angly, F.E., Willner, D., Rohwer, F., et al.: Grinder: a versatile amplicon and shotgun sequence simulator. Nucleic Acids Res. 40(12), 94 (2012)CrossRef Angly, F.E., Willner, D., Rohwer, F., et al.: Grinder: a versatile amplicon and shotgun sequence simulator. Nucleic Acids Res. 40(12), 94 (2012)CrossRef
24.
Zurück zum Zitat NCBI Resource Coordinators: Database resources of the national center for biotechnology information. Nucleic Acids Res. 45(D1), D12–D17 (2017) NCBI Resource Coordinators: Database resources of the national center for biotechnology information. Nucleic Acids Res. 45(D1), D12–D17 (2017)
25.
Zurück zum Zitat Frazee, A.C., Jaffe, A.E., Langmead, B., Leek, J.T.: Polyester: simulating RNA-seq datasets with differential transcript expression. Bioinformatics 31(17), 2778–2784 (2015)CrossRef Frazee, A.C., Jaffe, A.E., Langmead, B., Leek, J.T.: Polyester: simulating RNA-seq datasets with differential transcript expression. Bioinformatics 31(17), 2778–2784 (2015)CrossRef
Metadaten
Titel
MOSCA: An Automated Pipeline for Integrated Metagenomics and Metatranscriptomics Data Analysis
verfasst von
João Carlos Sequeira
Miguel Rocha
Maria Madalena Alves
Andreia Ferreira Salvador
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-319-98702-6_22