Skip to main content
Erschienen in:
Buchtitelbild

2015 | OriginalPaper | Buchkapitel

Integrated Analysis of Transcriptomic and Proteomic Datasets Reveals Information on Protein Expressivity and Factors Affecting Translational Efficiency

verfasst von : Jiangxin Wang, Gang Wu, Lei Chen, Weiwen Zhang

Verlag: Springer New York

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Integrated analysis of large-scale transcriptomic and proteomic data can provide important insights into the metabolic mechanisms underlying complex biological systems. In this chapter, we present methods to address two aspects of issues related to integrated transcriptomic and proteomic analysis. First, due to the fact that proteomic datasets are often incomplete, and integrated analysis of partial proteomic data may introduce significant bias. To address these issues, we describe a zero-inflated Poisson (ZIP)-based model to uncover the complicated relationships between protein abundances and mRNA expression levels, and then apply them to predict protein abundance for the proteins not experimentally detected. The ZIP model takes into consideration the undetected proteins by assuming that there is a probability mass at zero representing expressed proteins that were undetected owing to technical limitations. The model validity is demonstrated using biological information of operons, regulons, and pathways. Second, weak correlation between transcriptomic and proteomic datasets is often due to biological factors affecting translational processes. To quantify the effects of these factors, we describe a multiple regression-based statistical framework to quantitatively examine the effects of various translational efficiency-related sequence features on mRNA–protein correlation. Using the datasets from sulfate-reducing bacteria Desulfovibrio vulgaris, the analysis shows that translation-related sequence features can contribute up to 15.2–26.2 % of the total variation of the correlation between transcriptomic and proteomic datasets, and also reveals the relative importance of various features in translation process.
Literatur
1.
Zurück zum Zitat Medini D, Serruto D, Parkhill J, Relman DA, Donati C, Moxon R, Falkow S, Rappuoli R (2008) Microbiology in the post-genomic era. Nat Rev Microbiol 6:419–430PubMed Medini D, Serruto D, Parkhill J, Relman DA, Donati C, Moxon R, Falkow S, Rappuoli R (2008) Microbiology in the post-genomic era. Nat Rev Microbiol 6:419–430PubMed
2.
Zurück zum Zitat Kyrpides NC (2009) Fifteen years of microbial genomics: meeting the challenges and fulfilling the dream. Nat Biotechnol 27:627–632CrossRefPubMed Kyrpides NC (2009) Fifteen years of microbial genomics: meeting the challenges and fulfilling the dream. Nat Biotechnol 27:627–632CrossRefPubMed
3.
Zurück zum Zitat Uchiyama I, Mihara M, Nishide H, Chiba H (2013) MBGD update 2013: the microbial genome database for exploring the diversity of microbial world. Nucleic Acids Res 41(Database issue):D631–D635CrossRefPubMedPubMedCentral Uchiyama I, Mihara M, Nishide H, Chiba H (2013) MBGD update 2013: the microbial genome database for exploring the diversity of microbial world. Nucleic Acids Res 41(Database issue):D631–D635CrossRefPubMedPubMedCentral
4.
Zurück zum Zitat Schoolnik GK (2001) The accelerating convergence of genomics and microbiology. Genome Biol 2: REPORTS4009 Schoolnik GK (2001) The accelerating convergence of genomics and microbiology. Genome Biol 2: REPORTS4009
5.
Zurück zum Zitat Ward N, Fraser CM (2005) How genomics has affected the concept of microbiology. Curr Opin Microbiol 8:564–571CrossRefPubMed Ward N, Fraser CM (2005) How genomics has affected the concept of microbiology. Curr Opin Microbiol 8:564–571CrossRefPubMed
6.
Zurück zum Zitat Sharan R, Ideker T (2006) Modeling cellular machinery through biological network comparison. Nat Biotechnol 24:427–433CrossRefPubMed Sharan R, Ideker T (2006) Modeling cellular machinery through biological network comparison. Nat Biotechnol 24:427–433CrossRefPubMed
7.
Zurück zum Zitat Cardenas E, Tiedje JM (2008) New tools for discovering and characterizing microbial diversity. Curr Opin Biotechnol 19:544–549CrossRefPubMed Cardenas E, Tiedje JM (2008) New tools for discovering and characterizing microbial diversity. Curr Opin Biotechnol 19:544–549CrossRefPubMed
8.
9.
Zurück zum Zitat Fiehn O (2001) Combining genomics, metabolome analysis, and biochemical modelling to understand metabolic networks. Comp Funct Genomics 2:155–168CrossRefPubMedPubMedCentral Fiehn O (2001) Combining genomics, metabolome analysis, and biochemical modelling to understand metabolic networks. Comp Funct Genomics 2:155–168CrossRefPubMedPubMedCentral
10.
Zurück zum Zitat Singh OV, Nagaraj NS (2006) Transcriptomics, proteomics and interactomics: unique approaches to track the insights of bioremediation. Brief Funct Genomic Proteomic 4:355–362CrossRefPubMed Singh OV, Nagaraj NS (2006) Transcriptomics, proteomics and interactomics: unique approaches to track the insights of bioremediation. Brief Funct Genomic Proteomic 4:355–362CrossRefPubMed
11.
Zurück zum Zitat Lin J, Qian J (2007) Systems biology approach to integrative comparative genomics. Expert Rev Proteomics 4:107–119CrossRefPubMed Lin J, Qian J (2007) Systems biology approach to integrative comparative genomics. Expert Rev Proteomics 4:107–119CrossRefPubMed
12.
13.
Zurück zum Zitat Ishii N, Tomita M (2009) Multi-omics data-driven systems biology of E. coli. In: Lee SY (ed) Systems biology and biotechnology of Escherichia coli. Springer, Dordrecht, The Netherlands, pp 41–57CrossRef Ishii N, Tomita M (2009) Multi-omics data-driven systems biology of E. coli. In: Lee SY (ed) Systems biology and biotechnology of Escherichia coli. Springer, Dordrecht, The Netherlands, pp 41–57CrossRef
14.
Zurück zum Zitat Tang YJ, Martin HG, Myers S, Rodriguez S, Baidoo EE, Keasling JD (2009) Advances in analysis of microbial metabolic fluxes via 13C isotopic labeling. Mass Spectrom Rev 28:362–375CrossRefPubMed Tang YJ, Martin HG, Myers S, Rodriguez S, Baidoo EE, Keasling JD (2009) Advances in analysis of microbial metabolic fluxes via 13C isotopic labeling. Mass Spectrom Rev 28:362–375CrossRefPubMed
15.
Zurück zum Zitat Park SJ, Lee SY, Cho J, Kim TY, Lee JW, Park JH, Han MJ (2005) Global physiological understanding and metabolic engineering of microorganisms based on omics studies. Appl Microbiol Biotechnol 68:567–579CrossRefPubMed Park SJ, Lee SY, Cho J, Kim TY, Lee JW, Park JH, Han MJ (2005) Global physiological understanding and metabolic engineering of microorganisms based on omics studies. Appl Microbiol Biotechnol 68:567–579CrossRefPubMed
16.
17.
Zurück zum Zitat Hegde PS, White IR, Debouck C (2003) Interplay of transcriptomics and proteomics. Curr Opin Biotechnol 14:647–651CrossRefPubMed Hegde PS, White IR, Debouck C (2003) Interplay of transcriptomics and proteomics. Curr Opin Biotechnol 14:647–651CrossRefPubMed
18.
Zurück zum Zitat Mootha VK, Lepage P, Miller K, Bunkenborg J, Reich M, Hjerrild M, Del-monte T, Villeneuve A, Sladek R et al (2003) Identification of a gene causing human cytochrome c oxidase deficiency by integrative genomics. Proc Natl Acad Sci U S A 100:605–610CrossRefPubMedPubMedCentral Mootha VK, Lepage P, Miller K, Bunkenborg J, Reich M, Hjerrild M, Del-monte T, Villeneuve A, Sladek R et al (2003) Identification of a gene causing human cytochrome c oxidase deficiency by integrative genomics. Proc Natl Acad Sci U S A 100:605–610CrossRefPubMedPubMedCentral
19.
Zurück zum Zitat Mootha VK, Bunkenborg J, Olsen JV, Hjerrild M, Wisniewski JR, Stahl E, Bolouri MS, Ray HN, Sihag S et al (2003) Integrated analysis of protein composition, tissue diversity, and gene regulation in mouse mitochondria. Cell 115:629–640CrossRefPubMed Mootha VK, Bunkenborg J, Olsen JV, Hjerrild M, Wisniewski JR, Stahl E, Bolouri MS, Ray HN, Sihag S et al (2003) Integrated analysis of protein composition, tissue diversity, and gene regulation in mouse mitochondria. Cell 115:629–640CrossRefPubMed
20.
Zurück zum Zitat Alter O, Golub GH (2004) Integrative analysis of genome-scale data by using pseudoinverse projection predicts novel correlation between DNA replication and RNA transcription. Proc Natl Acad Sci U S A 101:16577–16582CrossRefPubMedPubMedCentral Alter O, Golub GH (2004) Integrative analysis of genome-scale data by using pseudoinverse projection predicts novel correlation between DNA replication and RNA transcription. Proc Natl Acad Sci U S A 101:16577–16582CrossRefPubMedPubMedCentral
21.
Zurück zum Zitat Greenbaum D, Jansen R, Gerstein M (2002) Analysis of mRNA expression and protein abundance data: an approach for the comparison of the enrichment of features in the cellular population of proteins and transcripts. Bioinformatics 18:585–596CrossRefPubMed Greenbaum D, Jansen R, Gerstein M (2002) Analysis of mRNA expression and protein abundance data: an approach for the comparison of the enrichment of features in the cellular population of proteins and transcripts. Bioinformatics 18:585–596CrossRefPubMed
22.
Zurück zum Zitat Ideker T, Thorsson V, Ranish JA, Christmas R, Buhler J, Eng JK, Bumgarner R, Goodlett DR, Aebersold R, Hood L (2001) Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science 292:929–934CrossRefPubMed Ideker T, Thorsson V, Ranish JA, Christmas R, Buhler J, Eng JK, Bumgarner R, Goodlett DR, Aebersold R, Hood L (2001) Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science 292:929–934CrossRefPubMed
23.
Zurück zum Zitat Washburn MP, Koller A, Oshiro G, Ulaszek G, Plouffe D, Deciu C, Winzeler E, Yates JR III (2003) Protein pathway and complex clustering of correlated mRNA and protein expression analyses in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 100:3107–3112CrossRefPubMedPubMedCentral Washburn MP, Koller A, Oshiro G, Ulaszek G, Plouffe D, Deciu C, Winzeler E, Yates JR III (2003) Protein pathway and complex clustering of correlated mRNA and protein expression analyses in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 100:3107–3112CrossRefPubMedPubMedCentral
24.
Zurück zum Zitat Greenbaum D, Colangelo C, Williams K, Gerstein M (2003) Comparing protein abundance and mRNA expression levels on a genomic scale. Genome Biol 4:117.1–117.8CrossRef Greenbaum D, Colangelo C, Williams K, Gerstein M (2003) Comparing protein abundance and mRNA expression levels on a genomic scale. Genome Biol 4:117.1–117.8CrossRef
25.
Zurück zum Zitat Beyer A, Hollunder J, Nasheuer HP, Wilhelm T (2004) Posttranscriptional expression regulation in the yeast Saccharomyces cerevisiae on a genomic scale. Mol Cell Proteomics 3:1083–1092CrossRefPubMed Beyer A, Hollunder J, Nasheuer HP, Wilhelm T (2004) Posttranscriptional expression regulation in the yeast Saccharomyces cerevisiae on a genomic scale. Mol Cell Proteomics 3:1083–1092CrossRefPubMed
26.
Zurück zum Zitat Nie L, Wu G, Zhang W (2006) Correlation of mRNA expression and protein abundance affected by multiple sequence features related to translational efficiency in Desulfovibrio vulgaris: a quantitative analysis. Genetics 174:2229–2243CrossRefPubMedPubMedCentral Nie L, Wu G, Zhang W (2006) Correlation of mRNA expression and protein abundance affected by multiple sequence features related to translational efficiency in Desulfovibrio vulgaris: a quantitative analysis. Genetics 174:2229–2243CrossRefPubMedPubMedCentral
27.
Zurück zum Zitat Wilkins MR, Pasquali C, Appel RD, Ou K, Golaz O, Sanchez J, Yan JX, Gooley AA, Hughes G et al (1996) From proteins to proteomes: large scale protein identification by two-dimensional electrophoresis and amino acid analysis. Biotechnology (NY) 14:61–65CrossRef Wilkins MR, Pasquali C, Appel RD, Ou K, Golaz O, Sanchez J, Yan JX, Gooley AA, Hughes G et al (1996) From proteins to proteomes: large scale protein identification by two-dimensional electrophoresis and amino acid analysis. Biotechnology (NY) 14:61–65CrossRef
28.
Zurück zum Zitat Scherl A, Francois P, Charbonnier Y, Deshusses JM, Koessler T, Huyghe A, Bento M, Stahl-Zeng J, Fischer A et al (2006) Exploring glycopeptide-resistance in Staphylococcus aureus: a combined proteomics and transcriptomics approach for the identification of resistance-related markers. BMC Genomics 7:296CrossRefPubMedPubMedCentral Scherl A, Francois P, Charbonnier Y, Deshusses JM, Koessler T, Huyghe A, Bento M, Stahl-Zeng J, Fischer A et al (2006) Exploring glycopeptide-resistance in Staphylococcus aureus: a combined proteomics and transcriptomics approach for the identification of resistance-related markers. BMC Genomics 7:296CrossRefPubMedPubMedCentral
29.
Zurück zum Zitat Zhang W, Gritsenko M, Moore RJ, Culley DE, Nie L, Petritis K, Strittmat-ter EF, Camp DG, Smith RD, Brockman FJ (2006) A proteomic view of Desulfovibrio vulgaris metabolism as determined by liquid chromatography coupled with tandem mass spectrometry. Proteomics 6:4286–4299CrossRefPubMed Zhang W, Gritsenko M, Moore RJ, Culley DE, Nie L, Petritis K, Strittmat-ter EF, Camp DG, Smith RD, Brockman FJ (2006) A proteomic view of Desulfovibrio vulgaris metabolism as determined by liquid chromatography coupled with tandem mass spectrometry. Proteomics 6:4286–4299CrossRefPubMed
30.
Zurück zum Zitat Tuikkala J, Elo L, Nevalainen OS, Aittokallio T (2006) Improving missing value estimation in microarray data with gene ontology. Bioinformatics 22:566–572CrossRefPubMed Tuikkala J, Elo L, Nevalainen OS, Aittokallio T (2006) Improving missing value estimation in microarray data with gene ontology. Bioinformatics 22:566–572CrossRefPubMed
31.
Zurück zum Zitat Nie L, Wu G, Brockman FJ, Zhang W (2006) Integrated analysis of transcriptomic and proteomic data of Desulfovibrio vulgaris: zero-inflated Poisson regression models to predict abundance of undetected proteins. Bioinformatics 22:1641–1647CrossRefPubMed Nie L, Wu G, Brockman FJ, Zhang W (2006) Integrated analysis of transcriptomic and proteomic data of Desulfovibrio vulgaris: zero-inflated Poisson regression models to predict abundance of undetected proteins. Bioinformatics 22:1641–1647CrossRefPubMed
32.
Zurück zum Zitat Collins RF, Roberts M, Phoenix DA (1995) Codon bias in Escherichia coli may modulate translation initiation. Biochem Soc Trans 23:76CrossRef Collins RF, Roberts M, Phoenix DA (1995) Codon bias in Escherichia coli may modulate translation initiation. Biochem Soc Trans 23:76CrossRef
33.
Zurück zum Zitat Akashi H, Gojobori T (2002) Metabolic efficiency and amino acid composition in the proteomes of Escherichia coli and Bacillus subtilis. Proc Natl Acad Sci U S A 99:3695–3700CrossRefPubMedPubMedCentral Akashi H, Gojobori T (2002) Metabolic efficiency and amino acid composition in the proteomes of Escherichia coli and Bacillus subtilis. Proc Natl Acad Sci U S A 99:3695–3700CrossRefPubMedPubMedCentral
34.
Zurück zum Zitat Tate WP, Poole ES, Dalphin ME, Major LL, Crawford DJ et al (1996) The translational stop signal: codon with a context, or extended factor recognition element? Biochimie 78:945–952CrossRefPubMed Tate WP, Poole ES, Dalphin ME, Major LL, Crawford DJ et al (1996) The translational stop signal: codon with a context, or extended factor recognition element? Biochimie 78:945–952CrossRefPubMed
35.
Zurück zum Zitat Heidelberg JF, Seshadri R, Haveman SA, Hemme CL et al (2004) The genome sequence of the anaerobic, sulfate-reducing bacterium Desulfovibrio vulgaris Hildenborough. Nat Biotechnol 22:554–559CrossRefPubMed Heidelberg JF, Seshadri R, Haveman SA, Hemme CL et al (2004) The genome sequence of the anaerobic, sulfate-reducing bacterium Desulfovibrio vulgaris Hildenborough. Nat Biotechnol 22:554–559CrossRefPubMed
36.
Zurück zum Zitat Zhang W, Culley DE, Scholten JC, Hogan M, Vitiritti L, Brockman FJ (2006) Global transcriptomic analysis of Desulfovibrio vulgaris on different electron donors. Antonie Van Leeuwenhoek 89:221–237CrossRefPubMed Zhang W, Culley DE, Scholten JC, Hogan M, Vitiritti L, Brockman FJ (2006) Global transcriptomic analysis of Desulfovibrio vulgaris on different electron donors. Antonie Van Leeuwenhoek 89:221–237CrossRefPubMed
37.
Zurück zum Zitat Nie L, Wu G, Zhang W (2006) Correlation between mRNA and protein abundance in Desulfovibrio vulgaris: a multiple regression to identify sources of variations. Biochem Biophys Res Commun 339:603–610CrossRefPubMed Nie L, Wu G, Zhang W (2006) Correlation between mRNA and protein abundance in Desulfovibrio vulgaris: a multiple regression to identify sources of variations. Biochem Biophys Res Commun 339:603–610CrossRefPubMed
38.
Zurück zum Zitat McCullagh P, Nelder JA (1989) Generalized linear models. Chapman and Hall, Boca Raton, FLCrossRef McCullagh P, Nelder JA (1989) Generalized linear models. Chapman and Hall, Boca Raton, FLCrossRef
39.
Zurück zum Zitat Lambert D (1992) Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34:1–14CrossRef Lambert D (1992) Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34:1–14CrossRef
40.
Zurück zum Zitat Johnson RA (2005) Miller and Freund’s probability and statistics for engineers. Pearson prentice Hall Johnson RA (2005) Miller and Freund’s probability and statistics for engineers. Pearson prentice Hall
41.
Zurück zum Zitat Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning-data mining, inference, prediction. Springer, New York, NY, USA Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning-data mining, inference, prediction. Springer, New York, NY, USA
42.
Zurück zum Zitat Osada Y, Saito R, Tomita M (1999) Analysis of base-pairing potentials between 16S rRNA and 5′ UTR for translation initiation in various prokaryotes. Bioinformatics 15:578–581CrossRefPubMed Osada Y, Saito R, Tomita M (1999) Analysis of base-pairing potentials between 16S rRNA and 5′ UTR for translation initiation in various prokaryotes. Bioinformatics 15:578–581CrossRefPubMed
43.
Zurück zum Zitat Suzek BE, Ermolaeva MD, Schreiber M, Salzberg SL (2001) A probabilistic method for identifying start codons in bacterial genomes. Bioinformatics 17:1123–1130CrossRefPubMed Suzek BE, Ermolaeva MD, Schreiber M, Salzberg SL (2001) A probabilistic method for identifying start codons in bacterial genomes. Bioinformatics 17:1123–1130CrossRefPubMed
45.
Zurück zum Zitat Hofacker IL, Stadler PF (2006) Memory efficient folding algorithms for circular RNA secondary structures. Bioinformatics 22:1172–1176CrossRefPubMed Hofacker IL, Stadler PF (2006) Memory efficient folding algorithms for circular RNA secondary structures. Bioinformatics 22:1172–1176CrossRefPubMed
46.
Zurück zum Zitat Wu G, Nie L, Zhang W (2006) Relation between mRNA expression and sequence information in Desulfovibrio vulgaris: combinatorial contributions of upstream regulatory motifs and coding sequence features to variations in mRNA abundance. Biochem Biophys Res Commun 344:114–121CrossRefPubMed Wu G, Nie L, Zhang W (2006) Relation between mRNA expression and sequence information in Desulfovibrio vulgaris: combinatorial contributions of upstream regulatory motifs and coding sequence features to variations in mRNA abundance. Biochem Biophys Res Commun 344:114–121CrossRefPubMed
47.
Zurück zum Zitat Devore J, Farnum N (2005) Applied statistics for engineers and scientists. Thompson Learning, Belmont, CA Devore J, Farnum N (2005) Applied statistics for engineers and scientists. Thompson Learning, Belmont, CA
48.
Zurück zum Zitat Ott RY, Longnecker M (2001) An introduction to statistical methods and data analysis. Thompson Learning, Pacific Grove, CA Ott RY, Longnecker M (2001) An introduction to statistical methods and data analysis. Thompson Learning, Pacific Grove, CA
49.
Zurück zum Zitat Montgomery DC (2001) Introduction to statistical quality control (Wiley series in statistics and probability). Wiley, New York Montgomery DC (2001) Introduction to statistical quality control (Wiley series in statistics and probability). Wiley, New York
50.
Zurück zum Zitat Nie L, Wu G, Culley DE, Scholten JC, Zhang W (2007) Integrative analysis of transcriptomic and proteomic data: challenges, solutions and applications. Crit Rev Biotechnol 27:63–75CrossRefPubMed Nie L, Wu G, Culley DE, Scholten JC, Zhang W (2007) Integrative analysis of transcriptomic and proteomic data: challenges, solutions and applications. Crit Rev Biotechnol 27:63–75CrossRefPubMed
51.
Zurück zum Zitat Lange R, Hengge-Aronis R (1994) The cellular concentration of the S subunit of RNA polymerase in Escherichia coli is controlled at the levels of transcription, translation, and protein stability. Genes Dev 8:1600–1612CrossRefPubMed Lange R, Hengge-Aronis R (1994) The cellular concentration of the S subunit of RNA polymerase in Escherichia coli is controlled at the levels of transcription, translation, and protein stability. Genes Dev 8:1600–1612CrossRefPubMed
52.
Zurück zum Zitat Rocha EP, Danchin A, Viari A (1999) Translation in Bacillus subtilis: roles and trends of initiation and termination, insights from a genome analysis. Nucleic Acids Res 27:3567–3576CrossRefPubMedPubMedCentral Rocha EP, Danchin A, Viari A (1999) Translation in Bacillus subtilis: roles and trends of initiation and termination, insights from a genome analysis. Nucleic Acids Res 27:3567–3576CrossRefPubMedPubMedCentral
53.
Zurück zum Zitat Romby P, Springer M (2003) Bacterial translational control at atomic resolution. Trends Genet 19:155–161CrossRefPubMed Romby P, Springer M (2003) Bacterial translational control at atomic resolution. Trends Genet 19:155–161CrossRefPubMed
54.
Metadaten
Titel
Integrated Analysis of Transcriptomic and Proteomic Datasets Reveals Information on Protein Expressivity and Factors Affecting Translational Efficiency
verfasst von
Jiangxin Wang
Gang Wu
Lei Chen
Weiwen Zhang
Copyright-Jahr
2015
Verlag
Springer New York
DOI
https://doi.org/10.1007/7651_2015_242