Skip to main content

2020 | OriginalPaper | Buchkapitel

Functional Evolutionary Modeling Exposes Overlooked Protein-Coding Genes Involved in Cancer

verfasst von : Nadav Brandes, Nathan Linial, Michal Linial

Erschienen in: Bioinformatics Research and Applications

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Numerous computational methods have been developed to screening the genome for candidate driver genes based on genomic data of somatic mutations in tumors. Compiling a catalog of cancer genes has profound implications for the understanding and treatment of the disease. Existing methods make many implicit and explicit assumptions about the distribution of random mutations. We present FABRIC, a new framework for quantifying the evolutionary selection of genes by assessing the functional effects of mutations on protein-coding genes using a pre-trained machine-learning model. The framework compares the estimated effects of observed genetic variations against all possible single-nucleotide mutations in the coding human genome. Compared to existing methods, FABRIC makes minimal assumptions about the distribution of random mutations. To demonstrate its wide applicability, we applied FABRIC on both naturally occurring human variants and somatic mutations in cancer. In the context of cancer, ~3 M somatic mutations were extracted from over 10,000 cancerous human samples. Of the entire human proteome, 593 protein-coding genes show statistically significant bias towards harmful mutations. These genes, discovered without any prior knowledge, show an overwhelming overlap with contemporary cancer gene catalogs. Notably, the majority of these genes (426) are unlisted in these catalogs, but a substantial fraction of them is supported by literature. In the context of normal human evolution, we analyzed ~5 M common and rare variants from ~60 K individuals, discovering 6,288 significant genes. Over 98% of them are dominated by negative selection, supporting the notion of a strong purifying selection during the evolution of the healthy human population. We present the FABRIC framework as an open-source project with a simple command-line interface.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Marx, V.: Cancer genomes: discerning drivers from passengers (2014) Marx, V.: Cancer genomes: discerning drivers from passengers (2014)
3.
Zurück zum Zitat Tomczak, K., Czerwińska, P., Wiznerowicz, M.: The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp Oncol 19, A68 (2015) Tomczak, K., Czerwińska, P., Wiznerowicz, M.: The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp Oncol 19, A68 (2015)
4.
Zurück zum Zitat Tokheim, C.J., Papadopoulos, N., Kinzler, K.W., et al.: Evaluating the evaluation of cancer driver genes. Proc. Natl. Acad. Sci. 113, 14330–14335 (2016). 201616440PubMedCrossRef Tokheim, C.J., Papadopoulos, N., Kinzler, K.W., et al.: Evaluating the evaluation of cancer driver genes. Proc. Natl. Acad. Sci. 113, 14330–14335 (2016). 201616440PubMedCrossRef
5.
Zurück zum Zitat Gonzalez-Perez, A., Deu-Pons, J., Lopez-Bigas, N.: Improving the prediction of the functional impact of cancer mutations by baseline tolerance transformation. Genome Med. 4, 89 (2012)PubMedPubMedCentralCrossRef Gonzalez-Perez, A., Deu-Pons, J., Lopez-Bigas, N.: Improving the prediction of the functional impact of cancer mutations by baseline tolerance transformation. Genome Med. 4, 89 (2012)PubMedPubMedCentralCrossRef
6.
Zurück zum Zitat Bailey, M.H., Tokheim, C., Porta-Pardo, E., et al.: Comprehensive characterization of cancer driver genes and mutations. Cell 173, 371–385 (2018)PubMedPubMedCentralCrossRef Bailey, M.H., Tokheim, C., Porta-Pardo, E., et al.: Comprehensive characterization of cancer driver genes and mutations. Cell 173, 371–385 (2018)PubMedPubMedCentralCrossRef
7.
Zurück zum Zitat Lawrence, M.S., Stojanov, P., Mermel, C.H., et al.: Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014)PubMedPubMedCentralCrossRef Lawrence, M.S., Stojanov, P., Mermel, C.H., et al.: Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014)PubMedPubMedCentralCrossRef
8.
Zurück zum Zitat Zhang, J., Liu, J., Sun, J., et al.: Identifying driver mutations from sequencing data of heterogeneous tumors in the era of personalized genome sequencing. Brief. Bioinform. 15, 244–255 (2014)PubMedCrossRef Zhang, J., Liu, J., Sun, J., et al.: Identifying driver mutations from sequencing data of heterogeneous tumors in the era of personalized genome sequencing. Brief. Bioinform. 15, 244–255 (2014)PubMedCrossRef
10.
Zurück zum Zitat Mularoni, L., Sabarinathan, R., Deu-Pons, J., et al.: OncodriveFML: a general framework to identify coding and non-coding regions with cancer driver mutations. Genome Biol. 17, 128 (2016)PubMedPubMedCentralCrossRef Mularoni, L., Sabarinathan, R., Deu-Pons, J., et al.: OncodriveFML: a general framework to identify coding and non-coding regions with cancer driver mutations. Genome Biol. 17, 128 (2016)PubMedPubMedCentralCrossRef
11.
Zurück zum Zitat Kircher, M., Witten, D.M., Jain, P., et al.: A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310 (2014)PubMedPubMedCentralCrossRef Kircher, M., Witten, D.M., Jain, P., et al.: A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310 (2014)PubMedPubMedCentralCrossRef
12.
Zurück zum Zitat Brandes, N., Linial, N., Linial, M.: Quantifying gene selection in cancer through protein functional alteration bias. Nucleic Acids Res. 47, 6642–6655 (2019)PubMedPubMedCentralCrossRef Brandes, N., Linial, N., Linial, M.: Quantifying gene selection in cancer through protein functional alteration bias. Nucleic Acids Res. 47, 6642–6655 (2019)PubMedPubMedCentralCrossRef
13.
Zurück zum Zitat Adzhubei, I., Jordan, D.M., Sunyaev, S.R.: Predicting functional effect of human missense mutations using PolyPhen-2. Curr. Protoc. Hum. Genet. 76, 7–20 (2013) Adzhubei, I., Jordan, D.M., Sunyaev, S.R.: Predicting functional effect of human missense mutations using PolyPhen-2. Curr. Protoc. Hum. Genet. 76, 7–20 (2013)
14.
Zurück zum Zitat Landrum, M.J., Lee, J.M., Benson, M., et al.: ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 44, D862–D868 (2015)PubMedPubMedCentralCrossRef Landrum, M.J., Lee, J.M., Benson, M., et al.: ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 44, D862–D868 (2015)PubMedPubMedCentralCrossRef
16.
Zurück zum Zitat Ofer, D., Linial, M.: ProFET: feature engineering captures high-level protein functions. Bioinformatics 31, 3429–3436 (2015)PubMedCrossRef Ofer, D., Linial, M.: ProFET: feature engineering captures high-level protein functions. Bioinformatics 31, 3429–3436 (2015)PubMedCrossRef
17.
Zurück zum Zitat Santarius, T., Shipley, J., Brewer, D., et al.: A census of amplified and overexpressed human cancer genes. Nat. Rev. Cancer 10, 59–64 (2010)PubMedCrossRef Santarius, T., Shipley, J., Brewer, D., et al.: A census of amplified and overexpressed human cancer genes. Nat. Rev. Cancer 10, 59–64 (2010)PubMedCrossRef
18.
Zurück zum Zitat Karczewski, K.J., Weisburd, B., Thomas, B., et al.: The ExAC browser: displaying reference data information from over 60 000 exomes. Nucleic Acids Res. 45, D840–D845 (2017)PubMedCrossRef Karczewski, K.J., Weisburd, B., Thomas, B., et al.: The ExAC browser: displaying reference data information from over 60 000 exomes. Nucleic Acids Res. 45, D840–D845 (2017)PubMedCrossRef
19.
Zurück zum Zitat Petrovski, S., Wang, Q., Heinzen, E.L., et al.: Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet. 9, e1003709 (2013)PubMedPubMedCentralCrossRef Petrovski, S., Wang, Q., Heinzen, E.L., et al.: Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet. 9, e1003709 (2013)PubMedPubMedCentralCrossRef
20.
Zurück zum Zitat Itan, Y., Shang, L., Boisson, B., et al.: The human gene damage index as a gene-level approach to prioritizing exome variants. Proc. Natl. Acad. Sci. 112, 13615–13620 (2015)PubMedCrossRef Itan, Y., Shang, L., Boisson, B., et al.: The human gene damage index as a gene-level approach to prioritizing exome variants. Proc. Natl. Acad. Sci. 112, 13615–13620 (2015)PubMedCrossRef
Metadaten
Titel
Functional Evolutionary Modeling Exposes Overlooked Protein-Coding Genes Involved in Cancer
verfasst von
Nadav Brandes
Nathan Linial
Michal Linial
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-57821-3_11