Skip to main content
Top

2018 | OriginalPaper | Chapter

Estimating the Length Distributions of Genomic Micro-satellites from Next Generation Sequencing Data

Authors : Xuan Feng, Huan Hu, Zhongmeng Zhao, Xuanping Zhang, Jiayin Wang

Published in: Bioinformatics and Biomedical Engineering

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Genomic micro-satellites are the genomic regions that consist of short and repetitive DNA motifs. In contrast to unique genome, genomic micro-satellites expose high intrinsic polymorphisms, which mainly derive from variability in length. Length distributions are widely used to represent the polymorphisms. Recent studies report that some micro-satellites alter their length distributions significantly in tumor tissue samples comparing to the ones observed in normal samples, which becomes a hot topic in cancer genomics. Several state-of-the-art approaches are proposed to identify the length distributions from the sequencing data. However, the existing approaches can only handle the micro-satellites shorter than one read length, which limits the potential research on long micro-satellite events. In this article, we propose a probabilistic approach, implemented as ELMSI that estimates the length distributions of the micro-satellites longer than one read length. The core algorithm works on a set of mapped reads. It first clusters the reads, and a k-mer extension algorithm is adopted to detect the unit and breakpoints as well. Then, it conducts an expectation maximization algorithm to approach the true length distributions. According to the experiments, ELMSI is able to handle micro-satellites with the length spectrum from shorter than one read length to 10 kbps scale. A series of comparison experiments are applied, which vary the numbers of micro-satellite regions, read lengths and sequencing coverages, and ELMSI outperforms MSIsensor in most of the cases.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Miesfeld, R., Krystal, M., Arnheim, N.: A member of a new repeated sequence family which is conserved throughout eucaryotic evolution is found between the human delta and beta globin genes. Nucleic Acids Res. 9(22), 5931–5947 (1981)CrossRef Miesfeld, R., Krystal, M., Arnheim, N.: A member of a new repeated sequence family which is conserved throughout eucaryotic evolution is found between the human delta and beta globin genes. Nucleic Acids Res. 9(22), 5931–5947 (1981)CrossRef
2.
go back to reference Ashley, C., Warren, S.: Trinucleotide repeat expansion and human disease. Annu. Rev. Genet. 16(1), 1698–1704 (1995) Ashley, C., Warren, S.: Trinucleotide repeat expansion and human disease. Annu. Rev. Genet. 16(1), 1698–1704 (1995)
3.
go back to reference Ellegren, H.: Microsatellites: simple sequences with complex evolution. Nat. Rev. Genet. 5(6), 435–445 (2004)CrossRef Ellegren, H.: Microsatellites: simple sequences with complex evolution. Nat. Rev. Genet. 5(6), 435–445 (2004)CrossRef
4.
go back to reference Niu, B., Ye, K., Zhang, Q., et al.: MSIsensor: microsatellite instability detection using paired tumor-normal sequence data. Bioinformatics 30(7), 1015 (2014)CrossRef Niu, B., Ye, K., Zhang, Q., et al.: MSIsensor: microsatellite instability detection using paired tumor-normal sequence data. Bioinformatics 30(7), 1015 (2014)CrossRef
5.
go back to reference Murphy, K.M., Zhang, S., Geiger, T., Hafez, M.J., Bacher, J., Berg, K.D., Eshleman, J.R.: Comparison of the microsatellite instability analysis system and the bethesda panel for the determination of micro-satellite instability in colorectal cancers. J. Mol. Diagn. 8(3), 305–311 (2006)CrossRef Murphy, K.M., Zhang, S., Geiger, T., Hafez, M.J., Bacher, J., Berg, K.D., Eshleman, J.R.: Comparison of the microsatellite instability analysis system and the bethesda panel for the determination of micro-satellite instability in colorectal cancers. J. Mol. Diagn. 8(3), 305–311 (2006)CrossRef
6.
go back to reference Lu, C., Xie, M., Wendl, M., et al.: Patterns and functional implications of rare germline variants across 12 cancer types. Nat. Commun. 6(10086), 1–13 (2015) Lu, C., Xie, M., Wendl, M., et al.: Patterns and functional implications of rare germline variants across 12 cancer types. Nat. Commun. 6(10086), 1–13 (2015)
7.
go back to reference Markowitz, S.D., Bertagnolli, M.M.: Molecular origins of cancer: molecular basis of colorectal cancer. N. Engl. J. Med. 361(25), 2449 (2009)CrossRef Markowitz, S.D., Bertagnolli, M.M.: Molecular origins of cancer: molecular basis of colorectal cancer. N. Engl. J. Med. 361(25), 2449 (2009)CrossRef
8.
go back to reference Kim, T.M., Laird, P.W., Park, P.J.: The landscape of microsatellite instability in colorectal and endometrial cancer genomes. Cell 155(4), 858–868 (2013)CrossRef Kim, T.M., Laird, P.W., Park, P.J.: The landscape of microsatellite instability in colorectal and endometrial cancer genomes. Cell 155(4), 858–868 (2013)CrossRef
9.
go back to reference Woerner, S.M., Kloor, M., Mueller, A., et al.: Microsatellite instability of selective target genes in HNPCC-associated colon adenomas. Oncogene 24(15), 2523–2535 (2005)CrossRef Woerner, S.M., Kloor, M., Mueller, A., et al.: Microsatellite instability of selective target genes in HNPCC-associated colon adenomas. Oncogene 24(15), 2523–2535 (2005)CrossRef
10.
go back to reference Ritchard, C.C., Morrissey, C., Kumar, A., et al.: Complex MSH2 and MSH6 mutations in hypermutated microsatellite unstable advanced prostate cancer. Nat. Commun. 5, 4988 (2014)CrossRef Ritchard, C.C., Morrissey, C., Kumar, A., et al.: Complex MSH2 and MSH6 mutations in hypermutated microsatellite unstable advanced prostate cancer. Nat. Commun. 5, 4988 (2014)CrossRef
11.
go back to reference Ribic, C.M., Sargent, D.J., Moore, M.J., et al.: Tumor microsatellite instability status as a predictor of benefit from fluorouracil-based adjuvant chemotherapy for colon cancer. N. Engl. J. Med. 349(3), 247–257 (2003)CrossRef Ribic, C.M., Sargent, D.J., Moore, M.J., et al.: Tumor microsatellite instability status as a predictor of benefit from fluorouracil-based adjuvant chemotherapy for colon cancer. N. Engl. J. Med. 349(3), 247–257 (2003)CrossRef
12.
go back to reference Pawlik, T.M., Raut, C.P., Rodriguez-Bigas, M.A.: Colorectal carcinogenesis: MSI-H versus MSI-L. Dis. Markers 20(4–5), 199–206 (2004)CrossRef Pawlik, T.M., Raut, C.P., Rodriguez-Bigas, M.A.: Colorectal carcinogenesis: MSI-H versus MSI-L. Dis. Markers 20(4–5), 199–206 (2004)CrossRef
13.
go back to reference Salipante, S.J., Scroggins, S.M., Hampel, H.L., et al.: Microsatellite instability detection by next generation sequencing. Clin. Chem. 60(9), 1192–1199 (2014)CrossRef Salipante, S.J., Scroggins, S.M., Hampel, H.L., et al.: Microsatellite instability detection by next generation sequencing. Clin. Chem. 60(9), 1192–1199 (2014)CrossRef
14.
go back to reference Mi, N.H., Mcpherson, J.R., Cutcutache, I., et al.: MSIseq: software for assessing microsatellite instability from catalogs of somatic mutations. Sci. Rep. 5, 13321 (2015)CrossRef Mi, N.H., Mcpherson, J.R., Cutcutache, I., et al.: MSIseq: software for assessing microsatellite instability from catalogs of somatic mutations. Sci. Rep. 5, 13321 (2015)CrossRef
15.
go back to reference Wu, C.W., Chen, G.D., Jiang, K.C., et al.: A genome-wide study of microsatellite instability in advanced gastric carcinoma. Cancer 92(1), 92–101 (2015)CrossRef Wu, C.W., Chen, G.D., Jiang, K.C., et al.: A genome-wide study of microsatellite instability in advanced gastric carcinoma. Cancer 92(1), 92–101 (2015)CrossRef
Metadata
Title
Estimating the Length Distributions of Genomic Micro-satellites from Next Generation Sequencing Data
Authors
Xuan Feng
Huan Hu
Zhongmeng Zhao
Xuanping Zhang
Jiayin Wang
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-78723-7_40

Premium Partner