Skip to main content

2017 | OriginalPaper | Buchkapitel

LSLS: A Novel Scaffolding Method Based on Path Extension

verfasst von : Min Li, Li Tang, Zhongxiang Liao, Junwei Luo, Fangxiang Wu, Yi Pan, Jianxin Wang

Erschienen in: Intelligent Computing Theories and Application

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

While aiming to determine orientations and orders of fragmented contigs, scaffolding is an essential step of assembly pipelines and can make assembly results more complete. Most existing scaffolding tools adopt the scaffold graph approach. However, constructing an accurate scaffold graph is still a challenge task. Removing potential false relationships is a key to achieve a better scaffolding performance, while most scaffolding approaches neglect the impacts of uneven sequencing depth that may cause more sequencing errors, and finally result in many false relationships. In this paper, we present a new scaffolding method LSLS (Loose-Strict-Loose Scaffolding), which is based on path extension. LSLS uses different strategies to extend paths, which can be more adaptive to different sequencing depths. For the problem of multiple paths, we designed a score function, which is based on the distribution of read pairs, to evaluate the reliability of path candidates and extend them with the paths which have the highest score. Besides, LSLS contains a new gap estimation method, which can estimate gap sizes more precisely. The experiment results on the two standard datasets show that LSLS can get better performance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Voelkerding, K.V., Dames, S.A., Durtschi, J.D.: Next-generation sequencing: from basic research to diagnostics. Clin. Chem. 55(4), 641–658 (2009)CrossRef Voelkerding, K.V., Dames, S.A., Durtschi, J.D.: Next-generation sequencing: from basic research to diagnostics. Clin. Chem. 55(4), 641–658 (2009)CrossRef
2.
Zurück zum Zitat Luo, J., Wang, J., Zhang, Z., Wu, F.X., Li, M., Pan, Y.: Epga: de novo assembly using the distributions of reads and insert size. Bioinformatics 31(6), 825–833 (2015)CrossRef Luo, J., Wang, J., Zhang, Z., Wu, F.X., Li, M., Pan, Y.: Epga: de novo assembly using the distributions of reads and insert size. Bioinformatics 31(6), 825–833 (2015)CrossRef
3.
Zurück zum Zitat Gritsenko, A.A., Nijkamp, J.F., Reinders, M.J.T., Ridder, D.D.: Grass: a generic algorithm for scaffolding next-generation sequencing assemblies. Bioinformatics 28(11), 1429 (2012)CrossRef Gritsenko, A.A., Nijkamp, J.F., Reinders, M.J.T., Ridder, D.D.: Grass: a generic algorithm for scaffolding next-generation sequencing assemblies. Bioinformatics 28(11), 1429 (2012)CrossRef
4.
Zurück zum Zitat Salmela, L., Mäkinen, V., Välimäki, N., Ylinen, J., Ukkonen, E.: Fast scaffolding with small independent mixed integer programs. Bioinformatics 27(23), 3259–3265 (2011)CrossRef Salmela, L., Mäkinen, V., Välimäki, N., Ylinen, J., Ukkonen, E.: Fast scaffolding with small independent mixed integer programs. Bioinformatics 27(23), 3259–3265 (2011)CrossRef
5.
Zurück zum Zitat Dayarian, A., Michael, T.P., Sengupta, A.M.: Sopra: scaffolding algorithm for paired reads via statistical optimization. BMC Bioinform. 11(1), 345 (2010)CrossRef Dayarian, A., Michael, T.P., Sengupta, A.M.: Sopra: scaffolding algorithm for paired reads via statistical optimization. BMC Bioinform. 11(1), 345 (2010)CrossRef
6.
Zurück zum Zitat Koren, S., Treangen, T.J., Pop, M.: Bambus 2: scaffolding metagenomes. Bioinformatics 27(21), 2964–2971 (2011)CrossRef Koren, S., Treangen, T.J., Pop, M.: Bambus 2: scaffolding metagenomes. Bioinformatics 27(21), 2964–2971 (2011)CrossRef
7.
Zurück zum Zitat Donmez, N., Brudno, M.: Scarpa: scaffolding reads with practical algorithms. Bioinformatics 29(4), 428 (2013)CrossRef Donmez, N., Brudno, M.: Scarpa: scaffolding reads with practical algorithms. Bioinformatics 29(4), 428 (2013)CrossRef
8.
Zurück zum Zitat Gao, S., Nagarajan, N., Sung, W.K.: Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences. J. Comput. Biol. J. Comput. Mol. Cell Biol. 18(11), 1681–1691 (2011)MathSciNetCrossRef Gao, S., Nagarajan, N., Sung, W.K.: Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences. J. Comput. Biol. J. Comput. Mol. Cell Biol. 18(11), 1681–1691 (2011)MathSciNetCrossRef
9.
Zurück zum Zitat Simpson, J.T., Durbin, R.: Efficient de novo assembly of large genomes using compressed data structures. Genome Res. 22(3), 549–556 (2012)CrossRef Simpson, J.T., Durbin, R.: Efficient de novo assembly of large genomes using compressed data structures. Genome Res. 22(3), 549–556 (2012)CrossRef
10.
Zurück zum Zitat Simpson, J.T., Wong, K., Jackman, S.D., et al.: Abyss: a parallel assembler for short read sequence data. Genome Res. 19(6), 1117 (2009)CrossRef Simpson, J.T., Wong, K., Jackman, S.D., et al.: Abyss: a parallel assembler for short read sequence data. Genome Res. 19(6), 1117 (2009)CrossRef
11.
Zurück zum Zitat Mandric, I., Zelikovsky, A.: ScaffMatch: scaffolding algorithm based on maximum weight matching. In: Przytycka, Teresa M. (ed.) RECOMB 2015. LNCS, vol. 9029, pp. 222–223. Springer, Cham (2015). doi:10.1007/978-3-319-16706-0_22 Mandric, I., Zelikovsky, A.: ScaffMatch: scaffolding algorithm based on maximum weight matching. In: Przytycka, Teresa M. (ed.) RECOMB 2015. LNCS, vol. 9029, pp. 222–223. Springer, Cham (2015). doi:10.​1007/​978-3-319-16706-0_​22
12.
Zurück zum Zitat Luo, J., Wang, J., Zhen, Z., Min, L., Wu, F.X.: Boss: a novel scaffolding algorithm based on an optimized scaffold graph. Bioinformatics 33, 169–176 (2016). btw597CrossRef Luo, J., Wang, J., Zhen, Z., Min, L., Wu, F.X.: Boss: a novel scaffolding algorithm based on an optimized scaffold graph. Bioinformatics 33, 169–176 (2016). btw597CrossRef
13.
Zurück zum Zitat Ariyaratne, P.N., Sung, W.K.: Pe-assembler: de novo assembler using short paired-end reads. Bioinformatics 27(2), 167 (2011)CrossRef Ariyaratne, P.N., Sung, W.K.: Pe-assembler: de novo assembler using short paired-end reads. Bioinformatics 27(2), 167 (2011)CrossRef
14.
Zurück zum Zitat Pop, M., Kosack, D.S., Salzberg, S.L.: Hierarchical scaffolding with bambus. Genome Res. 14(1), 149–159 (2004)CrossRef Pop, M., Kosack, D.S., Salzberg, S.L.: Hierarchical scaffolding with bambus. Genome Res. 14(1), 149–159 (2004)CrossRef
15.
Zurück zum Zitat Kent, W.J., Haussler, D.: Assembly of the working draft of the human genome with gigassembler. Genome Res. 11(9), 1541–1548 (2001)CrossRef Kent, W.J., Haussler, D.: Assembly of the working draft of the human genome with gigassembler. Genome Res. 11(9), 1541–1548 (2001)CrossRef
16.
Zurück zum Zitat Huson, D.H., Reinert, K., Myers, E.W.: The greedy path-merging algorithm for contig scaffolding. J. ACM 49(5), 603–615 (2002)MathSciNetCrossRefMATH Huson, D.H., Reinert, K., Myers, E.W.: The greedy path-merging algorithm for contig scaffolding. J. ACM 49(5), 603–615 (2002)MathSciNetCrossRefMATH
17.
Zurück zum Zitat Min, L., Liao, Z., He, Y., Wang, J., Luo, J., Yi, P.: Isea: iterative seed-extension algorithm for de novo assembly using paired-end information and insert size distribution. IEEE/ACM Trans. Comput. Biol. Bioinform. PP(99), 1 (2016)CrossRef Min, L., Liao, Z., He, Y., Wang, J., Luo, J., Yi, P.: Isea: iterative seed-extension algorithm for de novo assembly using paired-end information and insert size distribution. IEEE/ACM Trans. Comput. Biol. Bioinform. PP(99), 1 (2016)CrossRef
18.
Zurück zum Zitat Hunt, M., et al.: A comprehensive evaluation of assembly scaffolding tools. Genome Biol. 15(3), 1–15 (2014)CrossRef Hunt, M., et al.: A comprehensive evaluation of assembly scaffolding tools. Genome Biol. 15(3), 1–15 (2014)CrossRef
19.
Zurück zum Zitat Sahlin, K., Vezzi, F., Nystedt, B., Lundeberg, J., Arvestad, L.: Besst - efficient scaffolding of large fragmented assemblies. BMC Bioinform. 15(1), 281 (2014)CrossRef Sahlin, K., Vezzi, F., Nystedt, B., Lundeberg, J., Arvestad, L.: Besst - efficient scaffolding of large fragmented assemblies. BMC Bioinform. 15(1), 281 (2014)CrossRef
20.
Zurück zum Zitat Boetzer, M., Henkel, C.V., Jansen, H.J., Butler, D., Pirovano, W.: Scaffolding pre-assembled contigs using sspace. Bioinformatics 27(4), 578–579 (2011)CrossRef Boetzer, M., Henkel, C.V., Jansen, H.J., Butler, D., Pirovano, W.: Scaffolding pre-assembled contigs using sspace. Bioinformatics 27(4), 578–579 (2011)CrossRef
21.
Zurück zum Zitat Li, R., Yu, C., Li, Y., Lam, T.W., Yiu, S.M., Kristiansen, K., et al.: Soap2: an improved ultrafast tool for short read alignment. Bioinformatics 25(15), 1966–1967 (2009)CrossRef Li, R., Yu, C., Li, Y., Lam, T.W., Yiu, S.M., Kristiansen, K., et al.: Soap2: an improved ultrafast tool for short read alignment. Bioinformatics 25(15), 1966–1967 (2009)CrossRef
Metadaten
Titel
LSLS: A Novel Scaffolding Method Based on Path Extension
verfasst von
Min Li
Li Tang
Zhongxiang Liao
Junwei Luo
Fangxiang Wu
Yi Pan
Jianxin Wang
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-63312-1_38