Skip to main content
Top

2019 | OriginalPaper | Chapter

Deep Learning and Random Forest-Based Augmentation of sRNA Expression Profiles

Authors : Jelena Fiosina, Maksims Fiosins, Stefan Bonn

Published in: Bioinformatics Research and Applications

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The lack of well-structured annotations in a growing amount of RNA expression data complicates data interoperability and reusability. Commonly used text mining methods extract annotations from existing unstructured data descriptions and often provide inaccurate output that requires manual curation. Automatic data-based augmentation (generation of annotations on the base of expression data) can considerably improve the annotation quality and has not been well-studied. We formulate an automatic augmentation of small RNA-seq expression data as a classification problem and investigate deep learning (DL) and random forest (RF) approaches to solve it. We generate tissue and sex annotations from small RNA-seq expression data for tissues and cell lines of homo sapiens. We validate our approach on 4243 annotated small RNA-seq samples from the Small RNA Expression Atlas (SEA) database. The average prediction accuracy for tissue groups is 98% (DL), for tissues - 96.5% (DL), and for sex - 77% (DL). The “one dataset out” average accuracy for tissue group prediction is 83% (DL) and 59% (RF). On average, DL provides better results as compared to RF, and considerably improves classification performance for ‘unseen’ datasets.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Backes, C., Khaleeq, Q.T., et al.: miEAA: microRNA enrichment analysis and annotation. Nucleic Acids Res. 44(W1), W110–W116 (2016)CrossRef Backes, C., Khaleeq, Q.T., et al.: miEAA: microRNA enrichment analysis and annotation. Nucleic Acids Res. 44(W1), W110–W116 (2016)CrossRef
2.
go back to reference Ellis, S., et al.: Improving the value of public RNA-SEQ expression data by phenotype prediction. Nucleic Acids Res. 46(9), e54 (2018)CrossRef Ellis, S., et al.: Improving the value of public RNA-SEQ expression data by phenotype prediction. Nucleic Acids Res. 46(9), e54 (2018)CrossRef
4.
go back to reference Guo, L., et al.: miRNA and mRNA expression analysis reveals potential sex-biased miRNA expression. Sci. Rep. 7, 39812 (2017)CrossRef Guo, L., et al.: miRNA and mRNA expression analysis reveals potential sex-biased miRNA expression. Sci. Rep. 7, 39812 (2017)CrossRef
5.
go back to reference Guo, Z., Maki, M., et al.: Genome-wide survey of tissue-specific microRNA and transcription factor regulatory networks in 12 tissues. Sci. Rep. 4, 5150 (2014)CrossRef Guo, Z., Maki, M., et al.: Genome-wide survey of tissue-specific microRNA and transcription factor regulatory networks in 12 tissues. Sci. Rep. 4, 5150 (2014)CrossRef
6.
go back to reference Hadley, D., Pan, J., et al.: Precision annotation of digital samples in NCBI’s gene expression omnibus. Sci. Data 4, 170125 (2017)CrossRef Hadley, D., Pan, J., et al.: Precision annotation of digital samples in NCBI’s gene expression omnibus. Sci. Data 4, 170125 (2017)CrossRef
7.
8.
go back to reference Li, Y., et al.: Deep learning in bioinformatics: introduction, application, and perspective in big data era. bioRxiv (2019) Li, Y., et al.: Deep learning in bioinformatics: introduction, application, and perspective in big data era. bioRxiv (2019)
9.
go back to reference Madan, S., Fiosins, M., et al.: A semantic data integration methodology for translational neurodegenerative disease research. Figshare (2018) Madan, S., Fiosins, M., et al.: A semantic data integration methodology for translational neurodegenerative disease research. Figshare (2018)
11.
go back to reference Rahman, R.U., et al.: Oasis 2: improved online analysis of small RNA-seq data. BMC Bioinform. 19, 54 (2018)CrossRef Rahman, R.U., et al.: Oasis 2: improved online analysis of small RNA-seq data. BMC Bioinform. 19, 54 (2018)CrossRef
12.
go back to reference Simon, L., et al.: Human platelet microRNA-mRNA networks associated with age and gender revealed by integrated plateletomics. Blood 123, e37–e45 (2014)CrossRef Simon, L., et al.: Human platelet microRNA-mRNA networks associated with age and gender revealed by integrated plateletomics. Blood 123, e37–e45 (2014)CrossRef
13.
go back to reference Statnikov, A., Wang, L., Aliferis, C.F.: A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification. BMC Bioinform. 9, 319 (2008)CrossRef Statnikov, A., Wang, L., Aliferis, C.F.: A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification. BMC Bioinform. 9, 319 (2008)CrossRef
14.
go back to reference Sun, Y., Koo, S., et al.: Development of a micro-array to detect human and mouse microRNAs and characterization of expression in human organs. Nucleic Acids Res. 32(22), e188 (2004)CrossRef Sun, Y., Koo, S., et al.: Development of a micro-array to detect human and mouse microRNAs and characterization of expression in human organs. Nucleic Acids Res. 32(22), e188 (2004)CrossRef
15.
16.
go back to reference Wilkinson, M.D., et al.: The fair guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016)CrossRef Wilkinson, M.D., et al.: The fair guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016)CrossRef
17.
go back to reference Xiao, T., et al.: Learning from massive noisy labeled data for image classification. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2691–2699 (2015) Xiao, T., et al.: Learning from massive noisy labeled data for image classification. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2691–2699 (2015)
Metadata
Title
Deep Learning and Random Forest-Based Augmentation of sRNA Expression Profiles
Authors
Jelena Fiosina
Maksims Fiosins
Stefan Bonn
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-20242-2_14

Premium Partner