Skip to main content

Tipp

Weitere Kapitel dieses Buchs durch Wischen aufrufen

2019 | OriginalPaper | Buchkapitel

Deep Learning and Random Forest-Based Augmentation of sRNA Expression Profiles

verfasst von : Jelena Fiosina, Maksims Fiosins, Stefan Bonn

Erschienen in: Bioinformatics Research and Applications

Verlag: Springer International Publishing

share
TEILEN

Abstract

The lack of well-structured annotations in a growing amount of RNA expression data complicates data interoperability and reusability. Commonly used text mining methods extract annotations from existing unstructured data descriptions and often provide inaccurate output that requires manual curation. Automatic data-based augmentation (generation of annotations on the base of expression data) can considerably improve the annotation quality and has not been well-studied. We formulate an automatic augmentation of small RNA-seq expression data as a classification problem and investigate deep learning (DL) and random forest (RF) approaches to solve it. We generate tissue and sex annotations from small RNA-seq expression data for tissues and cell lines of homo sapiens. We validate our approach on 4243 annotated small RNA-seq samples from the Small RNA Expression Atlas (SEA) database. The average prediction accuracy for tissue groups is 98% (DL), for tissues - 96.5% (DL), and for sex - 77% (DL). The “one dataset out” average accuracy for tissue group prediction is 83% (DL) and 59% (RF). On average, DL provides better results as compared to RF, and considerably improves classification performance for ‘unseen’ datasets.
Literatur
1.
Zurück zum Zitat Backes, C., Khaleeq, Q.T., et al.: miEAA: microRNA enrichment analysis and annotation. Nucleic Acids Res. 44(W1), W110–W116 (2016) CrossRef Backes, C., Khaleeq, Q.T., et al.: miEAA: microRNA enrichment analysis and annotation. Nucleic Acids Res. 44(W1), W110–W116 (2016) CrossRef
2.
Zurück zum Zitat Ellis, S., et al.: Improving the value of public RNA-SEQ expression data by phenotype prediction. Nucleic Acids Res. 46(9), e54 (2018) CrossRef Ellis, S., et al.: Improving the value of public RNA-SEQ expression data by phenotype prediction. Nucleic Acids Res. 46(9), e54 (2018) CrossRef
4.
Zurück zum Zitat Guo, L., et al.: miRNA and mRNA expression analysis reveals potential sex-biased miRNA expression. Sci. Rep. 7, 39812 (2017) CrossRef Guo, L., et al.: miRNA and mRNA expression analysis reveals potential sex-biased miRNA expression. Sci. Rep. 7, 39812 (2017) CrossRef
5.
Zurück zum Zitat Guo, Z., Maki, M., et al.: Genome-wide survey of tissue-specific microRNA and transcription factor regulatory networks in 12 tissues. Sci. Rep. 4, 5150 (2014) CrossRef Guo, Z., Maki, M., et al.: Genome-wide survey of tissue-specific microRNA and transcription factor regulatory networks in 12 tissues. Sci. Rep. 4, 5150 (2014) CrossRef
6.
Zurück zum Zitat Hadley, D., Pan, J., et al.: Precision annotation of digital samples in NCBI’s gene expression omnibus. Sci. Data 4, 170125 (2017) CrossRef Hadley, D., Pan, J., et al.: Precision annotation of digital samples in NCBI’s gene expression omnibus. Sci. Data 4, 170125 (2017) CrossRef
7.
Zurück zum Zitat LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436 (2015) CrossRef LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436 (2015) CrossRef
8.
Zurück zum Zitat Li, Y., et al.: Deep learning in bioinformatics: introduction, application, and perspective in big data era. bioRxiv (2019) Li, Y., et al.: Deep learning in bioinformatics: introduction, application, and perspective in big data era. bioRxiv (2019)
9.
Zurück zum Zitat Madan, S., Fiosins, M., et al.: A semantic data integration methodology for translational neurodegenerative disease research. Figshare (2018) Madan, S., Fiosins, M., et al.: A semantic data integration methodology for translational neurodegenerative disease research. Figshare (2018)
11.
Zurück zum Zitat Rahman, R.U., et al.: Oasis 2: improved online analysis of small RNA-seq data. BMC Bioinform. 19, 54 (2018) CrossRef Rahman, R.U., et al.: Oasis 2: improved online analysis of small RNA-seq data. BMC Bioinform. 19, 54 (2018) CrossRef
12.
Zurück zum Zitat Simon, L., et al.: Human platelet microRNA-mRNA networks associated with age and gender revealed by integrated plateletomics. Blood 123, e37–e45 (2014) CrossRef Simon, L., et al.: Human platelet microRNA-mRNA networks associated with age and gender revealed by integrated plateletomics. Blood 123, e37–e45 (2014) CrossRef
13.
Zurück zum Zitat Statnikov, A., Wang, L., Aliferis, C.F.: A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification. BMC Bioinform. 9, 319 (2008) CrossRef Statnikov, A., Wang, L., Aliferis, C.F.: A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification. BMC Bioinform. 9, 319 (2008) CrossRef
14.
Zurück zum Zitat Sun, Y., Koo, S., et al.: Development of a micro-array to detect human and mouse microRNAs and characterization of expression in human organs. Nucleic Acids Res. 32(22), e188 (2004) CrossRef Sun, Y., Koo, S., et al.: Development of a micro-array to detect human and mouse microRNAs and characterization of expression in human organs. Nucleic Acids Res. 32(22), e188 (2004) CrossRef
15.
16.
Zurück zum Zitat Wilkinson, M.D., et al.: The fair guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016) CrossRef Wilkinson, M.D., et al.: The fair guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016) CrossRef
17.
Zurück zum Zitat Xiao, T., et al.: Learning from massive noisy labeled data for image classification. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2691–2699 (2015) Xiao, T., et al.: Learning from massive noisy labeled data for image classification. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2691–2699 (2015)
Metadaten
Titel
Deep Learning and Random Forest-Based Augmentation of sRNA Expression Profiles
verfasst von
Jelena Fiosina
Maksims Fiosins
Stefan Bonn
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-20242-2_14

Premium Partner