Skip to main content
Log in

Application of support vector machines to metabolomics experiments with limited replicates

  • Original Article
  • Published:
Metabolomics Aims and scope Submit manuscript

Abstract

Identifying biomarkers that are indicative of a phenotypic state is difficult because of the amount of natural variability which exists in any population. While there are many different algorithms to select biomarkers, previous investigation shows the sensitivity and flexibility of support vector machines (SVM) make them an attractive candidate. Here we evaluate the ability of support vector machine recursive feature elimination (SVM-RFE) to identify potential metabolic biomarkers in liquid chromatography mass spectrometry untargeted metabolite datasets. Two separate experiments are considered, a low variance (low biological noise) prokaryotic stress experiment, and a high variance (high biological noise) mammalian stress experiment. For each experiment, the phenotypic response to stress is metabolically characterized. SVM-based classification and metabolite ranking is undertaken using a systematically reduced number of biological replicates to evaluate the impact of sample size on biomarker reproducibility and robustness. Our results indicate the highest ranked 1 % of metabolites, the most predictive of the physiological state, were identified by SVM-RFE even when the number of training examples was small (≥3) and the coefficient of variation was high (>0.5). An accuracy analysis shows filtering with recursive feature elimination measurably improves SVM classification accuracy, an effect that is pronounced when the number of training examples is small. These results indicate that SVM-RFE can be successful at biomarker identification even in challenging scenarios where the training examples are noisy and the number of biological replicates is low.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Bertini, I., Calabro, A., De Carli, V., Luchinat, C., Nepi, S., Porfirio, B., et al. (2009). The metabonomic signature of celiac disease. Journal of Proteome Research, 8, 170–177.

    Article  CAS  PubMed  Google Scholar 

  • Duan, K., Rajapakse, J. C., et al. (2005). Multiple SVM-RFE for gene selection in cancer classification with expression data. IEEE Transactions on Nanobioscience, 4, 228–234.

    Article  PubMed  Google Scholar 

  • Guan, W., Zhou, M., Hampton, C. Y., Benigno, B. B., Walker, L. D., Gray, A., et al. (2009). Ovarian cancer detection from metabolomic liquid chromatography/mass spectrometry data by support vector machines. BMC Bioinformatics, 10, 259.

    Article  PubMed  PubMed Central  Google Scholar 

  • Guyon, I., Weston, J., Barnhill, S., et al. (2002). Gene selection for cancer classification using support vector machines. Machine Learning, 46, 389–422.

    Article  Google Scholar 

  • Hall, M., National, H., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., et al. (2010). The WEKA data mining software: An update. ACM SIGKDD Explorations Newsletter, 11(1), 10–18.

    Article  Google Scholar 

  • Haug, K., Salek, R. M., Conesa, P., Hastings, J., de Matos, P., Rijnbeek, M., et al. (2013). Metabolights—an open-access general-purpose repository for metabolomics studies and associated meta-data. Nucleic Acids Research, 41, D781–D786.

    Article  CAS  PubMed  Google Scholar 

  • Heinemann, J., Hamerly, T., Maaty, W. S., Movahed, N., Steffens, J. D., Reeves, B. D., et al. (2014). Expanding the paradigm of thiol redox in the thermophilic root of life. Biochimca et Biophysica Acta, 1840, 80–85.

    Article  CAS  Google Scholar 

  • Herder, C., Karakas, M., Koenig, W., et al. (2011). Biomarkers for the prediction of type 2 diabetes and cardiovascular disease. Clinical Pharmacology and Therapeutics, 90(1), 52–66.

    Article  CAS  PubMed  Google Scholar 

  • Lin, X., Wang, Q., Yin, P., Tang, L., Tan, Y., Li, H., et al. (2011). A method for handling metabonomics data from liquid chromatography/mass spectrometry: Combinational use of support vector machine recursive feature elimination, genetic algorithm and random forest for feature selection. Metabolomics, 7(4), 549–558.

    Article  CAS  Google Scholar 

  • Lusczek, E. R., Nelson, T., Lexcen, D., Witowski, N. E., Mulier, K. E., et al. (2011). Urine metabolomics in hemorrhagic shock: Normalization of urine in the face of changing intravascular fluid volume and perturbations in metabolism. Bioanalysis and Biomedicine, 3(2), 38–48.

    CAS  Google Scholar 

  • Maaty, W. S., Wiedenheft, B., Tarlykov, P., Schaff, N., Heinemann, J., Robison-Cox, J., et al. (2009). Something old, something new, something borrowed; how the thermoacidophilic archaeon Sulfolobus solfataricus responds to oxidative stress. PLoS One, 4(9), e6964.

    Article  PubMed  PubMed Central  Google Scholar 

  • Mahadevan, S., Shah, S. L., Marrie, T. J., Slupsky, C. M., et al. (2008). Analysis of metabolomic data using support vector machines. Analytical Chemistry, 80(19), 7562–7570.

    Article  CAS  PubMed  Google Scholar 

  • Mulier, K. E., Beilman, G. J., Conroy, M. J., Taylor, J. H., Skarda, D. E., et al. (2005). Ringer’s ethyl pyruvate in hemorrhagic shock and resuscitation does not improve early hemodynamics or tissue energetics. Shock, 23, 248–252.

    CAS  PubMed  Google Scholar 

  • Patti, G. J., Tautenhahn, R., Siuzdak, G., et al. (2012a). Meta-analysis of untargeted metabolomic data from multiple profiling experiments. Nature Protocols, 7(3), 508–516.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Patti, G. J., Yanes, O., Shriver, L. P., Courade, J., Tautenhahn, R., Manchester, M., et al. (2012b). Metabolomics implicates altered sphingolipids in chronic pain of neuropathic origin. Nature Chemical Biology, 8(3), 232–234.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • R Development Core Team. (2012). R: A language and environment for statistical computing, reference index version 2.15.1. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, http://www.R-project.org. Retrieved 16 May 2013.

  • Scribner, D. M., Witowski, N. E., Mulier, K. E., Lusczek, E. R., Wasiluk, K. R., Bielman, G. J., et al. (2010). Liver metabolomic changes identify biochemical pathways in hemorrhagic shock. The Journal of Surgical Research, 164, e131–e139.

    Article  CAS  PubMed  Google Scholar 

  • Serkova, N. J., Standiford, T. J., Stringer, K. A., et al. (2011). The emerging field of quantitative blood metabolomics for biomarker discovery in critical illnesses. American Journal of Respiratory and Critical Care Medicine, 184, 647–655.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Smith, C., O’Maille, G., Want, E. J., Qin, C., Trauger, S., Brandon, T. R., et al. (2005). METLIN: A metabolite mass spectral database. Therapeutic Drug Monitoring, 27(6), 747–751.

    Article  CAS  PubMed  Google Scholar 

  • Tautenhahn, R., Böttcher, C., Neumann, S., et al. (2008). Highly sensitive feature detection for high resolution LC/MS. BMC Bioinformatics, 9, 504.

    Article  PubMed  PubMed Central  Google Scholar 

  • VeselKov, K. A., Vingara, L. K., Masson, P., Robinette, S. L., Want, E., Li, J. V., et al. (2011). Optimizing preprocessing of ultra-performance liquid chromatography/mass spectrometry urinary metabolic profiles for improved information recovery. Analytical Chemistry, 83, 5864–5872.

    Article  CAS  PubMed  Google Scholar 

  • Yanes, O., Tautenhahn, R., Patti, G. J., Siuzdak, G., et al. (2011). Expanding coverage of the metabolome for global metabolite profiling. Analytical Chemistry, 83(6), 2152–2161.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgments

This work was supported by National Science Foundation, MCB0646499 and MCB102248. Mass spectrometry, proteomics and metabolomics core facility supported by the Murdock Charitable Trust, INBRE MT Grant No. P20 RR-16455-08, NIH Grant Nos. P20 RR-020185 and P20 RR-024237 from the COBRE Program of the National Center for Research Resources.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brian Bothner.

Additional information

All data is available online at http://www.ebi.ac.uk/metabolights/ (Haug et al. 2013).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Heinemann, J., Mazurie, A., Tokmina-Lukaszewska, M. et al. Application of support vector machines to metabolomics experiments with limited replicates. Metabolomics 10, 1121–1128 (2014). https://doi.org/10.1007/s11306-014-0651-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11306-014-0651-0

Keywords

Navigation