Elsevier

Advanced Drug Delivery Reviews

Volume 86, 23 June 2015, Pages 83-100
Advanced Drug Delivery Reviews

Recent progresses in the exploration of machine learning methods as in-silico ADME prediction tools

https://doi.org/10.1016/j.addr.2015.03.014Get rights and content

Abstract

In-silico methods have been explored as potential tools for assessing ADME and ADME regulatory properties particularly in early drug discovery stages. Machine learning methods, with their ability in classifying diverse structures and complex mechanisms, are well suited for predicting ADME and ADME regulatory properties. Recent efforts have been directed at the broadening of application scopes and the improvement of predictive performance with particular focuses on the coverage of ADME properties, and exploration of more diversified training data, appropriate molecular features, and consensus modeling. Moreover, several online machine learning ADME prediction servers have emerged. Here we review these progresses and discuss the performances, application prospects and challenges of exploring machine learning methods as useful tools in predicting ADME and ADME regulatory properties.

Introduction

The discovery and optimization of therapeutic agents with desirable pharmacodynamics, pharmacokinetic toxicological properties is the key focus of drug development efforts [1]. Predictive tools for accurately assessing pharmacokinetic and toxicological properties as well as pharmacodynamic properties in early development stages are highly useful for increased productivity in drug discovery processes [1], [2], [3]. As part of the efforts for developing these tools, computational methods have been developed and improved for the prediction of compound absorption, distribution, metabolism, and excretion (ADME) properties [4], [5]. In particular, machine learning (ML) methods have shown promising potential in predicting ADME properties by correlating these properties to molecular features and by establishing the complex structure–property relationships for diverse ranges of molecular structures and mechanisms [6], [7].

More recently, efforts have been directed at the development and refinement of ML models for improved prediction and more extensive coverage of various ADME properties particularly excretion [8], [9], [10] and distribution [11], [12] properties, and for the prediction of regulators of drug metabolism [13], [14], [15], [16] and excretion [8] implicated in drug–drug interactions and multi-drug resistance respectively. Efforts have also been made to further explore consensus modeling for improved prediction of the ADME properties and ADME regulatory properties of drug candidates [8], [13]. Moreover, online machine learning ADME and ADME regulatory property prediction servers have emerged [15], [17]. Here we review these progresses and discuss the performances, application prospects and challenges of exploring ML methods as tools for predicting ADME and ADME regulatory properties.

Section snippets

Molecular descriptors for representing compounds in ADME prediction

Molecular descriptors have been extensively used for representing structural and physicochemical properties of compounds from their molecular structures. The compounds associated with a specific ADME property are typically of high structural and mechanistic diversity. Therefore, the prediction of various ADME properties requires different sets of molecular descriptors that adequately cover the relevant molecular features. A large variety of > 3000 molecular descriptors can be computed from such

Commonly used machine learning methods for developing classification models

A number of ML methods have been used for developing ADME predictive tools. These include Linear Discriminant Analysis (LDA), k Nearest Neighbor (kNN), Artificial Neural Network (ANN), Probabilistic Neural Network (PNN), Support Vector Machine (SVM), Decision Tree (DT), Recursive Partitioning (RP), Random Forest (RF), Naïve Bayesian (NB), Multiple Linear Regression (MLR), Partial Least Squares Regression (PLSR), kNN Regression (kNNR), Support Vector Regression (SVR), Random Forest Regression

The exploration of machine learning classification methods for predicting ADME properties

ML classification methods classify compounds into one of the two opposing classes, one associated with a property (e.g. an ADME property) and the other not associated with the property. Because of their ability in classifying compounds of diverse range of structures and physicochemical properties, ML classification methods have been extensively explored for predicting various ADME properties that are typically associated with compounds of diverse structures (e.g. substrates of a drug

The exploration of machine learning classification methods for predicting ADME regulatory properties

ML classification methods have also been extensively used for predicting regulators of drug ADME properties, particularly the inhibitors of drug efflux and influx transporters for regulating multi-drug resistance (Table 3) [8], [64], [65] and the inhibitors of drug metabolism enzymes for assessing drug–drug interactions (Table 4) [13], [14], [66]. These studies have primarily focused on the extended coverage of drug transporters (9 transporters) [8] and metabolism enzymes (5 CYP enzymes CYP

The exploration of machine learning regression methods for predicting ADME and ADME regulatory properties

ML regression methods are intended for estimating the affinity/activity level in addition to the determination of whether or not a compound possesses or regulates a specific ADME property. Table 5 summarises the performance of the recently developed ML regression methods for predicting the affinity/activity level of ADME and ADME regulatory properties. Partly because of the limited availability of experimental affinity/activity levels, ML regression models have been developed for a limited

The trends in the development of machine learning models for predicting ADME and ADME regulatory properties

There are noticeable trends in the recent efforts for developing ML models to predict ADME and ADME regulatory properties. In developing ML classification models for predicting ADME and ADME regulatory properties, three ML methods support vector machines (SVM, 38 models), random forest (RF, 27 models) and k nearest neighbor (kNN, 25 models) have been more frequently used than other ML regression methods (4 models). These three methods have also been used for developing all the consensus ML

Application scope of the developed machine learning models

The recently and previously [79] developed ML classification models broadly cover compound metabolism (by 6 different CYP enzymes) [79], efflux (by 6 different transporters) [8] and influx (by 4 different transporters) [8] at reasonably good predictive accuracies. The SEs, SPs and ACs of the majority of the ML classification models are in ranges of 74%–92%, 66%–76% and 72%–92% respectively. The SEs are close to but the SPs are substantially lower than the SEs (~ 90%) and SPs (~ 90%) of ML virtual

Challenges in the exploration of machine learning methods

The performance of ML methods critically depends on the diversity and representativeness of in the training datasets and the appropriate representation of their structural and physicochemical properties. The training datasets used in the most of the ML models described in Table 2, Table 3, Table 4, Table 5 are not expected to be fully representative of the compounds associated with each specific ADME property. This is particularly true for compounds not possessing a specific ADME property,

Perspectives

Both classification-based and regression-based ML methods have consistently shown promising capability in predicting a variety of ADME and ADME regulatory properties for diverse ranges of structures at accuracy levels comparable to those practically used in drug lead discovery and optimization, making the developed ADME and ADME regulatory prediction models potentially useful tools for assessing ADME properties and predicting ADME regulatory properties. In spite of the significant efforts, the

Acknowledgements

We acknowledge the support by Major State Basic Research Development Program of China 2013CB967204 and Singapore Academic Research Fund R148000181112.

References (93)

  • V. Poongavanam et al.

    Fingerprint-based in silico models for the prediction of P-glycoprotein substrates and inhibitors

    Bioorg. Med. Chem.

    (2012)
  • C. Zhang et al.

    Exploration of (S)-3-aminopyrrolidine as a potentially interesting scaffold for discovery of novel Abl and PI3K dual inhibitors

    Eur. J. Med. Chem.

    (2011)
  • M. Posa

    Heuman indices of hydrophobicity of bile acids and their comparison with a newly developed and conventional molecular descriptors

    Biochimie

    (2014)
  • J. Drews

    Drug discovery: a historical perspective

    Science

    (2000)
  • R.E. White

    High-throughput screening in drug metabolism and pharmacokinetic support of drug discovery

    Annu. Rev. Pharmacol. Toxicol.

    (2000)
  • H. van de Waterbeemd et al.

    ADMET in silico modelling: towards prediction paradise?

    Nat. Rev. Drug Discov.

    (2003)
  • D. Stepensky

    Prediction of drug disposition on the basis of its chemical structure

    Clin. Pharmacokinet.

    (2013)
  • M. Trotter et al.

    Support vector machines for ADME property classification

    QSAR Comb. Sci.

    (2003)
  • Y. Sakiyama

    The use of machine learning and nonlinear statistical tools for ADME prediction

    Expert Opin. Drug Metab. Toxicol.

    (2009)
  • A. Sedykh et al.

    Human intestinal transporter database: QSAR modeling and virtual profiling of drug uptake, efflux and interactions

    Pharm. Res.

    (2013)
  • M.E. Gantner et al.

    Development of conformation independent computational models for the early recognition of breast cancer resistance protein substrates

    Biomed Res. Int.

    (2013)
  • V.K. Gombar et al.

    Quantitative structure–activity relationship models of clinical pharmacokinetics: clearance and volume of distribution

    J. Chem. Inf. Model.

    (2013)
  • B. Louis et al.

    Prediction of human volume of distribution values for drugs using linear and nonlinear quantitative structure pharmacokinetic relationship models

    Interdiscip. Sci.

    (2014)
  • F. Cheng et al.

    Classification of cytochrome P450 inhibitors and noninhibitors using combined classifiers

    J. Chem. Inf. Model.

    (2011)
  • H. Sun et al.

    Predictive models for cytochrome p450 isozymes based on quantitative high throughput screening data

    J. Chem. Inf. Model.

    (2011)
  • M. Rostkowski et al.

    WhichCyp: prediction of cytochromes P450 inhibition

    Bioinformatics

    (2013)
  • M. Lapins et al.

    A unified proteochemometric model for prediction of inhibition of cytochrome p450 isoforms

    PLoS One

    (2013)
  • F. Zsila et al.

    Evaluation of drug–human serum albumin binding interactions with support vector machine aided online automated docking

    Bioinformatics

    (2011)
  • R. Todeschini et al.

    DRAGON

    (2005)
  • I.V. Tetko et al.

    Virtual computational chemistry laboratory — design and description

    J. Comput. Aided Mol. Des.

    (2005)
  • L.H. Hall et al.

    Molconn-Z, in, eduSoft, LC

    (2002)
  • J.K. Wegner

    JOELib/JOELib2

    (2005)
  • Z.R. Li et al.

    MODEL-molecular descriptor lab: a web-based server for computing structural and physicochemical features of compounds

    Biotechnol. Bioeng.

    (2007)
  • C.W. Yap

    PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints

    J. Comput. Chem.

    (2011)
  • G. Rücker et al.

    Counts of all walks as atomic and molecular descriptors

    J. Chem. Inf. Comput. Sci.

    (1993)
  • J.H. Schuur et al.

    The coding of the three-dimensional structure of molecules by molecular transforms and its application to structure–spectra correlations and studies of biological activity

    J. Chem. Inf. Comput. Sci.

    (1996)
  • R.S. Pearlman et al.

    Metric validation and the receptor-relevant subspace concept

    J. Chem. Inf. Comput. Sci.

    (1999)
  • G. Bravi et al.

    MS-WHIM, new 3D theoretical descriptors derived from molecular surface properties: a comparative 3D QSAR study in a series of steroids

    J. Comput. Aided Mol. Des.

    (1997)
  • J. Galvez et al.

    Charge indexes. New topological descriptors

    J. Chem. Inf. Comput. Sci.

    (1994)
  • V. Consonni et al.

    Structure/response correlations and similarity/diversity analysis by GETAWAY descriptors. 1. Theory of the novel 3D molecular descriptors

    J. Chem. Inf. Comput. Sci.

    (2002)
  • M. Randic

    Molecular profiles. Novel geometry-dependent molecular descriptors

    New J. Chem.

    (1995)
  • L.B. Kier et al.

    Molecular Structure Description: The Electrotopological State

    (1999)
  • J.A. Platts et al.

    Estimation of molecular free energy relation descriptors using a group contribution approach

    J. Chem. Inf. Comput. Sci.

    (1999)
  • M.S. Meskin et al.

    QSAR analysis of drug excretion into human breast milk

    J. Clin. Hosp. Pharm.

    (1985)
  • I. Guyon et al.

    Gene selection for cancer classification using support vector machines

    Mach. Learn.

    (2002)
  • Y. Xue et al.

    Prediction of P-glycoprotein substrates by a support vector machine approach

    J. Chem. Inf. Comput. Sci.

    (2004)
  • Cited by (71)

    • Anticancer potential of phytochemicals from Oroxylum indicum targeting Lactate Dehydrogenase A through bioinformatic approach

      2023, Toxicology Reports
      Citation Excerpt :

      Oroxin A is another flavonoid usually isolated from O. indicum and it has been reported to possess significant inhibitory properties against breast cancer proliferation by generating significant endoplasmic reticulum stress and senescence [64]. Servers involving computational ADME and toxicity analyses have improved greatly in recent years with the incorporation of machine learning methods which have facilitated rapid analyses to evaluate various pharmacokinetic, pharmacodynamic and toxicity properties of drug-like compounds [65]. The present investigation revealed favorable ADME/T properties for Chrysin-7-O-glucuronide, Oroxindin and Oroxin A.

    • Integrated RNA-sequencing and network pharmacology approach reveals the protection of Yiqi Huoxue formula against idiopathic pulmonary fibrosis by interfering with core transcription factors

      2022, Phytomedicine
      Citation Excerpt :

      Therefore, using the multiple components-targets networks could aid in the effective treatment of IPF. Among the components of TCM, only molecules that overcome the absorption, distribution, metabolism, and excretion barrier are expected to exert curative effects, indicating their candidature for the active ingredients group (Tao et al., 2015). The active components of YQHX are key in elucidating their mechanism of action.

    View all citing articles on Scopus

    This review is part of the Advanced Drug Delivery Reviews theme issue on “In silico ADMET predictions in pharmaceutical research”.

    View full text