skip to main content
10.1145/1273496.1273524acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

An integrated approach to feature invention and model construction for drug activity prediction

Published:20 June 2007Publication History

ABSTRACT

We present a new machine learning approach for 3D-QSAR, the task of predicting binding affinities of molecules to target proteins based on 3D structure. Our approach predicts binding affinity by using regression on substructures discovered by relational learning. We make two contributions to the state-of-the-art. First, we use multiple-instance (MI) regression, which represents a molecule as a set of 3D conformations, to model activity. Second, the relational learning component employs the "Score As You Use" (SAYU) method to select substructures for their ability to improve the regression model. This is the first application of SAYU to multiple-instance, real-valued prediction. We evaluate our approach on three tasks and demonstrate that (i) SAYU outperforms standard coverage measures when selecting features for regression, (ii) the MI representation improves accuracy over standard single feature-vector encodings and (iii) combining SAYU with MI regression is more accurate for 3D-QSAR than either approach by itself.

References

  1. Brint, A., & Willett, P. (1987). Algorithms for the identification of three-dimensional maximal common substructures. J. Chemical Informatics and Computer Sciences, 27, 152--158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Cheng, J., Hatzis, C., Hayashi, H., Krogel, M.-A., Morishita, S., Page, D., & Sese, J. (2002). KDD Cup 2001 report. SIGKDD Explorations, 3, 47--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Cramer, R. D., Patterson, D. E., & Bunce, J. D. (1988). Comparative molecular field analysis (ComFA). Effect on binding of steroids to carrier proteins. Journal of the American Chemical Society, 110, 5959--5967.Google ScholarGoogle ScholarCross RefCross Ref
  4. Davis, J., Burnside, E., Dutra, I. C., Page, D., & Costa, V. S. (2005). An integrated approach to learning Bayesian networks of rules. Proceedings of the 16th European Conference on Machine Learning (pp. 84--95). Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Dietterich, T. G., Lathrop, R. H., & Lozano-Perez, T. (1997). Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence, 89, 31--71. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Finn, P., Muggleton, S., Page, D., & Srinivasan, A. (1998). Pharmacophore discovery using the inductive logic programming system PROGOL. Machine Learning, 30: Special issue on applications and the knowledge discovery process, Kohavi and Provost (Ed.s), 241--270. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Fletcher, R. (1980). Practical methods of optimization, vol. 1: Unconstrained Optimization, chapter 3. John Wiley and Sons.Google ScholarGoogle Scholar
  8. Jain, A., Dietterich, T., Lathrop, R., Chapman, D., Critchlow, R., Bauer, B., Webster, T., & Lozano-Péérez, T. (1994a). Compass: a shape-based machine learning tool for drug design. Journal of Computer-Aided Molecular Design, 8, 635--652.Google ScholarGoogle ScholarCross RefCross Ref
  9. Jain, A., Koile, K., Bauer, B., & Chapman, D. (1994b). Compass: Predicting biological activities from molecular surface properties. Journal of Medicinal Chemistry, 37, 2315--2327.Google ScholarGoogle ScholarCross RefCross Ref
  10. Landwehr, N., Kersting, K., & Raedt, L. D. (2005). nFOIL: Integrating Naive Bayes and FOIL. Proceedings of the 20th National Conference on Artificial Intelligence (pp. 795--800). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Landwehr, N., Passerini, A., Raedt, L. D., & Frasconi, P. (2006). kFOIL: Learning simple relational kernels. Proceedings of the 21st National Conference on Artificial Intelligence. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Marchand-Geneste, N., Watson, K., Alsberg, B., & King, R. (2002). New approach to pharmacophore mapping and QSAR analysis using inductive logic programming. Application to thermolysin inhibitors and glycogen phosphorylase b inhibitors. Journal of Medicinal Chemistry, 45, 399--409.Google ScholarGoogle ScholarCross RefCross Ref
  13. Maron, O. (1998). Learning from ambiguity. Doctoral dissertation, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Martin, Y., Bures, M., Danaher, E., DeLazzer, J., Lico, I., & Pavlik, P. (1993). A fast new approach to pharmacophore mapping and its application to dopaminergic and benzodiazepine agonists. J. Computer-Aided Molecular Design, 7, 83--102.Google ScholarGoogle ScholarCross RefCross Ref
  15. McGaughey, G. B., & Mewshaw, R. E. (1999). Application of comparative molecular field analysis to dopamine d2 partial agonists. Bioorganic Medical Chemistry, 7, 2453--2456.Google ScholarGoogle ScholarCross RefCross Ref
  16. Muggleton, S. (1995). Inverse entailment and Progol. New Generation Computing, 13, 245--286.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Ray, S., & Page, D. (2001). Multiple instance regression. Proceedings of the 18th International Conference on Machine Learning (pp. 425--432). Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Srinivasan, A., Page, D., Camacho, R., & King, R. (2006). Quantitative pharmacophore models with Inductive Logic Programming. Machine Learning Journal, 64, 65--90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Vapnik, V. (1999). The nature of statistical learning theory. Statistics for Engineering and Information Science. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. An integrated approach to feature invention and model construction for drug activity prediction

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICML '07: Proceedings of the 24th international conference on Machine learning
        June 2007
        1233 pages
        ISBN:9781595937933
        DOI:10.1145/1273496

        Copyright © 2007 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 20 June 2007

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate140of548submissions,26%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader