Skip to main content

25.04.2024 | Research

Quantitative evaluation of molecular generation performance of graph-based GANs

verfasst von: Jinli Zhang, Zhenbo Wang, Zongli Jiang, Man Wu, Chen Li, Yoshihiro Yamanishi

Erschienen in: Software Quality Journal

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Deep generative models have been widely used in molecular generation tasks because they can save time and cost in drug development compared with traditional methods. Previous studies based on generative adversarial network (GAN) models typically employ reinforcement learning (RL) to constrain chemical properties, resulting in efficient and novel molecules. However, such models have poor performance in generating molecules due to instability in training. Therefore, quantitative evaluation of existing molecular generation models, especially GAN models, is necessary. This study aims to evaluate the performance of discrete GAN models using RL in molecular generation tasks and explore the impact of different factors on model performance. Through evaluation experiments on QM9 and ZINC datasets, the results show that noise sampling distributions, training epochs, and training data volumes can affect the performance of molecular generation. Finally, we provide strategies for stable training and improved performance for GAN models.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat An, Z., Tan, Y., Zhang, J., Jiang, Z., & Li, C. (2023). A session recommendation model based on heterogeneous graph neural network. International Conference on Knowledge Science, Engineering and Management (pp. 160–171). Springer. An, Z., Tan, Y., Zhang, J., Jiang, Z., & Li, C. (2023). A session recommendation model based on heterogeneous graph neural network. International Conference on Knowledge Science, Engineering and Management (pp. 160–171). Springer.
Zurück zum Zitat Bagal, V., Aggarwal, R., Vinod, P., & Priyakumar, U. D. (2021). MolGPT: Molecular generation using a transformer-decoder model. Journal of Chemical Information and Modeling, 62(9), 2064–2076.CrossRef Bagal, V., Aggarwal, R., Vinod, P., & Priyakumar, U. D. (2021). MolGPT: Molecular generation using a transformer-decoder model. Journal of Chemical Information and Modeling, 62(9), 2064–2076.CrossRef
Zurück zum Zitat Bickerton, G. R., Paolini, G. V., Besnard, J., Muresan, S., & Hopkins, A. L. (2012). Quantifying the chemical beauty of drugs. Nature chemistry, 4(2), 90–98.CrossRef Bickerton, G. R., Paolini, G. V., Besnard, J., Muresan, S., & Hopkins, A. L. (2012). Quantifying the chemical beauty of drugs. Nature chemistry, 4(2), 90–98.CrossRef
Zurück zum Zitat Chen, L., Zheng, J., Okamura, H., & Dohi, T. (2022). Software reliability prediction through encoder-decoder recurrent neural networks. International Journal of Mathematical, Engineering and Management Sciences, 7(3), 325. Chen, L., Zheng, J., Okamura, H., & Dohi, T. (2022). Software reliability prediction through encoder-decoder recurrent neural networks. International Journal of Mathematical, Engineering and Management Sciences, 7(3), 325.
Zurück zum Zitat Comer, J., & Tam, K. (2001). Lipophilicity profiles: theory and measurement. Pharmacokinetic Optimization in Drug Research: Biological, Physicochemical, and Computational Strategies, 275–304. Comer, J., & Tam, K. (2001). Lipophilicity profiles: theory and measurement. Pharmacokinetic Optimization in Drug Research: Biological, Physicochemical, and Computational Strategies, 275–304.
Zurück zum Zitat Ertl, P., & Schuffenhauer, A. (2009). Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. Journal of Cheminformatics, 1, 1–11.CrossRef Ertl, P., & Schuffenhauer, A. (2009). Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. Journal of Cheminformatics, 1, 1–11.CrossRef
Zurück zum Zitat Gómez-Bombarelli, R., Wei, J. N., Duvenaud, D., Hernández-Lobato, J. M., Sánchez-Lengeling, B., Sheberla, D., Aguilera-Iparraguirre, J., Hirzel, T. D., Adams, R. P., & Aspuru-Guzik, A. (2018). Automatic chemical design using a data-driven continuous representation of molecules. ACS Central Science, 4(2), 268–276.CrossRef Gómez-Bombarelli, R., Wei, J. N., Duvenaud, D., Hernández-Lobato, J. M., Sánchez-Lengeling, B., Sheberla, D., Aguilera-Iparraguirre, J., Hirzel, T. D., Adams, R. P., & Aspuru-Guzik, A. (2018). Automatic chemical design using a data-driven continuous representation of molecules. ACS Central Science, 4(2), 268–276.CrossRef
Zurück zum Zitat Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2020). Generative adversarial networks. Communications of the ACM, 63(11), 139–144.MathSciNetCrossRef Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2020). Generative adversarial networks. Communications of the ACM, 63(11), 139–144.MathSciNetCrossRef
Zurück zum Zitat Guimaraes, G.L., Sanchez-Lengeling, B., Outeiral, C., Farias, P. L. C., & Aspuru-Guzik, A. (2017). Objective-reinforced generative adversarial networks (organ) for sequence generation models. Preprint retrieved from http://arxiv.org/abs/1705.10843 Guimaraes, G.L., Sanchez-Lengeling, B., Outeiral, C., Farias, P. L. C., & Aspuru-Guzik, A. (2017). Objective-reinforced generative adversarial networks (organ) for sequence generation models. Preprint retrieved from http://​arxiv.​org/​abs/​1705.​10843
Zurück zum Zitat Hoogeboom, E., Satorras, V. G., Vignac, C., & Welling, M. (2022). Equivariant diffusion for molecule generation in 3D. International Conference on Machine Learning (pp. 8867–8887). PMLR. Hoogeboom, E., Satorras, V. G., Vignac, C., & Welling, M. (2022). Equivariant diffusion for molecule generation in 3D. International Conference on Machine Learning (pp. 8867–8887). PMLR.
Zurück zum Zitat Huang, L., Zhang, H., Zhang, T., & Wong, K.-C. (2023). MDM: Molecular diffusion model for 3D molecule generation. Proceedings of the AAAI Conference on Artificial Intelligence, 37, 5105–5112.CrossRef Huang, L., Zhang, H., Zhang, T., & Wong, K.-C. (2023). MDM: Molecular diffusion model for 3D molecule generation. Proceedings of the AAAI Conference on Artificial Intelligence, 37, 5105–5112.CrossRef
Zurück zum Zitat Irwin, J. J., Sterling, T., Mysinger, M. M., Bolstad, E. S., & Coleman, R. G. (2012). ZINC: A free tool to discover chemistry for biology. Journal of Chemical Information and Modeling, 52(7), 1757–1768.CrossRef Irwin, J. J., Sterling, T., Mysinger, M. M., Bolstad, E. S., & Coleman, R. G. (2012). ZINC: A free tool to discover chemistry for biology. Journal of Chemical Information and Modeling, 52(7), 1757–1768.CrossRef
Zurück zum Zitat Jiang, Z., Xu, J., Zhang, J., Ma, F., & Li, J. (2022). Dual memory network for medical dialogue generation. In 2022 IEEE 24th International Conference on High Performance Computing & Communications; 8th International Conference on Data Science & Systems; 20th International Conference on Smart City; 8th International Conference on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys) (pp. 110–117). IEEE. Jiang, Z., Xu, J., Zhang, J., Ma, F., & Li, J. (2022). Dual memory network for medical dialogue generation. In 2022 IEEE 24th International Conference on High Performance Computing & Communications; 8th International Conference on Data Science & Systems; 20th International Conference on Smart City; 8th International Conference on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys) (pp. 110–117). IEEE.
Zurück zum Zitat Jin, W., Barzilay, R., & Jaakkola, T. (2018). Junction tree variational autoencoder for molecular graph generation. International Conference on Machine Learning (pp. 2323–2332). PMLR. Jin, W., Barzilay, R., & Jaakkola, T. (2018). Junction tree variational autoencoder for molecular graph generation. International Conference on Machine Learning (pp. 2323–2332). PMLR.
Zurück zum Zitat Kadurin, A., Nikolenko, S., Khrabrov, K., Aliper, A., & Zhavoronkov, A. (2017). druGAN: An advanced generative adversarial autoencoder model for de novo generation of new molecules with desired molecular properties in silico. Molecular Pharmaceutics, 14(9), 3098–3104.CrossRef Kadurin, A., Nikolenko, S., Khrabrov, K., Aliper, A., & Zhavoronkov, A. (2017). druGAN: An advanced generative adversarial autoencoder model for de novo generation of new molecules with desired molecular properties in silico. Molecular Pharmaceutics, 14(9), 3098–3104.CrossRef
Zurück zum Zitat Kim, S., Thiessen, P. A., Bolton, E. E., Chen, J., Fu, G., Gindulyte, A., Han, L., He, J., He, S., Shoemaker, B. A., et al. (2016). Pubchem substance and compound databases. Nucleic Acids Research, 44(D1), 1202–1213.CrossRef Kim, S., Thiessen, P. A., Bolton, E. E., Chen, J., Fu, G., Gindulyte, A., Han, L., He, J., He, S., Shoemaker, B. A., et al. (2016). Pubchem substance and compound databases. Nucleic Acids Research, 44(D1), 1202–1213.CrossRef
Zurück zum Zitat Kusner, M. J., Paige, B., & Hernández-Lobato, J. M. (2017). Grammar variational autoencoder. International Conference on Machine Learning (pp. 1945–1954). PMLR. Kusner, M. J., Paige, B., & Hernández-Lobato, J. M. (2017). Grammar variational autoencoder. International Conference on Machine Learning (pp. 1945–1954). PMLR.
Zurück zum Zitat Li, C., He, M., Qaosar, M., Ahmed, S., & Morimoto, Y. (2018). Capturing temporal dynamics of users’ preferences from purchase history big data for recommendation system. 2018 IEEE International Conference on Big Data (Big Data) (pp. 5372–5374). IEEE.CrossRef Li, C., He, M., Qaosar, M., Ahmed, S., & Morimoto, Y. (2018). Capturing temporal dynamics of users’ preferences from purchase history big data for recommendation system. 2018 IEEE International Conference on Big Data (Big Data) (pp. 5372–5374). IEEE.CrossRef
Zurück zum Zitat Li, C., & Yamanishi, Y. (2023). SpotGAN: A reverse-transformer GAN generates scaffold-constrained molecules with property optimization. Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 323–338). Springer. Li, C., & Yamanishi, Y. (2023). SpotGAN: A reverse-transformer GAN generates scaffold-constrained molecules with property optimization. Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 323–338). Springer.
Zurück zum Zitat Li, C., Yamanaka, C., Kaitoh, K., & Yamanishi, Y. (2021). Transformer-based objective-reinforced generative adversarial network to generate desired molecules. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22 (pp. 3884–3890) Li, C., Yamanaka, C., Kaitoh, K., & Yamanishi, Y. (2021). Transformer-based objective-reinforced generative adversarial network to generate desired molecules. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22 (pp. 3884–3890)
Zurück zum Zitat Maziarka, Ł, Pocha, A., Kaczmarczyk, J., Rataj, K., Danel, T., & Warchoł, M. (2020). Mol-CycleGAN: A generative model for molecular optimization. Journal of Cheminformatics, 12(1), 1–18.CrossRef Maziarka, Ł, Pocha, A., Kaczmarczyk, J., Rataj, K., Danel, T., & Warchoł, M. (2020). Mol-CycleGAN: A generative model for molecular optimization. Journal of Cheminformatics, 12(1), 1–18.CrossRef
Zurück zum Zitat Ramakrishnan, R., Dral, P. O., Rupp, M., & Von Lilienfeld, O. A. (2014). Quantum chemistry structures and properties of 134 kilo molecules. Scientific Data, 1(1), 1–7.CrossRef Ramakrishnan, R., Dral, P. O., Rupp, M., & Von Lilienfeld, O. A. (2014). Quantum chemistry structures and properties of 134 kilo molecules. Scientific Data, 1(1), 1–7.CrossRef
Zurück zum Zitat Rifaioglu, A. S., Cetin Atalay, R., Cansen Kahraman, D., Doğan, T., Martin, M., & Atalay, V. (2021). MDeePred: Novel multi-channel protein featurization for deep learning-based binding affinity prediction in drug discovery. Bioinformatics, 37(5), 693–704.CrossRef Rifaioglu, A. S., Cetin Atalay, R., Cansen Kahraman, D., Doğan, T., Martin, M., & Atalay, V. (2021). MDeePred: Novel multi-channel protein featurization for deep learning-based binding affinity prediction in drug discovery. Bioinformatics, 37(5), 693–704.CrossRef
Zurück zum Zitat Rogers, D. J., & Tanimoto, T. T. (1960). A computer program for classifying plants. Science, 132(3434), 1115–1118.CrossRef Rogers, D. J., & Tanimoto, T. T. (1960). A computer program for classifying plants. Science, 132(3434), 1115–1118.CrossRef
Zurück zum Zitat Sanchez-Lengeling, B., Outeiral, C., Guimaraes, G. L., & Aspuru-Guzik, A. (2017). Optimizing distributions over molecular space. An objective-reinforced generative adversarial network for inverse-design chemistry (organic). ChemRxiv. Sanchez-Lengeling, B., Outeiral, C., Guimaraes, G. L., & Aspuru-Guzik, A. (2017). Optimizing distributions over molecular space. An objective-reinforced generative adversarial network for inverse-design chemistry (organic). ChemRxiv.
Zurück zum Zitat Sarpong, D., Boakye, D., Ofosu, G., & Botchie, D. (2023). The three pointers of research and development (r &d) for growth-boosting sustainable innovation system. Technovation, 122, 102581.CrossRef Sarpong, D., Boakye, D., Ofosu, G., & Botchie, D. (2023). The three pointers of research and development (r &d) for growth-boosting sustainable innovation system. Technovation, 122, 102581.CrossRef
Zurück zum Zitat Schlichtkrull, M., Kipf, T. N., Bloem, P., Van Den Berg, R., Titov, I., & Welling, M. (2018). Modeling relational data with graph convolutional networks. The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, Proceedings 15 (pp. 593–607). Springer.CrossRef Schlichtkrull, M., Kipf, T. N., Bloem, P., Van Den Berg, R., Titov, I., & Welling, M. (2018). Modeling relational data with graph convolutional networks. The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, Proceedings 15 (pp. 593–607). Springer.CrossRef
Zurück zum Zitat Simonovsky, M., & Komodakis, N. (2018). GraphVAE: Towards generation of small graphs using variational autoencoders. Artificial Neural Networks and Machine Learning–ICANN 2018: 27th International Conference on Artificial Neural Networks, Rhodes, Greece, October 4–7, 2018, Proceedings, Part I 27 (pp. 412–422). Springer.CrossRef Simonovsky, M., & Komodakis, N. (2018). GraphVAE: Towards generation of small graphs using variational autoencoders. Artificial Neural Networks and Machine Learning–ICANN 2018: 27th International Conference on Artificial Neural Networks, Rhodes, Greece, October 4–7, 2018, Proceedings, Part I 27 (pp. 412–422). Springer.CrossRef
Zurück zum Zitat Song, T., Ren, Y., Wang, S., Han, P., Wang, L., Li, X., & Rodriguez-Patón, A. (2023). DNMG: Deep molecular generative model by fusion of 3D information for de novo drug design. Methods, 211, 10–22.CrossRef Song, T., Ren, Y., Wang, S., Han, P., Wang, L., Li, X., & Rodriguez-Patón, A. (2023). DNMG: Deep molecular generative model by fusion of 3D information for de novo drug design. Methods, 211, 10–22.CrossRef
Zurück zum Zitat Walker, A. R. (1998). Epidemiology and health implications of obesity, with special reference to African populations. Ecology of Food and Nutrition, 37(1), 21–55.CrossRef Walker, A. R. (1998). Epidemiology and health implications of obesity, with special reference to African populations. Ecology of Food and Nutrition, 37(1), 21–55.CrossRef
Zurück zum Zitat Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement learning. Reinforcement Learning, 5–32. Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement learning. Reinforcement Learning, 5–32.
Zurück zum Zitat Xu, M., Powers, A. S., Dror, R. O., Ermon, S., & Leskovec, J. (2023). Geometric latent diffusion models for 3D molecule generation. International Conference on Machine Learning (pp. 38592–38610). PMLR. Xu, M., Powers, A. S., Dror, R. O., Ermon, S., & Leskovec, J. (2023). Geometric latent diffusion models for 3D molecule generation. International Conference on Machine Learning (pp. 38592–38610). PMLR.
Zurück zum Zitat You, J., Ying, R., Ren, X., Hamilton, W., & Leskovec, J. (2018). GraphRNN: Generating realistic graphs with deep auto-regressive models. International Conference on Machine Learning (pp. 5708–5717). PMLR. You, J., Ying, R., Ren, X., Hamilton, W., & Leskovec, J. (2018). GraphRNN: Generating realistic graphs with deep auto-regressive models. International Conference on Machine Learning (pp. 5708–5717). PMLR.
Zurück zum Zitat Yu, L., Zhang, W., Wang, J., & Yu, Y. (2017). SeqGAN: Sequence generative adversarial nets with policy gradient. In Proceedings of the AAAI Conference on Artificial Intelligence (p. 31) Yu, L., Zhang, W., Wang, J., & Yu, Y. (2017). SeqGAN: Sequence generative adversarial nets with policy gradient. In Proceedings of the AAAI Conference on Artificial Intelligence (p. 31)
Zurück zum Zitat Zhao, B., Jiang, Z., Zhang, J., Ma, F., & Li, J. (2022). Medical dialogue generation via extracting heterogenous information. In 2022 IEEE 24th International Conference on High Performance Computing & Communications; 8th International Conference on Data Science & Systems; 20th International Conference on Smart City; 8th International Conference on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys) (pp. 201-IEEE). 194. Zhao, B., Jiang, Z., Zhang, J., Ma, F., & Li, J. (2022). Medical dialogue generation via extracting heterogenous information. In 2022 IEEE 24th International Conference on High Performance Computing & Communications; 8th International Conference on Data Science & Systems; 20th International Conference on Smart City; 8th International Conference on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys) (pp. 201-IEEE). 194.
Zurück zum Zitat Zhang, X., Li, C., & Morimoto, Y. (2019). A multi-factor approach for stock price prediction by using recurrent neural networks. Bulletin of Networking, Computing, Systems, and Software, 8(1), 9–13. Zhang, X., Li, C., & Morimoto, Y. (2019). A multi-factor approach for stock price prediction by using recurrent neural networks. Bulletin of Networking, Computing, Systems, and Software, 8(1), 9–13.
Zurück zum Zitat Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2223–2232) Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2223–2232)
Metadaten
Titel
Quantitative evaluation of molecular generation performance of graph-based GANs
verfasst von
Jinli Zhang
Zhenbo Wang
Zongli Jiang
Man Wu
Chen Li
Yoshihiro Yamanishi
Publikationsdatum
25.04.2024
Verlag
Springer US
Erschienen in
Software Quality Journal
Print ISSN: 0963-9314
Elektronische ISSN: 1573-1367
DOI
https://doi.org/10.1007/s11219-024-09671-7

Premium Partner