Skip to main content

2023 | OriginalPaper | Buchkapitel

Explanations for Itemset Mining by Constraint Programming: A Case Study Using ChEMBL Data

verfasst von : Maksim Koptelov, Albrecht Zimmermann, Patrice Boizumault, Ronan Bureau, Jean-Luc Lamotte

Erschienen in: Advances in Intelligent Data Analysis XXI

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In sensitive applications, such as drug development, offering experts an explanation for why data mining operations arrive at certain results adds a very valuable facet. In this work we benefit from modelling the task as a Constraint Satisfaction Problem (CSP) twice: by adding multiple constraints to the mining process and by deriving pattern failure explanations. We illustrate experimentally how to apply our method on data originally retrieved from the ChEMBL database [14]. We also report some interesting dependencies discovered by our method which are not easy to observe when analysing data manually.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Centre d’Etude et de Recherche du Médicament de Normandie: https://​cermn.​unicaen.​fr.
 
2
A manually curated database of bioactive molecules with drug-like properties.
 
Literatur
1.
Zurück zum Zitat Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: VLDB, vol. 1215, pp. 487–499 (1994) Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: VLDB, vol. 1215, pp. 487–499 (1994)
2.
Zurück zum Zitat Bodon, F.: A fast apriori implementation. In: FIMI, vol. 3, p. 63 (2003) Bodon, F.: A fast apriori implementation. In: FIMI, vol. 3, p. 63 (2003)
3.
Zurück zum Zitat Bogaerts, B., Gamba, E., Guns, T.: A framework for step-wise explaining how to solve constraint satisfaction problems. Artif. Intell. 300, 103550 (2021)MathSciNetCrossRefMATH Bogaerts, B., Gamba, E., Guns, T.: A framework for step-wise explaining how to solve constraint satisfaction problems. Artif. Intell. 300, 103550 (2021)MathSciNetCrossRefMATH
4.
Zurück zum Zitat Bouali, F., Guettala, A., Venturini, G.: Vizassist: an interactive user assistant for visual data mining. Vis. Comput. 32(11), 1447–1463 (2016)CrossRef Bouali, F., Guettala, A., Venturini, G.: Vizassist: an interactive user assistant for visual data mining. Vis. Comput. 32(11), 1447–1463 (2016)CrossRef
5.
Zurück zum Zitat Cortez, P., Embrechts, M.: Using sensitivity analysis and visualization techniques to open black box data mining models. Inf. Sci. 225, 1–17 (2013)CrossRef Cortez, P., Embrechts, M.: Using sensitivity analysis and visualization techniques to open black box data mining models. Inf. Sci. 225, 1–17 (2013)CrossRef
6.
7.
Zurück zum Zitat De Raedt, L., Guns, T., Nijssen, S.: Constraint programming for itemset mining. In: KDD, pp. 204–212 (2008) De Raedt, L., Guns, T., Nijssen, S.: Constraint programming for itemset mining. In: KDD, pp. 204–212 (2008)
8.
Zurück zum Zitat Dror, O., et al.: Novel approach for efficient pharmacophore-based virtual screening: method and applications. J. Chem. Inf. Model. 49(10), 2333–2343 (2009)CrossRef Dror, O., et al.: Novel approach for efficient pharmacophore-based virtual screening: method and applications. J. Chem. Inf. Model. 49(10), 2333–2343 (2009)CrossRef
9.
Zurück zum Zitat Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery in databases. AI Mag. 17(3), 37–37 (1996) Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery in databases. AI Mag. 17(3), 37–37 (1996)
10.
Zurück zum Zitat Ferreira, M., Levkowitz, H.: From visual data exploration to visual data mining: a survey. IEEE Trans. Visual. Comput. Graph. 9(3), 378–394 (2003)CrossRef Ferreira, M., Levkowitz, H.: From visual data exploration to visual data mining: a survey. IEEE Trans. Visual. Comput. Graph. 9(3), 378–394 (2003)CrossRef
11.
Zurück zum Zitat Fournier-Viger, P., Lin, J.C.W., Vo, B., Chi, T., Zhang, J., Le, H.: A survey of itemset mining. Data Min. Knowl. Disc. 7(4), e1207 (2017) Fournier-Viger, P., Lin, J.C.W., Vo, B., Chi, T., Zhang, J., Le, H.: A survey of itemset mining. Data Min. Knowl. Disc. 7(4), e1207 (2017)
12.
Zurück zum Zitat Freuder, E.: Explaining ourselves: human-aware constraint reasoning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31 (2017) Freuder, E.: Explaining ourselves: human-aware constraint reasoning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31 (2017)
13.
Zurück zum Zitat Gamba, E., Bogaerts, B., Guns, T.: Efficiently explaining CSPs with unsatisfiable subset optimization. In: (IJCAI), pp. 1381–1388 (2021) Gamba, E., Bogaerts, B., Guns, T.: Efficiently explaining CSPs with unsatisfiable subset optimization. In: (IJCAI), pp. 1381–1388 (2021)
14.
Zurück zum Zitat Gaulton, A., et al.: The chEMBL database in 2017. Nucleic Acids Res. 45(D1), D945–D954 (2017)CrossRef Gaulton, A., et al.: The chEMBL database in 2017. Nucleic Acids Res. 45(D1), D945–D954 (2017)CrossRef
15.
Zurück zum Zitat Guns, T., Nijssen, S., De Raedt, L.: Itemset mining: a constraint programming perspective. Artif. Intell. 175(12–13), 1951–1983 (2011)MathSciNetCrossRefMATH Guns, T., Nijssen, S., De Raedt, L.: Itemset mining: a constraint programming perspective. Artif. Intell. 175(12–13), 1951–1983 (2011)MathSciNetCrossRefMATH
17.
Zurück zum Zitat Holzinger, A., Dehmer, M., Jurisica, I.: Knowledge discovery and interactive data mining in bioinformatics-state-of-the-art, future challenges and research directions. BMC Bioinform. 15(6), 1–9 (2014)CrossRef Holzinger, A., Dehmer, M., Jurisica, I.: Knowledge discovery and interactive data mining in bioinformatics-state-of-the-art, future challenges and research directions. BMC Bioinform. 15(6), 1–9 (2014)CrossRef
18.
Zurück zum Zitat Jussien, N., Ouis, S.: User-friendly explanations for constraint programming. In: International Conference on Principles and Practice of CP (2001) Jussien, N., Ouis, S.: User-friendly explanations for constraint programming. In: International Conference on Principles and Practice of CP (2001)
19.
Zurück zum Zitat Kashid, A., Kulkarni, V., Patankar, R.: Discrimination-aware data mining: a survey. Int. J. Data Sci. 2(1), 70–84 (2017)CrossRef Kashid, A., Kulkarni, V., Patankar, R.: Discrimination-aware data mining: a survey. Int. J. Data Sci. 2(1), 70–84 (2017)CrossRef
20.
Zurück zum Zitat Kuo, Y.T., Lonie, A., Pearce, A.R., Sonenberg, L.: Mining surprising patterns and their explanations in clinical data. Appl. AI 28(2), 111–138 (2014) Kuo, Y.T., Lonie, A., Pearce, A.R., Sonenberg, L.: Mining surprising patterns and their explanations in clinical data. Appl. AI 28(2), 111–138 (2014)
21.
Zurück zum Zitat Kuo, Y.T., et al.: Domain ontology driven data mining: a medical case study. In: 2007 International Workshop on Domain Driven Data Mining, pp. 11–17 (2007) Kuo, Y.T., et al.: Domain ontology driven data mining: a medical case study. In: 2007 International Workshop on Domain Driven Data Mining, pp. 11–17 (2007)
22.
Zurück zum Zitat Leeuwen, M.: Interactive data exploration using pattern mining. In: Interactive Knowledge Discovery and Data Mining in Biomedical Informatics, pp. 169–182 (2014) Leeuwen, M.: Interactive data exploration using pattern mining. In: Interactive Knowledge Discovery and Data Mining in Biomedical Informatics, pp. 169–182 (2014)
23.
Zurück zum Zitat Mackworth, A.K.: Consistency in networks of relations. AI 8(1), 99–118 (1977)MATH Mackworth, A.K.: Consistency in networks of relations. AI 8(1), 99–118 (1977)MATH
24.
Zurück zum Zitat Métivier, J.P., et al.: The pharmacophore network: a computational method for exploring structure-activity relationships from a large chemical data set. J. Med. Chem. 61(8), 3551–3564 (2018)CrossRef Métivier, J.P., et al.: The pharmacophore network: a computational method for exploring structure-activity relationships from a large chemical data set. J. Med. Chem. 61(8), 3551–3564 (2018)CrossRef
25.
Zurück zum Zitat Pedreschi, D., et al.: Meaningful explanations of black box AI decision systems. In: AAAI, vol. 33, pp. 9780–9784 (2019) Pedreschi, D., et al.: Meaningful explanations of black box AI decision systems. In: AAAI, vol. 33, pp. 9780–9784 (2019)
26.
Zurück zum Zitat Pedreshi, D., et al.: Discrimination-aware data mining. In: KDD, pp. 560–568 (2008) Pedreshi, D., et al.: Discrimination-aware data mining. In: KDD, pp. 560–568 (2008)
27.
Zurück zum Zitat Ribeiro, M.T., Singh, S., Guestrin, C.: Why should i trust you? explaining the predictions of any classifier. In: KDD, pp. 1135–1144 (2016) Ribeiro, M.T., Singh, S., Guestrin, C.: Why should i trust you? explaining the predictions of any classifier. In: KDD, pp. 1135–1144 (2016)
29.
Zurück zum Zitat Soukup, T., Davidson, I.: Visual Data Mining: Techniques and Tools for Data Visualization and Mining. John Wiley & Sons, Hoboken (2002) Soukup, T., Davidson, I.: Visual Data Mining: Techniques and Tools for Data Visualization and Mining. John Wiley & Sons, Hoboken (2002)
30.
Zurück zum Zitat Velu, C., Kashwan, K.: Visual data mining techniques for classification of diabetic patients. In: IACC, pp. 1070–1075. IEEE (2013) Velu, C., Kashwan, K.: Visual data mining techniques for classification of diabetic patients. In: IACC, pp. 1070–1075. IEEE (2013)
31.
Zurück zum Zitat Wu, H., Lu, Z., Pan, L., Xu, R., Jiang, W.: An improved apriori-based algorithm for association rules mining. In: 6th FSKD, vol. 2, pp. 51–55. IEEE (2009) Wu, H., Lu, Z., Pan, L., Xu, R., Jiang, W.: An improved apriori-based algorithm for association rules mining. In: 6th FSKD, vol. 2, pp. 51–55. IEEE (2009)
32.
Zurück zum Zitat Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)CrossRef Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)CrossRef
Metadaten
Titel
Explanations for Itemset Mining by Constraint Programming: A Case Study Using ChEMBL Data
verfasst von
Maksim Koptelov
Albrecht Zimmermann
Patrice Boizumault
Ronan Bureau
Jean-Luc Lamotte
Copyright-Jahr
2023
DOI
https://doi.org/10.1007/978-3-031-30047-9_17

Premium Partner