Skip to main content

03.09.2024 | Research

Enhancing Pre-trained Deep Learning Model with Self-Adaptive Reflection

verfasst von: Xinzhi Wang, Mengyue Li, Hang Yu, Chenyang Wang, Vijayan Sugumaran, Hui Zhang

Erschienen in: Cognitive Computation

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In the text mining area, prevalent deep learning models primarily focus on mapping input features to result of predicted outputs, which exhibit a deficiency in self-dialectical thinking process. Inspired by self-reflective mechanisms in human cognition, we propose a hypothesis that existing models emulate decision-making processes and automatically rectify erroneous predictions. The Self-adaptive Reflection Enhanced pre-trained deep learning Model (S-REM) is introduced to validate our hypotheses and to determine the types of knowledge that warrant reproduction. Based on the pretrained-model, S-REM introduces the local explanation for pseudo-label and the global explanation for all labels as the explanation knowledge. The keyword knowledge from TF-IDF model is also integrated to form a reflection knowledge. Based on the key explanation features, the pretrained-model reflects on the initial decision by two reflection methods and optimizes the prediction of deep learning models. Experiments with local and global reflection variants of S-REM on two text mining tasks across four datasets, encompassing three public and one private dataset were conducted. The outcomes demonstrate the efficacy of our method in improving the accuracy of state-of-the-art deep learning models. Furthermore, the method can serve as a foundational step towards developing explainable through integration with various deep learning models.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bellagente M, Brack M, Teufel H, et al. Multifusion: fusing pre-trained models for multi-lingual, multi-modal image generation. Adv Neural Inf Process Syst. 2024;36. Bellagente M, Brack M, Teufel H, et al. Multifusion: fusing pre-trained models for multi-lingual, multi-modal image generation. Adv Neural Inf Process Syst. 2024;36.
2.
Zurück zum Zitat Dai R. Text Data mining algorithm combining CNN and DBM models. Mob Inf Syst. 2021;2021:1–7. Dai R. Text Data mining algorithm combining CNN and DBM models. Mob Inf Syst. 2021;2021:1–7.
3.
Zurück zum Zitat Sajda P, Philiastides MG, Parra LC. Single-trial analysis of neuroimaging data: inferring neural networks underlying perceptual decision-making in the human brain. IEEE Rev Biomed Eng. 2009;2:97–109.CrossRef Sajda P, Philiastides MG, Parra LC. Single-trial analysis of neuroimaging data: inferring neural networks underlying perceptual decision-making in the human brain. IEEE Rev Biomed Eng. 2009;2:97–109.CrossRef
4.
Zurück zum Zitat Akhtar N, Jalwana MAAK. Towards credible visual model interpretation with path attribution[C]//International Conference on Machine Learning. PMLR. 2023;439–457. Akhtar N, Jalwana MAAK. Towards credible visual model interpretation with path attribution[C]//International Conference on Machine Learning. PMLR. 2023;439–457.
5.
Zurück zum Zitat Lewis PR, Sarkadi Ş. Reflective artificial intelligence. Mind Mach. 2024;34(2):1–30.CrossRef Lewis PR, Sarkadi Ş. Reflective artificial intelligence. Mind Mach. 2024;34(2):1–30.CrossRef
6.
Zurück zum Zitat Campbell GE, Bolton AE. Fitting human data with fast, frugal, and computable models of decision-making. InProceedings of the Human Factors and Ergonomics Society Annual Meeting 2003 Oct (Vol. 47, No. 3, pp. 325–329). Sage CA: Los Angeles, CA: SAGE Publications. Campbell GE, Bolton AE. Fitting human data with fast, frugal, and computable models of decision-making. InProceedings of the Human Factors and Ergonomics Society Annual Meeting 2003 Oct (Vol. 47, No. 3, pp. 325–329). Sage CA: Los Angeles, CA: SAGE Publications.
7.
Zurück zum Zitat Kim B, Park J, Suh J. Transparency and accountability in AI decision support: explaining and visualizing convolutional neural networks for text information. Decis Support Syst. 2020;134:113302.CrossRef Kim B, Park J, Suh J. Transparency and accountability in AI decision support: explaining and visualizing convolutional neural networks for text information. Decis Support Syst. 2020;134:113302.CrossRef
8.
Zurück zum Zitat Cao M, Stewart A, Leonard NE. Integrating human and robot decision-making dynamics with feedback: models and convergence analysis. In2008 47th IEEE Conference on Decision and Control. IEEE. 2008;1127–1132. Cao M, Stewart A, Leonard NE. Integrating human and robot decision-making dynamics with feedback: models and convergence analysis. In2008 47th IEEE Conference on Decision and Control. IEEE. 2008;1127–1132.
9.
Zurück zum Zitat Hu Z, Shao M, Liu H, Mi J. Cognitive computing and rule extraction in generalized one-sided formal contexts. Cogn Comput. 2022;14(6):2087–107.CrossRef Hu Z, Shao M, Liu H, Mi J. Cognitive computing and rule extraction in generalized one-sided formal contexts. Cogn Comput. 2022;14(6):2087–107.CrossRef
10.
Zurück zum Zitat Zuo G, Pan T, Zhang T, Yang Y. SOAR improved artificial neural network for multistep decision-making tasks. Cogn Comput. 2021;13:612–25.CrossRef Zuo G, Pan T, Zhang T, Yang Y. SOAR improved artificial neural network for multistep decision-making tasks. Cogn Comput. 2021;13:612–25.CrossRef
11.
Zurück zum Zitat Young T, Hazarika D, Poria S, Cambria E. Recent trends in deep learning based natural language processing. IEEE Comput Intell Mag. 2018;13(3):55–75.CrossRef Young T, Hazarika D, Poria S, Cambria E. Recent trends in deep learning based natural language processing. IEEE Comput Intell Mag. 2018;13(3):55–75.CrossRef
12.
Zurück zum Zitat Hilzensauer W. Theoretische Zugänge und Methoden zur Reflexion des Lernens. Ein Diskussionsbeitrag. Bildungsforschung. 2008;2. Hilzensauer W. Theoretische Zugänge und Methoden zur Reflexion des Lernens. Ein Diskussionsbeitrag. Bildungsforschung. 2008;2.
13.
Zurück zum Zitat Leary MR. The curse of the self: self-awareness, egotism, and the quality of human life. Oxford University Press; 2007. Leary MR. The curse of the self: self-awareness, egotism, and the quality of human life. Oxford University Press; 2007.
14.
Zurück zum Zitat Ribeiro MT, Singh S, Guestrin C. “Why should I trust you?” Explaining the predictions of any classifier. InProceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 2016;1135–1144. Ribeiro MT, Singh S, Guestrin C. “Why should I trust you?” Explaining the predictions of any classifier. InProceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 2016;1135–1144.
15.
Zurück zum Zitat Wang Q, Mao Z, Wang B, Guo L. Knowledge graph embedding: a survey of approaches and applications. IEEE Trans Knowl Data Eng. 2017;29(12):2724–43.CrossRef Wang Q, Mao Z, Wang B, Guo L. Knowledge graph embedding: a survey of approaches and applications. IEEE Trans Knowl Data Eng. 2017;29(12):2724–43.CrossRef
16.
Zurück zum Zitat Dettmers T, Minervini P, Stenetorp P, Riedel S. Convolutional 2d knowledge graph embeddings. Proc AAAI Conf Artif Intell. 2018;32(1). Dettmers T, Minervini P, Stenetorp P, Riedel S. Convolutional 2d knowledge graph embeddings. Proc AAAI Conf Artif Intell. 2018;32(1).
17.
Zurück zum Zitat Quinn CJ, Kiyavash N, Coleman TP. Directed information graphs. IEEE Trans Inf Theory. 2015;61(12):6887–909.MathSciNetCrossRef Quinn CJ, Kiyavash N, Coleman TP. Directed information graphs. IEEE Trans Inf Theory. 2015;61(12):6887–909.MathSciNetCrossRef
18.
Zurück zum Zitat Weiss K, Khoshgoftaar TM, Wang D. A survey of transfer learning. J Big Data. 2016;3(1):1–40.CrossRef Weiss K, Khoshgoftaar TM, Wang D. A survey of transfer learning. J Big Data. 2016;3(1):1–40.CrossRef
19.
Zurück zum Zitat Pan SJ, Yang Q. A survey on transfer learning. IEEE Trans Knowl Data Eng. 2009;22(10):1345–59.CrossRef Pan SJ, Yang Q. A survey on transfer learning. IEEE Trans Knowl Data Eng. 2009;22(10):1345–59.CrossRef
20.
Zurück zum Zitat Nguyen BH, Xue B, Andreae P, Zhang M. A hybrid evolutionary computation approach to inducing transfer classifiers for domain adaptation. IEEE Trans Cybern. 2020;51(12):6319–32.CrossRef Nguyen BH, Xue B, Andreae P, Zhang M. A hybrid evolutionary computation approach to inducing transfer classifiers for domain adaptation. IEEE Trans Cybern. 2020;51(12):6319–32.CrossRef
21.
Zurück zum Zitat Zhao H, Sun X, Dong J, Chen C, Dong Z. Highlight every step: knowledge distillation via collaborative teaching. IEEE Trans Cybern. 2020;52(4):2070–81.CrossRef Zhao H, Sun X, Dong J, Chen C, Dong Z. Highlight every step: knowledge distillation via collaborative teaching. IEEE Trans Cybern. 2020;52(4):2070–81.CrossRef
22.
Zurück zum Zitat Zhang J, Chen B, Zhang L, Ke X, Ding H. Neural, symbolic and neural-symbolic reasoning on knowledge graphs. AI Open. 2021;2:14–35.CrossRef Zhang J, Chen B, Zhang L, Ke X, Ding H. Neural, symbolic and neural-symbolic reasoning on knowledge graphs. AI Open. 2021;2:14–35.CrossRef
23.
Zurück zum Zitat Hooker JN. A quantitative approach to logical inference. Decis Support Syst. 1988;4(1):45–69.CrossRef Hooker JN. A quantitative approach to logical inference. Decis Support Syst. 1988;4(1):45–69.CrossRef
24.
Zurück zum Zitat Deng H. Interpreting tree ensembles with inTrees. Int J Data Sci Anal. 2019;7(4):277–87.CrossRef Deng H. Interpreting tree ensembles with inTrees. Int J Data Sci Anal. 2019;7(4):277–87.CrossRef
25.
Zurück zum Zitat Mashayekhi M, Gras R. Rule extraction from random forest: the RF+ HC methods. InAdvances in Artificial Intelligence: 28th Canadian Conference on Artificial Intelligence, Canadian AI 2015, Halifax, Nova Scotia, Canada, June 2–5, 2015, Proceedings 28 2015 (pp. 223–237). Springer International Publishing. https://doi.org/10.1007/978-3-319-18356-5_20. Mashayekhi M, Gras R. Rule extraction from random forest: the RF+ HC methods. InAdvances in Artificial Intelligence: 28th Canadian Conference on Artificial Intelligence, Canadian AI 2015, Halifax, Nova Scotia, Canada, June 2–5, 2015, Proceedings 28 2015 (pp. 223–237). Springer International Publishing. https://​doi.​org/​10.​1007/​978-3-319-18356-5_​20.
27.
Zurück zum Zitat Yang C, Rangarajan A, Ranka S. Global model interpretation via recursive partitioning. In2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS). IEEE. 2018;1563–1570. https://doi.org/10.1109/HPCC/SmartCity/DSS.2018.00256. Yang C, Rangarajan A, Ranka S. Global model interpretation via recursive partitioning. In2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS). IEEE. 2018;1563–1570. https://​doi.​org/​10.​1109/​HPCC/​SmartCity/​DSS.​2018.​00256.
28.
Zurück zum Zitat Yuan H, Chen Y, Hu X, Ji S. Interpreting deep models for text analysis via optimization and regularization methods. Proc AAAI Conf Artif Intell. 2019;33(01):5717–24. Yuan H, Chen Y, Hu X, Ji S. Interpreting deep models for text analysis via optimization and regularization methods. Proc AAAI Conf Artif Intell. 2019;33(01):5717–24.
29.
Zurück zum Zitat Mahendran A, Vedaldi A. Understanding deep image representations by inverting them. Proc IEEE Conf Comput Vision Pattern Recogn. 2015;5188–5196. Mahendran A, Vedaldi A. Understanding deep image representations by inverting them. Proc IEEE Conf Comput Vision Pattern Recogn. 2015;5188–5196.
30.
Zurück zum Zitat Dosovitskiy A, Brox T. Inverting visual representations with convolutional networks. InProceedings of the IEEE conference on computer vision and pattern recognition. 2016;4829–4837. Dosovitskiy A, Brox T. Inverting visual representations with convolutional networks. InProceedings of the IEEE conference on computer vision and pattern recognition. 2016;4829–4837.
33.
Zurück zum Zitat Liu L, Wang L. What has my classifier learned? visualizing the classification rules of bag-of-feature model by support region detection. 2012 IEEE Conf Comput Vision Pattern Recogn IEEE. 2012;3586–3593. Liu L, Wang L. What has my classifier learned? visualizing the classification rules of bag-of-feature model by support region detection. 2012 IEEE Conf Comput Vision Pattern Recogn IEEE. 2012;3586–3593.
34.
Zurück zum Zitat Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-cam: visual explanations from deep networks via gradient-based localization. Proc IEEE Int Conf Comput Vision. 2017;618–626. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-cam: visual explanations from deep networks via gradient-based localization. Proc IEEE Int Conf Comput Vision. 2017;618–626.
35.
Zurück zum Zitat Lundberg SM, Lee SI. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. 2017;30. Lundberg SM, Lee SI. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. 2017;30.
36.
Zurück zum Zitat Guo W, Mu D, Xu J, Su P, Wang G, Xing X. Lemna: explaining deep learning based security applications. Inproceedings of the 2018 ACM SIGSAC conference on computer and communications security. 2018;364–379. Guo W, Mu D, Xu J, Su P, Wang G, Xing X. Lemna: explaining deep learning based security applications. Inproceedings of the 2018 ACM SIGSAC conference on computer and communications security. 2018;364–379.
37.
Zurück zum Zitat Li X, Xiong H, Li X, et al. G-LIME: statistical learning for local interpretations of deep neural networks using global priors. Artif Intell. 2023;314:103823.MathSciNetCrossRef Li X, Xiong H, Li X, et al. G-LIME: statistical learning for local interpretations of deep neural networks using global priors. Artif Intell. 2023;314:103823.MathSciNetCrossRef
38.
Zurück zum Zitat Chiu CW, Minku LL. A diversity framework for dealing with multiple types of concept drift based on clustering in the model space. IEEE Trans Neural Netw Learn Syst. 2020;33(3):1299–309.CrossRef Chiu CW, Minku LL. A diversity framework for dealing with multiple types of concept drift based on clustering in the model space. IEEE Trans Neural Netw Learn Syst. 2020;33(3):1299–309.CrossRef
42.
Zurück zum Zitat Shan J, Zhang H, Liu W, Liu Q. Online active learning ensemble framework for drifted data streams. IEEE Trans Neural Netw Learn Syst. 2018;30(2):486–98.CrossRef Shan J, Zhang H, Liu W, Liu Q. Online active learning ensemble framework for drifted data streams. IEEE Trans Neural Netw Learn Syst. 2018;30(2):486–98.CrossRef
43.
Zurück zum Zitat Petit G, Popescu A, Schindler H, et al. Fetril: feature translation for exemplar-free class-incremental learning[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2023;3911–3920. Petit G, Popescu A, Schindler H, et al. Fetril: feature translation for exemplar-free class-incremental learning[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2023;3911–3920.
44.
Zurück zum Zitat Li P, He L, Wang H, Hu X, Zhang Y, Li L, Wu X. Learning from short text streams with topic drifts. IEEE Trans Cybern. 2017;48(9):2697–711.CrossRef Li P, He L, Wang H, Hu X, Zhang Y, Li L, Wu X. Learning from short text streams with topic drifts. IEEE Trans Cybern. 2017;48(9):2697–711.CrossRef
45.
Zurück zum Zitat Lu Y, Cheung YM, Tang YY. Adaptive chunk-based dynamic weighted majority for imbalanced data streams with concept drift. IEEE Trans Neural Netw Learn Syst. 2019;31(8):2764–78.CrossRef Lu Y, Cheung YM, Tang YY. Adaptive chunk-based dynamic weighted majority for imbalanced data streams with concept drift. IEEE Trans Neural Netw Learn Syst. 2019;31(8):2764–78.CrossRef
46.
Zurück zum Zitat Yang C, Cheung YM, Ding J, Tan KC. Concept drift-tolerant transfer learning in dynamic environments. IEEE Trans Neural Netw Learn Syst. 2021;33(8):3857–71.CrossRef Yang C, Cheung YM, Ding J, Tan KC. Concept drift-tolerant transfer learning in dynamic environments. IEEE Trans Neural Netw Learn Syst. 2021;33(8):3857–71.CrossRef
47.
Zurück zum Zitat Pan Z, Yu X, Zhang M, et al. DyCR: a dynamic clustering and recovering network for few-shot class-incremental learning. IEEE Trans Neural Netw Learn Syst. 2024. Pan Z, Yu X, Zhang M, et al. DyCR: a dynamic clustering and recovering network for few-shot class-incremental learning. IEEE Trans Neural Netw Learn Syst. 2024.
49.
Zurück zum Zitat Bartoli A, De Lorenzo A, Medvet E, Tarlao F. Active learning of regular expressions for entity extraction. IEEE Trans Cybern. 2017;48(3):1067–80.CrossRef Bartoli A, De Lorenzo A, Medvet E, Tarlao F. Active learning of regular expressions for entity extraction. IEEE Trans Cybern. 2017;48(3):1067–80.CrossRef
50.
Zurück zum Zitat Jiang H, He H. Learning from negative links. IEEE Trans Cybern. 2021;52(8):8481–92.CrossRef Jiang H, He H. Learning from negative links. IEEE Trans Cybern. 2021;52(8):8481–92.CrossRef
51.
Zurück zum Zitat Wu Y, Dong Y, Qin J, Pedrycz W. Linguistic distribution and priority-based approximation to linguistic preference relations with flexible linguistic expressions in decision making. IEEE Trans Cybern. 2020;51(2):649–59.CrossRef Wu Y, Dong Y, Qin J, Pedrycz W. Linguistic distribution and priority-based approximation to linguistic preference relations with flexible linguistic expressions in decision making. IEEE Trans Cybern. 2020;51(2):649–59.CrossRef
52.
Zurück zum Zitat Pang J, Rao Y, Xie H, Wang X, Wang FL, Wong TL, Li Q. Fast supervised topic models for short text emotion detection. IEEE Trans Cybern. 2019;51(2):815–28.CrossRef Pang J, Rao Y, Xie H, Wang X, Wang FL, Wong TL, Li Q. Fast supervised topic models for short text emotion detection. IEEE Trans Cybern. 2019;51(2):815–28.CrossRef
53.
Zurück zum Zitat Wang X, Kou L, Sugumaran V, Luo X, Zhang H. Emotion correlation mining through deep learning models on natural language text. IEEE Trans Cybern. 2021;51(9):4400–13. Wang X, Kou L, Sugumaran V, Luo X, Zhang H. Emotion correlation mining through deep learning models on natural language text. IEEE Trans Cybern. 2021;51(9):4400–13.
54.
Zurück zum Zitat Wu Z, Ong DC. Context-guided bert for targeted aspect-based sentiment analysis. Proc AAAI Conf Artif Intell. 2021;35(16):14094–102. Wu Z, Ong DC. Context-guided bert for targeted aspect-based sentiment analysis. Proc AAAI Conf Artif Intell. 2021;35(16):14094–102.
55.
Zurück zum Zitat Wu HC, Luk RW, Wong KF, Kwok KL. Interpreting TF-IDF term weights as making relevance decisions. ACM Trans Inf Syst (TOIS). 2008;26(3):1–37.CrossRef Wu HC, Luk RW, Wong KF, Kwok KL. Interpreting TF-IDF term weights as making relevance decisions. ACM Trans Inf Syst (TOIS). 2008;26(3):1–37.CrossRef
56.
Zurück zum Zitat Pontiki M, Galanis D, Papageorgiou H, Androutsopoulos I, Manandhar S, AL-Smadi M, Al-Ayyoub M, Zhao Y, Qin B, De Clercq O, Hoste V. Semeval-2016 task 5: aspect based sentiment analysis. InProWorkshop on Semantic Evaluation (SemEval-2016). Assoc Comput Linguist. 2016;19–30. Pontiki M, Galanis D, Papageorgiou H, Androutsopoulos I, Manandhar S, AL-Smadi M, Al-Ayyoub M, Zhao Y, Qin B, De Clercq O, Hoste V. Semeval-2016 task 5: aspect based sentiment analysis. InProWorkshop on Semantic Evaluation (SemEval-2016). Assoc Comput Linguist. 2016;19–30.
57.
Zurück zum Zitat Zhang X, Zhao J, LeCun Y. Character-level convolutional networks for text classification. Adv Neural Inf Process Syst. 2015;28. Zhang X, Zhao J, LeCun Y. Character-level convolutional networks for text classification. Adv Neural Inf Process Syst. 2015;28.
Metadaten
Titel
Enhancing Pre-trained Deep Learning Model with Self-Adaptive Reflection
verfasst von
Xinzhi Wang
Mengyue Li
Hang Yu
Chenyang Wang
Vijayan Sugumaran
Hui Zhang
Publikationsdatum
03.09.2024
Verlag
Springer US
Erschienen in
Cognitive Computation
Print ISSN: 1866-9956
Elektronische ISSN: 1866-9964
DOI
https://doi.org/10.1007/s12559-024-10348-3