The widespread diffusion of text black box classifiers in many areas of human activity poses the need for explainable artificial intelligence techniques specifically tailored for this challenging domain. One of the seminal eXplainable Artificial Intelligence (XAI) techniques is LIME, standing for Local Interpretable Model-agnostic Explanations. In the text classification scenario, LIME maps the input instance sentence and its neighbors into a bag of words, while a linear regressor is used as interpretable model.
However, this strategy has some main drawbacks. Indeed, since neighborhooding sentences can be obtained only as subsets of the input one, they could not properly describe the decision boundary in the locality of the input sentence, other than being potentially not meaningful. Moreover, the explanation returned solely consists of either confirming the importance of the presence of a specific term or declaring the removal of a specific term relevant.
In this work, we try to overcome the above limitations by proposing \(\text {LLiMe}\), an extension of the basic LIME approach that exploits recent advances in Large Language Models (LLMs) to perform a classifier-driven generation of the neighborhood of the input instance. In our approach neighbors can employ a vocabulary larger than that imposed by the sentence under consideration. Moreover, we provide a neighborhood generation procedure guaranteeing to better capture the decision boundary in the locality of the sentence and an explanation generation procedure returning the most relevant set of term-operations pairs, each of which consisting a of specific term and a certain edit operation to accomplish to mostly influence the decision of the black box predictor. In this respect, our approach provides to the user a richer and more easy-to-interpret explanation than standard LIME.
Experiments conducted on real datasets witness the effectiveness of our technique in providing suited, relevant and interpretable explanations.