arXiv:2510.21884v2 Announce Type: replace-cross
Abstract: Large Language Models (LLMs) increasingly produce natural language explanations alongside their predictions, yet it remains unclear whether these explanations reference predictive cues present in the input text. In this work, we present an empirical study of how LLM-generated explanations align with predictive lexical evidence from an external model in text classification tasks. To analyze this relationship, we compare explanation content against interpretable feature importance signals extracted from transparent linear classifiers. These reference models allow us to partition predictive lexical cues into supporting and contradicting evidence relative to the predicted label. Across three benchmark datasets-WIKIONTOLOGY, AG NEWS, and IMDB-we observe a consistent empirical pattern that we term support-contra asymmetry. Explanations accompanying correct predictions tend to reference more supporting lexical cues and fewer contradicting cues, whereas explanations associated with incorrect predictions reference substantially more contradicting evidence. This pattern appears consistently across datasets, across reference model families (logistic regression and linear SVM), and across multiple feature retrieval depths. These results suggest that LLM explanations often reflect lexical signals that are predictive for the task when predictions are correct, while incorrect predictions are more frequently associated with explanations that reference misleading cues present in the input. Our findings provide a simple empirical perspective on explanation-evidence alignment and illustrate how external sources of predictive evidence can be used to analyze the behavior of LLM-generated explanations.
Identifying needs in adult rehabilitation to support the clinical implementation of robotics and allied technologies: an Italian national survey
IntroductionRobotics and technological interventions are increasingly being explored as solutions to improve rehabilitation outcomes but their implementation in clinical practice remains very limited. Understanding patient


