arXiv:2603.18007v1 Announce Type: cross
Abstract: The study explores whether current Large Language Models (LLMs) exhibit Theory of Mind (ToM) capabilities — specifically, the ability to infer others’ beliefs, intentions, and emotions from text. Given that LLMs are trained on language data without social embodiment or access to other manifestations of mental representations, their apparent social-cognitive reasoning raises key questions about the nature of their understanding. Are they capable of robust mental-state attribution indistinguishable from human ability in its output, or do their outputs merely reflect superficial pattern completion? To address this question, we tested five LLMs and compared their performance to that of human controls using an adapted version of a text-based tool widely used in human ToM research. The test involves answering questions about the beliefs, intentions, and emotions of story characters. The results revealed a performance gap between the models. Earlier and smaller models were strongly affected by the number of relevant inferential cues available and, to some extent, were also vulnerable to the presence of irrelevant or distracting information in the texts. In contrast, GPT-4o demonstrated high accuracy and strong robustness, performing comparably to humans even in the most challenging conditions. This work contributes to ongoing debates about the cognitive status of LLMs and the boundary between genuine understanding and statistical approximation.
Using an Adult-Designed Wearable for Pediatric Monitoring: Practical Tutorial and Application in School-Aged Children With Obesity
This tutorial presents a step-by-step guide on how to use an adult-oriented wearable (Fitbit) to collect and analyze activity and cardiovascular data in a pediatric




