Learning Dexterous Grasping from Sparse Taxonomy Guidance

arXiv:2604.04138v1 Announce Type: cross Abstract: Dexterous manipulation requires planning a grasp configuration suited to the object and task, which is then executed through coordinated multi-finger

Enhancing behavioral nudges with large language model-based iterative personalization: A field experiment on electricity and hot-water conservation

arXiv:2604.03881v1 Announce Type: cross Abstract: Nudging is widely used to promote behavioral change, but its effectiveness is often limited when recipients must repeatedly translate feedback

Gram-Anchored Prompt Learning for Vision-Language Models via Second-Order Statistics

arXiv:2604.03980v1 Announce Type: cross Abstract: Parameter-efficient prompt learning has become the de facto standard for adapting Vision-Language Models (VLMs) to downstream tasks. Existing approaches predominantly

Unveiling Language Routing Isolation in Multilingual MoE Models for Interpretable Subnetwork Adaptation

arXiv:2604.03592v1 Announce Type: cross Abstract: Mixture-of-Experts (MoE) models exhibit striking performance disparities across languages, yet the internal mechanisms driving these gaps remain poorly understood. In

15 Years of Augmented Human(s) Research: Where Do We Stand?

arXiv:2604.03715v1 Announce Type: cross Abstract: The Augmented Human vision broadly seeks to improve or expand baseline human functioning through the restoration or extension of physical,

When simulations look right but causal effects go wrong: Large language models as behavioral simulators

April 6, 2026

arXiv:2604.02458v1 Announce Type: cross
Abstract: Behavioral simulation is increasingly used to anticipate responses to interventions. Large language models (LLMs) enable researchers to specify population characteristics and intervention context in natural language, but it remains unclear to what extent LLMs can use these inputs to infer intervention effects. We evaluated three LLMs on 11 climate-psychology interventions using a dataset of 59,508 participants from 62 countries, and replicated the main analysis in two additional datasets (12 and 27 countries). LLMs reproduced observed patterns in attitudinal outcomes (e.g., climate beliefs and policy support) reasonably well, and prompting refinements improved this descriptive fit. However, descriptive fit did not reliably translate into causal fidelity (i.e., accurate estimates of intervention effects), and these two dimensions of accuracy followed different error structures. This descriptive-causal divergence held across the three datasets, but varied across intervention logics, with larger errors for interventions that depended on evoking internal experience than on directly conveying reasons or social cues. It was more pronounced for behavioral outcomes, where LLMs imposed stronger attitude-behavior coupling than in human data. Countries and population groups appearing well captured descriptively were not necessarily those with lower causal errors. Relying on descriptive fit alone may therefore create unwarranted confidence in simulation results, misleading conclusions about intervention effects and masking population disparities that matter for fairness.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844