Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games

Measuring and reducing surgical staff stress in a realistic operating room setting using EDA monitoring and smart hearing protection

BackgroundStress is a critical factor in the operating room (OR) and affects both the performance and well-being of surgical staff. Measuring and mitigating this stress

Bioethical considerations in deploying mobile mental health apps in LMIC settings: insights from the MITHRA pilot study in rural India

IntroductionIn India, untreated depression among women contributes significantly to morbidity and mortality, underscoring an urgent need for accessible and ethically grounded mental health interventions. Mobile

Mechanistic Decoding of Cognitive Constructs in LLMs

arXiv:2604.14593v1 Announce Type: cross Abstract: While Large Language Models (LLMs) demonstrate increasingly sophisticated affective capabilities, the internal mechanisms by which they process complex emotions remain

ClimateCause: Complex and Implicit Causal Structures in Climate Reports

arXiv:2604.14856v1 Announce Type: cross Abstract: Understanding climate change requires reasoning over complex causal networks. Yet, existing causal discovery datasets predominantly capture explicit, direct causal relations.

Theory of Mind in Action: The Instruction Inference Task in Dynamic Human-Agent Collaboration

arXiv:2507.02935v2 Announce Type: replace-cross Abstract: Successful human-agent teaming relies on an agent being able to understand instructions given by a (human) principal. In many cases,