Patient and clinician perceptions, expectations, and usability of ankle exoskeletons for daily living: a mixed-methods survey study

Ankle exoskeletons offer promising support for individuals with chronic foot drop, yet user and clinician perspectives on their use in daily living remain underexplored. Related

Development of reconfigurable smart medical wards using integrated components and complex features

Patient treatment in hospitals requires their regular monitoring to assess their health conditions. At the same time, routine measurements are often delayed, missed, or not

Why digital health fails silently: a sociotechnical theory of health information technology–related risk

IntroductionHealth information technology (HIT) is now integral to healthcare delivery, supporting clinical documentation, prescribing, diagnostics, and care coordination. Although these technologies offer substantial benefits, they

Portable automated rapid testing for auditory assessment: repeated at-home testing in older adults

IntroductionHearing challenges are prevalent in older adults and are associated with age-related cognitive decline. However, measuring age-related changes in hearing faces critical barriers related to

Why health information technology safety problems remain invisible

Post Content

Epistemic Regret Minimization: Label-Free Causal Critique Beyond Outcome Reward

May 21, 2026

arXiv:2602.11675v4 Announce Type: replace
Abstract: Large language models can answer causal questions correctly for the wrong reasons. Current RL methods reward emphwhat a model concludes but ignore emphwhy, reinforcing correlational shortcuts — a failure we call emphReward Entrenchment. We introduce emphEpistemic Regret Minimization (erm), a framework that critiques the causal emphstructure of a model’s reasoning trace rather than its answer. Applying established causal principles, erm flags unexamined confounders, correlation–intervention conflation, and unchecked back-door paths from exposed reasoning traces. The framework admits emphlabel-free operation — without the true causal graph or correct answer — and we separately distinguish favorable benchmark-derived critique, error-direction cues, and fully label-free judge-generated critique in the experiments. Within a single episode, erm detects and repairs causal reasoning errors; across episodes, it accumulates interventional evidence into a reward signal applicable where no answer key exists. Experiments on 1,360 scenarios across six frontier LLMs show that reasoning-heavy models (GPT-4 Turbo, GPT-5.2) resist outcome-only correction (25–31% recovery) yet respond to causal critique (78–91%), gaining $+53$–$59$ pp. Standard test-time methods (self-consistency, Best-of-$N$, Self-Refine) emphunderperform outcome-only reprompting on causal tasks, while ERM reduces residual Rung Collapse from 55–70% to 4%. A separation theorem proves outcome-only reward cannot close this gap; a controlled simulation confirms epistemic feedback does, outperforming outcome-only baselines 38-fold.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844