From pilot to policy: why AI health interventions fail to scale in developing countries

Post Content

Infectious disease burden and surveillance challenges in Jordan and Palestine: a systematic review and meta-analysis

BackgroundJordan and Palestine face public health challenges due to infectious diseases, with the added detrimental factors of long-term conflict, forced relocation, and lack of resources.

Detection of Antithrombotic-Related Bleeding in Older Inpatients: Multicenter Retrospective Study Using Structured and Unstructured Electronic Health Record Data

Background: Bleeding complications are a major contributor to adverse drug events (ADEs) among older inpatients, particularly in those treated with antithrombotic agents. Timely and accurate

The Relationship Between Physician Self-Disclosure and Patient Acquisition in Digital Health Markets: Cross-Sectional Study

Background: Online health communities have evolved into digital marketplaces where physicians have to compete for patients. Existing research examines physician-patient dynamics through a patient-centric lens,

Characterization of Models for Identifying Physical and Cognitive Frailty in Older Adults With Diabetes: Systematic Review and Meta-Analysis

Background: A growing number of risk prediction models have been developed to estimate the risk of frailty in individuals with diabetes. However, the methodological quality

LangForce: Bayesian Decomposition of Vision Language Action Models via Latent Action Queries

January 28, 2026

arXiv:2601.15197v4 Announce Type: replace
Abstract: Vision-Language-Action (VLA) models have shown promise in robot manipulation but often struggle to generalize to new instructions or complex multi-task scenarios. We identify a critical pathology in current training paradigms where goal-driven data collection creates a dataset bias. In such datasets, language instructions are highly predictable from visual observations alone, causing the conditional mutual information between instructions and actions to vanish, a phenomenon we term Information Collapse. Consequently, models degenerate into vision-only policies that ignore language constraints and fail in out-of-distribution (OOD) settings. To address this, we propose LangForce, a novel framework that enforces instruction following via Bayesian decomposition. By introducing learnable Latent Action Queries, we construct a dual-branch architecture to estimate both a vision-only prior $p(a mid v)$ and a language-conditioned posterior $pi(a mid v, ell)$. We then optimize the policy to maximize the conditional Pointwise Mutual Information (PMI) between actions and instructions. This objective effectively penalizes the vision shortcut and rewards actions that explicitly explain the language command. Without requiring new data, LangForce significantly improves generalization. Extensive experiments across on SimplerEnv and RoboCasa demonstrate substantial gains, including an 11.3% improvement on the challenging OOD SimplerEnv benchmark, validating the ability of our approach to robustly ground language in action.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844