Crisis support teams’ technological openness and learning attitudes toward the AI based virtual patient system crisis support VR

BackgroundAgainst the backdrop of escalating global humanitarian crises, innovative didactic simulations are becoming increasingly important. A promising alternative to traditional classroom-based didactics for learning psychological

Trauma-informed conversational agents for mental health: understanding user perspectives and experiences

IntroductionMental health conversational agents (MHCAs) offer scalable, accessible psychological support yet raise concerns about safety and appropriateness for trauma, exposed users. While trauma-informed care (TIC)

Ensemble based in transfer learning for cytological classification in pleural fluid

Pleural effusion cytology is critical for diagnosing benign and malignant conditions, yet manual interpretation remains time-consuming and prone to subjectivity. The increasing burden of malignant

ChatGPT in healthcare: perceptions, ethical considerations, and practice implications among healthcare professionals in Ecuador and other countries in the Americas: a cross-sectional survey study

BackgroundGenerative artificial intelligence tools, such as ChatGPT, are increasingly discussed in healthcare; however, evidence from Latin American professional settings is limited and must be interpreted

Advancing women’s health through equity in quantitative sciences: promoting sex- and gender-based modeling in clinical trials and real-world studies

Post Content

A Unified Framework for Locality in Scalable MARL

June 4, 2026

arXiv:2602.16966v2 Announce Type: replace-cross
Abstract: Scalable methods for networked multi-agent reinforcement learning let each agent plan using only a small neighborhood of the agent graph. This works only when the system is value-local, meaning a perturbation at one agent affects the long-run value at another agent weakly when the two are far apart. In the average-reward setting, the standard way to certify locality is the Dobrushin row-sum bound on a single matrix $C^pi$ that captures how each agent’s next state depends on each other agent’s current state. To make this matrix easy to work with, prior work bounds it by a supremum over joint actions. The resulting bound is independent of the policy, but it is loose whenever the policy never picks the worst-case action. We split $C^pi$ into pieces that separately track environment sensitivity and policy sensitivity, $C^pi preceq E^mathrm s+E^mathrm aPi(pi)$, where $E^mathrm s$ measures how the next state moves with the current state, $E^mathrm a$ measures how it moves with the current action, and $Pi(pi)$ measures how reactive the policy is to changes in state. The spectral radius of $H^pi := E^mathrm s+E^mathrm aPi(pi)$ then controls the decay of the average-reward Poisson solution, and the spectral certificate $rho(H^pi)<1$ is strictly weaker than the row-sum condition $|H^pi|_infty<1$ on the same matrix and applies in regimes where policy-independent action-supremum bounds used in prior Dobrushin-style work cannot. For temperature-$tau$ softmax policies we get $Pi(pi)le L/(2tau)$, so the softmax temperature directly controls locality. We use this decay result to give a deterministic oracle guarantee for a block-coordinate KL-proximal policy-improvement template whose truncation bias decays exponentially in the message-passing radius $kappa$.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844