Relative Density Ratio Optimization for Stable and Statistically Consistent Model Alignment

arXiv:2604.04410v1 Announce Type: cross Abstract: Aligning language models with human preferences is essential for ensuring their safety and reliability. Although most existing approaches assume specific human preference models such as the Bradley-Terry model, this assumption may fail to accurately capture true human preferences, and consequently, these methods lack statistical consistency, i.e., the guarantee that language […]

Boosted Distributional Reinforcement Learning: Analysis and Healthcare Applications

arXiv:2604.04334v1 Announce Type: cross Abstract: Researchers and practitioners are increasingly considering reinforcement learning to optimize decisions in complex domains like robotics and healthcare. To date, these efforts have largely utilized expectation-based learning. However, relying on expectation-focused objectives may be insufficient for making consistent decisions in highly uncertain situations involving multiple heterogeneous groups. While distributional reinforcement […]

Partial health status observability and time horizon uncertainty in mean-field game epidemiological models

arXiv:2604.04305v2 Announce Type: cross Abstract: We introduce Mean-Field Game (MFG) epidemiological models, in which immunity either wanes with time in a fully observable way or disappears instantaneously with no direct observation (making a previously recovered individual fully susceptible again without realizing it). Both interpretations create computational challenges for rational noninfected individuals deciding on their contact […]

Beyond Corner Patches: Semantics-Aware Backdoor Attack in Federated Learning

arXiv:2603.29328v3 Announce Type: replace-cross Abstract: Backdoor attacks on federated learning (FL) are most often evaluated with synthetic corner patches or out-of-distribution (OOD) patterns that are unlikely to arise in practice. In this paper, we revisit the backdoor threat to standard FL (a single global model) under a more realistic setting where triggers must be semantically […]

How Alignment Routes: Localizing, Scaling, and Controlling Policy Circuits in Language Models

arXiv:2604.04385v2 Announce Type: cross Abstract: This paper identifies a recurring sparse routing mechanism in alignment-trained language models: a gate attention head reads detected content and triggers downstream amplifier heads that boost the signal toward refusal. Using political censorship and safety refusal as natural experiments, the mechanism is traced across 9 models from 6 labs, all […]

“When to Hand Off, When to Work Together”: Expanding Human-Agent Co-Creative Collaboration through Concurrent Interaction

arXiv:2603.02050v4 Announce Type: replace-cross Abstract: As agents move into shared workspaces and their execution becomes visible, human-agent collaboration faces a fundamental shift from sequential delegation to concurrent co-creation. This raises a new coordination problem: what interaction patterns emerge, and what agent capabilities are required to support them? Study 1 (N=10) revealed that process visibility naturally […]

The portability paradox of foundation models for clinical decision support

npj Digital Medicine, Published online: 07 April 2026; doi:10.1038/s41746-026-02615-4 Yakdan et al. demonstrate that foundation models (FMs) trained to predict cervical spondylotic myelopathy from electronic health record data outperform traditional models on internal datasets but lose their advantage during external validation. This suggests that the feature-dense patterns learned by FMs may reduce their portability across […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844