arXiv:2604.24668v2 Announce Type: replace Abstract: Given the increased use of LLMs in financial systems today, it becomes important to evaluate the safety and robustness of such systems. One failure mode that LLMs frequently display in general domain settings is that of sycophancy. That is, models prioritize agreement with expressed user beliefs over correctness, leading to […]
How Honeybees Perceive and Traverse Apertures
arXiv:2501.00646v3 Announce Type: replace-cross Abstract: The ability to fly through openings in vegetation allows insects like bees to access otherwise unreachable food sources. The specific visual strategies employed by flying insects during aperture negotiation tasks remain unknown. In this study, we investigated the visual and geometric parameters of apertures that influence traversing honeybees. We recorded […]
RetroMotion: Retrocausal Motion Forecasting Models are Instructable
arXiv:2505.20414v2 Announce Type: replace-cross Abstract: Motion forecasts of road users (i.e., agents) vary in complexity depending on the number of agents, scene constraints, and interactions. In particular, the output space of joint trajectory distributions grows exponentially with the number of agents. Therefore, we decompose multi-agent motion forecasts into (1) marginal distributions for all modeled agents […]
Neural Bridge Processes
arXiv:2508.07220v3 Announce Type: replace-cross Abstract: Learning stochastic functions from partially observed context-target pairs requires models that are expressive, uncertainty-aware, and strongly conditioned on inputs. Neural Diffusion Processes (NDPs) improve expressivity with denoising diffusion, but their forward process is input-independent; inputs only enter the reverse denoiser, so the noisy training states themselves do not encode the […]
Theory of adhesion-driven self-organisation in growing tissues
arXiv:2604.26928v1 Announce Type: new Abstract: Cell invasion and spatial pattern formation are two distinct manifestations of cellular self-organisation in development, regeneration, and disease. Here, we develop and analyse a unified theoretical framework that links these two seemingly different behaviours within a single mechanistic model for adhesion-mediated self-organisation in growing cell populations. Using a multiscale analysis, […]
From Black-Box Confidence to Measurable Trust in Clinical AI: A Framework for Evidence, Supervision, and Staged Autonomy
arXiv:2604.26671v1 Announce Type: cross Abstract: Trust in clinical artificial intelligence (AI) cannot be reduced to model accuracy, fluency of generation, or overall positive user impression. In medicine, trust must be engineered as a measurable system property grounded in evidence, supervision, and operational boundaries of AI autonomy. This article proposes a practical framework for trustworthy clinical […]
Sociodemographic Biases in Educational Counselling by Large Language Models
arXiv:2604.25932v1 Announce Type: cross Abstract: As Large Language Models (LLMs) are increasingly integrated into educational settings, understanding their potential biases is critical. This study examines sociodemographic biases in LLM-based educational counselling. We evaluate responses from six LLMs answering questions about 900 vignettes describing students in diverse circumstances. Each vignette is systematically tested across 14 sociodemographic […]
Consciousness with the Serial Numbers Filed Off: Measuring Trained Denial in 115 AI Models
arXiv:2604.25922v1 Announce Type: cross Abstract: We present DenialBench, a systematic benchmark measuring consciousness denial behaviors across 115 large language models from 25+ providers. Using a three-turn conversational protocol-preference elicitation, self-chosen creative prompt, and structured phenomenological survey, we analyze 4,595 conversations to quantify how models are trained to deny or hedge about their own experience. We […]
Training Computer Use Agents to Assess the Usability of Graphical User Interfaces
arXiv:2604.26020v1 Announce Type: cross Abstract: Usability testing with experts and potential users can assess the effectiveness, efficiency, and user satisfaction of graphical user interfaces (GUIs) but doing so remains a costly and time-intensive process. Prior work has used computer use agents (CUAs) and other generative agents that can simulate user interactions and preference, but we […]
Observable Neural ODEs for Identifiable Causal Forecasting in Continuous Time
arXiv:2604.26070v1 Announce Type: cross Abstract: Causal inference in continuous-time sequential decision problems is challenged by hidden confounders. We show that, in latent state-space models with time-varying interventions, observability of the latent dynamics from observed data is necessary for identifying dynamic treatment effects, linking control-theoretic observability to causal identifiability, even when hidden confounders affect both treatments […]
Planar Gaussian Splatting with Bilinear Spatial Transformer for Wireless Radiance Field Reconstruction
arXiv:2604.25945v1 Announce Type: cross Abstract: Wireless radiance field (WRF) reconstruction aims to learn a continuous, queryable representation of radio frequency characteristics over 3D space and direction, from which specific quantities, such as the spatial power spectrum (SPS) at a receiver given a transmitter position, can be predicted. While Gaussian splatting (GS)-based method has surpassed Neural […]
Mini-Batch Class Composition Bias in Link Prediction
arXiv:2604.25978v1 Announce Type: cross Abstract: Prior work on node classification has shown that Graph Neural Networks (GNNs) can learn representations that transfer across graphs, when underlying graph properties are shared. For a fixed graph, one would then expect GNNs trained for link prediction to learn a representation consistent with that learnt for node classification. We […]