WildRelight: A Real-World Benchmark and Physics-Guided Adaptation for Single-Image Relighting

arXiv:2605.11696v1 Announce Type: cross Abstract: Recent single-image relighting methods, powered by advanced generative models, have achieved impressive photorealism on synthetic benchmarks. However, their effectiveness in the complex visual landscape of the real world remains largely unverified. A critical gap exists, as current datasets are typically designed for multi-view reconstruction and fail to address the unique […]

When Does Non-Uniform Replay Matter in Reinforcement Learning?

arXiv:2605.10236v2 Announce Type: replace-cross Abstract: Modern off-policy reinforcement learning algorithms often rely on simple uniform replay sampling and it remains unclear when and why non-uniform replay improves over this strong baseline. Across diverse RL settings, we show that the effectiveness of non-uniform replay is governed by three factors: replay volume, the number of replayed transitions […]

Debiased Model-based Representations for Sample-efficient Continuous Control

arXiv:2605.11711v1 Announce Type: cross Abstract: Model-based representations recently stand out as a promising framework that embeds latent dynamics information into the representations for downstream off-policy actor-critic learning. It implicitly combines the advantages of both model-free and model-based approaches while avoiding the training costs associated with model-based methods. Nevertheless, existing model-based representation methods can fail to […]

StereoTales: A Multilingual Framework for Open-Ended Stereotype Discovery in LLMs

arXiv:2605.10442v2 Announce Type: replace-cross Abstract: Multilingual studies of social bias in open-ended LLM generation remain limited: most existing benchmarks are English-centric, template-based, or restricted to recognizing pre-specified stereotypes. We introduce StereoTales, a multilingual dataset and evaluation pipeline for systematically studying the emergence of social bias in open-ended LLM generation. The dataset covers 10 languages and […]

Beyond the Last Layer: Multi-Layer Representation Fusion for Visual Tokenization

arXiv:2605.10780v2 Announce Type: replace-cross Abstract: Representation autoencoders that reuse frozen pretrained vision encoders as visual tokenizers have achieved strong reconstruction and generation quality. However, existing methods universally extract features from only the last encoder layer, discarding the rich hierarchical information distributed across intermediate layers. We show that low-level visual details survive in the last layer […]

A Research Agenda on Agents and Software Engineering: Outcomes from the Rio A2SE Seminar

arXiv:2605.11720v1 Announce Type: cross Abstract: The rise of agentic AI is reshaping software engineering in two intertwined directions: agents are increasingly applied to support software engineering tasks, and Agentic AI systems themselves are complex systems that require re-thinking currently established software engineering practices. To chart a coherent research agenda covering the two directions, we organized […]

Clin-JEPA: A Multi-Phase Co-Training Framework for Joint-Embedding Predictive Pretraining on EHR Patient Trajectories

arXiv:2605.10840v2 Announce Type: replace-cross Abstract: We present Clin-JEPA, a multi-phase co-training framework for joint-embedding predictive (JEPA) pretraining on EHR patient trajectories. JEPA architectures have enabled latent-space planning in robotics and high-quality representation learning in vision, but extending the paradigm to EHR data — to obtain a single backbone that simultaneously forecasts patient trajectories and serves […]

Classifier Context Rot: Monitor Performance Degrades with Context Length

arXiv:2605.12366v1 Announce Type: new Abstract: Monitoring coding agents for dangerous behavior using language models requires classifying transcripts that often exceed 500K tokens, but prior agent monitoring benchmarks rarely contain transcripts longer than 100K tokens. We show that when used as classifiers, current frontier models fail to notice dangerous actions more often in longer transcripts. In […]

Emergent Communication between Heterogeneous Visual Agents through Decentralized Learning

arXiv:2605.11695v1 Announce Type: cross Abstract: Symbols are shared, but perception is private. We study emergent communication between heterogeneous visual agents through decentralized learning, asking what visual information can become shareable when agents have different visual representations. Instead of optimizing messages through a shared external communicative objective, our agents exchange only discrete token sequences and update […]

ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents

arXiv:2605.12481v1 Announce Type: new Abstract: Computer Use Agents (CUAs) can act through both atomic GUI actions, such as click and type, and high-level tool calls, such as API-based file operations, but this hybrid action space often leaves them uncertain about when to continue with GUI actions or switch to tools, leading to suboptimal execution paths. […]

Retrieve-then-Steer: Online Success Memory for Test-Time Adaptation of Generative VLAs

arXiv:2605.10094v2 Announce Type: replace-cross Abstract: Vision-Language-Action (VLA) models show strong potential for general-purpose robotic manipulation, yet their closed-loop reliability often degrades under local deployment conditions. Existing evaluations typically treat test episodes as independent zero-shot trials. However, real robots often operate repeatedly in the same or slowly changing environments, where successful executions provide environment-verified evidence of […]

Rotation-Preserving Supervised Fine-Tuning

arXiv:2605.10973v1 Announce Type: cross Abstract: Supervised fine-tuning (SFT) improves in-domain performance but can degrade out-of-domain (OOD) generalization. Prior work suggests that this degradation is related to changes in dominant singular subspaces of pretrained weight matrices. However, directly identifying loss-sensitive directions with Hessian or Fisher information is computationally expensive at LLM scale. In this work, we […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844