arXiv:2604.09566v1 Announce Type: cross Abstract: The application of games as a therapeutic tool for cognitive training is beneficial for patients with cognitive impairments. However, effective game design for individual patient is resource-intensive. To this end, we propose an LLM-powered method, ours, for automated and personalized therapeutic game design. Inspired by the Dungeons & Dragons, LETGAMES […]
Detecting Safety Violations Across Many Agent Traces
arXiv:2604.11806v1 Announce Type: new Abstract: To identify safety violations, auditors often search over large sets of agent traces. This search is difficult because failures are often rare, complex, and sometimes even adversarially hidden and only detectable when multiple traces are analyzed together. These challenges arise in diverse settings such as misuse campaigns, covert sabotage, reward […]
LitPivot: Developing Well-Situated Research Ideas Through Dynamic Contextualization and Critique within the Literature Landscape
arXiv:2604.02600v2 Announce Type: replace-cross Abstract: Developing a novel research idea is hard. It must be distinct enough from prior work to claim a contribution while also building on it. This requires iteratively reviewing literature and refining an idea based on what a researcher reads; yet when an idea changes, the literature that matters often changes […]
CLAY: Conditional Visual Similarity Modulation in Vision-Language Embedding Space
arXiv:2604.11539v1 Announce Type: cross Abstract: Human perception of visual similarity is inherently adaptive and subjective, depending on the users’ interests and focus. However, most image retrieval systems fail to reflect this flexibility, relying on a fixed, monolithic metric that cannot incorporate multiple conditions simultaneously. To address this, we propose CLAY, an adaptive similarity computation method […]
Federated Single-Agent Robotics: Multi-Robot Coordination Without Intra-Robot Multi-Agent Fragmentation
arXiv:2604.11028v1 Announce Type: cross Abstract: As embodied robots move toward fleet-scale operation, multi-robot coordination is becoming a central systems challenge. Existing approaches often treat this as motivation for increasing internal multi-agent decomposition within each robot. We argue for a different principle: multi-robot coordination does not require intra-robot multi-agent fragmentation. Each robot should remain a single […]
The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping
arXiv:2604.11297v1 Announce Type: cross Abstract: Despite the success of reinforcement learning for large language models, a common failure mode is reduced sampling diversity, where the policy repeatedly generates similar erroneous behaviors. Classical entropy regularization encourages randomness under the current policy, but does not explicitly discourage recurrent failure patterns across rollouts. We propose MEDS, a Memory-Enhanced […]
Speaking to No One: Ontological Dissonance and the Double Bind of Conversational AI
arXiv:2604.10833v1 Announce Type: cross Abstract: Recent reports indicate that sustained interaction with conversational artificial intelligence (AI) systems can, in a small subset of users, contribute to the emergence or stabilisation of delusional experience. Existing accounts typically attribute such cases either to individual vulnerability or to failures of safety engineering. These explanations are incomplete. Drawing on […]
SCITUNE: Aligning Large Language Models with Human-Curated Scientific Multimodal Instructions
arXiv:2307.01139v2 Announce Type: replace-cross Abstract: Instruction finetuning is a popular paradigm to align large language models (LLM) with human intent. Despite its popularity, this idea is less explored in improving LLMs to align existing foundation models with scientific disciplines, concepts and goals. In this work, we present textitSciTune as a tuning framework to improve the […]
Detecting Invariant Manifolds in ReLU-Based RNNs
arXiv:2510.03814v4 Announce Type: replace-cross Abstract: Recurrent Neural Networks (RNNs) have found widespread applications in machine learning for time series prediction and dynamical systems reconstruction, and experienced a recent renaissance with improved training algorithms and architectural designs. Understanding why and how trained RNNs produce their behavior is important for scientific and medical applications, and explainable AI […]
C-ReD: A Comprehensive Chinese Benchmark for AI-Generated Text Detection Derived from Real-World Prompts
arXiv:2604.11796v1 Announce Type: cross Abstract: Recently, large language models (LLMs) are capable of generating highly fluent textual content. While they offer significant convenience to humans, they also introduce various risks, like phishing and academic dishonesty. Numerous research efforts have been dedicated to developing algorithms for detecting AI-generated text and constructing relevant datasets. However, in the […]
FlexMS is a flexible framework for benchmarking deep learning-based mass spectrum prediction tools in metabolomics
arXiv:2602.22822v2 Announce Type: replace Abstract: The identification and property prediction of chemical molecules is of central importance in the advancement of drug discovery and material science, where the tandem mass spectrometry technology gives valuable fragmentation cues in the form of mass-to-charge ratio peaks. However, the lack of experimental spectra hinders the attachment of each molecular […]
Sat2Sound: A Unified Framework for Zero-Shot Soundscape Mapping
arXiv:2505.13777v2 Announce Type: replace-cross Abstract: We present Sat2Sound, a unified multimodal framework for geospatial soundscape understanding, designed to predict and map the distribution of sounds across the Earth’s surface. Existing methods for this task rely on paired satellite images and geotagged audio samples, which often fail to capture the full diversity of sound at a […]