arXiv:2511.05913v2 Announce Type: replace-cross Abstract: New intent discovery (NID) seeks to recognize both new and known intents from unlabeled user utterances, which finds prevalent use in practical dialogue systems. Existing works towards NID mainly adopt a cascaded architecture, wherein the first stage focuses on encoding the utterances into informative text embeddings beforehand, while the latter […]
GIRL-DETR: Gradient-Isolated Reinforcement Learning for Video Moment Retrieval
arXiv:2606.00775v1 Announce Type: cross Abstract: Video Moment Retrieval (VMR) task requires accurately localizing temporal boundaries aligned with natural language queries, but many models suffer from a misalignment between continuous surrogate losses and non-differentiable metrics, leading to optimization stagnation during the late stages of training and trapping boundary predictions in suboptimal solutions. Although Reinforcement Learning (RL) […]
Coupling Language Models with Physics-based Simulation for Synthesis of Inorganic Materials
arXiv:2606.00315v1 Announce Type: new Abstract: Modern generative machine learning (ML) models can propose novel inorganic crystalline materials with targeted properties; however, synthesis planning of these materials remains difficult due to the complexity of the associated physical processes and limited availability of computational tools. We introduce a novel hybrid framework to evaluate Large Language Models (LLMs) […]
ELF: A Family of Encoder-Free ECG-Language Models
arXiv:2601.18798v2 Announce Type: replace-cross Abstract: ECG-Language Models (ELMs) extend recent advances in Multimodal Large Language Models (MLLMs) to automated ECG interpretation. However, most existing ELMs inherit Vision-Language Model (VLM) design choices and rely on pretrained ECG encoders, introducing substantial architectural and training complexity. Inspired by encoder-free VLMs, we introduce ELF, a family of three encoder-free […]
Dive into Waves: Morlet Spectral Transformer for Cross-Subject Emotion Decoding from EEG
arXiv:2606.00884v1 Announce Type: cross Abstract: We study cross-subject emotion recognition from EEG, a practically important yet challenging problem in brain-computer interfaces. Unlike tasks with clear waveform signatures, emotion-related EEG signals are primarily encoded in spectral power and are weak, noisy, and highly variable across subjects. Existing approaches rely either on large pretrained EEG foundation models, […]
On the synaptic matrix eigenvalues of sparsely connected neural networks
arXiv:2606.00326v1 Announce Type: new Abstract: The spectral behaviour of the synaptic matrix, representing the neuronal connection strengths, is an important tool to analyze the stability and transient dynamics of a typical brain as well as its learning process and memory capacity. The complexity of the brain due to large number of neurons as well as […]
Silent Failures in Federated Personalization of Foundation Models
arXiv:2606.00947v1 Announce Type: cross Abstract: Foundation models are increasingly personalized on decentralized private data through federated learning and are now deployed at scale under growing regulatory requirements for post-market monitoring. We argue that this convergence creates a distinct and under-recognized class of trustworthiness failures, which we term “Silent Failures.” These include amplified bias, fairness collapse, […]
When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning
arXiv:2602.08236v2 Announce Type: replace-cross Abstract: Despite rapid progress in MLLMs, visual spatial reasoning remains unreliable when correct answers depend on how a scene would appear under unseen or alternative viewpoints. Recent work addresses this by augmenting reasoning with world models for visual imagination, but questions such as when imagination is actually necessary, how much of […]
Move the Query, Not the Cache: Characterizing Cross-Instance Latent Attention Redistribution Across GPU Fabrics
arXiv:2606.01502v1 Announce Type: cross Abstract: Frontier LLMs increasingly decide what a query attends to with a sparse-attention indexer that picks a few KV-cache blocks per query: attention’s unit is now a small, reusable chunk. Agentic workloads hammer it: many sub-agents query one large codebase, reusing the same blocks. When that corpus outgrows one GPU it […]
Topology-Preserving Neural Operator Learning via Hodge Decomposition
arXiv:2605.13834v2 Announce Type: replace-cross Abstract: In this paper, we study solution operators of physical field equations on geometric meshes from a function-space perspective. We reveal that Hodge orthogonality fundamentally resolves spectral interference by isolating unlearnable topological degrees of freedom from learnable geometric dynamics, enabling an additive approximation confined to structure-preserving subspaces. Building on Hodge theory […]
Knowledge Graphs as the Missing Data Layer for LLM-Based Industrial Asset Operations
arXiv:2605.26874v2 Announce Type: replace-cross Abstract: LLM-based agents for industrial asset operations show limited accuracy when reasoning over flat document stores. AssetOpsBench (KDD 2026) establishes that GPT-4 agents achieve 65% on 139 industrial maintenance scenarios, and compares LLM orchestration paradigms (Agent-As-Tool vs. Plan-Execute) on a fixed data layer. We ask the orthogonal question: how much does […]
Silent Failures in Physical AI: A Literature Review of Runtime Action Authorization for Autonomous Systems
arXiv:2606.00090v1 Announce Type: cross Abstract: Physical AI systems increasingly map multimodal observations, language instructions, and learned world representations into physically consequential actions. Robotics foundation models, vision-language-action models, and world-model-based autonomous systems can condition decisions that move vehicles, robots, drones, and industrial machines. This transition exposes a safety problem that is not fully captured by conventional […]