arXiv:2509.05890v3 Announce Type: replace-cross Abstract: Quantum reinforcement learning has emerged as a framework combining quantum computation with sequential decision-making, and applications to the multi-armed bandit (MAB) problem have been reported. The graph bandit problem extends the MAB setting by introducing spatial constraints, where the accessibility of arms is restricted by graph connectivity, yet quantum approaches […]
FaithLens: Detecting and Explaining Faithfulness Hallucination
arXiv:2512.20182v4 Announce Type: replace-cross Abstract: Recognizing whether outputs from large language models (LLMs) contain faithfulness hallucination is crucial for real-world applications, e.g., retrieval-augmented generation and summarization. In this paper, we introduce FaithLens, a cost-efficient and effective faithfulness hallucination detection model that can jointly provide binary predictions and corresponding explanations to improve trustworthiness. To achieve this, […]
Who Benefits from AI? Self-Selection, Skill Gap, and the Hidden Costs of AI Feedback
arXiv:2409.18660v2 Announce Type: replace-cross Abstract: Feedback from artificial intelligence (AI) is increasingly easy to access and research has already established that people learn from it. But individuals choose when and how to seek such feedback, and more engaged and motivated individuals may seek it more, creating an illusion of effectiveness that masks self-selection. We investigate […]
End-to-End Large Portfolio Optimization for Variance Minimization with Neural Networks through Covariance Cleaning
arXiv:2507.01918v3 Announce Type: replace-cross Abstract: We develop a rotation-invariant neural network that provides the global minimum-variance portfolio by jointly learning how to lag-transform historical returns and marginal volatilities and how to regularise the eigenvalues of large equity covariance matrices. This explicit mathematical mapping offers clear interpretability of each module’s role, so the model cannot be […]
Multiclass Local Calibration with the Jensen-Shannon Distance
arXiv:2510.26566v2 Announce Type: replace-cross Abstract: Developing trustworthy Machine Learning (ML) models requires their predicted probabilities to be well-calibrated, meaning they should reflect true-class frequencies. Among calibration notions in multiclass classification, strong calibration is the most stringent, as it requires all predicted probabilities to be simultaneously calibrated across all classes. However, existing approaches to multiclass calibration […]
Temp-R1: A Unified Autonomous Agent for Complex Temporal KGQA via Reverse Curriculum Reinforcement Learning
arXiv:2601.18296v2 Announce Type: replace-cross Abstract: Temporal Knowledge Graph Question Answering (TKGQA) is inherently challenging, as it requires sophisticated reasoning over dynamic facts with multi-hop dependencies and complex temporal constraints. Existing methods rely on fixed workflows and expensive closed-source APIs, limiting flexibility and scalability. We propose Temp-R1, the first autonomous end-to-end agent for TKGQA trained through […]
Adapting Dijkstra for Buffers and Unlimited Transfers
arXiv:2603.11729v2 Announce Type: replace-cross Abstract: In recent years, RAPTOR based algorithms have been considered the state-of-the-art for path-finding with unlimited transfers without preprocessing. However, this status largely stems from the evolution of routing research, where Dijkstra-based solutions were superseded by timetable-based algorithms without a systematic comparison. In this work, we revisit classical Dijkstra-based approaches for […]
On the Emergence of Syntax by Means of Local Interaction
arXiv:2604.17857v2 Announce Type: replace-cross Abstract: Can syntactic processing emerge spontaneously from purely local interaction? We present a concrete instance on a minimal system: an 18,658-parameter two-dimensional neural cellular automaton (NCA), supervised by nothing more than a 1-bit boundary signal, is trained on the membership problem of an arithmetic-expression grammar. After training, its internal $L times […]
Detecting Hallucinations in SpeechLLMs at Inference Time Using Attention Maps
arXiv:2604.19565v1 Announce Type: cross Abstract: Hallucinations in Speech Large Language Models (SpeechLLMs) pose significant risks, yet existing detection methods typically rely on gold-standard outputs that are costly or impractical to obtain. Moreover, hallucination detection methods developed for text-based LLMs do not directly capture audio-specific signals. We investigate four attention-derived metrics: AUDIORATIO, AUDIOCONSISTENCY, AUDIOENTROPY, and TEXTENTROPY, […]
Environmental Sound Deepfake Detection Using Deep-Learning Framework
arXiv:2604.19652v1 Announce Type: cross Abstract: In this paper, we propose a deep-learning framework for environmental sound deepfake detection (ESDD) — the task of identifying whether the sound scene and sound event in an input audio recording is fake or not. To this end, we conducted extensive experiments to explore how individual spectrograms, a wide range […]
FASTER: Value-Guided Sampling for Fast RL
arXiv:2604.19730v1 Announce Type: cross Abstract: Some of the most performant reinforcement learning algorithms today can be prohibitively expensive as they use test-time scaling methods such as sampling multiple action candidates and selecting the best one. In this work, we propose FASTER, a method for getting the benefits of sampling-based test-time scaling of diffusion-based policies without […]
MRS: Multi-Resolution Skills for HRL Agents
arXiv:2505.21410v2 Announce Type: replace Abstract: Hierarchical reinforcement learning (HRL) decomposes the policy into a manager and a worker, enabling long-horizon planning but introducing a performance gap on tasks requiring agility. We identify a root cause: in subgoal-based HRL, the manager’s goal representation is typically learned without constraints on reachability or temporal distance from the current […]