Ranking-aware Reinforcement Learning for Ordinal Ranking

arXiv:2601.20585v1 Announce Type: cross Abstract: Ordinal regression and ranking are challenging due to inherent ordinal dependencies that conventional methods struggle to model. We propose Ranking-Aware Reinforcement Learning (RARL), a novel RL framework that explicitly learns these relationships. At its core, RARL features a unified objective that synergistically integrates regression and Learning-to-Rank (L2R), enabling mutual improvement […]

Person Re-ID in 2025: Supervised, Self-Supervised, and Language-Aligned. What Works?

arXiv:2601.20598v1 Announce Type: cross Abstract: Person Re-Identification (ReID) remains a challenging problem in computer vision. This work reviews various training paradigm and evaluates the robustness of state-of-the-art ReID models in cross-domain applications and examines the role of foundation models in improving generalization through richer, more transferable visual representations. We compare three training paradigms, supervised, self-supervised, […]

FaithSCAN: Model-Driven Single-Pass Hallucination Detection for Faithful Visual Question Answering

arXiv:2601.00269v2 Announce Type: replace-cross Abstract: Faithfulness hallucinations in VQA occur when vision-language models produce fluent yet visually ungrounded answers, severely undermining their reliability in safety-critical applications. Existing detection methods mainly fall into two categories: external verification approaches relying on auxiliary models or knowledge bases, and uncertainty-driven approaches using repeated sampling or uncertainty estimates. The former […]

Regularized Gradient Temporal-Difference Learning

arXiv:2601.20599v1 Announce Type: cross Abstract: Gradient temporal-difference (GTD) learning algorithms are widely used for off-policy policy evaluation with function approximation. However, existing convergence analyses rely on the restrictive assumption that the so-called feature interaction matrix (FIM) is nonsingular. In practice, the FIM can become singular and leads to instability or degraded performance. In this paper, […]

GRACE: A Language Model Framework for Explainable Inverse Reinforcement Learning

arXiv:2510.02180v2 Announce Type: replace-cross Abstract: Inverse Reinforcement Learning aims to recover reward models from expert demonstrations, but traditional methods yield black-box models that are difficult to interpret and debug. In this work, we introduce GRACE (Generating Rewards As CodE), a method for using Large Language Models within an evolutionary search to reverse-engineer an interpretable, code-based […]

Inequality in Congestion Games with Learning Agents

arXiv:2601.20578v1 Announce Type: cross Abstract: Who benefits from expanding transport networks? While designed to improve mobility, such interventions can also create inequality. In this paper, we show that disparities arise not only from the structure of the network itself but also from differences in how commuters adapt to it. We model commuters as reinforcement learning […]

SiNGER: A Clearer Voice Distills Vision Transformers Further

arXiv:2509.20986v3 Announce Type: replace-cross Abstract: Vision Transformers are widely adopted as the backbone of vision foundation models, but they are known to produce high-norm artifacts that degrade representation quality. When knowledge distillation transfers these features to students, high-norm artifacts dominate the objective, so students overfit to artifacts and underweight informative signals, diminishing the gains from […]

Fail Fast, Win Big: Rethinking the Drafting Strategy in Speculative Decoding via Diffusion LLMs

arXiv:2512.20573v3 Announce Type: replace-cross Abstract: Diffusion Large Language Models (dLLMs) offer fast, parallel token generation, but their standalone use is plagued by an inherent efficiency-quality tradeoff. We show that, if carefully applied, the attributes of dLLMs can actually be a strength for drafters in speculative decoding with autoregressive (AR) verifiers. Our core insight is that […]

Diffusion Timbre Transfer Via Mutual Information Guided Inpainting

arXiv:2601.01294v2 Announce Type: replace-cross Abstract: We study timbre transfer as an inference-time editing problem for music audio. Starting from a strong pre-trained latent diffusion model, we introduce a lightweight procedure that requires no additional training: (i) a dimension-wise noise injection that targets latent channels most informative of instrument identity, and (ii) an early-step clamping mechanism […]

Robust Distributed Learning under Resource Constraints: Decentralized Quantile Estimation via (Asynchronous) ADMM

arXiv:2601.20571v1 Announce Type: cross Abstract: Specifications for decentralized learning on resource-constrained edge devices require algorithms that are communication-efficient, robust to data corruption, and lightweight in memory usage. While state-of-the-art gossip-based methods satisfy the first requirement, achieving robustness remains challenging. Asynchronous decentralized ADMM-based methods have been explored for estimating the median, a statistical centrality measure that […]

Independence of Approximate Clones

arXiv:2601.20779v1 Announce Type: cross Abstract: In an ordinal election, two candidates are said to be perfect clones if every voter ranks them adjacently. The independence of clones axiom then states that removing one of the two clones should not change the election outcome. This axiom has been extensively studied in social choice theory, and several […]

ShareChat: A Dataset of Chatbot Conversations in the Wild

arXiv:2512.17843v3 Announce Type: replace-cross Abstract: While academic research typically treats Large Language Models (LLM) as generic text generators, they are distinct commercial products with unique interfaces and capabilities that fundamentally shape user behavior. Current datasets obscure this reality by collecting text-only data through uniform interfaces that fail to capture authentic chatbot usage. To address this […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844