Towards Fine-Grained and Verifiable Concept Bottleneck Models

arXiv:2605.14210v1 Announce Type: cross Abstract: Concept Bottleneck Models (CBMs) offer interpretable alternatives to black-box predictors by introducing human-relatable concepts before the final output. However, existing CBMs struggle to verify whether predicted concepts correspond to the correct visual evidence, limiting their reliability. We propose a fine-grained CBM framework that grounds each concept in localized visual evidence, […]

AudioMosaic: Contrastive Masked Audio Representation Learning

arXiv:2605.14231v1 Announce Type: cross Abstract: Audio self-supervised learning (SSL) aims to learn general-purpose representations from large-scale unlabeled audio data. While recent advances have been driven mainly by generative reconstruction objectives, contrastive approaches remain less explored, partly due to the difficulty of designing effective audio augmentations and the large batch sizes required for contrastive pre-training. We […]

LoVeC: Reinforcement Learning for Better Verbalized Confidence in Long-Form Generations

arXiv:2505.23912v2 Announce Type: replace-cross Abstract: Hallucination remains a major challenge for the safe and trustworthy deployment of large language models (LLMs) in factual content generation. Prior work has explored confidence estimation as an effective approach to hallucination detection, but often relies on post-hoc self-consistency methods that require computationally expensive sampling. Verbalized confidence offers a more […]

Not All Timesteps Matter Equally: Selective Alignment Knowledge Distillation for Spiking Neural Networks

arXiv:2605.14252v1 Announce Type: cross Abstract: Spiking neural networks (SNNs), which are brain-inspired and spike-driven, achieve high energy efficiency. However, a performance gap between SNNs and artificial neural networks (ANNs) still remains. Knowledge distillation (KD) is commonly adopted to improve SNN performance, but existing methods typically enforce uniform alignment across all timesteps, either from a teacher […]

SPIN: Structural LLM Planning via Iterative Navigation for Industrial Tasks

arXiv:2605.14051v1 Announce Type: new Abstract: Industrial LLM agent systems often separate planning from execution, yet LLM planners frequently produce structurally invalid or unnecessarily long workflows, leading to brittle failures and avoidable tool and API cost. We propose textttSPIN, a planning wrapper that combines validated Directed Acyclic Graph (DAG) planning with prefix based execution control. textttSPIN […]

Watermarking Game-Playing Agents in Perfect-Information Extensive-Form Games

arXiv:2605.14283v1 Announce Type: cross Abstract: Watermarking techniques for large language models (LLMs), which encode hidden information in the output so its source can be verified, have gained significant attention in recent days, thanks to their potential capability to detect accidental or deliberate misuse. Similar challenges involving model misuse also exist in the context of game-playing, […]

Descriptor: Distance-Annotated Traffic Perception Question Answering (DTPQA)

arXiv:2511.13397v2 Announce Type: replace-cross Abstract: The remarkable progress of Vision-Language Models (VLMs) on a variety of tasks has raised interest in their application to automated driving. However, for these models to be trusted in such a safety-critical domain, they must first possess robust perception capabilities, i.e., they must be capable of understanding a traffic scene, […]

Policy Optimization in Hybrid Discrete-Continuous Action Spaces via Mixed Gradients

arXiv:2605.14297v1 Announce Type: cross Abstract: We study reinforcement learning in hybrid discrete-continuous action spaces, such as settings where the discrete component selects a regime (or index) and the continuous component optimizes within it — a structure common in robotics, control, and operations problems. Standard model-free policy gradient methods rely on score-function (SF) estimators and suffer […]

Bad Seeing or Bad Thinking? Rewarding Perception for Vision-Language Reasoning

arXiv:2605.14054v1 Announce Type: new Abstract: Achieving robust perception-reasoning synergy is a central goal for advanced Vision-Language Models (VLMs). Recent advancements have pursued this goal via architectural designs or agentic workflows. However, these approaches are often limited by static textual reasoning or complicated by the significant compute and engineering burden of external agentic complexity. Worse, this […]

Dynamic Latent Routing

arXiv:2605.14323v1 Announce Type: cross Abstract: We investigate the temporal concatenation of sub-policies in Markov Decision Processes (MDP) with time-varying reward functions. We introduce General Dijkstra Search (GDS), and prove that globally optimal goal-reaching policies can be recovered through temporal composition of intermediate optimal sub-policies. Motivated by the “search, select, update” principle underlying GDS, we propose […]

Boosting LLM Reasoning via Human-Inspired Reward Shaping

arXiv:2602.04265v3 Announce Type: replace-cross Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a promising paradigm for enhancing reasoning in Large Language Models (LLMs). However, existing reward formulations typically treat exploration and consolidation as a monolithic process, resulting in entangled stage-wise learning dynamics. This contradicts the natural learning behavior of human learners. In human […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844