Generative Semantic Coding for Ultra-Low Bitrate Visual Communication and Analysis

arXiv:2510.27324v1 Announce Type: cross Abstract: We consider the problem of ultra-low bit rate visual communication for remote vision analysis, human interactions and control in challenging

Soft Task-Aware Routing of Experts for Equivariant Representation Learning

arXiv:2510.27222v1 Announce Type: cross Abstract: Equivariant representation learning aims to capture variations induced by input transformations in the representation space, whereas invariant representation learning encodes

MedCalc-Eval and MedCalc-Env: Advancing Medical Calculation Capabilities of Large Language Models

arXiv:2510.27267v1 Announce Type: cross Abstract: As large language models (LLMs) enter the medical domain, most benchmarks evaluate them on question answering or descriptive reasoning, overlooking

ZEBRA: Towards Zero-Shot Cross-Subject Generalization for Universal Brain Visual Decoding

arXiv:2510.27128v1 Announce Type: cross Abstract: Recent advances in neural decoding have enabled the reconstruction of visual experiences from brain activity, positioning fMRI-to-image reconstruction as a

Sparse Model Inversion: Efficient Inversion of Vision Transformers for Data-Free Applications

arXiv:2510.27186v1 Announce Type: cross Abstract: Model inversion, which aims to reconstruct the original training data from pre-trained discriminative models, is especially useful when the original

Towards Automated Semantic Interpretability in Reinforcement Learning via Vision-Language Models

November 3, 2025

arXiv:2503.16724v3 Announce Type: replace
Abstract: Semantic interpretability in Reinforcement Learning (RL) enables transparency and verifiability of decision-making. Achieving semantic interpretability in reinforcement learning requires (1) a feature space composed of human-understandable concepts and (2) a policy that is interpretable and verifiable. However, constructing such a feature space has traditionally relied on manual human specification, which often fails to generalize to unseen environments. Moreover, even when interpretable features are available, most reinforcement learning algorithms employ black-box models as policies, thereby hindering transparency. We introduce interpretable Tree-based Reinforcement learning via Automated Concept Extraction (iTRACE), an automated framework that leverages pre-trained vision-language models (VLM) for semantic feature extraction and train a interpretable tree-based model via RL. To address the impracticality of running VLMs in RL loops, we distill their outputs into a lightweight model. By leveraging Vision-Language Models (VLMs) to automate tree-based reinforcement learning, iTRACE loosens the reliance the need for human annotation that is traditionally required by interpretable models. In addition, it addresses key limitations of VLMs alone, such as their lack of grounding in action spaces and their inability to directly optimize policies. We evaluate iTRACE across three domains: Atari games, grid-world navigation, and driving. The results show that iTRACE outperforms other interpretable policy baselines and matches the performance of black-box policies on the same interpretable feature space.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844