arXiv:2603.06680v1 Announce Type: cross Abstract: We present VB, a benchmark that tests whether vision-language models can determine what is and is not visible in a photograph, and abstain when a human viewer cannot reliably answer. Each item pairs a single photo with a short yes/no visibility claim; the model must output VISIBLY_TRUE, VISIBLY_FALSE, or ABSTAIN, […]
InterReal: A Unified Physics-Based Imitation Framework for Learning Human-Object Interaction Skills
arXiv:2603.07516v1 Announce Type: cross Abstract: Interaction is one of the core abilities of humanoid robots. However, most existing frameworks focus on non-interactive whole-body control, which limits their practical applicability. In this work, we develop InterReal, a unified physics-based imitation learning framework for Real-world human-object Interaction (HOI) control. InterReal enables humanoid robots to track HOI reference […]
AI Steerability 360: A Toolkit for Steering Large Language Models
arXiv:2603.07837v1 Announce Type: cross Abstract: The AI Steerability 360 toolkit is an extensible, open-source Python library for steering LLMs. Steering abstractions are designed around four model control surfaces: input (modification of the prompt), structural (modification of the model’s weights or architecture), state (modification of the model’s activations and attentions), and output (modification of the decoding […]
Bi Directional Feedback Fusion for Activity Aware Forecasting of Indoor CO2 and PM2.5
arXiv:2603.06724v1 Announce Type: cross Abstract: Indoor air quality (IAQ) forecasting plays a critical role in safeguarding occupant health, ensuring thermal comfort, and supporting intelligent building control. However, predicting future concentrations of key pollutants such as carbon dioxide (CO2) and fine particulate matter (PM2.5) remains challenging due to the complex interplay between environmental factors and highly […]
Learning Unbiased Cluster Descriptors for Interpretable Imbalanced Concept Drift Detection
arXiv:2603.06757v1 Announce Type: cross Abstract: Unlabeled streaming data are usually collected to describe dynamic systems, where concept drift detection is a vital prerequisite to understanding the evolution of systems. However, the drifting concepts are usually imbalanced in most real cases, which brings great challenges to drift detection. That is, the dominant statistics of large clusters […]
Contextual Counterfactual Credit Assignment for Multi-Agent Reinforcement Learning in LLM Collaboration
arXiv:2603.06859v1 Announce Type: cross Abstract: Cooperative multi-agent reinforcement learning (MARL) systems powered by large language models (LLMs) are frequently optimized via sparse terminal-only feedback. This shared signal entangles upstream decisions, obstructing accurate decision-level credit assignment. To address this trajectory-level diffusion, we introduce Contextual Counterfactual Credit Assignment (textbftextttC3). Instead of distributing rewards across an entire episode, […]
RESCHED: Rethinking Flexible Job Shop Scheduling from a Transformer-based Architecture with Simplified States
arXiv:2603.07020v1 Announce Type: cross Abstract: Neural approaches to the Flexible Job Shop Scheduling Problem (FJSP), particularly those based on deep reinforcement learning (DRL), have gained growing attention in recent years. However, existing methods rely on complex feature-engineered state representations (i.e., often requiring more than 20 handcrafted features) and graph-biased neural architectures. To reduce modeling complexity […]
A Hybrid LTR-based System via Social Context Embedding for Recommending Solutions of Software Bugs in Developer Communities
arXiv:2603.07229v1 Announce Type: cross Abstract: Questions and Answering forums such as Stack Overflow play an important role in supporting software developers in finding answers to queries related to issues such as software errors and bugs. However, searching through a large set of candidate answers could be time consuming and may not lead to the best […]
Domain-Specific Quality Estimation for Machine Translation in Low-Resource Scenarios
arXiv:2603.07372v1 Announce Type: cross Abstract: Quality Estimation (QE) is essential for assessing machine translation quality in reference-less settings, particularly for domain-specific and low-resource language scenarios. In this paper, we investigate sentence-level QE for English to Indic machine translation across four domains (Healthcare, Legal, Tourism, and General) and five language pairs. We systematically compare zero-shot, few-shot, […]
Contact-Guided 3D Genome Structure Generation of E. coli via Diffusion Transformers
arXiv:2603.07472v1 Announce Type: cross Abstract: In this study, we present a conditional diffusion-transformer framework for generating ensembles of three-dimensional Escherichia coli genome conformations guided by Hi-C contact maps. Instead of producing a single deterministic structure, we formulate genome reconstruction as a conditional generative modeling problem that samples heterogeneous conformations whose ensemble-averaged contacts are consistent with […]
GRD-Net: Generative-Reconstructive-Discriminative Anomaly Detection with Region of Interest Attention Module
arXiv:2603.07566v1 Announce Type: cross Abstract: Anomaly detection is nowadays increasingly used in industrial applications and processes. One of the main fields of the appliance is the visual inspection for surface anomaly detection, which aims to spot regions that deviate from regularity and consequently identify abnormal products. Defect localization is a key task, that usually is […]
ProgAgent:A Continual RL Agent with Progress-Aware Rewards
arXiv:2603.07784v1 Announce Type: cross Abstract: We present ProgAgent, a continual reinforcement learning (CRL) agent that unifies progress-aware reward learning with a high-throughput, JAX-native system architecture. Lifelong robotic learning grapples with catastrophic forgetting and the high cost of reward specification. ProgAgent tackles these by deriving dense, shaped rewards from unlabeled expert videos through a perceptual model […]