arXiv:2605.26543v1 Announce Type: new Abstract: Polymer discovery is central to fields ranging from energy storage to biomedicine, but it is hindered by an astronomically large chemical design space and fragmented representations of structure, properties, and prior knowledge. This fragmentation leaves many AI models disconnected from physical and experimental reality, restricting their ability to support directly […]
ChainCaps: Composition-Safe Tool-Using Agents via Monotonic Capability Attenuation
arXiv:2605.26542v1 Announce Type: cross Abstract: Tool-using agents increasingly operate in open-ended deployment environments, where they compose file systems, web APIs, code interpreters, and enterprise services at runtime. This creates a safety gap in tool composition: an agent can satisfy every per-tool permission check and still produce an unsafe end-to-end effect, such as reading a confidential […]
Athena: Enhancing Multimodal Reasoning with Data-efficient Process Reward Models
arXiv:2506.09532v5 Announce Type: replace-cross Abstract: We present Athena-PRM, a multimodal process reward model (PRM) designed to evaluate the reward score for each step in solving complex reasoning problems. Developing high-performance PRMs typically demands significant time and financial investment, primarily due to the necessity for step-level annotations of reasoning steps. Conventional automated labeling methods, such as […]
Reliable Extraction of Clinical Follow-Up Instructions: A Hybrid Neural-Symbolic Pipeline
arXiv:2605.26560v1 Announce Type: cross Abstract: Objective. Outpatient notes carry follow-up instructions pairing actions with future times (“MRI brain in two weeks”). Extracting (action, date) pairs supports scheduling and audit, but generative extractors miss the date because linking and arithmetic are implicit in decoding. We test a hybrid neural-symbolic pipeline against direct generation. Methods. We define […]
MobileExplorer: Accelerating On-Device Inference for Mobile GUI Agents via Online Exploration
arXiv:2605.26546v1 Announce Type: new Abstract: Mobile graphical user interface (GUI) agents enable AI models to autonomously operate smartphones on behalf of users. However, most existing systems focus primarily on optimizing task accuracy and rely on cloud-hosted models for inference, which introduces privacy concerns and network-dependent latency. As a result, fully on-device deployment of mobile GUI […]
Examining the Challenges of Intellectual Property in AI-Generated Productions
arXiv:2605.26590v1 Announce Type: cross Abstract: With the advancement of artificial intelligence systems capable of autonomously generating artistic, literary, musical works, and even inventions without direct human intervention, the intellectual property (IP) regime faces unprecedented questions and challenges. The most critical issue concerns the ownership of moral and economic rights in the absence of a human […]
Inference-Time Search Using Side Information for Diffusion-Based Image Reconstruction
arXiv:2510.03352v3 Announce Type: replace-cross Abstract: Diffusion models have been used as priors for solving inverse problems. However, existing approaches typically overlook side information that could significantly improve reconstruction quality, especially in severely ill-posed settings. In this work, we propose a novel framework that incorporates side information into existing diffusion-based inverse problem solvers via inference-time search, […]
MedVol-R1: Reward-Driven Evidence Grounding for Volumetric Reasoning Segmentation
arXiv:2605.26621v1 Announce Type: cross Abstract: Volumetric Reasoning Segmentation (VRS) aims to segment a target region in a 3D medical scan from a free-form clinical query, where the referent is often implicit and requires both medical knowledge and volume-grounded reasoning. Existing methods typically rely on specialized segmentation tokens to connect language with mask decoding, but this […]
Random neural networks match observed dimensionality of neural population recordings and motivate stronger experimental tests
arXiv:2605.26551v1 Announce Type: new Abstract: Randomly connected neural networks have long served as a theoretical tool for studying collective dynamics in neural populations, yet quantitative comparisons to experiments remain limited. Recent technological advances have made it possible to resolve population-wide correlations across neurons, and minimal models such as random neural networks predict their generic structure. […]
Respecting Modality Gap in Post-hoc Out-of-distribution Detection with Pre-trained Vision-Language Models
arXiv:2605.26661v1 Announce Type: cross Abstract: Out-of-distribution (OOD) detection has emerged as a popular technique to enhance the reliability of machine learning models by identifying unexpected inputs from unknown classes. Recent progress in pre-trained vision-language models (VLMs) has enabled zero-shot OOD detection without access to in-distribution (ID) training data; in this setting, existing methods commonly treat […]
How to Square Tensor Networks and Circuits Without Squaring Them
arXiv:2512.17090v2 Announce Type: replace-cross Abstract: Squared tensor networks (TNs) and their extension as computational graphs–squared circuits–have been used as expressive distribution estimators, yet supporting closed-form marginalization. However, the squaring operation introduces additional complexity when computing the partition function or marginalizing variables, which hinders their applicability in ML. To solve this issue, canonical forms of TNs […]
DynFrame: Adaptive Reasoning-Driven Multimodal Framework with Dynamic Frame Augmentation for Complex Video Understanding
arXiv:2605.26680v1 Announce Type: cross Abstract: Recent video multimodal large language models (MLLMs) increasingly couple step-by-step reasoning with on-demand visual evidence retrieval, allowing models to revisit relevant video segments during inference. However, two structural gaps remain in existing thinking-with-video systems. (i) Sampling density is not a learnable decision: existing methods may let the model decide where […]