arXiv:2603.29693v1 Announce Type: new Abstract: A robust decision-making process must take into account uncertainty, especially when the choice involves inherent risks. Because artificial Intelligence (AI) systems are increasingly integrated into decision-making workflows, managing uncertainty relies more and more on the metacognitive capabilities of these systems; i.e, their ability to assess the reliability of and regulate […]
Identifiability of SDEs for reaction networks
arXiv:2505.07638v3 Announce Type: replace-cross Abstract: Biochemical reaction networks are widely applied across scientific disciplines to model complex dynamic systems. We investigate the diffusion approximation of reaction networks with mass-action kinetics, focusing on the identifiability of the stochastic differential equations associated to the reaction network. We derive conditions under which the law of the diffusion approximation […]
Reinforced Reasoning for End-to-End Retrosynthetic Planning
arXiv:2603.29723v1 Announce Type: new Abstract: Retrosynthetic planning is a fundamental task in organic chemistry, yet remains challenging due to its combinatorial complexity. To address this, conventional approaches typically rely on hybrid frameworks that combine single-step predictions with external search heuristics, inevitably fracturing the logical coherence between local molecular transformations and global planning objectives. To bridge […]
PAR$^2$-RAG: Planned Active Retrieval and Reasoning for Multi-Hop Question Answering
arXiv:2603.29085v1 Announce Type: new Abstract: Large language models (LLMs) remain brittle on multi-hop question answering (MHQA), where answering requires combining evidence across documents through retrieval and reasoning. Iterative retrieval systems can fail by locking onto an early low-recall trajectory and amplifying downstream errors, while planning-only approaches may produce static query sets that cannot adapt when […]
Spontaneous Functional Differentiation in Large Language Models: A Brain-Like Intelligence Economy
arXiv:2603.29735v1 Announce Type: new Abstract: The evolution of intelligence in artificial systems provides a unique opportunity to identify universal computational principles. Here we show that large language models spontaneously develop synergistic cores where information integration exceeds individual parts remarkably similar to the human brain. Using Integrated Information Decomposition across multiple architectures we find that middle […]
Incorporating LLM Embeddings for Variation Across the Human Genome
arXiv:2509.20702v2 Announce Type: replace-cross Abstract: Recent advances in large language model (LLM) embeddings have enabled powerful representations for biological data, but most applications to date focus on gene-level information. We present one of the first systematic frameworks to generate genetic variant-level embeddings across the entire human genome. Using curated annotations from FAVOR, ClinVar, and the […]
Tracking vs. Deciding: The Dual-Capability Bottleneck in Searchless Chess Transformers
arXiv:2603.29761v1 Announce Type: new Abstract: A human-like chess engine should mimic the style, errors, and consistency of a strong human player rather than maximize playing strength. We show that training from move sequences alone forces a model to learn two capabilities: state tracking, which reconstructs the board from move history, and decision quality, which selects […]
GISTBench: Evaluating LLM User Understanding via Evidence-Based Interest Verification
arXiv:2603.29112v1 Announce Type: new Abstract: We introduce GISTBench, a benchmark for evaluating Large Language Models’ (LLMs) ability to understand users from their interaction histories in recommendation systems. Unlike traditional RecSys benchmarks that focus on item prediction accuracy, our benchmark evaluates how well LLMs can extract and verify user interests from engagement data. We propose two […]
CREST: Constraint-Release Execution for Multi-Robot Warehouse Shelf Rearrangement
arXiv:2603.28803v1 Announce Type: cross Abstract: Double-Deck Multi-Agent Pickup and Delivery (DD-MAPD) models the multi-robot shelf rearrangement problem in automated warehouses. MAPF-DECOMP is a recent framework that first computes collision-free shelf trajectories with a MAPF solver and then assigns agents to execute them. While efficient, it enforces strict trajectory dependencies, often leading to poor execution quality […]
Semantic Labeling for Third-Party Cybersecurity Risk Assessment: A Semi-Supervised Approach to Intent-Aware Question Retrieval
arXiv:2602.10149v3 Announce Type: replace-cross Abstract: Third-Party Risk Assessment (TPRA) relies on large repositories of cybersecurity compliance questions used to assess external suppliers against standards such as ISO/IEC 27001 and NIST. In practice, not all questions are relevant for a specific supplier and selecting questions for a given assessment context remains a manual and time-consuming task. […]
ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering
arXiv:2511.22715v2 Announce Type: replace-cross Abstract: Multimodal Large Language Models (MLLMs) have shown impressive capabilities in jointly understanding text, images, and videos, often evaluated via Visual Question Answering (VQA). However, even state-of-the-art MLLMs struggle with domain-specific or knowledge-intensive queries, where relevant information is underrepresented in pre-training data. Knowledge-based VQA (KB-VQA) addresses this by retrieving external documents […]
Merging Triggers, Breaking Backdoors: Defensive Poisoning for Instruction-Tuned Language Models
arXiv:2601.04448v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have greatly advanced Natural Language Processing (NLP), particularly through instruction tuning, which enables broad task generalization without additional fine-tuning. However, their reliance on large-scale datasets-often collected from human or web sources-makes them vulnerable to backdoor attacks, where adversaries poison a small subset of data to implant […]