arXiv:2605.21725v1 Announce Type: new Abstract: Phylogenetic networks and, more generally, directed acyclic graphs (DAGs) represent hierarchical structure beyond trees, for instance in the presence of reticulate evolutionary events such as hybridization or horizontal gene transfer. A central question is which parts of such graphs are essential with respect to leaf-observable information, and which parts can […]
TextTeacher: What Can Language Teach About Images?
arXiv:2605.22098v1 Announce Type: cross Abstract: The platonic representation hypothesis suggests that sufficiently large models converge to a shared representation geometry, even across modalities. Motivated by this, we ask: Can the semantic knowledge of a language model efficiently improve a vision model? As an answer, we introduce TextTeacher, a simple auxiliary objective that injects text embeddings […]
RE-VLM: Event-Augmented Vision-Language Model for Scene Understanding
arXiv:2605.19329v2 Announce Type: replace-cross Abstract: Conventional vision-language models (VLMs) struggle to interpret scenes captured under adverse conditions (e.g., low light, high dynamic range, or fast motion) because standard RGB images degrade in such environments. Event cameras provide a complementary modality: they asynchronously record per-pixel brightness changes with high temporal resolution and wide dynamic range, preserving […]
Can Transformers Learn to Verify During Backtracking Search?
arXiv:2605.22221v1 Announce Type: cross Abstract: Backtracking search underlies classical constraint solvers, planners, and theorem provers. Recent transformer-based reasoning systems explore search trees over their own intermediate steps. A common training recipe fits an autoregressive next-token loss on offline solver traces. The model’s input at each step is a cumulative trace of all prior decisions. The […]
AttuneBench: A Conversation-Based Benchmark for LLM Emotional Intelligence
arXiv:2605.21739v1 Announce Type: new Abstract: Emotional intelligence (EI), the ability to perceive, understand, and respond appropriately to others’ emotional states, is central to human communication, and increasingly important to assess as LLMs assume conversational roles in everyday life. Existing EI benchmarks rely on synthetic prompts, single-turn cases, or third-party annotation. These approaches do not directly […]
Implicit Regularization of Mini-Batch Training in Graph Neural Networks
arXiv:2605.22480v1 Announce Type: cross Abstract: Mini-batch training of Graph Neural Networks (GNNs) is fundamentally different from training on i.i.d. data: sampling a subgraph alters the topology and introduces boundary effects, leading prior work to develop structure-aware samplers that preserve local connectivity and reduce embedding variance. Surprisingly, we demonstrate that the simplest possible scheme, Random Node […]
VEELA: A Clinically-Constrained Benchmark for Liver Vessel Segmentation in Computed Tomography Angiography
arXiv:2605.22357v1 Announce Type: cross Abstract: Accurate segmentation of hepatic and portal vessels in contrast-enhanced computed tomography angiography (CTA) remains challenging due to complex vascular topology, peripheral visibility limitations, and acquisition-induced ambiguities. While existing public datasets offer valuable benchmarks, few include clinically realistic annotation constraints. We introduce VEELA (Vessel Extraction and Extrication for Liver Analysis), a […]
SMDD-Bench: Can LLMs Solve Real-World Small Molecule Drug Design Tasks?
arXiv:2605.21740v1 Announce Type: new Abstract: LLM agents have incredible potential for scientific discovery applications. However, the performance of LLM agents on real-world, small molecule drug design (SMDD) tasks across diverse chemistries and targets is unclear. Current evaluation methods are either ad hoc, too simple for real-world discovery, limited in scale, or restricted to single-turn question […]
Who Uses AI? Platforms, Workforce, and AI Exposure
arXiv:2605.21743v1 Announce Type: new Abstract: A growing literature uses artificial intelligence platform conversation logs to measure occupation exposure. We show that these scores partly measure platform user base rather than the workforce. Holding outcome, sample, controls, and estimator fixed while varying only the platform input changes the post-ChatGPT employment coefficient by a factor of 1.9, […]
MambaGaze: Bidirectional Mamba with Explicit Missing Data Modeling for Cognitive Load Assessment from Eye-Gaze Tracking Data
arXiv:2605.22775v1 Announce Type: cross Abstract: Real-time cognitive load assessment from eye-tracking signals could potentially enable adaptive human-centered-AI such as safety-critical applications such as driver vigilance monitoring or automated flight deck assistance, yet two challenges persist: handling frequent data missingness from blinks and tracking failures, and efficiently modeling long-range temporal dependencies. We propose MambaGaze, a framework […]
A Causal Argumentation Method for Explainability of Machine Learning Models
arXiv:2605.21758v1 Announce Type: new Abstract: Explainable AI (XAI) methods identify which features are relevant to a model’s predictions but often fail to clarify why certain decisions are made. In this work, we present a novel method that integrates causality with argument-based reasoning to explain why models may be making predictions. Our approach first identifies causal […]
The Shape of Reasoning: Topological Analysis of Reasoning Traces in Large Language Models
arXiv:2510.20665v2 Announce Type: replace Abstract: Evaluating the quality of reasoning traces from large language models remains understudied, labor-intensive, and unreliable: current practice relies on expert rubrics, manual annotation, and slow pairwise judgments. Automated efforts are dominated by graph-based proxies that quantify structural connectivity but do not clarify what constitutes high-quality reasoning; such abstractions can be […]