arXiv:2604.22067v2 Announce Type: replace-cross Abstract: Psychiatric intake is a sequential, high-stakes information-gathering process in which clinicians must decide what to ask, in what order, and how to interpret incomplete or ambiguous responses under limited time. Despite growing interest in conversational AI for healthcare, there is still limited infrastructure for conversational AI in this application. Accordingly, […]
Relational In-Context Learning via Synthetic Pre-training with Structural Prior
arXiv:2603.03805v2 Announce Type: replace-cross Abstract: Relational Databases (RDBs) are the backbone of modern business, yet they lack foundation models comparable to those in text or vision. A key obstacle is that high-quality RDBs are private, scarce and structurally heterogeneous, making internet-scale pre-training infeasible. To overcome this data scarcity, We introduce $textbfRDB-PFN$, the first relational foundation […]
Benchmarking bandgap prediction in semiconductors under experimental and realistic evaluation settings
arXiv:2604.25568v1 Announce Type: cross Abstract: Accurate bandgap prediction is crucial for semiconductor applications, yet machine learning models trained on computational data often struggle to generalize to experimental bandgap measurements. Challenges related to data fidelity, domain generalization, and model interpretability remain insufficiently addressed in existing evaluation frameworks. To bridge this gap, we introduce RealMat-BaG, a benchmark […]
Assistants, Not Architects: The Role of LLMs in Networked Systems Design
arXiv:2604.25506v1 Announce Type: cross Abstract: Designing the architecture of modern networked systems requires navigating a large, combinatorial space of hardware, systems, and configuration choices with complex cross-layer interactions. Architects must balance competing objectives such as performance, cost, and deployability while satisfying compatibility and resource constraints, often relying on scattered rules-of-thumb drawn from benchmarks, papers, documentation, […]
RbtAct: Rebuttal as Supervision for Actionable Review Feedback Generation
arXiv:2603.09723v2 Announce Type: replace-cross Abstract: Large language models (LLMs) are increasingly used across the scientific workflow, including to draft peer-review reports. However, many AI-generated reviews are superficial and insufficiently actionable, leaving authors without concrete, implementable guidance and motivating the gap this work addresses. We propose RbtAct, which targets actionable review feedback generation and places existing […]
SnapGuard: Lightweight Prompt Injection Detection for Screenshot-Based Web Agents
arXiv:2604.25562v1 Announce Type: cross Abstract: Web agents have emerged as an effective paradigm for automating interactions with complex web environments, yet remain vulnerable to prompt injection attacks that embed malicious instructions into webpage content to induce unintended actions. This threat is further amplified for screenshot-based web agents, which operate on rendered visual webpages rather than […]
Regime-Conditional Retrieval: Theory and a Transferable Router for Two-Hop QA
arXiv:2604.09019v2 Announce Type: replace-cross Abstract: Two-hop QA retrieval splits queries into two regimes determined by whether the hop-2 entity is explicitly named in the question (Q-dominant) or only in the bridge passage (B-dominant). We formalize this split with three theorems: (T1) per-query AUC is a monotone function of the cosine separation margin, with R^2 >= […]
From Scene to Object: Text-Guided Dual-Gaze Prediction
arXiv:2604.20191v2 Announce Type: replace-cross Abstract: Interpretable driver attention prediction is crucial for human-like autonomous driving. However, existing datasets provide only scene-level global gaze rather than fine-grained object-level annotations, inherently failing to support text-grounded cognitive modeling. Consequently, while Vision-Language Models (VLMs) hold great potential for semantic reasoning, this critical data limitations leads to severe text-vision decoupling […]
PermaFrost-Attack: Stealth Pretraining Seeding(SPS) for planting Logic Landmines During LLM Training
arXiv:2604.22117v2 Announce Type: replace-cross Abstract: Aligned large language models (LLMs) remain vulnerable to adversarial manipulation, and their reliance on web-scale pretraining creates a subtle but consequential attack surface. We study Stealth Pretraining Seeding (SPS), a threat model in which adversaries distribute small amounts of poisoned content across stealth websites, increasing the likelihood that such material […]
From CRUD to Autonomous Agents: Formal Validation and Zero-Trust Security for Semantic Gateways in AI-Native Enterprise Systems
arXiv:2604.25555v1 Announce Type: cross Abstract: Enterprise software engineering is shifting away from deterministic CRUD/REST architectures toward AI-native systems where large language models act as cognitive orchestrators. This transition introduces a critical security tension: probabilistic LLMs weaken classical mechanisms for validation, access control, and formal testing. This paper proposes the design, formal validation, and empirical evaluation […]
RAS: a Reliability Oriented Metric for Automatic Speech Recognition
arXiv:2604.24278v2 Announce Type: replace-cross Abstract: Automatic speech recognition systems often produce confident yet incorrect transcriptions under noisy or ambiguous conditions, which can be misleading for both users and downstream applications. Standard evaluation based on Word Error Rate focuses solely on accuracy and fails to capture transcription reliability. We introduce an abstention-aware transcription framework that enables […]
The Topological Trouble With Transformers
arXiv:2604.17121v2 Announce Type: replace-cross Abstract: Transformers encode structure in sequences via an expanding contextual history. However, their purely feedforward architecture fundamentally limits dynamic state tracking. State tracking — the iterative updating of latent variables reflecting an evolving environment — involves inherently sequential dependencies that feedforward networks struggle to maintain. Consequently, feedforward models push evolving state […]