arXiv:2604.03631v1 Announce Type: new Abstract: On-screen learning behavior provides valuable insights into how students seek, use, and create information during learning. Analyzing on-screen behavioral engagement is essential for capturing students’ cognitive and collaborative processes. The recent development of Vision Language Models (VLMs) offers new opportunities to automate the labor-intensive manual coding often required for multimodal […]
Automated Attention Pattern Discovery at Scale in Large Language Models
arXiv:2604.03764v1 Announce Type: cross Abstract: Large language models have found success by scaling up capabilities to work in general settings. The same can unfortunately not be said for interpretability methods. The current trend in mechanistic interpretability is to provide precise explanations of specific behaviors in controlled settings. These often do not generalize, or are too […]
Making Prompts First-Class Citizens for Adaptive LLM Pipelines
arXiv:2508.05012v2 Announce Type: replace-cross Abstract: Modern LLM pipelines increasingly resemble complex data-centric applications: they retrieve data, correct errors, call external tools, and coordinate interactions between agents. Yet, the central element controlling this entire process — the prompt — remains a brittle, opaque string that is entirely disconnected from the surrounding program logic. This disconnect fundamentally […]
Truth as a Compression Artifact in Language Model Training
arXiv:2603.11749v3 Announce Type: replace-cross Abstract: Why do language models trained on contradictory data prefer correct answers? In controlled experiments with small transformers (3.5M–86M parameters), we show that this preference tracks the compressibility structure of errors rather than truth per se. We train GPT-2 style models on corpora where each mathematical problem appears with both correct […]
RAGnaroX: A Secure, Local-Hosted ChatOps Assistant Using Small Language Models
arXiv:2604.03291v1 Announce Type: cross Abstract: This paper introduces RAGnaroX, a resource-efficient ChatOps assistant that operates entirely on commodity hardware. Unlike existing solutions that often rely on external providers such as Azure or OpenAI, RAGnaroX offers a fully auditable, on-premise stack implemented in Rust. Its architecture integrates modular data ingestion, hybrid retrieval, and function calling, enabling […]
Vision-as-Inverse-Graphics Agent via Interleaved Multimodal Reasoning
arXiv:2601.11109v3 Announce Type: replace-cross Abstract: Vision-as-inverse-graphics, the concept of reconstructing images into editable programs, remains challenging for Vision-Language Models (VLMs), which inherently lack fine-grained spatial grounding in one-shot settings. To address this, we introduce VIGA (Vision-as-Inverse-Graphics Agent), an interleaved multimodal reasoning framework where symbolic logic and visual perception actively cross-verify each other. VIGA operates through […]
Brittlebench: Quantifying LLM robustness via prompt sensitivity
arXiv:2603.13285v2 Announce Type: replace-cross Abstract: Existing evaluation methods largely rely on clean, static benchmarks, which can overestimate true model performance by failing to capture the noise and variability inherent in real-world user inputs. This is especially true for language models, which can face human-generated text queries containing mistakes, typos, or alternative ways of phrasing the […]
Agents for Agents: An Interrogator-Based Secure Framework for Autonomous Internet of Underwater Things
arXiv:2604.04262v1 Announce Type: cross Abstract: Autonomous underwater vehicles (AUVs) and sensor nodes increasingly support decentralized sensing and coordination in the Internet of Underwater Things (IoUT), yet most deployments rely on static trust once authentication is established, leaving long-duration missions vulnerable to compromised or behaviorally deviating agents. In this paper, an interrogator based structure is presented […]
Spectral Compact Training: Pre-Training Large Language Models via Permanent Truncated SVD and Stiefel QR Retraction
arXiv:2604.00733v2 Announce Type: replace-cross Abstract: The memory wall remains the primary bottleneck for training large language models on consumer hardware. We introduce Spectral Compact Training (SCT), a method that replaces dense weight matrices with permanent truncated SVD factors W = U diag(s) V^T, where the full dense matrix is never materialized during training or inference. […]
Adaptive Stopping for Multi-Turn LLM Reasoning
arXiv:2604.01413v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) increasingly rely on multi-turn reasoning and interaction, such as adaptive retrieval-augmented generation (RAG) and ReAct-style agents, to answer difficult questions. These methods improve accuracy by iteratively retrieving information, reasoning, or acting, but introduce a key challenge: textbfWhen should the model stop? Existing approaches rely on heuristic […]
Artificial Intelligence and Systemic Risk: A Unified Model of Performative Prediction, Algorithmic Herding, and Cognitive Dependency in Financial Markets
arXiv:2604.03272v1 Announce Type: cross Abstract: We develop a unified model in which AI adoption in financial markets generates systemic risk through three mutually reinforcing channels: performative prediction, algorithmic herding, and cognitive dependency. Within an extended rational expectations framework with endogenous adoption, we derive an equilibrium systemic risk coupling $r(phi) = phirhobeta/lambda'(phi)$, where $phi$ is the […]
AI-Driven Predictive Maintenance with Environmental Context Integration for Connected Vehicles: Simulation, Benchmarking, and Field Validation
arXiv:2603.13343v2 Announce Type: replace-cross Abstract: Predictive maintenance for connected vehicles offers the potential to reduce unexpected breakdowns and improve fleet reliability, but most existing systems rely exclusively on internal diagnostic signals and are validated on simulated or industrial benchmark data. This paper presents a contextual data fusion framework integrating vehicle-internal sensor streams with external environmental […]