arXiv:2604.11784v1 Announce Type: cross Abstract: GUI agents drive applications through their visual interfaces instead of programmatic APIs, interacting with arbitrary software via taps, swipes, and keystrokes, reaching a long tail of applications that CLI-based agents cannot. Yet progress in this area is bottlenecked less by modeling capacity than by the absence of a coherent full-stack […]
X-SYS: A Reference Architecture for Interactive Explanation Systems
arXiv:2602.12748v3 Announce Type: replace Abstract: The explainable AI (XAI) research community has proposed numerous technical methods, yet deploying explainability as systems remains challenging: Interactive explanation systems require both suitable algorithms and system capabilities that maintain explanation usability across repeated queries, evolving models and data, and governance constraints. We argue that operationalizing XAI requires treating explainability […]
Strategic Algorithmic Monoculture: Experimental Evidence from Coordination Games
arXiv:2604.09502v2 Announce Type: replace Abstract: AI agents increasingly operate in multi-agent environments where outcomes depend on coordination. We distinguish primary algorithmic monoculture — baseline action similarity — from strategic algorithmic monoculture, whereby agents adjust similarity in response to incentives. We implement a simple experimental design that cleanly separates these forces, and deploy it on human […]
Auto-regressive transformation for image alignment
arXiv:2505.04864v2 Announce Type: replace-cross Abstract: Existing methods for image alignment struggle in cases involving feature-sparse regions, extreme scale and field-of-view differences, and large deformations, often resulting in suboptimal accuracy. Robustness to these challenges can be improved through iterative refinement of the transform field while focusing on critical regions in multi-scale image representations. We thus propose […]
PnP-CM: Consistency Models as Plug-and-Play Priors for Inverse Problems
arXiv:2509.22736v2 Announce Type: replace-cross Abstract: Diffusion models have found extensive use in solving inverse problems, by sampling from an approximate posterior distribution of data given the measurements. Recently, consistency models (CMs) have been proposed to directly predict the final output from any point on the diffusion ODE trajectory, enabling high-quality sampling in just a few […]
A Unified Theory of Sparse Dictionary Learning in Mechanistic Interpretability: Piecewise Biconvexity and Spurious Minima
arXiv:2512.05534v4 Announce Type: replace-cross Abstract: As AI models achieve remarkable capabilities across diverse domains, understanding what representations they learn and how they encode concepts has become increasingly important for both scientific progress and trustworthy deployment. Recent works in mechanistic interpretability have widely reported that neural networks represent meaningful concepts as linear directions in their representation […]
Fake-HR1: Rethinking Reasoning of Vision Language Model for Synthetic Image Detection
arXiv:2602.10042v3 Announce Type: replace-cross Abstract: Recent studies have demonstrated that incorporating Chain-of-Thought (CoT) reasoning into the detection process can enhance a model’s ability to detect synthetic images. However, excessively lengthy reasoning incurs substantial resource overhead, including token consumption and latency, which is particularly redundant when handling obviously generated forgeries. To address this issue, we propose […]
EduIllustrate: Towards Scalable Automated Generation Of Multimodal Educational Content
arXiv:2604.05005v2 Announce Type: replace-cross Abstract: Large language models are increasingly used as educational assistants, yet evaluation of their educational capabilities remains concentrated on question-answering and tutoring tasks. A critical gap exists for multimedia instructional content generation — the ability to produce coherent, diagram-rich explanations that combine geometrically accurate visuals with step-by-step reasoning. We present EduIllustrate, […]
Task2vec Readiness: Diagnostics for Federated Learning from Pre-Training Embeddings
arXiv:2604.10849v1 Announce Type: cross Abstract: Federated learning (FL) performance is highly sensitive to heterogeneity across clients, yet practitioners lack reliable methods to anticipate how a federation will behave before training. We propose readiness indices, derived from Task2Vec embeddings, that quantifies the alignment of a federation prior to training and correlates with its eventual performance. Our […]
Rethinking Token-Level Credit Assignment in RLVR: A Polarity-Entropy Analysis
arXiv:2604.11056v1 Announce Type: cross Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has substantially improved the reasoning ability of Large Language Models (LLMs). However, its sparse outcome-based rewards pose a fundamental credit assignment problem. We analyze this problem through the joint lens of reward polarity and token entropy. Our diagnostic tool, the Four Quadrant Decomposition, isolates […]
MathAgent: Adversarial Evolution of Constraint Graphs for Mathematical Reasoning Data Synthesis
arXiv:2604.11188v1 Announce Type: cross Abstract: Synthesizing high-quality mathematical reasoning data without human priors remains a significant challenge. Current approaches typically rely on seed data mutation or simple prompt engineering, often suffering from mode collapse and limited logical complexity. This paper proposes a hierarchical synthesis framework that formulates data synthesis as an unsupervised optimization problem over […]
OpeFlo: Automated UX Evaluation via Simulated Human Web Interaction with GUI Grounding
arXiv:2604.09581v1 Announce Type: new Abstract: Evaluating web usability typically requires time-consuming user studies and expert reviews, which often limits iteration speed during product development, especially for small teams and agile workflows. We present OpenFlo, a user-experience evaluation agent that simulates user behavior on websites and produces standardized usability. Unlike traditional tools that rely on DOM […]