PnP-CM: Consistency Models as Plug-and-Play Priors for Inverse Problems

arXiv:2509.22736v2 Announce Type: replace-cross Abstract: Diffusion models have found extensive use in solving inverse problems, by sampling from an approximate posterior distribution of data given the measurements. Recently, consistency models (CMs) have been proposed to directly predict the final output from any point on the diffusion ODE trajectory, enabling high-quality sampling in just a few […]

A Unified Theory of Sparse Dictionary Learning in Mechanistic Interpretability: Piecewise Biconvexity and Spurious Minima

arXiv:2512.05534v4 Announce Type: replace-cross Abstract: As AI models achieve remarkable capabilities across diverse domains, understanding what representations they learn and how they encode concepts has become increasingly important for both scientific progress and trustworthy deployment. Recent works in mechanistic interpretability have widely reported that neural networks represent meaningful concepts as linear directions in their representation […]

Fake-HR1: Rethinking Reasoning of Vision Language Model for Synthetic Image Detection

arXiv:2602.10042v3 Announce Type: replace-cross Abstract: Recent studies have demonstrated that incorporating Chain-of-Thought (CoT) reasoning into the detection process can enhance a model’s ability to detect synthetic images. However, excessively lengthy reasoning incurs substantial resource overhead, including token consumption and latency, which is particularly redundant when handling obviously generated forgeries. To address this issue, we propose […]

EduIllustrate: Towards Scalable Automated Generation Of Multimodal Educational Content

arXiv:2604.05005v2 Announce Type: replace-cross Abstract: Large language models are increasingly used as educational assistants, yet evaluation of their educational capabilities remains concentrated on question-answering and tutoring tasks. A critical gap exists for multimedia instructional content generation — the ability to produce coherent, diagram-rich explanations that combine geometrically accurate visuals with step-by-step reasoning. We present EduIllustrate, […]

Task2vec Readiness: Diagnostics for Federated Learning from Pre-Training Embeddings

arXiv:2604.10849v1 Announce Type: cross Abstract: Federated learning (FL) performance is highly sensitive to heterogeneity across clients, yet practitioners lack reliable methods to anticipate how a federation will behave before training. We propose readiness indices, derived from Task2Vec embeddings, that quantifies the alignment of a federation prior to training and correlates with its eventual performance. Our […]

Rethinking Token-Level Credit Assignment in RLVR: A Polarity-Entropy Analysis

arXiv:2604.11056v1 Announce Type: cross Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has substantially improved the reasoning ability of Large Language Models (LLMs). However, its sparse outcome-based rewards pose a fundamental credit assignment problem. We analyze this problem through the joint lens of reward polarity and token entropy. Our diagnostic tool, the Four Quadrant Decomposition, isolates […]

MathAgent: Adversarial Evolution of Constraint Graphs for Mathematical Reasoning Data Synthesis

arXiv:2604.11188v1 Announce Type: cross Abstract: Synthesizing high-quality mathematical reasoning data without human priors remains a significant challenge. Current approaches typically rely on seed data mutation or simple prompt engineering, often suffering from mode collapse and limited logical complexity. This paper proposes a hierarchical synthesis framework that formulates data synthesis as an unsupervised optimization problem over […]

OpeFlo: Automated UX Evaluation via Simulated Human Web Interaction with GUI Grounding

arXiv:2604.09581v1 Announce Type: new Abstract: Evaluating web usability typically requires time-consuming user studies and expert reviews, which often limits iteration speed during product development, especially for small teams and agile workflows. We present OpenFlo, a user-experience evaluation agent that simulates user behavior on websites and produces standardized usability. Unlike traditional tools that rely on DOM […]

Explainable Planning for Hybrid Systems

arXiv:2604.09578v1 Announce Type: new Abstract: The recent advancement in artificial intelligence (AI) technologies facilitates a paradigm shift toward automation. Autonomous systems are fully or partially replacing manually crafted ones. At the core of these systems is automated planning. With the advent of powerful planners, automated planning is now applied to many complex and safety-critical domains, […]

SLALOM: Simulation Lifecycle Analysis via Longitudinal Observation Metrics for Social Simulation

arXiv:2604.11466v1 Announce Type: cross Abstract: Large Language Model (LLM) agents offer a potentially-transformative path forward for generative social science but face a critical crisis of validity. Current simulation evaluation methodologies suffer from the “stopped clock” problem: they confirm that a simulation reached the correct final outcome while ignoring whether the trajectory leading to it was […]

Help Without Being Asked: A Deployed Proactive Agent System for On-Call Support with Continuous Self-Improvement

arXiv:2604.09579v1 Announce Type: new Abstract: In large-scale cloud service platforms, thousands of customer tickets are generated daily and are typically handled through on-call dialogues. This high volume of on-call interactions imposes a substantial workload on human support analysts. Recent studies have explored reactive agents that leverage large language models as a first line of support […]

Synthius-Mem: Brain-Inspired Hallucination-Resistant Persona Memory Achieving 94.4% Memory Accuracy and 99.6% Adversarial Robustness on LoCoMo

arXiv:2604.11563v1 Announce Type: cross Abstract: Providing AI agents with reliable long-term memory that does not hallucinate remains an open problem. Current approaches to memory for LLM agents — sliding windows, summarization, embedding-based RAG, and flat fact extraction — each reduce token cost but introduce catastrophic information loss, semantic drift, or uncontrolled hallucination about the user. […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844