MathAgent: Adversarial Evolution of Constraint Graphs for Mathematical Reasoning Data Synthesis

arXiv:2604.11188v1 Announce Type: cross Abstract: Synthesizing high-quality mathematical reasoning data without human priors remains a significant challenge. Current approaches typically rely on seed data mutation or simple prompt engineering, often suffering from mode collapse and limited logical complexity. This paper proposes a hierarchical synthesis framework that formulates data synthesis as an unsupervised optimization problem over […]

OpeFlo: Automated UX Evaluation via Simulated Human Web Interaction with GUI Grounding

arXiv:2604.09581v1 Announce Type: new Abstract: Evaluating web usability typically requires time-consuming user studies and expert reviews, which often limits iteration speed during product development, especially for small teams and agile workflows. We present OpenFlo, a user-experience evaluation agent that simulates user behavior on websites and produces standardized usability. Unlike traditional tools that rely on DOM […]

Explainable Planning for Hybrid Systems

arXiv:2604.09578v1 Announce Type: new Abstract: The recent advancement in artificial intelligence (AI) technologies facilitates a paradigm shift toward automation. Autonomous systems are fully or partially replacing manually crafted ones. At the core of these systems is automated planning. With the advent of powerful planners, automated planning is now applied to many complex and safety-critical domains, […]

SLALOM: Simulation Lifecycle Analysis via Longitudinal Observation Metrics for Social Simulation

arXiv:2604.11466v1 Announce Type: cross Abstract: Large Language Model (LLM) agents offer a potentially-transformative path forward for generative social science but face a critical crisis of validity. Current simulation evaluation methodologies suffer from the “stopped clock” problem: they confirm that a simulation reached the correct final outcome while ignoring whether the trajectory leading to it was […]

Help Without Being Asked: A Deployed Proactive Agent System for On-Call Support with Continuous Self-Improvement

arXiv:2604.09579v1 Announce Type: new Abstract: In large-scale cloud service platforms, thousands of customer tickets are generated daily and are typically handled through on-call dialogues. This high volume of on-call interactions imposes a substantial workload on human support analysts. Recent studies have explored reactive agents that leverage large language models as a first line of support […]

Synthius-Mem: Brain-Inspired Hallucination-Resistant Persona Memory Achieving 94.4% Memory Accuracy and 99.6% Adversarial Robustness on LoCoMo

arXiv:2604.11563v1 Announce Type: cross Abstract: Providing AI agents with reliable long-term memory that does not hallucinate remains an open problem. Current approaches to memory for LLM agents — sliding windows, summarization, embedding-based RAG, and flat fact extraction — each reduce token cost but introduce catastrophic information loss, semantic drift, or uncontrolled hallucination about the user. […]

AHC: Meta-Learned Adaptive Compression for Continual Object Detection on Memory-Constrained Microcontrollers

arXiv:2604.09576v1 Announce Type: new Abstract: Deploying continual object detection on microcontrollers (MCUs) with under 100KB memory requires efficient feature compression that can adapt to evolving task distributions. Existing approaches rely on fixed compression strategies (e.g., FiLM conditioning) that cannot adapt to heterogeneous task characteristics, leading to suboptimal memory utilization and catastrophic forgetting. We introduce Adaptive […]

Why Do Large Language Models Generate Harmful Content?

arXiv:2604.11663v1 Announce Type: new Abstract: Large Language Models (LLMs) have been shown to generate harmful content. However, the underlying causes of such behavior remain under explored. We propose a causal mediation analysis-based approach to identify the causal factors responsible for harmful generation. Our method performs a multi-granular analysis across model layers, modules (MLP and attention […]

Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music

arXiv:2604.10905v1 Announce Type: cross Abstract: We present Audio Flamingo Next (AF-Next), the next-generation and most capable large audio-language model in the Audio Flamingo series, designed to advance understanding and reasoning over speech, environmental sounds and music. Compared to Audio Flamingo 3, AF-Next introduces: (i) a stronger foundational audio-language model that significantly improves accuracy across diverse […]

Use of AI Tools: Guidelines to Maintain Academic Integrity in Computing Colleges

arXiv:2604.11111v1 Announce Type: cross Abstract: The rapid adoption of AI tools such as ChatGPT has significantly transformed academic practices, offering considerable benefits for both students and faculty in computing disciplines. These tools have been shown to enhance learning efficiency, academic self-efficacy, and confidence. However, their increasing use also raises pressing concerns regarding the preservation of […]

RL-Driven Sustainable Land-Use Allocation for the Lake Malawi Basin

arXiv:2604.03768v2 Announce Type: replace Abstract: Unsustainable land-use practices in ecologically sensitive regions threaten biodiversity, water resources, and the livelihoods of millions. This paper presents a deep reinforcement learning (RL) framework for optimizing land-use allocation in the Lake Malawi Basin to maximize total ecosystem service value (ESV). Drawing on the benefit transfer methodology of Costanza et […]

Reliable Evaluation Protocol for Low-Precision Retrieval

arXiv:2508.03306v4 Announce Type: replace-cross Abstract: Lowering the numerical precision of model parameters and computations is widely adopted to improve the efficiency of retrieval systems. However, when computing relevance scores between the query and documents in low-precision, we observe spurious ties due to the reduced granularity. This introduces high variability in the results based on tie […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844