METRO: Towards Strategy Induction from Expert Dialogue Transcripts for Non-collaborative Dialogues

arXiv:2604.11427v2 Announce Type: replace-cross Abstract: Developing non-collaborative dialogue agents traditionally requires the manual, unscalable codification of expert strategies. We propose ours, a method that leverages large language models to autonomously induce both strategy actions and planning logic directly from raw transcripts. METRO formalizes expert knowledge into a Strategy Forest, a hierarchical structure that captures both […]

Predicting success of cooperators across arbitrary heterogeneous environmental landscapes

arXiv:2604.12546v1 Announce Type: new Abstract: Cooperation is central to the organization of complex biological and social systems. Most theoretical models assume homogeneous environments; in reality, populations inhabit spatially varying landscapes in which the payoffs of cooperation differ across space. Here, we introduce a general framework for the evolution of cooperation in complex, heterogeneous environments where […]

Calibration-Aware Policy Optimization for Reasoning LLMs

arXiv:2604.12632v1 Announce Type: cross Abstract: Group Relative Policy Optimization (GRPO) enhances LLM reasoning but often induces overconfidence, where incorrect responses yield lower perplexity than correct ones, degrading relative calibration as described by the Area Under the Curve (AUC). Existing approaches either yield limited improvements in calibration or sacrifice gains in reasoning accuracy. We first prove […]

KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance

arXiv:2604.12627v1 Announce Type: new Abstract: RLVR improves reasoning in large language models, but its effectiveness is often limited by severe reward sparsity on hard problems. Recent hint-based RL methods mitigate sparsity by injecting partial solutions or abstract templates, yet they typically scale guidance by adding more tokens, which introduce redundancy, inconsistency, and extra training overhead. […]

Wolkowicz-Styan Upper Bound on the Hessian Eigenspectrum for Cross-Entropy Loss in Nonlinear Smooth Neural Networks

arXiv:2604.10202v2 Announce Type: replace-cross Abstract: Neural networks (NNs) are central to modern machine learning and achieve state-of-the-art results in many applications. However, the relationship between loss geometry and generalization is still not well understood. The local geometry of the loss function near a critical point is well-approximated by its quadratic form, obtained through a second-order […]

Transferable Expertise for Autonomous Agents via Real-World Case-Based Learning

arXiv:2604.12717v1 Announce Type: new Abstract: LLM-based autonomous agents perform well on general reasoning tasks but still struggle to reliably use task structure, key constraints, and prior experience in complex real-world settings. We propose a case-based learning framework that converts experience from past tasks into reusable knowledge assets, allowing agents to transfer prior case experience to […]

Neural Dynamic GI: Random-Access Neural Compression for Temporal Lightmaps in Dynamic Lighting Environments

arXiv:2604.12625v1 Announce Type: cross Abstract: High-quality global illumination (GI) in real-time rendering is commonly achieved using precomputed lighting techniques, with lightmap as the standard choice. To support GI for static objects in dynamic lighting environments, multiple lightmaps at different lighting conditions need to be precomputed, which incurs substantial storage and memory overhead. To overcome this […]

LIFE — an energy efficient advanced continual learning agentic AI framework for frontier systems

arXiv:2604.12874v1 Announce Type: new Abstract: The rapid advancement of AI has changed the character of HPC usage such as dimensioning, provisioning, and execution. Not only has energy demand been amplified, but existing rudimentary continual learning capabilities limit ability of AI to effectively manage HPCs. This paper reviews emerging directions beyond monolithic transformers, emphasizing agentic AI […]

Face Density as a Proxy for Data Complexity: Quantifying the Hardness of Instance Count

arXiv:2604.09689v2 Announce Type: replace-cross Abstract: Machine learning progress has historically prioritized model-centric innovations, yet achievable performance is frequently capped by the intrinsic complexity of the data itself. In this work, we isolate and quantify the impact of instance density (measured by face count) as a primary driver of data complexity. Rather than simply observing that […]

HintMR: Eliciting Stronger Mathematical Reasoning in Small Language Models

arXiv:2604.12229v1 Announce Type: new Abstract: Small language models (SLMs) often struggle with complex mathematical reasoning due to limited capacity to maintain long chains of intermediate steps and to recover from early errors. We address this challenge by introducing a hint-assisted reasoning framework that incrementally guides SLMs through multi-step mathematical problem solving. Our approach decomposes solutions […]

Efficient Semantic Image Communication for Traffic Monitoring at the Edge

arXiv:2604.12622v1 Announce Type: cross Abstract: Many visual monitoring systems operate under strict communication constraints, where transmitting full-resolution images is impractical and often unnecessary. In such settings, visual data is often used for object presence, spatial relationships, and scene context rather than exact pixel fidelity. This paper presents two semantic image communication pipelines for traffic monitoring, […]

Frontier-Eng: Benchmarking Self-Evolving Agents on Real-World Engineering Tasks with Generative Optimization

arXiv:2604.12290v1 Announce Type: new Abstract: Current LLM agent benchmarks, which predominantly focus on binary pass/fail tasks such as code generation or search-based question answering, often neglect the value of real-world engineering that is often captured through the iterative optimization of feasible designs. To this end, we introduce Frontier-Eng, a human-verified benchmark for generative optimization — […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844