arXiv:2605.17000v1 Announce Type: cross Abstract: Optimization of LLM training and inference configurations, such as hyperparameters, data mixtures, and prompts, is critical to performance, but it is often approached heuristically in practice, leading to potentially suboptimal outcomes. By framing them as noisy, expensive, and derivative-free optimization problems, Bayesian optimization (BO) and other black-box optimization (BBO) methods […]
PopuLoRA: Co-Evolving LLM Populations for Reasoning Self-Play
arXiv:2605.16727v1 Announce Type: new Abstract: We introduce PopuLoRA, a population-based asymmetric self-play framework for reinforcement learning with verifiable rewards (RLVR) post-training of LLMs. Teachers and students are specialised LoRA adapters on a shared frozen base: teachers propose problems, matched students solve them under a programmatic verifier, and cross-evaluation between sub-populations replaces the self-calibration that limits […]
D$^2$Evo: Dual Difficulty-Aware Self-Evolution for Data-Efficient Reinforcement Learning
arXiv:2605.17037v1 Announce Type: cross Abstract: Reinforcement learning (RL) has demonstrated potential for enhancing reasoning in large language models (LLMs). However, effective RL training, which requires medium-difficulty training samples, faces two fundamental challenges: Effective Data Scarcity and Dynamic Difficulty Shifts, where medium-difficulty samples are scarce and become trivial as models improve. Existing methods mitigate this scarcity […]
UniversalRAG: Retrieval-Augmented Generation over Corpora of Diverse Modalities and Granularities
arXiv:2504.20734v4 Announce Type: replace-cross Abstract: Retrieval-Augmented Generation (RAG) has shown substantial promise in improving factual accuracy by grounding model responses with external knowledge relevant to queries. However, most existing approaches are limited to a text-only corpus, and while recent efforts have extended RAG to other modalities such as images and videos, they typically operate over […]
SEMA-RAG: A Self-Evolving Multi-Agent Retrieval-Augmented Generation Framework for Medical Reasoning
arXiv:2605.17101v1 Announce Type: cross Abstract: Retrieval-Augmented Generation (RAG) is widely employed to mitigate risks such as hallucinations and knowledge obsolescence in medical question answering, yet its predominantly single-round, static retrieval paradigm misaligns with the multi-stage process of clinical reasoning. This compressed workflow induces two structural deficiencies: question-to-query translation often lacks clinically grounded semantic interpretation, and […]
Body-Grounded Perspective Formation and Conative Attunement in Artificial Agents
arXiv:2605.16728v1 Announce Type: new Abstract: This paper proposes a minimal architecture for body-grounded perspective formation in artificial agents. Extending prior work, the model introduces an interoceptive viability signal, a Fisher-style metric over fused exteroceptive-interoceptive states, and a conative alignment mechanism linking bodily tendency to action readiness. In a reward-free gridworld, conation converts learned bodily tendency […]
Evolutionary Extreme Learning Machine of ab-initio Energy Landscapes for Crystal Structure Prediction using Manta Ray Optimization with Levy Flight
arXiv:2605.17148v1 Announce Type: cross Abstract: The Manta Ray Foraging Optimization algorithm (MRFO) has proven to be a powerful heuristic strategy in the optimal solution of a large number of engineering problems. In this paper, an improvement of MRFO with Levy Flight is suggested for the training of extreme learning machines (ELMs) whose basic model is […]
Individual utilities of life satisfaction reveal inequality aversion unrelated to political alignment
arXiv:2509.07793v4 Announce Type: replace-cross Abstract: How should well-being be prioritised in society, and what trade-offs are people willing to make between fairness and personal well-being? We investigate these questions using a stated preference experiment with a nationally representative UK sample (n = 300), in which participants evaluated life satisfaction outcomes for both themselves and others […]
Solving linear-rate ODE hierarchies (like master equations) using closures and operator splitting
arXiv:2605.17186v1 Announce Type: cross Abstract: Countably infinite systems of linear ODEs arise as forward equations for many continuous-time Markov processes. The standard recipe — truncate to a finite cap N and exponentiate — pays cubic cost in N and a time-growing boundary-feedback bias. We identify a structural condition on the rates, L_n+r,n = alpha_r n […]
State Contamination in Memory-Augmented LLM Agents
arXiv:2605.16746v1 Announce Type: new Abstract: LLM agents increasingly rely on persistent state, including transcripts, summaries, retrieved context, and memory buffers, to support long-horizon interaction. This makes safety depend not only on individual model outputs, but also on what an agent stores and later reuses. We study a failure mode we call memory laundering: toxic or […]
State-of-the-Art Claims Require State-of-the-Art Evidence
arXiv:2605.17273v1 Announce Type: cross Abstract: State-of-the-Art (SOTA) claims pervade Artificial Intelligence (AI) and Machine Learning (ML) research. These claims rest on benchmark evaluations, where models are ranked by aggregate scores across tasks. Public benchmarks or leaderboards are the most visible instance, but the same structure appears in paper tables throughout the literature. However, such minimal […]
GraphMind: Theorem Selection and Conclusion Generation Framework with Dynamic GNN for LLM Reasoning
arXiv:2511.19078v2 Announce Type: replace-cross Abstract: Large language models (LLMs) have demonstrated impressive capabilities in natural language understanding and generation, including multi-step reasoning such as mathematical proving. However, existing approaches often lack an explicit and dynamic mechanism to structurally represent and evolve intermediate reasoning states, which limits their ability to perform context-aware theorem selection and iterative […]