Adaptive Stopping for Multi-Turn LLM Reasoning

arXiv:2604.01413v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) increasingly rely on multi-turn reasoning and interaction, such as adaptive retrieval-augmented generation (RAG) and ReAct-style agents, to answer difficult questions. These methods improve accuracy by iteratively retrieving information, reasoning, or acting, but introduce a key challenge: textbfWhen should the model stop? Existing approaches rely on heuristic […]

Effects of Generative AI Errors on User Reliance Across Task Difficulty

arXiv:2604.04319v1 Announce Type: cross Abstract: The capabilities of artificial intelligence (AI) lie along a jagged frontier, where AI systems surprisingly fail on tasks that humans find easy and succeed on tasks that humans find hard. To investigate user reactions to this phenomenon, we developed an incentive-compatible experimental methodology based on diagram generation tasks, in which […]

Compressible Softmax-Attended Language under Incompressible Attention

arXiv:2604.04384v1 Announce Type: cross Abstract: Across every attention head in five transformer language models (124M–7B parameters, four architecture families), the logit energy field $tildeE$ reaches 90% of its variance in 2–11 singular components. The emphlearned interaction matrix $W_Q^mathrmT W_K$ needs 38–75 components for the same threshold out of $d_h in 64, 128$. The spectral gap […]

DP-OPD: Differentially Private On-Policy Distillation for Language Models

arXiv:2604.04461v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly adapted to proprietary and domain-specific corpora that contain sensitive information, creating a tension between formal privacy guarantees and efficient deployment through model compression. Differential privacy (DP), typically enforced via DP-SGD, provides record-level protection but often incurs substantial utility loss in autoregressive generation, where optimization […]

One Model for All: Multi-Objective Controllable Language Models

arXiv:2604.04497v1 Announce Type: cross Abstract: Aligning large language models (LLMs) with human preferences is critical for enhancing LLMs’ safety, helpfulness, humor, faithfulness, etc. Current reinforcement learning from human feedback (RLHF) mainly focuses on a fixed reward learned from average human ratings, which may weaken the adaptability and controllability of varying preferences. However, creating personalized LLMs […]

Grokking as Dimensional Phase Transition in Neural Networks

arXiv:2604.04655v1 Announce Type: cross Abstract: Neural network grokking — the abrupt memorization-to-generalization transition — challenges our understanding of learning dynamics. Through finite-size scaling of gradient avalanche dynamics across eight model scales, we find that grokking is a textitdimensional phase transition: effective dimensionality~$D$ crosses from sub-diffusive (subcritical, $D < 1$) to super-diffusive (supercritical, $D > 1$) […]

What Makes Good Multilingual Reasoning? Disentangling Reasoning Traces with Measurable Features

arXiv:2604.04720v1 Announce Type: cross Abstract: Large Reasoning Models (LRMs) still exhibit large performance gaps between English and other languages, yet much current work assumes these gaps can be closed simply by making reasoning in every language resemble English reasoning. This work challenges this assumption by asking instead: what actually characterizes effective reasoning in multilingual settings, […]

Plausibility as Commonsense Reasoning: Humans Succeed, Large Language Models Do not

arXiv:2604.04825v1 Announce Type: cross Abstract: Large language models achieve strong performance on many language tasks, yet it remains unclear whether they integrate world knowledge with syntactic structure in a human-like, structure-sensitive way during ambiguity resolution. We test this question in Turkish prenominal relative-clause attachment ambiguities, where the same surface string permits high attachment (HA) or […]

Agentic Federated Learning: The Future of Distributed Training Orchestration

arXiv:2604.04895v1 Announce Type: cross Abstract: Although Federated Learning (FL) promises privacy and distributed collaboration, its effectiveness in real-world scenarios is often hampered by the stochastic heterogeneity of clients and unpredictable system dynamics. Existing static optimization approaches fail to adapt to these fluctuations, resulting in resource underutilization and systemic bias. In this work, we propose a […]

A Multi-Agent Reinforcement Learning Framework for Public Health Decision Analysis

arXiv:2311.00855v3 Announce Type: replace Abstract: Human immunodeficiency virus (HIV) is a major public health concern in the United States (U.S.), with about 1.2 million people living with it and about 35,000 newly infected each year. There are considerable geographical disparities in HIV burden and care access across the U.S. The ‘Ending the HIV Epidemic (EHE)’ […]

Environment heterogeneity creates fast amplifiers of natural selection in graph-structured populations

arXiv:2507.23769v3 Announce Type: replace Abstract: Complex spatial structure, with partially isolated subpopulations, and environment heterogeneity, such as gradients in nutrients, oxygen, and drugs, both shape the evolution of natural populations. We investigate the impact of environment heterogeneity on mutant fixation in spatially structured populations with demes on the nodes of a graph. When migrations between […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844