arXiv:2601.20352v1 Announce Type: new Abstract: The rapid evolution of Large Language Model (LLM) agents has necessitated robust memory systems to support cohesive long-term interaction and complex reasoning. Benefiting from the strong capabilities of LLMs, recent research focus has shifted from simple context extension to the development of dedicated agentic memory systems. However, existing approaches typically […]
Robust Deep Monte Carlo Counterfactual Regret Minimization: Addressing Theoretical Risks in Neural Fictitious Self-Play
arXiv:2509.00923v2 Announce Type: replace Abstract: Monte Carlo Counterfactual Regret Minimization (MCCFR) has emerged as a cornerstone algorithm for solving extensive-form games, but its integration with deep neural networks introduces scale-dependent challenges that manifest differently across game complexities. This paper presents a comprehensive analysis of how neural MCCFR component effectiveness varies with game scale and proposes […]
Policy of Thoughts: Scaling LLM Reasoning via Test-time Policy Evolution
arXiv:2601.20379v1 Announce Type: new Abstract: Large language models (LLMs) struggle with complex, long-horizon reasoning due to instability caused by their frozen policy assumption. Current test-time scaling methods treat execution feedback merely as an external signal for filtering or rewriting trajectories, without internalizing it to improve the underlying reasoning strategy. Inspired by Popper’s epistemology of “conjectures […]
Epistemic Constitutionalism Or: how to avoid coherence bias
arXiv:2601.14295v2 Announce Type: replace Abstract: Large language models increasingly function as artificial reasoners: they evaluate arguments, assign credibility, and express confidence. Yet their belief-forming behavior is governed by implicit, uninspected epistemic policies. This paper argues for an epistemic constitution for AI: explicit, contestable meta-norms that regulate how systems form and express beliefs. Source attribution bias […]
MemCtrl: Using MLLMs as Active Memory Controllers on Embodied Agents
arXiv:2601.20831v1 Announce Type: new Abstract: Foundation models rely on in-context learning for personalized decision making. The limited size of this context window necessitates memory compression and retrieval systems like RAG. These systems however often treat memory as large offline storage spaces, which is unfavorable for embodied agents that are expected to operate under strict memory […]
FAIR-MATCH: A Multi-Objective Framework for Bias Mitigation in Reciprocal Dating Recommendations
arXiv:2507.01063v2 Announce Type: replace-cross Abstract: Online dating platforms have fundamentally transformed the formation of romantic relationships, with millions of users worldwide relying on algorithmic matching systems to find compatible partners. However, current recommendation systems in dating applications suffer from significant algorithmic deficiencies, including but not limited to popularity bias, filter bubble effects, and inadequate reciprocity […]
AuditoryBench++: Can Language Models Understand Auditory Knowledge without Hearing?
arXiv:2509.17641v2 Announce Type: replace-cross Abstract: Even without directly hearing sounds, humans can effortlessly reason about auditory properties, such as pitch, loudness, or sound-source associations, drawing on auditory commonsense. In contrast, language models often lack this capability, limiting their effectiveness in multimodal interactions. As an initial step to address this gap, we present AuditoryBench++, a comprehensive […]
mR3: Multilingual Rubric-Agnostic Reward Reasoning Models
arXiv:2510.01146v2 Announce Type: replace-cross Abstract: Evaluation using Large Language Model (LLM) judges has been widely adopted in English and shown to be effective for automatic evaluation. However, their performance does not generalize well to non-English settings, and it remains unclear what constitutes effective multilingual training for such judges. In this paper, we introduce mR3, a […]
Normative Equivalence in Human-AI Cooperation: Behaviour, Not Identity, Drives Cooperation in Mixed-Agent Groups
arXiv:2601.20487v2 Announce Type: new Abstract: The introduction of artificial intelligence (AI) agents into human group settings raises essential questions about how these novel participants influence cooperative social norms. While previous studies on human-AI cooperation have primarily focused on dyadic interactions, little is known about how integrating AI agents affects the emergence and maintenance of cooperative […]
PathWise: Planning through World Model for Automated Heuristic Design via Self-Evolving LLMs
arXiv:2601.20539v2 Announce Type: new Abstract: Large Language Models (LLMs) have enabled automated heuristic design (AHD) for combinatorial optimization problems (COPs), but existing frameworks’ reliance on fixed evolutionary rules and static prompt templates often leads to myopic heuristic generation, redundant evaluations, and limited reasoning about how new heuristics should be derived. We propose a novel multi-agent […]
SimpleMem: Efficient Lifelong Memory for LLM Agents
arXiv:2601.02553v3 Announce Type: replace Abstract: To support long-term interaction in complex environments, LLM agents require memory systems that manage historical experiences. Existing approaches either retain full interaction histories via passive context extension, leading to substantial redundancy, or rely on iterative reasoning to filter noise, incurring high token costs. To address this challenge, we introduce SimpleMem, […]
LLMs versus the Halting Problem: Revisiting Program Termination Prediction
arXiv:2601.18987v3 Announce Type: replace-cross Abstract: Determining whether a program terminates is a central problem in computer science. Turing’s foundational result established the Halting Problem as undecidable, showing that no algorithm can universally determine termination for all programs and inputs. Consequently, automatic verification tools approximate termination, sometimes failing to prove or disprove; these tools rely on […]