Fast Approximation Algorithm for Non-Monotone DR-submodular Maximization under Size Constraint

arXiv:2511.02254v1 Announce Type: cross Abstract: This work studies the non-monotone DR-submodular Maximization over a ground set of $n$ subject to a size constraint $k$. We

AI Credibility Signals Outrank Institutions and Engagement in Shaping News Perception on Social Media

arXiv:2511.02370v1 Announce Type: cross Abstract: AI-generated content is rapidly becoming a salient component of online information ecosystems, yet its influence on public trust and epistemic

Near Optimal Convergence to Coarse Correlated Equilibrium in General-Sum Markov Games

arXiv:2511.02157v1 Announce Type: cross Abstract: No-regret learning dynamics play a central role in game theory, enabling decentralized convergence to equilibrium for concepts such as Coarse

Estimation of Segmental Longitudinal Strain in Transesophageal Echocardiography by Deep Learning

arXiv:2511.02210v1 Announce Type: cross Abstract: Segmental longitudinal strain (SLS) of the left ventricle (LV) is an important prognostic indicator for evaluating regional LV dysfunction, in

Shared Parameter Subspaces and Cross-Task Linearity in Emergently Misaligned Behavior

arXiv:2511.02022v1 Announce Type: cross Abstract: Recent work has discovered that large language models can develop broadly misaligned behaviors after being fine-tuned on narrowly harmful datasets,

Huxley-G”odel Machine: Human-Level Coding Agent Development by an Approximation of the Optimal Self-Improving Machine

October 29, 2025

arXiv:2510.21614v2 Announce Type: replace
Abstract: Recent studies operationalize self-improvement through coding agents that edit their own codebases. They grow a tree of self-modifications through expansion strategies that favor higher software engineering benchmark performance, assuming that this implies more promising subsequent self-modifications. However, we identify a mismatch between the agent’s self-improvement potential (metaproductivity) and its coding benchmark performance, namely the Metaproductivity-Performance Mismatch. Inspired by Huxley’s concept of clade, we propose a metric ($mathrmCMP$) that aggregates the benchmark performances of the descendants of an agent as an indicator of its potential for self-improvement. We show that, in our self-improving coding agent development setting, access to the true $mathrmCMP$ is sufficient to simulate how the G”odel Machine would behave under certain assumptions. We introduce the Huxley-G”odel Machine (HGM), which, by estimating $mathrmCMP$ and using it as guidance, searches the tree of self-modifications. On SWE-bench Verified and Polyglot, HGM outperforms prior self-improving coding agent development methods while using less wall-clock time. Last but not least, HGM demonstrates strong transfer to other coding datasets and large language models. The agent optimized by HGM on SWE-bench Verified with GPT-5-mini and evaluated on SWE-bench Lite with GPT-5 achieves human-level performance, matching the best officially checked results of human-engineered coding agents. Our code is available at https://github.com/metauto-ai/HGM.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844