arXiv:2511.16997v1 Announce Type: new Abstract: The emergence of AI Scientists has demonstrated remarkable potential in automating scientific research. However, current approaches largely conceptualize scientific discovery as a solitary optimization or search process, overlooking that knowledge production is inherently a social and historical endeavor. Human scientific insight stems from two distinct yet interconnected sources. First is […]
Budget-Aware Tool-Use Enables Effective Agent Scaling
arXiv:2511.17006v1 Announce Type: new Abstract: Scaling test-time computation improves performance across different tasks on large language models (LLMs), which has also been extended to tool-augmented agents. For these agents, scaling involves not only “thinking” in tokens but also “acting” via tool calls. The number of tool calls directly bounds the agent’s interaction with the external […]
Hybrid Differential Reward: Combining Temporal Difference and Action Gradients for Efficient Multi-Agent Reinforcement Learning in Cooperative Driving
arXiv:2511.16916v1 Announce Type: new Abstract: In multi-vehicle cooperative driving tasks involving high-frequency continuous control, traditional state-based reward functions suffer from the issue of vanishing reward differences. This phenomenon results in a low signal-to-noise ratio (SNR) for policy gradients, significantly hindering algorithm convergence and performance improvement. To address this challenge, this paper proposes a novel Hybrid […]
Comparing verbal, visual and combined explanations for Bayesian Network inferences
arXiv:2511.16961v1 Announce Type: new Abstract: Bayesian Networks (BNs) are an important tool for assisting probabilistic reasoning, but despite being considered transparent models, people have trouble understanding them. Further, current User Interfaces (UIs) still do not clarify the reasoning of BNs. To address this problem, we have designed verbal and visual extensions to the standard BN […]
Fantastic Bugs and Where to Find Them in AI Benchmarks
arXiv:2511.16842v1 Announce Type: new Abstract: Benchmarks are pivotal in driving AI progress, and invalid benchmark questions frequently undermine their reliability. Manually identifying and correcting errors among thousands of benchmark questions is not only infeasible but also a critical bottleneck for reliable evaluation. In this work, we introduce a framework for systematic benchmark revision that leverages […]
Cognitive BASIC: An In-Model Interpreted Reasoning Language for LLMs
arXiv:2511.16837v1 Announce Type: new Abstract: Cognitive BASIC is a minimal, BASIC-style prompting language and in-model interpreter that structures large language model (LLM) reasoning into explicit, stepwise execution traces. Inspired by the simplicity of retro BASIC, we repurpose numbered lines and simple commands as an interpretable cognitive control layer. Modern LLMs can reliably simulate such short […]
Stable diffusion models reveal a persisting human and AI gap in visual creativity
arXiv:2511.16814v1 Announce Type: new Abstract: While recent research suggests Large Language Models match human creative performance in divergent thinking tasks, visual creativity remains underexplored. This study compared image generation in human participants (Visual Artists and Non Artists) and using an image generation AI model (two prompting conditions with varying human input: high for Human Inspired, […]
Interactive Query Answering on Knowledge Graphs with Soft Entity Constraints
arXiv:2508.13663v2 Announce Type: replace Abstract: Methods for query answering over incomplete knowledge graphs retrieve entities that are emphlikely to be answers, which is particularly useful when such answers cannot be reached by direct graph traversal due to missing edges. However, existing approaches have focused on queries formalized using first-order-logic. In practice, many real-world queries involve […]
DAPS++: Rethinking Diffusion Inverse Problems with Decoupled Posterior Annealing
arXiv:2511.17038v1 Announce Type: new Abstract: From a Bayesian perspective, score-based diffusion solves inverse problems through joint inference, embedding the likelihood with the prior to guide the sampling process. However, this formulation fails to explain its practical behavior: the prior offers limited guidance, while reconstruction is largely driven by the measurement-consistency term, leading to an inference […]
Emergence of psychopathological computations in large language models
arXiv:2504.08016v2 Announce Type: replace Abstract: Can large language models (LLMs) instantiate computations of psychopathology? An effective approach to the question hinges on addressing two factors. First, for conceptual validity, we require a general and computational account of psychopathology that is applicable to computational entities without biological embodiment or subjective experience. Second, psychopathological computations, derived from […]
Spanning Tree Autoregressive Visual Generation
arXiv:2511.17089v1 Announce Type: cross Abstract: We present Spanning Tree Autoregressive (STAR) modeling, which can incorporate prior knowledge of images, such as center bias and locality, to maintain sampling performance while also providing sufficiently flexible sequence orders to accommodate image editing at inference. Approaches that expose randomly permuted sequence orders to conventional autoregressive (AR) models in […]
PepEVOLVE: Position-Aware Dynamic Peptide Optimization via Group-Relative Advantage
arXiv:2511.16912v1 Announce Type: cross Abstract: Macrocyclic peptides are an emerging modality that combines biologics-like affinity with small-molecule-like developability, but their vast combinatorial space and multi-parameter objectives make lead optimization slow and challenging. Prior generative approaches such as PepINVENT require chemists to pre-specify mutable positions for optimization, choices that are not always known a priori, and […]