arXiv:2604.25191v1 Announce Type: cross Abstract: Chip placement is a critical step in physical design. While reinforcement learning (RL)-based methods have recently emerged, their training primarily focuses on wirelength optimization, and therefore often fail to achieve expert-quality layouts. We identify the reward design as the primary cause for the performance gap with experts, and instead of […]
Below-Chance Blindness: Prompted Underperformance in Small LLMs Produces Positional Bias Rather than Answer Avoidance
arXiv:2604.25249v1 Announce Type: cross Abstract: Detecting sandbagging–the deliberate underperformance on capability evaluations–is an open problem in AI safety. We tested whether symptom validity testing (SVT) logic from clinical malingering detection could identify sandbagging through below-chance performance (BCB) on forced-choice items. In a pre-registered pilot at the 7-9 billion parameter instruction-tuned scale (3 models, 4 MMLU-Pro […]
The Thinking Pixel: Recursive Sparse Reasoning in Multimodal Diffusion Latents
arXiv:2604.25299v1 Announce Type: cross Abstract: Diffusion models have achieved success in high-fidelity data synthesis, yet their capacity for more complex, structured reasoning like text following tasks remains constrained. While advances in language models have leveraged strategies such as latent reasoning and recursion to enhance text understanding capabilities, extending these to multimodal text-to-image generation tasks is […]
BifDet: A 3D Bifurcation Detection Dataset for Airway-Tree Modeling
arXiv:2604.24999v1 Announce Type: cross Abstract: Thoracic Computed Tomography (CT) scans offer detailed insights into the intricate branching network of the airway tree, which is essential for understanding various respiratory diseases. Airway bifurcations, where airway branches split, are crucial landmarks for understanding lung physiology, disease mechanisms and lesion localization. Despite the significance of bifurcation analysis, a […]
When Errors Can Be Beneficial: A Categorization of Imperfect Rewards for Policy Gradient
arXiv:2604.25872v1 Announce Type: cross Abstract: Training language models via reinforcement learning often relies on imperfect proxy rewards, since ground truth rewards that precisely define the intended behavior are rarely available. Standard metrics for assessing the quality of proxy rewards, such as ranking accuracy, treat incorrect rewards as strictly harmful. In this work, however, we highlight […]
Ternary Memristive Logic: Hardware for Reasoning Realized via Domain Algebra
arXiv:2604.20891v2 Announce Type: replace-cross Abstract: Memristive crossbars store numerical weights needing aggregation and decoding; a single junction means nothing alone. This paper presents a fundamentally different use: each junction stores a complete, domain-scoped logical assertion (holds/negated/undefined). Ternary resistance states encode these values directly. We establish a structure-preserving mapping from a domain algebra to crossbar topology: […]
Contrast-Enhanced Gating in GRUs for Robust Low-Data Sequence Learning
arXiv:2402.09034v3 Announce Type: replace-cross Abstract: Activation functions govern how recurrent networks regulate and transmit information across temporal dependencies. Despite advances in sequence modelling, gated recurrent units (GRUs) still depend on the standard sigmoid and tanh nonlinearities, which can produce weak gate separation and unstable learning, particularly when training data are limited. We introduce squared sigmoid-tanh […]
Maximum-Entropy Model of Colored Noise in Superdiffusive Axonal Growth
arXiv:2506.11272v2 Announce Type: replace-cross Abstract: We develop a coarse-grained stochastic theory for axonal growth on micropatterned substrates using the Shannon–Jaynes maximum entropy principle. Starting from a Langevin description of growth cone motion, we infer the effective distribution of traction force relaxation rates from experimentally motivated constraints rather than postulating the colored noise directly. The resulting […]
Three Models of RLHF Annotation: Extension, Evidence, and Authority
arXiv:2604.25895v1 Announce Type: cross Abstract: Preference-based alignment methods, most prominently Reinforcement Learning with Human Feedback (RLHF), use the judgments of human annotators to shape large language model behaviour. However, the normative role of these judgments is rarely made explicit. I distinguish three conceptual models of that role. The first is extension: annotators extend the system […]
Why Do LLM-based Web Agents Fail? A Hierarchical Planning Perspective
arXiv:2603.14248v2 Announce Type: replace Abstract: Large language model (LLM) web agents are increasingly used for web navigation but remain far from human reliability on realistic, long-horizon tasks. Existing evaluations focus primarily on end-to-end success, offering limited insight into where failures arise. We propose a hierarchical planning framework to analyze web agents across three layers (i.e., […]
Gelina: Unified Speech and Gesture Synthesis via Interleaved Token Prediction
arXiv:2510.12834v4 Announce Type: replace-cross Abstract: Human communication is multimodal, with speech and gestures tightly coupled, yet most computational methods for generating speech and gestures synthesize them sequentially, weakening synchrony and prosody alignment. We introduce Gelina, a unified framework that jointly synthesizes speech and co-speech gestures from text using interleaved token sequences in a discrete autoregressive […]
Bayesian Rate Inference for Sequence Motif Dynamics in Systems of Reactive Nucleic Acids
arXiv:2604.25701v1 Announce Type: cross Abstract: The RNA world hypothesis suggests a pathway of how life emerged on early earth. It assumes that life started with RNA based systems, capable of storing, transmitting and replicating information, envisioning that monomers and short RNA oligomers interact to form longer strands, eventually becoming catalytically active ribozymes. Key reactions in […]