arXiv:2510.27091v1 Announce Type: cross Abstract: Quantum theory provides non-classical principles, such as superposition and entanglement, that inspires promising paradigms in machine learning. However, most existing quantum-inspired fusion models rely solely on unitary or unitary-like transformations to generate quantum entanglement. While theoretically expressive, such approaches often suffer from training instability and limited generalizability. In this work, […]
CATArena: Evaluation of LLM Agents through Iterative Tournament Competitions
arXiv:2510.26852v1 Announce Type: new Abstract: Large Language Model (LLM) agents have evolved from basic text generation to autonomously completing complex tasks through interaction with external tools. However, current benchmarks mainly assess end-to-end performance in fixed scenarios, restricting evaluation to specific skills and suffering from score saturation and growing dependence on expert annotation as agent capabilities […]
H2-Cache: A Novel Hierarchical Dual-Stage Cache for High-Performance Acceleration of Generative Diffusion Models
arXiv:2510.27171v1 Announce Type: cross Abstract: Diffusion models have emerged as state-of-the-art in image generation, but their practical deployment is hindered by the significant computational cost of their iterative denoising process. While existing caching techniques can accelerate inference, they often create a challenging trade-off between speed and fidelity, suffering from quality degradation and high computational overhead. […]
The End of Manual Decoding: Towards Truly End-to-End Language Models
arXiv:2510.26697v2 Announce Type: replace-cross Abstract: The “end-to-end” label for LLMs is a misnomer. In practice, they depend on a non-differentiable decoding process that requires laborious, hand-tuning of hyperparameters like temperature and top-p. This paper introduces AutoDeco, a novel architecture that enables truly “end-to-end” generation by learning to control its own decoding strategy. We augment the […]
Feature-Function Curvature Analysis: A Geometric Framework for Explaining Differentiable Models
arXiv:2510.27207v1 Announce Type: cross Abstract: Explainable AI (XAI) is critical for building trust in complex machine learning models, yet mainstream attribution methods often provide an incomplete, static picture of a model’s final state. By collapsing a feature’s role into a single score, they are confounded by non-linearity and interactions. To address this, we introduce Feature-Function […]
Causal Masking on Spatial Data: An Information-Theoretic Case for Learning Spatial Datasets with Unimodal Language Models
arXiv:2510.27009v1 Announce Type: new Abstract: Language models are traditionally designed around causal masking. In domains with spatial or relational structure, causal masking is often viewed as inappropriate, and sequential linearizations are instead used. Yet the question of whether it is viable to accept the information loss introduced by causal masking on nonsequential data has received […]
Reconstructing Unseen Sentences from Speech-related Biosignals for Open-vocabulary Neural Communication
arXiv:2510.27247v1 Announce Type: cross Abstract: Brain-to-speech (BTS) systems represent a groundbreaking approach to human communication by enabling the direct transformation of neural activity into linguistic expressions. While recent non-invasive BTS studies have largely focused on decoding predefined words or sentences, achieving open-vocabulary neural communication comparable to natural human interaction requires decoding unconstrained speech. Additionally, effectively […]
InertialAR: Autoregressive 3D Molecule Generation with Inertial Frames
arXiv:2510.27497v1 Announce Type: cross Abstract: Transformer-based autoregressive models have emerged as a unifying paradigm across modalities such as text and images, but their extension to 3D molecule generation remains underexplored. The gap stems from two fundamental challenges: (1) tokenizing molecules into a canonical 1D sequence of tokens that is invariant to both SE(3) transformations and […]
HiF-DTA: Hierarchical Feature Learning Network for Drug-Target Affinity Prediction
arXiv:2510.27281v1 Announce Type: cross Abstract: Accurate prediction of Drug-Target Affinity (DTA) is crucial for reducing experimental costs and accelerating early screening in computational drug discovery. While sequence-based deep learning methods avoid reliance on costly 3D structures, they still overlook simultaneous modeling of global sequence semantic features and local topological structural features within drugs and proteins, […]
Generalizing matrix representations to fully heterochronous ranked tree shapes
arXiv:2510.27030v1 Announce Type: new Abstract: Phylogenetic tree shapes capture fundamental signatures of evolution. We consider “ranked” tree shapes, which are equipped with a total order on the internal nodes compatible the tree graph. Recent work has established an elegant bijection of ranked tree shapes and a class of integer matrices, called textbfF-matrices, defined by simple […]