arXiv:2511.20836v3 Announce Type: replace-cross Abstract: As language models (LMs) are increasingly adopted across domains, high-quality benchmarking frameworks are essential for guiding deployment decisions. In practice, however, frameworks such as Holistic Evaluation of Language Models (HELM) typically evaluate models under a single static prompt configuration, even though model behavior depends strongly on prompt choice. As a […]
HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention
arXiv:2603.28458v2 Announce Type: replace-cross Abstract: Token-level sparse attention mechanisms, exemplified by DeepSeek Sparse Attention (DSA), achieve fine-grained key selection by scoring every historical key for each query through a lightweight indexer, then computing attention only on the selected subset. While the downstream sparse attention itself scales favorably, the indexer must still scan the entire prefix […]
Neural Harmonic Textures for High-Quality Primitive Based Neural Reconstruction
arXiv:2604.01204v1 Announce Type: cross Abstract: Primitive-based methods such as 3D Gaussian Splatting have recently become the state-of-the-art for novel-view synthesis and related reconstruction tasks. Compared to neural fields, these representations are more flexible, adaptive, and scale better to large scenes. However, the limited expressivity of individual primitives makes modeling high-frequency detail challenging. We introduce Neural […]
PReD: An LLM-based Foundation Multimodal Model for Electromagnetic Perception, Recognition, and Decision
arXiv:2603.28183v2 Announce Type: replace Abstract: Multimodal Large Language Models have demonstrated powerful cross-modal understanding and reasoning capabilities in general domains. However, in the electromagnetic (EM) domain, they still face challenges such as data scarcity and insufficient integration of domain knowledge. This paper proposes PReD, the first foundation model for the EM domain that covers the […]
WARP: Guaranteed Inner-Layer Repair of NLP Transformers
arXiv:2604.00938v1 Announce Type: cross Abstract: Transformer-based NLP models remain vulnerable to adversarial perturbations, yet existing repair methods face a fundamental trade-off: gradient-based approaches offer flexibility but lack verifiability and often overfit; methods that do provide repair guarantees are restricted to the final layer or small networks, significantly limiting the parameter search space available for repair. […]
IWP: Token Pruning as Implicit Weight Pruning in Large Vision Language Models
arXiv:2604.00757v1 Announce Type: cross Abstract: Large Vision Language Models show impressive performance across image and video understanding tasks, yet their computational cost grows rapidly with the number of visual tokens. Existing token pruning methods mitigate this issue through empirical approaches while overlooking the internal mechanism of attention. In this paper, we propose a novel training […]
Non-ignorable fuzziness in granular counts: the case of RNA-seq data
arXiv:2604.00763v1 Announce Type: cross Abstract: RNA-seq count data are often affected by read-to-gene alignment ambiguity, especially in high-dimensional transcriptomics. This type of ambiguity can be conveniently expressed through granular counts, namely fuzzy-valued observations of latent discrete quantities. We study a class of fuzzy-reporting mechanisms and show that, when reporting exploits graded membership, ignorability fails generically, […]
From Density Matrices to Phase Transitions in Deep Learning: Spectral Early Warnings and Interpretability
arXiv:2603.29805v2 Announce Type: replace-cross Abstract: A key problem in the modern study of AI is predicting and understanding emergent capabilities in models during training. Inspired by methods for studying reactions in quantum chemistry, we present the “2-datapoint reduced density matrix”. We show that this object provides a computationally efficient, unified observable of phase transitions during […]
How Emotion Shapes the Behavior of LLMs and Agents: A Mechanistic Study
arXiv:2604.00005v1 Announce Type: new Abstract: Emotion plays an important role in human cognition and performance. Motivated by this, we investigate whether analogous emotional signals can shape the behavior of large language models (LLMs) and agents. Existing emotion-aware studies mainly treat emotion as a surface-level style factor or a perception target, overlooking its mechanistic role in […]
$textttYC-Bench$: Benchmarking AI Agents for Long-Term Planning and Consistent Execution
arXiv:2604.01212v1 Announce Type: cross Abstract: As LLM agents tackle increasingly complex tasks, a critical question is whether they can maintain strategic coherence over long horizons: planning under uncertainty, learning from delayed feedback, and adapting when early mistakes compound. We introduce $textttYC-Bench$, a benchmark that evaluates these capabilities by tasking an agent with running a simulated […]
Stochastic ordering tools for continuous-time Markov chains and applications to reaction network models
arXiv:2604.00756v1 Announce Type: cross Abstract: Stochastic reaction networks are mathematical models with a wide range of applications in biochemistry, ecology, and epidemiology, and are often complex to analyze. Except for some special cases, it is generally difficult to predict how the abundances of all considered species evolve over time. A possible approach to address this […]
CarbonEdge: Carbon-Aware Deep Learning Inference Framework for Sustainable Edge Computing
arXiv:2603.27420v2 Announce Type: replace-cross Abstract: Deep learning applications at the network edge lead to a significant growth in AI-related carbon emissions, presenting a critical sustainability challenge. The existing edge computing frameworks optimize for latency and throughput, but they largely ignore the environmental impact of inference workloads. This paper introduces CarbonEdge, a carbon-aware deep learning inference […]