arXiv:2512.08226v1 Announce Type: new Abstract: Personalized neoantigen vaccines represent a promising immunotherapy approach that harnesses tumor-specific antigens to stimulate anti-tumor immune responses. However, the design of these vaccines requires sophisticated computational workflows to predict and prioritize neoantigen candidates from patient sequencing data, coupled with rigorous review to ensure candidate quality. While numerous computational tools exist […]
Value-State Gated Attention for Mitigating Extreme-Token Phenomena in Transformers
arXiv:2510.09017v2 Announce Type: replace-cross Abstract: Large models based on the Transformer architecture are susceptible to extreme-token phenomena, such as attention sinks and value-state drains. These issues, which degrade model performance, quantization fidelity, and interpretability, arise from a problematic mutual reinforcement mechanism where the model learns an inefficient ‘no-op’ behavior by focusing attention on tokens with […]
Empowerment Gain and Causal Model Construction: Children and adults are sensitive to controllability and variability in their causal interventions
arXiv:2512.08230v1 Announce Type: new Abstract: Learning about the causal structure of the world is a fundamental problem for human cognition. Causal models and especially causal learning have proved to be difficult for large pretrained models using standard techniques of deep learning. In contrast, cognitive scientists have applied advances in our formal understanding of causation in […]
Training-Time Action Conditioning for Efficient Real-Time Chunking
arXiv:2512.05964v2 Announce Type: replace-cross Abstract: Real-time chunking (RTC) enables vision-language-action models (VLAs) to generate smooth, reactive robot trajectories by asynchronously predicting action chunks and conditioning on previously committed actions via inference-time inpainting. However, this inpainting method introduces computational overhead that increases inference latency. In this work, we propose a simple alternative: simulating inference delay at […]
Beyond Traditional Diagnostics: Transforming Patient-Side Information into Predictive Insights with Knowledge Graphs and Prototypes
arXiv:2512.08261v1 Announce Type: new Abstract: Predicting diseases solely from patient-side information, such as demographics and self-reported symptoms, has attracted significant research attention due to its potential to enhance patient awareness, facilitate early healthcare engagement, and improve healthcare system efficiency. However, existing approaches encounter critical challenges, including imbalanced disease distributions and a lack of interpretability, resulting […]
A Systematic Evaluation of Preference Aggregation in Federated RLHF for Pluralistic Alignment of LLMs
arXiv:2512.08786v1 Announce Type: cross Abstract: This paper addresses the challenge of aligning large language models (LLMs) with diverse human preferences within federated learning (FL) environments, where standard methods often fail to adequately represent diverse viewpoints. We introduce a comprehensive evaluation framework that systematically assesses the trade-off between alignment quality and fairness when using different aggregation […]
Reasoning Models Ace the CFA Exams
arXiv:2512.08270v1 Announce Type: new Abstract: Previous research has reported that large language models (LLMs) demonstrate poor performance on the Chartered Financial Analyst (CFA) exams. However, recent reasoning models have achieved strong results on graduate-level academic and professional examinations across various disciplines. In this paper, we evaluate state-of-the-art reasoning models on a set of mock CFA […]
GSPN-2: Efficient Parallel Sequence Modeling
arXiv:2512.07884v1 Announce Type: cross Abstract: Efficient vision transformer remains a bottleneck for high-resolution images and long-video related real-world applications. Generalized Spatial Propagation Network (GSPN) addresses this by replacing quadratic self-attention with a line-scan propagation scheme, bringing the cost close to linear in the number of rows or columns, while retaining accuracy. Despite this advancement, the […]
Training-Free Dual Hyperbolic Adapters for Better Cross-Modal Reasoning
arXiv:2512.08820v1 Announce Type: cross Abstract: Recent research in Vision-Language Models (VLMs) has significantly advanced our capabilities in cross-modal reasoning. However, existing methods suffer from performance degradation with domain changes or require substantial computational resources for fine-tuning in new domains. To address this issue, we develop a new adaptation method for large vision-language models, called textitTraining-free […]
Functional Random Forest with Adaptive Cost-Sensitive Splitting for Imbalanced Functional Data Classification
arXiv:2512.07888v1 Announce Type: cross Abstract: Classification of functional data where observations are curves or trajectories poses unique challenges, particularly under severe class imbalance. Traditional Random Forest algorithms, while robust for tabular data, often fail to capture the intrinsic structure of functional observations and struggle with minority class detection. This paper introduces Functional Random Forest with […]