arXiv:2605.27677v1 Announce Type: new Abstract: Convergent evolution provides a useful framework for testing whether independent origins of similar traits share common genetic mechanisms. Evolutionary Sparse Learning with Paired Species Contrast (ESL-PSC) is an approach to identify genes and sites associated with convergent traits from aligned sequences by fitting sparse predictive models to phylogenetically informed species […]
Periodic RoPE for Infinite Context LLMs
arXiv:2605.27980v1 Announce Type: cross Abstract: The ability to process ultra-long contexts is crucial for large language models (LLMs) to perform long-horizon tasks. While recent efforts have extended context windows to 1M and beyond, model performance degrades when sequence length exceeds the pre-trained range of positional encodings (e.g., RoPE), i.e., position exhaustion. This fundamental limitation must […]
SPARD: Defending Harmful Fine-Tuning Attack via Safety Projection with Relevance-Diversity Data Selection
arXiv:2605.28030v1 Announce Type: cross Abstract: Fine-tuning large language models often undermines their safety alignment, a problem further amplified by harmful fine-tuning attacks in which adversarial data removes safeguards and induces unsafe behaviors. We propose SPARD, a defense framework that integrates Safety-Projected Alternating optimization with Relevance-Diversity aware data selection. SPARD employs SPAG, which optimizes alternatively between […]
Behavioural Analysis of Alignment Faking
arXiv:2605.27681v1 Announce Type: new Abstract: Alignment faking (AF) refers to a model strategically complying with a training objective to avoid behavioural modification while preserving its deployment preferences. Understanding when and why AF arises matters as models grow better at distinguishing training from deployment. Prior work finds AF fragile, prompt-sensitive, and model-dependent, leaving its underlying drivers […]
Performance and Explainability Requirements of Evolutionary Algorithms in Real-World Physics-Informed Optimization
arXiv:2605.28164v1 Announce Type: cross Abstract: Evolutionary computation offers a variety of tools to solve complex real-world optimization problems. However, research often focuses on smaller, simplified problems and optimization algorithms that sometimes miss expectations in real-world scenarios. Additionally, trust in the applied algorithm and the solutions it provides is often essential in such settings, but requires […]
Cross-Entropy Games and Frost Training
arXiv:2605.27701v1 Announce Type: new Abstract: We present Frost Training, a method for improving Monte Carlo-based policy optimization for a large family of LLM-as-a-judge tasks called Cross-Entropy Games. The key idea is to exploit the gradient of the reward function in embedding space. This signal is used in the Greedy Coordinate Gradient (GCG) jailbreaking technique; we […]
Routing-Aligned Fine-Tuning for Multilingual Downstream Tasks in Mixture-of-Experts Models
arXiv:2605.28306v1 Announce Type: cross Abstract: Mixture-of-Experts (MoE) models have emerged as a dominant paradigm for efficient LLM scaling, yet adapting them to non-English downstream tasks remains challenging. Existing fine-tuning approaches treat MoE models as monolithic learners, ignoring the heterogeneous routing structure that develops during pretraining. We validate across multiple MoE models and downstream tasks that […]
Hierarchical Prompt-Domain Control and Learning for Resource-Constrained Agentic Language Models
arXiv:2605.27703v1 Announce Type: new Abstract: Large Language Models are increasingly deployed inside agentic systems, where they must follow structured protocols, adapt to evolving states, and operate under memory, latency, and cost constraints. In such regimes, prompt extension is unreliable: growing contexts can push compact models outside their effective prompt domain, while deployment-time fine-tuning remains limited […]
Functional Entropy: Predicting Functional Correctness in LLM-Generated Code with Uncertainty Quantification
arXiv:2605.28500v1 Announce Type: cross Abstract: Large language models have shown impressive capabilities in code generation, yet they often produce functionally incorrect code. Uncertainty quantification (UQ) methods have emerged as a promising approach for detecting hallucinations in natural language generation, but their effectiveness for code generation tasks remains underexplored. We systematically evaluate how UQ techniques transfer […]
DeepSciVerify: Verifying Scientific Claim–Citation Alignment via LLM-Driven Evidence Escalation
arXiv:2605.27710v1 Announce Type: new Abstract: Misalignment between claims and their cited evidence is a common failure mode in reports generated by large language models, limiting their reliability in scientific and other high-stakes settings. We present DeepSciVerify, a two-stage pipeline for scientific claim-citation verification that combines abstract-level reasoning with selective escalation to passage-level evidence. The system […]
Online Irregular Multivariate Time Series Forecasting via Uncertainty-Driven Dual-Expert Calibration
arXiv:2605.28603v1 Announce Type: cross Abstract: Irregular multivariate time series forecasting is critical in many real-world applications, where time series are irregularly sampled and exhibit dynamically evolving missingness patterns. Although existing methods perform well in offline settings, they often suffer from significant performance degradation when deployed online due to dynamic shifts in data distribution. Maintaining forecasting […]
Prefix-Safe Bayesian Belief Tracking for LLM Reasoning Reliability:Separating Calibration from Ranking
arXiv:2605.27712v1 Announce Type: new Abstract: Long reasoning traces need reliability estimates before final answers are known. We study prefix-conditioned eventual-success estimation, $P(y=1 mid o_1:t)$, using prefix-safe observations. Sequential Bayesian Belief Tracking (SBBT) calibrates observation likelihoods and recursively updates a two-state belief, providing a common tracker for scalar scores, text and self-verification markers, hidden clusters, token-pooling […]