arXiv:2605.27628v1 Announce Type: new Abstract: As autonomous and agentic AI systems scale in robotic and human-machine environments, managing hallucination and persistent but unjustified action remains an open challenge. Rather than attributing these failures solely to model or alignment limitations, this paper explores the architectural vulnerability of unbounded autonomy – the presumption that an agent should […]
Fine-Tuned LLM as a Complementary Predictor Improving Ads System
arXiv:2605.27856v1 Announce Type: cross Abstract: Recommendation systems power engagement and monetization across feeds, ads, and short-video platforms, but translating the latest advances in Large Language Models into Recommendation Systems (RecSys) gains remains rare, particularly in advertising and production-scale real-world industry setups. Prior real-world LLM successes typically fall into three buckets: (a) generative retrieval that directly […]
One LR Doesn’t Fit All: Heavy-Tail Guided Layerwise Learning Rates for LLMs
arXiv:2605.22297v3 Announce Type: replace-cross Abstract: Learning rate configuration is a fundamental aspect of modern deep learning. The prevailing practice of applying a uniform learning rate across all layers overlooks the structural heterogeneity of Transformers, potentially limiting their effectiveness as the backbone of Large Language Models (LLMs). In this paper, we introduce Layerwise Learning Rate (LLR), […]
Let the Results Speak: A Replication-First Paradigm for LLM Behavioral Benchmarking
arXiv:2605.27914v1 Announce Type: cross Abstract: Subjective evaluation of LLM behavior — empathy, restraint, calibrated emotional tone — is hard. Human inter-rater agreement on such qualities saturates near rho ~ 0.45, and an LLM-as-judge proxy alone risks circularity: a judge sharing the target’s training cohort cannot independently verify it. Anchoring validity to a single human-rater consensus […]
ESL-PSC Toolkit: a graphical software environment for linking shared genetic changes to convergent phenotypes
arXiv:2605.27677v1 Announce Type: new Abstract: Convergent evolution provides a useful framework for testing whether independent origins of similar traits share common genetic mechanisms. Evolutionary Sparse Learning with Paired Species Contrast (ESL-PSC) is an approach to identify genes and sites associated with convergent traits from aligned sequences by fitting sparse predictive models to phylogenetically informed species […]
Periodic RoPE for Infinite Context LLMs
arXiv:2605.27980v1 Announce Type: cross Abstract: The ability to process ultra-long contexts is crucial for large language models (LLMs) to perform long-horizon tasks. While recent efforts have extended context windows to 1M and beyond, model performance degrades when sequence length exceeds the pre-trained range of positional encodings (e.g., RoPE), i.e., position exhaustion. This fundamental limitation must […]
SPARD: Defending Harmful Fine-Tuning Attack via Safety Projection with Relevance-Diversity Data Selection
arXiv:2605.28030v1 Announce Type: cross Abstract: Fine-tuning large language models often undermines their safety alignment, a problem further amplified by harmful fine-tuning attacks in which adversarial data removes safeguards and induces unsafe behaviors. We propose SPARD, a defense framework that integrates Safety-Projected Alternating optimization with Relevance-Diversity aware data selection. SPARD employs SPAG, which optimizes alternatively between […]
Behavioural Analysis of Alignment Faking
arXiv:2605.27681v1 Announce Type: new Abstract: Alignment faking (AF) refers to a model strategically complying with a training objective to avoid behavioural modification while preserving its deployment preferences. Understanding when and why AF arises matters as models grow better at distinguishing training from deployment. Prior work finds AF fragile, prompt-sensitive, and model-dependent, leaving its underlying drivers […]
Performance and Explainability Requirements of Evolutionary Algorithms in Real-World Physics-Informed Optimization
arXiv:2605.28164v1 Announce Type: cross Abstract: Evolutionary computation offers a variety of tools to solve complex real-world optimization problems. However, research often focuses on smaller, simplified problems and optimization algorithms that sometimes miss expectations in real-world scenarios. Additionally, trust in the applied algorithm and the solutions it provides is often essential in such settings, but requires […]
Cross-Entropy Games and Frost Training
arXiv:2605.27701v1 Announce Type: new Abstract: We present Frost Training, a method for improving Monte Carlo-based policy optimization for a large family of LLM-as-a-judge tasks called Cross-Entropy Games. The key idea is to exploit the gradient of the reward function in embedding space. This signal is used in the Greedy Coordinate Gradient (GCG) jailbreaking technique; we […]
Routing-Aligned Fine-Tuning for Multilingual Downstream Tasks in Mixture-of-Experts Models
arXiv:2605.28306v1 Announce Type: cross Abstract: Mixture-of-Experts (MoE) models have emerged as a dominant paradigm for efficient LLM scaling, yet adapting them to non-English downstream tasks remains challenging. Existing fine-tuning approaches treat MoE models as monolithic learners, ignoring the heterogeneous routing structure that develops during pretraining. We validate across multiple MoE models and downstream tasks that […]
Hierarchical Prompt-Domain Control and Learning for Resource-Constrained Agentic Language Models
arXiv:2605.27703v1 Announce Type: new Abstract: Large Language Models are increasingly deployed inside agentic systems, where they must follow structured protocols, adapt to evolving states, and operate under memory, latency, and cost constraints. In such regimes, prompt extension is unreliable: growing contexts can push compact models outside their effective prompt domain, while deployment-time fine-tuning remains limited […]