arXiv:2601.21235v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly deployed in high-stakes domains, where rare but severe failures can result in irreversible harm. However, prevailing evaluation benchmarks often reduce complex social risk to mean-centered scalar scores, thereby obscuring distributional structure, cross-dimensional interactions, and worst-case behavior. This paper introduces Social Harm Analysis via Risk […]
The Surprising Difficulty of Search in Model-Based Reinforcement Learning
arXiv:2601.21306v1 Announce Type: cross Abstract: This paper investigates search in model-based reinforcement learning (RL). Conventional wisdom holds that long-term predictions and compounding errors are the primary obstacles for model-based RL. We challenge this view, showing that search is not a plug-and-play replacement for a learned policy. Surprisingly, we find that search can harm performance even […]
QUARK: Robust Retrieval under Non-Faithful Queries via Query-Anchored Aggregation
arXiv:2601.21049v1 Announce Type: new Abstract: User queries in real-world retrieval are often non-faithful (noisy, incomplete, or distorted), causing retrievers to fail when key semantics are missing. We formalize this as retrieval under recall noise, where the observed query is drawn from a noisy recall process of a latent target item. To address this, we propose […]
Optimization and Mobile Deployment for Anthropocene Neural Style Transfer
arXiv:2601.21141v1 Announce Type: cross Abstract: This paper presents AnthropoCam, a mobile-based neural style transfer (NST) system optimized for the visual synthesis of Anthropocene environments. Unlike conventional artistic NST, which prioritizes painterly abstraction, stylizing human-altered landscapes demands a careful balance between amplifying material textures and preserving semantic legibility. Industrial infrastructures, waste accumulations, and modified ecosystems contain […]
Reputation as a Solution to Cooperation Collapse in LLM-based MASs
arXiv:2505.05029v3 Announce Type: replace Abstract: Cooperation has long been a fundamental topic in both human society and AI systems. However, recent studies indicate that the collapse of cooperation may emerge in multi-agent systems (MASs) driven by large language models (LLMs). To address this challenge, we explore reputation systems as a remedy. We propose RepuNet, a […]
No Verifiable Reward for Prosody: Toward Preference-Guided Prosody Learning in TTS
arXiv:2509.18531v2 Announce Type: replace-cross Abstract: Recent work reports gains in neural text-to-speech (TTS) with Group Relative Policy Optimization (GRPO). However, in the absence of a verifiable reward for textitprosody, GRPO trained on transcription-oriented signals (CER/NLL) lowers error rates yet collapses prosody into monotone, unnatural speech; adding speaker-similarity further destabilizes training and degrades CER. We address […]
Diagnosing and Mitigating Modality Interference in Multimodal Large Language Models
arXiv:2505.19616v4 Announce Type: replace-cross Abstract: Multimodal Large Language Models demonstrate strong performance on multimodal benchmarks, yet often exhibit poor robustness when exposed to spurious modality interference, such as irrelevant text in vision understanding, or irrelevant visual content in question answering. At its core, modality interference refers to cases where spurious signals from non-essential modalities distort […]
$mathbbR^2k$ is Theoretically Large Enough for Embedding-based Top-$k$ Retrieval
arXiv:2601.20844v2 Announce Type: replace-cross Abstract: This paper studies the minimal dimension required to embed subset memberships ($m$ elements and $mchoose k$ subsets of at most $k$ elements) into vector spaces, denoted as Minimal Embeddable Dimension (MED). The tight bounds of MED are derived theoretically and supported empirically for various notions of “distances” or “similarities,” including […]
Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report
arXiv:2601.21051v1 Announce Type: new Abstract: We present Foundation-Sec-8B-Reasoning, the first open-source native reasoning model for cybersecurity. Built upon our previously released Foundation-Sec-8B base model (derived from Llama-3.1-8B-Base), the model is trained through a two-stage process combining supervised fine-tuning (SFT) and reinforcement learning from verifiable rewards (RLVR). Our training leverages proprietary reasoning data spanning cybersecurity analysis, […]
The quenched structured coalescent for diploid population models on finite graphs with large migrations and uneven offspring distributions
arXiv:2601.21079v1 Announce Type: cross Abstract: In this work we describe a new model for the evolution of a diploid structured population backwards in time that allows for large migrations and uneven offspring distributions. The model generalizes both the mean-field model of Birkner et al. [textitElectron. J. Probab. 23: 1-44 (2018)] and the haploid structured model […]
Bayesian-LoRA: Probabilistic Low-Rank Adaptation of Large Language Models
arXiv:2601.21003v1 Announce Type: new Abstract: Large Language Models usually put more emphasis on accuracy and therefore, will guess even when not certain about the prediction, which is especially severe when fine-tuned on small datasets due to the inherent tendency toward miscalibration. In this work, we introduce Bayesian-LoRA, which reformulates the deterministic LoRA update as a […]
From Linear Input to Hierarchical Structure: Function Words as Statistical Cues for Language Learning
arXiv:2601.21191v1 Announce Type: cross Abstract: What statistical conditions support learning hierarchical structure from linear input? In this paper, we address this question by focusing on the statistical distribution of function words. Function words have long been argued to play a crucial role in language acquisition due to their distinctive distributional properties, including high frequency, reliable […]