arXiv:2605.18879v1 Announce Type: cross Abstract: Large language models inevitably retain sensitive information, defined as inputs that may induce harmful generations, due to training on massive web corpora, raising concerns for privacy and safety. Existing machine unlearning methods primarily rely on retraining or aggressive fine-tuning, which are either computationally expensive or prone to degrading related knowledge […]
Data-Free Client Contribution Estimation via Logit Maximization for Federated Learning
arXiv:2605.18892v1 Announce Type: cross Abstract: Federated learning (FL) enables collaborative learning of computer vision models, where privacy and regulatory constraints prevent centralizing data across devices or organizations. However, practical FL deployments often exhibit severe class imbalance and label skew, causing standard aggregation protocols to overfit dominant clients and degrade minority-class performance. We propose a data-free, […]
Does Your Wildfire Prediction Model Actually Work, or Just Score Well?
arXiv:2605.18911v1 Announce Type: cross Abstract: Wildfire prediction is important for early warning and resource allocation, yet existing Earth foundation models (Earth FMs) are pretrained for general atmospheric and geophysical objectives rather than wildfire forecasting. To address this gap, we introduce WILDFIRE-FM, the first foundation model pretrained specifically for wildfire prediction using weather, active-fire observations, topography, […]
Network-Aware Bilinear Tokenization for Brain Functional Connectivity Representation Learning
arXiv:2605.14048v3 Announce Type: replace Abstract: Masked autoencoders (MAEs) have recently shown promise for self-supervised representation learning of resting-state brain functional connectivity (FC). However, a fundamental question remains unresolved: how should FC matrices be tokenized to align with the intrinsic modular organization of large-scale brain networks? Existing approaches typically adopt region-centric or graph-based schemes that treat […]
OEP: Poisoning Self-Evolving LLM Agents via Locally Correct but Non-Transferable Experiences
arXiv:2605.18930v1 Announce Type: cross Abstract: Memory-augmented large language model (LLM) agents use iterative reflection and self-evolution to solve complex tasks, but these mechanisms introduce security risks. Existing agentic memory attacks require privileged access or explicit malicious content, making them detectable by advanced safety filters. This leaves a subtler attack surface underexplored: whether adversaries can induce […]
Informational blueprints reveal condition-dependent gene regulatory architectures
arXiv:2605.19071v1 Announce Type: new Abstract: While coding regions in the genome have a direct interpretation in terms of protein products, significant fractions are non-coding and yet control essential biological functions. Unlike the genetic code, there is no “lookup table” that identifies where regulatory proteins, known as transcription factors (TFs), bind. Here, we extract these binding […]
RLFTSim: Realistic and Controllable Multi-Agent Traffic Simulation via Reinforcement Learning Fine-Tuning
arXiv:2605.19033v1 Announce Type: cross Abstract: Supervised open-loop training has been widely adopted for training traffic simulation models; however, it fails to capture the inherently dynamic, multi-agent interactions common in complex driving scenarios. We introduce RLFTSim, a reinforcement-learning-based fine-tuning framework that enhances scenario realism by aligning simulator rollouts with real-world data distributions and provides a method […]
Beyond Majority Voting: LLM Aggregation by Leveraging Higher-Order Information
arXiv:2510.01499v2 Announce Type: replace-cross Abstract: With the rapid progress of multi-agent large language model (LLM) reasoning, how to effectively aggregate answers from multiple LLMs has emerged as a fundamental challenge. Standard majority voting treats all answers equally, failing to consider latent heterogeneity and correlation across models. In this work, we design two new aggregation algorithms […]
Learning Long-Term Temporal Dependencies in Photovoltaic Power Output Prediction Through Multi-Horizon Forecasting
arXiv:2605.19074v1 Announce Type: cross Abstract: The rapid global expansion of solar photovoltaic (PV) capacity-reaching a record 597 GW in 2024-highlights the urgent need for robust forecasting models to mitigate the grid instability caused by the intermittent nature of solar irradiance. While deep learning-based direct forecasting using ground-based sky images (GSI) has emerged as a dominant […]
Embedding by Elicitation: Dynamic Representations for Bayesian Optimization of System Prompts
arXiv:2605.19093v1 Announce Type: new Abstract: System prompts are a central control mechanism in modern AI systems, shaping behavior across conversations, tasks, and user populations. Yet they are difficult to tune when feedback is available only as aggregate metrics rather than per-example labels, failures, or critiques. We study this aggregate feedback setting as sample-constrained black-box optimization […]
GOAL: Graph-based Objective-Aligned Diffusion Solvers for Dynamic Multi-Objective Optimization
arXiv:2605.19119v1 Announce Type: cross Abstract: Existing neural combinatorial optimization solvers frame solution search as imitation of optimal decisions, inherently limiting their utility to single-objective minimization and static constraints. We propose GOAL, a conditioned diffusion solver over relational graph representations that enables controllable decision generations by conditioning on human-specified objectives. We introduce a heterogeneous graph encoding […]
A Geometric Analysis of Small-sized Language Model Hallucinations
arXiv:2602.14778v3 Announce Type: replace-cross Abstract: Hallucinations — plausible but factually incorrect responses — pose a major challenge to the reliability of Large Language Models (LLMs), especially in multi-step or agentic settings. Existing work largely frames hallucinations as a consequence of missing knowledge; we show instead that, even when the relevant factual knowledge is present, models […]