arXiv:2603.03099v5 Announce Type: replace-cross Abstract: Despite Adam demonstrating faster empirical convergence than SGD in many applications, much of the existing theory yields guarantees essentially comparable to those of SGD, leaving the empirical performance gap insufficiently explained. In this paper, we uncover a key second-moment normalization in Adam and develop a stopping-time/martingale analysis that provably distinguishes […]
TensorHub: Scalable and Elastic Weight Transfer for LLM RL Training
arXiv:2604.09107v1 Announce Type: cross Abstract: Modern LLM reinforcement learning (RL) workloads require a highly efficient weight transfer system to scale training across heterogeneous computational resources. However, existing weight transfer approaches either fail to provide flexibility for dynamically scaling clusters or incur fundamental data movement overhead, resulting in poor performance. We introduce Reference-Oriented Storage (ROS), a […]
Descriptor: Parasitoid Wasps and Associated Hymenoptera Dataset (DAPWH)
arXiv:2602.20028v2 Announce Type: replace-cross Abstract: Accurate taxonomic identification is the cornerstone of biodiversity monitoring and agricultural management, particularly for the hyper-diverse superfamily Ichneumonoidea. Comprising the families Ichneumonidae and Braconidae, these parasitoid wasps are ecologically critical for regulating insect populations, yet they remain one of the most taxonomically challenging groups due to their cryptic morphology and […]
Scheming in the wild: detecting real-world AI scheming incidents with open-source intelligence
arXiv:2604.09104v1 Announce Type: cross Abstract: Scheming, the covert pursuit of misaligned goals by AI systems, represents a potentially catastrophic risk, yet scheming research suffers from significant limitations. In particular, scheming evaluations demonstrate behaviours that may not occur in real-world settings, limiting scientific understanding, hindering policy development, and not enabling real-time detection of loss of control […]
An Adaptive Model Selection Framework for Demand Forecasting under Horizon-Induced Degradation to Support Business Strategy and Operations
arXiv:2602.13939v3 Announce Type: replace-cross Abstract: Business environments characterized by intermittent demand, high variability, and multi-step planning horizons require forecasting policies that support consistent operational decisions across heterogeneous SKU portfolios. Because no forecasting model is universally dominant, and model rankings vary across error metrics, demand regimes, and forecast horizons, forecasting model assignment is a nontrivial decision […]
CLIP-Inspector: Model-Level Backdoor Detection for Prompt-Tuned CLIP via OOD Trigger Inversion
arXiv:2604.09101v1 Announce Type: cross Abstract: Organisations with limited data and computational resources increasingly outsource model training to Machine Learning as a Service (MLaaS) providers, who adapt vision-language models (VLMs) such as CLIP to downstream tasks via prompt tuning rather than training from scratch. This semi-honest setting creates a security risk where a malicious provider can […]
Overstating Attitudes, Ignoring Networks: LLM Biases in Simulating Misinformation Susceptibility
arXiv:2602.04674v2 Announce Type: replace-cross Abstract: Large language models (LLMs) are increasingly used as proxies for human judgment in computational social science, yet their ability to reproduce patterns of susceptibility to misinformation remains unclear. We test whether LLM-simulated survey respondents, prompted with participant profiles drawn from social survey data measuring network, demographic, attitudinal and behavioral features, […]
DeepGuard: Secure Code Generation via Multi-Layer Semantic Aggregation
arXiv:2604.09089v1 Announce Type: cross Abstract: Large Language Models (LLMs) for code generation can replicate insecure patterns from their training data. To mitigate this, a common strategy for security hardening is to fine-tune models using supervision derived from the final transformer layer. However, this design may suffer from a final-layer bottleneck: vulnerability-discriminative cues can be distributed […]
Tiled Prompts: Overcoming Prompt Misguidance in Image and Video Super-Resolution
arXiv:2602.03342v2 Announce Type: replace-cross Abstract: Text-conditioned diffusion models have advanced image and video super-resolution by using prompts as semantic priors, and modern super-resolution pipelines typically rely on latent tiling to scale to high resolutions. In practice, a single global caption is used with the latent tiling, often causing prompt misguidance. Specifically, a coarse global prompt […]
GAN-Enhanced Deep Reinforcement Learning for Semantic-Aware Resource Allocation in 6G Network Slicing
arXiv:2604.08576v1 Announce Type: cross Abstract: Sixth-generation (6G) wireless networks must support heterogeneous services: enhanced Mobile Broadband (eMBB) requiring 1 Tbps data rates, massive Machine-Type Communications (mMTC) supporting 10 million devices per km, and Ultra-Reliable Low-Latency Communications (URLLC) with 0.1-1 ms latency. Current resource allocation suffers from three limitations: (1) semantic blindness wasting 35% bandwidth on […]
Multivariate Time Series Anomaly Detection via Dual-Branch Reconstruction and Autoregressive Flow-based Residual Density Estimation
arXiv:2604.08582v1 Announce Type: cross Abstract: Multivariate Time Series Anomaly Detection (MTSAD) is critical for real-world monitoring scenarios such as industrial control and aerospace systems. Mainstream reconstruction-based anomaly detection methods suffer from two key limitations: first, overfitting to spurious correlations induced by an overemphasis on cross-variable modeling; second, the generation of misleading anomaly scores by simply […]
Temperature-Dependent Performance of Prompting Strategies in Extended Reasoning Large Language Models
arXiv:2604.08563v1 Announce Type: cross Abstract: Extended reasoning models represent a transformative shift in Large Language Model (LLM) capabilities by enabling explicit test-time computation for complex problem solving. However, the optimal configuration of sampling temperature and prompting strategy for these systems remains largely underexplored. We systematically evaluate chain-of-thought and zero-shot prompting across four temperature settings (0.0, […]