arXiv:2603.09789v2 Announce Type: replace-cross Abstract: Accurate financial volatility forecasting is crucial but challenged by the non-linear, highly correlated nature of market data. Recently, quantum computing has emerged as a promising paradigm for solving complex high-dimensional sampling problems. To harness this, we propose a novel hybrid framework combining the temporal representation power of classical neural networks […]
Less Is More: Engineering Challenges of On-Device Small Language Model Integration in a Mobile Application
arXiv:2604.24636v2 Announce Type: replace-cross Abstract: On-device Small Language Models (SLMs) promise fully offline, private AI experiences for mobile users (no cloud dependency, no data leaving the device). But is this promise achievable in practice? This paper presents a longitudinal practitioner case study documenting the engineering challenges of integrating SLMs (Gemma 4 E2B, 2.6B parameters; Qwen3 […]
RLDX-1 Technical Report
arXiv:2605.03269v2 Announce Type: replace-cross Abstract: While Vision-Language-Action models (VLAs) have shown remarkable progress toward human-like generalist robotic policies through the versatile intelligence (i.e. broad scene understanding and language-conditioned generalization) inherited from pre-trained Vision-Language Models, they still struggle with complex real-world tasks requiring broader functional capabilities (e.g. motion awareness, long-term memory, and physical sensing). To address […]
Misaligned by Reward: Socially Undesirable Preferences in LLMs
arXiv:2605.05003v1 Announce Type: cross Abstract: Reward models are a key component of large language model alignment, serving as proxies for human preferences during training. However, existing evaluations focus primarily on broad instruction-following benchmarks, providing limited insight into whether these models capture socially desirable preferences. As a result, important failures in social alignment can remain hidden. […]
Look Once, Beam Twice: Camera-Primed Real-Time Double-Directional mmWave Beam Management for Vehicular Connectivity
arXiv:2605.05071v1 Announce Type: cross Abstract: Millimeter-wave (mmWave) frequencies promise multi-gigabit connectivity for vehicle-to-everything (V2X) networks, but face challenges in terms of severe path loss and mobility-related beam misalignment. Reliable V2X connectivity requires fast, double-directional beam alignment. However, existing methods suffer from high training overhead and limited generalization to unseen scenarios. This paper presents VIsion-based BEamforming(VIBE), […]
Adaptive Policy Selection and Fine-Tuning under Interaction Budgets for Offline-to-Online Reinforcement Learning
arXiv:2605.05123v1 Announce Type: cross Abstract: In offline-to-online reinforcement learning (O2O-RL), policies are first safely trained offline using previously collected datasets and then further fine-tuned for tasks via limited online interactions. In a typical O2O-RL pipeline, candidate policies trained with offline RL are evaluated via either off-policy evaluation (OPE) or online evaluation (OE). The policy with […]
Design Conductor 2.0: An agent builds a TurboQuant inference accelerator in 80 hours
arXiv:2605.05170v1 Announce Type: cross Abstract: Driven by a rapid co-evolution of both harness and underlying models, LLM agents are improving at a dizzying pace. In our prior work (performed in Dec. 2025), we introduced “Design Conductor” (or just “Conductor”), a system capable of building a 5-stage Linux-capable RISC-V CPU in 12 hours. In this work, […]
Provable Distributional Value Iteration under Partial Observability
arXiv:2505.06518v3 Announce Type: replace Abstract: In many real-world planning tasks, agents must tackle uncertainty about the environment’s state and variability in the outcomes induced by stochastic dynamics and rewards. Motivated by recent progress in world model approaches, where latent models approximate beliefs and support planning, we extend Distributional Reinforcement Learning (DistRL), which models the entire […]
Towards Generative Location Awareness for Disaster Response: A Probabilistic Cross-view Geolocalization Approach
arXiv:2512.20056v2 Announce Type: replace Abstract: As Earth’s climate changes, it is impacting disasters and extreme weather events across the planet. Record-breaking heat waves, drenching rainfalls, extreme wildfires, and widespread flooding during hurricanes are all becoming more frequent and more intense. Rapid and efficient response to disaster events is essential for climate resilience and sustainability. A […]
Beyond Content Safety: Real-Time Monitoring for Reasoning Vulnerabilities in Large Language Models
arXiv:2603.25412v2 Announce Type: replace Abstract: Large language models increasingly rely on explicit chain-of-thought reasoning to solve complex tasks, yet the safety of the reasoning process itself remains largely unaddressed. Existing work focuses predominantly on content safety (i.e., detecting harmful, biased, or factually incorrect outputs), while treating the underlying reasoning chain as an opaque intermediate artifact. […]
Shadow-Loom: Causal Reasoning over Graphical World Models of Narratives
arXiv:2605.02475v2 Announce Type: replace Abstract: Stories hold a reader’s attention because they have causes, secrets, and consequences. Shadow-Loom is an experimental open-source framework that turns a narrative into a versioned graphical world model and lets two engines act on it: a causal physics grounded in Pearl’s ladder of causation and a recently proposed counterfactual calculus […]
Beyond Public Access in LLM Pre-Training Data
arXiv:2505.00020v2 Announce Type: replace-cross Abstract: Using a legally obtained dataset of 34 copyrighted O’Reilly Media books, we apply the DE-COP membership inference attack method to investigate whether OpenAI’s large language models show recognition of copyrighted content. Our results based on this small sample suggest that GPT-4o, OpenAI’s more recent and capable model, exhibits patterns consistent […]