arXiv:2604.04516v1 Announce Type: cross Abstract: Adapting LLMs to new domains causes forgetting because standard methods (full fine-tuning, LoRA) inject new directions into the weight space. We propose GAIN, which re-emphasizes existing features through multiplicative modulation W_new = S * W. The learned diagonal matrix S is applied to the attention output projection and optionally the […]
Your Agent, Their Asset: A Real-World Safety Analysis of OpenClaw
arXiv:2604.04759v1 Announce Type: cross Abstract: OpenClaw, the most widely deployed personal AI agent in early 2026, operates with full local system access and integrates with sensitive services such as Gmail, Stripe, and the filesystem. While these broad privileges enable high levels of automation and powerful personalization, they also expose a substantial attack surface that existing […]
Seemingly Simple Planning Problems are Computationally Challenging: The Countdown Game
arXiv:2508.02900v2 Announce Type: replace Abstract: There is a broad consensus that the inability to form long-term plans is one of the key limitations of current foundational models and agents. However, the existing planning benchmarks remain woefully inadequate to truly measure their planning capabilities. Most existing benchmarks either focus on loosely defined tasks like travel planning […]
ClawSafety: “Safe” LLMs, Unsafe Agents
arXiv:2604.01438v2 Announce Type: replace Abstract: Personal AI agents like OpenClaw run with elevated privileges on users’ local machines, where a single successful prompt injection can leak credentials, redirect financial transactions, or destroy files. This threat goes well beyond conventional text-level jailbreaks, yet existing safety evaluations fall short: most test models in isolated chat settings, rely […]
Vision Transformer-Based Time-Series Image Reconstruction for Cloud-Filling Applications
arXiv:2506.19591v2 Announce Type: replace-cross Abstract: Cloud cover in multispectral imagery (MSI) poses significant challenges for early season crop mapping, as it leads to missing or corrupted spectral information. Synthetic aperture radar (SAR) data, which is not affected by cloud interference, offers a complementary solution, but lack sufficient spectral detail for precise crop mapping. To address […]
MPCFormer: A physics-informed data-driven approach for explainable socially-aware autonomous driving
arXiv:2512.03795v2 Announce Type: replace-cross Abstract: Autonomous Driving (AD) vehicles still struggle to exhibit human-like behavior in highly dynamic and interactive traffic scenarios. The key challenge lies in AD’s limited ability to interact with surrounding vehicles, largely due to a lack of understanding the underlying mechanisms of social interaction. To address this issue, we introduce MPCFormer, […]
WIMLE: Uncertainty-Aware World Models with IMLE for Sample-Efficient Continuous Control
arXiv:2602.14351v2 Announce Type: replace-cross Abstract: Model-based reinforcement learning promises strong sample efficiency but often underperforms in practice due to compounding model error, unimodal world models that average over multi-modal dynamics, and overconfident predictions that bias learning. We introduce WIMLE, a model-based method that extends Implicit Maximum Likelihood Estimation (IMLE) to the model-based RL framework to […]
HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention
arXiv:2603.28458v3 Announce Type: replace-cross Abstract: Token-level sparse attention mechanisms, exemplified by DeepSeek Sparse Attention (DSA), achieve fine-grained key selection by scoring every historical key for each query through a lightweight indexer, then computing attention only on the selected subset. While the downstream sparse attention itself scales favorably, the indexer must still scan the entire prefix […]
GA-GS: Generation-Assisted Gaussian Splatting for Static Scene Reconstruction
arXiv:2604.04331v1 Announce Type: cross Abstract: Reconstructing static 3D scene from monocular video with dynamic objects is important for numerous applications such as virtual reality and autonomous driving. Current approaches typically rely on background for static scene reconstruction, limiting the ability to recover regions occluded by dynamic objects. In this paper, we propose GA-GS, a Generation-Assisted […]
Same Geometry, Opposite Noise: Transformer Magnitude Representations Lack Scalar Variability
arXiv:2604.04469v1 Announce Type: cross Abstract: Scalar variability — the finding that representational noise scales proportionally with magnitude, producing a constant coefficient of variation — is a hallmark of biological magnitude systems. We tested whether transformer language models exhibit this property by analysing the dispersion of hidden-state representations across carrier sentences for 26 numerical magnitudes in […]
PassiveQA: A Three-Action Framework for Epistemically Calibrated Question Answering via Supervised Finetuning
arXiv:2604.04565v1 Announce Type: cross Abstract: Large Language Models (LLMs) have achieved strong performance in question answering and retrieval-augmented generation (RAG), yet they implicitly assume that user queries are fully specified and answerable. In real-world settings, queries are often incomplete, ambiguous, or missing critical variables, leading models to produce overconfident or hallucinated responses. In this work, […]
Individual and Combined Effects of English as a Second Language and Typos on LLM Performance
arXiv:2604.04723v1 Announce Type: cross Abstract: Large language models (LLMs) are used globally, and because much of their training data is in English, they typically perform best on English inputs. As a result, many non-native English speakers interact with them in English as a second language (ESL), and these inputs often contain typographical errors. Prior work […]