arXiv:2603.18647v1 Announce Type: cross Abstract: Test Vector Leakage Assessment (TVLA) based on Welch’s $t$-test has become a standard tool for detecting side-channel leakage. However, its mean-based nature can limit sensitivity when leakage manifests primarily through higher-order distributional differences. As our experiments show, this property becomes especially crucial when it comes to evaluating neural network implementations. […]
Retrieval-Augmented LLM Agents: Learning to Learn from Experience
arXiv:2603.18272v1 Announce Type: new Abstract: While large language models (LLMs) have advanced the development of general-purpose agents, achieving robust generalization to unseen tasks remains a significant challenge. Current approaches typically rely on either fine-tuning or training-free memory-augmented generation using retrieved experience; yet both have limitations: fine-tuning often fails to extrapolate to new tasks, while experience […]
KD-EKF: Knowledge-Distilled Adaptive Covariance EKF for Robust UWB/PDR Indoor Localization
arXiv:2603.18027v1 Announce Type: cross Abstract: Ultra-wideband (UWB) indoor localization provides centimeter-level accuracy and low latency, but its measurement reliability degrades severely under Non-Line-of-Sight (NLOS) conditions, leading to meter-scale ranging errors and inconsistent uncertainty characteristics. Inertial Measurement Unit (IMU)-based Pedestrian Dead Reckoning (PDR) complements UWB by providing infrastructure-free motion estimation; however, its error accumulates nonlinearly over […]
SEM: Sparse Embedding Modulation for Post-Hoc Debiasing of Vision-Language Models
arXiv:2603.19028v1 Announce Type: cross Abstract: Models that bridge vision and language, such as CLIP, are key components of multimodal AI, yet their large-scale, uncurated training data introduce severe social and spurious biases. Existing post-hoc debiasing methods often operate directly in the dense CLIP embedding space, where bias and task-relevant information are highly entangled. This entanglement […]
InfoMamba: An Attention-Free Hybrid Mamba-Transformer Model
arXiv:2603.18031v1 Announce Type: cross Abstract: Balancing fine-grained local modeling with long-range dependency capture under computational constraints remains a central challenge in sequence modeling. While Transformers provide strong token mixing, they suffer from quadratic complexity, whereas Mamba-style selective state-space models (SSMs) scale linearly but often struggle to capture high-rank and synchronous global interactions. We present a […]
EDM-ARS: A Domain-Specific Multi-Agent System for Automated Educational Data Mining Research
arXiv:2603.18273v1 Announce Type: new Abstract: In this technical report, we present the Educational Data Mining Automated Research System (EDM-ARS), a domain-specific multi-agent pipeline that automates end-to-end educational data mining (EDM) research. We conceptualize EDM-ARS as a general framework for domain-aware automated research pipelines, where educational expertise is embedded into each stage of the research lifecycle. […]
NANOZK: Layerwise Zero-Knowledge Proofs for Verifiable Large Language Model Inference
arXiv:2603.18046v1 Announce Type: cross Abstract: When users query proprietary LLM APIs, they receive outputs with no cryptographic assurance that the claimed model was actually used. Service providers could substitute cheaper models, apply aggressive quantization, or return cached responses – all undetectable by users paying premium prices for frontier capabilities. We present METHOD, a zero-knowledge proof […]
DreamPartGen: Semantically Grounded Part-Level 3D Generation via Collaborative Latent Denoising
arXiv:2603.19216v1 Announce Type: cross Abstract: Understanding and generating 3D objects as compositions of meaningful parts is fundamental to human perception and reasoning. However, most text-to-3D methods overlook the semantic and functional structure of parts. While recent part-aware approaches introduce decomposition, they remain largely geometry-focused, lacking semantic grounding and failing to model how parts align with […]
Lightweight Adaptation for LLM-based Technical Service Agent: Latent Logic Augmentation and Robust Noise Reduction
arXiv:2603.18074v1 Announce Type: cross Abstract: Adapting Large Language Models in complex technical service domains is constrained by the absence of explicit cognitive chains in human demonstrations and the inherent ambiguity arising from the diversity of valid responses. These limitations severely hinder agents from internalizing latent decision dynamics and generalizing effectively. Moreover, practical adaptation is often […]
CORE: Robust Out-of-Distribution Detection via Confidence and Orthogonal Residual Scoring
arXiv:2603.18290v1 Announce Type: new Abstract: Out-of-distribution (OOD) detection is essential for deploying deep learning models reliably, yet no single method performs consistently across architectures and datasets — a scorer that leads on one benchmark often falters on another. We attribute this inconsistency to a shared structural limitation: logit-based methods see only the classifier’s confidence signal, […]
Enhancing Reinforcement Learning Fine-Tuning with an Online Refiner
arXiv:2603.18088v1 Announce Type: cross Abstract: Constraints are essential for stabilizing reinforcement learning fine-tuning (RFT) and preventing degenerate outputs, yet they inherently conflict with the optimization objective because stronger constraints limit the ability of a fine-tuned model to discover better solutions. We propose textitdynamic constraints that resolve this tension by adapting to the evolving capabilities of […]
MMSearch-Plus: Benchmarking Provenance-Aware Search for Multimodal Browsing Agents
arXiv:2508.21475v3 Announce Type: replace Abstract: Existing multimodal browsing benchmarks often fail to require genuine multimodal reasoning, as many tasks can be solved with text-only heuristics without vision-in-the-loop verification. We introduce MMSearch-Plus, a 311-task benchmark that enforces multimodal understanding by requiring extraction and propagation of fine-grained visual cues through iterative image-text retrieval and cross-validation under retrieval […]