arXiv:2605.27155v1 Announce Type: cross
Abstract: Testing object detectors in safety-critical domains requires semantically meaningful probes beyond pixel-level corruptions. We present SemProbe, a tool for semantic robustness probing: users upload deployment images, create masks manually or automatically, select operational design domain-derived factors (or custom prompts), and run diffusion-based controlled inpainting. The system supports batch jobs, parallel seed/workflow variations, and configurable generation parameters. After each output, model inference runs automatically and displays annotated before/after comparisons with performance deltas. All probes are logged as structured artifacts, enabling traceable robustness evidence aligned with safety evaluation workflows. We demonstrate textscSemProbe on hand detection for dimension saws, targeting factors from insurance-oriented test criteria.
Negligible in Size, Significant in Effect: On Scale Vectors in Large Language Models
arXiv:2605.26895v1 Announce Type: cross Abstract: Normalization layers in modern large language models (LLMs) consist of a deterministic normalization operation and a learnable scale vector. While


