Language Models Can Explain Visual Features via Steering

Improving Fine-Grained Rice Leaf Disease Detection via Angular-Compactness Dual Loss Learning

arXiv:2603.25006v1 Announce Type: cross Abstract: Early detection of rice leaf diseases is critical, as rice is a staple crop supporting a substantial share of the

Pixelis: Reasoning in Pixels, from Seeing to Acting

arXiv:2603.25091v1 Announce Type: cross Abstract: Most vision-language systems are static observers: they describe pixels, do not act, and cannot safely improve under shift. This passivity

AI Security in the Foundation Model Era: A Comprehensive Survey from a Unified Perspective

arXiv:2603.24857v1 Announce Type: cross Abstract: As machine learning (ML) systems expand in both scale and functionality, the security landscape has become increasingly complex, with a

TIGFlow-GRPO: Trajectory Forecasting via Interaction-Aware Flow Matching and Reward-Driven Optimization

arXiv:2603.24936v1 Announce Type: cross Abstract: Human trajectory forecasting is important for intelligent multimedia systems operating in visually complex environments, such as autonomous driving and crowd

Grokking as a Falsifiable Finite-Size Transition

arXiv:2603.24746v1 Announce Type: cross Abstract: Grokking — the delayed onset of generalization after early memorization — is often described with phase-transition language, but that claim