arXiv:2603.21565v1 Announce Type: cross Abstract: Synthetic aperture radar automatic target recognition (SAR ATR) is of considerable importance in marine navigation and disaster monitoring. However, the coherent speckle noise inherent in SAR imagery often obscures salient target features, leading to degraded recognition accuracy and limited model generalization. To address this issue, this paper proposes a target-aware […]
Grounded Chess Reasoning in Language Models via Master Distillation
arXiv:2603.20510v1 Announce Type: new Abstract: Language models often lack grounded reasoning capabilities in specialized domains where training data is scarce but bespoke systems excel. We introduce a general framework for distilling expert system reasoning into natural language chain-of-thought explanations, enabling compact models to acquire domain expertise and the ability to generate faithful, grounded explanations. Rather […]
When Exploration Comes for Free with Mixture-Greedy: Do we need UCB in Diversity-Aware Multi-Armed Bandits?
arXiv:2603.21716v1 Announce Type: cross Abstract: Efficient selection among multiple generative models is increasingly important in modern generative AI, where sampling from suboptimal models is costly. This problem can be formulated as a multi-armed bandit task. Under diversity-aware evaluation metrics, a non-degenerate mixture of generators can outperform any individual model, distinguishing this setting from classical best-arm […]
LLM-Driven Heuristic Synthesis for Industrial Process Control: Lessons from Hot Steel Rolling
arXiv:2603.20537v1 Announce Type: new Abstract: Industrial process control demands policies that are interpretable and auditable, requirements that black-box neural policies struggle to meet. We study an LLM-driven heuristic synthesis framework for hot steel rolling, in which a language model iteratively proposes and refines human-readable Python controllers using rich behavioral feedback from a physics-based simulator. The […]
Adaptive Video Distillation: Mitigating Oversaturation and Temporal Collapse in Few-Step Generation
arXiv:2603.21864v1 Announce Type: cross Abstract: Video generation has recently emerged as a central task in the field of generative AI. However, the substantial computational cost inherent in video synthesis makes model distillation a critical technique for efficient deployment. Despite its significance, there is a scarcity of methods specifically designed for video diffusion models. Prevailing approaches […]
Leum-VL Technical Report
arXiv:2603.20354v1 Announce Type: cross Abstract: A short video succeeds not simply because of what it shows, but because of how it schedules attention — yet current multimodal models lack the structural grammar to parse or produce this organization. Existing models can describe scenes, answer event-centric questions, and read on-screen text, but they are far less […]
Context Cartography: Toward Structured Governance of Contextual Space in Large Language Model Systems
arXiv:2603.20578v1 Announce Type: new Abstract: The prevailing approach to improving large language model (LLM) reasoning has centered on expanding context windows, implicitly assuming that more tokens yield better performance. However, empirical evidence – including the “lost in the middle” effect and long-distance relational degradation – demonstrates that contextual space exhibits structural gradients, salience asymmetries, and […]
ALARA for Agents: Least-Privilege Context Engineering Through Portable Composable Multi-Agent Teams
arXiv:2603.20380v1 Announce Type: cross Abstract: Industry practitioners and academic researchers regularly use multi-agent systems to accelerate their work, yet the frameworks through which these systems operate do not provide a simple, unified mechanism for scalably managing the critical aspects of the agent harness, impacting both the quality of individual human-agent interactions and the capacity for […]
lambda-GELU: Learning Gating Hardness for Controlled ReLU-ization in Deep Networks
arXiv:2603.21991v1 Announce Type: cross Abstract: Gaussian Error Linear Unit (GELU) is a widely used smooth alternative to Rectifier Linear Unit (ReLU), yet many deployment, compression, and analysis toolchains are most naturally expressed for piecewise-linear (ReLU-type) networks. We study a hardness-parameterized formulation of GELU, f(x;lambda)=xPhi(lambda x), where Phi is the Gaussian CDF and lambda in [1, […]
KV Cache Optimization Strategies for Scalable and Efficient LLM Inference
arXiv:2603.20397v1 Announce Type: cross Abstract: The key-value (KV) cache is a foundational optimization in Transformer-based large language models (LLMs), eliminating redundant recomputation of past token representations during autoregressive generation. However, its memory footprint scales linearly with context length, imposing critical bottlenecks on GPU memory capacity, memory bandwidth, and inference throughput as production LLMs push context […]
Position: Multi-Agent Algorithmic Care Systems Demand Contestability for Trustworthy AI
arXiv:2603.20595v1 Announce Type: new Abstract: Multi-agent systems (MAS) are increasingly used in healthcare to support complex decision-making through collaboration among specialized agents. Because these systems act as collective decision-makers, they raise challenges for trust, accountability, and human oversight. Existing approaches to trustworthy AI largely rely on explainability, but explainability alone is insufficient in multi-agent settings, […]
Coding Agents are Effective Long-Context Processors
arXiv:2603.20432v1 Announce Type: cross Abstract: Large Language Models (LLMs) have demonstrated remarkable progress in scaling to access massive contexts. However, the access is via the latent and uninterpretable attention mechanisms, and LLMs fail to effective process long context, exhibiting significant performance degradation as context length increases. In this work, we study whether long-context processing can […]