arXiv:2511.12779v2 Announce Type: replace-cross Abstract: We study the problem of efficiently estimating policies that simultaneously optimize multiple objectives in reinforcement learning (RL). Given $n$ objectives (or tasks), we seek the optimal partition of these objectives into $k ll n$ groups, where each group comprises related objectives that can be trained together. This problem arises in […]
Revisiting the Scaling Properties of Downstream Metrics in Large Language Model Training
arXiv:2512.08894v1 Announce Type: cross Abstract: While scaling laws for Large Language Models (LLMs) traditionally focus on proxy metrics like pretraining loss, predicting downstream task performance has been considered unreliable. This paper challenges that view by proposing a direct framework to model the scaling of benchmark performance from the training budget. We find that for a […]
TS-PEFT: Unveiling Token-Level Redundancy in Parameter-Efficient Fine-Tuning
arXiv:2511.16147v2 Announce Type: replace-cross Abstract: Current Parameter-Efficient Fine-Tuning (PEFT) methods typically operate under an implicit assumption: once a target module is selected, every token passing through it contributes equally to the downstream task and requires a parameter update. In this paper, we challenge this convention and unveil a pervasive token-level redundancy in the fine-tuning of […]
Escaping the Verifier: Learning to Reason via Demonstrations
arXiv:2511.21667v3 Announce Type: replace-cross Abstract: Training Large Language Models (LLMs) to reason often relies on Reinforcement Learning (RL) with task-specific verifiers. However, many real-world reasoning-intensive tasks lack verifiers, despite offering abundant expert demonstrations that remain under-utilized for reasoning-focused training. We introduce RARO (Relativistic Adversarial Reasoning Optimization) that learns strong reasoning capabilities from only expert demonstrations […]
Decoupling Template Bias in CLIP: Harnessing Empty Prompts for Enhanced Few-Shot Learning
arXiv:2512.08606v1 Announce Type: cross Abstract: The Contrastive Language-Image Pre-Training (CLIP) model excels in few-shot learning by aligning visual and textual representations. Our study shows that template-sample similarity (TSS), defined as the resemblance between a text template and an image sample, introduces bias. This bias leads the model to rely on template proximity rather than true […]
On the Temporality for Sketch Representation Learning
arXiv:2512.04007v2 Announce Type: replace-cross Abstract: Sketches are simple human hand-drawn abstractions of complex scenes and real-world objects. Although the field of sketch representation learning has advanced significantly, there is still a gap in understanding the true relevance of the temporal aspect to the quality of these representations. This work investigates whether it is indeed justifiable […]
Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning
arXiv:2512.08639v1 Announce Type: cross Abstract: Aerial Vision-and-Language Navigation (VLN) aims to enable unmanned aerial vehicles (UAVs) to interpret natural language instructions and navigate complex urban environments using onboard visual observation. This task holds promise for real-world applications such as low-altitude inspection, search-and-rescue, and autonomous aerial delivery. Existing methods often rely on panoramic images, depth inputs, […]
China Regional 3km Downscaling Based on Residual Corrective Diffusion Model
arXiv:2512.05377v2 Announce Type: replace-cross Abstract: A fundamental challenge in numerical weather prediction is to efficiently produce high-resolution forecasts. A common solution is applying downscaling methods, which include dynamical downscaling and statistical downscaling, to the outputs of global models. This work focuses on statistical downscaling, which establishes statistical relationships between low-resolution and high-resolution historical data using […]
Reusability in MLOps: Leveraging Ports and Adapters to Build a Microservices Architecture for the Maritime Domain
arXiv:2512.08657v1 Announce Type: cross Abstract: ML-Enabled Systems (MLES) are inherently complex since they require multiple components to achieve their business goal. This experience report showcases the software architecture reusability techniques applied while building Ocean Guard, an MLES for anomaly detection in the maritime domain. In particular, it highlights the challenges and lessons learned to reuse […]
TrajMoE: Scene-Adaptive Trajectory Planning with Mixture of Experts and Reinforcement Learning
arXiv:2512.07135v2 Announce Type: replace-cross Abstract: Current autonomous driving systems often favor end-to-end frameworks, which take sensor inputs like images and learn to map them into trajectory space via neural networks. Previous work has demonstrated that models can achieve better planning performance when provided with a prior distribution of possible trajectories. However, these approaches often overlook […]