Revisiting the Scaling Properties of Downstream Metrics in Large Language Model Training

Scalable Multi-Objective and Meta Reinforcement Learning via Gradient Estimation

arXiv:2511.12779v2 Announce Type: replace-cross Abstract: We study the problem of efficiently estimating policies that simultaneously optimize multiple objectives in reinforcement learning (RL). Given $n$ objectives

TS-PEFT: Unveiling Token-Level Redundancy in Parameter-Efficient Fine-Tuning

arXiv:2511.16147v2 Announce Type: replace-cross Abstract: Current Parameter-Efficient Fine-Tuning (PEFT) methods typically operate under an implicit assumption: once a target module is selected, every token passing

Escaping the Verifier: Learning to Reason via Demonstrations

arXiv:2511.21667v3 Announce Type: replace-cross Abstract: Training Large Language Models (LLMs) to reason often relies on Reinforcement Learning (RL) with task-specific verifiers. However, many real-world reasoning-intensive

Decoupling Template Bias in CLIP: Harnessing Empty Prompts for Enhanced Few-Shot Learning

arXiv:2512.08606v1 Announce Type: cross Abstract: The Contrastive Language-Image Pre-Training (CLIP) model excels in few-shot learning by aligning visual and textual representations. Our study shows that

On the Temporality for Sketch Representation Learning

arXiv:2512.04007v2 Announce Type: replace-cross Abstract: Sketches are simple human hand-drawn abstractions of complex scenes and real-world objects. Although the field of sketch representation learning has