June 9, 2026 – Page 17 – dijee Pharma Intelligence

Learning to Evaluate: Cost-Effective Model Evaluation on Unlabeled Data with Meta-Learning

arXiv:2605.23595v3 Announce Type: replace-cross Abstract: The rapid advancement of machine learning has led to an unprecedented expansion of model ecosystems, making it increasingly difficult to assess the reliability of newly released models on unseen and unlabeled data. Existing evaluation pipelines typically rely on costly annotation, repeated fine-tuning, or assumptions that do not generalize well to […]

June 9, 2026

ThinkBooster: A Unified Framework for Seamless Test-Time Scaling of LLM Reasoning

arXiv:2606.06915v2 Announce Type: replace-cross Abstract: Test-time compute (TTC) scaling has emerged as a powerful paradigm for improving large language model (LLM) reasoning by allocating additional compute during inference, e.g., via multi-sample generation and verifier-based reranking. Existing TTC scaling strategies and reasoning scorers remain fragmented, evaluated under inconsistent protocols, and are rarely analyzed through the lens […]

June 9, 2026

Understanding Quantization-Aware Training: Gradients at Quantized Weights Bias to the Low-Loss Basin

arXiv:2606.09012v1 Announce Type: cross Abstract: Post-training quantization (PTQ) converts a trained full-precision model into low-bit weights without task-level retraining, while quantization-aware training (QAT) incorporates quantization into the training loop. Although PTQ is efficient and often accurate at moderate bitwidths, it can fail sharply at aggressive bitwidths; QAT is more expensive but can often recover the […]

June 9, 2026

Self-Paced Curriculum Reinforcement Learning for Autonomous Superbike Racing in Simulation

arXiv:2606.09236v1 Announce Type: cross Abstract: Autonomous Racing has seen remarkable progress through deep Reinforcement Learning (RL), primarily for four-wheeled vehicles. However, motorbikes introduce substantially greater complexity due to the need to manage balance and lean angle, in addition to more reactive steering and throttle control, and a smaller weight. In this work, we present a […]

June 9, 2026

Closure-Validated Circuit Discovery in Attention Heads: Co-activation Proposes, Ablation Disposes

arXiv:2606.09607v1 Announce Type: cross Abstract: Interpretability increasingly treats groups of components, not individual units, as the basic object, and proposes to find them by clustering co-activation statistics. We ask whether such a cheap signal actually identifies an attention-head circuit. Adapting a sparse-autoencoder clustering recipe to attention heads — but validating by causal ablation rather than […]

June 9, 2026

HA-VLN 2.0: An Open Benchmark and Leaderboard for Human-Aware Navigation in Discrete and Continuous Environments with Dynamic Multi-Human Interactions

arXiv:2503.14229v4 Announce Type: replace Abstract: Vision-and-Language Navigation (VLN) has been studied mainly in either discrete or continuous spaces, with little attention to dynamic, crowded environments. We present HA-VLN 2.0, a unified benchmark introducing explicit social-awareness constraints. Our contributions are: (i) a standardized task and metrics capturing both goal accuracy and personal-space adherence; (ii) HAPS 2.0 […]

June 9, 2026

NoRD: A Data-Efficient Vision-Language-Action Model that Drives without Reasoning

arXiv:2602.21172v3 Announce Type: replace Abstract: Vision-Language-Action (VLA) models are advancing autonomous driving by replacing modular pipelines with unified end-to-end architectures. However, current VLAs face two expensive requirements: (1) massive dataset collection, and (2) dense reasoning annotations. In this work, we address both challenges with NORD (No Reasoning for Driving). Compared to existing VLAs, NORD achieves […]

June 9, 2026

Neural Scalable Symbolic Search Framework for Complex Logical Queries with Multiple Free Variables

arXiv:2605.25985v2 Announce Type: replace Abstract: Complex Query Answering (CQA) is a fundamental knowledge representation and reasoning task over incomplete knowledge graphs (KGs). Answering existential first-order queries with $k$ free variables (i.e., $textEFO_k$ queries) is a crucial yet challenging problem, as it requires ranking answer tuples in $mathcalE^k$, where $mathcalE$ denotes the entity set of a […]

June 9, 2026

Audio-FLAN: An Instruction-Following Dataset for Unified Audio Understanding and Generation of Speech, Music, and Sound

arXiv:2502.16584v2 Announce Type: replace-cross Abstract: Recent advancements in audio tokenization have significantly enhanced the integration of audio capabilities into large language models (LLMs). However, audio understanding and generation are often treated as distinct tasks, hindering the development of truly unified audio-language models. While instruction tuning has demonstrated remarkable success in improving generalization and zero-shot learning […]

June 9, 2026

Correcting Mean Bias in Text Embeddings: A Refined Renormalization with Training-Free Improvements on MMTEB

arXiv:2511.11041v2 Announce Type: replace-cross Abstract: We find that current sentence-embedding models produce outputs with a consistent bias: every embedding $e$ decomposes as $tilde e + mu$, where the mean $mu$ is near-identical across all sentences. We study two training-free corrections — subtracting $mu$ directly (R1), or projecting each embedding off the mean direction (R2) — […]

June 9, 2026

MEnvAgent: Scalable Polyglot Environment Construction for Verifiable Software Engineering

arXiv:2601.22859v3 Announce Type: replace-cross Abstract: The evolution of Large Language Model (LLM) agents for software engineering (SWE) is constrained by the scarcity of verifiable datasets, a bottleneck stemming from the complexity of constructing executable environments across diverse languages. To address this, we introduce MEnvAgent, a Multi-language framework for automated Environment construction that facilitates scalable generation […]

June 9, 2026

Train at Moving Edge: Online-Verified Prompt Selection for Efficient RL Training of Large Reasoning Model

arXiv:2603.25184v2 Announce Type: replace-cross Abstract: Reinforcement learning (RL) has become essential for post-training large language models (LLMs) in reasoning tasks. While scaling rollouts can stabilize training and enhance performance, the computational overhead is a critical issue. In algorithms like GRPO, multiple rollouts per prompt incur prohibitive costs, as a large portion of prompts provide negligible […]

June 9, 2026

Subscribe for Updates