June 2, 2026 – Page 22 – dijee Pharma Intelligence

Revisiting Reinforcement Learning with Verifiable Rewards from a Contrastive Perspective

arXiv:2605.12969v3 Announce Type: replace-cross Abstract: Group Relative Policy Optimization (GRPO) is one of the most widely adopted RLVR algorithms for post-training large language models on reasoning tasks. We first show that GRPO admits an equivalent discriminative reformulation, in which policy optimization maximizes the expected score gap between verified positive and negative rollouts. This reformulation reveals […]

June 2, 2026

DOT-MoE: Differentiable Optimal Transport for MoEfication

arXiv:2606.01666v1 Announce Type: cross Abstract: The scaling of Large Language Models (LLMs) has driven significant performance gains but created substantial challenges in inference efficiency. While Mixture of Experts (MoEs) architectures address this by decoupling model size from inference cost, training MoEs from scratch is often unstable and compute intensive. Conversion of pre-trained dense models into […]

June 2, 2026

FLARE: Diffusion for Hybrid Language Model

arXiv:2606.01774v1 Announce Type: cross Abstract: Autoregressive (AR) large language models (LLMs) have achieved broad practical success, but sequential decoding remains a key bottleneck for low-latency deployment. Recent efficient-inference work has progressed along two axes: reducing the cost of each model invocation through efficient architectures, and reducing serial decoding steps through parallel generation. Hybrid attention backbones […]

June 2, 2026

Co-Fusion4D: Spatio-temporal Collaborative Fusion for Robust 3D Object Detection

arXiv:2605.20301v2 Announce Type: replace-cross Abstract: In autonomous driving, 3D object detection is essential for accurate perception and reliable decision-making. However, object motion and ego-motion often induce cross-frame spatiotemporal inconsistencies in BEV-based detectors, leading to temporal BEV feature misalignment and degraded spatiotemporal consistency. To address these challenges, we propose Co-Fusion4D, a unified framework that explicitly preserves […]

June 2, 2026

Agentic-J: An AI Agent for Biological Microscopy Image Analysis

arXiv:2606.02080v1 Announce Type: cross Abstract: Biological image analysis increasingly demands integration across heterogeneous tools, programming environments, and domain knowledge that few researchers can command simultaneously. We present Agentic-J, a containerised, multi-agent AI assistant, primarily for ImageJ/Fiji that enables biologists to specify analysis tasks in natural language, from nuclei segmentation and cell tracking to multi-condition quantification. […]

June 2, 2026

Consistency Training while Mitigating Obfuscation via Rate Matching

arXiv:2606.02211v1 Announce Type: cross Abstract: Large language models are often influenced by extraneous input features, such as cues revealing a user’s preferred answer. Consistency training reduces this influence by training models to behave similarly across inputs with and without the extraneous feature. However, existing methods train for consistency over entire responses or internal activations, which […]

June 2, 2026

Topology as Logic: Structural Role Geometry Across Formal, Software, Biological, and Prebiotic Systems

arXiv:2606.02392v1 Announce Type: cross Abstract: We ask whether dependency topology correlates with functional load-bearing organization as recoverable geometry — not as a metaphor, but as a measurable structural property detectable by multilayer network analysis. Across seven independent substrates, we show that hub persistence and rank divergence under the Functional Proximity Law recover operational organization that […]

June 2, 2026

Modeling Depth Ambiguity: A Mixture-Density Representation for Flying-Point-Free Depth Estimation

arXiv:2606.02552v1 Announce Type: cross Abstract: Despite advances in depth estimation, flying points remain a persistent failure mode: near object boundaries, depth estimators often predict spurious 3D points in the empty space between foreground and background surfaces. We trace this artifact to a standard modeling choice: assigning each pixel a single depth hypothesis. At boundaries, a […]

June 2, 2026

EMoE: Training-Free Expert Disagreement for Uncertainty-Aware Text-to-Image Diffusion

arXiv:2505.13273v2 Announce Type: replace Abstract: Large text-to-image diffusion models rarely expose reliable signals of when a prompt is likely to produce a poorly aligned generation, especially when training data is undisclosed. We study whether expert disagreement inside pre-trained mixture-of-experts (MoE) diffusion models can serve as a reliable estimate for epistemic uncertainty. We introduce EMoE, a […]

June 2, 2026

On the Collapse of Generative Paths: A Criterion and Correction for Diffusion Steering

arXiv:2512.10339v2 Announce Type: replace Abstract: Inference-time steering adapts pretrained diffusion and flow models to new tasks without retraining, often utilizing ratio-of-densities constructions that reweight time-indexed marginals with fixed exponents. We identify Marginal Path Collapse, a failure mode in which the intermediate density defined by such compositions becomes non-normalizable despite valid endpoints. This collapse can arise […]

June 2, 2026

REAL: Resolving Knowledge Conflicts in Knowledge-Intensive Visual Question Answering via Reasoning-Pivot Alignment

arXiv:2602.14065v2 Announce Type: replace Abstract: Knowledge-intensive Visual Question Answering (KI-VQA) frequently suffers from severe knowledge conflicts caused by the inherent limitations of open-domain retrieval. However, existing paradigms face critical limitations due to the lack of generalizable conflict detection and intra-model constraint mechanisms to handle conflicting evidence. To address these challenges, we propose the REAL (Reasoning-Pivot […]

June 2, 2026

Process Reward Agents for Steering Knowledge-Intensive Reasoning

arXiv:2604.09482v2 Announce Type: replace Abstract: Reasoning in knowledge-intensive domains remains challenging as intermediate steps are often not locally verifiable: unlike math or code, evaluating step correctness may require synthesizing clues across large external knowledge sources. As a result, subtle errors can propagate through reasoning traces, potentially never to be detected. Prior work has proposed process […]

June 2, 2026

Subscribe for Updates