May 8, 2026 – Page 13 – dijee Pharma Intelligence

Steering Visual Generation in Unified Multimodal Models with Understanding Supervision

arXiv:2605.05781v1 Announce Type: cross Abstract: Unified multimodal models are envisioned to bridge the gap between understanding and generation. Yet, to achieve competitive performance, state-of-the-art models adopt largely decoupled understanding and generation components. This design, while effective for individual tasks, weakens the connection required for mutual enhancement, leaving the potential synergy empirically uncertain. We propose to […]

May 8, 2026

AceGRPO: Adaptive Curriculum Enhanced Group Relative Policy Optimization for Autonomous Machine Learning Engineering

arXiv:2602.07906v5 Announce Type: replace-cross Abstract: Autonomous Machine Learning Engineering (MLE) requires agents to perform sustained, iterative optimization over long horizons. While recent LLM-based agents show promise, current prompt-based agents for MLE suffer from behavioral stagnation due to frozen parameters. Although Reinforcement Learning (RL) offers a remedy, applying it to MLE is hindered by prohibitive execution […]

May 8, 2026

CITE: Anytime-Valid Statistical Inference in LLM Self-Consistency

arXiv:2605.05873v1 Announce Type: cross Abstract: Large language models often improve reasoning by sampling multiple outputs and aggregating their final answers, but precise and efficient control of error levels remains a challenging task. In particular, deciding when to stop sampling remains difficult when the stopping rule is data-dependent and the set of possible answers is not […]

May 8, 2026

Intentionality is a Design Decision: Measuring Functional Intentionality for Accountable AI Systems

arXiv:2605.05475v1 Announce Type: new Abstract: As AI systems increasingly exhibit autonomous, goal-directed, and long-horizon behavior, users lack a standardized way to detect the degree to which a system functions like an intentional actor for governance and accountability purposes. This position paper defines intentionality not as consciousness, but as a behavioral profile characterized by purpose, foresight, […]

May 8, 2026

Quantum-enhanced Large Language Models on Quantum Hardware via Cayley Unitary Adapters

arXiv:2605.05914v1 Announce Type: cross Abstract: Large language models (LLMs) have transformed artificial intelligence, yet classical architectures impose a fundamental constraint: every trainable parameter demands classical memory that scales unfavourably with model size. Quantum computing offers a qualitatively different pathway, but practical demonstrations on real hardware have remained elusive for models of practical relevance. Here we […]

May 8, 2026

Bringing Up a Bilingual BabyLM: Investigating Multilingual Language Acquisition Using Small-Scale Models

arXiv:2603.29552v2 Announce Type: replace-cross Abstract: Multilingualism is incredibly common around the world, leading to many important theoretical and practical questions about how children learn multiple languages at once. For example, does multilingual acquisition lead to delays in learning? Are there better and worse ways to structure multilingual input? Many correlational studies address these questions, but […]

May 8, 2026

A Fine-Grained Understanding of Uniform Convergence for Halfspaces

arXiv:2605.06004v1 Announce Type: cross Abstract: We study the fine-grained uniform convergence behavior of halfspaces beyond worst-case VC bounds. For inhomogeneous halfspaces in $mathbbR^d$ with $dge 2$, we show that standard first-order VC bounds are essentially tight: even consistent hypotheses can incur population error $Theta(dln(n/d)/n)$, and in the agnostic setting the deviation scales as $sqrttauln(1/tau)$ at […]

May 8, 2026

LANTERN: LLM-Augmented Neurosymbolic Transfer with Experience-Gated Reasoning Networks

arXiv:2605.05478v1 Announce Type: new Abstract: Transfer learning in reinforcement learning (RL) seeks to accelerate learning in new tasks by leveraging knowledge from related sources. Existing neurosymbolic transfer methods, however, typically rely on manually specified task automata, assume a single source task, and use fixed knowledge-integration mechanisms that cannot adapt to varying source relevance. We propose […]

May 8, 2026

Optimal Transport for LLM Reward Modeling from Noisy Preference

arXiv:2605.06036v1 Announce Type: cross Abstract: Reward models are fundamental to Reinforcement Learning from Human Feedback (RLHF), yet real-world datasets are inevitably corrupted by noisy preference. Conventional training objectives tend to overfit these errors, while existing denoising approaches often rely on homogeneous noise assumptions that fail to capture the complexity of linguistic preferences. To handle these […]

May 8, 2026

Predictive and Prescriptive AI toward Optimizing Wildfire Suppression

arXiv:2605.04510v2 Announce Type: replace-cross Abstract: Intense wildfire seasons require critical prioritization decisions to allocate scarce suppression resources over a dispersed geographical area. This paper develops a predictive and prescriptive approach to jointly optimize crew assignments and wildfire suppression. The problem features a discrete resource-allocation structure with endogenous wildfire demand and non-linear wildfire dynamics. We formulate […]

May 8, 2026

Schedule-and-Calibrate: Utility-Guided Multi-Task Reinforcement Learning for Code LLMs

arXiv:2605.06111v1 Announce Type: cross Abstract: Reinforcement learning (RL) with verifiable rewards has proven effective at post-training LLMs for coding, yet deploying separate task-specific specialists incurs costs that scale with the number of tasks, motivating a unified multi-task RL (MTRL) approach. However, existing MTRL methods treat all coding tasks uniformly, relying on fixed data curricula under […]

May 8, 2026

FinRAG-12B: A Production-Validated Recipe for Grounded Question Answering in Banking

arXiv:2605.05482v1 Announce Type: new Abstract: Large language models (LLMs) are rapidly being adopted across various domains. However, their adoption in banking industry faces resistance due to demands for high accuracy, regulatory compliance, and the need for verifiable and grounded responses. We present a unified, data-efficient framework for training grounded domain-specific LLMs that optimizes answer quality, […]

May 8, 2026

Subscribe for Updates