April 21, 2026 – Page 9 – dijee Pharma Intelligence

CAPO: Counterfactual Credit Assignment in Sequential Cooperative Teams

arXiv:2604.17693v1 Announce Type: cross Abstract: In cooperative teams where agents act in a fixed order and share a single team reward, it is hard to know how much each agent contributed, and harder still when agents are updated one at a time because data collected earlier no longer reflects the new policies. We introduce the […]

April 21, 2026

SpidR-Adapt: A Universal Speech Representation Model for Few-Shot Adaptation

arXiv:2512.21204v2 Announce Type: replace-cross Abstract: Human infants, with only a few hundred hours of speech exposure, acquire basic units of new languages, highlighting a striking efficiency gap compared to the data-hungry self-supervised speech models. To address this gap, this paper introduces SpidR-Adapt for rapid adaptation of speech units to new languages using minimal unlabeled data. […]

April 21, 2026

Asymmetric-Loss-Guided Hybrid CNN-BiLSTM-Attention Model for Industrial RUL Prediction with Interpretable Failure Heatmaps

arXiv:2604.13459v2 Announce Type: replace-cross Abstract: Turbofan engine degradation under sustained operational stress necessitates robust prognostic systems capable of accurately estimating the Remaining Useful Life (RUL) of critical components. Existing deep learning approaches frequently fail to simultaneously capture multi-sensor spatial correlations and long-range temporal dependencies, while standard symmetric loss functions inadequately penalize the safety-critical error of […]

April 21, 2026

Countdown-Code: A Testbed for Studying The Emergence and Generalization of Reward Hacking in RLVR

arXiv:2603.07084v2 Announce Type: replace-cross Abstract: Reward hacking is a form of misalignment in which models overoptimize proxy rewards without genuinely solving the underlying task. Precisely measuring reward hacking occurrence remains challenging because true task rewards are often expensive or impossible to compute. We introduce Countdown-Code, a minimal environment where models can both solve a mathematical […]

April 21, 2026

SafeAnchor: Preventing Cumulative Safety Erosion in Continual Domain Adaptation of Large Language Models

arXiv:2604.17691v1 Announce Type: cross Abstract: Safety alignment in large language models is remarkably shallow: it is concentrated in the first few output tokens and reversible by fine-tuning on as few as 100 adversarial examples. This fragility becomes critical in real-world deployment, where models undergo sequential adaptation across domains such as medicine, law, and code, causing […]

April 21, 2026

Beyond Facts: Benchmarking Distributional Reading Comprehension in Large Language Models

arXiv:2604.06201v2 Announce Type: replace-cross Abstract: While most reading comprehension benchmarks for LLMs focus on factual information that can be answered by localizing specific textual evidence, many real-world tasks require understanding distributional information, such as population-level trends and preferences expressed across collections of text. We introduce Text2DistBench, a reading comprehension benchmark for evaluating LLMs’ ability to […]

April 21, 2026

On the Creativity of AI Agents

arXiv:2604.13242v2 Announce Type: replace-cross Abstract: Large language models (LLMs), particularly when integrated into agentic systems, have demonstrated human- and even superhuman-level performance across multiple domains. Whether these systems can truly be considered creative, however, remains a matter of debate, as conclusions heavily depend on the definitions, evaluation methods, and specific use cases employed. In this […]

April 21, 2026

Bridging the Reasoning Gap in Vietnamese with Small Language Models via Test-Time Scaling

arXiv:2604.17794v1 Announce Type: cross Abstract: The democratization of ubiquitous AI hinges on deploying sophisticated reasoning capabilities on resource-constrained devices. However, Small Language Models (SLMs) often face a “reasoning gap”, particularly in non-English languages like Vietnamese, where they struggle to maintain coherent chains of thought. This paper investigates Test-Time Scaling strategies for the Qwen3-1.7B architecture within […]

April 21, 2026

Towards Intelligent Legal Document Analysis: CNN-Driven Classification of Case Law Texts

arXiv:2604.17674v1 Announce Type: cross Abstract: Legal practitioners and judicial institutions face an ever-growing volume of case-law documents characterised by formalised language, lengthy sentence structures, and highly specialised terminology, making manual triage both time-consuming and error-prone. This work presents a lightweight yet high-accuracy framework for citation-treatment classification that pairs lemmatisation-based preprocessing with subword-aware FastText embeddings and […]

April 21, 2026

Mix and Match: Context Pairing for Scalable Topic-Controlled Educational Summarisation

arXiv:2604.18087v1 Announce Type: cross Abstract: Topic-controlled summarisation enables users to generate summaries focused on specific aspects of source documents. This paper investigates a data augmentation strategy for training small language models (sLMs) to perform topic-controlled summarisation. We propose a pairwise data augmentation method that combines contexts from different documents to create contrastive training examples, enabling […]

April 21, 2026

From Kinematics to Dynamics: Learning to Refine Hybrid Plans for Physically Feasible Execution

arXiv:2604.12474v2 Announce Type: replace-cross Abstract: In many robotic tasks, agents must traverse a sequence of spatial regions to complete a mission. Such problems are inherently mixed discrete-continuous: a high-level action sequence and a physically feasible continuous trajectory. The resulting trajectory and action sequence must also satisfy problem constraints such as deadlines, time windows, and velocity […]

April 21, 2026

Dissecting AI Trading: Behavioral Finance and Market Bubbles

arXiv:2604.18373v1 Announce Type: cross Abstract: We study how AI agents form expectations and trade in experimental asset markets. Using a simulated open-call auction populated by autonomous Large Language Model (LLM) agents, we document three main findings. First, AI agents exhibit classic behavioral patterns: a pronounced disposition effect and recency-weighted extrapolative beliefs. Second, these individual-level patterns […]

April 21, 2026

Subscribe for Updates