April 17, 2026 – Page 10 – dijee Pharma Intelligence

CausalDetox: Causal Head Selection and Intervention for Language Model Detoxification

arXiv:2604.14602v1 Announce Type: cross Abstract: Large language models (LLMs) frequently generate toxic content, posing significant risks for safe deployment. Current mitigation strategies often degrade generation quality or require costly human annotation. We propose CAUSALDETOX, a framework that identifies and intervenes on the specific attention heads causally responsible for toxic generation. Using the Probability of Necessity […]

April 17, 2026

MirrorBench: Evaluating Self-centric Intelligence in MLLMs by Introducing a Mirror

arXiv:2604.14785v1 Announce Type: new Abstract: Recent progress in Multimodal Large Language Models (MLLMs) has demonstrated remarkable advances in perception and reasoning, suggesting their potential for embodied intelligence. While recent studies have evaluated embodied MLLMs in interactive settings, current benchmarks mainly target capabilities to perceive, understand, and interact with external objects, lacking a systematic evaluation of […]

April 17, 2026

Heuristic Classification of Thoughts Prompting (HCoT): Integrating Expert System Heuristics for Structured Reasoning into Large Language Models

arXiv:2604.12390v2 Announce Type: replace Abstract: This paper addresses two limitations of large language models (LLMs) in solving complex problems: (1) their reasoning processes exhibit Bayesian-like stochastic generation, where each token is sampled from a context-dependent probability distribution, leading to inherently random decision trajectories rather than deterministic planning; (2) the reasoning and decision-making mechanisms are statically […]

April 17, 2026

Sequence Search: Automated Sequence Design using Neural Architecture Search

arXiv:2604.14788v1 Announce Type: new Abstract: Developing an MR sequence is challenging and remains largely constrained by human intuition. Recently, AI-driven approaches have been proposed; however, most require an initial sequence for parameter optimization or extensive training datasets, limiting their general applicability. In this study, we propose “Sequence Search,” an automated sequence design framework based on […]

April 17, 2026

SAGE Celer 2.6 Technical Card

arXiv:2604.14168v1 Announce Type: cross Abstract: We introduce SAGE Celer 2.6, the latest in our line of general-purpose Celer models from SAGEA. Celer 2.6 is available in 5B, 10B, and 27B parameter sizes and benefits from extensive architectural modifications and further pre-training on an undisclosed model. Using our Inverse Reasoning (IR) pipeline, SAGEA natively trains Celer […]

April 17, 2026

Internal Knowledge Without External Expression: Probing the Generalization Boundary of a Classical Chinese Language Model

arXiv:2604.14180v1 Announce Type: cross Abstract: We train a 318M-parameter Transformer language model from scratch on a curated corpus of 1.56 billion tokens of pure Classical Chinese, with zero English characters or Arabic numerals. Through systematic out-of-distribution (OOD) testing, we investigate whether the model can distinguish known from unknown inputs, and crucially, whether it can express […]

April 17, 2026

Pareto-Optimal Offline Reinforcement Learning via Smooth Tchebysheff Scalarization

arXiv:2604.13175v1 Announce Type: cross Abstract: Large language models can be aligned with human preferences through offline reinforcement learning (RL) on small labeled datasets. While single-objective alignment is well-studied, many real-world applications demand the simultaneous optimization of multiple conflicting rewards, e.g. optimizing both catalytic activity and specificity in protein engineering, or helpfulness and harmlessness for chatbots. […]

April 17, 2026

When PCOS Meets Eating Disorders: An Explainable AI Approach to Detecting the Hidden Triple Burden

arXiv:2604.14356v1 Announce Type: cross Abstract: Women with polycystic ovary syndrome (PCOS) face substantially elevated risks of body image distress, disordered eating, and metabolic challenges, yet existing natural language processing approaches for detecting these conditions lack transparency and cannot identify co-occurring presentations. We developed small, open-source language models to automatically detect this triple burden in social […]

April 17, 2026

FocalLens: Visualizing Narratives through Focalization

arXiv:2604.14456v1 Announce Type: cross Abstract: Visualizing narratives is useful to writers to reflect on unfinished drafts and identify unintentional biases and inconsistencies. Literary scholars can use the visualizations to identify nuanced patterns and literary styles from written text. Current narrative visualization is limited to representing character and location co-occurrences in a timeline, omitting important and […]

April 17, 2026

Optimistic Policy Learning under Pessimistic Adversaries with Regret and Violation Guarantees

arXiv:2604.14243v1 Announce Type: cross Abstract: Real-world decision-making systems operate in environments where state transitions depend not only on the agent’s actions, but also on textbfexogenous factors outside its control–competing agents, environmental disturbances, or strategic adversaries–formally, $s_h+1 = f(s_h, a_h, bara_h)+omega_h$ where $bara_h$ is the adversary/external action, $a_h$ is the agent’s action, and $omega_h$ is an […]

April 17, 2026

Heuristic Classification of Thoughts Prompting (HCoT): Integrating Expert System Heuristics for Structured Reasoning into Large Language Models

April 17, 2026

CausalDetox: Causal Head Selection and Intervention for Language Model Detoxification

April 17, 2026

Subscribe for Updates