arXiv:2604.14602v1 Announce Type: cross Abstract: Large language models (LLMs) frequently generate toxic content, posing significant risks for safe deployment. Current mitigation strategies often degrade generation quality or require costly human annotation. We propose CAUSALDETOX, a framework that identifies and intervenes on the specific attention heads causally responsible for toxic generation. Using the Probability of Necessity […]
MirrorBench: Evaluating Self-centric Intelligence in MLLMs by Introducing a Mirror
arXiv:2604.14785v1 Announce Type: new Abstract: Recent progress in Multimodal Large Language Models (MLLMs) has demonstrated remarkable advances in perception and reasoning, suggesting their potential for embodied intelligence. While recent studies have evaluated embodied MLLMs in interactive settings, current benchmarks mainly target capabilities to perceive, understand, and interact with external objects, lacking a systematic evaluation of […]
Heuristic Classification of Thoughts Prompting (HCoT): Integrating Expert System Heuristics for Structured Reasoning into Large Language Models
arXiv:2604.12390v2 Announce Type: replace Abstract: This paper addresses two limitations of large language models (LLMs) in solving complex problems: (1) their reasoning processes exhibit Bayesian-like stochastic generation, where each token is sampled from a context-dependent probability distribution, leading to inherently random decision trajectories rather than deterministic planning; (2) the reasoning and decision-making mechanisms are statically […]
Sequence Search: Automated Sequence Design using Neural Architecture Search
arXiv:2604.14788v1 Announce Type: new Abstract: Developing an MR sequence is challenging and remains largely constrained by human intuition. Recently, AI-driven approaches have been proposed; however, most require an initial sequence for parameter optimization or extensive training datasets, limiting their general applicability. In this study, we propose “Sequence Search,” an automated sequence design framework based on […]
SAGE Celer 2.6 Technical Card
arXiv:2604.14168v1 Announce Type: cross Abstract: We introduce SAGE Celer 2.6, the latest in our line of general-purpose Celer models from SAGEA. Celer 2.6 is available in 5B, 10B, and 27B parameter sizes and benefits from extensive architectural modifications and further pre-training on an undisclosed model. Using our Inverse Reasoning (IR) pipeline, SAGEA natively trains Celer […]
Internal Knowledge Without External Expression: Probing the Generalization Boundary of a Classical Chinese Language Model
arXiv:2604.14180v1 Announce Type: cross Abstract: We train a 318M-parameter Transformer language model from scratch on a curated corpus of 1.56 billion tokens of pure Classical Chinese, with zero English characters or Arabic numerals. Through systematic out-of-distribution (OOD) testing, we investigate whether the model can distinguish known from unknown inputs, and crucially, whether it can express […]
Pareto-Optimal Offline Reinforcement Learning via Smooth Tchebysheff Scalarization
arXiv:2604.13175v1 Announce Type: cross Abstract: Large language models can be aligned with human preferences through offline reinforcement learning (RL) on small labeled datasets. While single-objective alignment is well-studied, many real-world applications demand the simultaneous optimization of multiple conflicting rewards, e.g. optimizing both catalytic activity and specificity in protein engineering, or helpfulness and harmlessness for chatbots. […]
When PCOS Meets Eating Disorders: An Explainable AI Approach to Detecting the Hidden Triple Burden
arXiv:2604.14356v1 Announce Type: cross Abstract: Women with polycystic ovary syndrome (PCOS) face substantially elevated risks of body image distress, disordered eating, and metabolic challenges, yet existing natural language processing approaches for detecting these conditions lack transparency and cannot identify co-occurring presentations. We developed small, open-source language models to automatically detect this triple burden in social […]
FocalLens: Visualizing Narratives through Focalization
arXiv:2604.14456v1 Announce Type: cross Abstract: Visualizing narratives is useful to writers to reflect on unfinished drafts and identify unintentional biases and inconsistencies. Literary scholars can use the visualizations to identify nuanced patterns and literary styles from written text. Current narrative visualization is limited to representing character and location co-occurrences in a timeline, omitting important and […]
Optimistic Policy Learning under Pessimistic Adversaries with Regret and Violation Guarantees
arXiv:2604.14243v1 Announce Type: cross Abstract: Real-world decision-making systems operate in environments where state transitions depend not only on the agent’s actions, but also on textbfexogenous factors outside its control–competing agents, environmental disturbances, or strategic adversaries–formally, $s_h+1 = f(s_h, a_h, bara_h)+omega_h$ where $bara_h$ is the adversary/external action, $a_h$ is the agent’s action, and $omega_h$ is an […]
Heuristic Classification of Thoughts Prompting (HCoT): Integrating Expert System Heuristics for Structured Reasoning into Large Language Models
arXiv:2604.12390v2 Announce Type: replace Abstract: This paper addresses two limitations of large language models (LLMs) in solving complex problems: (1) their reasoning processes exhibit Bayesian-like stochastic generation, where each token is sampled from a context-dependent probability distribution, leading to inherently random decision trajectories rather than deterministic planning; (2) the reasoning and decision-making mechanisms are statically […]
CausalDetox: Causal Head Selection and Intervention for Language Model Detoxification
arXiv:2604.14602v1 Announce Type: cross Abstract: Large language models (LLMs) frequently generate toxic content, posing significant risks for safe deployment. Current mitigation strategies often degrade generation quality or require costly human annotation. We propose CAUSALDETOX, a framework that identifies and intervenes on the specific attention heads causally responsible for toxic generation. Using the Probability of Necessity […]