April 23, 2026 – Page 11 – dijee Pharma Intelligence

Environmental Understanding Vision-Language Model for Embodied Agent

arXiv:2604.19839v1 Announce Type: cross Abstract: Vision-language models (VLMs) have shown strong perception and reasoning abilities for instruction-following embodied agents. However, despite these abilities and their generalization performance, they still face limitations in environmental understanding, often failing on interactions or relying on environment metadata during execution. To address this challenge, we propose a novel framework named […]

April 23, 2026

Efficient Test-Time Scaling of Multi-Step Reasoning by Probing Internal States of Large Language Models

arXiv:2511.06209v4 Announce Type: replace Abstract: LLMs can solve complex tasks by generating long, multi-step reasoning chains. Test-time scaling (TTS) can further improve LLM performance by sampling multiple variants of intermediate reasoning steps, verifying their correctness, and strategically choosing the best steps for continuation. However, existing verification approaches, such as Process Reward Models (PRMs), are computationally […]

April 23, 2026

DialToM: A Theory of Mind Benchmark for Forecasting State-Driven Dialogue Trajectories

arXiv:2604.20443v1 Announce Type: cross Abstract: Large Language Models (LLMs) have been shown to possess Theory of Mind (ToM) abilities. However, it remains unclear whether this stems from robust reasoning or spurious correlations. We introduce DialToM, a human-verified benchmark built from natural human dialogue using a multiple-choice framework. We evaluate not only mental state prediction (Literal […]

April 23, 2026

Transparent Screening for LLM Inference and Training Impacts

arXiv:2604.19757v1 Announce Type: cross Abstract: This paper presents a transparent screening framework for estimating inference and training impacts of current large language models under limited observability. The framework converts natural-language application descriptions into bounded environmental estimates and supports a comparative online observatory of current market models. Rather than claiming direct measurement for opaque proprietary services, […]

April 23, 2026

PromptEcho: Annotation-Free Reward from Vision-Language Models for Text-to-Image Reinforcement Learning

arXiv:2604.12652v2 Announce Type: replace-cross Abstract: Reinforcement learning (RL) can improve the prompt following capability of text-to-image (T2I) models, yet obtaining high-quality reward signals remains challenging: CLIP Score is too coarse-grained, while VLM-based reward models (e.g., RewardDance) require costly human-annotated preference data and additional fine-tuning. We propose PromptEcho, a reward construction method that requires emphno annotation […]

April 23, 2026

HiPO: Hierarchical Preference Optimization for Adaptive Reasoning in LLMs

arXiv:2604.20140v1 Announce Type: new Abstract: Direct Preference Optimization (DPO) is an effective framework for aligning large language models with human preferences, but it struggles with complex reasoning tasks. DPO optimizes for the likelihood of generating preferred over dispreferred responses in their entirety and lacks the granularity to provide feedback on subsections of many-step solutions typical […]

April 23, 2026

CHASM: Unveiling Covert Advertisements on Chinese Social Media

arXiv:2604.20511v1 Announce Type: cross Abstract: Current benchmarks for evaluating large language models (LLMs) in social media moderation completely overlook a serious threat: covert advertisements, which disguise themselves as regular posts to deceive and mislead consumers into making purchases, leading to significant ethical and legal concerns. In this paper, we present the CHASM, a first-of-its-kind dataset […]

April 23, 2026

Improving Molecular Force Fields with Minimal Temporal Information

arXiv:2604.19806v1 Announce Type: cross Abstract: Accurate prediction of energy and forces for 3D molecular systems is one of fundamental challenges at the core of AI for Science applications. Many powerful and data-efficient neural networks predict molecular energies and forces from single atomic configurations. However, one crucial aspect of the data generation process is rarely considered […]

April 23, 2026

EgoSelf: From Memory to Personalized Egocentric Assistant

arXiv:2604.19564v2 Announce Type: replace-cross Abstract: Egocentric assistants often rely on first-person view data to capture user behavior and context for personalized services. Since different users exhibit distinct habits, preferences, and routines, such personalization is essential for truly effective assistance. However, effectively integrating long-term user data for personalization remains a key challenge. To address this, we […]

April 23, 2026

Measuring Creativity in the Age of Generative AI: Distinguishing Human and AI-Generated Creative Performance in Hiring and Talent Systems

arXiv:2604.19799v1 Announce Type: cross Abstract: Generative AI is rapidly transforming how organizations create value and evaluate talent. While large language models enhance baseline output quality, they simultaneously introduce ambiguity in assessing human creativity, as observable artifacts may be partially or fully AI-generated. This paper reconceptualizes creativity as a distributional and process-based property that emerges under […]

April 23, 2026

The Tool-Overuse Illusion: Why Does LLM Prefer External Tools over Internal Knowledge?

arXiv:2604.19749v1 Announce Type: new Abstract: Equipping LLMs with external tools effectively addresses internal reasoning limitations. However, it introduces a critical yet under-explored phenomenon: tool overuse, the unnecessary tool-use during reasoning. In this paper, we first reveal this phenomenon is pervasive across diverse LLMs. We then experimentally elucidate its underlying mechanisms through two key lenses: (1) […]

April 23, 2026

Assessing the Robustness of Climate Foundation Models under No-Analog Distribution Shifts

arXiv:2603.23043v2 Announce Type: replace-cross Abstract: The accelerating pace of climate change introduces profound non-stationarities that challenge the ability of Machine Learning based climate emulators to generalize beyond their training distributions. While these emulators offer computationally efficient alternatives to traditional Earth System Models, their reliability remains a potential bottleneck under “no-analog” future climate states, which we […]

April 23, 2026

Subscribe for Updates