April 14, 2026 – Page 6 – dijee Pharma Intelligence

From Perception to Planning: Evolving Ego-Centric Task-Oriented Spatiotemporal Reasoning via Curriculum Learning

arXiv:2604.10517v1 Announce Type: new Abstract: Modern vision-language models achieve strong performance in static perception, but remain limited in the complex spatiotemporal reasoning required for embodied, egocentric tasks. A major source of failure is their reliance on temporal priors learned from passive video data, which often leads to spatiotemporal hallucinations and poor generalization in dynamic environments. […]

April 14, 2026

Latent Structure of Affective Representations in Large Language Models

arXiv:2604.07382v2 Announce Type: replace-cross Abstract: The geometric structure of latent representations in large language models (LLMs) is an active area of research, driven in part by its implications for model transparency and AI safety. Existing literature has focused mainly on general geometric and topological properties of the learnt representations, but due to a lack of […]

April 14, 2026

Preference-Agile Multi-Objective Optimization for Real-time Vehicle Dispatching

arXiv:2604.10664v1 Announce Type: new Abstract: Multi-objective optimization (MOO) has been widely studied in literature because of its versatility in human-centered decision making in real-life applications. Recently, demand for dynamic MOO is fast-emerging due to tough market dynamics that require real-time re-adjustments of priorities for different objectives. However, most existing studies focus either on deterministic MOO […]

April 14, 2026

Generating Multiple-Choice Knowledge Questions with Interpretable Difficulty Estimation using Knowledge Graphs and Large Language Models

arXiv:2604.10748v1 Announce Type: cross Abstract: Generating multiple-choice questions (MCQs) with difficulty estimation remains challenging in automated MCQ-generation systems used in adaptive, AI-assisted education. This study proposes a novel methodology for generating MCQs with difficulty estimation from the input documents by utilizing knowledge graphs (KGs) and large language models (LLMs). Our approach uses an LLM to […]

April 14, 2026

When More Thinking Hurts: Overthinking in LLM Test-Time Compute Scaling

arXiv:2604.10739v1 Announce Type: new Abstract: Scaling test-time compute through extended chains of thought has become a dominant paradigm for improving large language model reasoning. However, existing research implicitly assumes that longer thinking always yields better results. This assumption remains largely unexamined. We systematically investigate how the marginal utility of additional reasoning tokens changes as compute […]

April 14, 2026

MoBiE: Efficient Inference of Mixture of Binary Experts under Post-Training Quantization

arXiv:2604.06798v2 Announce Type: replace-cross Abstract: Mixture-of-Experts (MoE) based large language models (LLMs) offer strong performance but suffer from high memory and computation costs. Weight binarization provides extreme efficiency, yet existing binary methods designed for dense LLMs struggle with MoE-specific issues, including cross-expert redundancy, task-agnostic importance estimation, and quantization-induced routing shifts. To this end, we propose […]

April 14, 2026

ZoomR: Memory Efficient Reasoning through Multi-Granularity Key Value Retrieval

arXiv:2604.10898v1 Announce Type: new Abstract: Large language models (LLMs) have shown great performance on complex reasoning tasks but often require generating long intermediate thoughts before reaching a final answer. During generation, LLMs rely on a key-value (KV) cache for autoregressive decoding. However, the memory footprint of the KV cache grows with output length. Prior work […]

April 14, 2026

Deep-Reporter: Deep Research for Grounded Multimodal Long-Form Generation

arXiv:2604.10741v1 Announce Type: cross Abstract: Recent agentic search frameworks enable deep research via iterative planning and retrieval, reducing hallucinations and enhancing factual grounding. However, they remain text-centric, overlooking the multimodal evidence that characterizes real-world expert reports. We introduce a pressing task: multimodal long-form generation. Accordingly, we propose Deep-Reporter, a unified agentic framework for grounded multimodal […]

April 14, 2026

CFMS: A Coarse-to-Fine Multimodal Synthesis Framework for Enhanced Tabular Reasoning

arXiv:2604.10973v1 Announce Type: new Abstract: Reasoning over tabular data is a crucial capability for tasks like question answering and fact verification, as it requires models to comprehend both free-form questions and semi-structured tables. However, while methods like Chain-of-Thought (CoT) introduce reasoning chains, purely symbolic methodes are inherently limited by their blindness to holistic visual patterns. […]

April 14, 2026

VideoStir: Understanding Long Videos via Spatio-Temporally Structured and Intent-Aware RAG

arXiv:2604.05418v2 Announce Type: replace-cross Abstract: Scaling multimodal large language models (MLLMs) to long videos is constrained by limited context windows. While retrieval-augmented generation (RAG) is a promising remedy by organizing query-relevant visual evidence into a compact context, most existing methods (i) flatten videos into independent segments, breaking their inherent spatio-temporal structure, and (ii) depend on […]

April 14, 2026

Min-$k$ Sampling: Decoupling Truncation from Temperature Scaling via Relative Logit Dynamics

arXiv:2604.11012v1 Announce Type: new Abstract: The quality of text generated by large language models depends critically on the decoding sampling strategy. While mainstream methods such as Top-$k$, Top-$p$, and Min-$p$ achieve a balance between diversity and accuracy through probability-space truncation, they share an inherent limitation: extreme sensitivity to the temperature parameter. Recent logit-space approaches like […]

April 14, 2026

Too Nice to Tell the Truth: Quantifying Agreeableness-Driven Sycophancy in Role-Playing Language Models

arXiv:2604.10733v1 Announce Type: cross Abstract: Large language models increasingly serve as conversational agents that adopt personas and role-play characters at user request. This capability, while valuable, raises concerns about sycophancy: the tendency to provide responses that validate users rather than prioritize factual accuracy. While prior work has established that sycophancy poses risks to AI safety […]

April 14, 2026

Subscribe for Updates