April 8, 2026 – Page 11 – dijee Pharma Intelligence

Watch Before You Answer: Learning from Visually Grounded Post-Training

arXiv:2604.05117v1 Announce Type: cross Abstract: It is critical for vision-language models (VLMs) to comprehensively understand visual, temporal, and textual cues. However, despite rapid progress in multimodal modeling, video understanding performance still lags behind text-based reasoning. In this work, we find that progress is even worse than previously assumed: commonly reported long video understanding benchmarks contain […]

April 8, 2026

From Use to Oversight: How Mental Models Influence User Behavior and Output in AI Writing Assistants

arXiv:2604.05166v1 Announce Type: cross Abstract: AI-based writing assistants are ubiquitous, yet little is known about how users’ mental models shape their use. We examine two types of mental models — functional or related to what the system does, and structural or related to how the system works — and how they affect control behavior — […]

April 8, 2026

XMark: Reliable Multi-Bit Watermarking for LLM-Generated Texts

arXiv:2604.05242v1 Announce Type: cross Abstract: Multi-bit watermarking has emerged as a promising solution for embedding imperceptible binary messages into Large Language Model (LLM)-generated text, enabling reliable attribution and tracing of malicious usage of LLMs. Despite recent progress, existing methods still face key limitations: some become computationally infeasible for large messages, while others suffer from a […]

April 8, 2026

DQA: Diagnostic Question Answering for IT Support

arXiv:2604.05350v1 Announce Type: cross Abstract: Enterprise IT support interactions are fundamentally diagnostic: effective resolution requires iterative evidence gathering from ambiguous user reports to identify an underlying root cause. While retrieval-augmented generation (RAG) provides grounding through historical cases, standard multi-turn RAG systems lack explicit diagnostic state and therefore struggle to accumulate evidence and resolve competing hypotheses […]

April 8, 2026

Your LLM Agent Can Leak Your Data: Data Exfiltration via Backdoored Tool Use

arXiv:2604.05432v1 Announce Type: cross Abstract: Tool-use large language model (LLM) agents are increasingly deployed to support sensitive workflows, relying on tool calls for retrieval, external API access, and session memory management. While prior research has examined various threats, the risk of systematic data exfiltration by backdoored agents remains underexplored. In this work, we present Back-Reveal, […]

April 8, 2026

Learned Elevation Models as a Lightweight Alternative to LiDAR for Radio Environment Map Estimation

arXiv:2604.05520v1 Announce Type: cross Abstract: Next-generation wireless systems such as 6G operate at higher frequency bands, making signal propagation highly sensitive to environmental factors such as buildings and vege- tation. Accurate Radio Environment Map (REM) estimation is therefore increasingly important for effective network planning and operation. Existing methods, from ray-tracing simulators to deep learning generative […]

April 8, 2026

Analogical Reasoning as a Doctor: A Foundation Model for Gastrointestinal Endoscopy Diagnosis

arXiv:2604.05649v1 Announce Type: cross Abstract: Gastrointestinal diseases impose a growing global health burden, and endoscopy is a primary tool for early diagnosis. However, routine endoscopic image interpretation still suffers from missed lesions and limited efficiency. Although AI-assisted diagnosis has shown promise, existing models often lack generalizability, adaptability, robustness, and scalability because of limited medical data, […]

April 8, 2026

The Illusion of Latent Generalization: Bi-directionality and the Reversal Curse

arXiv:2604.04943v1 Announce Type: cross Abstract: The reversal curse describes a failure of autoregressive language models to retrieve a fact in reverse order (e.g., training on “$A > B$” but failing on “$B < A$”). Recent work shows that objectives with bidirectional supervision (e.g., bidirectional attention or masking-based reconstruction for decoder-only models) can mitigate the reversal […]

April 8, 2026

A mathematical theory of evolution for self-designing AIs

arXiv:2604.05142v1 Announce Type: new Abstract: As artificial intelligence systems (AIs) become increasingly produced by recursive self-improvement, a form of evolution may emerge, in which the traits of AI systems are shaped by the success of earlier AIs in designing and propagating their descendants. There is a rich mathematical theory modeling how behavioral traits are shaped […]

April 8, 2026

Learning to Retrieve from Agent Trajectories

arXiv:2604.04949v1 Announce Type: cross Abstract: Information retrieval (IR) systems have traditionally been designed and trained for human users, with learning-to-rank methods relying heavily on large-scale human interaction logs such as clicks and dwell time. With the rapid emergence of large language model (LLM) powered search agents, however, retrieval is increasingly consumed by agents rather than […]

April 8, 2026

Modeling Patient Care Trajectories with Transformer Hawkes Processes

arXiv:2604.05844v1 Announce Type: cross Abstract: Patient healthcare utilization consists of irregularly time-stamped events, such as outpatient visits, inpatient admissions, and emergency encounters, forming individualized care trajectories. Modeling these trajectories is crucial for understanding utilization patterns and predicting future care needs, but is challenging due to temporal irregularity and severe class imbalance. In this work, we […]

April 8, 2026

MG$^2$-RAG: Multi-Granularity Graph for Multimodal Retrieval-Augmented Generation

arXiv:2604.04969v1 Announce Type: cross Abstract: Retrieval-Augmented Generation (RAG) mitigates hallucinations in Multimodal Large Language Models (MLLMs), yet existing systems struggle with complex cross-modal reasoning. Flat vector retrieval often ignores structural dependencies, while current graph-based methods rely on costly “translation-to-text” pipelines that discard fine-grained visual information. To address these limitations, we propose textbfMG$^2$-RAG, a lightweight textbfMulti-textbfGranularity […]

April 8, 2026

Subscribe for Updates