Uncategorized – Page 205 – dijee Pharma Intelligence

WorldMAP: Bootstrapping Vision-Language Navigation Trajectory Prediction with Generative World Models

arXiv:2604.07957v1 Announce Type: new Abstract: Vision-language models (VLMs) and generative world models are opening new opportunities for embodied navigation. VLMs are increasingly used as direct planners or trajectory predictors, while world models support look-ahead reasoning by imagining future views. Yet predicting a reliable trajectory from a single egocentric observation remains challenging. Current VLMs often generate […]

April 10, 2026

How Far Are Large Multimodal Models from Human-Level Spatial Action? A Benchmark for Goal-Oriented Embodied Navigation in Urban Airspace

arXiv:2604.07973v1 Announce Type: new Abstract: Large multimodal models (LMMs) show strong visual-linguistic reasoning but their capacity for spatial decision-making and action remains unclear. In this work, we investigate whether LMMs can achieve embodied spatial action like human through a challenging scenario: goal-oriented navigation in urban 3D spaces. We first spend over 500 hours constructing a […]

April 10, 2026

CLEAR: Context Augmentation from Contrastive Learning of Experience via Agentic Reflection

arXiv:2604.07487v1 Announce Type: new Abstract: Large language model agents rely on effective model context to obtain task-relevant information for decision-making. Many existing context engineering approaches primarily rely on the context generated from the past experience and retrieval mechanisms that reuse these context. However, retrieved context from past tasks must be adapted by the execution agent […]

April 10, 2026

Evaluating Counterfactual Explanation Methods on Incomplete Inputs

arXiv:2604.08004v1 Announce Type: new Abstract: Existing algorithms for generating Counterfactual Explanations (CXs) for Machine Learning (ML) typically assume fully specified inputs. However, real-world data often contains missing values, and the impact of these incomplete inputs on the performance of existing CX methods remains unexplored. To address this gap, we systematically evaluate recent CX generation methods […]

April 10, 2026

“Why This Avoidance Maneuver?” Contrastive Explanations in Human-Supervised Maritime Autonomous Navigation

arXiv:2604.08032v1 Announce Type: new Abstract: Automated maritime collision avoidance will rely on human supervision for the foreseeable future. This necessitates transparency into how the system perceives a scenario and plans a maneuver. However, the causal logic behind avoidance maneuvers is often complex and difficult to convey to a navigator. This paper explores how to explain […]

April 10, 2026

ConsistRM: Improving Generative Reward Models via Consistency-Aware Self-Training

arXiv:2604.07484v1 Announce Type: new Abstract: Generative reward models (GRMs) have emerged as a promising approach for aligning Large Language Models (LLMs) with human preferences by offering greater representational capacity and flexibility than traditional scalar reward models. However, GRMs face two major challenges: reliance on costly human-annotated data restricts scalability, and self-training approaches often suffer from […]

April 10, 2026

ImplicitMemBench: Measuring Unconscious Behavioral Adaptation in Large Language Models

arXiv:2604.08064v1 Announce Type: new Abstract: Existing memory benchmarks for LLM agents evaluate explicit recall of facts, yet overlook implicit memory where experience becomes automated behavior without conscious retrieval. This gap is critical: effective assistants must automatically apply learned procedures or avoid failed actions without explicit reminders. We introduce ImplicitMemBench, the first systematic benchmark evaluating implicit […]

April 10, 2026

Comparative Evaluation of Embedding Representations for Financial News Sentiment Analysis

arXiv:2512.13749v2 Announce Type: replace-cross Abstract: Financial sentiment analysis enhances market understanding. However, standard Natural Language Processing (NLP) approaches encounter significant challenges when applied to small datasets. This study presents a comparative evaluation of embedding-based techniques for financial news sentiment classification in resource-constrained environments. Word2Vec, GloVe, and sentence transformer representations are evaluated in combination with gradient […]

April 10, 2026

Let the Agent Steer: Closed-Loop Ranking Optimization via Influence Exchange

arXiv:2603.27765v3 Announce Type: replace Abstract: Recommendation ranking is fundamentally an influence allocation problem: a sorting formula distributes ranking influence among competing factors, and the business outcome depends on finding the optimal “exchange rates” among them. However, offline proxy metrics systematically misjudge how influence reallocation translates to online impact, with asymmetric bias across metrics that a […]

April 10, 2026

One Shot Dominance: Knowledge Poisoning Attack on Retrieval-Augmented Generation Systems

arXiv:2505.11548v4 Announce Type: replace-cross Abstract: Large Language Models (LLMs) enhanced with Retrieval-Augmented Generation (RAG) have shown improved performance in generating accurate responses. However, the dependence on external knowledge bases introduces potential security vulnerabilities, particularly when these knowledge bases are publicly accessible and modifiable. While previous studies have exposed knowledge poisoning risks in RAG systems, existing […]

April 10, 2026

A systematic framework for generating novel experimental hypotheses from language models

arXiv:2408.05086v3 Announce Type: replace-cross Abstract: Neural language models (LMs) have been shown to capture complex linguistic patterns, yet their utility in understanding human language and more broadly, human cognition, remains debated. While existing work in this area often evaluates human-machine alignment, few studies attempt to translate findings from this enterprise into novel insights about humans. […]

April 10, 2026

WebArbiter: A Principle-Guided Reasoning Process Reward Model for Web Agents

arXiv:2601.21872v2 Announce Type: replace Abstract: Web agents hold great potential for automating complex computer tasks, yet their interactions involve long-horizon, sequential decision-making with irreversible actions. In such settings, outcome-based supervision is sparse and delayed, often rewarding incorrect trajectories and failing to support inference-time scaling. This motivates the use of Process Reward Models (WebPRMs) for web […]

April 10, 2026

Subscribe for Updates