January 29, 2026 – Page 11 – DIJEE Pharma Intelligence

In-context Language Learning for Endangered Languages in Speech Recognition

arXiv:2505.20445v5 Announce Type: replace-cross Abstract: With approximately 7,000 languages spoken worldwide, current large language models (LLMs) support only a small subset. Prior research indicates LLMs can learn new languages for certain tasks without supervised data. We extend this investigation to speech recognition, investigating whether LLMs can learn unseen, low-resource languages through in-context learning (ICL). With […]

January 29, 2026

Understanding Post-Training Structural Changes in Large Language Models

arXiv:2509.17866v3 Announce Type: replace-cross Abstract: Post-training fundamentally alters the behavior of large language models (LLMs), yet its impact on the internal parameter space remains poorly understood. In this work, we conduct a systematic singular value decomposition (SVD) analysis of principal linear layers in pretrained LLMs, focusing on two widely adopted post-training methods: instruction tuning and […]

January 29, 2026

DGRAG: Distributed Graph-based Retrieval-Augmented Generation in Edge-Cloud Systems

arXiv:2505.19847v2 Announce Type: replace Abstract: Retrieval-Augmented Generation (RAG) improves factuality by grounding LLMs in external knowledge, yet conventional centralized RAG requires aggregating distributed data, raising privacy risks and incurring high retrieval latency and cost. We present DGRAG, a distributed graph-driven RAG framework for edge-cloud collaborative systems. Each edge device organizes local documents into a knowledge […]

January 29, 2026

OPERA: A Reinforcement Learning–Enhanced Orchestrated Planner-Executor Architecture for Reasoning-Oriented Multi-Hop Retrieval

arXiv:2508.16438v3 Announce Type: replace-cross Abstract: Recent advances in large language models (LLMs) and dense retrievers have driven significant progress in retrieval-augmented generation (RAG). However, existing approaches face significant challenges in complex reasoning-oriented multi-hop retrieval tasks: 1) Ineffective reasoning-oriented planning: Prior methods struggle to generate robust multi-step plans for complex queries, as rule-based decomposers perform poorly […]

January 29, 2026

Quantifying Fidelity: A Decisive Feature Approach to Comparing Synthetic and Real Imagery

arXiv:2512.16468v3 Announce Type: replace Abstract: Virtual testing using synthetic data has become a cornerstone of autonomous vehicle (AV) safety assurance. Despite progress in improving visual realism through advanced simulators and generative AI, recent studies reveal that pixel-level fidelity alone does not ensure reliable transfer from simulation to the real world. What truly matters is whether […]

January 29, 2026

Deep SPI: Safe Policy Improvement via World Models

arXiv:2510.12312v2 Announce Type: replace-cross Abstract: Safe policy improvement (SPI) offers theoretical control over policy updates, yet existing guarantees largely concern offline, tabular reinforcement learning (RL). We study SPI in general online settings, when combined with world model and representation learning. We develop a theoretical framework showing that restricting policy updates to a well-defined neighborhood of […]

January 29, 2026

Endogenous Reprompting: Self-Evolving Cognitive Alignment for Unified Multimodal Models

arXiv:2601.20305v1 Announce Type: new Abstract: Unified Multimodal Models (UMMs) exhibit strong understanding, yet this capability often fails to effectively guide generation. We identify this as a Cognitive Gap: the model lacks the understanding of how to enhance its own generation process. To bridge this gap, we propose Endogenous Reprompting, a mechanism that transforms the model’s […]

January 29, 2026

Mitigating LLM Hallucination via Behaviorally Calibrated Reinforcement Learning

arXiv:2512.19920v3 Announce Type: replace-cross Abstract: LLM deployment in critical domains is currently impeded by persistent hallucinations–generating plausible but factually incorrect assertions. While scaling laws drove significant improvements in general capabilities, theoretical frameworks suggest hallucination is not merely stochastic error but a predictable statistical consequence of training objectives prioritizing mimicking data distribution over epistemic honesty. Standard […]

January 29, 2026

QueerGen: How LLMs Reflect Societal Norms on Gender and Sexuality in Sentence Completion Tasks

arXiv:2601.20731v1 Announce Type: cross Abstract: This paper examines how Large Language Models (LLMs) reproduce societal norms, particularly heterocisnormativity, and how these norms translate into measurable biases in their text generations. We investigate whether explicit information about a subject’s gender or sexuality influences LLM responses across three subject categories: queer-marked, non-queer-marked, and the normalized “unmarked” category. […]

January 29, 2026

Lost in Simulation: LLM-Simulated Users are Unreliable Proxies for Human Users in Agentic Evaluations

arXiv:2601.17087v2 Announce Type: replace-cross Abstract: Agentic benchmarks increasingly rely on LLM-simulated users to scalably evaluate agent performance, yet the robustness, validity, and fairness of this approach remain unexamined. Through a user study with participants across the United States, India, Kenya, and Nigeria, we investigate whether LLM-simulated users serve as reliable proxies for real human users […]

January 29, 2026

Towards Intelligent Urban Park Development Monitoring: LLM Agents for Multi-Modal Information Fusion and Analysis

arXiv:2601.20206v1 Announce Type: new Abstract: As an important part of urbanization, the development monitoring of newly constructed parks is of great significance for evaluating the effect of urban planning and optimizing resource allocation. However, traditional change detection methods based on remote sensing imagery have obvious limitations in high-level and intelligent analysis, and thus are difficult […]

January 29, 2026

Adapting the Behavior of Reinforcement Learning Agents to Changing Action Spaces and Reward Functions

arXiv:2601.20714v1 Announce Type: cross Abstract: Reinforcement Learning (RL) agents often struggle in real-world applications where environmental conditions are non-stationary, particularly when reward functions shift or the available action space expands. This paper introduces MORPHIN, a self-adaptive Q-learning framework that enables on-the-fly adaptation without full retraining. By integrating concept drift detection with dynamic adjustments to learning […]

January 29, 2026

Subscribe for Updates