January 27, 2026 – Page 5 – DIJEE Pharma Intelligence

TriPlay-RL: Tri-Role Self-Play Reinforcement Learning for LLM Safety Alignment

arXiv:2601.18292v1 Announce Type: cross Abstract: In recent years, safety risks associated with large language models have become increasingly prominent, highlighting the urgent need to mitigate the generation of toxic and harmful content. The mainstream paradigm for LLM safety alignment typically adopts a collaborative framework involving three roles: an attacker for adversarial prompt generation, a defender […]

January 27, 2026

Temp-R1: A Unified Autonomous Agent for Complex Temporal KGQA via Reverse Curriculum Reinforcement Learning

arXiv:2601.18296v1 Announce Type: cross Abstract: Temporal Knowledge Graph Question Answering (TKGQA) is inherently challenging, as it requires sophisticated reasoning over dynamic facts with multi-hop dependencies and complex temporal constraints. Existing methods rely on fixed workflows and expensive closed-source APIs, limiting flexibility and scalability. We propose Temp-R1, the first autonomous end-to-end agent for TKGQA trained through […]

January 27, 2026

Online parameter estimation for the Crazyflie quadcopter through an EM algorithm

arXiv:2601.17009v1 Announce Type: new Abstract: Drones are becoming more and more popular nowadays. They are small in size, low in cost, and reliable in operation. They contain a variety of sensors and can perform a variety of flight tasks, reaching places that are difficult or inaccessible for humans. Earthquakes damage a lot of infrastructure, making […]

January 27, 2026

Co-PLNet: A Collaborative Point-Line Network for Prompt-Guided Wireframe Parsing

arXiv:2601.18252v1 Announce Type: cross Abstract: Wireframe parsing aims to recover line segments and their junctions to form a structured geometric representation useful for downstream tasks such as Simultaneous Localization and Mapping (SLAM). Existing methods predict lines and junctions separately and reconcile them post-hoc, causing mismatches and reduced robustness. We present Co-PLNet, a point-line collaborative framework […]

January 27, 2026

The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models

arXiv:2601.15165v2 Announce Type: replace-cross Abstract: Diffusion Large Language Models (dLLMs) break the rigid left-to-right constraint of traditional LLMs, enabling token generation in arbitrary orders. Intuitively, this flexibility implies a solution space that strictly supersets the fixed autoregressive trajectory, theoretically unlocking superior reasoning potential for general tasks like mathematics and coding. Consequently, numerous works have leveraged […]

January 27, 2026

Augmenting Question Answering with A Hybrid RAG Approach

arXiv:2601.12658v2 Announce Type: replace-cross Abstract: Retrieval-Augmented Generation (RAG) has emerged as a powerful technique for enhancing the quality of responses in Question-Answering (QA) tasks. However, existing approaches often struggle with retrieving contextually relevant information, leading to incomplete or suboptimal answers. In this paper, we introduce Structured-Semantic RAG (SSRAG), a hybrid architecture that enhances QA quality […]

January 27, 2026

Neural Network Approximation: A View from Polytope Decomposition

arXiv:2601.18264v1 Announce Type: cross Abstract: Universal approximation theory offers a foundational framework to verify neural network expressiveness, enabling principled utilization in real-world applications. However, most existing theoretical constructions are established by uniformly dividing the input space into tiny hypercubes without considering the local regularity of the target function. In this work, we investigate the universal […]

January 27, 2026

MemoryRewardBench: Benchmarking Reward Models for Long-Term Memory Management in Large Language Models

arXiv:2601.11969v2 Announce Type: replace-cross Abstract: Existing works increasingly adopt memory-centric mechanisms to process long contexts in a segment manner, and effective memory management is one of the key capabilities that enables large language models to effectively propagate information across the entire sequence. Therefore, leveraging reward models (RMs) to automatically and reliably evaluate memory quality is […]

January 27, 2026

HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding

arXiv:2601.14724v2 Announce Type: replace-cross Abstract: Recent advancements in Multimodal Large Language Models (MLLMs) have demonstrated significant improvement in offline video understanding. However, extending these capabilities to streaming video inputs, remains challenging, as existing models struggle to simultaneously maintain stable understanding performance, real-time responses, and low GPU memory overhead. To address this challenge, we propose HERMES, […]

January 27, 2026

PRISM-CAFO: Prior-conditioned Remote-sensing Infrastructure Segmentation and Mapping for CAFOs

arXiv:2601.11451v2 Announce Type: replace-cross Abstract: Large-scale livestock operations pose significant risks to human health and the environment, while also being vulnerable to threats such as infectious diseases and extreme weather events. As the number of such operations continues to grow, accurate and scalable mapping has become increasingly important. In this work, we present an infrastructure-first, […]

January 27, 2026

Beyond Single-Granularity Prompts: A Multi-Scale Chain-of-Thought Prompt Learning for Graph

arXiv:2510.09394v4 Announce Type: replace-cross Abstract: The “pre-train, prompt” paradigm, designed to bridge the gap between pre-training tasks and downstream objectives, has been extended from the NLP domain to the graph domain and has achieved remarkable progress. Current mainstream graph prompt-tuning methods modify input or output features using learnable prompt vectors. However, existing approaches are confined […]

January 27, 2026

Explaining Synergistic Effects in Social Recommendations

arXiv:2601.18151v1 Announce Type: cross Abstract: In social recommenders, the inherent nonlinearity and opacity of synergistic effects across multiple social networks hinders users from understanding how diverse information is leveraged for recommendations, consequently diminishing explainability. However, existing explainers can only identify the topological information in social networks that significantly influences recommendations, failing to further explain the […]

January 27, 2026

Subscribe for Updates