Meta-RL Induces Exploration in Language Agents

arXiv:2512.16848v2 Announce Type: replace-cross Abstract: Reinforcement learning (RL) has enabled the training of large language model (LLM) agents to interact with the environment and to solve multi-turn long-horizon tasks. However, the RL-trained agents often struggle in tasks that require active exploration and fail to efficiently adapt from trial-and-error experiences. In this paper, we present LaMer, […]

Condition-Gated Reasoning for Context-Dependent Biomedical Question Answering

arXiv:2602.17911v2 Announce Type: replace-cross Abstract: Current biomedical question answering (QA) systems often assume that medical knowledge applies uniformly, yet real-world clinical reasoning is inherently conditional: nearly every decision depends on patient-specific factors such as comorbidities and contraindications. Existing benchmarks do not evaluate such conditional reasoning, and retrieval-augmented or graph-based methods lack explicit mechanisms to ensure […]

Gradually Excavating External Knowledge for Implicit Complex Question Answering

arXiv:2603.08148v1 Announce Type: cross Abstract: Recently, large language models (LLMs) have gained much attention for the emergence of human-comparable capabilities and huge potential. However, for open-domain implicit question-answering problems, LLMs may not be the ultimate solution due to the reasons of: 1) uncovered or out-of-date domain knowledge, 2) one-shot generation and hence restricted comprehensiveness. To […]

How Much Do LLMs Hallucinate in Document Q&A Scenarios? A 172-Billion-Token Study Across Temperatures, Context Lengths, and Hardware Platforms

arXiv:2603.08274v1 Announce Type: cross Abstract: How much do large language models actually hallucinate when answering questions grounded in provided documents? Despite the critical importance of this question for enterprise AI deployments, reliable measurement has been hampered by benchmarks that rely on static datasets vulnerable to contamination, LLM-based judges with documented biases, or evaluation scales too […]

Towards plausibility in time series counterfactual explanations

arXiv:2603.08349v1 Announce Type: cross Abstract: We present a new method for generating plausible counterfactual explanations for time series classification problems. The approach performs gradient-based optimization directly in the input space. To enforce plausibility, we integrate soft-DTW (dynamic time warping) alignment with $k$-nearest neighbors from the target class, which effectively encourages the generated counterfactuals to adopt […]

First-Order Geometry, Spectral Compression, and Structural Compatibility under Bounded Computation

arXiv:2603.08494v1 Announce Type: cross Abstract: Optimization under structural constraints is typically analyzed through projection or penalty methods, obscuring the geometric mechanism by which constraints shape admissible dynamics. We propose an operator-theoretic formulation in which computational or feasibility limitations are encoded by self-adjoint operators defining locally reachable subspaces. In this setting, the optimal first-order improvement direction […]

Mathematical modeling of glioma invasion and therapy approaches via kinetic theory of active particles

arXiv:2203.11578v3 Announce Type: replace Abstract: We propose here a multiscale model for study the effect of combined therapies on glioma spread in the brain under the influence of vascularization. The model accounts for the interplay between the different components of the neoplasm and the healthy tissue and it investigates and compares various therapy approaches. Precisely, […]

CITS: Nonparametric Statistical Causal Modeling for High-Resolution Neural Time Series

arXiv:2508.01920v2 Announce Type: replace Abstract: Identifying causal interactions in complex dynamical systems is a fundamental challenge across the computational sciences. Existing functional connectivity methods capture correlations but not causation. While addressing directionality, popular causal inference tools such as Granger causality and the Peter-Clark algorithm rely on restrictive assumptions that limit their applicability to high-resolution time-series […]

Real-Time Aligned Reward Model beyond Semantics

arXiv:2601.22664v3 Announce Type: replace Abstract: Reinforcement Learning from Human Feedback (RLHF) is a pivotal technique for aligning large language models (LLMs) with human preferences, yet it is susceptible to reward overoptimization, in which policy models overfit to the reward model, exploit spurious reward patterns instead of faithfully capturing human intent. Prior mitigations primarily relies on […]

A Mathematical Theory of Agency and Intelligence

arXiv:2602.22519v2 Announce Type: replace Abstract: To operate reliably under changing conditions, complex systems require feedback on how effectively they use resources, not just whether objectives are met. Current AI systems process vast information to produce sophisticated predictions, yet predictions can appear successful while the underlying interaction with the environment degrades. What is missing is a […]

ViroGym: Realistic Large-Scale Benchmarks for Evaluating Viral Proteins

arXiv:2603.06740v1 Announce Type: new Abstract: Protein language models (pLMs) have shown strong potential in prediction of the functional effects of missense variants in zero-shot settings. Despite this progress, benchmarking pLMs for viral proteins remains limited and systematic strategies for integrating in silico metrics with in vitro validation to guide antigen and target selection are underdeveloped. […]

Online Neural Networks for Change-Point Detection

arXiv:2010.01388v2 Announce Type: replace-cross Abstract: Moments when a time series changes its behavior are called change points. Occurrence of change point implies that the state of the system is altered and its timely detection might help to prevent unwanted consequences. In this paper, we present two change-point detection approaches based on neural networks and online […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844