May 19, 2026 – Page 15 – dijee Pharma Intelligence

Microdroplets Fail to Retain Exhaled Volatile Biomarkers within a Single Breath

arXiv:2605.16356v1 Announce Type: new Abstract: Exhaled breath condensate (EBC) contains volatile metabolites and is promising for non-invasive disease diagnosis, but after decades of research spanning over 100 biomarkers and 10 diseases, no EBC-based test has reached clinical use. The measurement variability that can span orders of magnitude, far exceeding the clinically required 10%, has long […]

May 19, 2026

ReTAMamba: Reliability-Aware Temporal Aggregation with Mamba for Irregular Clinical Time Series Prediction

arXiv:2605.16380v1 Announce Type: cross Abstract: Clinical time-series data are difficult to model with methods designed for regular sequences because they exhibit irregular sampling, frequent missing values, and heterogeneous observation patterns across variables. Existing approaches commonly use observation masks and time-gap information, but they do not continuously capture the decaying reliability of past observations or consistently […]

May 19, 2026

When Does Non-Uniform Replay Matter in Reinforcement Learning?

arXiv:2605.10236v3 Announce Type: replace-cross Abstract: Modern off-policy reinforcement learning algorithms often rely on simple uniform replay sampling and it remains unclear when and why non-uniform replay improves over this strong baseline. Across diverse RL settings, we show that the effectiveness of non-uniform replay is governed by three factors: replay volume, the number of replayed transitions […]

May 19, 2026

ClawGym: A Scalable Framework for Building Effective Claw Agents

arXiv:2604.26904v3 Announce Type: replace-cross Abstract: Claw-style environments support multi-step workflows over local files, tools, and persistent workspace states. However, scalable development around these environments remains constrained by the absence of a systematic framework, especially one for synthesizing verifiable training data and integrating it with agent training and diagnostic evaluation. To address this challenge, we present […]

May 19, 2026

A More Word-like Image Tokenization for MLLMs

arXiv:2605.17954v1 Announce Type: cross Abstract: Modern multimodal large language models (MLLMs) typically keep the language model fixed and train a visual projector that maps the pixels into a sequence of tokens in its embedding space, so that images can be presented in essentially the same form as text. However, the language model has been optimized […]

May 19, 2026

Self-Improving Tabular Language Models via Iterative Reward-Guided Post-Training

arXiv:2604.18966v2 Announce Type: replace-cross Abstract: Tabular language models can generate synthetic tables by modeling rows as token sequences, but they are typically trained once with supervised fine-tuning and then used as static synthesizers. This is limiting because next-token likelihood does not directly optimize the distributional, utility, and indistinguishability properties used to evaluate synthetic data. We […]

May 19, 2026

PAREDA: A Multi-Accent Speech Dataset of Natural Language Processing Research Discussions

arXiv:2605.17860v1 Announce Type: cross Abstract: While modern Automatic Speech Recognition (ASR) systems achieve high accuracy on benchmark corpora, their performance often degrades when there is real-world variability. This work focuses on variability arising due to accented, spontaneous, and domain-specific speech. In particular, we introduce PAper REading DAtaset (PAREDA), a first-of-its-kind multi-accent speech dataset consisting of […]

May 19, 2026

StreamPro: From Reactive Perception to Proactive Decision-Making in Streaming Video

arXiv:2605.16381v1 Announce Type: cross Abstract: Proactive streaming video understanding requires models to continuously process video streams and decide when to respond, rather than merely what to respond. This naturally introduces a decision-making problem under partial observations, where models must balance early prediction against sufficient evidence. However, existing benchmarks largely follow a “see-then-answer” paradigm, where responses […]

May 19, 2026

Fre-Res: Frequency-Residual Video Token Compression for Efficient Video MLLMs

arXiv:2605.16366v1 Announce Type: cross Abstract: Video MLLMs face a persistent tension between spatial fidelity and temporal coverage: preserving fine-grained visual details requires many spatial tokens, while capturing short-lived events requires dense temporal sampling. We propose textbfFre-Res, a budget-adaptive dual-track video-token compression framework that separates these two forms of evidence. Fre-Res preserves sparse high-fidelity spatial anchors […]

May 19, 2026

LAST-RAG: Literature-Anchored Stochastic Trajectory Retrieval-Augmented Generation for Knowledge-Conditioned Degradation Model Selection

arXiv:2605.17902v1 Announce Type: new Abstract: Stochastic-process-based degradation modeling is a core approach for estimating the distribution of remaining useful life (RUL); however, the selection of an appropriate stochastic process has not been sufficiently addressed. Existing model selection methods mainly rely on the statistical fit of the observed health indicator (HI) trajectory, but this approach may […]

May 19, 2026

ANVIL: Analogies and Videos for Lecturers

arXiv:2605.16295v1 Announce Type: cross Abstract: We present ANVIL, a multimodal generative system that automates the production of analogy-based instructional animations for computer science topics. Given a concept definition, ANVIL generates a textual analogy, compiles it into a structured visual screenplay, and produces executable manim code to render an animation, with an automated repair mechanism to […]

May 19, 2026

LoopQ: Quantization for Recursive Transformers

arXiv:2605.16343v1 Announce Type: cross Abstract: Looped language models (LoopLMs) improve parameter efficiency by recursively reusing Transformer blocks, enabling deeper computation under a fixed model size. However, this reuse makes LoopLMs more fragile under post-training quantization (PTQ). We present the first systematic study of quantization in LoopLMs and identify three challenges: distribution shift across roles, state […]

May 19, 2026

Subscribe for Updates