P^2O: Joint Policy and Prompt Optimization

arXiv:2603.21877v3 Announce Type: replace-cross Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) enhances Large Language Model (LLM) reasoning but suffers from advantage collapse on “hard samples” where all rollouts fail. This lack of variance eliminates crucial learning signals. For these intractable samples, simply scaling up rollout budgets offers limited gains. We introduce Joint Policy and Prompt […]

Mochi: Aligning Pre-training and Inference for Efficient Graph Foundation Models via Meta-Learning

arXiv:2604.22031v2 Announce Type: replace-cross Abstract: We propose Mochi, a Graph Foundation Model that addresses task unification and training efficiency by adopting a meta-learning based training framework. Prior models pre-train with reconstruction-based objectives such as link prediction, and assume that the resulting representations can be aligned with downstream tasks through a separate unification step such as […]

HeadQ: Model-Visible Distortion and Score-Space Correction for KV-Cache Quantization

arXiv:2605.03562v2 Announce Type: replace-cross Abstract: KV-cache quantizers usually optimize storage-space reconstruction, even though attention reads keys through logits and values through attention-weighted readout. We argue that persistent cache error should be measured in model-visible coordinates. For keys, the visible object is score error modulo constant shifts; this yields HeadQ, a key-side method that stores a […]

What Happens Inside Agent Memory? Circuit Analysis from Emergence to Diagnosis

arXiv:2605.03354v2 Announce Type: replace Abstract: Agent memory failures are silent: an LLM-based agent can produce a fluent response even when it fails to extract, retain, or retrieve the information needed across sessions. The write-manage-read loop describes the external pipeline of these systems but leaves open which internal computations implement each stage. Tracing feature circuits across […]

Open-Rosalind: Tool-First Biomedical LLM Agents with Process-Aware Benchmarking

Large language models are increasingly used as scientific agents, yet the flexibility that benefits general-purpose agents can conflict with the accountability required in biomedical research. We study whether biomedical agents can be organized around auditable constraints rather than unconstrained autonomy. We present Open-Rosalind, a tool-first bio-agent system designed around four operational principles: evidence-grounded outputs, trace […]

Variable transmission efficiency of mammalian origin HPAI D1.1 H5N1 strains in ferrets

Highly pathogenic avian influenza H5N1 2.3.4.4b genotype D1.1 lineage continues to predominate in the United States wild bird population and has spilled over into dairy cattle three independent times. To assess the transmission risk of this sublineage, we performed direct-contact transmission experiments for three distinct D1.1 strains in ferrets. Two of these strains were isolated […]

PromptBio-Bench: Benchmarking LLM-based Bioinformatics Agents for End-to-End Data Analysis

Large language model (LLM)-based agents hold transformative potential for automating bioinformatics workflows; however, systematic evaluations of their capabilities remain limited, hindering a clear assessment of their readiness for real-world application. We introduce PromptBio-Bench, a comprehensive evaluation suite of 194 expert-curated tasks spanning bioinformatics and data science at varied difficulty levels, and an evaluation framework for […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844