Measuring and reducing surgical staff stress in a realistic operating room setting using EDA monitoring and smart hearing protection

BackgroundStress is a critical factor in the operating room (OR) and affects both the performance and well-being of surgical staff. Measuring and mitigating this stress

Bioethical considerations in deploying mobile mental health apps in LMIC settings: insights from the MITHRA pilot study in rural India

IntroductionIn India, untreated depression among women contributes significantly to morbidity and mortality, underscoring an urgent need for accessible and ethically grounded mental health interventions. Mobile

Assessing nurses’ attitudes toward artificial intelligence in Kazakhstan: psychometric validation of a nine-item scale

BackgroundArtificial intelligence (AI) is increasingly integrated into healthcare, yet the attitudes and knowledge of nurses, who are the key mediators of AI implementation, remain underexplored.

Retrieval Augmented Classification for Confidential Documents

arXiv:2604.08628v1 Announce Type: cross Abstract: Unauthorized disclosure of confidential documents demands robust, low-leakage classification. In real work environments, there is a lot of inflow and

SafeAdapt: Provably Safe Policy Updates in Deep Reinforcement Learning

arXiv:2604.09452v1 Announce Type: cross Abstract: Safety guarantees are a prerequisite to the deployment of reinforcement learning (RL) agents in safety-critical tasks. Often, deployment environments exhibit

APEX-EM: Non-Parametric Online Learning for Autonomous Agents via Structured Procedural-Episodic Experience Replay

April 6, 2026

arXiv:2603.29093v2 Announce Type: replace-cross
Abstract: LLM-based autonomous agents lack persistent procedural memory: they re-derive solutions from scratch even when structurally identical tasks have been solved before. We present APEX-EM, a non-parametric online learning framework that accumulates, retrieves, and reuses structured procedural plans without modifying model weights. APEX-EM introduces: (1) a structured experience representation encoding the full procedural-episodic trace of each execution — planning steps, artifacts, iteration history with error analysis, and quality scores; (2) a Plan-Retrieve-Generate-Iterate-Ingest (PRGII) workflow with Task Verifiers providing multi-dimensional reward signals; and (3) a dual-outcome Experience Memory with hybrid retrieval combining semantic search, structural signature matching, and plan DAG traversal — enabling cross-domain transfer between tasks sharing no lexical overlap but analogous operational structure. Successful experiences serve as positive in-context examples; failures as negative examples with structured error annotations.
We evaluate on BigCodeBench, KGQAGen-10k, and Humanity’s Last Exam using Claude Sonnet 4.5 and Opus 4.5. On KGQAGen-10k, APEX-EM achieves 89.6% accuracy versus 41.3% without memory (+48.3pp), surpassing the oracle-retrieval upper bound (84.9%). On BigCodeBench, it reaches 83.3% SR from a 53.9% baseline (+29.4pp), exceeding MemRL’s +11.0pp gain under comparable frozen-backbone conditions (noting backbone differences controlled for in our analysis). On HLE, entity graph retrieval reaches 48.0% from 25.2% (+22.8pp). Ablations show component value is task-dependent: rich judge feedback is negligible for code generation but critical for structured queries (+10.3pp), while binary-signal iteration partially compensates for weaker feedback.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844