Autonomous Algorithm Discovery for Ptychography via Evolutionary LLM Reasoning

arXiv:2603.05696v1 Announce Type: cross Abstract: Ptychography is a computational imaging technique widely used for high-resolution materials characterization, but high-quality reconstructions often require the use of regularization functions that largely remain manually designed. We introduce Ptychi-Evolve, an autonomous framework that uses large language models (LLMs) to discover and evolve novel regularization algorithms. The framework combines LLM-driven […]

Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization

arXiv:2602.23008v2 Announce Type: replace-cross Abstract: Exploration remains the key bottleneck for large language model agents trained with reinforcement learning. While prior methods exploit pretrained knowledge, they fail in environments requiring the discovery of novel states. We propose Exploratory Memory-Augmented On- and Off-Policy Optimization (EMPO$^2$), a hybrid RL framework that leverages memory for exploration and combines […]

A-3PO: Accelerating Asynchronous LLM Training with Staleness-aware Proximal Policy Approximation

arXiv:2512.06547v3 Announce Type: replace-cross Abstract: Decoupled PPO has been a successful reinforcement learning (RL) algorithm to deal with the high data staleness under the asynchronous RL setting. Decoupled loss used in decoupled PPO improves coupled-loss style of algorithms’ (e.g., standard PPO, GRPO) learning stability by introducing a proximal policy to decouple the off-policy correction (importance […]

Proof-of-Guardrail in AI Agents and What (Not) to Trust from It

arXiv:2603.05786v1 Announce Type: cross Abstract: As AI agents become widely deployed as online services, users often rely on an agent developer’s claim about how safety is enforced, which introduces a threat where safety measures are falsely advertised. To address the threat, we propose proof-of-guardrail, a system that enables developers to provide cryptographic proof that a […]

Revisiting the (Sub)Optimality of Best-of-N for Inference-Time Alignment

arXiv:2603.05739v1 Announce Type: cross Abstract: Best-of-N (BoN) sampling is a widely used inference-time alignment method for language models, whereby N candidate responses are sampled from a reference model and the one with the highest predicted reward according to a learned reward model is selected. Despite its widespread practical use, recent theoretical work has suggested that […]

Phys4D: Fine-Grained Physics-Consistent 4D Modeling from Video Diffusion

arXiv:2603.03485v2 Announce Type: replace-cross Abstract: Recent video diffusion models have achieved impressive capabilities as large-scale generative world models. However, these models often struggle with fine-grained physical consistency, exhibiting physically implausible dynamics over time. In this work, we present textbfPhys4D, a pipeline for learning physics-consistent 4D world representations from video diffusion models. Phys4D adopts textbfa three-stage […]

Margin and Consistency Supervision for Calibrated and Robust Vision Models

arXiv:2603.05812v1 Announce Type: cross Abstract: Deep vision classifiers often achieve high accuracy while remaining poorly calibrated and fragile under small distribution shifts. We present Margin and Consistency Supervision (MaCS), a simple, architecture-agnostic regularization framework that jointly enforces logit-space separation and local prediction stability. MaCS augments cross-entropy with (i) a hinge-squared margin penalty that enforces a […]

Accelerating Scientific Research with Gemini: Case Studies and Common Techniques

arXiv:2602.03837v3 Announce Type: replace-cross Abstract: Recent advances in large language models (LLMs) have opened new avenues for accelerating scientific research. While models are increasingly capable of assisting with routine tasks, their ability to contribute to novel, expert-level mathematical discovery is less understood. We present a collection of case studies demonstrating how researchers have successfully collaborated […]

Visual Words Meet BM25: Sparse Auto-Encoder Visual Word Scoring for Image Retrieval

arXiv:2603.05781v1 Announce Type: cross Abstract: Dense image retrieval is accurate but offers limited interpretability and attribution, and it can be compute-intensive at scale. We present textbfBM25-V, which applies Okapi BM25 scoring to sparse visual-word activations from a Sparse Auto-Encoder (SAE) on Vision Transformer patch features. Across a large gallery, visual-word document frequencies are highly imbalanced […]

StreamWise: Serving Multi-Modal Generation in Real-Time at Scale

arXiv:2603.05800v1 Announce Type: cross Abstract: Advances in multi-modal generative models are enabling new applications, from storytelling to automated media synthesis. Most current workloads generate simple outputs (e.g., image generation from a prompt) in batch mode, often requiring several seconds even for basic results. Serving real-time multi-modal workflows at scale is costly and complex, requiring efficient […]

Window-based Membership Inference Attacks Against Fine-tuned Large Language Models

arXiv:2601.02751v2 Announce Type: replace-cross Abstract: Most membership inference attacks (MIAs) against Large Language Models (LLMs) rely on global signals, like average loss, to identify training data. This approach, however, dilutes the subtle, localized signals of memorization, reducing attack effectiveness. We challenge this global-averaging paradigm, positing that membership signals are more pronounced within localized contexts. We […]

Just-In-Time Objectives: A General Approach for Specialized AI Interactions

arXiv:2510.14591v2 Announce Type: replace-cross Abstract: Large language models promise a broad set of functions, but when not given a specific objective, they default to generic results. We demonstrate that inferring the user’s in-the-moment objective, then rapidly optimizing for that singular objective, enables LLMs to produce specialized tools, interfaces, and responses. Our work introduces just-in-time objectives, […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844