Ablate and Rescue: A Causal Analysis of Residual Stream Hyper-Connections

arXiv:2603.14833v1 Announce Type: cross Abstract: Multi-stream transformer architectures have recently been proposed as a promising direction for managing representation collapse and the vanishing gradient problem for residual connections, yet their internal mechanisms remain unexplored. In particular, the recently introduced Manifold-Constrained Hyper-Connections (mHC) architecture posits multiple residual streams with constrained interaction, but lacks in-depth mechanistic analysis. […]

A Loss Landscape Visualization Framework for Interpreting Reinforcement Learning: An ADHDP Case Study

arXiv:2603.14600v1 Announce Type: cross Abstract: Reinforcement learning algorithms have been widely used in dynamic and control systems. However, interpreting their internal learning behavior remains a challenge. In the authors’ previous work, a critic match loss landscape visualization method was proposed to study critic training. This study extends that method into a framework which provides a […]

DOVA: Deliberation-First Multi-Agent Orchestration for Autonomous Research Automation

arXiv:2603.13327v1 Announce Type: new Abstract: Large language model (LLM) agents have demonstrated remarkable capabilities in tool use, reasoning, and code generation, yet single-agent systems exhibit fundamental limitations when confronted with complex research tasks demanding multi-source synthesis, adversarial verification, and personalized delivery. We present DOVA (Deep Orchestrated Versatile Agent), a multi-agent platform introducing three key innovations: […]

VisualLeakBench: Auditing the Fragility of Large Vision-Language Models against PII Leakage and Social Engineering

arXiv:2603.13385v1 Announce Type: cross Abstract: As Large Vision-Language Models (LVLMs) are increasingly deployed in agent-integrated workflows and other deployment-relevant settings, their robustness against semantic visual attacks remains under-evaluated — alignment is typically tested on explicit harmful content rather than privacy-critical multimodal scenarios. We introduce VisualLeakBench, an evaluation suite to audit LVLMs against OCR Injection and […]

GPrune-LLM: Generalization-Aware Structured Pruning for Large Language Models

arXiv:2603.13418v1 Announce Type: cross Abstract: Structured pruning is widely used to compress large language models (LLMs), yet its effectiveness depends heavily on neuron importance estimation. Most existing methods estimate neuron importance from activation statistics on a single calibration dataset, which introduces calibration bias and degrades downstream cross-task generalization. We observe that neurons exhibit heterogeneous distribution […]

CLAG: Adaptive Memory Organization via Agent-Driven Clustering for Small Language Model Agents

arXiv:2603.15421v1 Announce Type: cross Abstract: Large language model agents heavily rely on external memory to support knowledge reuse and complex reasoning tasks. Yet most memory systems store experiences in a single global retrieval pool which can gradually dilute or corrupt stored knowledge. This problem is especially pronounced for small language models (SLMs), which are highly […]

CHIMERA-Bench: A Benchmark Dataset for Epitope-Specific Antibody Design

arXiv:2603.13431v1 Announce Type: cross Abstract: Computational antibody design has seen rapid methodological progress, with dozens of deep generative methods proposed in the past three years, yet the field lacks a standardized benchmark for fair comparison and model development. These methods are evaluated on different SAbDab snapshots, non-overlapping test sets, and incompatible metrics, and the literature […]

DyACE: Dynamic Algorithm Co-evolution for Online Automated Heuristic Design with Large Language Model

arXiv:2603.13344v1 Announce Type: new Abstract: The prevailing paradigm in Automated Heuristic Design (AHD) typically relies on the assumption that a single, fixed algorithm can effectively navigate the shifting dynamics of a combinatorial search. This static approach often proves inadequate for Perturbative Heuristics, where the optimal algorithm for escaping local optima depends heavily on the specific […]

MGMAR: Metal-Guided Metal Artifact Reduction for X-ray Computed Tomography

arXiv:2603.13447v1 Announce Type: cross Abstract: An X-ray computed tomography (CT), metal artifact reduction (MAR) remains a major challenge because metallic implants violate standard CT forward-model assumptions, producing severe streaking and shadowing artifacts that degrade diagnostic quality. We propose MGMAR, a metal-guided MAR method that explicitly leverages metal-related information throughout the reconstruction pipeline. MGMAR first generates […]

Benchmarking LLM-based agents for single-cell omics analysis

arXiv:2508.13201v3 Announce Type: replace Abstract: Background: The surge in single-cell omics data exposes limitations in traditional, manually defined analysis workflows. AI agents offer a paradigm shift, enabling adaptive planning, executable code generation, traceable decisions, and real-time knowledge fusion. However, the lack of a comprehensive benchmark critically hinders progress. Results: We introduce a novel benchmarking evaluation […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844