Adaptive Relative Pose Estimation Framework with Dual Noise Tuning for Safe Approaching Maneuvers

arXiv:2507.16214v3 Announce Type: replace-cross Abstract: Accurate and robust relative pose estimation is crucial for enabling challenging Active Debris Removal (ADR) missions targeting tumbling derelict satellites such as ESA’s ENVISAT. This work presents a complete pipeline integrating advanced computer vision techniques with adaptive nonlinear filtering to address this challenge. A Convolutional Neural Network (CNN), enhanced with […]

Hyperagents

arXiv:2603.19461v1 Announce Type: new Abstract: Self-improving AI systems aim to reduce reliance on human engineering by learning to improve their own learning and problem-solving processes. Existing approaches to self-improvement rely on fixed, handcrafted meta-level mechanisms, fundamentally limiting how fast such systems can improve. The Darwin G”odel Machine (DGM) demonstrates open-ended self-improvement in coding by repeatedly […]

CARES: Context-Aware Resolution Selector for VLMs

arXiv:2510.19496v2 Announce Type: replace-cross Abstract: Large vision-language models (VLMs) commonly process images at native or high resolution to remain effective across tasks. This inflates visual tokens ofter to 97-99% of total tokens, resulting in high compute and latency, even when low-resolution images would suffice. We introduce emphCARES-a textbfContext-textbfAware textbfResolution textbfSelector, a lightweight preprocessing module that, […]

Reinforcement-guided generative protein language models enable de novo design of highly diverse AAV capsids

arXiv:2603.19473v1 Announce Type: new Abstract: Adeno-associated viral (AAV) vectors are widely used delivery platforms in gene therapy, and the design of improved capsids is key to expanding their therapeutic potential. A central challenge in AAV bioengineering, as in protein design more broadly, is the vast sequence design space relative to the scale of feasible experimental […]

Understanding and Optimizing Multi-Stage AI Inference Pipelines

arXiv:2504.09775v5 Announce Type: replace-cross Abstract: The rapid evolution of Large Language Models (LLMs) has driven the need for increasingly sophisticated inference pipelines and hardware platforms. Modern LLM serving extends beyond traditional prefill-decode workflows, incorporating multi-stage processes such as Retrieval Augmented Generation (RAG), key-value (KV) cache retrieval, dynamic model routing, and multi step reasoning. These stages […]

When both Grounding and not Grounding are Bad — A Partially Grounded Encoding of Planning into SAT (Extended Version)

arXiv:2603.19429v1 Announce Type: new Abstract: Classical planning problems are typically defined using lifted first-order representations, which offer compactness and generality. While most planners ground these representations to simplify reasoning, this can cause an exponential blowup in size. Recent approaches instead operate directly on the lifted level to avoid full grounding. We explore a middle ground […]

World4RL: Diffusion World Models for Policy Refinement with Reinforcement Learning for Robotic Manipulation

arXiv:2509.19080v2 Announce Type: replace-cross Abstract: Robotic manipulation policies are commonly initialized through imitation learning, but their performance is limited by the scarcity and narrow coverage of expert data. Reinforcement learning can refine polices to alleviate this limitation, yet real-robot training is costly and unsafe, while training in simulators suffers from the sim-to-real gap. Recent advances […]

PFM-VEPAR: Prompting Foundation Models for RGB-Event Camera based Pedestrian Attribute Recognition

arXiv:2603.19565v1 Announce Type: cross Abstract: Event-based pedestrian attribute recognition (PAR) leverages motion cues to enhance RGB cameras in low-light and motion-blur scenarios, enabling more accurate inference of attributes like age and emotion. However, existing two-stream multimodal fusion methods introduce significant computational overhead and neglect the valuable guidance from contextual samples. To address these limitations, this […]

FB-CLIP: Fine-Grained Zero-Shot Anomaly Detection with Foreground-Background Disentanglement

arXiv:2603.19608v1 Announce Type: cross Abstract: Fine-grained anomaly detection is crucial in industrial and medical applications, but labeled anomalies are often scarce, making zero-shot detection challenging. While vision-language models like CLIP offer promising solutions, they struggle with foreground-background feature entanglement and coarse textual semantics. We propose FB-CLIP, a framework that enhances anomaly localization via multi-strategy textual […]

Global Convergence of Multiplicative Updates for the Matrix Mechanism: A Collaborative Proof with Gemini 3

arXiv:2603.19465v1 Announce Type: cross Abstract: We analyze a fixed-point iteration $v leftarrow phi(v)$ arising in the optimization of a regularized nuclear norm objective involving the Hadamard product structure, posed in~citedenisov in the context of an optimization problem over the space of algorithms in private machine learning. We prove that the iteration $v^(k+1) = textdiag((D_v^(k)^1/2 M […]

Inducing Sustained Creativity and Diversity in Large Language Models

arXiv:2603.19519v1 Announce Type: cross Abstract: We address a not-widely-recognized subset of exploratory search, where a user sets out on a typically long “search quest” for the perfect wedding dress, overlooked research topic, killer company idea, etc. The first few outputs of current large language models (LLMs) may be helpful but only as a start, since […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844