AQuA: Toward Strategic Response Generation for Ambiguous Visual Questions

arXiv:2603.07394v1 Announce Type: cross Abstract: Visual Question Answering (VQA) is a core task for evaluating the capabilities of Vision-Language Models (VLMs). Existing VQA benchmarks primarily feature clear and unambiguous image-question pairs, whereas real-world scenarios often involve varying degrees of ambiguity that require nuanced reasoning and context-appropriate response strategies. Although recent studies have begun to address […]

MedSteer: Counterfactual Endoscopic Synthesis via Training-Free Activation Steering

arXiv:2603.07066v1 Announce Type: cross Abstract: Generative diffusion models are increasingly used for medical imaging data augmentation, but text prompting cannot produce causal training data. Re-prompting rerolls the entire generation trajectory, altering anatomy, texture, and background. Inversion-based editing methods introduce reconstruction error that causes structural drift. We propose MedSteer, a training-free activation-steering framework for endoscopic synthesis. […]

Kinematics-Aware Latent World Models for Data-Efficient Autonomous Driving

arXiv:2603.07264v1 Announce Type: cross Abstract: Data-efficient learning remains a central challenge in autonomous driving due to the high cost and safety risks of large-scale real-world interaction. Although world-model-based reinforcement learning enables policy optimization through latent imagination, existing approaches often lack explicit mechanisms to encode spatial and kinematic structure essential for driving tasks. In this work, […]

Diversity-Aware Adaptive Collocation for Physics-Informed Neural Networks via Sparse QUBO Optimization and Hybrid Coresets

arXiv:2603.06761v1 Announce Type: cross Abstract: Physics-Informed Neural Networks (PINNs) enforce governing equations by penalizing PDE residuals at interior collocation points, but standard collocation strategies – uniform sampling and residual-based adaptive refinement – can oversample smooth regions, produce highly correlated point sets, and incur unnecessary training cost. We reinterpret collocation selection as a coreset construction problem: […]

Physics-informed AI Accelerated Retention Analysis of Ferroelectric Vertical NAND: From Day-Scale TCAD to Second-Scale Surrogate Model

arXiv:2603.06881v1 Announce Type: cross Abstract: Ferroelectric field-effect transistors (FeFET)-based vertical NAND (Fe-VNAND) has emerged as a promising candidate to overcome z-scaling limitations with lower programming voltages. However, the data retention of 3D Fe-VNAND is hindered by the complex interaction between charge detrapping and ferroelectric depolarization. Developing optimized device designs requires exploring an extensive parameter space, […]

HybridStitch: Pixel and Timestep Level Model Stitching for Diffusion Acceleration

arXiv:2603.07815v1 Announce Type: cross Abstract: Diffusion models have demonstrated a remarkable ability in Text-to-Image (T2I) generation applications. Despite the advanced generation output, they suffer from heavy computation overhead, especially for large models that contain tens of billions of parameters. Prior work has illustrated that replacing part of the denoising steps with a smaller model still […]

Interpretable-by-Design Transformers via Architectural Stream Independence

arXiv:2603.07482v1 Announce Type: cross Abstract: While transformers achieve strong performance, their internal decision-making processes remain opaque. We investigate whether architectural constraints can enforce interpretability by design through architectural stream independence: maintaining a token stream (carrying symbolic structure) and contextual semantics in separated streams that remain independently observable throughout processing, with integration delayed until output. We […]

SMAT: Staged Multi-Agent Training for Co-Adaptive Exoskeleton Control

arXiv:2603.07618v1 Announce Type: cross Abstract: Effective exoskeleton assistance requires co-adaptation: as the device alters joint dynamics, the user reorganizes neuromuscular coordination, creating a non-stationary learning problem. Most learning-based approaches do not explicitly account for the sequential nature of human motor adaptation, leading to training instability and poorly timed assistance. We propose Staged Multi-Agent Training (SMAT), […]

PSTNet: Physically-Structured Turbulence Network

arXiv:2603.07957v1 Announce Type: cross Abstract: Reliable real-time estimation of atmospheric turbulence intensity remains an open challenge for aircraft operating across diverse altitude bands, particularly over oceanic, polar, and data-sparse regions that lack operational nowcasting infrastructure. Classical spectral models encode climatological averages rather than the instantaneous atmospheric state, and generic ML regressors offer adaptivity but provide […]

Enhancing Instruction Following of LLMs via Activation Steering with Dynamic Rejection

arXiv:2603.06745v1 Announce Type: cross Abstract: Large Language Models (LLMs), despite advances in instruction tuning, often fail to follow complex user instructions. Activation steering techniques aim to mitigate this by manipulating model internals, but have a potential risk of oversteering, where excessive emphasis on the instruction degrades task accuracy and overall text quality. To address this, […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844