Global Convergence of Multiplicative Updates for the Matrix Mechanism: A Collaborative Proof with Gemini 3

arXiv:2603.19465v2 Announce Type: replace-cross Abstract: We analyze a fixed-point iteration $v leftarrow phi(v)$ arising in the optimization of a regularized nuclear norm objective involving the Hadamard product structure, posed in DMR+22 in the context of an optimization problem over the space of algorithms in private machine learning. We prove that the iteration $v^(k+1) = textdiag((D_v^(k)^1/2 […]

Focus, Don’t Prune: Identifying Instruction-Relevant Regions for Information-Rich Image Understanding

arXiv:2603.22815v1 Announce Type: cross Abstract: Large Vision-Language Models (LVLMs) have shown strong performance across various multimodal tasks by leveraging the reasoning capabilities of Large Language Models (LLMs). However, processing visually complex and information-rich images, such as infographics or document layouts, requires these models to generate a large number of visual tokens, leading to significant computational […]

Avoiding Over-smoothing in Social Media Rumor Detection with Pre-trained Propagation Tree Transformer

arXiv:2603.22854v1 Announce Type: cross Abstract: Deep learning techniques for rumor detection typically utilize Graph Neural Networks (GNNs) to analyze post relations. These methods, however, falter due to over-smoothing issues when processing rumor propagation structures, leading to declining performance. Our investigation into this issue reveals that over-smoothing is intrinsically tied to the structural characteristics of rumor […]

An epidemiological model with waning immunity and reinfection

arXiv:2511.05688v2 Announce Type: replace Abstract: Waning immunity and reinfection are critical features of many infectious diseases, but epidemiological models often fail to capture the intricate interaction between an individual’s history of immunity and their current infection status; when they do, the approach is usually overly simplistic. We develop a novel dual-age structured model that simultaneously […]

Reliable OOD Virtual Screening with Extrapolatory Pseudo-Label Matching

arXiv:2406.01825v5 Announce Type: replace-cross Abstract: Machine learning (ML) models are increasingly deployed for virtual screening in drug discovery, where the goal is to identify novel, chemically diverse scaffolds while minimizing experimental costs. This creates a fundamental challenge: the most valuable discoveries lie in out-of-distribution (OOD) regions beyond the training data, yet ML models often degrade […]

Agent Audit: A Security Analysis System for LLM Agent Applications

arXiv:2603.22853v1 Announce Type: cross Abstract: What should a developer inspect before deploying an LLM agent: the model, the tool code, the deployment configuration, or all three? In practice, many security failures in agent systems arise not from model weights alone, but from the surrounding software stack: tool functions that pass untrusted inputs to dangerous operations, […]

Image Generation from Contextually-Contradictory Prompts

arXiv:2506.01929v2 Announce Type: replace-cross Abstract: Text-to-image diffusion models excel at generating high-quality, diverse images from natural language prompts. However, they often fail to produce semantically accurate results when the prompt contains concept combinations that contradict their learned priors. We define this failure mode as contextual contradiction, where one concept implicitly negates another due to entangled […]

myMNIST: Benchmark of PETNN, KAN, and Classical Deep Learning Models for Burmese Handwritten Digit Recognition

arXiv:2603.18597v2 Announce Type: replace-cross Abstract: We present the first systematic benchmark on a standardized iteration of the publicly available Burmese Handwritten Digit Dataset (BHDD), which we have designated as myMNIST Benchmarking. While BHDD serves as a foundational resource for Myanmar NLP/AI, it lacks a comprehensive, reproducible performance baseline across modern architectures. We evaluate eleven architectures […]

Generating Findings for Jaw Cysts in Dental Panoramic Radiographs Using a GPT-Based VLM: A Preliminary Study on Building a Two-Stage Self-Correction Loop with Structured Output (SLSO) Framework

arXiv:2510.02001v4 Announce Type: replace-cross Abstract: Vision-language models (VLMs) such as GPT (Generative Pre-Trained Transformer) have shown potential for medical image interpretation; however, challenges remain in generating reliable radiological findings in clinical practice, as exemplified by dental pathologies. This study proposes a Self-correction Loop with Structured Output (SLSO) framework as an integrated processing methodology to enhance […]

UniQueR: Unified Query-based Feedforward 3D Reconstruction

arXiv:2603.22851v1 Announce Type: cross Abstract: We present UniQueR, a unified query-based feedforward framework for efficient and accurate 3D reconstruction from unposed images. Existing feedforward models such as DUSt3R, VGGT, and AnySplat typically predict per-pixel point maps or pixel-aligned Gaussians, which remain fundamentally 2.5D and limited to visible surfaces. In contrast, UniQueR formulates reconstruction as a […]

Arc Gradient Descent: A Geometrically Motivated Gradient Descent-based Optimiser with Phase-Aware, User-Controlled Step Dynamics (proof-of-concept)

arXiv:2512.06737v3 Announce Type: replace-cross Abstract: The paper presents the formulation, implementation, and evaluation of the ArcGD optimiser. The evaluation is conducted initially on a non-convex benchmark function and subsequently on a real-world ML dataset. The initial comparative study using the Adam optimiser is conducted on a stochastic variant of the highly non-convex and notoriously challenging […]

EVA: Aligning Video World Models with Executable Robot Actions via Inverse Dynamics Rewards

arXiv:2603.17808v2 Announce Type: replace-cross Abstract: Video generative models are increasingly used as world models for robotics, where a model generates a future visual rollout conditioned on the current observation and task instruction, and an inverse dynamics model (IDM) converts the generated frames into executable robot actions. However, current video world models lack explicit executability constraints. […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844