Beyond Convolution: A Taxonomy of Structured Operators for Learning-Based Image Processing

arXiv:2603.12067v2 Announce Type: replace-cross Abstract: The convolution operator is the fundamental building block of modern convolutional neural networks (CNNs), owing to its simplicity, translational equivariance, and efficient implementation. However, its structure as a fixed, linear, locally-averaging operator limits its ability to capture structured signal properties such as low-rank decompositions, adaptive basis representations, and non-uniform spatial […]

Predictive Analytics for Foot Ulcers Using Time-Series Temperature and Pressure Data

arXiv:2603.12278v1 Announce Type: new Abstract: Diabetic foot ulcers (DFUs) are a severe complication of diabetes, often resulting in significant morbidity. This paper presents a predictive analytics framework utilizing time-series data captured by wearable foot sensors — specifically NTC thin-film thermocouples for temperature measurement and FlexiForce pressure sensors for plantar load monitoring. Data was collected from […]

Shattering the Shortcut: A Topology-Regularized Benchmark for Multi-hop Medical Reasoning in LLMs

arXiv:2603.12458v1 Announce Type: cross Abstract: While Large Language Models (LLMs) achieve expert-level performance on standard medical benchmarks through single-hop factual recall, they severely struggle with the complex, multi-hop diagnostic reasoning required in real-world clinical settings. A primary obstacle is “shortcut learning”, where models exploit highly connected, generic hub nodes (e.g., “inflammation”) in knowledge graphs to […]

Surprised by Attention: Predictable Query Dynamics for Time Series Anomaly Detection

arXiv:2603.12916v1 Announce Type: cross Abstract: Multivariate time series anomalies often manifest as shifts in cross-channel dependencies rather than simple amplitude excursions. In autonomous driving, for instance, a steering command might be internally consistent but decouple from the resulting lateral acceleration. Residual-based detectors can miss such anomalies when flexible sequence models still reconstruct signals plausibly despite […]

SvfEye: A Semantic-Visual Fusion Framework with Multi-Scale Visual Context for Multimodal Reasoning

arXiv:2603.00171v2 Announce Type: replace-cross Abstract: Multimodal Large Language Models (MLLMs) often struggle to accurately perceive fine-grained visual details, especially when targets are tiny or visually subtle. This challenge can be addressed through semantic-visual information fusion, which integrates global image context with fine-grained local evidence for multi-scale visual understanding. Recently, a paradigm termed “Thinking with Images” […]

Fair Lung Disease Diagnosis from Chest CT via Gender-Adversarial Attention Multiple Instance Learning

arXiv:2603.12988v1 Announce Type: cross Abstract: We present a fairness-aware framework for multi-class lung disease diagnosis from chest CT volumes, developed for the Fair Disease Diagnosis Challenge at the PHAROS-AIF-MIH Workshop (CVPR 2026). The challenge requires classifying CT scans into four categories — Healthy, COVID-19, Adenocarcinoma, and Squamous Cell Carcinoma — with performance measured as the […]

Depth Charge: Jailbreak Large Language Models from Deep Safety Attention Heads

arXiv:2603.05772v2 Announce Type: replace-cross Abstract: Currently, open-sourced large language models (OSLLMs) have demonstrated remarkable generative performance. However, as their structure and weights are made public, they are exposed to jailbreak attacks even after alignment. Existing attacks operate primarily at shallow levels, such as the prompt or embedding level, and often fail to expose vulnerabilities rooted […]

ARL-Tangram: Unleash the Resource Efficiency in Agentic Reinforcement Learning

arXiv:2603.13019v1 Announce Type: cross Abstract: Agentic reinforcement learning (RL) has emerged as a transformative workload in cloud clusters, enabling large language models (LLMs) to solve complex problems through interactions with real world. However, unlike traditional RL, agentic RL demands substantial external cloud resources, e.g., CPUs for code execution and GPUs for reward models, that exist […]

Variation-aware Flexible 3D Gaussian Editing

arXiv:2602.11638v3 Announce Type: replace-cross Abstract: Indirect editing methods for 3D Gaussian Splatting (3DGS) have recently witnessed significant advancements. These approaches operate by first applying edits in the rendered 2D space and subsequently projecting the modifications back into 3D. However, this paradigm inevitably introduces cross-view inconsistencies and constrains both the flexibility and efficiency of the editing […]

Automatic In-Domain Exemplar Construction and LLM-Based Refinement of Multi-LLM Expansions for Query Expansion

arXiv:2602.08917v2 Announce Type: replace-cross Abstract: Query expansion with large language models is promising but often relies on hand-crafted prompts, manually chosen exemplars, or a single LLM, making it non-scalable and sensitive to domain shift. We present an automated, domain-adaptive QE framework that builds in-domain exemplar pools by harvesting pseudo-relevant passages using a BM25-MonoT5 pipeline. A […]

Ref-DGS: Reflective Dual Gaussian Splatting

arXiv:2603.07664v2 Announce Type: replace-cross Abstract: Reflective appearance, especially strong and typically near-field specular reflections, poses a fundamental challenge for accurate surface reconstruction and novel view synthesis. Existing Gaussian splatting methods either fail to model near-field specular reflections or rely on explicit ray tracing at substantial computational cost. We present Ref-DGS, a reflective dual Gaussian splatting […]

BitDance: Scaling Autoregressive Generative Models with Binary Tokens

arXiv:2602.14041v2 Announce Type: replace-cross Abstract: We present BitDance, a scalable autoregressive (AR) image generator that predicts binary visual tokens instead of codebook indices. With high-entropy binary latents, BitDance lets each token represent up to $2^256$ states, yielding a compact yet highly expressive discrete representation. Sampling from such a huge token space is difficult with standard […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844