HeRo-Q: A General Framework for Stable Low Bit Quantization via Hessian Conditioning

arXiv:2601.21626v1 Announce Type: cross Abstract: Post Training Quantization (PTQ), a mainstream model compression technique, often leads to the paradoxical ‘low error, high loss’ phenomenon because it focuses solely on minimizing quantization error. The root cause lies in the Hessian matrix of the LLM loss landscape: a few high curvature directions are extremely sensitive to perturbations. […]

Token-Guard: Towards Token-Level Hallucination Control via Self-Checking Decoding

arXiv:2601.21969v1 Announce Type: cross Abstract: Large Language Models (LLMs) often hallucinate, generating content inconsistent with the input. Retrieval-Augmented Generation (RAG) and Reinforcement Learning with Human Feedback (RLHF) can mitigate hallucinations but require resource-intensive retrieval or large-scale fine-tuning. Decoding-based methods are lighter yet lack explicit hallucination control. To address this, we present Token-Guard, a token-level hallucination […]

Multi-modal Imputation for Alzheimer’s Disease Classification

arXiv:2601.21076v1 Announce Type: new Abstract: Deep learning has been successful in predicting neurodegenerative disorders, such as Alzheimer’s disease, from magnetic resonance imaging (MRI). Combining multiple imaging modalities, such as T1-weighted (T1) and diffusion-weighted imaging (DWI) scans, can increase diagnostic performance. However, complete multimodal datasets are not always available. We use a conditional denoising diffusion probabilistic […]

Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models

arXiv:2601.22060v1 Announce Type: cross Abstract: Multimodal large language models (MLLMs) have achieved remarkable success across a broad range of vision tasks. However, constrained by the capacity of their internal world knowledge, prior work has proposed augmenting MLLMs by “reasoning-then-tool-call” for visual and textual search engines to obtain substantial gains on tasks requiring extensive factual information. […]

Log2Motion: Biomechanical Motion Synthesis from Touch Logs

arXiv:2601.21043v1 Announce Type: cross Abstract: Touch data from mobile devices are collected at scale but reveal little about the interactions that produce them. While biomechanical simulations can illuminate motor control processes, they have not yet been developed for touch interactions. To close this gap, we propose a novel computational problem: synthesizing plausible motion directly from […]

SteerEval: A Framework for Evaluating Steerability with Natural Language Profiles for Recommendation

arXiv:2601.21105v1 Announce Type: cross Abstract: Natural-language user profiles have recently attracted attention not only for improved interpretability, but also for their potential to make recommender systems more steerable. By enabling direct editing, natural-language profiles allow users to explicitly articulate preferences that may be difficult to infer from past behavior. However, it remains unclear whether current […]

Output-Space Search: Targeting LLM Generations in a Frozen Encoder-Defined Output Space

arXiv:2601.21169v1 Announce Type: cross Abstract: We introduce Output-Space Search (OS-Search), which turns LLM generation into endpoint search. An outer loop selects a target z* in a frozen encoder-defined 3D output space Z, and a retrieval-grounded policy trained with sequence-level RL generates outputs whose coordinates land near z* under standard autoregressive decoding. This enables parallel sweeps […]

A Sheaf-Theoretic and Topological Perspective on Complex Network Modeling and Attention Mechanisms in Graph Neural Models

arXiv:2601.21207v1 Announce Type: cross Abstract: Combinatorial and topological structures, such as graphs, simplicial complexes, and cell complexes, form the foundation of geometric and topological deep learning (GDL and TDL) architectures. These models aggregate signals over such domains, integrate local features, and generate representations for diverse real-world applications. However, the distribution and diffusion behavior of GDL […]

Conditional Generative Framework with Peak-Aware Attention for Robust Chemical Detection under Interferences

arXiv:2601.21246v1 Announce Type: cross Abstract: Gas chromatography-mass spectrometry (GC-MS) is a widely used analytical method for chemical substance detection, but measurement reliability tends to deteriorate in the presence of interfering substances. In particular, interfering substances cause nonspecific peaks, residence time shifts, and increased background noise, resulting in reduced sensitivity and false alarms. To overcome these […]

PILD: Physics-Informed Learning via Diffusion

arXiv:2601.21284v1 Announce Type: cross Abstract: Diffusion models have emerged as powerful generative tools for modeling complex data distributions, yet their purely data-driven nature limits applicability in practical engineering and scientific problems where physical laws need to be followed. This paper proposes Physics-Informed Learning via Diffusion (PILD), a framework that unifies diffusion modeling and first-principles physical […]

OrthoInsight: Rib Fracture Diagnosis and Report Generation Based on Multi-Modal Large Models

arXiv:2507.13993v4 Announce Type: replace-cross Abstract: The growing volume of medical imaging data has increased the need for automated diagnostic tools, especially for musculoskeletal injuries like rib fractures, commonly detected via CT scans. Manual interpretation is time-consuming and error-prone. We propose OrthoInsight, a multi-modal deep learning framework for rib fracture diagnosis and report generation. It integrates […]

Self-Improving Pretraining: using post-trained models to pretrain better models

arXiv:2601.21343v1 Announce Type: cross Abstract: Ensuring safety, factuality and overall quality in the generations of large language models is a critical challenge, especially as these models are increasingly deployed in real-world applications. The prevailing approach to addressing these issues involves collecting expensive, carefully curated datasets and applying multiple stages of fine-tuning and alignment. However, even […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844