April 2, 2026 – Page 20 – dijee Pharma Intelligence

Learning Quantised Structure-Preserving Motion Representations for Dance Fingerprinting

arXiv:2604.00927v1 Announce Type: cross Abstract: We present DANCEMATCH, an end-to-end framework for motion-based dance retrieval, the task of identifying semantically similar choreographies directly from raw video, defined as DANCE FINGERPRINTING. While existing motion analysis and retrieval methods can compare pose sequences, they rely on continuous embeddings that are difficult to index, interpret, or scale. In […]

April 2, 2026

Incoherence in Goal-Conditioned Autoregressive Models

arXiv:2510.06545v2 Announce Type: replace-cross Abstract: We investigate mathematically the notion of incoherence: a structural issue with reinforcement learning policies derived by naive goal-conditioning of autoregressive models. We focus on the process of re-training models on their own actions, that is, fine-tuning offline-learned policies with online RL. We prove that it decreases incoherence and leads to […]

April 2, 2026

Two-stage Vision Transformers and Hard Masking offer Robust Object Representations

arXiv:2506.08915v4 Announce Type: replace-cross Abstract: Context can strongly affect object representations, sometimes leading to undesired biases, particularly when objects appear in out-of-distribution backgrounds at inference. At the same time, many object-centric tasks require to leverage the context for identifying the relevant image regions. We posit that this conundrum, in which context is simultaneously needed and […]

April 2, 2026

DuoTok: Source-Aware Dual-Track Tokenization for Multi-Track Music Language Modeling

arXiv:2511.20224v2 Announce Type: replace-cross Abstract: Audio tokenization bridges continuous waveforms and multi-track music language models. In dual-track modeling, tokens should preserve three properties at once: high-fidelity reconstruction, strong predictability under a language model, and cross-track correspondence. We introduce DuoTok, a source-aware dual-track tokenizer that addresses this trade-off through staged disentanglement. DuoTok first pretrains a semantic […]

April 2, 2026

CoCoDiff: Correspondence-Consistent Diffusion Model for Fine-grained Style Transfer

arXiv:2602.14464v2 Announce Type: replace-cross Abstract: Transferring visual style between images while preserving semantic correspondence between similar objects remains a central challenge in computer vision. While existing methods have made great strides, most of them operate at global level but overlook region-wise and even pixel-wise semantic correspondence. To address this, we propose CoCoDiff, a novel training-free […]

April 2, 2026

Routing-Free Mixture-of-Experts

arXiv:2604.00801v1 Announce Type: cross Abstract: Standard Mixture-of-Experts (MoE) models rely on centralized routing mechanisms that introduce rigid inductive biases. We propose Routing-Free MoE which eliminates any hard-coded centralized designs including external routers, Softmax, Top-K and load balancing, instead encapsulating all activation functionalities within individual experts and directly optimized through continuous gradient flow, enabling each expert […]

April 2, 2026

Query-Conditioned Evidential Keyframe Sampling for MLLM-Based Long-Form Video Understanding

arXiv:2604.01002v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) have shown strong performance on video question answering, but their application to long-form videos is constrained by limited context length and computational cost, making keyframe sampling essential. Existing approaches typically rely on semantic relevance or reinforcement learning, which either fail to capture evidential clues or […]

April 2, 2026

Looking into a Pixel by Nonlinear Unmixing — A Generative Approach

arXiv:2604.01141v1 Announce Type: cross Abstract: Due to the large footprint of pixels in remote sensing imagery, hyperspectral unmixing (HU) has become an important and necessary procedure in hyperspectral image analysis. Traditional HU methods rely on a prior spectral mixing model, especially for nonlinear mixtures, which has largely limited the performance and generalization capacity of the […]

April 2, 2026

Teaching AI to Handle Exceptions: Supervised Fine-Tuning with Human-Aligned Judgment

arXiv:2503.02976v3 Announce Type: replace Abstract: Large language models (LLMs), initially developed for generative AI, are now evolving into agentic AI systems, which make decisions in complex, real-world contexts. Unfortunately, while their generative capabilities are well-documented, their decision-making processes remain poorly understood. This is particularly evident when testing targeted decision-making: for instance, how models handle exceptions, […]

April 2, 2026

Distilling the Thought, Watermarking the Answer: A Principle Semantic Guided Watermark for Large Reasoning Models

arXiv:2601.05144v2 Announce Type: replace Abstract: Reasoning Large Language Models (RLLMs) excelling in complex tasks present unique challenges for digital watermarking, as existing methods often disrupt logical coherence or incur high computational costs. Token-based watermarking techniques can corrupt the reasoning flow by applying pseudo-random biases, while semantic-aware approaches improve quality but introduce significant latency or require […]

April 2, 2026

Enhancing Team Diversity with Generative AI: A Novel Project Management Framework

arXiv:2502.05181v2 Announce Type: replace-cross Abstract: This research-in-progress paper presents a new project management framework that utilises GenAI technology. The framework is designed to address the common challenge of uniform team compositions in academic and research project teams, particularly in universities and research institutions. It does so by integrating sociologically identified patterns of successful team member […]

April 2, 2026

Polychromic Objectives for Reinforcement Learning

arXiv:2509.25424v4 Announce Type: replace-cross Abstract: Reinforcement learning fine-tuning (RLFT) is a dominant paradigm for improving pretrained policies for downstream tasks. These pretrained policies, trained on large datasets, produce generations with a broad range of promising but unrefined behaviors. Often, a critical failure mode of RLFT arises when policies lose this diversity and collapse into a […]

April 2, 2026

Subscribe for Updates