Where does output diversity collapse in post-training?

arXiv:2604.16027v1 Announce Type: cross Abstract: Post-trained language models produce less varied outputs than their base counterparts. This output diversity collapse undermines inference-time scaling methods that rely on varied samples, and risks homogenizing model outputs on creative and value-laden tasks. Prior work attributes collapse to specific post-training methods, without separating the role of training data composition […]

Safe Deep Reinforcement Learning for Building Heating Control and Demand-side Flexibility

arXiv:2604.16033v1 Announce Type: cross Abstract: Buildings account for approximately 40% of global energy consumption, and with the growing share of intermittent renewable energy sources, enabling demand-side flexibility, particularly in heating, ventilation and air conditioning systems, is essential for grid stability and energy efficiency. This paper presents a safe deep reinforcement learning-based control framework to optimize […]

Optimistic Policy Learning under Pessimistic Adversaries with Regret and Violation Guarantees

arXiv:2604.14243v2 Announce Type: replace-cross Abstract: Real-world decision-making systems operate in environments where state transitions depend not only on the agent’s actions, but also on textbfexogenous factors outside its control–competing agents, environmental disturbances, or strategic adversaries–formally, $s_h+1 = f(s_h, a_h, bara_h)+omega_h$ where $bara_h$ is the adversary/external action, $a_h$ is the agent’s action, and $omega_h$ is an […]

Early Detection of Acute Myeloid Leukemia (AML) Using YOLOv12 Deep Learning Model

arXiv:2604.16082v1 Announce Type: cross Abstract: Acute Myeloid Leukemia (AML) is one of the most life-threatening type of blood cancers, and its accurate classification is considered and remains a challenging task due to the visual similarity between various cell types. This study addresses the classification of the multiclasses of AML cells Utilizing YOLOv12 deep learning model. […]

SatBLIP: Context Understanding and Feature Identification from Satellite Imagery with Vision-Language Learning

arXiv:2604.14373v2 Announce Type: replace-cross Abstract: Rural environmental risks are shaped by place-based conditions (e.g., housing quality, road access, land-surface patterns), yet standard vulnerability indices are coarse and provide limited insight into risk contexts. We propose SatBLIP, a satellite-specific vision-language framework for rural context understanding and feature identification that predicts county-level Social Vulnerability Index (SVI). SatBLIP […]

Unveiling Stochasticity: Universal Multi-modal Probabilistic Modeling for Traffic Forecasting

arXiv:2604.16084v1 Announce Type: cross Abstract: Traffic forecasting is a challenging spatio-temporal modeling task and a critical component of urban transportation management. Current studies mainly focus on deterministic predictions, with limited considerations on the uncertainty and stochasticity in traffic dynamics. Therefore, this paper proposes an elegant yet universal approach that transforms existing models into probabilistic predictors […]

Stylistic-STORM (ST-STORM) : Perceiving the Semantic Nature of Appearance

arXiv:2604.16086v1 Announce Type: cross Abstract: One of the dominant paradigms in self-supervised learning (SSL), illustrated by MoCo or DINO, aims to produce robust representations by capturing features that are insensitive to certain image transformations such as illumination, or geometric changes. This strategy is appropriate when the objective is to recognize objects independently of their appearance. […]

UniDoc-RL: Coarse-to-Fine Visual RAG with Hierarchical Actions and Dense Rewards

arXiv:2604.14967v2 Announce Type: replace-cross Abstract: Retrieval-Augmented Generation (RAG) extends Large Vision-Language Models (LVLMs) with external visual knowledge. However, existing visual RAG systems typically rely on generic retrieval signals that overlook the fine-grained visual semantics essential for complex reasoning. To address this limitation, we propose UniDoc-RL, a unified reinforcement learning framework in which an LVLM agent […]

Machine learning approaches to uncover the neural mechanisms of motivated behaviour: from ADHD to individual differences in effort and reward sensitivity

arXiv:2604.15363v1 Announce Type: new Abstract: Motivated behaviour relies on the brain’s capacity to evaluate effort and reward. Dysregulation within these processes contributes to a spectrum of conditions, from hyperactivity in attention-deficit/hyperactivity disorder (ADHD) to diminished goal-directed behaviour in apathy. This thesis investigates the neural mechanisms underlying ADHD using electroencephalography (EEG) and examines individual differences in […]

UniEditBench: A Unified and Cost-Effective Benchmark for Image and Video Editing via Distilled MLLMs

arXiv:2604.15871v1 Announce Type: cross Abstract: The evaluation of visual editing models remains fragmented across methods and modalities. Existing benchmarks are often tailored to specific paradigms, making fair cross-paradigm comparisons difficult, while video editing lacks reliable evaluation benchmarks. Furthermore, common automatic metrics often misalign with human preference, yet directly deploying large multimodal models (MLLMs) as evaluators […]

Selectivity and Shape in the Design of Forward-Forward Goodness Functions

arXiv:2604.13081v2 Announce Type: replace-cross Abstract: The Forward-Forward (FF) algorithm trains networks layer-by-layer using a local “goodness function,” yet sum-of-squares (SoS) has remained the only choice studied. We systematically explore the goodness-function design space and identify a unifying principle: the goodness function must be sensitive to the shape of neural activity, not its total energy. This […]

MR-Coupler: Automated Metamorphic Test Generation via Functional Coupling Analysis

arXiv:2604.10126v2 Announce Type: replace-cross Abstract: Metamorphic testing (MT) is a widely recognized technique for alleviating the oracle problem in software testing. However, its adoption is hindered by the difficulty of constructing effective metamorphic relations (MRs), which often require domain-specific or hard-to-obtain knowledge. In this work, we propose a novel approach that leverages the functional coupling […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844