ReLMXEL: Adaptive RL-Based Memory Controller with Explainable Energy and Latency Optimization

arXiv:2603.17309v1 Announce Type: cross Abstract: Reducing latency and energy consumption is critical to improving the efficiency of memory systems in modern computing. This work introduces ReLMXEL (Reinforcement Learning for Memory Controller with Explainable Energy and Latency Optimization), a explainable multi-agent online reinforcement learning framework that dynamically optimizes memory controller parameters using reward decomposition. ReLMXEL operates […]

Public Profile Matters: A Scalable Integrated Approach to Recommend Citations in the Wild

arXiv:2603.17361v1 Announce Type: cross Abstract: Proper citation of relevant literature is essential for contextualising and validating scientific contributions. While current citation recommendation systems leverage local and global textual information, they often overlook the nuances of the human citation behaviour. Recent methods that incorporate such patterns improve performance but incur high computational costs and introduce systematic […]

CRE-T1 Preview Technical Report: Beyond Contrastive Learning for Reasoning-Intensive Retrieval

arXiv:2603.17387v1 Announce Type: cross Abstract: The central challenge of reasoning-intensive retrieval lies in identifying implicitreasoning relationships between queries and documents, rather than superficial se-mantic or lexical similarity. The contrastive learning paradigm is fundamentallya static representation consolidation technique: during training, it encodes hier-archical relevance concepts into fixed geometric structures in the vector space,and at inference time […]

DeepStage: Learning Autonomous Defense Policies Against Multi-Stage APT Campaigns

arXiv:2603.16969v1 Announce Type: cross Abstract: This paper presents DeepStage, a deep reinforcement learning (DRL) framework for adaptive, stage-aware defense against Advanced Persistent Threats (APTs). The enterprise environment is modeled as a partially observable Markov decision process (POMDP), where host provenance and network telemetry are fused into unified provenance graphs. Building on our prior work, StageFinder, […]

A Comedy of Estimators: On KL Regularization in RL Training of LLMs

arXiv:2512.21852v3 Announce Type: replace-cross Abstract: The reasoning performance of large language models (LLMs) can be substantially improved by training them with reinforcement learning (RL). The RL objective for LLM training involves a regularization term, which is the reverse Kullback-Leibler (KL) divergence between the trained policy and the reference policy. Since computing the KL divergence exactly […]

Parameterizing Dataset Distillation via Gaussian Splatting

arXiv:2509.26219v3 Announce Type: replace-cross Abstract: Dataset distillation aims to compress training data while preserving training-aware knowledge, alleviating the reliance on large-scale datasets in modern model training. Dataset parameterization provides a more efficient storage structure for dataset distillation, reducing redundancy and accommodating richer information. However, existing methods either rely on complex auxiliary modules or fail to […]

Generative Hints

arXiv:2511.02933v2 Announce Type: replace-cross Abstract: Data augmentation is widely used in vision to introduce variation and mitigate overfitting, by enabling models to learn invariant properties. However, augmentation only indirectly captures these properties and does not explicitly constrain the learned function to satisfy them beyond the empirical training set. We propose generative hints, a training methodology […]

Volumetric Ergodic Control

arXiv:2511.11533v2 Announce Type: replace-cross Abstract: Ergodic control synthesizes optimal coverage behaviors over spatial distributions for nonlinear systems. However, existing formulations model the robot as a non-volumetric point, whereas in practice a robot interacts with the environment through its body and sensors with physical volume. In this work, we introduce a new ergodic control formulation that […]

Harm or Humor: A Multimodal, Multilingual Benchmark for Overt and Covert Harmful Humor

arXiv:2603.17759v2 Announce Type: cross Abstract: Dark humor often relies on subtle cultural nuances and implicit cues that require contextual reasoning to interpret, posing safety challenges that current static benchmarks fail to capture. To address this, we introduce a novel multimodal, multilingual benchmark for detecting and understanding harmful and offensive humor. Our manually curated dataset comprises […]

Telehealth Approaches for Pediatric Otitis Media and Clinical Outcomes: Scoping Review

Background: Otitis media (OM) is a common pediatric infection worldwide. Conventionally, accurate diagnosis depends on in-person pneumatic otoscopy, which is not always accessible, contributing to delayed care and inappropriate prescribing, especially in underserved settings. Rapid advances in telemedicine and digital tools have accelerated the development of remote approaches for assessing pediatric ear diseases, while diagnostic […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844