ImplicitRM: Unbiased Reward Modeling from Implicit Preference Data for LLM alignment

arXiv:2603.23184v1 Announce Type: cross Abstract: Reward modeling represents a long-standing challenge in reinforcement learning from human feedback (RLHF) for aligning language models. Current reward modeling is heavily contingent upon experimental feedback data with high collection costs. In this work, we study textitimplicit reward modeling — learning reward models from implicit human feedback (e.g., clicks and […]

From the AI Act to a European AI Agency: Completing the Union’s Regulatory Architecture

arXiv:2603.22912v1 Announce Type: cross Abstract: As artificial intelligence (AI) technologies continue to advance, effective risk assessment, regulation, and oversight are necessary to ensure that AI development and deployment align with ethical principles while preserving innovation and economic competitiveness. The adoption of the EU AI Act marks an important step in this direction, establishing a harmonised […]

Contrastive Metric Learning for Point Cloud Segmentation in Highly Granular Detectors

arXiv:2603.23356v1 Announce Type: cross Abstract: We propose a novel clustering approach for point-cloud segmentation based on supervised contrastive metric learning (CML). Rather than predicting cluster assignments or object-centric variables, the method learns a latent representation in which points belonging to the same object are embedded nearby while unrelated points are separated. Clusters are then reconstructed […]

Obscure but Effective: Classical Chinese Jailbreak Prompt Optimization via Bio-Inspired Search

arXiv:2602.22983v3 Announce Type: replace Abstract: As Large Language Models (LLMs) are increasingly used, their security risks have drawn increasing attention. Existing research reveals that LLMs are highly susceptible to jailbreak attacks, with effectiveness varying across language contexts. This paper investigates the role of classical Chinese in jailbreak attacks. Owing to its conciseness and obscurity, classical […]

ForestPrune: High-ratio Visual Token Compression for Video Multimodal Large Language Models via Spatial-Temporal Forest Modeling

arXiv:2603.22911v1 Announce Type: cross Abstract: Due to the great saving of computation and memory overhead, token compression has become a research hot-spot for MLLMs and achieved remarkable progress in image-language tasks. However, for the video, existing methods still fall short of high-ratio token compression. We attribute this shortcoming to the insufficient modeling of temporal and […]

MS-DGCNN++: Multi-Scale Dynamic Graph Convolution with Scale-Dependent Normalization for Robust LiDAR Tree Species Classification

arXiv:2507.12602v2 Announce Type: replace-cross Abstract: Graph-based deep learning on LiDAR point clouds encodes geometry through edge features, yet standard implementations use the same encoding at every scale. In tree species classification, where point density varies by orders of magnitude between trunk and canopy, this is particularly limiting. We prove it is suboptimal: normalized directional features […]

Agentic AI-based Coverage Closure for Formal Verification

arXiv:2603.03147v2 Announce Type: replace Abstract: Coverage closure is a critical requirement in Integrated Chip (IC) development process and key metric for verification sign-off. However, traditional exhaustive approaches often fail to achieve full coverage within project timelines. This study presents an agentic AI-driven workflow that utilizes Large Language Model (LLM)-enabled Generative AI (GenAI) to automate coverage […]

Off-Policy Evaluation and Learning for Survival Outcomes under Censoring

arXiv:2603.22900v1 Announce Type: cross Abstract: Optimizing survival outcomes, such as patient survival or customer retention, is a critical objective in data-driven decision-making. Off-Policy Evaluation~(OPE) provides a powerful framework for assessing such decision-making policies using logged data alone, without the need for costly or risky online experiments in high-stakes applications. However, typical estimators are not designed […]

MKA: Memory-Keyed Attention for Efficient Long-Context Reasoning

arXiv:2603.20586v2 Announce Type: replace-cross Abstract: As long-context language modeling becomes increasingly important, the cost of maintaining and attending to large Key/Value (KV) caches grows rapidly, becoming a major bottleneck in both training and inference. While prior works such as Multi-Query Attention (MQA) and Multi-Latent Attention (MLA) reduce memory by sharing or compressing KV features, they […]

Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models

arXiv:2601.22060v3 Announce Type: replace-cross Abstract: Multimodal large language models (MLLMs) have achieved remarkable success across a broad range of vision tasks. However, constrained by the capacity of their internal world knowledge, prior work has proposed augmenting MLLMs by “reasoning-then-tool-call” for visual and textual search engines to obtain substantial gains on tasks requiring extensive factual information. […]

Confidence Calibration under Ambiguous Ground Truth

arXiv:2603.22879v1 Announce Type: cross Abstract: Confidence calibration assumes a unique ground-truth label per input, yet this assumption fails wherever annotators genuinely disagree. Post-hoc calibrators fitted on majority-voted labels, the standard single-label targets used in practice, can appear well-calibrated under conventional evaluation yet remain substantially miscalibrated against the underlying annotator distribution. We show that this failure […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844