From Parameter Dynamics to Risk Scoring : Quantifying Sample-Level Safety Degradation in LLM Fine-tuning

arXiv:2605.04572v1 Announce Type: new Abstract: Safety alignment of Large Language Models (LLMs) is extremely fragile, as fine-tuning on a small number of benign samples can erase safety behaviors learned from millions of preference examples. Existing studies attempt to explain this phenomenon by comparing parameters and hidden states before and after fine-tuning, but overlook their dynamic […]

SWAN: Semantic Watermarking with Abstract Meaning Representation

arXiv:2605.04305v1 Announce Type: cross Abstract: We introduce SWAN (Semantic Watermarking with Abstract Meaning Representation), a novel framework that embeds watermark signatures into the semantic structure of a sentence using Abstract Meaning Representation (AMR). In contrast to existing watermarking methods, which typically encode signatures by adjusting token selection preferences during text generation, SWAN embeds the signature […]

On the Wasserstein Gradient Flow Interpretation of Drifting Models

arXiv:2605.05118v1 Announce Type: cross Abstract: Recently, Deng et al. (2026) proposed Generative Modeling via Drifting (GMD), a novel framework for generative tasks. This note presents an analysis of GMD through the lens of Wasserstein Gradient Flows (WGF), i.e., the path of steepest descent for a functional in the space of probability measures, equipped with the […]

NoisyCausal: A Benchmark for Evaluating Causal Reasoning Under Structured Noise

arXiv:2605.04313v1 Announce Type: cross Abstract: Causal reasoning in natural language requires identifying relevant variables, understanding their interactions, and reasoning about effects and interventions, often under noisy or ambiguous conditions. While large language models (LLMs) exhibit strong general reasoning abilities, they struggle to disentangle correlation from causation, particularly when observations are partially incorrect or irrelevant information […]

SensingAgents: A Multi-Agent Collaborative Framework for Robust IMU Activity Recognition

arXiv:2605.04608v1 Announce Type: new Abstract: Human Activity Recognition (HAR) using Inertial Measurement Unit (IMU) sensors is a cornerstone of mobile health, smart environments, and human-computer interaction. However, current deep learning-based HAR models often struggle with heavy reliance on labeled data, position-specific ambiguity, and a lack of transparent reasoning. Inspired by the advanced agents framework, which […]

The First Token Knows: Single-Decode Confidence for Hallucination Detection

arXiv:2605.05166v1 Announce Type: cross Abstract: Self-consistency detects hallucinations by generating multiple sampled answers to a question and measuring agreement, but this requires repeated decoding and can be sensitive to lexical variation. Semantic self-consistency improves this by clustering sampled answers by meaning using natural language inference, but it adds both sampling cost and external inference overhead. […]

Coral: Cost-Efficient Multi-LLM Serving over Heterogeneous Cloud GPUs

arXiv:2605.04357v1 Announce Type: cross Abstract: The usage of large language models (LLMs) has grown increasingly fragmented, with no single model dominating. Meanwhile, cloud providers offer a wide range of mid-tier and older-generation GPUs that enjoy better availability and deliver comparable performance per dollar to top-tier hardware. To efficiently harness these heterogeneous resources for serving multiple […]

AuditRepairBench: A Paired-Execution Trace Corpus for Evaluator-Channel Ranking Instability in Agent Repair

arXiv:2605.04624v1 Announce Type: new Abstract: Agent-repair leaderboards reorder under evaluator reconfiguration, and a measurable share of the reordering is produced by methods that consult evaluator-derived signal during internal selection of candidate repairs. We document this failure mode on a public leaderboard and release AuditRepairBench, a paired-execution trace corpus of 576,000 registered cells (96,000 executed) that […]

Extending Differential Temporal Difference Methods for Episodic Problems

arXiv:2605.04368v1 Announce Type: cross Abstract: Differential temporal difference (TD) methods are value-based reinforcement learning algorithms that have been proposed for infinite-horizon problems. They rely on reward centering, where each reward is centered by the average reward. This keeps the return bounded and removes a value function’s state-independent offset. However, reward centering can alter the optimal […]

Combining Abstract Argumentation and Machine Learning for Efficiently Analyzing Low-Level Process Event Streams

arXiv:2505.05880v2 Announce Type: replace Abstract: Monitoring and analyzing process traces is a critical task for modern companies and organizations. In scenarios where there is a gap between trace events and reference business activities, this entails an interpretation problem, amounting to translating each event of any ongoing trace into the corresponding step of the activity instance. […]

Worst-Case Discovery and Runtime Protection for RL-Based Network Controllers

arXiv:2605.04373v1 Announce Type: cross Abstract: RL-based controllers achieve strong average-case performance in networking tasks such as congestion control and adaptive bitrate streaming. Yet their performance can degrade severely under network conditions where strong performance is still achievable. Identifying such conditions and quantifying the resulting performance gap is intractable by enumeration, while the sequential and closed-loop […]

A Generalized Framework of Antisymmetric Polyspectral Indices for Identifying High-Order Neural Interactions

arXiv:2605.04636v1 Announce Type: new Abstract: Cross-frequency interactions are fundamental brain mechanisms for integrating information across temporal scales. However, accurate identification of these couplings is hindered by complex multi-frequency nonlinearities and by spurious, zero-lag artifacts caused by volume conduction. To our knowledge, conventional metrics lack a robust framework to characterize genuine interactions among multiple time series […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844