ReasoningGuard: Safeguarding Large Reasoning Models with Inference-time Safety Aha Moments

arXiv:2508.04204v2 Announce Type: replace-cross Abstract: Large Reasoning Models (LRMs) have demonstrated impressive performance in reasoning-intensive tasks, but they remain vulnerable to harmful content generation, particularly in the mid-to-late steps of their reasoning processes. Current defense methods, however, depend on costly fine-tuning and additional expert knowledge, which limits their scalability. In this work, we propose ReasoningGuard, […]

VVS: Accelerating Speculative Decoding for Visual Autoregressive Generation via Partial Verification Skipping

arXiv:2511.13587v3 Announce Type: replace-cross Abstract: Visual autoregressive (AR) generation models have demonstrated strong potential for image generation, yet their next-token-prediction paradigm introduces considerable inference latency. Although speculative decoding (SD) has been proven effective for accelerating visual AR models, its “draft one step, then verify one step” paradigm prevents a direct reduction in the number of […]

SafeRedir: Prompt Embedding Redirection for Robust Unlearning in Image Generation Models

arXiv:2601.08623v2 Announce Type: replace-cross Abstract: Image generation models (IGMs), while capable of producing impressive and creative content, often memorize a wide range of undesirable concepts from their training data, leading to the reproduction of unsafe content such as NSFW imagery and copyrighted artistic styles. Such behaviors pose persistent safety and compliance risks in real-world deployments […]

Beyond Variance: Prompt-Efficient RLVR via Rare-Event Amplification and Bidirectional Pairing

arXiv:2602.03452v2 Announce Type: replace-cross Abstract: Reinforcement learning with verifiable rewards (RLVR) is effective for training large language models on deterministic outcome reasoning tasks. Prior work shows RLVR works with few prompts, but prompt selection is often based only on training-accuracy variance, leading to unstable optimization directions and weaker transfer. We revisit prompt selection from a […]

Denoising Particle Filters: Learning State Estimation with Single-Step Objectives

arXiv:2602.19651v2 Announce Type: replace-cross Abstract: Learning-based methods commonly treat state estimation in robotics as a sequence modeling problem. While this paradigm can be effective at maximizing end-to-end performance, models are often difficult to interpret and expensive to train, since training requires unrolling sequences of predictions in time. As an alternative to end-to-end trained state estimation, […]

Centrality-Based Pruning for Efficient Echo State Networks

arXiv:2603.20684v2 Announce Type: replace-cross Abstract: Echo State Networks (ESNs) are a reservoir computing framework widely used for nonlinear time-series prediction. However, despite their effectiveness, randomly initialized reservoirs often contain redundant nodes, leading to unnecessary computational overhead and reduced efficiency. In this work, we propose a graph centrality-based pruning approach that interprets the reservoir as a […]

AutoOR: Scalably Post-training LLMs to Autoformalize Operations Research Problems

arXiv:2604.16804v2 Announce Type: replace-cross Abstract: Optimization problems are central to decision-making in manufacturing, logistics, scheduling, and other industrial settings. Translating complicated descriptions of these problems into solver-ready formulations requires specialized operations research (OR) expertise, making it hard to scale. We present AutoOR, a scalable synthetic data generation and reinforcement learning pipeline that trains LLMs to […]

RamanBench: A Large-Scale Benchmark for Machine Learning on Raman Spectroscopy

arXiv:2605.02003v2 Announce Type: replace-cross Abstract: Machine Learning (ML) has transformed many scientific fields, yet key applications still lack standardized benchmarks. Raman spectroscopy, a widely used technique for non-invasive molecular analysis, is one such field where progress is limited by fragmented datasets, inconsistent evaluation, and models that fail to capture the structure of spectral data. We […]

Architectural Constraints Alignment in AI-assisted, Platform-based Service Development

arXiv:2605.04973v1 Announce Type: cross Abstract: AI-assisted development tools enable rapid prototyping of services but often lack awareness of architectural constraints, infrastructure dependencies, and organizational standards required in production environments. Consequently, generated artifacts may exhibit brittle behavior and limited deployability. We propose a retrieval-augmented scaffolding approach that combines platform-based code generation with agentic clarification loops to […]

Preference-Based Self-Distillation: Beyond KL Matching via Reward Regularization

arXiv:2605.05040v1 Announce Type: cross Abstract: On-policy distillation is an efficient alternative to reinforcement learning, offering dense token-level training signals. However, its reliance on a stronger external teacher has driven recent work on on-policy self-distillation, where the same model serves as both teacher and student under different prompt contexts. Yet, existing self-distillation methods largely reduce learning […]

SoK: Robustness in Large Language Models against Jailbreak Attacks

arXiv:2605.05058v1 Announce Type: cross Abstract: Large Language Models (LLMs) have achieved remarkable success but remain highly susceptible to jailbreak attacks, in which adversarial prompts coerce models into generating harmful, unethical, or policy-violating outputs. Such attacks pose real-world risks, eroding safety, trust, and regulatory compliance in high-stakes applications. Although a variety of attack and defense methods […]

Driver-WM: A Driver-Centric Traffic-Conditioned Latent World Model for In-Cabin Dynamics Rollout

arXiv:2605.05092v1 Announce Type: cross Abstract: Safe L2/L3 driving automation requires anticipating human-in-the-loop reactions during shared-control transitions. While most driving world models forecast the external environment, in-cabin intelligence remains strictly recognition-oriented and lacks multi-step rollout capabilities for driver dynamics. We introduce Driver-WM, a driver-centric latent world model that rolls out in-cabin dynamics causally conditioned on out-cabin […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844