LAsset: An LLM-assisted Security Asset Identification Framework for System-on-Chip (SoC) Verification

arXiv:2601.02624v2 Announce Type: replace-cross Abstract: The growing complexity of modern system-on-chip (SoC) and IP designs is making security assurance difficult day by day. One of the fundamental steps in the pre-silicon security verification of a hardware design is the identification of security assets, as it substantially influences downstream security verification tasks, such as threat modeling, […]

Countering the Over-Reliance Trap: Mitigating Object Hallucination for LVLMs via a Self-Validation Framework

arXiv:2601.22451v2 Announce Type: replace-cross Abstract: Despite progress in Large Vision Language Models (LVLMs), object hallucination remains a critical issue in image captioning task, where models generate descriptions of non-existent objects, compromising their reliability. Previous work attributes this to LVLMs’ over-reliance on language priors and attempts to mitigate it through logits calibration. However, they still lack […]

MedDialBench: Benchmarking LLM Diagnostic Robustness under Parametric Adversarial Patient Behaviors

arXiv:2604.06846v1 Announce Type: cross Abstract: Interactive medical dialogue benchmarks have shown that LLM diagnostic accuracy degrades significantly when interacting with non-cooperative patients, yet existing approaches either apply adversarial behaviors without graded severity or case-specific grounding, or reduce patient non-cooperation to a single ungraded axis, and none analyze cross-dimension interactions. We introduce MedDialBench, a benchmark enabling […]

ATBench: A Diverse and Realistic Agent Trajectory Benchmark for Safety Evaluation and Diagnosis

arXiv:2604.02022v2 Announce Type: replace Abstract: Evaluating the safety of LLM-based agents is increasingly important because risks in realistic deployments often emerge over multi-step interactions rather than isolated prompts or final responses. Existing trajectory-level benchmarks remain limited by insufficient interaction diversity, coarse observability of safety failures, and weak long-horizon realism. We introduce ATBench, a trajectory-level benchmark […]

Draw-In-Mind: Rebalancing Designer-Painter Roles in Unified Multimodal Models Benefits Image Editing

arXiv:2509.01986v4 Announce Type: replace-cross Abstract: In recent years, integrating multimodal understanding and generation into a single unified model has emerged as a promising paradigm. While this approach achieves strong results in text-to-image (T2I) generation, it still struggles with precise image editing. We attribute this limitation to an imbalanced division of responsibilities. The understanding module primarily […]

Chatbot-Based Assessment of Code Understanding in Automated Programming Assessment Systems

arXiv:2604.07304v1 Announce Type: cross Abstract: Large Language Models (LLMs) challenge conventional automated programming assessment because students can now produce functionally correct code without demonstrating corresponding understanding. This paper makes two contributions. First, it reports a saturation-based scoping review of conversational assessment approaches in programming education. The review identifies three dominant architectural families: rule-based or template-driven […]

Region-R1: Reinforcing Query-Side Region Cropping for Multi-Modal Re-Ranking

arXiv:2604.05268v2 Announce Type: replace-cross Abstract: Multi-modal retrieval-augmented generation (MM-RAG) relies heavily on re-rankers to surface the most relevant evidence for image-question queries. However, standard re-rankers typically process the full query image as a global embedding, making them susceptible to visual distractors (e.g., background clutter) that skew similarity scores. We propose Region-R1, a query-side region cropping […]

CADENCE: Context-Adaptive Depth Estimation for Navigation and Computational Efficiency

arXiv:2604.07286v1 Announce Type: cross Abstract: Autonomous vehicles deployed in remote environments typically rely on embedded processors, compact batteries, and lightweight sensors. These hardware limitations conflict with the need to derive robust representations of the environment, which often requires executing computationally intensive deep neural networks for perception. To address this challenge, we present CADENCE, an adaptive […]

Bridging Natural Language and Microgrid Dynamics: A Context-Aware Simulator and Dataset

arXiv:2604.05429v2 Announce Type: replace-cross Abstract: Addressing the critical need for intelligent, context-aware energy management in renewable systems, we introduce the OpenCEM Simulator and Dataset: the first open-source digital twin explicitly designed to integrate rich, unstructured contextual information with quantitative renewable energy dynamics. Traditional energy management relies heavily on numerical time series, thereby neglecting the significant […]

Towards Privacy-Preserving Large Language Model: Text-free Inference Through Alignment and Adaptation

arXiv:2604.06831v1 Announce Type: cross Abstract: Current LLM-based services typically require users to submit raw text regardless of its sensitivity. While intuitive, such practice introduces substantial privacy risks, as unauthorized access may expose personal, medical, or legal information. Although prior defenses strived to mitigate these risks, they often incur substantial computational overhead and degrade model performance. […]

On the Step Length Confounding in LLM Reasoning Data Selection

arXiv:2604.06834v1 Announce Type: cross Abstract: Large reasoning models have recently demonstrated strong performance on complex tasks that require long chain-of-thought reasoning, through supervised fine-tuning on large-scale and high-quality datasets. To construct such datasets, existing pipelines generate long reasoning data from more capable Large Language Models (LLMs) and apply manually heuristic or naturalness-based selection methods to […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844