Graph-of-Mark: Promote Spatial Reasoning in Multimodal Language Models with Graph-Based Visual Prompting

arXiv:2603.06663v1 Announce Type: cross Abstract: Recent advances in training-free visual prompting, such as Set-of-Mark, have emerged as a promising direction for enhancing the grounding capabilities of multimodal language models (MLMs). These techniques operate by partitioning the input image into object regions and annotating them with marks, predominantly boxes with numeric identifiers, before feeding the augmented […]

CDRRM: Contrast-Driven Rubric Generation for Reliable and Interpretable Reward Modeling

arXiv:2603.08035v1 Announce Type: new Abstract: Reward modeling is essential for aligning Large Language Models(LLMs) with human preferences, yet conventional reward models suffer from poor interpretability and heavy reliance on costly expert annotations. While recent rubric-based approaches enhance evaluation transparency, they lack systematic quality control, yielding noisy and redundant criteria, failing to mitigate persistent biases (e.g., […]

Evidence-Driven Reasoning for Industrial Maintenance Using Heterogeneous Data

arXiv:2603.08171v1 Announce Type: new Abstract: Industrial maintenance platforms contain rich but fragmented evidence, including free-text work orders, heterogeneous operational sensors or indicators, and structured failure knowledge. These sources are often analyzed in isolation, producing alerts or forecasts that do not support conditional decision-making: given this asset history and behavior, what is happening and what action […]

Deconstructing Multimodal Mathematical Reasoning: Towards a Unified Perception-Alignment-Reasoning Paradigm

arXiv:2603.08291v1 Announce Type: new Abstract: Multimodal Mathematical Reasoning (MMR) has recently attracted increasing attention for its capability to solve mathematical problems that involve both textual and visual modalities. However, current models still face significant challenges in real-world visual math tasks. They often misinterpret diagrams, fail to align mathematical symbols with visual evidence, and produce inconsistent […]

Don’t Freeze, Don’t Crash: Extending the Safe Operating Range of Neural Navigation in Dense Crowds

arXiv:2603.06729v1 Announce Type: cross Abstract: Navigating safely through dense crowds requires collision avoidance that generalizes beyond the densities seen during training. Learning-based crowd navigation can break under out-of-distribution crowd sizes due to density-sensitive observation normalization and social-cost scaling, while analytical solvers often remain safe but freeze in tight interactions. We propose a reinforcement learning approach […]

GALACTIC: Global and Local Agnostic Counterfactuals for Time-series Clustering

arXiv:2603.05318v2 Announce Type: replace-cross Abstract: Time-series clustering is a fundamental tool for pattern discovery, yet existing explainability methods, primarily based on feature attribution or metadata, fail to identify the transitions that move an instance across cluster boundaries. While Counterfactual Explanations (CEs) identify the minimal temporal perturbations required to alter the prediction of a model, they […]

UWPD: A General Paradigm for Invisible Watermark Detection Agnostic to Embedding Algorithms

arXiv:2603.06723v1 Announce Type: cross Abstract: Invisible watermarks, as an essential technology for image copyright protection, have been widely deployed with the rapid development of social media and AIGC. However, existing invisible watermark detection heavily relies on prior knowledge of specific algorithms, leading to limited detection capabilities for “unknown watermarks” in open environments. To this end, […]

Conformal Prediction for Risk-Controlled Medical Entity Extraction Across Clinical Domains

arXiv:2603.00924v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) are increasingly used for medical entity extraction, yet their confidence scores are often miscalibrated, limiting safe deployment in clinical settings. We present a conformal prediction framework that provides finite-sample coverage guarantees for LLM-based extraction across two clinical domains. First, we extract structured entities from 1,000 FDA […]

Extracting Recurring Vulnerabilities from Black-Box LLM-Generated Software

arXiv:2602.04894v3 Announce Type: replace-cross Abstract: LLMs are increasingly used for code generation, but their outputs often follow recurring templates that can induce predictable vulnerabilities. We study vulnerability persistence in LLM-generated software and introduce Feature–Security Table (FSTab) with two components. First, FSTab enables a black-box attack that predicts likely backend vulnerabilities from observable frontend features and […]

AdaCultureSafe: Adaptive Cultural Safety Grounded by Cultural Knowledge in Large Language Models

arXiv:2603.08275v1 Announce Type: cross Abstract: With the widespread adoption of Large Language Models (LLMs), respecting indigenous cultures becomes essential for models’ culturally safety and responsible global applications. Existing studies separately consider cultural safety and cultural knowledge and neglect that the former should be grounded by the latter. This severely prevents LLMs from yielding culture-specific respectful […]

An explainable hybrid deep learning-enabled intelligent fault detection and diagnosis approach for automotive software systems validation

arXiv:2603.08165v1 Announce Type: cross Abstract: Advancements in data-driven machine learning have emerged as a pivotal element in supporting automotive software systems (ASSs) engineering across various levels of the V-development process. Duringsystemverificationandvalidation,theintegrationofanintelligent fault detection anddiagnosis (FDD) model with test recordings analysis process serves as a powerful tool for efficiency ensuring functional safety. However, the lack of […]

The Future of Software Testing: AI-Powered Test Case Generation and Validation

arXiv:2409.05808v3 Announce Type: replace-cross Abstract: Software testing is a crucial phase in the software development lifecycle (SDLC), ensuring that products meet necessary functional, performance, and quality benchmarks before release. Despite advancements in automation, traditional methods of generating and validating test cases still face significant challenges, including prolonged timelines, human error, incomplete test coverage, and high […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844