Semantic Robustness Probing via Inpainting: An Interactive Tool for Safety-Critical Object Detection

arXiv:2605.27155v1 Announce Type: cross Abstract: Testing object detectors in safety-critical domains requires semantically meaningful probes beyond pixel-level corruptions. We present SemProbe, a tool for semantic

Negligible in Size, Significant in Effect: On Scale Vectors in Large Language Models

arXiv:2605.26895v1 Announce Type: cross Abstract: Normalization layers in modern large language models (LLMs) consist of a deterministic normalization operation and a learnable scale vector. While

Evaluating the Relevance of Uncertainty Estimators for LLM Hallucination

arXiv:2605.27016v1 Announce Type: cross Abstract: Large language models (LLMs) are prone to hallucinations, i.e., statements unsupported by the input or training data, hindering reliable deployment.

The Labyrinth and the Thread: Rethinking Regularizations in Sequential Knowledge Editing for Large Language Models

arXiv:2605.26670v1 Announce Type: cross Abstract: Sequential editing of structured knowledge in large language models allows targeted factual updates without retraining, yet existing methods often rely

Generative artificial intelligence and the marginalization of minoritized knowledges in higher education: the case of disability

arXiv:2605.26769v1 Announce Type: cross Abstract: Generative artificial intelligence redefines higher education by restructuring the processes through which scientific knowledge is produced and validated. These systems

MatFormBench: A Benchmarking Evaluation Framework for Target-Driven Materials Formulation

May 27, 2026

arXiv:2605.26741v1 Announce Type: cross
Abstract: Inverse design of materials has significantly advanced target-driven formulation optimization, yet existing materials machine learning benchmarks remain limited to forward property prediction, failing to systematically evaluate inverse optimization and generation algorithms, a critical gap that hinders the progress of target-driven materials design. To address this limitation, we propose MatFormBench, a novel benchmarking ecosystem tailored to evaluate and guide generative strategies for target-driven formulation. MatFormBench integrates a physics-driven formulation generation scheme to generate synthetic samples that faithfully emulate realistic materials structure-property response relationships, complemented by five escalating difficulty levels to quantify the complexity of these relationships. To rigorously assess algorithm performance, we further propose MatFormScore, a multi-dimensional metric that comprehensively quantifies performance across five critical axes: target success, search efficiency, exploratory capacity, robustness, and stability. We validate MatFormBench by evaluating 39 diverse inverse design algorithms, covering classical surrogate-assisted black-box search, state-of-the-art deep generative models, and increasingly popular Large Language Model (LLM)-based recommendation strategies. Across 1170 standardized algorithm-task evaluations, diffusion-based models demonstrate the strongest overall performance, while Variational Autoencoder (VAE)-based and Genetic Algorithm (GA)-based methods exhibit distinct advantages in specific scenarios. By establishing a unified evaluation standard for target-driven materials formulation, MatFormBench enables reproducible benchmarking, principled algorithm comparison, and diagnostic analysis of inverse design strategies, providing a foundational tool for advancing materials inverse design.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844