Toward terminological clarity in digital biomarker research

Digital biomarker research has generated thousands of publications demonstrating associations between sensor-derived measures and clinical conditions, yet clinical adoption remains negligible. We identify a foundational

Trust and anxiety as primary drivers of digital health acceptance in multiple sclerosis: toward an extended disease-specific technology acceptance model

BackgroundDigital health applications and AI-supported wearables may benefit people with Multiple Sclerosis (MS), yet fluctuating cognitive and physical symptoms could shape adoption in ways not

Real-world federated learning for brain imaging scientists

BackgroundFederated learning (FL) has the potential to boost deep learning in neuroimaging but is rarely deployed in real-world scenarios, where its true potential lies. We

How physicians embrace AI: insights from technology acceptance and trust theories

ObjectiveThis study investigates the factors influencing physicians’ acceptance and adoption of artificial intelligence (AI) technologies in clinical practice, integrating the Theory of Planned Behavior (TPB)

Through the looking glass: ethical considerations regarding LLM-induced hallucinations to medical questions

Post Content

ReasonMap: Towards Fine-Grained Visual Reasoning from Transit Maps

March 13, 2026

arXiv:2505.18675v3 Announce Type: replace-cross
Abstract: Multimodal large language models (MLLMs) have demonstrated significant progress in semantic scene understanding and text-image alignment, with reasoning variants enhancing performance on more complex tasks involving mathematics and logic. To bridge this gap, we introduce ReasonMap, a novel benchmark specifically designed to evaluate these capabilities. ReasonMap encompasses high-resolution transit maps from 30 cities and includes 1,008 question-answer pairs spanning two question types and three templates. Furthermore, we design a two-level evaluation pipeline that properly assesses answer correctness and quality. Our comprehensive evaluation of 16 popular MLLMs reveals a counterintuitive pattern: among open-source models, base variants outperform their reasoning-tuned counterparts, whereas the opposite trend is observed in closed-source models. Further analysis under the visual-masking setting confirms that strong performance necessitates direct visual grounding, rather than relying solely on language priors. We further establish a training baseline with reinforcement fine-tuning, providing a reference for future exploration. We hope this benchmark study offers new insights into visual reasoning and helps investigate the gap between open- and closed-source models.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844