Graph-to-Vision: Multi-graph Understanding and Reasoning using Vision-Language Models

arXiv:2503.21435v3 Announce Type: replace Abstract: Recent advances in Vision-Language Models (VLMs) have shown promising capabilities in interpreting visualized graph data, offering a new perspective for graph-structured reasoning beyond traditional Graph Neural Networks (GNNs). However, existing studies focus primarily on single-graph reasoning, leaving the critical challenge of multi-graph joint reasoning underexplored. In this work, we introduce […]

EuropeMedQA Study Protocol: A Multilingual, Multimodal Medical Examination Dataset for Language Model Evaluation

arXiv:2604.14306v2 Announce Type: replace-cross Abstract: While Large Language Models (LLMs) have demonstrated high proficiency on English-centric medical examinations, their performance often declines when faced with non-English languages and multimodal diagnostic tasks. This study protocol describes the development of EuropeMedQA, the first comprehensive, multilingual, and multimodal medical examination dataset sourced from official regulatory exams in Italy, […]

AdaptEvolve: Improving Efficiency of Evolutionary AI Agents through Adaptive Model Selection

arXiv:2602.11931v2 Announce Type: replace-cross Abstract: Evolutionary agentic systems intensify the trade-off between computational efficiency and reasoning capability by repeatedly invoking large language models (LLMs) during inference. This setting raises a central question: how can an agent dynamically select an LLM that is sufficiently capable for the current generation step while remaining computationally efficient? While model […]

Atlas-Alignment: Making Interpretability Transferable Across Language Models

arXiv:2510.27413v2 Announce Type: replace-cross Abstract: Interpretability is crucial for building safe, reliable, and controllable language models, yet existing interpretability pipelines remain costly and difficult to scale. Interpreting a new model typically requires training model-specific components (e.g., sparse autoencoders), followed by manual or semi-automated labeling and validation, imposing a growing “transparency tax” that does not scale […]

The Shape of Adversarial Influence: Characterizing LLM Latent Spaces with Persistent Homology

arXiv:2505.20435v3 Announce Type: replace-cross Abstract: Existing interpretability methods for Large Language Models (LLMs) predominantly capture linear directions or isolated features. This overlooks the high-dimensional, relational, and nonlinear geometry of model representations. We apply persistent homology (PH) to characterize how adversarial inputs reshape the geometry and topology of internal representation spaces of LLMs. This phenomenon, especially […]

HFX: Joint Design of Algorithms and Systems for Multi-SLO Serving and Fast Scaling

arXiv:2508.15919v3 Announce Type: replace-cross Abstract: Large language model (LLM) serving faces the dual challenge of meeting strict user-specific service-level objectives (SLOs) while minimizing computational cost under dynamic, multi-task workloads. Existing approaches either rely on static scheduling policies or focus on single-task settings, limiting their applicability in real-world deployments with heterogeneous requests, variable prompt lengths, and […]

Handling Missing Modalities in Multimodal Survival Prediction for Non-Small Cell Lung Cancer

arXiv:2601.10386v2 Announce Type: replace-cross Abstract: Accurate survival prediction in Non-Small Cell Lung Cancer (NSCLC) requires integrating clinical, radiological, and histopathological data. Multimodal Deep Learning (MDL) can improve precision prognosis, but small cohorts and missing modalities limit its clinical applicability, as conventional approaches enforce complete case filtering or imputation. We present a missing-aware multimodal survival framework […]

Categorical Perception in Large Language Model Hidden States: Structural Warping at Digit-Count Boundaries

arXiv:2603.28258v2 Announce Type: replace-cross Abstract: Categorical perception (CP) — enhanced discriminability at category boundaries — is among the most studied phenomena in perceptual psychology. This paper reports that analogous geometric warping occurs in the hidden-state representations of large language models (LLMs) processing Arabic numerals. Using representational similarity analysis across six models from five architecture families, […]

Consequentialist Objectives and Catastrophe

arXiv:2603.15017v3 Announce Type: replace Abstract: Because human preferences are too complex to codify, AIs operate with misspecified objectives. Optimizing such objectives often produces undesirable outcomes; this phenomenon is known as reward hacking. Such outcomes are not necessarily catastrophic. Indeed, most examples of reward hacking in previous literature are benign. And typically, objectives can be modified […]

PoLO: Proof-of-Learning and Proof-of-Ownership at Once with Chained Watermarking

arXiv:2505.12296v2 Announce Type: replace-cross Abstract: Our evaluation shows that PoLO achieves textbf99% watermark detection accuracy for ownership verification, while preserving data privacy and cutting verification costs to just textbf1.5–10% of traditional methods. Forging PoLO demands textbf1.1–4$times$ more resources than honest proof generation, with the original proof retaining over textbf90% detection accuracy even after attacks.

KuaiLive: A Real-time Interactive Dataset for Live Streaming Recommendation

arXiv:2508.05633v2 Announce Type: replace-cross Abstract: Live streaming platforms have become a dominant form of online content consumption, offering dynamically evolving content, real-time interactions, and highly engaging user experiences. These unique characteristics introduce new challenges that differentiate live streaming recommendation from traditional recommendation settings and have garnered increasing attention from industry in recent years. However, research […]

Multimodal Neural Operators for Real-Time Biomechanical Modelling of Traumatic Brain Injury

arXiv:2510.03248v3 Announce Type: replace-cross Abstract: Background: Traumatic brain injury modeling requires integrating volumetric neuroimaging, demographic parameters, and acquisition metadata. Finite element solvers are too computationally expensive for clinical settings. Neural operators offer much faster inference. Their ability to integrate volumetric imaging with scalar metadata remains underexplored for biomechanical predictions. Objective: This study evaluates multimodal neural […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844