arXiv:2406.05984v2 Announce Type: replace-cross Abstract: Mental health constitutes a complex and pervasive global challenge, affecting millions of lives and often leading to severe consequences. In this paper, we conduct a thorough survey to explore the intersection of data science, artificial intelligence, and mental healthcare, focusing on the recent developments of mental disorder detection through online […]
Harness as an Asset: Enforcing Determinism via the Convergent AI Agent Framework (CAAF)
arXiv:2604.17025v2 Announce Type: replace Abstract: Large Language Models produce a controllability gap in safety-critical engineering: even low rates of undetected constraint violations render a system undeployable. Current orchestration paradigms suffer from sycophantic compliance, context attention decay, and stochastic oscillation during self-correction. We introduce the Convergent AI Agent Framework (CAAF), which transitions agentic workflows from open-loop […]
The Shape of Attraction in UMAP: Exploring the Embedding Forces in Dimensionality Reduction
arXiv:2503.09101v4 Announce Type: replace-cross Abstract: Uniform manifold approximation and projection (UMAP) is among the most popular neighbor embedding methods. The method samples pairs of point indices according to similarities in the high-dimensional space, and applies attractive and repulsive forces to their coordinates in the low-dimensional embedding. In this paper, we analyze the forces to reveal […]
SycoPhantasy: Quantifying Sycophancy and Hallucination in Small Open Weight VLMs for Vision-Language Scoring of Fantasy Characters
arXiv:2604.24346v1 Announce Type: cross Abstract: Vision-language models (VLMs) are increasingly deployed as evaluators in tasks requiring nuanced image understanding, yet their reliability in scoring alignment between images and text descriptions remains underexplored. We investigate whether small, open-weight VLMs exhibit emphsycophantic behavior when evaluating image-text alignment: assigning high scores without grounding their judgments in visual evidence. […]
SIV-Bench: A Video Benchmark for Social Interaction Understanding and Reasoning
arXiv:2506.05425v2 Announce Type: replace-cross Abstract: Understanding social interaction, which encompasses perceiving numerous and subtle multimodal cues, inferring unobservable mental states and relations, and dynamically predicting others’ behavior, is the foundation for achieving human-machine interaction. Despite rapid advances in Multimodal Large Language Models (MLLMs), the rich and multifaceted nature of social interaction has hindered the development […]
DepthKV: Layer-Dependent KV Cache Pruning for Long-Context LLM Inference
arXiv:2604.24647v1 Announce Type: cross Abstract: Long-context reasoning is a critical capability of large language models (LLMs), enabling applications such as long-document understanding, summarization, and code generation. However, efficient autoregressive inference relies on the key-value (KV) cache, whose memory footprint grows linearly with sequence length, leading to a major memory bottleneck. To mitigate this overhead, KV […]
CFDLLMBench: A Benchmark Suite for Evaluating Large Language Models in Computational Fluid Dynamics
arXiv:2509.20374v3 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have demonstrated strong performance across general NLP tasks, but their utility in automating numerical experiments of complex physical system — a critical and labor-intensive component — remains underexplored. As the major workhorse of computational science over the past decades, Computational Fluid Dynamics (CFD) offers a uniquely […]
Agentic Hives: Equilibrium, Indeterminacy, and Endogenous Cycles in Self-Organizing Multi-Agent Systems
arXiv:2603.00130v2 Announce Type: replace-cross Abstract: Current multi-agent AI systems operate with a fixed number of agents whose roles are specified at design time. No formal theory governs when agents should be created, destroyed, or re-specialized at runtime-let alone how the population structure responds to changes in resources or objectives. We introduce the Agentic Hive, a […]
On the Reasoning Abilities of Masked Diffusion Language Models
arXiv:2510.13117v3 Announce Type: replace-cross Abstract: Masked diffusion models (MDMs) for text offer a compelling alternative to traditional autoregressive language models. Parallel generation makes them efficient, but their computational capabilities and the limitations inherent in their parallelism remain largely unexplored. To this end, we characterize what types of reasoning problems MDMs can provably solve and how […]
The Pragmatic Persona: Discovering LLM Persona through Bridging Inference
arXiv:2604.24079v1 Announce Type: cross Abstract: Large Language Models (LLMs) reveal inherent and distinctive personas through dialogue. However, most existing persona discovery approaches rely on surface-level lexical or stylistic cues, treating dialogue as a flat sequence of tokens and failing to capture the deeper discourse-level structures that sustain persona consistency. To address this limitation, we propose […]
A Lightweight Explainable Guardrail for Prompt Safety
arXiv:2602.15853v2 Announce Type: replace-cross Abstract: We propose a lightweight explainable guardrail (LEG) method to detect unsafe prompts. LEG uses a multi-task learning architecture to jointly learn a prompt classifier and an explanation classifier, where the latter labels prompt words that explain the safe/unsafe overall decision. LEG is trained on synthetic explanation data, which is generated […]
Federated Cross-Modal Retrieval with Missing Modalities via Semantic Routing and Adapter Personalization
arXiv:2604.22885v1 Announce Type: cross Abstract: Federated cross-modal retrieval faces severe challenges from heterogeneous client data, particularly non-IID semantic distributions and missing modalities. Under such heterogeneity, a single global model is often insufficient to capture both shared cross-modal knowledge and client-specific characteristics. We propose RCSR, a personalization-friendly federated framework that integrates prototype anchoring, retrieval-centric semantic routing, […]