arXiv:2603.16171v1 Announce Type: cross Abstract: We present MemX, a local-first long-term memory system for AI assistants with stability-oriented retrieval design. MemX is implemented in Rust on top of libSQL and an OpenAI-compatible embedding API, providing persistent, searchable, and explainable memory for conversational agents. Its retrieval pipeline applies vector recall, keyword recall, Reciprocal Rank Fusion (RRF), […]
AsgardBench – Evaluating Visually Grounded Interactive Planning Under Minimal Feedback
arXiv:2603.15888v1 Announce Type: new Abstract: With AsgardBench we aim to evaluate visually grounded, high-level action sequence generation and interactive planning, focusing specifically on plan adaptation during execution based on visual observations rather than navigation or low-level manipulation. In the landscape of embodied AI benchmarks, AsgardBench targets the capability category of interactive planning, which is more […]
A Scoping Review of AI-Driven Digital Interventions in Mental Health Care: Mapping Applications Across Screening, Support, Monitoring, Prevention, and Clinical Education
arXiv:2603.16204v1 Announce Type: cross Abstract: Artificial intelligence (AI)-enabled digital interventions, including Generative AI (GenAI) and Human-Centered AI (HCAI), are increasingly used to expand access to digital psychiatry and mental health care. This PRISMA-ScR scoping review maps the landscape of AI-driven mental health (mHealth) technologies across five critical phases: pre-treatment (screening/triage), treatment (therapeutic support), post-treatment (remote […]
Prompt Sensitivity and Answer Consistency of Small Open-Source Language Models for Clinical Question Answering in Low-Resource Healthcare
arXiv:2603.00917v3 Announce Type: replace-cross Abstract: Small open-source language models are gaining attention for healthcare applications in low-resource settings where cloud infrastructure and GPU hardware may be unavailable. However, the reliability of these models under different phrasings of the same clinical question remains poorly understood. We evaluate five open-source models (Gemma 2 2B, Phi-3 Mini 3.8B, […]
Visual Prompt Discovery via Semantic Exploration
arXiv:2603.16250v1 Announce Type: cross Abstract: LVLMs encounter significant challenges in image understanding and visual reasoning, leading to critical perception failures. Visual prompts, which incorporate image manipulation code, have shown promising potential in mitigating these issues. While emerged as a promising direction, previous methods for visual prompt generation have focused on tool selection rather than diagnosing […]
Prompt Engineering for Scale Development in Generative Psychometrics
arXiv:2603.15909v1 Announce Type: new Abstract: This Monte Carlo simulation examines how prompt engineering strategies shape the quality of large language model (LLM)–generated personality assessment items within the AI-GENIE framework for generative psychometrics. Item pools targeting the Big Five traits were generated using multiple prompting designs (zero-shot, few-shot, persona-based, and adaptive), model temperatures, and LLMs, then […]
Artificial intelligence-enabled single-lead ECG for non-invasive hyperkalemia detection: development, multicenter validation, and proof-of-concept deployment
arXiv:2603.14177v2 Announce Type: replace-cross Abstract: Hyperkalemia is a life-threatening electrolyte disorder that is common in patients with chronic kidney disease and heart failure, yet frequent monitoring remains difficult outside hospital settings. We developed and validated Pocket-K, a single-lead AI-ECG system initialized from the ECGFounder foundation model for non-invasive hyperkalemia screening and handheld deployment. In this […]
PlotTwist: A Creative Plot Generation Framework with Small Language Models
arXiv:2603.16410v1 Announce Type: cross Abstract: Creative plot generation presents a fundamental challenge for language models: transforming a concise premise into a coherent narrative that sustains global structure, character development, and emotional resonance. Although recent Large Language Models (LLMs) demonstrate strong fluency across general-purpose tasks, they typically require preference alignment to perform well on specialized domains […]
Characterizing Delusional Spirals through Human-LLM Chat Logs
arXiv:2603.16567v1 Announce Type: cross Abstract: As large language models (LLMs) have proliferated, disturbing anecdotal reports of negative psychological effects, such as delusions, self-harm, and “AI psychosis,” have emerged in global media and legal discourse. However, it remains unclear how users and chatbots interact over the course of lengthy delusional “spirals,” limiting our ability to understand […]
One Operator to Rule Them All? On Boundary-Indexed Operator Families in Neural PDE Solvers
arXiv:2603.01406v1 Announce Type: cross Abstract: Neural PDE solvers are often described as learning solution operators that map problem data to PDE solutions. In this work, we argue that this interpretation is generally incorrect when boundary conditions vary. We show that standard neural operator training implicitly learns a boundary-indexed family of operators, rather than a single […]
A Novel Evolutionary Method for Automated Skull-Face Overlay in Computer-Aided Craniofacial Superimposition
arXiv:2603.00170v3 Announce Type: replace-cross Abstract: Craniofacial Superimposition is a forensic technique for identifying skeletal remains by comparing a post-mortem skull with ante-mortem facial photographs. A critical step in this process is Skull-Face Overlay (SFO). This stage involves aligning a 3D skull model with a 2D facial image, typically guided by cranial and facial landmarks’ correspondence. […]
AdaSwitch: Balancing Exploration and Guidance in Knowledge Distillation via Adaptive Switching
arXiv:2510.07842v2 Announce Type: replace-cross Abstract: Small language models (SLMs) are crucial for applications with strict latency and computational constraints, yet achieving high performance remains challenging. Knowledge distillation (KD) can transfer capabilities from large teacher models, but existing methods face a dilemma: off-policy distillation provides high-quality supervision but suffers from exposure bias (training inference mismatch), while […]