Effectiveness of Al-Assisted Patient Health Education Using Voice Cloning and ChatGPT: Prospective Randomized Controlled Trial

Background: Traditional patient education often lacks personalization and engagement, potentially limiting knowledge acquisition and treatment adherence. Advances in artificial intelligence (AI), including voice cloning technology

Guide on Selection of Optimal Motivational Themes for Use in a Clinical Trial Recruiting Black US Adults: Survey Study

Background: Black adults in the United States face significant cardiovascular health disparities, which are likely exacerbated by the underrepresentation of Black adults in cardiovascular clinical

The Right to Understand in Health Care AI

Post Content

Translating Telehealth Communication Research Into Patient-Centered, Implementable Practice

Understanding both patient and clinician perspectives on communication challenges in virtual primary care consultations is important to ensure safe and effective care. This commentary reviews

Telemedicine Adoption for Managing Chronic and Rare Diseases in Indonesia During and Beyond the COVID-19 Era: Qualitative Study

Background: Telemedicine has emerged as a valuable tool for improving health care delivery, especially in low-resource and geographically isolated regions. In Indonesia, the COVID-19 pandemic

Knowing without Acting: The Disentangled Geometry of Safety Mechanisms in Large Language Models

March 16, 2026

arXiv:2603.05773v2 Announce Type: replace-cross
Abstract: Safety alignment is often conceptualized as a monolithic process wherein harmfulness detection automatically triggers refusal. However, the persistence of jailbreak attacks suggests a fundamental mechanistic decoupling. We propose the textbfunderlineDisentangled textbfunderlineSafety textbfunderlineHypothesis textbf(DSH), positing that safety computation operates on two distinct subspaces: a textitRecognition Axis ($mathbfv_H$, “Knowing”) and an textitExecution Axis ($mathbfv_R$, “Acting”). Our geometric analysis reveals a universal “Reflex-to-Dissociation” evolution, where these signals transition from antagonistic entanglement in early layers to structural independence in deep layers. To validate this, we introduce textitDouble-Difference Extraction and textitAdaptive Causal Steering. Using our curated textscAmbiguityBench, we demonstrate a causal double dissociation, effectively creating a state of “Knowing without Acting.” Crucially, we leverage this disentanglement to propose the textbfRefusal Erasure Attack (REA), which achieves State-of-the-Art attack success rates by surgically lobotomizing the refusal mechanism. Furthermore, we uncover a critical architectural divergence, contrasting the textitExplicit Semantic Control of Llama3.1 with the textitLatent Distributed Control of Qwen2.5. The code and dataset are available at https://anonymous.4open.science/r/DSH.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844