Analysis of intellectual property strategies across different categories of digital therapeutics

Advances in digital technology and the coronavirus disease (COVID-19) pandemic have accelerated the digital transformation of healthcare. Digital therapeutics (DTx), which deliver evidence-based interventions through

Correction: Artificial intelligence assessment of valvular disease and ventricular function by a single echocardiography view

Post Content

Comparative performance of ChatGPT-5 and DeepSeek on the Chinese ultrasound medicine senior professional title examination

BackgroundLarge language models (LLMs) have shown growing potential for medical education and assessment, but evidence on their performance in specialty certification exams in China—particularly in

Depression detection using deep learning and large language models from multimodalities

Depression is a complex psychiatric disorder that affects neural functioning, cognition, emotion, and behavior, making objective assessment a persistent clinical challenge. Traditional diagnostic methods depend

Editorial: Ethical considerations of large language models: challenges and best practices

Post Content

Mind the Discriminability Trap in Source-Free Cross-domain Few-shot Learning

March 17, 2026

arXiv:2603.13341v1 Announce Type: cross
Abstract: Source-Free Cross-Domain Few-Shot Learning (SF-CDFSL) focuses on fine-tuning with limited training data from target domains (e.g., medical or satellite images), where Vision-Language Models (VLMs) such as CLIP and SigLIP have shown promising results. Current works in traditional visual models suggest that improving visual discriminability enhances performance. However, in VLM-based SF-CDFSL tasks, we find that textbfstrengthening visual-modal discriminability actually suppresses VLMs’ performance. In this paper, we aim to delve into this phenomenon for an interpretation and a solution. By both theoretical and experimental proofs, our study reveals that fine-tuning with the typical cross-entropy loss ($mathcalL_mathrmvlm$) inherently includes a visual learning part and a cross-modal learning part, where the cross-modal part is crucial for rectifying the heavily disrupted modality misalignment in SF-CDFSL. However, we find that the visual learning essentially acts as a shortcut that encourages the model to reduce $mathcalL_mathrmvlm$ without considering the cross-modal part, therefore hindering the cross-modal alignment and harming the performance. Based on this interpretation, we further propose an approach to address this problem: first, we perturb the visual learning to guide the model to focus on the cross-modal alignment. Then, we use the visual-text semantic relationships to gradually align the visual and textual modalities during the fine-tuning. Extensive experiments on various settings, backbones (CLIP, SigLip, PE-Core), and tasks (4 CDFSL datasets and 11 FSL datasets) show that we consistently set new state-of-the-art results. Code is available at https://github.com/zhenyuZ-HUST/CVPR26-Mind-the-Discriminability-Trap.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844