arXiv:2503.03485v2 Announce Type: replace-cross
Abstract: Understanding the biological mechanisms of disease is crucial for medicine, and in particular, for drug discovery. AI-powered analysis of genome-scale biological data holds great potential in this regard. The increasing availability of single-cell RNA sequencing data has enabled the development of large foundation models for disease biology. However, existing foundation models only modestly improve over task-specific models in downstream applications. Here, we explored two avenues for improving single-cell foundation models. First, we scaled the pre-training data to a diverse collection of 116 million cells, which is larger than those used by previous models. Second, we leveraged the availability of large-scale biological annotations as a form of supervision during pre-training. We trained the model family of models comprising six transformer-based state-of-the-art single-cell foundation models with 70 million, 160 million, and 400 million parameters. We vetted our models on several downstream evaluation tasks, including identifying the underlying disease state of held-out donors not seen during training, distinguishing between diseased and healthy cells for disease conditions and donors not seen during training, and probing the learned representations for known biology. Our models showed substantial improvement over existing works, and scaling experiments showed that performance improved predictably with both data volume and parameter count.
Identifying needs in adult rehabilitation to support the clinical implementation of robotics and allied technologies: an Italian national survey
IntroductionRobotics and technological interventions are increasingly being explored as solutions to improve rehabilitation outcomes but their implementation in clinical practice remains very limited. Understanding patient

