arXiv:2503.06286v3 Announce Type: replace
Abstract: Large-scale visual neural datasets such as the Natural Scenes Dataset (NSD) are boosting computational neuroscience research by enabling models of the brain with performances beyond what was possible just a decade ago. However, because the stimuli of these datasets typically live within a common naturalistic visual distribution, they do not allow for strict out-of-distribution (OOD) generalization tests which are crucial for the development of more robust models. Here, we address this limitation by releasing NSD-synthetic, a dataset consisting of 7T fMRI responses from the same eight NSD participants for 284 synthetic images. We show that NSD-synthetic’s fMRI responses reliably encode stimulus-related information and are OOD with respect to NSD. Furthermore, we provide a proof of principle that OOD generalization tests on NSD-synthetic reveal differences between models of the brain that are not detected with the original NSD data; we demonstrate that the degree of OOD (quantified as the distance between a set of responses and the training data used for modeling) is predictive of the magnitude of model failures; and we show that less strict OOD generalization tests can can be usefully applied even within the domain of naturalistic stimuli. These results showcase how NSD-synthetic enables OOD generalization tests that facilitate the development of more robust models of visual processing and the formulation of more accurate theories of human vision.
Uncovering Code Insights: Leveraging GitHub Artifacts for Deeper Code Understanding
arXiv:2511.03549v1 Announce Type: cross Abstract: Understanding the purpose of source code is a critical task in software maintenance, onboarding, and modernization. While large language models



