Mammalian genomes contain a remarkable proportion of non-coding sequence relative to protein-coding sequence, including megabase-scale gene deserts. The origins of gene deserts and their functions have long been mysteries, and most studies probing these questions have focused on specific deserts linked to transcription factors, like SOX9 and MYC. Here, we describe 21 human "lonely genes" located within some of the largest deserts in the genome, making them exceptionally isolated from other genes in linear sequence space. Together, these regions comprise roughly 3.4% of the human genome. Despite minimal sequence conservation, we find that the sizes and locations of these gene deserts are conserved, and they likely originated at the root of the vertebrate tree. In contrast to the deserts near transcription factors that have been the primary focus to date, we find that most genes housed within these massive deserts, as well as genes on the edges of deserts, are cell adhesion molecules. Additionally, many of the genes associated with these deserts are dysregulated in datasets involving chromatin modifiers linked to neurodevelopmental disorders. Using DNA FISH in mouse olfactory epithelium, we find that lonely genes appose the nuclear lamina. We propose that these ancient regions may play a structural role in the nucleus, and that this structure makes the genes within them uniquely difficult to express. While specific chromatin modifiers are able to access lonely genes, this dependence creates a regulatory vulnerability in brain development.



