De novo antibody design models lack sufficient training data to reliably generalize. We demonstrate scalable generation of structural training data for machine learning-driven antibody design by linking in silico designs of antibody-antigen complexes to high-throughput experimental binding validation. Using AlphaSeq, a yeast-based platform for measuring protein binding affinities, we measure the affinity and specificity of thousands of de novo "synthetic epitope proteins" (SEPs) designed to bind to VHHs. The resulting Synthetic Epitope Atlas (SEPIA) pairs over 26 million on- and off-target affinity measurements with computationally designed VHH-SEP "pseudo-structures." We validate strong, specific binding for 1,161 pseudo-structures and >75,000 VHH and SEP mutational variants. We show that these pseudo-structures complement existing structural databases and enable ML models to outperform confidence metrics commonly used to rank de novo antibody designs. Taken together, SEPIA establishes a scalable framework for improving de novo antibody design by augmenting sparse structural data with large-scale experimental binding data.
Adaptation to free-living drives loss of beneficial endosymbiosis through metabolic trade-offs
Symbioses are widespread (1) and underpin the function of diverse ecosystems (2-6), but their evolutionary stability is challenging to explain (7,8). Fitness trade-offs between con-trasting


