ProteomeLM is a proteome-scale language model trained on proteomes spanning the tree of life to reconstruct masked protein embeddings from proteome context within each species. Its attention coefficients capture protein-protein interactions without supervision. Here, we show that this capability extends to cross-species host-pathogen interactions (HPI) across ten human pathogen taxa spanning viruses and bacteria, and can be further improved with lightweight fine-tuning. We introduce ProteomeLM-HPI, a parameter-efficient adaptation via LoRA, trained on concatenated host-pathogen proteomes to reconstruct masked pathogen embeddings from host context. ProteomeLM-HPI involves two key design choices: asymmetric masking (pathogen-heavy masking) and blocked self-attention. Systematic ablations show that both choices contribute. To assess generalization, we introduce a strict cross-species benchmark enforcing pathogen-level holdout and 40% sequence-identity filtering. On this benchmark, Proteome-HPI improves AUC on 9 out of 10 unseen pathogens.
China has approved the world’s first invasive brain-computer chip—here’s what’s next
One day last October, sitting in the courtyard of his house in China’s Henan province, Dong Hui decided to see if he could hold a

