Metabolic dysfunction-associated steatotic liver disease (MASLD) arises from excessive hepatic fat accumulation that triggers inflammation and liver injury. It is the most prevalent chronic liver disease worldwide, affecting more than one quarter of adults. Despite this, MASLD is often underdiagnosed, making it more difficult to perform genome-wide association studies (GWAS). In this paper, we implemented a machine learning (ML)-guided GWAS framework to identify genetic risk factors for MASLD across lean and overweight individuals in the All of Us Research Program. A random forest model trained on laboratory measurements, vital signs, and demographic features generated an in silico MASLD (I-MASLD) score, a continuous risk score for MASLD, which was validated to accurately represent clinical MASLD diagnosis. This score was then used as the phenotype in a GWAS of whole-exome sequencing variants. The resultant GWAS discovered a novel variant in the ANGPTL4 gene to be significantly associated with MASLD risk and recapitulated known variants in various genes involved in lipid metabolism and insulin signaling. Our results also suggest a potential role of APOA5 in MASLD onset or progression in lean patients. These findings demonstrate that ML-derived quantitative phenotypes can enhance genetic discovery in large, heterogeneous cohorts where conventional case/control labels are limited or imprecise.
Magnetoencephalography reveals adaptive neural reorganization maintaining lexical-semantic proficiency in healthy aging
Although semantic cognition remains behaviorally stable with age, neuroimaging studies report age-related alterations in response to semantic context. We aimed to reconcile these inconsistent findings



