• Home
  • Uncategorized
  • ML-Guided GWAS Reveals Genetic Architectures for MASLD for Overweight and Lean Individuals in the All of Us Cohort

ML-Guided GWAS Reveals Genetic Architectures for MASLD for Overweight and Lean Individuals in the All of Us Cohort

Metabolic dysfunction-associated steatotic liver disease (MASLD) arises from excessive hepatic fat accumulation that triggers inflammation and liver injury. It is the most prevalent chronic liver disease worldwide, affecting more than one quarter of adults. Despite this, MASLD is often underdiagnosed, making it more difficult to perform genome-wide association studies (GWAS). In this paper, we implemented a machine learning (ML)-guided GWAS framework to identify genetic risk factors for MASLD across lean and overweight individuals in the All of Us Research Program. A random forest model trained on laboratory measurements, vital signs, and demographic features generated an in silico MASLD (I-MASLD) score, a continuous risk score for MASLD, which was validated to accurately represent clinical MASLD diagnosis. This score was then used as the phenotype in a GWAS of whole-exome sequencing variants. The resultant GWAS discovered a novel variant in the ANGPTL4 gene to be significantly associated with MASLD risk and recapitulated known variants in various genes involved in lipid metabolism and insulin signaling. Our results also suggest a potential role of APOA5 in MASLD onset or progression in lean patients. These findings demonstrate that ML-derived quantitative phenotypes can enhance genetic discovery in large, heterogeneous cohorts where conventional case/control labels are limited or imprecise.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844