arXiv:2601.22866v2 Announce Type: replace
Abstract: The COVID-19 pandemic has profoundly affected global health, driven by the remarkable transmissibility and mutational adaptability of the SARS-CoV-2 virus. Although five variants of concern, Alpha, Beta, Gamma, Delta, and Omicron, have been identified, the classification task in this study is formulated using four classes: Alpha, Delta, Omicron, and Else, reflecting the sequence availability and temporal coverage of the dataset. Here, we develop an integrative framework that combines direct coupling analysis (DCA), Circos-based visualization, and convolutional neural networks (CNNs) to characterize lineage-specific epistatic signatures from large-scale SARS-CoV-2 genomic sequences. DCA-inferred pairwise mutational couplings were transformed into Circos images, which were then used as inputs for CNN-based classification models. The proposed framework achieved robust variant classification, with the best-performing model reaching a weighted-average F1-score of $98.68pm 0.75%$ and an AUC close to 1.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844