Fragment end motif analysis to distinguish pathogens from contaminants in enriched plasma microbial DNA

Introduction: Despite its promise, accuracy of microbial cell-free DNA (mDNA) in plasma as a diagnostic tool is hindered by its low abundance and process contaminants. We have previously shown that combining size selection with single-stranded DNA (ssDNA) library preparation increased mDNA yield by 200-fold but also decreased sensitivity for pathogen detection due to higher background noise. A recent study showed that pathogen-derived DNA was enriched for CC dinucleotide at 5′ ends compared to contaminants. Since ssDNA libraries preserve sequence motifs at both ends (5′ and 3′), we hypothesized that analysis of nucleotide motifs at microbial fragment ends in size-selected ssDNA libraries could help differentiate pathogen DNA from background noise. Methods: We performed deep sequencing on size-selected ssDNA libraries (<110 bp) generated from longitudinal plasma samples of 11 critically-ill patients (5 with culture-proven infections, 20 samples; 6 without infections, 18 samples) and 6 no-template controls (NTCs). For each 2-mer and 1-mer motif, we calculated the ratio between its frequency observed at 5′ and 3′ fragment ends in sequencing data and its expected frequency in the corresponding reference genome (O/E ratio). We compared enrichment of motifs in pathogen DNA and contaminant DNA fragments. Results: Pathogen-derived mDNA fragments were more biased in O/E end motif ratios compared to contaminants across all 3 groups (NTCs, no-infections and culture-proven infections), at both 5′ and 3′ fragment ends. Notably, the GG dinucleotide was enriched at the 3′ end in pathogens compared to contaminants (P < 0.0001). Combining O/E ratios for C and G nucleotides at the 3′ end achieved areas under the receiver operating characteristic curve of >0.98 for distinguishing common contaminants from culture-proven pathogens. Conclusions: Pathogen-derived mDNA in size-selected ssDNA libraries is biased at 5′ and 3′ fragment end compared to contaminants. Incorporating microbial fragment end motif analysis can enhance signal-to-noise ratio and improve pathogen detection and identification in plasma metagenomic sequencing.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844