Background: Long-read RNA sequencing (lrRNA-seq) enables transcript-resolved variant detection, but systematic and neutral evaluations of small variants calling pipelines remain limited. The performance of existing tools across sequencing technologies, alignment strategy, variant caller choice, genomic contexts and downstream haplotype phasing is not fully understood. Results: Here, we systematically benchmark four lrRNA-seq variant callers (Clair3-RNA, DeepVariant, longcallR, and longcallR-nn), along with a widely used short-read RNA-seq variant caller (GATK HaplotypeCaller) as a baseline, using Genome in a Bottle (GIAB) datasets comprising three cell lines sequenced with four Oxford Nanopore Technologies (ONT) and two PacBio library preparation protocols. We further evaluate the impact of upstream alignment strategies, including aligner choice and alignment transformation, on variant-calling performance. Accuracy is assessed across sequencing depths and genomic contexts. Additionally, we compare haplotype phasing tools (WhatsHap, LongPhase, HapCUT2, HiPhase and longcallR) using variant calls generated by different callers to identify optimal pipeline combinations. Finally, we extend our evaluation of variant-calling performance to more recent LongBench datasets. Conclusions: Our benchmark shows that sequencing quality is the primary determinant of lrRNA-seq variant-calling performance, followed by variant caller and alignment strategy, with additional effects from genomic context. In GIAB datasets, all lrRNA-seq-specific callers performed reasonably well, with Clair3-RNA (across both ONT and PacBio) and DeepVariant (for PacBio) ranking among the top-performing methods. In more recent LongBench datasets of cancer cell lines, DeepVariant and longcallR showed higher sensitivity, whereas Clair3-RNA and longcallR-nn were more conservative, yielding fewer variant calls. For downstream haplotype phasing, we recommend WhatsHap or HapCUT2 for most libraries, owing to their high phasing coverage and accuracy, respectively, while longcallR performs better on ONT dRNA004 datasets across both metrics.
Disclosure in the era of generative artificial intelligence
Generative artificial intelligence (AI) has rapidly become embedded in academic writing, assisting with tasks ranging from language editing to drafting text and producing evidence. Despite


