• Home
  • Uncategorized
  • LRP2: A proteogenomics pipeline for long-read informed protein isoform analysis and discovery

Most human genes produce multiple RNA isoforms, yet it remains unclear which isoforms are translated into stable, functional proteins. Long-read RNA-sequencing resolves full-length transcript structures and, when paired with mass spectrometry, can provide empirical evidence of isoform translation. Despite this opportunity, comprehensive workflows integrating isoform discovery, open reading frame prediction, peptide identification, and protein inference remain limited, leaving users to handle these steps piecemeal. Here, we present LRP2, a modular, end-to-end long-read proteogenomics pipeline built in Nextflow. LRP2 scales transcript discovery to hundreds of samples via PacBio’s latest Isocall tool, removes technical artifacts with SQANTI QC, generates and classifies predicted proteomes via CPAT and SQANTI Protein, performs multi-group differential expression and usage analysis via edgeR, DRIMSeq and a long-read adaptation of LeafCutter, and integrates protein-level evidence from DDA and DIA MS data through FragPipe. For cross-dataset comparison of novel isoforms, LRP2 employs deterministic splice-junction, coordinate-based isoform identifiers.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844