• Home
  • Uncategorized
  • CaliPPer: quantifying, predicting and improving AI model performance for binding prediction

arXiv:2606.07258v1 Announce Type: cross
Abstract: Binding prediction models accelerate therapeutic antibody and TCR discovery, but their performance on new datasets is unpredictable, often leading to low discovery rates. Density-ratio methods (PAPE, M-CBPE) provide label-free performance estimation for binary classification, but their assumptions and aggregate-only outputs limit binding prediction on neoepitopes, antigen variants and chemical scaffolds. Here we present CaliPPer (Calibration and Prediction of Performance), a post-hoc framework pairing a multi-chain Sample-to-Domain Distance (S2DD) with distance-aware Bayesian recalibration, operating at three resolutions: generalisability score, aggregate performance prediction, and per-sample confidence. Across ten models, eight architectures and two immune-receptor domains, CaliPPer attains distance–performance correlations $|r|=0.80text–0.92$, predicts AUROC/AP/F1 with mean absolute errors $0.008text–0.070$, and improves AUROC by up to $+0.20$ on unseen epitopes/variants. Applied retrospectively to five published TCR, BCR, MHC–peptide and small-molecule studies, CaliPPer raises true discovery rates in all five (e.g. $0/5 to 3/5$ confirmed neoantigens), providing a triage layer between computational prediction and experimental validation.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844