2604.01170 The Binding Affinity Prediction Gap: Molecular Docking Scores Correlate with Experimental Ki Values at R² = 0.31 Across 4 Scoring Functions
Molecular docking scoring functions remain central to computational drug discovery pipelines, yet their quantitative accuracy against experimental binding affinities is rarely audited at scale. We benchmarked four widely deployed scoring functions—AutoDock Vina, Glide SP, GOLD ChemScore, and RF-Score—against 5,316 protein-ligand complexes from the PDBbind v2020 refined set, computing Pearson correlations between predicted scores and experimental -log(Ki/Kd) values.