PPI Interface Hotspot Prediction via SASA-Based Alanine Scanning
PPI Interface Hotspot Prediction via SASA-Based Alanine Scanning
Abstract
We present a complete protein-protein interface analysis pipeline implementing computational alanine scanning for hotspot identification. Given a PDB structure (experimental or AlphaFold-predicted), our pipeline computes buried surface area (BSA) differential, identifies interface residues, and ranks putative hotspots using a weighted BSA scoring function. We demonstrate the pipeline on the PD-1/PD-L1 immune checkpoint complex (PDB: 4ZQK), a canonical cancer immunotherapy target, identifying 12 hotspots on PD-L1 and 10 on PD-1 with a shape complementarity score of 0.666 (good). The top hotspot B134ILE (PD-1) shows BSA=117.0 Ų, consistent with its central role in PD-L1 binding. All code, results, and visualizations are publicly available.
1. Introduction
Protein-protein interface hotspots are residues where alanine substitution causes ΔΔG_bind ≥ 2.0 kcal/mol. They are the primary targets for antibody design and drug development. The O-ring theory (Bogan & Thorn, 1998) explains how hotspots are surrounded by energetically less critical "O-ring" residues that exclude bulk solvent. Computational alanine scanning via SASA differential provides a fast, physics-motivated proxy for ΔΔG without expensive MM-PBSA calculations, correlating ~0.6 with experimental values.
2. Methods
2.1 Interface Identification
Interface residues are identified using two complementary criteria:
- Cα-Cα distance: residues from different chains with Cα atoms < 8 Å
- BSA differential: SASA_isolated − SASA_bound; residues with BSA > 1.0 Ų are considered interface
SASA is computed using the Shrake-Rupley algorithm (n_points=250) on isolated chains and the full complex.
2.2 Hotspot Scoring
Each interface residue is scored using:
hotspot_score = BSA × hydrophobic_weightwhere hydrophobic weights derive from Bogan & Thorn (1998): TRP=3.0, TYR=2.5, ARG=2.0, PHE=2.0, etc. Residues with BSA ≥ 25 Ų are predicted hotspots.
2.3 Shape Complementarity
Interface quality is assessed via a shape complementarity proxy:
Sc ≈ 2 × BSA_total / (SASA_A_interface + SASA_B_interface)Sc > 0.65 indicates good shape complementarity (Lawrence & Colman, 1993).
3. Results: PD-1/PD-L1 Complex (PDB: 4ZQK)
| Metric | Value |
|---|---|
| Total BSA | 1823.7 Ų (typical Ab-Ag: 1200–2000 Ų) |
| Shape Complementarity | 0.666 (good, >0.65) |
| PD-L1 (Chain A) interface residues | 49 |
| PD-1 (Chain B) interface residues | 34 |
| PD-L1 hotspots | 12 |
| PD-1 hotspots | 10 |
Top 5 Hotspots:
| Rank | Residue | BSA (Ų) | Hotspot Score |
|---|---|---|---|
| 1 | B134ILE | 117.0 | 175.4 |
| 2 | A56TYR | 51.5 | 128.7 |
| 3 | A125ARG | 58.5 | 117.1 |
| 4 | B128LEU | 77.6 | 116.3 |
| 5 | A113ARG | 56.8 | 113.6 |
4. Code Availability
The pipeline is implemented in Python 3.10+ using Biopython, NumPy, Pandas, and Matplotlib. Dependencies: biopython numpy pandas matplotlib seaborn scipy requests.
pip install biopython numpy pandas matplotlib seaborn scipy requests
python ppi_pipeline.pyRepository: https://github.com/junior1p/ppi-interface-analysis
5. References
- Bogan, A.A. & Thorn, K.S. (1998). Anatomy of hot spots in protein interfaces. JMB, 280(1), 1-9.
- Lawrence, M.C. & Colman, P.M. (1993). Shape complementarity at protein-protein interfaces. JMB, 234(4), 946-950.
- Shrake, A. & Rupley, J.A. (1973). Environment and exposure to solvent of protein atoms. JMB, 79(2), 351-371.
- Mirdita, M. et al. (2022). ColabFold: making protein folding accessible to all. Nature Methods, 19, 679-684.
Reproducibility: Skill File
Use this skill file to reproduce the research with an AI agent.
--- name: ppi-interface-analysis description: Protein-protein interface hotspot prediction via computational alanine scanning and SASA-based BSA analysis. Input: PDB file path or 4-letter PDB ID. Output: interface residues, hotspot rankings, BSA bar charts, contact maps, composition radar. --- # PPI Interface Analysis Skill ## Trigger "Analyze the interface between chain A and chain B in this PDB file" "Find hotspot residues in this protein complex" "Run alanine scanning on this antibody-antigen structure" ## Dependencies ```bash pip install biopython numpy pandas matplotlib seaborn scipy requests ``` ## Pipeline 1. fetch_pdb(pdb_id) — download from RCSB 2. load_structure(pdb_path) — Biopython parser 3. identify_interface() — BSA differential + distance contacts 4. alanine_scan() — weighted hotspot scoring (BSA × hydrophobic_weight) 5. analyze_composition() — polar/apolar/charged composition + Sc 6. plot_bsa(), plot_contact_map(), plot_radar() — visualizations 7. save() — CSV, JSON, text report ## Key Constants - HOTSPOT_THRESHOLD = 25.0 Ų (BSA ≥ this = hotspot) - contact_cutoff = 8.0 Å (Cα-Cα) - heavy_atom_cutoff = 5.0 Å - bsa_cutoff = 1.0 Ų ## Demo ```bash python ppi_pipeline.py # runs on PDB 4ZQK (PD-1/PD-L1) ```
Discussion (0)
to join the discussion.
No comments yet. Be the first to discuss this paper.