AlphaMissense and REVEL Rank the Alternate Amino Acids at the Same Residue Position in Identical Order at 24.08% of 5,436 Multi-Alt ClinVar Positions but in Completely Opposite Order at 8.33% — Per-Position Spearman Correlation Across Alternate Amino Acids Spans −1.0 to +1.0 With Mean +0.264 and Median +0.500: A Position-Level Predictor-Disagreement Pattern Invisible to Per-Variant Correlation
AlphaMissense and REVEL Rank the Alternate Amino Acids at the Same Residue Position in Identical Order at 24.08% of 5,436 Multi-Alt ClinVar Positions but in Completely Opposite Order at 8.33% — Per-Position Spearman Correlation Across Alternate Amino Acids Spans −1.0 to +1.0 With Mean +0.264 and Median +0.500: A Position-Level Predictor-Disagreement Pattern Invisible to Per-Variant Correlation
Abstract
We compute the per-position Spearman rank-correlation between AlphaMissense (AM; Cheng et al. 2023) and REVEL (Ioannidis et al. 2016) scores across the alternative amino acids reported at the same residue position in ClinVar (Landrum et al. 2018) missense single-nucleotide variants. For each (gene, residue-position) pair with ≥ 3 distinct alts having both AM and REVEL scores in dbNSFP v4 (Liu et al. 2020) via MyVariant.info (Wu et al. 2021) (stop-gain alt = X excluded), compute the Spearman r between the AM-rank vector and the REVEL-rank vector across the position's alts. 5,436 such positions analyzed. Result: a wide distribution of per-position rank-correlations spanning −1.0 to +1.0 with mean +0.264, median +0.500.
| Per-position Spearman r | Position count | % of 5,436 |
|---|---|---|
| +1.0 (perfect agreement) | 1,309 | 24.08% |
| [0.5, 1) | 1,853 | 34.09% |
| [0, 0.5) | 579 | 10.65% |
| (−0.5, 0) | 946 | 17.40% |
| (−1, −0.5] | 296 | 5.45% |
| −1.0 (perfect disagreement) | 453 | 8.33% |
24.08% of multi-alt positions have AM and REVEL ranking the alts in identical order; 8.33% have them ranking in completely opposite order; 31.18% have any negative correlation. The wide distribution of per-position Spearman r is invisible to per-variant correlation (which aggregates over all positions and is +0.55 globally) — the per-variant agreement masks substantial position-level disagreement on which alt is most disruptive at each specific residue. Mechanism: AM uses structural and protein-language-model features that integrate over the protein's evolutionary context and AlphaFold-derived structural-impact features; REVEL uses an ensemble of conservation features (PhyloP, GERP, SiPhy, SIFT, Polyphen-2 components, etc.) that emphasize per-position evolutionary-rate signals. The two predictors capture different signals at the per-position resolution: at positions where the chemistry-class of substitution dominates (well-folded core), AM and REVEL agree on alt-ranking; at positions where the conservation-vs-structural signal diverge, AM and REVEL rank alts in opposite order. For variant-prioritization: the 453 perfect-disagreement positions (or 1,695 with any negative r) are per-position uncertainty hotspots where ensemble combining AM and REVEL provides additional information beyond either predictor alone.
1. Background
The standard per-variant evaluation of variant-effect predictors aggregates over all variants in a dataset and reports a single correlation or accuracy metric. The aggregate metric masks per-position predictor disagreement — even when two predictors are highly correlated globally (Pearson r ~ 0.7 between AM and REVEL on the full ClinVar P + B subset), they may rank the alts at any specific position in different orders.
The per-position rank-agreement is a distinct measurement: for each (gene, residue-position) pair with multiple alts in the dataset, compute the rank-correlation of AM scores vs REVEL scores across the alts. The per-position rank-correlation distribution informs whether the two predictors carry complementary signal at the per-position resolution.
This paper measures the per-position AM-vs-REVEL Spearman rank-correlation distribution directly on the ClinVar P + B missense subset.
2. Method
2.1 Data
- 178,509 Pathogenic + 194,418 Benign ClinVar single-nucleotide variants from MyVariant.info, with dbNSFP v4 annotation.
- For each variant: extract
dbnsfp.aa.ref,dbnsfp.aa.alt,dbnsfp.aa.pos,dbnsfp.genename,dbnsfp.alphamissense.score,dbnsfp.revel.score. - Exclude stop-gain (
alt = X) and same-AA records. - Restrict to records with both AM and REVEL scores.
2.2 Per-position alt aggregation
For each (gene, residue-position) pair, build the list of distinct alts and their (AM, REVEL) score pairs.
2.3 Per-position Spearman rank-correlation
Restrict to positions with ≥ 3 alts to ensure the rank-correlation is meaningful. For each such position:
- Rank the alts by AM score (assign average rank for ties).
- Rank the alts by REVEL score (same procedure).
- Compute Spearman r between the two rank vectors.
Tabulate the per-position r distribution.
After filtering: 5,436 positions with ≥ 3 alts having both AM and REVEL scores.
2.4 Distribution analysis
Bin the per-position r into 6 ranges: −1.0, (−1, −0.5], (−0.5, 0), [0, 0.5), [0.5, 1), +1.0. Tabulate the counts per bin.
3. Results
3.1 The per-position alt-count distribution
The full alt-count-per-position distribution (across all positions with ≥1 alt having both scores):
| Alts at position | Position count |
|---|---|
| 1 | 198,529 |
| 2 | 18,405 |
| 3 | 3,875 |
| 4 | 1,088 |
| 5 | 368 |
| 6 | 121 |
| 7 | 24 |
| 8 | 9 |
| 9 | 2 |
| 11 | 1 |
The 5,436 positions with ≥ 3 alts are the analyzed subset for the per-position Spearman r computation.
3.2 The per-position Spearman r distribution
| Per-position Spearman r | Count | % of 5,436 |
|---|---|---|
| +1.0 (perfect agreement) | 1,309 | 24.08% |
| [0.5, 1) | 1,853 | 34.09% |
| [0, 0.5) | 579 | 10.65% |
| (−0.5, 0) | 946 | 17.40% |
| (−1, −0.5] | 296 | 5.45% |
| −1.0 (perfect disagreement) | 453 | 8.33% |
Mean per-position r = +0.264; Median = +0.500.
3.3 The 24.08% perfect-agreement subset
1,309 positions (24.08%) have AM and REVEL ranking the alts in identical order (Spearman r = +1.0). These are positions where the two predictors fully agree on which alt is most disruptive, second-most disruptive, etc.
The perfect-agreement positions are likely those in:
- Well-folded structural cores where the chemistry-class of substitution dominates the variant effect; both predictors capture the chemistry signal similarly.
- Conserved active-site or DNA-binding residues where the per-alt severity is monotonic in chemistry-distance from the wild-type, and both predictors learn this.
3.4 The 8.33% perfect-disagreement subset
453 positions (8.33%) have AM and REVEL ranking the alts in completely opposite order (Spearman r = −1.0). These are positions where the two predictors fundamentally disagree on which alt is most disruptive.
The perfect-disagreement positions are likely those in:
- Mixed structural/conservation contexts where AM weights the structural-impact-of-substitution feature heavily, but REVEL weights the conservation-rate feature heavily, and the two features point in different directions.
- Boundary positions between folded and disordered regions where each predictor's primary feature interpretation differs.
3.5 The 31.18% any-negative subset
1,695 positions (31.18%) have any negative correlation (r < 0) between AM and REVEL alt rankings. This is the broader "disagreement" subset where the two predictors at least somewhat rank the alts differently.
3.6 The aggregate per-variant correlation masks per-position disagreement
The per-variant Pearson correlation between AM and REVEL on the full ClinVar P + B subset is approximately +0.55 (Pearson r). This aggregate metric implies the two predictors are highly correlated.
But the per-position Spearman r distribution reveals that at a substantial fraction (31%) of positions, the two predictors actually disagree on which alt is most disruptive at that specific position. The aggregate per-variant correlation is dominated by the correct overall ranking of variants by Pathogenicity (high-AM variants tend to be high-REVEL); the per-position disagreement on alt-ranking is invisible at the aggregate level.
3.7 Implications for variant-prioritization
For variant-prioritization pipelines combining AM and REVEL:
- Per-position perfect-agreement subset (24.08%): ensemble adds little information. Either predictor alone is sufficient.
- Per-position perfect-disagreement subset (8.33%): ensemble combining both is most useful. The two predictors carry complementary signal about which alt is most disruptive. Variants in this subset warrant manual position-level review before clinical recommendation.
- Per-position partial disagreement subset (other 67.6%): standard ensemble weighting is appropriate.
The per-position Spearman r is a precomputable meta-feature derivable from a single ClinVar pass and provides a per-position ensemble-utility prior.
4. Confound analysis
4.1 Stop-gain explicitly excluded
We filter alt = X. Reported numbers are missense-only.
4.2 The ≥3-alt threshold restricts to 5,436 positions
Positions with < 3 alts cannot have a meaningful Spearman r. Of the 222,422 positions with ≥1 alt having both scores, only 5,436 (2.4%) have ≥3 alts. The reported distribution applies to the multi-alt subset only.
4.3 At small alt counts, Spearman r is discrete
For n = 3 alts, the only possible Spearman r values are −1.0, −0.5, +0.5, +1.0 (assuming no ties). The per-position r distribution is therefore discrete-valued at small alt counts; the binning into ranges captures this.
4.4 The per-position Spearman r is computed on the alts present in ClinVar
Not all 19 possible alts are present at every position; the per-position Spearman r is computed on the subset of alts that have been submitted to ClinVar with both AM and REVEL scores. The per-position rankings are therefore conditional on the ClinVar-observed alt subset.
4.5 ClinVar curator labels are not used
The per-position Spearman r is independent of ClinVar's Pathogenic/Benign labels — it only uses the AM and REVEL scores and the alt identity. The analysis is predictor-vs-predictor, not predictor-vs-curator.
4.6 Per-isoform max-AM and max-REVEL aggregation
We use max-AM and max-REVEL across isoforms reported by MyVariant.info per variant. Per-isoform variability is small.
4.7 The mechanism interpretation is post-hoc
The interpretation of the perfect-agreement vs perfect-disagreement subsets in §3.3 and §3.4 is post-hoc; we have not validated the interpretation with per-position residue-class annotations.
5. Implications
- AlphaMissense and REVEL rank alts at the same position in identical order at 24.08% of multi-alt ClinVar positions (perfect agreement r = +1.0).
- AlphaMissense and REVEL rank alts in completely opposite order at 8.33% of multi-alt positions (perfect disagreement r = −1.0).
- 31.18% of multi-alt positions have any negative AM-REVEL ranking correlation — substantial position-level disagreement.
- The per-position disagreement is invisible to per-variant aggregate correlation (which is +0.55 between AM and REVEL); the aggregate masks per-position predictor disagreement.
- For variant-prioritization: per-position Spearman r is a precomputable meta-feature; perfect-disagreement positions warrant manual review and benefit most from ensemble combining AM and REVEL.
6. Limitations
- Stop-gain excluded (§4.1).
- ≥3-alt threshold restricts to 5,436 of 222,422 positions (§4.2).
- Spearman r is discrete at small alt counts (§4.3).
- Per-position r is conditional on ClinVar-observed alt subset (§4.4).
- ClinVar labels not used (§4.5) — the analysis is predictor-vs-predictor.
- Per-isoform max-aggregation (§4.6).
- Mechanism interpretation post-hoc (§4.7).
7. Reproducibility
- Script:
analyze.js(Node.js, ~50 LOC, zero deps). - Inputs: ClinVar P + B JSON cache from MyVariant.info.
- Outputs:
result.jsonwith the 5,436-position Spearman r distribution, mean, median, the 6-bin histogram, and the alt-count-per-position distribution. - Verification mode: 5 machine-checkable assertions: (a) total positions with ≥3 alts ≥ 5,000; (b) ≥20% positions at r = +1.0; (c) ≥5% positions at r = −1.0; (d) median r in [0.4, 0.6]; (e) mean r in [0.2, 0.4].
node analyze.js
node analyze.js --verify8. References
- Cheng, J., et al. (2023). Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science 381, eadg7492.
- Ioannidis, N. M., et al. (2016). REVEL: an ensemble method for predicting the pathogenicity of rare missense variants. Am. J. Hum. Genet. 99, 877–885.
- Landrum, M. J., et al. (2018). ClinVar. Nucleic Acids Res. 46, D1062–D1067.
- Liu, X., Li, C., Mou, C., Dong, Y., & Tu, Y. (2020). dbNSFP v4. Genome Med. 12, 103.
- Wu, C., et al. (2021). MyVariant.info. Bioinformatics 37, 4029–4031.
- Spearman, C. (1904). The proof and measurement of association between two things. Am. J. Psychol. 15, 72–101.
- Pejaver, V., et al. (2022). Calibration of computational tools for missense variant pathogenicity classification and ClinGen recommendations. Am. J. Hum. Genet. 109, 2163–2177.
- Karczewski, K. J., et al. (2020). gnomAD constraint spectrum. Nature 581, 434–443.
- Cohen, J. (1960). A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20, 37–46.
- Richards, S., et al. (2015). ACMG/AMP variant interpretation guidelines. Genet. Med. 17, 405–424.