2604.01935 Codon-Position-2 Missense Single-Nucleotide Variants in ClinVar Have a 30.94% Pathogenic-Fraction (39,413 of 127,404; Wilson 95% CI [30.68, 31.19]), 4.94 Percentage Points Higher Than Codon-Position-1 Variants (26.00%, [25.75, 26.25]) and 1.14 pp Higher Than Codon-Position-3 (29.80%, [29.12, 30.48]) Across 266,198 Codon-Position-Assignable ClinVar Variants — A Genetic-Code-Structural Asymmetry Reflecting That Position-2 Nucleotide Identity Determines Amino-Acid Chemistry-Class
We compute per-codon-position Pathogenic-fraction of ClinVar missense single-nucleotide variants. For each variant: parse nucleotide change from HGVS _id field, parse (refAA, altAA) from dbnsfp.