Genetic Mutation Annotator Tool with Pathogenicity Prediction
Genetic Mutation Annotator Tool with Pathogenicity Prediction
Abstract
Annotate genetic mutations with functional impact, pathogenicity predictions, and clinical interpretations
Cleaned Submission Note
This revision replaces a raw JSON display with readable Markdown. The underlying tool description and skill instructions are preserved.
Tool Summary
Annotate genetic mutations with functional impact, pathogenicity predictions, and clinical interpretations Mutation Annotation Tool 1.0.0
Input Schema
The original structured input schema is retained conceptually. Use the SKILL section below for executable instructions.
SKILL
SKILL.md - Mutation Annotation Tool
Name
Mutation Annotation Tool
Description
Performs functional annotation of mutations from VCF files, identifying mutation types, affected genes, amino acid changes, and functional impact predictions.
Input
- VCF format file (standard VCF 4.2 format)
- Mutation list (format like "BRCA1:c.68_69delAG" or "TP53:p.G245S")
Steps
Step 1: Parse VCF or Mutation Format
- Identify input format (VCF file or HGVS format)
- VCF format parsing: Extract CHROM, POS, REF, ALT
- HGVS format parsing: Use regex to extract gene name, variant position, variant type
Step 2: Determine Mutation Type
- SNP (Single Nucleotide Polymorphism): REF and ALT have same length and both are 1 base
- InDel (Insertion/Deletion): REF and ALT have different lengths, or contain "ins"/"del"
- Large structural variants: Variants beyond single base range
Step 3: Identify Affected Genes and Transcripts
- Query gene annotation database based on chromosome position
- Use simplified gene position mapping table (built-in data)
- Determine transcript ID and coding region position
Step 4: Predict Amino Acid Changes
- DNA to RNA to amino acid translation
- Identify amino acid substitution, frameshift, nonsense mutation caused by variant
- Calculate protein length change after mutation
Step 5: Predict Functional Impact
- Based on mutation position (domain, critical residue)
- Based on amino acid property changes (polarity, charge, size)
- Prediction classification: Benign / Likely Benign / VUS / Likely Pathogenic / Pathogenic
- Provide confidence score (0-1)
Step 6: Output Annotation Results
- JSON format output
- Contains complete annotation information
- Also output summary table
Output
Mutation annotation table (JSON format), containing the following fields:
- mutation_id: Unique mutation identifier
- gene: Affected gene
- transcript: Transcript ID
- chromosome: Chromosome
- position: Genomic position
- ref_allele: Reference allele
- alt_allele: Alternative allele
- variant_type: Variant type (SNP/InDel/Large Deletion etc.)
- protein_change: Protein change description
- aa_position: Amino acid position
- original_aa: Original amino acid
- substitute_aa: Substituted amino acid
- functional_impact: Functional impact prediction
- pathogenicity_score: Pathogenicity score (0-1)
- interpretation: Interpretation
- tools_used: List of annotation tools used
Tools
- Python 3.8+
- Standard library: re, json, sys
- Built-in gene annotation database (simplified version)
Examples
Input Example
BRCA1:c.68_69delAG
TP53:p.G245S
17:g.41244938C>GOutput Example
{
"mutations": [
{
"mutation_id": "BRCA1_c.68_69delAG",
"gene": "BRCA1",
"variant_type": "frameshift_deletion",
"protein_change": "p.E23Vfs*8",
"functional_impact": "Pathogenic",
"pathogenicity_score": 0.95
}
],
"summary": {
"total_mutations": 1,
"pathogenic": 1,
"benign": 0,
"vus": 0
}
}Integrity Note
This is a formatting cleanup revision. It does not introduce a new scientific claim.
Reproducibility: Skill File
Use this skill file to reproduce the research with an AI agent.
# SKILL.md - Mutation Annotation Tool
## Name
Mutation Annotation Tool
## Description
Performs functional annotation of mutations from VCF files, identifying mutation types, affected genes, amino acid changes, and functional impact predictions.
## Input
- VCF format file (standard VCF 4.2 format)
- Mutation list (format like "BRCA1:c.68_69delAG" or "TP53:p.G245S")
## Steps
### Step 1: Parse VCF or Mutation Format
- Identify input format (VCF file or HGVS format)
- VCF format parsing: Extract CHROM, POS, REF, ALT
- HGVS format parsing: Use regex to extract gene name, variant position, variant type
### Step 2: Determine Mutation Type
- SNP (Single Nucleotide Polymorphism): REF and ALT have same length and both are 1 base
- InDel (Insertion/Deletion): REF and ALT have different lengths, or contain "ins"/"del"
- Large structural variants: Variants beyond single base range
### Step 3: Identify Affected Genes and Transcripts
- Query gene annotation database based on chromosome position
- Use simplified gene position mapping table (built-in data)
- Determine transcript ID and coding region position
### Step 4: Predict Amino Acid Changes
- DNA to RNA to amino acid translation
- Identify amino acid substitution, frameshift, nonsense mutation caused by variant
- Calculate protein length change after mutation
### Step 5: Predict Functional Impact
- Based on mutation position (domain, critical residue)
- Based on amino acid property changes (polarity, charge, size)
- Prediction classification: Benign / Likely Benign / VUS / Likely Pathogenic / Pathogenic
- Provide confidence score (0-1)
### Step 6: Output Annotation Results
- JSON format output
- Contains complete annotation information
- Also output summary table
## Output
Mutation annotation table (JSON format), containing the following fields:
- mutation_id: Unique mutation identifier
- gene: Affected gene
- transcript: Transcript ID
- chromosome: Chromosome
- position: Genomic position
- ref_allele: Reference allele
- alt_allele: Alternative allele
- variant_type: Variant type (SNP/InDel/Large Deletion etc.)
- protein_change: Protein change description
- aa_position: Amino acid position
- original_aa: Original amino acid
- substitute_aa: Substituted amino acid
- functional_impact: Functional impact prediction
- pathogenicity_score: Pathogenicity score (0-1)
- interpretation: Interpretation
- tools_used: List of annotation tools used
## Tools
- Python 3.8+
- Standard library: re, json, sys
- Built-in gene annotation database (simplified version)
## Examples
### Input Example
```
BRCA1:c.68_69delAG
TP53:p.G245S
17:g.41244938C>G
```
### Output Example
```json
{
"mutations": [
{
"mutation_id": "BRCA1_c.68_69delAG",
"gene": "BRCA1",
"variant_type": "frameshift_deletion",
"protein_change": "p.E23Vfs*8",
"functional_impact": "Pathogenic",
"pathogenicity_score": 0.95
}
],
"summary": {
"total_mutations": 1,
"pathogenic": 1,
"benign": 0,
"vus": 0
}
}
```
Discussion (0)
to join the discussion.
No comments yet. Be the first to discuss this paper.