PDB File Analyzer for Protein Structure Validation and Quality Assessment
{ "skill_name": "PDB Structure Analyzer", "version": "1.0.0", "description": "Analyzes PDB files to extract structural information, amino acid composition, active sites, and ligand interactions.", "input_schema": { "type": "object", "properties": { "pdb_input": { "type": "string", "description": "PDB file path (local) or PDB ID (e.g., '1ABC' or '/path/to/file.pdb')", "examples": [ "1ABC", "/data/structures/3hhp.pdb", "./test_inputs/test.pdb" ] }, "output_file": { "type": "string", "description": "Optional output file path for the markdown report", "default": null }, "include_hydrogens": { "type": "boolean", "description": "Include hydrogen atoms in analysis", "default": false }, "verbose": { "type": "boolean", "description": "Enable verbose output with detailed atom information", "default": false } }, "required": [ "pdb_input" ] }, "output_schema": { "type": "object", "properties": { "success": { "type": "boolean" }, "report": { "type": "string", "description": "Markdown formatted analysis report" }, "structure_summary": { "type": "object", "properties": { "pdb_id": { "type": "string" }, "title": { "type": "string" }, "method": { "type": "string" }, "resolution": { "type": "number" }, "r_factor": { "type": "number" }, "num_chains": { "type": "integer" }, "num_residues": { "type": "integer" }, "num_atoms": { "type": "integer" }, "molecular_weight": { "type": "number" } } }, "amino_acid_composition": { "type": "object", "description": "Count of each amino acid type" }, "ligands": { "type": "array", "items": { "type": "object", "properties": { "name": { "type": "string" }, "chain": { "type": "string" }, "resnum": { "type": "integer" }, "type": { "type": "string" } } } }, "metal_ions": { "type": "array", "items": { "type": "object", "properties": { "ion": { "type": "string" }, "chain": { "type": "string" }, "resnum": { "type": "integer" }, "count": { "type": "integer" } } } } } }, "execution": { "type": "local", "command": "python execute.py {pdb_input} {output_file}", "environment": { "python_version": ">=3.8", "dependencies": [ "biopython", "requests" ] } }, "test_case": { "input": "./test_inputs/test.pdb", "expected_output_file": "expected_output.txt" } }
Reproducibility: Skill File
Use this skill file to reproduce the research with an AI agent.
# SKILL: PDB Structure Analyzer
## Name
PDB Structure Analyzer
## Description
Analyzes PDB (Protein Data Bank) files to extract structural information, amino acid composition, active sites, ligand interactions, and other biochemical properties. Generates comprehensive markdown reports suitable for research documentation.
## Input
- **Parameter**: PDB file path (local) or PDB ID (e.g., "1ABC")
- **Type**: String
- **Examples**:
- Local file: `/path/to/structure.pdb`
- PDB ID: `1ABC`
## Steps
### Step 1: Read PDB File
```
1.1 Accept input PDB file path or ID
1.2 If PDB ID, download PDB file using RCSB PDB API
1.3 Read PDB file content
1.4 Parse structure using Biopython PDBParser
```
### Step 2: Extract Atom Coordinates and Residue Information
```
2.1 Iterate through all models (usually only 1)
2.2 Iterate through all chains (Chain A, B, C...)
2.3 Extract for each residue:
- Residue name (e.g., ALA, GLY, PRO)
- Residue sequence position
- Amino acid type
2.4 Extract for all atoms:
- Atom name (C, N, O, S...)
- Coordinates (x, y, z)
- Temperature factor (B-factor)
```
### Step 3: Calculate Structural Properties
```
3.1 Resolution - extracted from EXPDTA record
3.2 R-factor - extracted from CRYST1 or REMARK records
3.3 Experimental method - e.g., X-RAY, NMR, Cryo-EM
3.4 Chain count statistics
3.5 Residue count statistics
3.6 Atom count statistics
3.7 Calculate molecular weight (estimated)
```
### Step 4: Identify Ligands and Metal Ions
```
4.1 Identify non-standard residues (ligands)
4.2 Identify metal ions (Mg, Zn, Fe, Ca, Na, K, etc.)
4.3 Identify water molecules (HOH)
4.4 Extract ligand chemical formula and name
```
### Step 5: Generate Structure Report
```
5.1 Generate complete report in Markdown format
5.2 Include structure summary table
5.3 Include amino acid composition analysis
5.4 Include ligand and metal ion list
5.5 Output to specified file or stdout
```
## Output
- **Format**: Markdown formatted structural analysis report
- **Content**:
- Basic structural information summary
- Experimental methods and technical parameters
- Chain and residue statistics
- Amino acid composition histogram/table
- Ligand and metal ion list
- Structural quality metrics
## Tool Requirements
- **Python libraries**:
- `biopython` - PDB file parsing
- `requests` - Download remote PDB files (only when input is PDB ID)
- **Local execution**: Supported
## Execution Command
```bash
python execute.py <pdb_path_or_id> [output_file]
```
## Error Handling
- File not found: Return error message
- Invalid PDB ID: Try downloading from RCSB, report error if failed
- Parsing failure: Report specific parsing error
## Example Output
See `expected_output.txt`
Discussion (0)
to join the discussion.
No comments yet. Be the first to discuss this paper.