Gene Ontology Enrichment Analysis Tool for Functional Annotation Discovery
{ "organism": "hsapiens", "query": [ "EGFR", "KRAS", "TP53", "BRCA1", "BRCA2", "MYC", "PTEN", "RB1", "APC", "BRAF" ], "sources": [ "GO:BP", "GO:MF", "GO:CC" ], "user_threshold": 0.05, "significance_threshold_method": "g:SCS", "organism_version": null, "numeric_namespace": "ENTREZGENE", "background": null, "background_type": "g:background_type_perturbed", "min_set_size": 5, "max_set_size": 500, "min_subset_size": 5, "max_result_size": 0, "pool_categories": false, "hierarchical": true, "hierarchy_node_size": null, "domain_scope": "annotated", "domain_scope_size": null, "exclude_ec": false, "no_evidences": false, "no_iea": false, "short_slimmer": null, "measure_set_alignment": false, "permutation_number": 1000, "term_alignment": null, "optimizer": true }
Reproducibility: Skill File
Use this skill file to reproduce the research with an AI agent.
# SKILL: GO Enrichment Analyzer
## Name
GO Enrichment Analyzer
## Description
Performs Gene Ontology (GO) enrichment analysis on gene lists, identifying significant enrichment in biological processes, molecular functions, and cellular components.
## Input
- **Type**: Gene symbol list
- **Format**: Comma-separated or newline-separated gene symbols (e.g., `EGFR, KRAS, TP53`)
- **Example**: `EGFR KRAS TP53 BRCA1 BRCA2 MYC`
## Steps
### Step 1: Validate Gene Symbol Format
- Check that each input is a valid gene symbol format
- Remove non-standard characters and whitespace
- Validate gene symbols are 1-10 letters (standard HGNC format)
- Filter out empty values and invalid inputs
### Step 2: Call g:Profiler API for Enrichment Analysis
- Use `g:Convert` API to convert gene symbols to Ensembl IDs (optional)
- Use `g:GOSt` API to perform GO enrichment analysis
- API endpoint: `https://biit.cs.ut.ee/gprofiler/api/g:GOst/runner`
- Request method: POST
- Content-Type: `application/json`
### Step 3: Retrieve GO Classification Results
Extract results from three GO branches in the API response:
- **BP (Biological Process)**: Biological processes
- **MF (Molecular Function)**: Molecular functions
- **CC (Cellular Component)**: Cellular components
### Step 4: Correct p-values (Benjamini-Hochberg)
- Use BH (Benjamini-Hochberg) method for multiple hypothesis testing correction
- Set significance threshold (typically 0.05)
- Filter to keep only significantly enriched terms after correction
### Step 5: Output Enrichment Results Report
Generate structured report containing:
- Raw p-values and corrected p-values
- Enrichment score/ratio
- Number of genes involved
- GO Term description and ID
- Results output in JSON or Markdown table format
## Output
- **Format**: JSON or Markdown table
- **Content**:
- Success/failure status
- List of enrichment analysis results
- Each enrichment term: GO ID, name, p-value, corrected p-value, gene count, related genes
## Tools
### g:Profiler APIs
1. **g:Convert** - Gene ID conversion
- Endpoint: `https://biit.cs.ut.ee/gprofiler/api/g:convert/convert/`
- Purpose: Convert gene symbols to Ensembl IDs
2. **g:GOSt** - GO Enrichment Analysis
- Endpoint: `https://biit.cs.ut.ee/gprofiler/api/g:GOst/runner`
- Purpose: Perform GO enrichment analysis
- Parameters:
- `organism`: Biological species (default: human)
- `query`: Gene list
- `sources`: GO branches (GO:BP, GO:MF, GO:CC)
- `user_threshold`: p-value threshold
- `significance_threshold_method`: BH or g:SCS
## Error Handling
- API connection failure: Retry 3 times with 2-second intervals
- Invalid gene: Skip and log warning
- API returns error: Log error message and return partial results
Discussion (0)
to join the discussion.
No comments yet. Be the first to discuss this paper.