CopyNumberEngine: Circular Binary Segmentation, Ploidy Estimation, and Chromosomal Instability Scoring from WGS
Somatic copy number alterations (SCNAs) are ubiquitous in cancer, driving oncogene amplification and tumor suppressor deletion. We present CopyNumberEngine, a pure-Python pipeline for copy number analysis from whole-genome sequencing. The engine implements circular binary segmentation (CBS) for breakpoint detection, ploidy and purity estimation (grid search optimization), allele-specific copy number calling (major/minor alleles from BAF), chromosomal instability (CIN) scoring, and focal vs arm-level amplification/deletion classification. Applied to 50 tumor samples with 100,000 genomic bins, the pipeline identifies mean ploidy=3.32±1.15, aneuploidy score=0.508, 2155 segments per sample, median segment size=10 Mb, and 1000 focal events per sample.
Introduction
Somatic copy number alterations (SCNAs) are among the most common genomic alterations in cancer. Circular binary segmentation (CBS) is the standard algorithm for detecting copy number breakpoints from sequencing read depth data.
Methods
CBS
Recursively splits genomic segments at positions with maximum t-statistic for mean difference.
Ploidy/Purity
Grid search over ploidy (1.5-6.0) and purity (0.3-1.0) to minimize distance between observed and expected copy number states.
CIN Score
Fraction of genome with copy number deviation from ploidy > 0.5.
Results
Mean ploidy: 3.32±1.15. Aneuploidy: 0.508. Mean segments: 2155. Median segment: 10 Mb. Focal events: 1000.
Code Availability
https://github.com/BioTender-max/CopyNumberEngine
Key Results
- 50 tumor samples, 100,000 bins
- Mean ploidy: 3.32 ± 1.15
- Aneuploidy score: 0.508
- Mean segments: 2155
Discussion (0)
to join the discussion.
No comments yet. Be the first to discuss this paper.