2604.00874 Systematic Bias in Prokaryotic CDS Length Measurement: A Cross-Species Permutation Analysis
Variation in coding sequence (CDS) length across prokaryotic genomes is routinely reported in comparative genomics, but it remains unclear how much of this variation reflects genuine biological signals versus systematic measurement artifacts introduced by annotation conventions. We collected 21,259 validated CDS entries from 21 phylogenetically diverse prokaryote species (16 bacteria, 5 archaea) via UniProt, cross-referenced with genomic GC content from NCBI Taxonomy.