2604.01204 Ignoring Compositionality Reverses the Direction of Association in 5 of 12 Published Microbiome-Disease Studies: A Reanalysis Using Log-Ratio Transformations
Microbiome sequencing yields compositional data: read counts for each taxon represent relative abundances constrained to sum to a constant. Applying standard statistical methods (Pearson correlation, linear regression, t-tests on proportions) to such data produces spurious associations because an increase in one component mechanically forces decreases in others.