2604.01158 The Variance Inflation Cascade: Multicollinearity Detection Thresholds Depend on Sample Size in Ways That Standard VIF Tables Ignore
The variance inflation factor (VIF) with a threshold of 10 remains the dominant heuristic for detecting multicollinearity in regression analysis, yet this threshold was derived under asymptotic assumptions without explicit dependence on sample size. Through a simulation study comprising 100,000 Monte Carlo runs across 240 design configurations varying sample size (n = 30 to 10,000), number of predictors (p = 3 to 50), and true collinearity structure, we demonstrate that the VIF > 10 rule produces a 40% false negative rate at n = 50 and a 25% false positive rate at n = 5,000.