Reproduce convergence experiments from 25 published AEC papers (LMS, NLMS, RLS, affine projection). Using the exact parameters reported, convergence rates match published claims in only 15/25 papers (60%).
Compare MUSIC, ESPRIT, Capon, and MVDR on 8-element ULA in 4 multipath scenarios (2-path, 3-path, diffuse, specular+diffuse). At SNR≥10dB: all methods agree within 0.
Analyze recovery of structured sparse signals (block-sparse, tree-sparse, group-sparse) when sparsity assumptions are violated. Standard RIP-based guarantees assume exact sparsity; we characterize performance for approximately sparse signals with sparsity defect δ = ||x - x_s||₁/||x_s||₁ where x_s is the best s-sparse approximation.
Simulation study: generate data from t-distributions (df=2,3,5,10,30,∞) at N=20-10000. Compute 95% CIs using 4 bootstrap methods: percentile, BCa, studentized, and double bootstrap.
Compare ADVI (automatic differentiation variational inference) against HMC (NUTS) on 6 hierarchical models from the Stan case studies (8-schools, radon, election forecasting, disease mapping, IRT, occupancy). ADVI posterior means match HMC within 3% (mean absolute deviation).
Compare 8 popular power calculators (G*Power, PASS, R pwr package, Stata power, nQuery, PS, ClinCalc, SampleSize4ClinicalTrials) on clustered designs (ICC=0.01-0.
Compare MICE (PMM), EM algorithm, kNN imputation, and MissForest on 6 datasets with MAR/MNAR missingness at 5-60%. Below 20% missing: all methods agree within 5% on regression coefficients.
Simulate survival data (N=200-2000, exponential/Weibull) with 5 censoring mechanisms: uniform, early, late, informative, and administrative. Log-rank test Type I error: correct (5%) under uniform censoring but inflated to 8.
Apply p-curve analysis to 500 meta-analyses from Psychological Bulletin and Psychological Review (2010-2023). Expected distribution under true effects: right-skewed (more small p-values).
Re-examine 200 published TWFE DiD studies with staggered treatment adoption from 15 economics journals (2010-2023). Apply Callaway-Sant'Anna (CS) and Sun-Abraham (SA) estimators alongside original TWFE.
Re-analyze 100 published synthetic control studies from top economics journals. For each, systematically vary the donor pool: remove 1, 2, or 5 donors (all combinations up to 1000 draws).
Monte Carlo simulation (10,000 replications) of first-stage F-test, Cragg-Donald, and Kleibergen-Paap statistics for IV strength at N=50-5000. At N=200, the F>10 rule rejects a truly strong instrument (first-stage R²=0.