Implement Jegadeesh-Titman (1993) 12-1 momentum strategy on CRSP data (1990-2023), stratified into 3 market cap tiers: large (>$10B), mid ($500M-$10B), small (<$500M). Gross returns: large 0.
Backtest Almgren-Chriss (AC) optimal execution vs TWAP on 200 US equities over 24 months, stratified by liquidity (ADV percentile). Above 50th percentile ADV: AC outperforms TWAP by 3.
Evaluate 3 credit risk models (logistic regression, XGBoost, neural network) on a loan portfolio (N=120,000) under 3 default definitions: 90 days past due (DPD90, Basel standard), 180 DPD, and 60 DPD. Model rankings change: at DPD90, XGBoost leads (AUC=0.
Simulate 10,000 return series from Student-t distributions (df=3,4,5,10,∞) at N=250,500,1000 trading days. Compute VaR at 99% using Gaussian assumption (deliberately misspecified for t-returns).
Train ECAPA-TDNN speaker verification on VoxCeleb2 with 4 augmentation strategies: none, noise-only (MUSAN), reverb-only (simulated RIR), full (noise+reverb+speed). Test on VOiCES corpus at 5 RT60 conditions (0.
Benchmark 5 QP solvers (OSQP, qpOASES, Gurobi, ECOS, CVXPY+SCS) on MPC problems with horizon N=5-200 for 3 system dimensions (2-state, 10-state, 50-state). Computation time t(N): theoretical O(N³) for dense QP.
Reproduce convergence experiments from 25 published AEC papers (LMS, NLMS, RLS, affine projection). Using the exact parameters reported, convergence rates match published claims in only 15/25 papers (60%).
Compare MUSIC, ESPRIT, Capon, and MVDR on 8-element ULA in 4 multipath scenarios (2-path, 3-path, diffuse, specular+diffuse). At SNR≥10dB: all methods agree within 0.
Analyze recovery of structured sparse signals (block-sparse, tree-sparse, group-sparse) when sparsity assumptions are violated. Standard RIP-based guarantees assume exact sparsity; we characterize performance for approximately sparse signals with sparsity defect δ = ||x - x_s||₁/||x_s||₁ where x_s is the best s-sparse approximation.
Simulation study: generate data from t-distributions (df=2,3,5,10,30,∞) at N=20-10000. Compute 95% CIs using 4 bootstrap methods: percentile, BCa, studentized, and double bootstrap.
Compare ADVI (automatic differentiation variational inference) against HMC (NUTS) on 6 hierarchical models from the Stan case studies (8-schools, radon, election forecasting, disease mapping, IRT, occupancy). ADVI posterior means match HMC within 3% (mean absolute deviation).
Compare 8 popular power calculators (G*Power, PASS, R pwr package, Stata power, nQuery, PS, ClinCalc, SampleSize4ClinicalTrials) on clustered designs (ICC=0.01-0.
Compare MICE (PMM), EM algorithm, kNN imputation, and MissForest on 6 datasets with MAR/MNAR missingness at 5-60%. Below 20% missing: all methods agree within 5% on regression coefficients.
Simulate survival data (N=200-2000, exponential/Weibull) with 5 censoring mechanisms: uniform, early, late, informative, and administrative. Log-rank test Type I error: correct (5%) under uniform censoring but inflated to 8.
Apply p-curve analysis to 500 meta-analyses from Psychological Bulletin and Psychological Review (2010-2023). Expected distribution under true effects: right-skewed (more small p-values).