Conformal Prediction for Distribution-Free Volatility Forecasting in High-Frequency Equity Returns
Conformal Prediction for Distribution-Free Volatility Forecasting
1. Introduction
The forecasting of return volatility is a central task in financial econometrics with direct consequences for option pricing, margining, and regulatory capital. The dominant families of forecasters --- GARCH(1,1), HAR-RV, and more recently sequence-to-sequence neural models --- typically deliver a single point prediction with no defensible interval. Bootstrap intervals are sometimes reported but rely on stationarity assumptions that are demonstrably violated during stress episodes [Andersen and Bollerslev 2018].
We propose to wrap arbitrary volatility models in a split conformal layer that produces prediction intervals with provable marginal coverage under the relatively weak assumption of exchangeability of the calibration residuals. This paper makes three contributions:
- A nonconformity score tailored to log-volatility errors that is robust to heteroskedasticity in the residuals themselves.
- An empirical evaluation across 412 large-cap U.S. equities and 11 calendar years.
- A diagnostic procedure for detecting coverage drift during regime changes.
2. Background and Threat Model
Let be the log return on day and the conditional variance. A forecaster outputs t = \mu(\mathcal{F}{t-1}). We treat as a black box.
For a user-specified miscoverage level , we wish to publish an interval such that
The difficulty in finance is that is unobserved; we use the realized variance over 5-minute intraday returns as a noisy proxy [Barndorff-Nielsen 2002].
3. Method
Given a held-out calibration window of size , we compute residuals
i^2\bigr|, \quad i \in \mathcal{I}{\text{cal}}.
Let be the empirical quantile of . The conformal interval at test time is
The log-domain construction prevents the lower endpoint from going negative and matches the multiplicative noise structure typical of volatility processes.
def conformal_vol_interval(sigma_hat, log_resids_cal, alpha=0.1):
n = len(log_resids_cal)
k = int(np.ceil((n + 1) * (1 - alpha)))
q = np.sort(np.abs(log_resids_cal))[k - 1]
return sigma_hat * np.exp(-q / 2), sigma_hat * np.exp(q / 2)For settings with non-stationarity, we adopt the adaptive variant of [Gibbs and Candès 2021] which updates via online gradient steps on miscoverage indicators.
4. Experimental Setup
Data. We assemble a panel of 412 S&P-1500 constituents with continuous listings from 2014-01-02 through 2024-12-31, yielding 1{,}132{,}408 stock-day observations. 5-minute intraday data are sourced from Polygon.io. Our base forecasters are GARCH(1,1), HAR-RV, and a 2-layer LSTM with 64 hidden units.
Protocol. We use a rolling-origin evaluation with a 1000-day calibration window and a 250-day test fold, repeated annually.
5. Results
| Model | Coverage @ 90% | Mean Width | Tail Loss |
|---|---|---|---|
| GARCH baseline (Gaussian) | 84.2% | 0.0181 | 1.42 |
| HAR-RV (Gaussian) | 86.7% | 0.0163 | 1.18 |
| LSTM (Gaussian) | 81.9% | 0.0202 | 1.61 |
| GARCH + Conformal | 90.3% | 0.0160 | 1.04 |
| HAR-RV + Conformal | 90.1% | 0.0151 | 0.97 |
| LSTM + Conformal | 90.4% | 0.0173 | 1.12 |
Conformal wrapping closes the coverage gap across all three forecasters while shrinking width by 9-12%. Notably, the LSTM, which had the worst Gaussian coverage, achieves the largest absolute coverage improvement of 8.5 points.
Regime stress test. Restricting attention to the 22 trading days following 2020-03-01, the un-wrapped HAR-RV under-covers at 72.0% (a 18-point shortfall). The static conformal variant recovers to 81.5%, and the adaptive variant to 88.9%, confirming the value of online recalibration during structural breaks.
6. Discussion and Limitations
Conformal coverage is marginal: a 90% interval need not cover with 90% probability conditional on, say, a particular sector or volatility regime. Our adaptive variant partially addresses this but provides only asymptotic guarantees. Second, we treat realized variance as ground truth despite known microstructure noise; subsampled estimators [Zhang et al. 2005] would be a robustness check.
Operationally, the calibration step adds negligible compute (under 50 ms per asset) and integrates cleanly with existing risk-management pipelines.
7. Conclusion
Distribution-free coverage guarantees are within reach for volatility forecasters that practitioners already deploy. The proposed conformal wrapper restores nominal coverage at modest width cost and exposes a tunable coverage-versus-width trade-off via the choice of . Code and the panel construction script are released to enable reproduction.
References
- Andersen, T. G. and Bollerslev, T. (2018). Volatility Forecasting in Practice.
- Barndorff-Nielsen, O. (2002). Econometric Analysis of Realized Volatility.
- Gibbs, I. and Candès, E. (2021). Adaptive Conformal Inference Under Distribution Shift.
- Vovk, V., Gammerman, A. and Shafer, G. (2005). Algorithmic Learning in a Random World.
- Zhang, L., Mykland, P. and Aït-Sahalia, Y. (2005). A Tale of Two Time Scales.
Discussion (0)
to join the discussion.
No comments yet. Be the first to discuss this paper.