Hizmeri, Rodrigo
(2021)
*Essays on High Frequency Financial Econometrics:(Co)Jumps, Aggregation, Asymmetry and Measurement Error.*
PhD thesis, UNSPECIFIED.

## Abstract

Over the last twenty years, the use of statistical and econometric methods for analyzing high-frequency data has increased substantially. This growth has been driven by an increase in the availability of intraday data and technological advancements. High-frequency data provides a finer characterization of the elements comprised in asset prices; for instance, it makes it possible to discriminate jumps from the diffusive component. However, the most notable contribution of high-frequency data is the Realized Volatility (RV), which is estimated as the sum of all squared intraday returns. The RV is a consistent estimator (as Δn → 0 ) of the true latent volatility process, and as such it enables to treat volatility as "quasi"-observable. Whereas in the absence of jumps, the RV converges to the integrated variance, in the presence of jumps it converges to the quadratic variance, i.e. the sum of the integrated variance and integrated jumps. However, since RV is only a proxy for the true latent volatility process, this measure is, of course, subject to estimation error. There are many potential sources yielding RV to be an imperfect measure. Nevertheless, the most relevant one is that we work with limited samples, which makes RV a less efficient estimator. While this issue can be mitigated by increasing the sample size, there are well-known high-frequency features that spoil this alternative. The most notable cases are the presence of microstructure noise and intraday periodicity (whereas intraday periodicity does not impact the realized variance, as it integrates to 1, it does impact other realized measures that are essential for estimating jumps, and higher-order moments). Therefore, the parameter estimates of the econometric models based on realized measures are subject to the error-in-variables problem. Accurate estimates and forecasts of both univariate and multivariate volatility play a central role in many financial economic applications. Examples include the comparison of total risk of two portfolios measured by their volatility and, of course, the estimation of portfolio weights. Besides that poor out-of-sample forecasts of the volatility leads to poor financial decisions, inaccurate forecasts of the covariance matrix generally lead to extreme positions that increase both transaction costs and the risk of the portfolio. Thus, an investor may end up with a riskier portfolio with smaller expected return. Another important issue related to modelling and forecasting asset price volatility resides in understanding the underlying component of high-frequency data. Jumps are the main culprit of the extreme variations and the fat-tails observed in asset prices. The current evidence suggests that jumps are unpredictable and have different sizes and signs. Therefore, it is imperative to underscore the information content of these different types of jumps, and also evaluate whether assets with distinct levels of liquidity share similar underlying components. This dissertation focuses on the aforementioned issues, and therefore we split the contribution into two main parts. The first part examines the impact of estimation error in the modelling and forecasting of both univariate and multivariate volatility, and their impact on portfolio choice. The second part evaluates the underlying components of high-frequency data, and explores the predictive information content of different types of jumps and the role of systematic jumps to, respectively, modelling and forecasting realized variances and covariances. A summary of each chapter is as follows. Chapter 2 examines the impact of intraday periodicity on forecasting realized variance using a heterogeneous autoregressive (HAR) model framework. We show that intraday periodicity inflates both the unconditional and conditional variance of the realized variance, and therefore biases the autoregressive parameter estimates and jump estimators. This combined effect adversely affects forecasting. To overcome this issue, we propose a periodicity-adjusted model, HARP, where predictors are built from periodicity-filtered data. We demonstrate empirically --using 30 stocks and the SPDR S\&P 500 ETF-- and via Monte Carlo simulations that the HARP models produce significantly better forecasts. We also show that our results are robust to various sources of intraday periodicity estimation error and to a `possibly'' time-varying feature of the intraday periodicity. Chapter 3 proposes a dilution bias correction approach to deal with the error-in-variables problem observed in realized volatility (RV) measures. Given that the weekly and monthly measures of the RV are less prone to measurement error, we show that the absolute difference between the daily and monthly RV is proportional to the relative magnitude of the estimation error. Therefore, in implementing the latter metric, and in allowing the daily autoregressive parameter to vary as a function of the error term, the result is more responsive forecasts with greater persistence (faster mean-reversion) when the measurement error is low (high). Empirical results indicate that our models outperform some of the most popular HAR and GARCH models across various forecasting horizons. In chapter 4, we model and forecast realized (co)variances using a factor structure, which suggests that (co)variances are formed by the sum of systematic and idiosyncratic components. First, we show that idiosyncratic volatility is the main driver of total realized volatility. Given the evidence that assets with a high level of idiosyncratic volatility suffer from low predictability, (co)variance forecasts of these assets are likely to have higher forecasting errors. To take this issue into account we incorporate the market factor information, and show significant improvements in the in- and out-of-sample performance of the models. We evaluate these forecasting gains using statistical loss functions and global minimum variance portfolios. We create 100 random portfolios of 5 and 10 assets, and show that the proposed models not only improve statistically upon their benchmark models, but also economically, in that a risk-averse investor is willing to sacrifice up to 157 annual basis points to obtain greater forecasting accuracy that translates in more informed financial decisions. Chapter 5 examines the underlying components of high-frequency data using novel theoretical tests that the presence of: a) Brownian motion; b) jumps; c) finite vs. infinite activity jumps. Given that the asymptotic distribution of most of these procedures has been derived under the assumption of noiseless prices, we first evaluate the finite sample properties under different types of microstructure noise such as Gaussian, t-distributed and Gaussian-T mixture noise. The Monte Carlo results show that 1-min data provide a good trade-off between bias and enough statistical power. Using 100 stocks and SPY, we find that both a Brownian and a jump component characterize the 1-min data, and jumps should allow for both finite and infinite activity. We also find evidence of time-varying rejection rates, such that more jump days are usually associated with an increase of infinite jumps vis-à-vis finite jumps. Chapter 6 proposes a novel approach for disentangling realized jumps measures by activity (infinite/finite) and by sign (positive/negative). It also provides noise-robust versions of the ABD jump test (Andersen et al., 2007b) and realized semivariance measures for use at high-frequency sampling intervals. The volatility forecasting exercise involves the use of different types of jumps, forecast horizons, sampling frequencies, calendar and transaction time-based sampling schemes, as well as standard and noise-robust volatility measures. We find that infinite (finite) jumps improve the forecasts at shorter (longer) horizons; but the contribution of signed jumps is limited. Noise-robust estimators, that identify jumps in the presence of microstructure noise, deliver substantial forecast improvements at higher sampling frequencies. However, standard volatility measures at the 300-second frequency generate the smallest MSPEs. Since no single model dominates across sampling frequency and forecasting horizon, we show that model-averaged volatility forecasts –using time-varying weights and models from the model confidence set— generally outperform forecasts from both the benchmark and single best extended HAR model. Finally, Chapter 7 proposes a robust framework for disentangling undiversifiable common jumps within the realized covariance matrix. Simultaneous jumps detected in our empirical study are strongly related to major financial and economic news, and their occurrence raises correlation and persistence among assets. Our application shows that common jumps and directional common jumps substantially improve the in- and out-of-sample forecasts of the realized variance at the day-, week- and month-horizon. This finding is corroborated via Monte Carlo simulations. Applying these new specifications to minimum variance portfolios results in superior positions from reduced turnover. Thus, investors willingly sacrifice up to 100 annual basis points in switching to those strategies.