This dissertation investigates a number of different but related issues in validating risk measurement models for market risk management. In Chapter 2, we evaluate the effectiveness of backtesting methodologies using the standard binomial approach in addition to the interval forecast backtest, the density forecast backtest and the probability forecast backtest. Our comparison is conducted for three risk measures: value-at-risk, expected shortfall and spectral risk measures. Our goal is to analyze the abilities of various backtesting methodologies in gauging the accuracy of risk models. We ask whether the general results from risk model validation are specific to the narrow backtesting method that have been widely applied in the literature (the binomial test), and whether these results consistently hold for a broad range of risk measures. In addition, we test the importance of distribution and volatility specifications in affecting backtesting results. Based on a Monte Carlo simulation, we provide evidence pertaining to the overall dominance of the density forecast backtest among the hypothesis-based backtesting methods (e.g., the binomial backtest, the interval forecast backtest and the density forecast backtest). We suggest a loss function for SRM, in which the probability forecast backtest is capable of identifying accurate models from among alternative models. We also confirm the empirical findings that the choice of the distribution specification is a more important factor in determining the evaluation performance than the choice of the volatility specification.
The empirical validation of risk measurement models is given in Chapter 3. In this chapter, we investigate the performance of risk models at measuring extreme tail risks in the recent crisis. We focus on two issues: 1) the appropriateness and robustness of risk measures in capturing extreme risks. 2) the suitability and reliability of Extreme Value Theory in modeling the extreme tail events. By using the FTSE 100 index futures and the WTI crude oil futures from Jan 1998 to June 2010 as proxies and based on the backtesting sample from Jan 2008 and June 2010, we assess the appropriateness of risk models and their ability to capture a portfolio’s risk exposures during the recent crisis. We extend and encompasses the existing research in this area by comparing the validation outcomes of the binomial backtest (Kupiec, 1995), the interval forecast backtest (Christoffersen, 1998) and the density forecast backtest (Berkowitz, 2001) in order to assess the appropriateness and robustness of the estimated Value at Risk and Expect Shortfall measures. We pay special attention on the extreme tail risk exposures by applying the generalized Pareto distribution (hereafter GPD) to the underlying process of a risk model. The backtesting results show that first, it is clear that ES is a more appropriate and robust risk measure than VaR in capturing extreme risks, because the rejection made for ES is significant lower than for VaR in three risk models. Second, the GARCH specification with GPD offers significant statistical advantages for measuring risk exposure over the Student’s t or the normal distribution.
Chapter 4 extends the previous study by investigating the model risk associated with the omission or the misspecification of risk factors in the underlying process and its impact on the performance of a risk measurement model. We pay special attention to testing if the risk factors of nonlinearity, stochastic volatility and jumps that are essential components of underlying interest rate dynamics. We investigate whether the risk factors found to be important for in-sample performance retain their materiality for out-of-sample forecasts and whether the best in-sample performing models still outperform out-of-sample forecasts. We contribute to the existing literature by providing a comprehensive empirical study of the out-of-sample performance of a wide variety of popular interest rate models in defining material risk factors in interest rate dynamics. Using the three-month T-bill as a proxy for short-term interest rates and based on the VaR backtesting procedure, we find that introducing nonlinear drift does not provide significant improvement in capturing the interest rate dynamics for both in-sample and out-of-sample performance. A linear drift specification is adequate for the purpose of modeling the conditional mean of the T-bill data. However, introducing a GARCH effect provides significant improvement in in-sample performance. Such an effect, in conjunction with jumps, provides a significant improvement in out-of-sample performance. In addition, jump-diffusion models tend to effectively capture excess kurtosis and heavy tails in interest rates for both in-sample and out-of-sample performance. We find that a linear drift model with either GARCH or level effects or both effects, combined with jumps, performs well in relation to VaR backtesting results.
Chapter 5 presents the main conclusions and directions for further research.