Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Unlock the full potential of AI with Building LLMs for Productionβ€”our 470+ page guide to mastering LLMs with practical projects and expert insights!

Publication

Statistical Forecasting of Time Series Data Part 4: Forecasting Volatility using GARCH
Data Visualization

Statistical Forecasting of Time Series Data Part 4: Forecasting Volatility using GARCH

Last Updated on January 6, 2023 by Editorial Team

Author(s): Yashveer Singh Sohi

Data Visualization

Photo by Chris Liverani onΒ Unsplash

In this series of articles, the S&P 500 Market Index is analyzed using popular Statistical Model: SARIMA (Seasonal Autoregressive Integrated Moving Average), and GARCH (Generalized AutoRegressive Conditional Heteroskedasticity).

In the first part, the series was scrapped from the yfinance API in python. It was cleaned and used to derive the S&P 500 Returns (percent change in successive prices) and Volatility (magnitude of returns). In the second part, a number of time series exploration techniques were used to derive insights from the data about characteristics like trend, seasonality, stationarity, etc. With these insights, in the 3rd part, the SARIMA class of models was explored.

In this article, the GARCH model is built to model the volatility in S&P 500 Returns. The code used in this article is from Volatility Models/GARCH for SPX Volatility.ipynb notebook in this repository

Table ofΒ Contents

  1. Importing Data
  2. Train-Test Split
  3. GARCH Model
  4. Volatility of S&P 500Β Returns
  5. Parameter Estimation forΒ GARCH
  6. Fitting GARCH on S&P 500Β Returns
  7. Predicting Volatility
  8. Evaluating the Performance
  9. Conclusion
  10. Links to other parts of thisΒ series
  11. References

Importing Data

Here we import the dataset that was scrapped and preprocessed in part 1 of this series. Refer part 1 to get the data ready, or download the data.csv file from this repository.

Output for the previous code cell showing the first 5 rows of theΒ dataset

Since this is same code used in the previous parts of this series, the individual lines are not explained in detail here forΒ brevity.

Train-Test Split

We now split the data into train and test sets. Here all the observations on and from 2019–01–01 form the test set, and all the observations before that is the trainΒ set.

Output of previous code cell showing the shape of training and testingΒ sets

GARCH Model

GARCH stands for Generalized Auto-Regressive Conditional Heteroskedasticity. Conditional Heteroskedasticity is tantamount to conditional variance (or conditional volatility) in the time series. GARCH model uses the concept of volatility clustering to model the volatility of a series. Volatility Clustering essentially means that the volatility today, depends on the volatility at recent time steps. A GARCH model is specified using 2 parameters: GARCH(p, q). The GARCH model is formulated as shownΒ below.

Mathematical formulation of how GARCH models Volatility

The above equation shows how GARCH models Volatility. The squared volatility at some time step is represented as the linear combination of some constant, a group of past residual terms, and a group of past volatility terms. The parameters p, and q are used to control the number of these lagged residual, and volatility terms in the above equation respectively.

Volatility of S&P 500Β Returns

In this article, the Volatility of S&P 500 Returns is modeled using GARCH. In order to test whether the volatility predicted matches the volatility of returns in future, we calculate the magnitude of S&P 500 Returns and have stored it in the series spx_volΒ .

Thus, the model is fit on spx_ret series, and the predicted volatility is compared with spx_volΒ .

Parameter Estimation forΒ GARCH

The PACF (Partial Auto-Correlation Function) plot is used to get an initial estimate of the parameters: p and q of the GARCH model. The number of significant lags in this plot is used as the initial parameters. Then the model summary table (displayed after fitting the model) is used to understand which coefficients in the model are significant. Based on this, the model is fineΒ tuned.

Now, the PACF plot for S&P 500 Returns is generated:

PACF plot of S&P 500Β Returns

Using the plot_pacf() function in statsmodels.graphics.tsaplots package, the PACF plot is generated for the spx_ret series. On examining the plot, the first 2 lags are significant. Therefore, the GARCH(2, 2) model is suitable for an initial startingΒ point.

Fitting GARCH on S&P 500Β Returns

Now, a GARCH(2, 2) model is fit on the S&P 500 ReturnsΒ series.

Summary Table of the GARCH(2, 2) Model fitted on S&P 500Β Returns

The arch_model() function in the arch package is used to implement the GARCH model. The Implementation mentioned here is inspired from the one mentioned in the official documentation here. Before fitting the model, a new dataframe is prepared. This dataframe consists of all the time steps in the original dataset (before train-test split). The training time steps are occupied by the Returns of S&P 500. These are actually used for training the GARCH model. The testing periods are occupied by the Returns observed one time step previously. This is tantamount to saying that the model will forecast tomorrow’s volatility in returns using the returns observedΒ today.

Using the arch_model() method, the model is defined. The function takes in the dataset mentioned above as input, and the parameters: p=2, and q=2Β . The vol= β€œGARCH” argument specifies that the model to be used is GARCH. The model definition is stored in the variable model and the fit() method is called on it to train the model. The last_obs argument is used to ensure that the model is trained on train data only. The fitted model is stored in model_results variable, and the summary of this is displayed by calling the summary() on the fittedΒ model.

The first few lines before the summary table in the output image shows the fitting information as displayed after each iteration by the fit() function. The update_freq=5 argument in the fit() function limits this information to be displayed after every 5 iterations. Next, in the summary table there are 3 sections: Constant Meanβ€Šβ€”β€ŠGARCH Model Results, Mean Model, and Volatility Model. In the section for Volatility Model, the P<|Z| column clearly indicates that all the coefficients are significant at the 5% confidence level.

Predicting Volatility

Here the fitted model is used to forecast the volatility of S&P 500 Returns in the testΒ set.

Plot showing the Volatility predicted by the model against the assumed volatility (magnitude of the returns) of the S&P 500 Returns timeΒ series.

The forecast() method is used on the fitted model: model_resultsΒ . This outputs an ARCHModelForecast object that contains the predictions for the mean model, and the volatility model. Next, the residual_variance attribute is called to get the predictions for volatility. The predictions are stored in a dataframe with the same number of periods as in dataΒ . All the periods on which the model is trained has NaN values in them, and only the periods for which the model should generate forecasts are actually populated. In this case, all the periods of the test set has real valued numbers, and the ones of the train set has NaN’s. The predictions are then plotted against the volatility (magnitude of returns) calculated heuristically.

Evaluating the Performance

The image above shows that whenever the predicted volatility of the model spikes violently, the magnitude of returns (heuristically calculated volatility) also fluctuates greatly. On the other hand, when the predicted volatility is stable, then the magnitude of returns are also relatively stable. Thus our model is clearly able to identify periods of high and low volatility in S&P 500Β Returns.

The primary purpose of this model is to identify periods where the market is stable and periods where the market is volatile, and the model is successful in capturing this information. Thus, there is no reason to evaluate the model on error metrics like RMSE (Root Mean SquaredΒ Error)

Conclusion

In this article, the volatility in S&P 500 Returns was analysed and predicted using the GARCH model. In the next article, first, an ARIMA model will be used to fit the S&P 500 Returns. Then, a GARCH model will be used to model the residuals of ARIMA. This will allow us to generate much more reliable confidence intervals than the ones generated by the ARIMA modelΒ alone.

Links to other parts of thisΒ series

References

[1] 365DataScience Course on Time SeriesΒ Analysis

[2] machinelearningmastery blogs on Time SeriesΒ Analysis

[3] Wikipedia article onΒ GARCH

[4] ritvikmath YouTube videos on the GARCHΒ Model.

[5] arch documentation for forecasting using GARCHΒ model.


Statistical Forecasting of Time Series Data Part 4: Forecasting Volatility using GARCH was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback ↓