A Curated List of Important Time Series Forecasting Concepts
Last Updated on September 13, 2022 by Editorial Team
Author(s): Kumar kaushal
Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.
A detailed list of concepts related to time series forecasting and their explanations, along with packages for Python. It will be a go-to list for anyone interested in understanding major topics of time series forecasting
You may have encountered several articles related to Time series forecasting. Those must have explained a few concepts but not all or most of them. This article intends to be the go-to list for anyone aspiring to understand major concepts related to time series forecasting. It will explain the concepts along with python-based libraries and packages for time series analysis and forecasting.
What is time series data?
It is a sequence of data points collected or recorded over consistent time intervals. The time series data attaches a time interval to each value. The time interval could be minutes, hours, days, months, or anything else, and it should be equal. This is called the Frequency of the time series data.
What are trends, seasonality, and cyclicity?
If the data has a long-term increase or decrease, a trend is present in the data.
A repeating pattern at a specific point of time(year) is referred to as Seasonality. The pattern should repeat consistently. Increased sales of warm clothes during winter is an example of seasonality. The difference between Cyclicity and Seasonality is that in the case of Cyclicity, fluctuations have random nature and are not of a fixed period.
What are the models generally used for forecasting?
- Classical ones are majorly the statistical kind of models. Famous examples of such models are Auto Regression (AR), Moving Average(MA), Auto Regression Moving Average(ARMA), Auto Regression Integrated Moving Average (ARIMA), ARIMAX (X means eXogenous), SARIMAX (S means Seasonal), Exponential smoothing, and others.
- ML-based approaches include XGBoost, Random Forest, and others using the regression approaches. Deep learning methods used in time series forecasting are LSTM, N-BEATS, DeepAR, and others.
- Some combination methods have also been attempted. For example- the M5 competition winner used an equal-weighted combination involving six models. Each model exploited a different learning approach and training set.
Which model is to be used in the case of multivariate time series?
In cases of multivariate time series, models such as VAR ( Vector Auto Regression) and its derivates (VARMA and VARMAX) can be used.
Which model is to be used in case of volatile data?
Advanced models such as GARCH (Generalized Autoregressive Conditional Heteroskedasticity) can be used here.
While training and validating a model, can we randomly shuffle the data before doing a train-test split?
No, since the time series data is chronological, we can not randomly shuffle and do the train-test split. Values at the rear of the data can be considered as a test set, and the rest values are to be used as a train set.
What to do if there are missing values in the data?
Yes, treating the missing value would be the obvious answer. However, due to the peculiarity of time series data, the approach would be different here as compared to much-known methods. This is because values between consecutive time intervals impact each other. Linear interpolation, Spline interpolation, Last Observation Carried forward, and Next Observation Carried Forward are a few methods.
What are the metrics to evaluate the performance of a time series forecasting model?
Evaluation metrics are the same as in the case of regression models. These are R-squared, Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and others.
What are the python-based packages available for time series forecasting?
The below list mentions some python-based packages:
tsfresh, AutoTS,Kats, Prophet, darts, sktime, statsmodel, PyFlux, and Orbit. References to the packages are mentioned in the Reference section of this article.
What is the stationarity of a time series data, and what is its importance?
A stationary data has a constant mean, constant variance, and a consistent covariance between periods at an identical distance. An ADF test is done to confirm the stationarity of the time series data. This is important as with stationarity, the statistical properties of the data will not change with time.
What are exogenous and endogenous variables?
An exogenous variable is a variable that is independent of other variables, and the output variable depends on this variable. Including this variable may account for adding a better explanation to the time series forecasting models.
An endogenous variable is a variable that depends on other variables and on which the output variable depends.
What is the significance of White Noise?
The error terms in the time series model should be White Noise. If they are not White Noise, then there is still some pattern that the model needs to account for.
Follow me (kumarkaushal.bit) for more interesting topics related to Data Science, where important concepts are explained in a straightforward and comprehensible manner.
- M5 accuracy competition: Results, findings, and conclusions
- Autoregressive conditional heteroskedasticity – Wikipedia
- tsfresh – tsfresh 0.18.1.dev39+g611e04f documentation
- AutoTS – AutoTS 0.5.0 documentation
- Kats | Kats
- Welcome to sktime – sktime documentation
- Time Series analysis tsa – statsmodels
- Introduction – PyFlux 0.4.7 documentation
- About Orbit – orbit 1.1.3dev documentation
- GitHub – unit8co/darts: A python library for easy manipulation and forecasting of time series.
A Curated List of Important Time Series Forecasting Concepts was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.
Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI