A Curated List of Important Time Series Forecasting Concepts
Last Updated on September 13, 2022 by Editorial Team
Author(s): Kumar kaushal
Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.
A detailed list of concepts related to time series forecasting and their explanations, along with packages for Python. It will be a go-to list for anyone interested in understanding major topics of time series forecasting
You may have encountered several articles related to Time series forecasting. Those must have explained a few concepts but not all or most of them. This article intends to be the go-to list for anyone aspiring to understand major concepts related to time series forecasting. It will explain the concepts along with python-based libraries and packages for time series analysis and forecasting.
What is time seriesΒ data?
It is a sequence of data points collected or recorded over consistent time intervals. The time series data attaches a time interval to each value. The time interval could be minutes, hours, days, months, or anything else, and it should be equal. This is called the Frequency of the time seriesΒ data.
What are trends, seasonality, and cyclicity?
If the data has a long-term increase or decrease, a trend is present in theΒ data.
A repeating pattern at a specific point of time(year) is referred to as Seasonality. The pattern should repeat consistently. Increased sales of warm clothes during winter is an example of seasonality. The difference between Cyclicity and Seasonality is that in the case of Cyclicity, fluctuations have random nature and are not of a fixedΒ period.
What are the models generally used for forecasting?
Time series forecasting models can be generally segregated into the following broad categories- Classical(or statistical ones), ML (Machine Learning based), and Combinational.
- Classical ones are majorly the statistical kind of models. Famous examples of such models are Auto Regression (AR), Moving Average(MA), Auto Regression Moving Average(ARMA), Auto Regression Integrated Moving Average (ARIMA), ARIMAX (X means eXogenous), SARIMAX (S means Seasonal), Exponential smoothing, andΒ others.
- ML-based approaches include XGBoost, Random Forest, and others using the regression approaches. Deep learning methods used in time series forecasting are LSTM, N-BEATS, DeepAR, andΒ others.
- Some combination methods have also been attempted. For example- the M5 competition winner used an equal-weighted combination involving six models. Each model exploited a different learning approach and trainingΒ set.
Which model is to be used in the case of multivariate timeΒ series?
In cases of multivariate time series, models such as VAR ( Vector Auto Regression) and its derivates (VARMA and VARMAX) can beΒ used.
Which model is to be used in case of volatileΒ data?
Advanced models such as GARCH (Generalized Autoregressive Conditional Heteroskedasticity) can be usedΒ here.
While training and validating a model, can we randomly shuffle the data before doing a train-test split?
No, since the time series data is chronological, we can not randomly shuffle and do the train-test split. Values at the rear of the data can be considered as a test set, and the rest values are to be used as a trainΒ set.
What to do if there are missing values in theΒ data?
Yes, treating the missing value would be the obvious answer. However, due to the peculiarity of time series data, the approach would be different here as compared to much-known methods. This is because values between consecutive time intervals impact each other. Linear interpolation, Spline interpolation, Last Observation Carried forward, and Next Observation Carried Forward are a fewΒ methods.
What are the metrics to evaluate the performance of a time series forecasting model?
Evaluation metrics are the same as in the case of regression models. These are R-squared, Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), andΒ others.
What are the python-based packages available for time series forecasting?
The below list mentions some python-based packages:
tsfresh, AutoTS,Kats, Prophet, darts, sktime, statsmodel, PyFlux, and Orbit. References to the packages are mentioned in the Reference section of thisΒ article.
What is the stationarity of a time series data, and what is its importance?
A stationary data has a constant mean, constant variance, and a consistent covariance between periods at an identical distance. An ADF test is done to confirm the stationarity of the time series data. This is important as with stationarity, the statistical properties of the data will not change withΒ time.
What are exogenous and endogenous variables?
An exogenous variable is a variable that is independent of other variables, and the output variable depends on this variable. Including this variable may account for adding a better explanation to the time series forecasting models.
An endogenous variable is a variable that depends on other variables and on which the output variableΒ depends.
What is the significance of WhiteΒ Noise?
The error terms in the time series model should be White Noise. If they are not White Noise, then there is still some pattern that the model needs to accountΒ for.
Follow me (kumarkaushal.bit) for more interesting topics related to Data Science, where important concepts are explained in a straightforward and comprehensible manner.
References:
- M5 accuracy competition: Results, findings, and conclusions
- Autoregressive conditional heteroskedasticity – Wikipedia
- tsfresh – tsfresh 0.18.1.dev39+g611e04f documentation
- AutoTS – AutoTS 0.5.0 documentation
- Kats | Kats
- Welcome to sktime – sktime documentation
- Time Series analysis tsa – statsmodels
- Introduction – PyFlux 0.4.7 documentation
- About Orbit – orbit 1.1.3dev documentation
- Prophet
- GitHub – unit8co/darts: A python library for easy manipulation and forecasting of time series.
A Curated List of Important Time Series Forecasting Concepts was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.
Join thousands of data leaders on the AI newsletter. Itβs free, we donβt spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI