Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take the GenAI Test: 25 Questions, 6 Topics. Free from Activeloop & Towards AI

Publication

Automatic Trend Change Points Detection in Time Series Analysis
Data Science   Latest   Machine Learning

Automatic Trend Change Points Detection in Time Series Analysis

Last Updated on June 10, 2024 by Editorial Team

Author(s): Daniel Pollak

Originally published on Towards AI.

Photo by rc.xyz NFT gallery on Unsplash

Lately, I’ve been extensively involved in analyzing high-frequency time series characterized by linear trends. My focus has been on developing a dependable and simple forecasting model, based on a limited span of β€œtraining data” spanning a few weeks.

One recurrent phenomenon across numerous series has been the presence of few trend change points, often manifesting as sudden shifts in either the rate or the magnitude of the trend, and I’d like to share a simple technique for automatic identification and modeling these points into linear regressions.

Here are the contents of the article:

  1. Quick basic overview of linear regression as a tool for time-series analysis.
  2. Examples of trend change points.
  3. The identification & modeling techniques.

Linear Regression Model

Image by GeeksForGeeks

Although there have been significant advancements in time series analysis with the emergence of sophisticated forecasting models based on SOTA learning architectures such as LSTMs and Transformers (N-BEATS, TimeGPT, PatchTST, …), linear regression remains a fundamental tool in statistics and machine learning. There are several reasons to prefer utilizing linear regression over advanced tools in my opinion:

  1. Interpretability: Regression terms are clearly defined, and interpreting their coefficients is straightforward (although it doesn’t always imply causality, but it’s not our interest here)
  2. Simplicity: Linear regression is easy to understand and visualize.
  3. Computational Efficiency: Training a regression model involves solving a simple equation with a clear solution, contrasting with the complex weight optimizations and extensive training data required by advanced models.
  4. Flexibility: Exogenous terms can be incorporated into the regression to model specific behaviors of time-series data, such as trends, seasonalities, lagged terms, and more.
  5. Long-period forecast: The challenge of forecasting over a long time horizon is likely to persist as a difficult task for both basic and advanced models. Therefore, we opt for employing a small training dataset and forecasting over a shorter period instead.

Change Points Examples

A simple example for a change point is shown in the following plot –

FIgure 1: A simple trend change point

From May 1st to approximately May 22nd, the dataset adheres to a simple linear pattern. Subsequently, there is a sudden bump, after which the graph resumes its previous linear trend.

A bit more intricate example involves both a bump and a change in rate, as illustrated below:

Figure 2 β€” Bump and rate change point

Piecewise Linear Regression Models

It’s worth mentioning β€œPiecewise Linear Regression” or β€œSegmented Regression” here. This method involves applying separate regressions for distinct segments, but it requires prior knowledge of these segments. Piecewise regression can be implemented using separate connected regression models or within a single regression equation incorporating segment terms, which are binary dummy variables.

For instance, to model the change point illustrated in Figure 1, we could utilize the following equation:

Model 1 β€” modeling in a single change point

Fitting this equation to the dataset depicted in Figure 1 yields the following regression output:

 OLS Regression Results 
==============================================================================
Dep. Variable: metric_value R-squared: 1.000
Model: OLS Adj. R-squared: 1.000
Method: Least Squares F-statistic: 1.420e+06
Date: Wed, 29 May 2024 Prob (F-statistic): 0.00
Time: 12:30:33 Log-Likelihood: -751.30
No. Observations: 865 AIC: 1509.
Df Residuals: 862 BIC: 1523.
Df Model: 2
Covariance Type: HC1
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
const 1.0977 0.046 23.817 0.000 1.007 1.188
cp 28.0949 0.075 374.505 0.000 27.948 28.242
beta 0.0831 0.000 566.808 0.000 0.083 0.083
==============================================================================

This implies that the model fits the dataset perfectly (R squared = 1). An interesting observation lies in the high z-value and low p-value of the d parameter. Both signify considerable statistical significance, indicating that the cp_t regressor effectively captures a substantial portion of the dataset’s variation. We’ll leverage this insight in the subsequent sections.

However, a drawback of the approach mentioned above is that we need to manually construct the cp_t regressor based on the prior knowledge there is a change point on May 22nd.

Segment Automatic Identification

Let’s try to fit the following model:

Model 2 β€” modeling in all possible change points

Instead of utilizing a dummy variable to represent a change occurring on a particular date, we’ll incorporate such dummy variables for every single date into the model. The resulting regression output will appear as follows:

 OLS Regression Results 
==============================================================================
Dep. Variable: metric_value R-squared: 0.999
Model: OLS Adj. R-squared: 0.999
Method: Least Squares F-statistic: 4.940e+05
Date: Wed, 29 May 2024 Prob (F-statistic): 0.00
Time: 12:46:10 Log-Likelihood: -3937.8
No. Observations: 865 AIC: 7950.
Df Residuals: 828 BIC: 8126.
Df Model: 36
Covariance Type: HC1
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
const -6.9122 2.089 -3.309 0.001 -11.006 -2.818
i 2.7750 0.153 18.092 0.000 2.474 3.076
cp_1.0 -18.5993 4.007 -4.642 0.000 -26.453 -10.746
cp_2.0 -18.5993 4.007 -4.642 0.000 -26.453 -10.746
cp_3.0 -18.5993 4.007 -4.642 0.000 -26.453 -10.746
cp_4.0 -18.5993 4.007 -4.642 0.000 -26.453 -10.746
cp_5.0 -18.5993 4.007 -4.642 0.000 -26.453 -10.746
cp_6.0 -18.5993 4.007 -4.642 0.000 -26.453 -10.746
cp_7.0 -18.5993 4.007 -4.642 0.000 -26.453 -10.746
...
cp_18.0 -18.5993 4.007 -4.642 0.000 -26.453 -10.746
cp_19.0 -18.5993 4.007 -4.642 0.000 -26.453 -10.746
cp_20.0 -18.5993 4.007 -4.642 0.000 -26.453 -10.746
cp_21.0 93.2063 19.304 4.828 0.000 55.371 131.041
cp_22.0 429.7896 29.416 14.611 0.000 372.134 487.445
cp_23.0 93.2063 19.304 4.828 0.000 55.371 131.041
cp_24.0 -18.5993 4.007 -4.642 0.000 -26.453 -10.746
cp_25.0 -18.5993 4.007 -4.642 0.000 -26.453 -10.746
cp_26.0 -18.5993 4.007 -4.642 0.000 -26.453 -10.746
cp_27.0 -18.5993 4.007 -4.642 0.000 -26.453 -10.746
cp_28.0 -18.5993 4.007 -4.642 0.000 -26.453 -10.746
...
==============================================================================

It’s apparent that among all the coefficients, cp_22 stands out with the highest z-value, indicating that it captures the most variance in the dataset.

Let’s try to introduce another change point:

Figure 3 β€” Two changepoints

And the regression output:

 OLS Regression Results 
==============================================================================
Dep. Variable: metric_value R-squared: 0.998
Model: OLS Adj. R-squared: 0.998
Method: Least Squares F-statistic: 1.199e+05
Date: Wed, 29 May 2024 Prob (F-statistic): 0.00
Time: 12:59:45 Log-Likelihood: -4509.6
No. Observations: 865 AIC: 9093.
Df Residuals: 828 BIC: 9269.
Df Model: 36
Covariance Type: HC1
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
const -22.1901 4.571 -4.854 0.000 -31.149 -13.231
i 4.1035 0.297 13.816 0.000 3.521 4.686
cp_1.0 -50.4838 8.323 -6.066 0.000 -66.796 -34.171
cp_2.0 -50.4838 8.323 -6.066 0.000 -66.796 -34.171
cp_3.0 -50.4838 8.323 -6.066 0.000 -66.796 -34.171
cp_4.0 -50.4838 8.323 -6.066 0.000 -66.796 -34.171
cp_5.0 -50.4838 8.323 -6.066 0.000 -66.796 -34.171
...
cp_19.0 -50.4838 8.323 -6.066 0.000 -66.796 -34.171
cp_20.0 -50.4838 8.323 -6.066 0.000 -66.796 -34.171
cp_21.0 61.3218 18.753 3.270 0.001 24.566 98.078
cp_22.0 397.9051 27.505 14.467 0.000 343.996 451.814
cp_23.0 61.3218 18.753 3.270 0.001 24.566 98.078
cp_24.0 -50.4838 8.323 -6.066 0.000 -66.796 -34.171
cp_25.0 -50.4838 8.323 -6.066 0.000 -66.796 -34.171
cp_26.0 -50.4838 8.323 -6.066 0.000 -66.796 -34.171
cp_27.0 -50.4838 8.323 -6.066 0.000 -66.796 -34.171
cp_28.0 -50.4838 8.323 -6.066 0.000 -66.796 -34.171
cp_29.0 141.1829 32.296 4.372 0.000 77.884 204.482
cp_30.0 718.1829 49.021 14.650 0.000 622.103 814.263
cp_31.0 141.1829 32.296 4.372 0.000 77.884 204.482
cp_32.0 -50.4838 8.323 -6.066 0.000 -66.796 -34.171
cp_33.0 -50.4838 8.323 -6.066 0.000 -66.796 -34.171
cp_34.0 -50.4838 8.323 -6.066 0.000 -66.796 -34.171
cp_35.0 -51.5355 8.473 -6.082 0.000 -68.143 -34.928
==============================================================================

Once more, the two cp_i coefficients exhibiting the highest z-values are cp_22 and cp_30.

Overfitting

Certainly, in practical scenarios, incorporating a large number of irrelevant regressors in your regression model can result in overfitting, correlation among regressors, and increased statistical variability.

What I usually do is using the aforementioned method to pinpoint significant potential change points. Subsequently, I attempt to fit the regression solely with these identified points to assess whether the accuracy of the fit remains high, such as by validating a high R-squared value. If successful, I generate a forecast using the last regression.

Wrapping up

The method demonstrated here can certainly be applied to more intricate time series. Additionally, as illustrated in Figure 2, we can employ the same concept to model a change point in slope.

Certainly, there are alternative approaches. For instance, we could take the time series difference of the very basic example we previously discussed and utilize a simple anomaly detection technique to identify these change points. Subsequently, we could save these points in our database and incorporate them when our training set’s time period’ includes them.

What I find appealing about the aforementioned technique is its ability to allow me to reuse the same regression framework I already use for forecasting. It also minimizes the necessity for storing state about my training set (historically identified change points).

References

[1] Piecewise Linear Regression Models, STAT 501: Regression Methods, Penn State University

[2] Statistical Modeling and Forecasting, The Consequences of Including Irrelevant Variables In A Linear Regression Model

[3] Rob J Hyndman, George Athanasopoulos, Forecasting: Principles and Practice (2nd ed), 2018, Monash University, Australia

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓