Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Bias & Variance in Machine Learning
Machine Learning

Bias & Variance in Machine Learning

Last Updated on July 28, 2020 by Editorial Team

Author(s): Shaurya Lalwani

Machine Learning

Photo by Etienne Girardet onΒ Unsplash

Linear Regression is a machine learning algorithm that is used to predict a quantitative target, with the help of independent variables that are modeled in a linear manner, to fit a line or a plane (or hyperplane) that contains the predicted data points. For a second, let’s consider this to be the best-fit line (for better understanding). So, usually, points from the training data don’t really lie on the best-fit line only, and that makes perfect sense because any data isn’t perfect. That is why we are making predictions in the first place, and not just plotting a randomΒ line.

Understanding Bias

Photo by David Talley onΒ Unsplash

The linear regression line cannot be curved in order to include all the training set data points, and hence is unable to capture an accurate relationship at times. This is called bias. In mathematical terms, intercept obtained in the linear regression equation, is theΒ bias.

Why do I sayΒ that?

Let me explain: Here’s a random Linear Regression equation:

y = Intercept + Slope1*x1 + Slope2*x2

The target (y) has some values in the data-set, and the above equation calculates the predicted values for the same. If the β€œIntercept” itself is very high, and it reaches close to the predicted y values, then it would mean that the changes in y, caused by the other two parts of our equationβ€Šβ€”β€Šthe independent variables(x1 and x2), would be less. This means that the amount of variance explained by x1 and x2, would be less, and that would eventually cause an underfitting model to be built. An underfitting model has a low R-squared (the amount of variance in the target, explained by the independent variables).

Underfit can also be understood by thinking of how the best-fit line/plane is captured in the first place. The best-fit line/plane captures the relationship between the target and the independent variable. If this relationship is captured to a very high extend, it leads to low bias and viceΒ versa.

Now that we understand what bias is, and how a high bias causes an underfitting model, it becomes clear that for a robust model, we need to remove this underfit.

In a scenario where we create a curve that passes through all data points and can showcase the existing relationship between the independent variables and the dependant variable, then there would be no bias in theΒ model.

Understanding Variance

Photo by Nick Fewings onΒ Unsplash

A model that has overfitted on train data, will result in a new phenomenon called β€œvariance”. Time to consider a fewΒ models:

Model1: High Bias (Unable to capture the relationship properly)

Model2: Low Bias (Captures relationship to a very highΒ extent)

Error measurement while validating aΒ model:

Error = Actual Valuesβ€Šβ€”β€ŠPredicted Values

On calculating the errors on the training data (test data is not in the picture yet), we observe the following:

Model1: Validation of model on train data shows that errors areΒ high

Model2: Validation of model on train data shows that errors areΒ low

Now, let’s bring in the train data, and understand variance.

So, if the model has overfitted on train data, then it β€œunderstands” and β€œknows” the train data to such a high extent, that it is possible that it will struggle with the test data, and hence it will be unable to capture a relationship when test data is used as input to that model. In broader terms, this means that there will be a high difference of fit between the train data and the test data (as train data shows a perfect validation and test data is unable to capture a relationship). This difference of fit is referred to as β€œvariance”, and it is usually caused when the model understands only the train data and struggles with any new input given toΒ it.

On validating the above models on test data, we noticeΒ this:

Model1: Relationship isn’t captured correctly here as well, but there isn’t a huge gap of understanding between the train and test data, so the variance isΒ low

Model2: There is a huge gap of understanding between the train and test data, so the variance isΒ high

The Trade-Off between Bias &Β Variance

Photo by AndrΓ© Noboa onΒ Unsplash

Now we understand that both bias and variance can cause problems in our prediction model. So, how do we go about solving thisΒ issue?

A couple of terms to understand before weΒ proceed:

Overfit: Low Bias & High Variabilityβ€Šβ€”β€ŠModel fits great on train data, but struggles with test data because it understands only the train dataΒ well

Underfit: High Bias & Low Variabilityβ€Šβ€”β€ŠModel is unable to capture relationship while using train data, but since it hasn’t captured the relationship anyway, hence there isn’t much of a gap of understanding between train and test data, so variance isΒ low

Coming back to the solution, we can do the following to try to build a trade-off between the bias and variance beingΒ caused:

1. Cross-Validation

Photo by Mae Mu onΒ Unsplash

Usually, a model is built on train data and tested on the same, but there’s one more thing that people prefer. Testing the model on a part of the train data, and this is called the validation data.

So, what is Cross-Validation?

As mentioned, model validation is done on a part of the train data. So, if we keep choosing a new set of data points from the train data for validating each iteration, and keep averaging the results obtained from these sets of data, we are doing cross-validation. This is an optimized method to understand the behavior of the model on the train data and a way to understand whether there is a presence of an overfit orΒ not.

Types of Cross-Validation:

K-Fold CV: K here represents the number of sets we have to break our train set into, and then these K sets will be used for model validation, and the results obtained from theses K sets will be averaged to give a final result, which will possibly avoid overfitting.

Leave-One-Out CV: The working technique of Leave-One-Out CV is similar to that to K-Fold CV, but it takes the process to a new level since it calculates the cross-validation results using each and every data point in the train data. This is time-consuming obviously but definitely helps in avoiding overfitting.

Forward Chaining: While working with time-series data, K-Fold CV and Leave-One-Out CV can create a problem, since it is very much possible that some years could have a pattern that other years don’t have, so using random sets of data for cross-validation would not make sense. In fact, it is possible that the existing trends could go unnoticed, which is not what we want. So, usually, in this kind of case, a forward-chaining method is used, wherein each fold that we form (for cross-validation), contains a train set, created by adding up data of a consecutive year to the previous train set and validating it on the test set (which contains only the consecutive year to the latest year used in the trainΒ set).

2. Regularization

Photo by Simon Berger onΒ Unsplash

Regularization is a technique that helps in reducing both, the bias and the variance, by penalizing beta coefficients attached to our model’s independent variables.

I’ve written a whole article on β€œFeature Selection in Machine Learning”, where I have described Regularization and its types in much more depth. Feel free to check it outΒ here:

Feature Selection in MachineΒ Learning

Conclusion

There is no perfect model. It has to be made perfect, by using its imperfections in a positive manner. Once you are able to identify that bias or variability exists in your model, then you can do a ton of things to change that. You may try feature selection and feature transformation as well. You may try removing some over-fitting variables. Based on what is possible at that moment, the decision can be made, and the model can definitely be improved if there is a possibility of that happening.

Thank you for reading! Happy learning!

Support my writing hereΒ πŸ˜ƒ


Bias & Variance in Machine Learning was originally published in Towards AIβ€Šβ€”β€ŠMultidisciplinary Science Journal on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback ↓