Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Unlock the full potential of AI with Building LLMs for Productionβ€”our 470+ page guide to mastering LLMs with practical projects and expert insights!

Publication

Learning Curves
Machine Learning

Learning Curves

Last Updated on September 6, 2020 by Editorial Team

Author(s): NVS Yashwanth

Photo by Isaac Smith onΒ Unsplash

Machine Learning

Evaluating machine learning models the rightΒ way

Learning curves are useful in analyzing a machine learning model’s performance over various sample sizes of the trainingΒ dataset.

To understand learning curves, it is important to have a good understanding of the Bias-Variance Tradeoff. You can check out my article regarding theΒ same.

Bias-Variance

Evaluating Models

β€œAlways plot learning curves while evaluating models”

Okay, so the basic thing we know is, if a model performs well on the training data but generalizes poorly, then the model is overfitting. If it performs poorly on both, then it is underfitting.

The hyperparameters must be set in such a way that, both bias and variance are as low as possible.

How are Learning CurvesΒ helpful?

β€œLearning curves are plots of the model’s performance on the training set and the validation set as a function of varying samples of training dataset.Β β€œ

To be specific, learning curves show training & validation scores on the y-axis against varying samples of the training dataset on theΒ x-axis.

The training & validation scores could be any evaluation metric like MSE, RMSE, etc. on your training and validation sets.

Learning curves can be used to understand the bias and variance errors of aΒ model.

Understanding LearningΒ Curves

Let’s generate some random data, fit a linear regression model for the same, and plot the learning curves for evaluating theΒ model.

from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error as mse
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
plt.style.use('seaborn')
X = 1 * np.random.rand(100, 1)
y = 3 + 3* X + np.random.randn(100, 1)
X_train,X_val,y_train,y_val=train_test_split(X,y,test_size=0.2)
regressor=LinearRegression()
regressor.fit(X_train,y_train)
predictions=regressor.predict(X_val)
plt.figure(1,figsize=(15,5))
plt.subplot(121)
plt.scatter(X,y)
plt.plot(X_val,predictions,color='black')
plt.title('Scikit Learn Linear Regression')
train_errors=[]
val_errors=[]
plt.subplot(122)
for i in range(1,len(X_train)):
regressor.fit(X_train[:i],y_train[:i])
train_preds=regressor.predict(X_train[:i])
val_preds=regressor.predict(X_val)
train_errors.append(mse(train_preds,y_train[:i]))
val_errors.append(mse(val_preds,y_val))
plt.plot(range(1,len(X_train)),np.sqrt(train_errors),label='Training error')
plt.plot(range(1,len(X_train)),np.sqrt(val_errors),label='Validation error')
plt.title('Learning Curves')
plt.xlabel('Train set size')
plt.ylabel('RMSE')
plt.legend()
plt.show()

Look at the output of the aboveΒ code:

Left: Regression Line; Right: Learning Curves. Image by theΒ author.

Okay, nice images. But what is the meaning? It may seem too much at the beginning. Take a look at the following steps to understand the code and theΒ images.

  1. We generated random dataΒ (X,y).
  2. Derived a training & validation dataset from theΒ same.
  3. Used Scikit Learn’s LinearRegression class to fit a line for our data, which is what the image on the left isΒ about.
  4. We then fit the model in the same way as above, but this time, we fit the model for training sample size 1 -> entire training datasetΒ size.
  5. For every sample size of our training set, we make predictions on our training sample size chosen and the entire validation dataset.
  6. We calculate the RMSE(Root Mean Square Error) and store the same for plotting later.Β Done!

We can see training & validation scores converge at a particular point. As seen in the image on the right, the first point of convergence w.r.t x-axis is about training sample size 10. This means that, beyond this point, the model will not benefit from increasing the training sample size. Considering the y-axis, the point of convergence is about RMSE value 1. Now, this is okay, and the model seems to generalize properly.

However, take an example where the value at the point of convergence corresponding to the y-axis is high (as seen in the image below). It shows that the model is suffering from high bias. This means that training & validation errors are high and the model doesn’t benefit from increasing the training sample size and thus results in underfitting.

Learning curves. High bias. Source: Andrew Ng’s MLΒ course.

On the other hand, if there was no visible point of convergence (as seen in the image below), this shows the model is having high variance and has less data. Meaning, the validation errors could be very high and the model would be overfitting.

Learning curves. High variance. Source: Andrew Ng’s MLΒ course.

How to improve model performance?

In the case of high bias, increase the number of features, or decrease the regularization parameter, thereby increasing the model complexity.

In the case of high variance, decrease the number of features, or increase the regularization parameter, thereby decreasing the model complexity. To fill the gap, just increase the data you have (not the features).

Conclusion

Always plot learning curves. Having a good understanding of learning curves helps you evaluate your model and analyze bias-variance problems. Hope you understood the importance of learning curves. See you at the nextΒ one.

Originally published on machinelearningmaster.

Hey, if you liked this article please show your support by smashing that clap button and sharing this articleΒ . Follow me for more articles on Machine Learning, Deep Learning, and Data Science. See you at the nextΒ one!

Find me around theΒ web

Blog: MachineLearningMaster

GitHub Profile: This is where IΒ fork

LinkedIn Profile: Connecting and sharing professional updates

Twitter: Sharing techΒ tweets

Thank you


Learning Curves was originally published in Towards AIβ€Šβ€”β€ŠMultidisciplinary Science Journal on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback ↓