Join thousands of AI enthusiasts and experts at the Learn AI Community.

Publication

Statistics

Regression Line with Mathematics for the Linear Regression

Last Updated on July 9, 2020 by Editorial Team

Author(s): Gul Ershad

Statistics

Introduction

Regression is a prediction where the target is continuous and its applications are several. It is the simplest parametric model. Every data-set is given in a pair consisting of an input feature vector and label value. The main goal is to hypothesize the parameters to predict the target values of the test data after training from the training data-set.

The below table has two variables X and Y. Here, Y is known as the target variable or independent variable, and X is known as the explanatory variable.

X and Y variables

The prediction of the height of a child based on his age and weight can be an example of a regression problem.

Lets X is a real-values:

Values ofย X

And, the real value ofย Y:

Values ofย Y

So, the regression process based on the givenย rule:

Approach to Regression

Following are the general approach to Regression:

  1. Collect data
  2. Prepare data: Numeric values should be there for the regression. If there are nominal values, it should be mapped to binaryย values.
  3. Analysis: Good for the visualization into 2Dย plots.
  4. Train: Find the regression weights.
  5. Test: Measure the R2, or correlation of the predicted values and data. It measures the accuracy of theย model.

Regression Line

Linear regression consists of finding the best-fitting straight line through the points. The best-fitting line is called a regression line.

Regression Line

The equation of Regression Line:

The Equation of Regression Line

The equation of Intercept a:

The Equation of intercept a

The equation of Slopeย b:

Properties of Regression Line

The regression line has the following properties:

  1. The regression always runs and rise through points x and yย mean.
  2. This line minimizes the sum of square differences between observed values and predicted values.
  3. In the regression line, x is the input value and y is the outputย value.

Residual Error in Regression Line

Residual Error is the difference between the observed value of the dependent value and predicted value.

Residual Error = Observed value – Predicated value

Residual Error

Derivative to find the equation of Regression Line

Letโ€™s consider the following variables x and y with theirย values:

Variables X and Y with theirย values

So, to calculate the values of a and b lets find the values of XY, Xยฒ, andย Yยฒ.

Ready values to find Intercept andย Slope

Here,

No. ofย items

Now, find the value of Intercept aย :

Value of Intercept

Find the value of Slopeย b:

Value ofย Slope

Hence, the Regression Line equation:

The Regression Lineย equation

Linear Regression

Let's take an example, try to forecast the horsepower of a friendโ€™s automobile so its equation willย be:

Horsepower = 0.0018 * annual_salaryโ€Šโ€”โ€Š0.99*hourslistening_radio

This equation is known as a regression equation. The values of 0.0018 and 0.99 are known as regression weights. And, the process of finding these regression weights is called regression.

Forecasting new values given set of input is easy once the regression weights areย found.

For regression, the prediction formula for the linear regression is likeย below:

Equation of Linear Regression
import mglearn
mglearn.plots.plot_linear_regression_wave()
Linear Regression on Waveย data-set

There are many different linear models for regression. The difference between these models lies in how the model parameters w and b are learned from the training data, and how model complexity can be controlled.

Pros of Linear Regression:

  1. It easy to interpret and computationally inexpensive

Cons of Linear Regression:

  1. It poorly models on non-linear data

Conclusion

To find the best-fitting straight line through the points is an important part of Linear regression and this line is called a regression line. Linear regression consists of finding the best-fitting straight line through the points. The least-squares method is used to find the best-fitting straight line in regression.

References

Introduction to Linear Regression: http://onlinestatbook.com/2/regression/introC.html


Regression Line with Mathematics for the Linear Regression was originally published in Towards AIโ€Šโ€”โ€ŠMultidisciplinary Science Journal on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback โ†“