Regression Line with Mathematics for the Linear Regression
Last Updated on July 9, 2020 by Editorial Team
Author(s): Gul Ershad
Statistics
Introduction
Regression is a prediction where the target is continuous and its applications are several. It is the simplest parametric model. Every data-set is given in a pair consisting of an input feature vector and label value. The main goal is to hypothesize the parameters to predict the target values of the test data after training from the training data-set.
The below table has two variables X and Y. Here, Y is known as the target variable or independent variable, and X is known as the explanatory variable.
The prediction of the height of a child based on his age and weight can be an example of a regression problem.
Lets X is a real-values:
And, the real value ofΒ Y:
So, the regression process based on the givenΒ rule:
Approach to Regression
Following are the general approach to Regression:
- Collect data
- Prepare data: Numeric values should be there for the regression. If there are nominal values, it should be mapped to binaryΒ values.
- Analysis: Good for the visualization into 2DΒ plots.
- Train: Find the regression weights.
- Test: Measure the R2, or correlation of the predicted values and data. It measures the accuracy of theΒ model.
Regression Line
Linear regression consists of finding the best-fitting straight line through the points. The best-fitting line is called a regression line.
The equation of Regression Line:
The equation of Intercept a:
The equation of SlopeΒ b:
Properties of Regression Line
The regression line has the following properties:
- The regression always runs and rise through points x and yΒ mean.
- This line minimizes the sum of square differences between observed values and predicted values.
- In the regression line, x is the input value and y is the outputΒ value.
Residual Error in Regression Line
Residual Error is the difference between the observed value of the dependent value and predicted value.
Residual Error = Observed value – Predicated value
Derivative to find the equation of Regression Line
Letβs consider the following variables x and y with theirΒ values:
So, to calculate the values of a and b lets find the values of XY, XΒ², andΒ YΒ².
Here,
Now, find the value of Intercept aΒ :
Find the value of SlopeΒ b:
Hence, the Regression Line equation:
Linear Regression
Let's take an example, try to forecast the horsepower of a friendβs automobile so its equation willΒ be:
Horsepower = 0.0018 * annual_salaryβββ0.99*hourslistening_radio
This equation is known as a regression equation. The values of 0.0018 and 0.99 are known as regression weights. And, the process of finding these regression weights is called regression.
Forecasting new values given set of input is easy once the regression weights areΒ found.
For regression, the prediction formula for the linear regression is likeΒ below:
import mglearn
mglearn.plots.plot_linear_regression_wave()
There are many different linear models for regression. The difference between these models lies in how the model parameters w and b are learned from the training data, and how model complexity can be controlled.
Pros of Linear Regression:
- It easy to interpret and computationally inexpensive
Cons of Linear Regression:
- It poorly models on non-linear data
Conclusion
To find the best-fitting straight line through the points is an important part of Linear regression and this line is called a regression line. Linear regression consists of finding the best-fitting straight line through the points. The least-squares method is used to find the best-fitting straight line in regression.
References
Introduction to Linear Regression: http://onlinestatbook.com/2/regression/introC.html
Regression Line with Mathematics for the Linear Regression was originally published in Towards AIβββMultidisciplinary Science Journal on Medium, where people are continuing the conversation by highlighting and responding to this story.
Published via Towards AI