Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take the GenAI Test: 25 Questions, 6 Topics. Free from Activeloop & Towards AI

Publication

Normal Equation in Linear Regression
Machine Learning

Normal Equation in Linear Regression

Last Updated on June 25, 2020 by Editorial Team

Author(s): Saniya Parveez

Machine Learning

Gradient descent is a very popular and first-order iterative optimization algorithm for finding a local minimum over a differential function. Similarly, the Normal Equation is another way of doing minimization. It does minimization without restoring to an iterative algorithm. Normal Equation method minimizes J by explicitly taking its derivatives concerning theta j and setting them toΒ zero.

Example:

Below is a data-set to predict houseΒ price:

House Features:

Predictor:

Calculation:

import numpy as np
x = np.array([[1, 2104, 5, 1, 45],
[1, 1416, 3, 2, 40],
[1, 1534, 3, 2, 30],
[1, 852, 2, 1, 36]])
y = [460, 232, 315, 178]
x_transpose = x.transpose()
x_transpose_x = np.dot(x_transpose, x)
x_transpose_y = np.dot(x_transpose, y)
theta = np.dot(x_transpose_x_inverse, x_transpose_y)
theta

Gradient Descent Vs NormalΒ Equation

Gradient Descent

  • It requires to choose the value ofΒ Alpha.
  • It requires many iterations.
  • It works well when n (no. of data-set) isΒ large.

Normal Equation

  • It does not need to choose the value ofΒ Alpha.
  • It doesn’t require iteration.
  • It requires to calculate the inverse of transpose ofΒ x.
  • It is slow if n (data-set) if veryΒ large.

Linear Regression with NormalΒ Equation

Import libraries:

from numpy import genfromtxt
import numpy as np
from matplotlib import pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn import linear_modl

Load the PortlandΒ data

data = genfromtxt('portland.csv', delimiter=',')

Feature extraction:

area = data[:, 0]
rooms =data[:, 1]
price = data[:, 2]

Visualize The Area against theΒ Price:

def fitTheta(feature, theta):
return np.dot(theta, feature)
def visualiseFeature(feature, featureLabel, thetaVal=None):
fig = plt.figure(figsize=(20, 10))

plt.rcParams.update({'font.size': 10})

plt.xlabel(featureLabel, fontsize=15)

plt.ylabel("Price", fontsize=15)

plt.scatter(feature, price, color="red", s=75)

if(thetaVal):

thetaFit = fitTheta(feature, thetaVal)

plt.plot(feature, thetaFit)

Visualize the Number of Rooms against the Price of theΒ House:

visualiseFeature(rooms, "Number of Rooms")

Here, the relationship between the Number of Rooms, and the Price of the House, appears to beΒ Linear.

Define Feature Matrix, and Outcome/Target Vector:

X_data = data[:, 0:2] #Feature Matrix
y = data[:, 2] #Outcome Vector

Calculate Cost Function:

def getMSE(feature, thetaRange):
costMatrix = np.repeat(price, thetaRange.shape[0]).reshape(price.shape[0], thetaRange.shape[0])

costs = np.dot(np.asmatrix(feature).T, np.asmatrix(thetaRange)) - costMatrix

MSE = (np.array((np.sum(costs, 0)))**2)/(2*price.shape[0])

return np.array(MSE)[0]

Visualize Cost Function:

def visualiseLoss(feature, featureName, startInterval, endInterval, stepSize=0.5, thetaVal=None):
thetaRange = np.arange(startInterval, endInterval, stepSize)

Loss = getMSE(feature, thetaRange)

fig = plt.figure(figsize=(20, 10))

plt.title("Loss Function for the Feature: {}".format(featureName), fontsize=25)

plt.ylabel("Cost Function, J(Θ)", fontsize=15)

plt.xlabel("Weight(s), Θ", fontsize=15)

plt.plot(thetaRange, Loss, zorder=1)

if(thetaVal):

thetaLoss = getMSE(feature, np.array(thetaVal).reshape(1, 1))

plt.scatter(thetaVal, thetaLoss, marker="x", linewidth=5, color="red", s=200, zorder=2)

plt.annotate("Theta = {}".format(thetaVal), (thetaVal, thetaLoss), fontsize=25)
visualiseLoss(area, "Area", -500, 800)

Add altΒ text

visualiseLoss(rooms, "Rooms", -40000, 250000)

Split Data

X_data = data[:, 0:2] #Feature Matrix
y = data[:, 2] #Outcome Vector
X = np.c_[np.ones(X_data.shape[0]), X_data]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

Normal Equation

firstTerm = np.linalg.inv(np.dot(X_train.T, X_train))
secondTerm = np.dot(X_train.T, y_train)
Theta = np.dot(firstTerm, secondTerm)

Visualize:

visualiseLoss(area, "Area", -500, 800, thetaVal=Theta[1])
visualiseFeature(area, "Area (Square Feet)", thetaVal=Theta[1])

Add altΒ text

Prediction using Normal Equation thetaΒ value

normal_predictions = np.dot(X_test, Theta)
normal_predictions

Add altΒ text

Prediction using Linear Regression

reg = linear_model.LinearRegression()
reg.fit(X_train, y_train)
sk_predictions = reg.predict(X_test)
sk_predictions

Add altΒ text

Here, the predictions from the Normal Equation and Linear Equation are theΒ same.

Normal Equation Non-Invertibility

A squared matrix that does not have an inverse a matrix is singular if and only if it is determined isΒ zero.

Example:

import numpy as np
A = [[2, 6],
[1, 3]]

The inverse ofΒ Matrix:

inverse_A = np.linalg.inv(A)
inverse_A

Error fromΒ Numpy:

Add altΒ text

Error

Problem due to Non-Invertibility:

  • Redundant features
  • Too manyΒ features

How to solve if there are too many features?

  • Delete some features of use Regularization

Conclusion

Gradient Descent gives one way to minimizing J. Normal Equation is another way of doing minimization. It does minimization without restoring to an iterative algorithm. But, Normal Equation is very slow if the data-set size is veryΒ large


Normal Equation in Linear Regression was originally published in Towards AIβ€Šβ€”β€ŠMultidisciplinary Science Journal on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Comment (1)

  1. Roshan KumarYadav
    May 5, 2021

    Code below the dataset is incorrect and it doesnt work when we define the x_transpose_x_inverse

Feedback ↓