Master LLMs with our FREE course in collaboration with Activeloop & Intel Disruptor Initiative. Join now!


Learn With Me: Linear Algebra for Data Science — Part 3: Eigenvectors
Latest   Machine Learning

Learn With Me: Linear Algebra for Data Science — Part 3: Eigenvectors

Last Updated on July 26, 2023 by Editorial Team

Author(s): Matthew Macias

Originally published on Towards AI.

I know what you are thinking, what on earth is an eigenvector? Is that even an actual word?

Source: Photo by Ben White on Unsplash

Yes, they are real. They are also extremely helpful for certain operations in Data Science. If you read Part 2 of this series, you’ll know that I encouraged you to start to think of matrices as a means to perform linear transformations. A common way to think about this is through the classic Ax = b example, where matrix A transforms the input vector x into the output vector b.

So, what is an Eigenvector?

An eigenvector is probably best explained through an example. Let’s start with a matrix A and a vector x:

Source: image generated by author

For those of you who have read the earlier parts of this series, the above dot product will be a piece of cake! The matrix A transforms vector x into a new vector b which has a value of [4, 3] . Let’s see this transformation visually.

Source: image generated by author

We could describe this transformation as a rotating and stretching of the original vector x. We can go one step further and visualize a field of vectors represented by a circle. Imagine that each point that lies on this circle is the tip of a vector that originates at the origin. Now we can see how all different vectors would transform as a result of matrix A.

Source: image generated by author

As you can see, the original circle gets rotated and stretched out quite a bit. If we imagine those points along the circle as vectors, then the same would be said for them. But what if there was a vector that wasn’t rotated during the transformation? How would we find it? And better yet, why should we care about it?

Well, that’s exactly what an eigenvector is, a vector that is not rotated during a linear transformation, only scaled. Formally we can denote it as:

Source: image generated by author

Where λ (lambda) is referred to as an eigenvalue. It’s just a scalar that tells us by how much the eigenvector was shortened or lengthened after the linear transformation. A matrix may have some eigenvalues or none (for a matrix to have any eigenvectors, to begin with, it must be a square matrix), but each one of those eigenvectors has a corresponding eigenvalue that tells us about the scaling.

Continuing our example above, let’s see what an eigenvector and eigenvalue of matrix A look like.

Source: image generated by author

We can see that before and after the transformation, the eigenvector does not rotate off its original axis. This eigenvector also has an eigenvalue of 11, which intuitively makes sense because the vector after transformation looks about 11 times longer.

Calculating Eigenvectors

If all you want to know is what Python code to execute to return both the eigenvalues and eigenvectors, here you are:

eig_vals, eig_vecs = np.linalg.eig(A)

For the more curious among you, we will go through how you might derive the eigenvalues and vectors manually. I will preface the math by saying you will probably never do this for practical reasons, but it’s really good to get an understanding of what is going on under the hood of the NumPy functions.

Let’s go back to our original math form of eigenvectors and eigenvalues.

Source: image generated by the author

The first thing we will do is add an identity matrix into the mix. These matrices don’t change anything (they are the equivalent of multiplying a number by 1), but they simplify the calculation. Our new equation becomes:

Source: image generated by the author

Let’s bring everything to one side because having things equal to zero makes it exciting! We will also factor out an x as that will be a commonality.

Source: image generated by the author

For now, we will just take for granted that the terms within the brackets result in a matrix that is not invertible. At this point, it doesn’t seem like there are many other options. However, the fact that the resulting matrix is not invertible tells us something pretty important. The determinant of the terms within the brackets must also equal 0. That is:

Source: image generated by the author

Next, we will calculate the matrix within the brackets using our original A matrix:

Source: image generated by the author

Once we have that matrix, we can go about solving for its determinant. For a 2×2 matrix that is relatively simple, I would not recommend trying this manually for any larger matrices. In this case, it’s just the (top left * bottom right) — (bottom left * top right). This results in an equation that we can factorize to find our eigenvalues:

Source: image generated by author

So far, it all looks good, we are getting the same eigenvalues as we did with Python. But the work is not done, we need to use our eigenvalues to calculate our eigenvectors. Using what we have defined so far, let's fill in the standard equation for eigenvectors:

Source: image generated by author

We actually have an eigenvalue now that we can use as input into the above equation. Using 11, we will get a system of equations like the below:

Source: image generated by author

These are pretty straightforward equations, and they are both solved by:

Source: image generated by author

So any set of points that satisfy the above criteria will lie on our eigenvector. There you have it, you’ve just calculated the eigenvalues and eigenvectors for matrix A. But, there is still one major question looming…

Who cares?

Source: Photo by Matthew Henry on Unsplash

What was the point of all of this? I hear you say. Well, believe it or not, this does have its place in Data Science. We will briefly discuss one major application of eigenvalues and eigenvectors.

Principal Components Analysis (PCA)

You may or may not have heard of principal components analysis. It is perhaps the most used dimensionality reduction technique in Data Science. It serves to reduce the amount of data required to represent a system whilst also maintaining as much information as possible. PCA effectively takes the eigenvectors of the covariance matrix and returns what is known as principal components. We will cover PCA more in-depth in later parts of the series, but for now, just know how important it is!


Hopefully, you enjoyed another part of the linear algebra series, be sure to check out Part 1 and Part 2 if there was anything you were unsure about going through this article. We have laid a pretty nice foundation up to this point, and we will start to get into more practical applications of linear algebra, such as PCA and singular value decomposition (SVD).

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓