Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

The Covariance and Correlation Clutter…
Latest   Machine Learning

The Covariance and Correlation Clutter…

Last Updated on July 25, 2023 by Editorial Team

Author(s): Astha Puri

Originally published on Towards AI.

For the longest time, I remember being confused between these two devils — covariance and correlation. And the resemblance DID NOT help! 🙁

So here I am, writing my first post and making an attempt to simplify the massive world of data — and stats that come with it. I’ll try to keep my posts short and sweet. I hope they help the impatient newbies like me out there to stay motivated. So let's learn and crush the clutter!!

Okay, try to put yourself in the scenarios described below. How would you feel if:
1. you’re in a race but you’re not told how long it is. The deal is to tell you every 1 minute about the distance you’ve covered.
2. your friend takes you to a concert and at the top of each hour, you’re told how much time has passed but not how long the concert is.
3. you’re writing an exam. You got to finish all the question BUT you don’t know how long the exam is! (This one gives me the chills..)

Anyway, pretty scary, huh? Okay, let's get back to statistics. Maybe it wouldn’t sound so scary now!

So, covariance and correlation both tell us how the relationship between two variables is. Does an increase in one lead to an increase in another?

  1. Yes? Good. Then you say the two variables have a positive correlation and a positive covariance. For example, the more you drink water, the more you pee!
  2. No? Okay then..maybe you have two variables that are completely independent of each other. For example — how many hours I sleep does not impact how much rain South Dakota gets.
  3. Hang on! Is an increase in one variable decreasing the other? Voila! You got a negative correlation and covariance. Umm lets see…say the more I eat, the thinner I get? Haha, I wish.

Anyway…so then why two different terms? Haven’t we got enough in this world to learn already?

Well, remember the three scary scenarios I gave above? What made them scary? I don’t mind giving an exam or participating in a race..just tell me how long each of them is, and I’ll be okay!

That exactly is the difference between covariance and correlation. Covariance values have no bound, but correlation will stick between -1 and 1.
So I could say A and B have a covariance of 20 or covariance of 50. How do you measure impact? There is no upper limit. So we know that A and B are moving together, but how impactful is it? That's where correlation comes in. I could tell you A and B have a correlation of 0.3 or 0.7 or any other value for that matter. But when I say this, you have 1 as the upper bound of correlation, so it gives you a better picture of the strength of the impact!

Let's look at an example in python:

First, we use a random number generator to generate arrays:

The way the arrays are generated, the second array values increase from their respective first array values. Let's calculate the covariance matrix.

The diagonal of the matrix = covariance between each variable and itself. The other values = covariance between the two variables.

So the covariance here is 3.15…..No good in 3.15 as an absolute number without context right? All we know if there is a positive relation, but how strong?

Let's look at the correlation.

This gives much more information about the strong correlation between the 2 arrays.

Great…I hope you enjoyed the read 🙂 Au revoir, for now..

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓