Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

How Much Math do I need in Data Science?
Data Science   Mathematics

How Much Math do I need in Data Science?

Last Updated on August 12, 2020 by Editorial Team

Author(s): Benjamin Obi Tayo Ph.D.

Image by Benjamin O.Β Tayo

Math skills are essential in data science and machineΒ learning

I. Introduction

If you are a data science aspirant, you no doubt have the following questions in mind:

Can I become a data scientist with little or no math background?

What essential math skills are important in data science?

There are so many good packages that can be used for building predictive models or for producing data visualizations. Some of the most common packages for descriptive and predictive analytics include:

  • Ggplot2
  • Matplotlib
  • Seaborn
  • Scikit-learn
  • Caret
  • TensorFlow
  • PyTorch
  • Keras

Thanks to these packages, anyone can build a model or produce a data visualization. However, very solid background knowledge in mathematics is essential for fine-tuning your models to produce reliable models with optimal performance. It is one thing to build a model, it is another thing to interpret the model and draw out meaningful conclusions that can be used for data-driven decision making. It’s important that before using these packages, you have an understanding of the mathematical basis of each, that way you are not using these packages simply as black-box tools.

II. Case Study: Building A Multiple Regression Model

Let’s suppose we are going to be building a multi-regression model. Before doing that, we need to ask ourselves the following questions:

How big is my dataset?

What are my feature variables and target variable?

What predictor features correlate the most with the target variable?

What features are important?

Should I scale my features?

How should my dataset be partitioned into training and testing sets?

What is principal component analysis (PCA)?

Should I use PCA for removing redundant features?

How do I evaluate my model? Should I used R2 score, MSE, or MAE?

How can I improve the predictive power of the model?

Should I use regularized regression models?

What are the regression coefficients?

What is the intercept?

Should I use non-parametric regression models such as KNeighbors regression or support vector regression?

What are the hyperparameters in my model, and how can they be fine-tuned to obtain the model with optimal performance?

Without a sound math background, you wouldn’t be able to address the questions raised above. The bottom line is that in data science and machine learning, mathematical skills are as important as programming skills. As a data science aspirant, it is therefore essential that you invest time to study the theoretical and mathematical foundations of data science and machine learning. Your ability to build reliable and efficient models that can be applied to real-world problems depends on how good your mathematical skills are. To see how math skills are applied in building a machine learning regression model, please see this article: Machine Learning Process Tutorial.

Let’s now discuss some of the essential math skills needed in data science and machine learning.

III. Essential Math Skills for Data Science and MachineΒ Learning

1. Statistics and Probability

Statistics and Probability is used for visualization of features, data preprocessing, feature transformation, data imputation, dimensionality reduction, feature engineering, model evaluation, etc.

Here are the topics you need to be familiar with: Mean, Median, Mode, Standard deviation/variance, Correlation coefficient and the covariance matrix, Probability distributions (Binomial, Poisson, Normal), p-value, Baye’s Theorem (Precision, Recall, Positive Predictive Value, Negative Predictive Value, Confusion Matrix, ROC Curve), Central Limit Theorem, R_2 score, Mean Square Error (MSE), A/B Testing, Monte Carlo Simulation

2. Multivariable Calculus

Most machine learning models are built with a dataset having several features or predictors. Hence, familiarity with multivariable calculus is extremely important for building a machine learning model.

Here are the topics you need to be familiar with: Functions of several variables; Derivatives and gradients; Step function, Sigmoid function, Logit function, ReLU (Rectified Linear Unit) function; Cost function; Plotting of functions; Minimum and Maximum values of a function

3. LinearΒ Algebra

Linear algebra is the most important math skill in machine learning. A data set is represented as a matrix. Linear algebra is used in data preprocessing, data transformation, dimensionality reduction, and model evaluation.

Here are the topics you need to be familiar with: Vectors; Norm of a vector; Matrices; Transpose of a matrix; The inverse of a matrix; The determinant of a matrix; Trace of a Matrix; Dot product; Eigenvalues; Eigenvectors

4. Optimization Methods

Most machine learning algorithms perform predictive modeling by minimizing an objective function, thereby learning the weights that must be applied to the testing data in order to obtain the predicted labels.

Here are the topics you need to be familiar with: Cost function/Objective function; Likelihood function; Error function; Gradient Descent Algorithm and its variants (e.g. Stochastic Gradient Descent Algorithm)

IV. Summary and Conclusion

In summary, we’ve discussed the essential math and theoretical skills that are needed in data science and machine learning. There are several free online courses that will teach you the necessary math skills that you need in data science and machine learning. As a data science aspirant, it’s important to keep in mind that the theoretical foundations of data science are very crucial for building efficient and reliable models. You should, therefore, invest enough time to study the mathematical theory behind each machine learning algorithm.

V. References

Linear Regression Basics for Absolute Beginners.

Mathematics of Principal Component Analysis with R Code Implementation.

Machine Learning Process Tutorial.

Bio: Benjamin Tayo is an associate professor of engineering and physics at the University of Central Oklahoma. He is also a data science educator. He obtained his Ph.D. from Lehigh University in computational material sciences. Tayo has published over 100 articles on Medium covering various aspects of data science.


How Much Math do I need in Data Science was originally published in Towards AIΒ on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published viaΒ Towards AIΒ per author’s request.

Comments (2)

  1. Alabi isama
    August 13, 2020

    Just the article I need now . But can someone with no math background learn all this . What can the writer of the article do to help . Goes have books or courses out there dealing with these topics .
    Lastly can how can I contact the writer.

    1. Stacy
      August 17, 2020

      Hi Alabi,

      You can connect with Dr. Tayo on Linkedin at: https://www.linkedin.com/in/benjamin-o-tayo-ph-d-a2717511/ – he mentors prospect data scientists in his company DataScienceHub

Feedback ↓