Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

The Gradient Descent Algorithm
Tutorials

The Gradient Descent Algorithm

Last Updated on November 1, 2022 by Editorial Team

Author(s): Towards AI Editorial Team

Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.

Image by Anja from Pixabay

The What, Why, and Hows of the Gradient Descent Algorithm

Author(s): Pratik Shukla

“The cure for boredom is curiosity. There is no cure for curiosity.” — Dorothy Parker

The Gradient Descent Series of Blogs:

  1. The Gradient Descent Algorithm (You are here!)
  2. Mathematical Intuition behind the Gradient Descent Algorithm
  3. The Gradient Descent Algorithm & its Variants

Table of Contents:

  1. Motivation for the Gradient Descent Series
  2. What is the gradient descent algorithm?
  3. The intuition behind the gradient descent algorithm
  4. Why do we need the gradient descent algorithm?
  5. How does the gradient descent algorithm work?
  6. The formula of the gradient descent algorithm
  7. Why do we use gradients?
  8. A brief introduction to the directional derivatives
  9. What is the direction of the steepest ascent?
  10. An example proving the direction of the steepest ascent
  11. An explanation of the ( — ) sign in the gradient descent algorithm
  12. Why learning rate?
  13. Some basic rules of differentiation
  14. Gradient descent algorithm for one variable
  15. Gradient descent algorithm for two variables
  16. Conclusion
  17. References and Resources

Motivation for the Gradient Descent Series:

We are pleased to introduce our first blog series on machine learning algorithms! We want to educate our readers on the fundamental principles behind machine learning algorithms. Nowadays, one of the numerous Python packages can be used to implement most machine-learning algorithms. We can quickly implement any machine learning method using these Python packages in only a few minutes. We find it intriguing, don’t you? However, many students and professionals struggle when they need to make changes to the algorithm. To understand how machine learning algorithms function at their core, we have developed this series of blogs. We intend to provide a short series on more machine learning algorithms in the future, and we hope you will find this one exciting and valuable!

Optimization is at the core of machine learning — it’s a big part of what makes an algorithm’s results “good” in the ways we want them to be. Many machine learning algorithms find the optimal values of their parameters using the gradient descent algorithm. Therefore, understanding the gradient descent algorithm is essential to understanding how AI produces good results.

In the first part of this series, we will provide a strong background on the gradient descent algorithm’s what, why, and hows. In the second part, we will offer you a robust mathematical intuition on how the gradient descent algorithm finds the best values of its parameters. In the last part of the series, we will compare the variants of the gradient descent algorithm with their elaborated code examples in Python. This series is intended for beginners and experts alike — come one, come all!

What is the Gradient Descent Algorithm?

Wikipedia formally defines the phrase gradient descent as follows:

In mathematics, gradient descent is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function.

Gradient descent is a machine learning algorithm that operates iteratively to find the optimal values for its parameters. The algorithm considers the function’s gradient, the user-defined learning rate, and the initial parameter values while updating the parameter values.

Intuition Behind the Gradient Descent Algorithm:

Let’s use a metaphor to visualize what gradient descent looks like in action. Say that we’re hiking a mountain and unfortunately, it begins to rain while we are in the middle of our hike. Our objective is to descend the mountain as rapidly as possible to seek shelter. So, what will be our strategy for doing this? Remember that we can’t see very far because it’s raining. In all directions around us, we can only perceive the nearby movements.

Here’s what comes to mind. We will scan the area around us in search of a move that will send us down as rapidly as feasible. Once we find that direction, we will take a baby step in that direction. We’ll continue doing this until we get to the bottom of the mountain. So, in essence, this is how the gradient descent method locates the global minimum (the lowest point of the entire set of data we’re analyzing). Here is how we can relate this example to the gradient descent algorithm.

current position → → → initial parameters
baby step → → → learning rate
direction → → → partial derivative (gradient)

Why do We Need the Gradient Descent Algorithm?

In many machine learning models, our ultimate goal is to find the best parameter values that reduce the cost associated with the predictions. To do this, we initially start with random values of these parameters and try to find the optimal ones. To find the optimal values, we use the gradient descent algorithm.

How does the Gradient Descent Algorithm Work?

  1. Start with random initial values for the parameters.
  2. Predict the value of the target variable using the current parameters.
  3. Calculate the cost associated with the prediction.
  4. Have we minimized the cost? If yes, then go to step — 6. If no, then go to step — 5.
  5. Update the parameter values using the gradient descent algorithm and return to step — 2.
  6. We have our final updated parameters.
  7. Our model is ready to roll (down the mountain)!
Figure — 1: How the gradient descent algorithm works

The Formula of the Gradient Descent Algorithm:

Figure — 2: The formula of the gradient descent algorithm

Now, let’s understand the meaning behind each of the terms mentioned in the above formula. Let’s first start by understanding the directional derivatives.

Note: Our ultimate goal is to find the optimal parameters as quickly as possible. So, we will need something to help us move in the right direction as soon as possible.

Why do we use gradients?

Gradients: Gradients are nothing but a vector whose entries are partial derivatives of a function.

Suppose we have a function f(x) of one variable x. In this case, we will have only one partial derivative. The partial derivative shown in the below image gives us the value of how quickly the function is changing (increasing or decreasing) in the x direction (along the x-axis). We can write the partial derivative in the gradient form as follows.

Figure — 3: Gradient with one element

Let’s say we have a function f(x, y) of two variables, x and y. In this case, we will have two partial derivatives. The partial derivative shown in the below image gives us the value of how quickly the function is changing (increasing or decreasing) in the x direction and y direction (along the x-axis and y-axis). We can write the partial derivative in the gradient form as follows.

Figure — 4: Gradient with two elements

To generalize this, we can have a function with n variables, and its gradient will have n elements.

Figure — 5: Gradient with n elements

But now the question is, what if we want to find the derivative in some directions other than just along the axis? We know that we can travel in an infinite number of directions from a given point. Now, to find the gradient in any direction, we will use the concept of directional derivatives.

Figure — 6: We can take a step in an infinite number of directions from a point.

A Brief Introduction to the Directional Derivatives:

Unit vector: A unit vector is a vector with a magnitude of 1.

How do we find the length or magnitude of a vector?

Consider the following for a vector u.

Figure — 7: Vector u

The vector’s length is then calculated as the square root of the sum of all its components squared.

Figure — 8: The length of vector u

The derivative of a function f(x, y) in the direction of vector u (a unit vector) is given by the dot product of the function’s gradient with the unit vector u. Mathematically, we can represent it in the following form.

Figure — 9: Directional derivative

The above equation gives us the partial derivative of f(x, y) in any direction. Now, let’s see how it works if we want to find the partial derivative along the x-axis. First, if we want to find the partial derivative in the x direction, the unit vector u will be (1, 0). Now, let’s calculate the partial derivative along the x-axis.

Figure — 10: Partial derivative along the x-axis

Next, let’s see how it works if we want to find the partial derivative along the y-axis. First of all, if we want to find the partial derivative in the y direction, the unit vector u will be (0, 1). Now, let’s calculate the partial derivative along the y-axis.

Figure — 11: Partial derivative along the y-axis

Note: The length (magnitude) of the unit vector must be 1.

Now that we know how to find the partial derivatives in all directions, we need to find the direction in which the partial derivative gives us the maximum change, because, in our case, we want to find the optimum values as quickly as possible.

What is the direction of the steepest ascent?

As of right now, we are aware that the directional derivatives are shown as follows.

Figure — 12: Directional derivative

Next, we can replace the dot product between two vectors with the cosine value of the angle between them.

Figure — 13: Directional derivative

Now, note that since u is a unit vector, its magnitude is always going to be 1.

Figure — 14: Directional derivative

Now, in the above equation, we do not have control over the magnitude of the gradient. We can only control the angle θ. So, to maximize the partial derivative of the function, we need to maximize cosθ. Now, we all know that cosθ is maximized (1) when θ = 0 (cos0 = 1). It means that the value of the derivative is maximized when the angle between the gradient and unit vector is 0. In other words, we can say that the value of the partial derivative is maximized when the unit vector (direction vector) points in the direction of the gradient.

So, in conclusion, we can say that finding the partial derivative in the direction of the gradient gives us the steepest ascent. Now, let’s understand this with the help of an example.

Example proving the direction of the steepest ascent:

Find the gradient of the function f(x, y) = x² + y² at the point (3, 2).

1. Step — 1:

We have a function f(x, y) of two variables x and y.

Figure — 15: Function f(x, y)

2. Step — 2:

Next, we will find the gradient of the function. Since there are two variables in our function, the gradient vector will have two elements in it.

Figure — 16: Gradient of f(x, y)

3. Step — 3:

Next, we are calculating the gradient of the function f(x, y) = x² + y².

Figure — 17: Partial derivatives of f(x, y)

4. Step — 4:

The gradient of the function can be written as follows.

Figure — 18: Gradient of f(x, y)

5. Step — 5:

Next, we calculate the gradient of the function at the point (3, 2).

Figure — 19: Gradient of f(x, y) at point (3, 2)

6. Step — 6:

Next, we are finding the partial derivative of the function f(x, y) along the x-axis (1, 0).

Figure — 20: Gradient of f(x, y) at point (3, 2) along the x-axis

7. Step — 7:

Next, we are finding the partial derivative of the function f(x, y) along the y-axis (0, 1).

Figure — 21: Gradient of f(x, y) at point (3, 2) along the y-axis

8. Step — 8:

Next, we find the partial derivative of the function f(x, y) in the direction of (1, 1). Note that here we will have to take care of the magnitude of the unit vector.

Figure — 22: Gradient of f(x, y) at point (3, 2) in the direction of (1, 1)

9. Step — 9:

Next, we find the partial derivative of the function f(x, y) in the direction of the gradient (3, 2). Please note that this is the direction of the gradient vector. Also, here we will have to take care of the magnitude of the unit vector.

Figure — 23: Gradient of f(x, y) at point (3, 2) in the direction of (3, 2)

10. Step — 10:

So, based on the calculations shown in Step — 6, Step — 7, Step — 8, Step — 9, we can confidently say that the direction of the steepest ascent is the direction of the gradient.

In the gradient descent algorithm, we aim to find the optimal parameters as quickly as possible. So, this is the reason why we use the partial derivatives in the gradient descent algorithm.

But wait… there is a catch!

In the gradient descent algorithm, we want to find the minimum point. However, using the gradient will lead us to the highest point because it gives us the steepest ascent. So, what do we about it?

Explanation for the ( — ) sign in the gradient descent algorithm:

Now, we know that the gradient gives us the steepest ascent. So, if we proceed in the direction of the steepest ascent, we will never reach the minimum point. Our ultimate goal is to quickly find a way to reach the minimum point. So, to go in the direction of the steepest descent, we will travel in the exact opposite direction of the steepest ascent. This is the reason why we use the ( — ) sign.

Why Learning Rate?

Please be aware that we have no control over the gradient’s magnitude. Occasionally we may get a very high gradient value. Therefore, if we don’t somehow manage to slow down the rate of change, we’ll end up making some very huge strides. It is important to remember that a high learning rate may only provide us with sub-optimal parameter values. In contrast, a lower learning rate may necessitate more training epochs to obtain the optimal value.

The gradient descent approach has a hyperparameter that regulates how quickly our model learns new information. This hyperparameter is known as the learning rate. Our model’s learning rate determines how quickly parameter values are changed. We must set the learning rate at an optimum value. If the learning rate is too high, our model might take big steps and miss the minimum. So, a higher learning rate may result in the non-convergence of the model. On the other hand, if the learning is too small, the model will take too much time to converge.

Figure — 24: Learning Rate

Some Basic Rules of Differentiation:

1. Scalar multiplication rule:

Figure — 25: Scalar multiplication rule

2. The summation rule:

Figure — 26: Summation rule

3. The power rule:

Figure — 27: Power rule

4. The chain rule:

Figure — 28: Chain rule

Now, let’s take a couple of examples to understand how the gradient descent algorithm works.

Gradient descent for one variable:

Let’s start off with a very simple cost function. Let’s say we have a cost function (J(θ) = θ²) involving only one parameter (θ), and our goal is to find the optimal value of the parameter (θ) such that it minimizes the cost function (J(θ) = θ²).

From our cost function (J(θ) = θ²), we can clearly say that it will be minimum at θ=0. However, deriving such conclusions will not be easy while we are working with more complex functions. To do that, we will use the gradient descent algorithm. Let’s see how we can apply the gradient descent algorithm to find the optimal value of the parameter (θ).

1. Step — 1:

Our cost function with one parameter (θ) is given by,

Figure — 29: Cost function with one variable

2. Step — 2:

Our ultimate goal is to minimize the cost function by finding the optimal value of parameter θ.

Figure — 30: Minimizing the cost function

3. Step — 3:

The formula for the gradient descent algorithm is the following.

Figure — 31: Gradient descent algorithm

4. Step — 4:

To ease the calculations, we are considering the learning rate of 0.1.

Figure — 32: The learning rate

5. Step — 5:

Next, we find the partial derivative of the cost function.

Figure — 33: The partial derivative of the cost function

6. Step — 6:

Next, we use the partial derivative of Step — 5 and substitute it into the formula given in Step — 3.

Figure — 34: Updating the parameters using the gradient descent algorithm

7. Step — 7:

Now, let’s understand how the gradient descent algorithm works with the help of an example. Here, we are starting with the value of θ=5, and we will find the optimal value for θ such that it minimizes the cost function. Next, we will also begin with the value of θ=-5 to check whether it can find the optimal values for the cost function or not. Please note that here we are using the above-derived gradient descent rule to update the value of the parameter θ.

Figure — 35: Cost values at different data points
Figure — 36: Cost values at different data points

8. Step — 8:

Next, we plot the graph of the data shown in the above tables. We can see in the graph that the gradient descent algorithm is able to find the optimal value of θ and minimizes the cost function J(θ).

Figure — 37: The graph of parameter vs cost function

Gradient Descent for two variables:

Now, let’s move on to the cost function with two variables and see how it goes.

1. Step — 1:

Our cost function with two parameters (θ1 and θ2) is given by,

Figure — 38: The cost function with two parameters

2. Step — 2:

Our ultimate goal is to minimize the cost function by finding the optimal value of parameters θ1 and θ2.

Figure — 39: Minimize the cost function

3. Step — 3:

The formula for the gradient descent algorithm is as follows.

Figure — 40: Gradient descent algorithm

4. Step — 4:

We will use the formula given in Step — 3 to find the optimal values of our parameters θ1 and θ2.

Figure — 41: Updating the parameters using the gradient descent algorithm

4. Step — 4:

Next, we find the partial derivatives of the cost functions with respect to the parameters θ1 and θ2.

Figure — 42: The partial derivatives of the cost function

5. Step — 5:

Next, we are using the partial derivatives derived in Step — 4 to substitute in Step — 3.

Figure — 43: Updating the parameters using the gradient descent algorithm

6. Step — 6:

To simplify the calculations, we are going to use the learning rate of 0.1.

Figure — 44: Learning rate

7. Step — 7:

Now, let’s understand how the gradient descent algorithm works with the help of an example. Here, we are starting with the value of θ1=1 and θ2=1, and we will find the optimal value for θ1 and θ2 such that it minimizes the cost function. Next, we will also start with the value of θ1=-1 and θ2=-1 to check whether the gradient descent algorithm can find the optimal values for the cost function or not.

Figure — 45: Cost values at different data points
Figure — 46: Cost values at different data points

8. Step — 8:

Next, we plot the graph of the data shown in the above tables. We can see in the graph that the gradient descent algorithm is able to find the optimal values of θ1 and θ2 and minimizes the cost function J(θ).

Figure — 47: The graph of parameter vs cost function

Conclusion:

There you have it! We’ve gone over the basics of the gradient descent algorithm and its important role in machine learning. Feel free to go over any of the calculations or concepts that might not be clear on the first pass-over. Now that you’ve successfully learned how to descend the mountain, learn about the other ways gradient descent can help solve problems in the next installment of the Gradient Descent series.

Buy Pratik a Coffee!

Citation:

For attribution in academic contexts, please cite this work as:

Shukla, et al., “The Gradient Descent Algorithm”, Towards AI, 2022

BibTex Citation:

@article{pratik_2022, 
title={The Gradient Descent Algorithm},
url={https://towardsai.net/neural-networks-with-python},
journal={Towards AI},
publisher={Towards AI Co.},
author={Pratik, Shukla},
editor={Lauren, Keegan},
year={2022},
month={Oct}
}

References and Resources:

  1. Gradient descent — Wikipedia


The Gradient Descent Algorithm was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓

Sign Up for the Course
`; } else { console.error('Element with id="subscribe" not found within the page with class "home".'); } } }); // Remove duplicate text from articles /* Backup: 09/11/24 function removeDuplicateText() { const elements = document.querySelectorAll('h1, h2, h3, h4, h5, strong'); // Select the desired elements const seenTexts = new Set(); // A set to keep track of seen texts const tagCounters = {}; // Object to track instances of each tag elements.forEach(el => { const tagName = el.tagName.toLowerCase(); // Get the tag name (e.g., 'h1', 'h2', etc.) // Initialize a counter for each tag if not already done if (!tagCounters[tagName]) { tagCounters[tagName] = 0; } // Only process the first 10 elements of each tag type if (tagCounters[tagName] >= 2) { return; // Skip if the number of elements exceeds 10 } const text = el.textContent.trim(); // Get the text content const words = text.split(/\s+/); // Split the text into words if (words.length >= 4) { // Ensure at least 4 words const significantPart = words.slice(0, 5).join(' '); // Get first 5 words for matching // Check if the text (not the tag) has been seen before if (seenTexts.has(significantPart)) { // console.log('Duplicate found, removing:', el); // Log duplicate el.remove(); // Remove duplicate element } else { seenTexts.add(significantPart); // Add the text to the set } } tagCounters[tagName]++; // Increment the counter for this tag }); } removeDuplicateText(); */ // Remove duplicate text from articles function removeDuplicateText() { const elements = document.querySelectorAll('h1, h2, h3, h4, h5, strong'); // Select the desired elements const seenTexts = new Set(); // A set to keep track of seen texts const tagCounters = {}; // Object to track instances of each tag // List of classes to be excluded const excludedClasses = ['medium-author', 'post-widget-title']; elements.forEach(el => { // Skip elements with any of the excluded classes if (excludedClasses.some(cls => el.classList.contains(cls))) { return; // Skip this element if it has any of the excluded classes } const tagName = el.tagName.toLowerCase(); // Get the tag name (e.g., 'h1', 'h2', etc.) // Initialize a counter for each tag if not already done if (!tagCounters[tagName]) { tagCounters[tagName] = 0; } // Only process the first 10 elements of each tag type if (tagCounters[tagName] >= 10) { return; // Skip if the number of elements exceeds 10 } const text = el.textContent.trim(); // Get the text content const words = text.split(/\s+/); // Split the text into words if (words.length >= 4) { // Ensure at least 4 words const significantPart = words.slice(0, 5).join(' '); // Get first 5 words for matching // Check if the text (not the tag) has been seen before if (seenTexts.has(significantPart)) { // console.log('Duplicate found, removing:', el); // Log duplicate el.remove(); // Remove duplicate element } else { seenTexts.add(significantPart); // Add the text to the set } } tagCounters[tagName]++; // Increment the counter for this tag }); } removeDuplicateText(); //Remove unnecessary text in blog excerpts document.querySelectorAll('.blog p').forEach(function(paragraph) { // Replace the unwanted text pattern for each paragraph paragraph.innerHTML = paragraph.innerHTML .replace(/Author\(s\): [\w\s]+ Originally published on Towards AI\.?/g, '') // Removes 'Author(s): XYZ Originally published on Towards AI' .replace(/This member-only story is on us\. Upgrade to access all of Medium\./g, ''); // Removes 'This member-only story...' }); //Load ionic icons and cache them if ('localStorage' in window && window['localStorage'] !== null) { const cssLink = 'https://code.ionicframework.com/ionicons/2.0.1/css/ionicons.min.css'; const storedCss = localStorage.getItem('ionicons'); if (storedCss) { loadCSS(storedCss); } else { fetch(cssLink).then(response => response.text()).then(css => { localStorage.setItem('ionicons', css); loadCSS(css); }); } } function loadCSS(css) { const style = document.createElement('style'); style.innerHTML = css; document.head.appendChild(style); } //Remove elements from imported content automatically function removeStrongFromHeadings() { const elements = document.querySelectorAll('h1, h2, h3, h4, h5, h6, span'); elements.forEach(el => { const strongTags = el.querySelectorAll('strong'); strongTags.forEach(strongTag => { while (strongTag.firstChild) { strongTag.parentNode.insertBefore(strongTag.firstChild, strongTag); } strongTag.remove(); }); }); } removeStrongFromHeadings(); "use strict"; window.onload = () => { /* //This is an object for each category of subjects and in that there are kewords and link to the keywods let keywordsAndLinks = { //you can add more categories and define their keywords and add a link ds: { keywords: [ //you can add more keywords here they are detected and replaced with achor tag automatically 'data science', 'Data science', 'Data Science', 'data Science', 'DATA SCIENCE', ], //we will replace the linktext with the keyword later on in the code //you can easily change links for each category here //(include class="ml-link" and linktext) link: 'linktext', }, ml: { keywords: [ //Add more keywords 'machine learning', 'Machine learning', 'Machine Learning', 'machine Learning', 'MACHINE LEARNING', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, ai: { keywords: [ 'artificial intelligence', 'Artificial intelligence', 'Artificial Intelligence', 'artificial Intelligence', 'ARTIFICIAL INTELLIGENCE', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, nl: { keywords: [ 'NLP', 'nlp', 'natural language processing', 'Natural Language Processing', 'NATURAL LANGUAGE PROCESSING', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, des: { keywords: [ 'data engineering services', 'Data Engineering Services', 'DATA ENGINEERING SERVICES', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, td: { keywords: [ 'training data', 'Training Data', 'training Data', 'TRAINING DATA', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, ias: { keywords: [ 'image annotation services', 'Image annotation services', 'image Annotation services', 'image annotation Services', 'Image Annotation Services', 'IMAGE ANNOTATION SERVICES', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, l: { keywords: [ 'labeling', 'labelling', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, pbp: { keywords: [ 'previous blog posts', 'previous blog post', 'latest', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, mlc: { keywords: [ 'machine learning course', 'machine learning class', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, }; //Articles to skip let articleIdsToSkip = ['post-2651', 'post-3414', 'post-3540']; //keyword with its related achortag is recieved here along with article id function searchAndReplace(keyword, anchorTag, articleId) { //selects the h3 h4 and p tags that are inside of the article let content = document.querySelector(`#${articleId} .entry-content`); //replaces the "linktext" in achor tag with the keyword that will be searched and replaced let newLink = anchorTag.replace('linktext', keyword); //regular expression to search keyword var re = new RegExp('(' + keyword + ')', 'g'); //this replaces the keywords in h3 h4 and p tags content with achor tag content.innerHTML = content.innerHTML.replace(re, newLink); } function articleFilter(keyword, anchorTag) { //gets all the articles var articles = document.querySelectorAll('article'); //if its zero or less then there are no articles if (articles.length > 0) { for (let x = 0; x < articles.length; x++) { //articles to skip is an array in which there are ids of articles which should not get effected //if the current article's id is also in that array then do not call search and replace with its data if (!articleIdsToSkip.includes(articles[x].id)) { //search and replace is called on articles which should get effected searchAndReplace(keyword, anchorTag, articles[x].id, key); } else { console.log( `Cannot replace the keywords in article with id ${articles[x].id}` ); } } } else { console.log('No articles found.'); } } let key; //not part of script, added for (key in keywordsAndLinks) { //key is the object in keywords and links object i.e ds, ml, ai for (let i = 0; i < keywordsAndLinks[key].keywords.length; i++) { //keywordsAndLinks[key].keywords is the array of keywords for key (ds, ml, ai) //keywordsAndLinks[key].keywords[i] is the keyword and keywordsAndLinks[key].link is the link //keyword and link is sent to searchreplace where it is then replaced using regular expression and replace function articleFilter( keywordsAndLinks[key].keywords[i], keywordsAndLinks[key].link ); } } function cleanLinks() { // (making smal functions is for DRY) this function gets the links and only keeps the first 2 and from the rest removes the anchor tag and replaces it with its text function removeLinks(links) { if (links.length > 1) { for (let i = 2; i < links.length; i++) { links[i].outerHTML = links[i].textContent; } } } //arrays which will contain all the achor tags found with the class (ds-link, ml-link, ailink) in each article inserted using search and replace let dslinks; let mllinks; let ailinks; let nllinks; let deslinks; let tdlinks; let iaslinks; let llinks; let pbplinks; let mlclinks; const content = document.querySelectorAll('article'); //all articles content.forEach((c) => { //to skip the articles with specific ids if (!articleIdsToSkip.includes(c.id)) { //getting all the anchor tags in each article one by one dslinks = document.querySelectorAll(`#${c.id} .entry-content a.ds-link`); mllinks = document.querySelectorAll(`#${c.id} .entry-content a.ml-link`); ailinks = document.querySelectorAll(`#${c.id} .entry-content a.ai-link`); nllinks = document.querySelectorAll(`#${c.id} .entry-content a.ntrl-link`); deslinks = document.querySelectorAll(`#${c.id} .entry-content a.des-link`); tdlinks = document.querySelectorAll(`#${c.id} .entry-content a.td-link`); iaslinks = document.querySelectorAll(`#${c.id} .entry-content a.ias-link`); mlclinks = document.querySelectorAll(`#${c.id} .entry-content a.mlc-link`); llinks = document.querySelectorAll(`#${c.id} .entry-content a.l-link`); pbplinks = document.querySelectorAll(`#${c.id} .entry-content a.pbp-link`); //sending the anchor tags list of each article one by one to remove extra anchor tags removeLinks(dslinks); removeLinks(mllinks); removeLinks(ailinks); removeLinks(nllinks); removeLinks(deslinks); removeLinks(tdlinks); removeLinks(iaslinks); removeLinks(mlclinks); removeLinks(llinks); removeLinks(pbplinks); } }); } //To remove extra achor tags of each category (ds, ml, ai) and only have 2 of each category per article cleanLinks(); */ //Recommended Articles var ctaLinks = [ /* ' ' + '

Subscribe to our AI newsletter!

' + */ '

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

'+ '

Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!

' + '
' + '' + '' + '

Note: Content contains the views of the contributing authors and not Towards AI.
Disclosure: This website may contain sponsored content and affiliate links.

' + 'Discover Your Dream AI Career at Towards AI Jobs' + '

Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 10,000 live jobs today with Towards AI Jobs!

' + '
' + '

🔥 Recommended Articles 🔥

' + 'Why Become an LLM Developer? Launching Towards AI’s New One-Stop Conversion Course'+ 'Testing Launchpad.sh: A Container-based GPU Cloud for Inference and Fine-tuning'+ 'The Top 13 AI-Powered CRM Platforms
' + 'Top 11 AI Call Center Software for 2024
' + 'Learn Prompting 101—Prompt Engineering Course
' + 'Explore Leading Cloud Providers for GPU-Powered LLM Training
' + 'Best AI Communities for Artificial Intelligence Enthusiasts
' + 'Best Workstations for Deep Learning
' + 'Best Laptops for Deep Learning
' + 'Best Machine Learning Books
' + 'Machine Learning Algorithms
' + 'Neural Networks Tutorial
' + 'Best Public Datasets for Machine Learning
' + 'Neural Network Types
' + 'NLP Tutorial
' + 'Best Data Science Books
' + 'Monte Carlo Simulation Tutorial
' + 'Recommender System Tutorial
' + 'Linear Algebra for Deep Learning Tutorial
' + 'Google Colab Introduction
' + 'Decision Trees in Machine Learning
' + 'Principal Component Analysis (PCA) Tutorial
' + 'Linear Regression from Zero to Hero
'+ '

', /* + '

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

',*/ ]; var replaceText = { '': '', '': '', '
': '
' + ctaLinks + '
', }; Object.keys(replaceText).forEach((txtorig) => { //txtorig is the key in replacetext object const txtnew = replaceText[txtorig]; //txtnew is the value of the key in replacetext object let entryFooter = document.querySelector('article .entry-footer'); if (document.querySelectorAll('.single-post').length > 0) { //console.log('Article found.'); const text = entryFooter.innerHTML; entryFooter.innerHTML = text.replace(txtorig, txtnew); } else { // console.log('Article not found.'); //removing comment 09/04/24 } }); var css = document.createElement('style'); css.type = 'text/css'; css.innerHTML = '.post-tags { display:none !important } .article-cta a { font-size: 18px; }'; document.body.appendChild(css); //Extra //This function adds some accessibility needs to the site. function addAlly() { // In this function JQuery is replaced with vanilla javascript functions const imgCont = document.querySelector('.uw-imgcont'); imgCont.setAttribute('aria-label', 'AI news, latest developments'); imgCont.title = 'AI news, latest developments'; imgCont.rel = 'noopener'; document.querySelector('.page-mobile-menu-logo a').title = 'Towards AI Home'; document.querySelector('a.social-link').rel = 'noopener'; document.querySelector('a.uw-text').rel = 'noopener'; document.querySelector('a.uw-w-branding').rel = 'noopener'; document.querySelector('.blog h2.heading').innerHTML = 'Publication'; const popupSearch = document.querySelector$('a.btn-open-popup-search'); popupSearch.setAttribute('role', 'button'); popupSearch.title = 'Search'; const searchClose = document.querySelector('a.popup-search-close'); searchClose.setAttribute('role', 'button'); searchClose.title = 'Close search page'; // document // .querySelector('a.btn-open-popup-search') // .setAttribute( // 'href', // 'https://medium.com/towards-artificial-intelligence/search' // ); } // Add external attributes to 302 sticky and editorial links function extLink() { // Sticky 302 links, this fuction opens the link we send to Medium on a new tab and adds a "noopener" rel to them var stickyLinks = document.querySelectorAll('.grid-item.sticky a'); for (var i = 0; i < stickyLinks.length; i++) { /* stickyLinks[i].setAttribute('target', '_blank'); stickyLinks[i].setAttribute('rel', 'noopener'); */ } // Editorial 302 links, same here var editLinks = document.querySelectorAll( '.grid-item.category-editorial a' ); for (var i = 0; i < editLinks.length; i++) { editLinks[i].setAttribute('target', '_blank'); editLinks[i].setAttribute('rel', 'noopener'); } } // Add current year to copyright notices document.getElementById( 'js-current-year' ).textContent = new Date().getFullYear(); // Call functions after page load extLink(); //addAlly(); setTimeout(function() { //addAlly(); //ideally we should only need to run it once ↑ }, 5000); }; function closeCookieDialog (){ document.getElementById("cookie-consent").style.display = "none"; return false; } setTimeout ( function () { closeCookieDialog(); }, 15000); console.log(`%c 🚀🚀🚀 ███ █████ ███████ █████████ ███████████ █████████████ ███████████████ ███████ ███████ ███████ ┌───────────────────────────────────────────────────────────────────┐ │ │ │ Towards AI is looking for contributors! │ │ Join us in creating awesome AI content. │ │ Let's build the future of AI together → │ │ https://towardsai.net/contribute │ │ │ └───────────────────────────────────────────────────────────────────┘ `, `background: ; color: #00adff; font-size: large`); //Remove latest category across site document.querySelectorAll('a[rel="category tag"]').forEach(function(el) { if (el.textContent.trim() === 'Latest') { // Remove the two consecutive spaces (  ) if (el.nextSibling && el.nextSibling.nodeValue.includes('\u00A0\u00A0')) { el.nextSibling.nodeValue = ''; // Remove the spaces } el.style.display = 'none'; // Hide the element } }); // Add cross-domain measurement, anonymize IPs 'use strict'; //var ga = gtag; ga('config', 'G-9D3HKKFV1Q', 'auto', { /*'allowLinker': true,*/ 'anonymize_ip': true/*, 'linker': { 'domains': [ 'medium.com/towards-artificial-intelligence', 'datasets.towardsai.net', 'rss.towardsai.net', 'feed.towardsai.net', 'contribute.towardsai.net', 'members.towardsai.net', 'pub.towardsai.net', 'news.towardsai.net' ] } */ }); ga('send', 'pageview'); -->