Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Introduction
Latest   Machine Learning

Introduction

Last Updated on July 25, 2023 by Editorial Team

Author(s): Hitesh Hinduja

Originally published on Towards AI.

Testing in Machine Learning: A Comprehensive Guide with Examples from TensorFlow, PyTorch, Keras, scikit-learn, Hugging Face, and More-Part-1 of 3

Here we are with a new blog today. In this three-part series, we will learn about testing frameworks in ML, which is a rapidly developing area. In the first part of the blog, we will learn about why there is a need for testing frameworks, followed by unit testing techniques. We will cover most of the testing libraries/packages in various frameworks, starting from TensorFlow, PyCaret, DeepDiff, and many more. In the second part of the blog, we will cover gradient testing techniques using JAX, Theano, and many more. We will also cover some testing frameworks like Nose, and some more testing techniques like smoke tests, A/B testing, Canary roll-outs, Blue-green deployments, and many more in AI/ML. In the final and third parts of the blog, we will go through some interesting research papers that have been published in the field of testing in AI/ML, followed by some startups and the interesting work they have been doing in this space. So let’s get started!

Most teams today don’t have the opportunity to follow these best practices while deploying a model into production or during staging or model building. In this blog, I will share some of my experiences from the past, where the need for these frameworks/techniques/libraries was felt, and how they have evolved over time. So, let’s get started with a story!

Understanding Regression Testing through Experiences:

It was in the year 2018 when one of my team members, who was a skilled software developer, came to me with a genuine question. He asked, “Hitesh, I have one question which I need to know. We have so many testing frameworks today in software development, such as Test and behavior-driven developments, equivalence partitioning, boundary value analysis, code coverage, mocking and stubbing, and many more. Can you please help me understand the complete list of tests that we use today to test our software code/pipelines, etc.?” We went to one of the conference rooms and discussed a range of tests that we use today, including regression testing, performance testing, and many others. Then, my team member asked me, “If we consider regression testing, how do we enable that in AI/ML models?”

It was a good question that helped to relate software testing experiences with AI/ML testing experiences. In the end, I believe that a “successful product is not one that has fancy technologies, but one that meets the needs of its users and is tested successfully to ensure its reliability and effectiveness”.

Now, let me share an example I explained to him on how regression testing can be done in AI/ML scenarios. In 2018, I was working on an NLP use case which proved to be one of my toughest experiences to date. It involved not only having a tagged dataset with input paragraphs and output classes but also extracting a specific set of paragraphs and tagging them based on our experts’ classification. Let me fast forward a bit and come to the final dataset. I had approximately 1 million rows of data in the first cut with 4 classes, as it was a multiclass classification model. I asked my team to take just 50,000 rows of data through a stratified sample thought process, meaning to take equal distributions of their classes to start with and not a random sample. My approach has always been a “start simple” thought process, which means that back then, I used to start with basic n-gram ensembling, meta-dense features being feature-engineered, and running classical ML models on the dataset. Assuming all the cleaning was done as the first step, I used to always follow the above ones. Cleaning requires a lot of understanding and is not just about dropping stopwords, punctuation, etc. My team members would be known by now some blunders due to dropping themU+1F600. Once these traditional techniques/models were built, I used to get a good benchmark on how these probabilistic and rule-based techniques had performed. My next step was not to jump into deep learning-based approaches but rather to jump into RCA(Root Cause Analysis) of my outputs. I used to ask my team to create a simple Excel file of inputs and outputs of their complete dataset combined, train, val, and test, and come over to me once done. My whole team and I used to sit for hours to study the incorrectly classified samples and what were the types of those samples. The multiclass output was a probability of values adding up to 100%. I remember starting with an F1 score as low as 18% to close the final set of models with an F1 of ~85%. I could feel all my team members getting exhausted in doing those Excel analytics, but the RCA got us tons of issues. We used to study the incorrectly predicted classes and understand how close we were to predicting the actual class. If we are not predicting the actual class, are we even predicting the next best class (e.g., old being predicted as very old) that confuses the model, or are the predictions outrightly opposite (e.g., old being predicted as very new)? The RCA deserves a separate blog altogether. What I am trying to say here is the importance of improving the mistakes that the model makes and then checking the same set of models over that data to see which ones perform better. I would not call this exactly regression testing, but imagine our “RCA” observations as a test suite where we check how our model is performing on those test suites. If there are differences, we again investigate the root cause and re-run the models.

Now, you all might have a question about what happens if we keep on improving, and the model generalizes on that data. That’s where we don’t just keep 50,000 samples constant here, but keep on adding samples to our data and slowly retrain the models scaling to a larger dataset. Also, we are not disturbing the output of the test set but rather collecting the test suites of where the model is struggling to understand/fails to predict due to reasons. For example, if you are running a sentiment analysis model with more than 8 sentiments in your data, imagine “anger” vs “hate”, how close the sentiments would be? In such predictions, even exclamations, emojis, and stopwords can turn out to be expensive for incorrect predictions, and so I would say one of your test suites would be to check how these 2 classes are performing for curated test suite of emojis, stopwords, and exclamations.

The advantage of this approach is that it is always easier to understand the behavior of your model on a limited dataset rather than blindly taking millions of rows, performing cleaning, feature engineering, and getting metrics. In conclusion, AI/ML models are complex, and testing them requires specialized frameworks and techniques that go beyond traditional software testing. Techniques like RCA and test suites can help improve the accuracy and reliability of AI/ML models. As we continue to develop and deploy AI/ML models in various applications, it is essential to have a thorough understanding of the testing frameworks and best practices to ensure the success of these models.

Now, let’s continue our discussion on the developments in the testing ecosystem for AI/ML. Let’s get started.

Unit Testing and Additional Testing Methods:

Let’s first understand the unit testing in a simple way

Unit testing in ML is like making sure your pizza toppings are spread evenly. Imagine you’re a pizzeria owner and you want to make sure each slice of pizza has the same amount of pepperoni, mushrooms, and cheese. So, you take a small sample slice, called a “unit slice,” and make sure it has the right balance of toppings. If the unit slice is perfect, you can be confident that the rest of the pizza is too.

In the same way, when you’re developing an ML model, you want to make sure each part of the code is working correctly. So, you take a small sample of data, called a “unit test,” and check that the model is producing the expected output. If the unit test passes, you can be confident that the rest of the model is working too.

And just like how you wouldn’t want a pizza with all the toppings on one side, you don’t want your ML model to be biased towards one type of data or prediction. Unit testing helps ensure that your model is working fairly and accurately for all scenarios.

To perform unit testing, there are several techniques/frameworks/libraries that we will see below. Let’s try to understand, with example cases, how these can be leveraged in ML models, data, and much more. We will also see some interesting startups in this space doing work on testing significantly.

Let’s take different problem statements and see how these test functions are defined. You can use these functions accordingly for your own problem statements.

TensorFlow’s tf.test.TestCase and tf.test.mock modules

In this example, we are using TensorFlow’s tf.test.TestCase and tf.test.mock modules to perform three different types of unit tests on the deep learning model: checking the model summary, checking the number of trainable weights, and checking that the training logs are printed correctly.

PyTorch’s unittest.TestCase module

In this example, we are testing the same CNN model (MyModel) using PyTorch’s unittest.TestCase module. We are performing the same two unit tests as in the previous example:

  • test_input_shape: This test checks whether the output shape of the model matches the expected shape given a specific input shape.
  • test_loss_function: This test checks whether the gradients of the model parameters with respect to the loss function are correct

Keras Model.fit() method

In this example, we are using Keras’ Model.fit() method to train the deep learning model and then performing a unit test to check the accuracy of the model after training.

Similarly, we can use the below functions to unit test our modules. In the context of the length of the blog, I will explain each of these functions in a simple way and you can incorporate these in a similar context as above.

sci-kit-learn’s assert_allclose() function

assert_allclose() is a function in the scikit-learn library that checks if two arrays or data frames are equal, within a tolerance.

The function takes two inputs: the first is the expected output or the “true” values, and the second is the actual output or the “predicted” values. The function then compares these two inputs element-wise and checks whether the absolute difference between them is less than a specified tolerance level.

If the difference is greater than the specified tolerance level, then assert_allclose() raises an assertion error, which indicates that the test has failed.

Overall, assert_allclose() is a helpful tool for testing the accuracy of machine learning models, as it can quickly highlight when the model is not producing results within an acceptable margin of error.

MLflow’s mlflow.pytest_plugin module:

MLflow’s pytest plugin is a unit testing technique used in machine learning. It allows you to test your ML models, training code, and preprocessing code in a reproducible and automated way. The plugin provides a set of pytest fixtures that you can use to set up your tests, such as loading data and models, and it also provides a set of assertions that you can use to test your models’ accuracy and behavior. By using the MLflow pytest plugin, you can ensure that your ML models are performing as expected and catch regressions early on.

DeepDiff’s DeepDiff class

DeepDiff is a Python package that provides a convenient way to compare complex Python objects such as dictionaries, lists, sets, and tuples. The DeepDiff class is particularly useful in unit testing in ML as it allows developers to quickly identify differences between expected and actual model outputs.

For example, suppose you have a trained ML model that is expected to output a dictionary with specific keys and values. With the DeepDiff class, you can compare the expected and actual output dictionaries and easily identify any differences between them.

The DeepDiff class provides various comparison methods, such as ‘assert_diff’ and ‘assert_same_structure’, that can be used to perform different types of comparisons between objects. It also provides helpful methods such as ‘to_dict’ and ‘from_dict’ to convert objects to and from dictionaries for comparison.

Using the DeepDiff class in unit testing in ML can help developers quickly catch errors and ensure that their models are performing as expected.

Yellowbrick’s VisualAssert class

Yellowbrick is a Python library for machine learning visualization.

The VisualAssert class allows users to create visualizations of model predictions and actual values, which can be compared side by side to ensure that they are consistent. This can be useful for detecting errors in machine learning models that are not immediately apparent from numerical output alone.

For example, if a model is trained to predict the prices of houses based on various features such as location, size, and a number of bedrooms, a visual assertion using the VisualAssert class could display a scatter plot of predicted prices versus actual prices. If there is a systematic deviation from the 45-degree line (i.e., predicted prices are consistently higher or lower than actual prices), this could indicate a problem with the model that needs to be addressed.

Overall, the VisualAssert class provides an additional layer of validation to machine learning models and can help catch errors that might not be caught through traditional unit testing methods.

Hugging Face’s pytest plugin:

Hugging Face’s pytest plugin is a tool for testing machine-learning models built using the Hugging Face Transformers library. The plugin allows developers to easily write tests for their models and evaluate their performance on various metrics.

With the pytest plugin, developers can define test cases that create model instances, input data, and expected output. The plugin then runs the model on the input data and checks that the output matches the expected output. The plugin also provides various assertion helpers to make it easier to write tests for specific tasks such as text classification, language modeling, and sequence-to-sequence tasks.

Overall, the Hugging Face pytest plugin can help developers ensure that their models are functioning correctly and producing the expected output, which is an important part of unit testing in machine learning.

In this example, we first import the pytest module and the pipeline function from the Hugging Face’s transformers library. Then, we define a simple text classification task pipeline using the pre-trained distilbert-base-uncased-finetuned-sst-2-english model.

Next, we define a test case using the @pytest.mark.parametrize decorator. This test case takes in two inputs — a text sample and the expected label for that sample. For each set of inputs specified in the decorator, the test case runs inference using the pipeline and checks if the predicted label matches the expected label using the assert statement.

Finally, we run the test case using the pytest command in the terminal. The Hugging Face’s pytest plugin automatically discovers the test cases and provides useful output, including any failures or errors.

TensorFlow Probability’s tfp.test_util.assert_near() function

The tfp.test_util.assert_near() function in TensorFlow Probability is used in unit testing to compare two tensors and assert that they are nearly equal (within a certain tolerance). This is useful in testing statistical models, where the output may not be exact due to randomness.

The function takes in two tensors and an optional tolerance level, and checks if they are element-wise nearly equal. If the tensors are not nearly equal, an AssertionError is raised.

Mxnet’s mx.test_utils.assert_almost_equal() function

Similar to the other framework functions.

PyTorch Lightning’s LightningTestCase class

LightningTestCase is a class in the PyTorch Lightning library that extends the unittest.TestCase class and provides some additional functionalities for testing PyTorch Lightning models. This class helps in writing test cases for PyTorch Lightning models in a more efficient and easy way.

The LightningTestCase class provides some helper functions for testing various aspects of a PyTorch Lightning model. For example, it provides a run_model_test function that tests whether a model can run properly without any errors, and another function assert_checkpoint_and_resume to test whether a model can be saved and resumed properly.

In this example, we define a simple linear regression model (SimpleModel) and then create a test case (TestSimpleModel) that inherits from LightningTestCase. We define a single test method test_checkpoint_and_resume, which trains the model for one epoch and then uses the assert_checkpoint_and_resume method to test that the model can be checkpointed and resumed without error.

PyCaret’s assert_model_performance() function

Similar to other functions that allows us to compare model performance by setting thresholds.

Allennlp’s test_case.AllennlpTestCase class:

Similar to other classes that allows us to define test cases.

TorchIO’s assert_allclose() function:

Similar to other assert functions of other libraries.

Skorch’s NeuralNetClassifier class:

This is not exactly the testing library but more of a wrapper class. Skorch is a Python library that allows you to use PyTorch models with scikit-learn. The NeuralNetClassifier class is a scikit-learn compatible wrapper for PyTorch neural networks.

When you create a NeuralNetClassifier object, you define the architecture of your neural network using PyTorch. You can also specify hyperparameters such as the learning rate, number of epochs, and batch size.

Once you’ve defined your model, you can use the NeuralNetClassifier object to fit the model to your training data and make predictions on new data. The fit method trains the neural network using the specified hyperparameters, and the predict method uses the trained model to make predictions on new data.

Overall, the NeuralNetClassifier class allows you to use the power of PyTorch to build and train neural networks, while still having the convenience and compatibility of scikit-learn.

DVC

DVC (Data Version Control) is an open-source library that helps you manage and version your machine learning models and data. It provides a simple and efficient way to track changes to your code, data, and experiments, and to collaborate with your team members.

One of the key features of DVC is its ability to version large datasets without storing them in Git, which can quickly become bloated and slow. Instead, DVC uses Git to track the changes to your data and metadata, while storing the actual data in a remote storage service like Amazon S3 or Google Cloud Storage.

DVC also provides tools for reproducible machine learning experiments, which is crucial for testing and validating your models. With DVC, you can create a pipeline that defines the data processing, model training, and evaluation steps, and run it on different machines or environments with the same results. This makes it easy to test different configurations and hyperparameters and to compare the performance of different models.

Overall, DVC can help you streamline your machine-learning workflow, reduce errors and inconsistencies, and improve the quality and reliability of your models.

There are few more from TensorFlow Extended (TFX) framework. Feel free to explore that.

So that’s it from part-1 of this blog. In conclusion, testing is an important aspect of Machine Learning and is crucial for the success of any ML project. The use of testing frameworks, unit testing, regression testing techniques, and other testing methods helps ensure that the model is working as intended, and that changes made to the codebase do not result in unexpected behavior or performance issues.

Furthermore, by leveraging testing frameworks and techniques, data scientists and machine learning engineers can catch and address issues early in the development cycle, leading to more robust and reliable ML models. With the growing importance of AI/ML in various industries, it’s essential to focus on testing and ensure that ML models are working as intended, providing accurate and reliable results.

(Note : In the second part of the blog, we will cover gradient testing techniques using JAX, Theano, and many more. We will also cover some testing frameworks like Nose, and some more testing techniques like smoke tests, A/B testing, Canary roll-outs, Blue-green deployments, and many more in AI/ML, here we will be doing good hands-on in Azure cloud. In the final and third part of the blog, we will go through some interesting research papers that have been published in the field of testing in AI/ML, followed by some startups and the interesting work they have been doing in this space, stay tuned!)

Just before I go, on a lighter note, I asked this question to our very own Azure Open AI ChatGPT, why do machine learning engineers/data scientists avoid testing? and it says U+1F602

Because it always gives them “false positives” and “true negatives”!

Signing off,

Hitesh Hinduja

Hitesh Hinduja U+007C LinkedIn

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓

Sign Up for the Course
`; } else { console.error('Element with id="subscribe" not found within the page with class "home".'); } } }); // Remove duplicate text from articles /* Backup: 09/11/24 function removeDuplicateText() { const elements = document.querySelectorAll('h1, h2, h3, h4, h5, strong'); // Select the desired elements const seenTexts = new Set(); // A set to keep track of seen texts const tagCounters = {}; // Object to track instances of each tag elements.forEach(el => { const tagName = el.tagName.toLowerCase(); // Get the tag name (e.g., 'h1', 'h2', etc.) // Initialize a counter for each tag if not already done if (!tagCounters[tagName]) { tagCounters[tagName] = 0; } // Only process the first 10 elements of each tag type if (tagCounters[tagName] >= 2) { return; // Skip if the number of elements exceeds 10 } const text = el.textContent.trim(); // Get the text content const words = text.split(/\s+/); // Split the text into words if (words.length >= 4) { // Ensure at least 4 words const significantPart = words.slice(0, 5).join(' '); // Get first 5 words for matching // Check if the text (not the tag) has been seen before if (seenTexts.has(significantPart)) { // console.log('Duplicate found, removing:', el); // Log duplicate el.remove(); // Remove duplicate element } else { seenTexts.add(significantPart); // Add the text to the set } } tagCounters[tagName]++; // Increment the counter for this tag }); } removeDuplicateText(); */ // Remove duplicate text from articles function removeDuplicateText() { const elements = document.querySelectorAll('h1, h2, h3, h4, h5, strong'); // Select the desired elements const seenTexts = new Set(); // A set to keep track of seen texts const tagCounters = {}; // Object to track instances of each tag // List of classes to be excluded const excludedClasses = ['medium-author', 'post-widget-title']; elements.forEach(el => { // Skip elements with any of the excluded classes if (excludedClasses.some(cls => el.classList.contains(cls))) { return; // Skip this element if it has any of the excluded classes } const tagName = el.tagName.toLowerCase(); // Get the tag name (e.g., 'h1', 'h2', etc.) // Initialize a counter for each tag if not already done if (!tagCounters[tagName]) { tagCounters[tagName] = 0; } // Only process the first 10 elements of each tag type if (tagCounters[tagName] >= 10) { return; // Skip if the number of elements exceeds 10 } const text = el.textContent.trim(); // Get the text content const words = text.split(/\s+/); // Split the text into words if (words.length >= 4) { // Ensure at least 4 words const significantPart = words.slice(0, 5).join(' '); // Get first 5 words for matching // Check if the text (not the tag) has been seen before if (seenTexts.has(significantPart)) { // console.log('Duplicate found, removing:', el); // Log duplicate el.remove(); // Remove duplicate element } else { seenTexts.add(significantPart); // Add the text to the set } } tagCounters[tagName]++; // Increment the counter for this tag }); } removeDuplicateText(); //Remove unnecessary text in blog excerpts document.querySelectorAll('.blog p').forEach(function(paragraph) { // Replace the unwanted text pattern for each paragraph paragraph.innerHTML = paragraph.innerHTML .replace(/Author\(s\): [\w\s]+ Originally published on Towards AI\.?/g, '') // Removes 'Author(s): XYZ Originally published on Towards AI' .replace(/This member-only story is on us\. Upgrade to access all of Medium\./g, ''); // Removes 'This member-only story...' }); //Load ionic icons and cache them if ('localStorage' in window && window['localStorage'] !== null) { const cssLink = 'https://code.ionicframework.com/ionicons/2.0.1/css/ionicons.min.css'; const storedCss = localStorage.getItem('ionicons'); if (storedCss) { loadCSS(storedCss); } else { fetch(cssLink).then(response => response.text()).then(css => { localStorage.setItem('ionicons', css); loadCSS(css); }); } } function loadCSS(css) { const style = document.createElement('style'); style.innerHTML = css; document.head.appendChild(style); } //Remove elements from imported content automatically function removeStrongFromHeadings() { const elements = document.querySelectorAll('h1, h2, h3, h4, h5, h6, span'); elements.forEach(el => { const strongTags = el.querySelectorAll('strong'); strongTags.forEach(strongTag => { while (strongTag.firstChild) { strongTag.parentNode.insertBefore(strongTag.firstChild, strongTag); } strongTag.remove(); }); }); } removeStrongFromHeadings(); "use strict"; window.onload = () => { /* //This is an object for each category of subjects and in that there are kewords and link to the keywods let keywordsAndLinks = { //you can add more categories and define their keywords and add a link ds: { keywords: [ //you can add more keywords here they are detected and replaced with achor tag automatically 'data science', 'Data science', 'Data Science', 'data Science', 'DATA SCIENCE', ], //we will replace the linktext with the keyword later on in the code //you can easily change links for each category here //(include class="ml-link" and linktext) link: 'linktext', }, ml: { keywords: [ //Add more keywords 'machine learning', 'Machine learning', 'Machine Learning', 'machine Learning', 'MACHINE LEARNING', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, ai: { keywords: [ 'artificial intelligence', 'Artificial intelligence', 'Artificial Intelligence', 'artificial Intelligence', 'ARTIFICIAL INTELLIGENCE', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, nl: { keywords: [ 'NLP', 'nlp', 'natural language processing', 'Natural Language Processing', 'NATURAL LANGUAGE PROCESSING', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, des: { keywords: [ 'data engineering services', 'Data Engineering Services', 'DATA ENGINEERING SERVICES', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, td: { keywords: [ 'training data', 'Training Data', 'training Data', 'TRAINING DATA', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, ias: { keywords: [ 'image annotation services', 'Image annotation services', 'image Annotation services', 'image annotation Services', 'Image Annotation Services', 'IMAGE ANNOTATION SERVICES', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, l: { keywords: [ 'labeling', 'labelling', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, pbp: { keywords: [ 'previous blog posts', 'previous blog post', 'latest', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, mlc: { keywords: [ 'machine learning course', 'machine learning class', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, }; //Articles to skip let articleIdsToSkip = ['post-2651', 'post-3414', 'post-3540']; //keyword with its related achortag is recieved here along with article id function searchAndReplace(keyword, anchorTag, articleId) { //selects the h3 h4 and p tags that are inside of the article let content = document.querySelector(`#${articleId} .entry-content`); //replaces the "linktext" in achor tag with the keyword that will be searched and replaced let newLink = anchorTag.replace('linktext', keyword); //regular expression to search keyword var re = new RegExp('(' + keyword + ')', 'g'); //this replaces the keywords in h3 h4 and p tags content with achor tag content.innerHTML = content.innerHTML.replace(re, newLink); } function articleFilter(keyword, anchorTag) { //gets all the articles var articles = document.querySelectorAll('article'); //if its zero or less then there are no articles if (articles.length > 0) { for (let x = 0; x < articles.length; x++) { //articles to skip is an array in which there are ids of articles which should not get effected //if the current article's id is also in that array then do not call search and replace with its data if (!articleIdsToSkip.includes(articles[x].id)) { //search and replace is called on articles which should get effected searchAndReplace(keyword, anchorTag, articles[x].id, key); } else { console.log( `Cannot replace the keywords in article with id ${articles[x].id}` ); } } } else { console.log('No articles found.'); } } let key; //not part of script, added for (key in keywordsAndLinks) { //key is the object in keywords and links object i.e ds, ml, ai for (let i = 0; i < keywordsAndLinks[key].keywords.length; i++) { //keywordsAndLinks[key].keywords is the array of keywords for key (ds, ml, ai) //keywordsAndLinks[key].keywords[i] is the keyword and keywordsAndLinks[key].link is the link //keyword and link is sent to searchreplace where it is then replaced using regular expression and replace function articleFilter( keywordsAndLinks[key].keywords[i], keywordsAndLinks[key].link ); } } function cleanLinks() { // (making smal functions is for DRY) this function gets the links and only keeps the first 2 and from the rest removes the anchor tag and replaces it with its text function removeLinks(links) { if (links.length > 1) { for (let i = 2; i < links.length; i++) { links[i].outerHTML = links[i].textContent; } } } //arrays which will contain all the achor tags found with the class (ds-link, ml-link, ailink) in each article inserted using search and replace let dslinks; let mllinks; let ailinks; let nllinks; let deslinks; let tdlinks; let iaslinks; let llinks; let pbplinks; let mlclinks; const content = document.querySelectorAll('article'); //all articles content.forEach((c) => { //to skip the articles with specific ids if (!articleIdsToSkip.includes(c.id)) { //getting all the anchor tags in each article one by one dslinks = document.querySelectorAll(`#${c.id} .entry-content a.ds-link`); mllinks = document.querySelectorAll(`#${c.id} .entry-content a.ml-link`); ailinks = document.querySelectorAll(`#${c.id} .entry-content a.ai-link`); nllinks = document.querySelectorAll(`#${c.id} .entry-content a.ntrl-link`); deslinks = document.querySelectorAll(`#${c.id} .entry-content a.des-link`); tdlinks = document.querySelectorAll(`#${c.id} .entry-content a.td-link`); iaslinks = document.querySelectorAll(`#${c.id} .entry-content a.ias-link`); mlclinks = document.querySelectorAll(`#${c.id} .entry-content a.mlc-link`); llinks = document.querySelectorAll(`#${c.id} .entry-content a.l-link`); pbplinks = document.querySelectorAll(`#${c.id} .entry-content a.pbp-link`); //sending the anchor tags list of each article one by one to remove extra anchor tags removeLinks(dslinks); removeLinks(mllinks); removeLinks(ailinks); removeLinks(nllinks); removeLinks(deslinks); removeLinks(tdlinks); removeLinks(iaslinks); removeLinks(mlclinks); removeLinks(llinks); removeLinks(pbplinks); } }); } //To remove extra achor tags of each category (ds, ml, ai) and only have 2 of each category per article cleanLinks(); */ //Recommended Articles var ctaLinks = [ /* ' ' + '

Subscribe to our AI newsletter!

' + */ '

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

'+ '

Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!

' + '
' + '' + '' + '

Note: Content contains the views of the contributing authors and not Towards AI.
Disclosure: This website may contain sponsored content and affiliate links.

' + 'Discover Your Dream AI Career at Towards AI Jobs' + '

Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 10,000 live jobs today with Towards AI Jobs!

' + '
' + '

🔥 Recommended Articles 🔥

' + 'Why Become an LLM Developer? Launching Towards AI’s New One-Stop Conversion Course'+ 'Testing Launchpad.sh: A Container-based GPU Cloud for Inference and Fine-tuning'+ 'The Top 13 AI-Powered CRM Platforms
' + 'Top 11 AI Call Center Software for 2024
' + 'Learn Prompting 101—Prompt Engineering Course
' + 'Explore Leading Cloud Providers for GPU-Powered LLM Training
' + 'Best AI Communities for Artificial Intelligence Enthusiasts
' + 'Best Workstations for Deep Learning
' + 'Best Laptops for Deep Learning
' + 'Best Machine Learning Books
' + 'Machine Learning Algorithms
' + 'Neural Networks Tutorial
' + 'Best Public Datasets for Machine Learning
' + 'Neural Network Types
' + 'NLP Tutorial
' + 'Best Data Science Books
' + 'Monte Carlo Simulation Tutorial
' + 'Recommender System Tutorial
' + 'Linear Algebra for Deep Learning Tutorial
' + 'Google Colab Introduction
' + 'Decision Trees in Machine Learning
' + 'Principal Component Analysis (PCA) Tutorial
' + 'Linear Regression from Zero to Hero
'+ '

', /* + '

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

',*/ ]; var replaceText = { '': '', '': '', '
': '
' + ctaLinks + '
', }; Object.keys(replaceText).forEach((txtorig) => { //txtorig is the key in replacetext object const txtnew = replaceText[txtorig]; //txtnew is the value of the key in replacetext object let entryFooter = document.querySelector('article .entry-footer'); if (document.querySelectorAll('.single-post').length > 0) { //console.log('Article found.'); const text = entryFooter.innerHTML; entryFooter.innerHTML = text.replace(txtorig, txtnew); } else { // console.log('Article not found.'); //removing comment 09/04/24 } }); var css = document.createElement('style'); css.type = 'text/css'; css.innerHTML = '.post-tags { display:none !important } .article-cta a { font-size: 18px; }'; document.body.appendChild(css); //Extra //This function adds some accessibility needs to the site. function addAlly() { // In this function JQuery is replaced with vanilla javascript functions const imgCont = document.querySelector('.uw-imgcont'); imgCont.setAttribute('aria-label', 'AI news, latest developments'); imgCont.title = 'AI news, latest developments'; imgCont.rel = 'noopener'; document.querySelector('.page-mobile-menu-logo a').title = 'Towards AI Home'; document.querySelector('a.social-link').rel = 'noopener'; document.querySelector('a.uw-text').rel = 'noopener'; document.querySelector('a.uw-w-branding').rel = 'noopener'; document.querySelector('.blog h2.heading').innerHTML = 'Publication'; const popupSearch = document.querySelector$('a.btn-open-popup-search'); popupSearch.setAttribute('role', 'button'); popupSearch.title = 'Search'; const searchClose = document.querySelector('a.popup-search-close'); searchClose.setAttribute('role', 'button'); searchClose.title = 'Close search page'; // document // .querySelector('a.btn-open-popup-search') // .setAttribute( // 'href', // 'https://medium.com/towards-artificial-intelligence/search' // ); } // Add external attributes to 302 sticky and editorial links function extLink() { // Sticky 302 links, this fuction opens the link we send to Medium on a new tab and adds a "noopener" rel to them var stickyLinks = document.querySelectorAll('.grid-item.sticky a'); for (var i = 0; i < stickyLinks.length; i++) { /* stickyLinks[i].setAttribute('target', '_blank'); stickyLinks[i].setAttribute('rel', 'noopener'); */ } // Editorial 302 links, same here var editLinks = document.querySelectorAll( '.grid-item.category-editorial a' ); for (var i = 0; i < editLinks.length; i++) { editLinks[i].setAttribute('target', '_blank'); editLinks[i].setAttribute('rel', 'noopener'); } } // Add current year to copyright notices document.getElementById( 'js-current-year' ).textContent = new Date().getFullYear(); // Call functions after page load extLink(); //addAlly(); setTimeout(function() { //addAlly(); //ideally we should only need to run it once ↑ }, 5000); }; function closeCookieDialog (){ document.getElementById("cookie-consent").style.display = "none"; return false; } setTimeout ( function () { closeCookieDialog(); }, 15000); console.log(`%c 🚀🚀🚀 ███ █████ ███████ █████████ ███████████ █████████████ ███████████████ ███████ ███████ ███████ ┌───────────────────────────────────────────────────────────────────┐ │ │ │ Towards AI is looking for contributors! │ │ Join us in creating awesome AI content. │ │ Let's build the future of AI together → │ │ https://towardsai.net/contribute │ │ │ └───────────────────────────────────────────────────────────────────┘ `, `background: ; color: #00adff; font-size: large`); //Remove latest category across site document.querySelectorAll('a[rel="category tag"]').forEach(function(el) { if (el.textContent.trim() === 'Latest') { // Remove the two consecutive spaces (  ) if (el.nextSibling && el.nextSibling.nodeValue.includes('\u00A0\u00A0')) { el.nextSibling.nodeValue = ''; // Remove the spaces } el.style.display = 'none'; // Hide the element } }); // Add cross-domain measurement, anonymize IPs 'use strict'; //var ga = gtag; ga('config', 'G-9D3HKKFV1Q', 'auto', { /*'allowLinker': true,*/ 'anonymize_ip': true/*, 'linker': { 'domains': [ 'medium.com/towards-artificial-intelligence', 'datasets.towardsai.net', 'rss.towardsai.net', 'feed.towardsai.net', 'contribute.towardsai.net', 'members.towardsai.net', 'pub.towardsai.net', 'news.towardsai.net' ] } */ }); ga('send', 'pageview'); -->