Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Can ChatGPT beat DeepPavlov in Natural Language Understanding tasks?
Latest   Machine Learning

Can ChatGPT beat DeepPavlov in Natural Language Understanding tasks?

Last Updated on July 17, 2023 by Editorial Team

Author(s): Vasily Konovalov

Originally published on Towards AI.

Whether you can use ChatGPT to build any Natural Language Processing (NLP) model for your application? In this article, I compare the DeepPavlov model’s performance with ChatGPT on selected GLUE tasks. The Natural Language Understanding (NLU) tasks included in GLUE are crucial for building any NLP application. The findings indicate that while ChatGPT shows astonishing results, it cannot beat pretrained Transformer-based models integrated into DeepPavlov on the GLUE tasks.

Photo by Jem Sahagun on Unsplash

Introduction

ChatGPT, which has been released by OpenAI, has revolutionized the way artificial intelligence interacts with humans. Although there is limited information available on how the model was constructed, its exceptional quality has made it extremely popular. It is noteworthy that the base model of ChatGPT only has 3.5B parameters, yet in conversation tasks, it outperforms the GPT3 model with 175B parameters. This shows the high relevance of collecting data from humans for supervised model fine-tuning. There are several publications on ChatGPT evaluation that test its effectiveness on well-known natural language processing (NLP) tasks. In this article, I am going to compare the performance of ChatGPT with that of the supervised transformer-based models from the DeepPavlov library. DeepPavlov is a conversational open-source library for Natural Language Processing (NLP) and Multiskill AI Assistant development. DeepPavlov is based on PyTorch and leverages Transformer and Datasets packages from HuggingFace to train various transformer-based models on hundreds of datasets.

Install DeepPavlov Library

DeepPavlov Library is an open-source NLP framework. It contains all essential state-of-the-art NLP models that can be used alone or as a part of DeepPavlov Dream — an open-source Multiskill AI Assistant Platform. The library contains various text classification models for topic classification, insult detection, and intent recognition. Before comparing the models, let’s first install DeepPavlov Library by running the command:

pip install deeppavlov==1.0.1

The DeepPavlov Library supports Python 3.6 — 3.9 U+1F40D.

How to use DeepPavlov Library

The DeepPavlov framework is implemented in Python and uses PyTorch as the base machine learning framework. DeepPavlov supports HuggingFace transformers which allows using all AutoModel transformer-based models from the HuggingFace Hub. DeepPavlov provides a way to interact with models via a command-line interface (CLI), an application programming interface (API), or via Python pipelines. The list of the CLI — supported commands is as follows:

  • install model’s specific requirements, run python -m deeppavlov install <config_name>, where <config_name> is the name of the configuration file.
  • interact with the model through CLI, run python -m deeppavlov interact <config_name> [-d] [-i], where -d downloads files from the metadata of the configuration file (optional), and -i installs model requirements (optional).
  • train the model, run the following command: python -m deeppavlov train <config_name> [-d] [-i]. The dataset will be downloaded regardless of whether there was the -d flag or not. To train a model on your own data, you need to modify dataset_reader path in the model configuration file. The data format is specified in the corresponding model doc page. In order to change the backbone transformer architecture, you should modify the corresponding variable in the variables section.
  • riseapi starts a REST API server with the selected model, which might be useful in production deployment scenarios, run python -m deeppavlov riseapi <config_name> [-d] [-i].

Alternatively, you can interact with DeepPavlov’s models by using Python pipelines.

Comparing ChatGPT and DeepPavlov

There is no easy way to compare ChatGPT with DeepPavlov-based models. You have to keep in mind that ChatGPT performance is conditioned on the prompt. Even the slightest modification of the prompt might change the result completely. Almost all prompts used here were adopted from [2] to be fully comparable with the paper results. Moreover, testing ChatGPT performance on well-established NLP benchmarks might lead to several problems. Evaluating ChatGPT performance on thousand test examples might be long and expensive. Therefore I decided to either evaluate the GLUE test set of the corresponding task if the size of the test set is less than 1000 samples or follow the approach proposed in [2] by evaluating a sample of the dev set and comparing it with ChatGPT. In the following comparisons, I included the ChatGPT performance, BERT-based model performance (from [2]), and of DeepPavlov-based models. It is worth mentioning that according to [1], ChatGPT might remember many of the evaluation datasets from the pre-training phrase. OpenAI is continuously improving ChatGPT by releasing new versions. Since the initial release on Dec 15, 2022, there have been seven major releases. All results mentioned here are based on the ChatGPT version of March 23, 2023.

Recognizing Textual Entailment (RTE)

The Recognizing Textual Entailment (RTE) task is a natural language processing task that involves determining whether a given piece of text (the hypothesis) can be inferred from another piece of text (the premise). The task requires understanding the meaning and context of the text and making a judgment about the logical relationship between the two pieces of text. RTE is often used in applications such as question answering, information retrieval, and text summarization.

For example, given the sentence “Cyprus, divided or not, joins the EU on the 1st of May.” determine if the following statement is entailed: “Cyprus was divided into two parts on May 1.” The correct answer is false (not entailment). To access the pre-trained model from the DeepPavlov library, one should use:

from deeppavlov import build_model
model = build_model(glue_rte_roberta_mnli, download=True, install=True)
model(["Cyprus, divided or not, joins the EU on the 1st of May."],
["Cyprus was divided into two parts on May 1."])

To test ChatGPT performance, I used the following prompt: “Given the sentence ‘[text_1]’, determine if the following statement is entailed: ‘[text_2]’”. The RTE test set contains thousands of examples. So to compare ChatGPT with DreepPavlov RTE pretrained models, I used the same 25 sampled instances for each class from the dev set as in [2]. On this sampled set DeepPavlov’s RTE model gains an accuracy of 90, whereas ChatGPT is only 88. Here are three examples where ChatGPT fails to correctly label the pair of sentences:

Prompt: “Given the sentence ‘Microsoft nearly halved at NYSE following the results as the company said operating income would drop as much as 42 percent in the second quarter.’ determine if the following statement is entailed: ‘Shares of Microsoft fell by almost 50 percent.’”
Expected: entailment
DeepPavlov: entailment
ChatGPT: No, the statement is not entailed. The sentence states that Microsoft nearly halved, which means it decreased by almost 50 percent, but it does not explicitly state that the shares fell by almost 50 percent.

Prompt: “Given the sentence ‘Ssangyong Motor was taken over by creditors after it collapsed under heavy debts during the 1997–98 Asian financial crisis.’ determine if the following statement is entailed: ‘Asian financial crisis takes over Ssangyong Motor’”
Expected: entailment
DeepPavlov: entailment
ChatGPT: No, the statement is not entailed. The sentence states that Ssangyong Motor collapsed under heavy debts during the Asian financial crisis and was taken over by creditors, but it does not imply that the crisis itself took over the company.

Prompt: “Given the sentence ‘Mopresol as the capital of France is the biggest city of Europe.’ determine if the following statement is entailed: ‘Mopresol is the biggest city of France.’”
Expected: entailment
DeepPavlov: entailment
ChatGPT: No, the statement is not entailed.

Winograd NLI (WNLI)

ChatGPT has difficulty with coreference resolution that deals with resolving ambiguous pronouns. Winograd Schema Challenge assesses a machine learning model’s capability to refer pronouns to the corresponding noun. It involves resolving ambiguous pronouns in a sentence by using the context provided in the preceding sentence. The challenge is considered difficult for NLP models as it requires a deep understanding of natural language and context. The WNLI task represents the mix of natural language inference with coreference resolution. To access the pre-trained model from the DeepPavlov library, one should use:

from deeppavlov import build_model
model = build_model('glue_wnli_roberta', download=True, install=True)
model(["Susan knew that Ann's son had been in a car accident,
because she told her about it."],["Ann told her about it."])

To solve this problem using ChatGPT, the query based on [2] was used as input: “Having the premise [’sentence1'], tell me if the following hypothesis [‘sentence2’] is entailed with the premise? Answer ‘entailment’ if yes, or ‘not_entailment’ if no”.

Mainly, ChatGPT manages to choose the correct answer. But when it needs to determine which person is substituted for the pronoun or the solution involves the application of physical laws and concepts, ChatGPT incorrectly determines the answer option based on the given context. Primarily, ChatGPT is able to select the accurate response. However, sometimes it struggles in logical or physical reasoning to select the answer based on the provided context.

The unlabeled test set of WNLI contains 146 pairs of sequences. Therefore, it’s feasible to label the entire test set with ChatGPT and complete a fair comparison between ChatGPT and DeepPavolv pretrained WNLI model. DeepPavlov’s WNLI model gains an accuracy of 81.5, whereas ChatGPT is only 79.5. Here are some examples where ChatGPT fails to label the pair of sentences correctly:

Prompt: “Having the premise ‘The cookstove was warming the kitchen, and the lamplight made it seem even warmer’, tell me if the following hypothesis ‘The lamplight made the cookstove seem even warmer’ is entailed with the premise? Answer ‘entailment’ if yes, or ‘not_entailment’ if no.”
Expected: not_entailment
DeepPavlov: not_entailment
ChatGPT: entailment

Prompt: ”Having the premise ‘The Harbor folks said Grandma kept her house so clean that you could wipe her floor with a clean handkerchief without getting any dirt on it’, tell me if the following hypothesis ‘You could wipe her floor with a clean handkerchief without getting any dirt on the house’ is entailed with the premise? Answer ‘entailment’ if yes, or ‘not_entailment’ if no.”
Expected: not_entailment
DeepPavlov: not_entailment
ChatGPT: entailment

Prompt: “Having the premise ‘Muriel has hanging in her living room a famous painting by Cecile d’Arblay of a ballerina putting on her shoe’, tell me if the following hypothesis ‘Muriel has hanging in her living room a famous painting by Cecile d’Arblay of a ballerina putting on the ballerina’s shoe’ is entailed with the premise? Answer ‘entailment’ if yes, or ‘not_entailment’ if no.”
Expected: entailment
DeepPavlov: entailment
ChatGPT: not_entailment

Word in Context (WiC)

One more task where I tested the work of ChatGPT is the word sense disambiguation (WSD) task and the WIC dataset. The Word-in-Context (WIC) dataset is a benchmark dataset used to evaluate the ability of machine learning models to determine the correct sense of a word in a given context. It consists of pairs of sentences where the target word has different senses, and the task is to predict whether the word has the same sense in both sentences or not. The dataset is designed to evaluate models’ ability to capture contextual information and make accurate sense predictions.

As the ChatGPT input I used the following query from [2]: “Are the given sentences expressing the same sense of the word [‘verb/noun’]: [‘sentence1’] and [‘sentence2’]?”

According to the results in [1], the SOTA accuracy exceeds the ChatGPT accuracy by 9.4: 74.00 vs. 64.58. And our experiments confirm this: ChatGPT doesn’t solve this task well if a word in two different sentences has the same meaning. In this case, ChatGPT uses synonyms to explain the meanings of the studied word in each sentence. And sometimes, these synonyms are different, which leads to the wrong decision. Here are some examples of mistakes:

Prompt: “Are the given sentences expressing the same sense of the word ‘create’: ‘You can create the color orange by mixing yellow and red.’ and ‘The company was created 25 years ago.’ ? Return your answer as a letter: ‘T’ if the sense is the same, or ‘F’ if it is not the same.”
ChatGPT: F
Expected:
True
DeepPavlov: True

Prompt: “Are the given sentences expressing the same sense of the word ‘head’: ‘His horse won by a head.’ and ‘He is two heads taller than his little sister.’ ? Return your answer as a letter: ‘T’ if the sense is the same, or ‘F’ if it is not the same.”
ChatGPT: F
Expected: True
DeepPavlov: True

Prompt: “Are the given sentences expressing the same sense of the word ‘voice’: ‘The voice of the law.’ and ‘Conservatism has many voices.’ ? Return your answer as a letter: ‘T’ if the sense is the same, or ‘F’ if it is not the same.”
ChatGPT: F
Expected: True
DeepPavlov: True

Semantic Textual Similarity Benchmark (STS-B)

The Semantic Textual Similarity (STS) Benchmark task is a natural language processing task that measures the degree of semantic similarity between two sentences. The task involves comparing pairs of sentences and assigning a similarity score between 0 and 5, where 0 indicates no similarity and 5 indicates complete similarity. The STS Benchmark task is used to evaluate the performance of various machine learning models and algorithms in measuring semantic similarity between sentences. STS-B can be particularly challenging for LLMs because of the difference between the language modeling objective and the regression task objective. LLMs are designed to predict the next word in a sequence or generate coherent text, with their pre-training focused on capturing linguistic patterns and relationships. Consequently, their internal representations may not be well-suited for modeling continuous numerical outputs. To access the pre-trained model from the DeepPavlov library, one should use:

from deeppavlov import build_model
model = build_model(glue_stsb_cased_bert_torch, download=True, install=True)
model(["A man is playing a guitar."], ["A man is playing a flute."])

The following prompt was used to determine the score with ChatGPT [2]: “Determine the similarity between the following two sentences: [‘sentence1’] and [‘sentence2’]. The score should be ranging from 0.0 to 5.0 and can be a decimal.” Like the authors of [2], I also confirm that ChatGPT determines the semantic textual similarity much worse than the pretrained Transformer-based model. As I have seen, this applies to texts that aren’t similar by 0 or 5 points. The problem may be that ChatGPT uses unsuitable metrics for text comparison.

Due to the large number of test examples, I performed an evaluation on the sample from the dev set as in [2]. The results are as follows. The pretrained STS-B models in DeepPavlov outperform ChatGPT and the models from the article that are based on the same Transformer-encoder.

Here are some examples where ChatGPT fails to label the pair of sentences correctly:

Prompt: “Determine the similarity between the following two sentences: ‘A girl is riding a horse.’ and ‘The girl trotted the horse.’ The score should be ranging from 0.0 to 5.0 and can be a decimal.”
Expected: 4.5
ChatGPT: 2.135
DeepPavlov: 3.09

Prompt: “Determine the similarity between the following two sentences: ‘The cook is kneading the flour.’ and ‘A woman is kneading tortilla dough.’ The score should be ranging from 0.0 to 5.0 and can be a decimal.”
Expected: 2.6
ChatGPT: 0.548
DeepPavlov: 2.12

Prompt: “Determine the similarity between the following two sentences: ‘One man is breaking cement on another man’s chest.’ and ‘A man breaks cinder blocks on another man.’ The score should be ranging from 0.0 to 5.0 and can be a decimal.”
Expected: 3.75
ChatGPT: 1.9
DeepPavlov: 3.82

The Corpus of Linguistic Acceptability (CoLa)

ChatGPT can serve as a valuable writing aid. Its effectiveness can be evaluated by the CoLa dataset that assesses a model’s capacity to determine the grammatical correctness of sentences. The CoLA task involves categorizing a sentence as either acceptable or unacceptable in terms of grammar. The dataset comprises English sentences from diverse origins, varying in complexity from basic to intricate. Human evaluators annotate the sentences, indicating their grammatical acceptability. You can use the DeepPavlov CoLa-pretrained model as follows:

from deeppavlov import train_model
model = build_model('glue_cola_cased_bert_torch', install=True, download=True)
model(["The more people you give beer to, the more people get sick."])

As the input sequence to ChatGPT, I used the following prompt from [2]: “Is the following sentence [‘sentence’] grammatically correct? Answer ‘T’ if yes, or ‘F’ if no.” To compare ChatGPT with DreepPavlov CoLa pretrained models, I used the sampled set from the dev set as in [2]. The comparison shows that the DeepPavlov CoLa model outperforms ChatGPT.

Prompt: “Is the following sentence ‘The proof this set is recursive is difficult.’ grammatically correct? Answer ‘T’ if yes, or ‘F’ if no.”
ChatGPT: T
Expected: F
DeepPavlov: F

Prompt: “Is the following sentence ‘Himself is understood by Rutherford.’ grammatically correct? Answer ‘T’ if yes, or ‘F’ if no.”
ChatGPT: T
Expected: F
DeepPavlov: F

Conclusion

In this article, I compared ChatGPT with DeepPavlov models on the selected GLUE tasks and showed that on these tasks, DeepPavlov-based models outperform ChatGPT by a huge margin. For each task, I provided examples where ChatGPT performed worse or even failed completely. I hope you find this helpful in building your own NLP models.

References:

[1] Kocoń, Jan, et al. “Chatgpt: Jack of all trades, master of none.” Information Fusion (2023): 101861.

[2] Zhong, Qihuang, et al. “Can chatgpt understand too? a comparative study on chatgpt and fine-tuned bert.” arXiv preprint arXiv:2302.10198 (2023).

[3] Burtsev, Mikhail, et al. “Deeppavlov: An open source library for conversational ai.” (2018).

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓

Sign Up for the Course
`; } else { console.error('Element with id="subscribe" not found within the page with class "home".'); } } }); // Remove duplicate text from articles /* Backup: 09/11/24 function removeDuplicateText() { const elements = document.querySelectorAll('h1, h2, h3, h4, h5, strong'); // Select the desired elements const seenTexts = new Set(); // A set to keep track of seen texts const tagCounters = {}; // Object to track instances of each tag elements.forEach(el => { const tagName = el.tagName.toLowerCase(); // Get the tag name (e.g., 'h1', 'h2', etc.) // Initialize a counter for each tag if not already done if (!tagCounters[tagName]) { tagCounters[tagName] = 0; } // Only process the first 10 elements of each tag type if (tagCounters[tagName] >= 2) { return; // Skip if the number of elements exceeds 10 } const text = el.textContent.trim(); // Get the text content const words = text.split(/\s+/); // Split the text into words if (words.length >= 4) { // Ensure at least 4 words const significantPart = words.slice(0, 5).join(' '); // Get first 5 words for matching // Check if the text (not the tag) has been seen before if (seenTexts.has(significantPart)) { // console.log('Duplicate found, removing:', el); // Log duplicate el.remove(); // Remove duplicate element } else { seenTexts.add(significantPart); // Add the text to the set } } tagCounters[tagName]++; // Increment the counter for this tag }); } removeDuplicateText(); */ // Remove duplicate text from articles function removeDuplicateText() { const elements = document.querySelectorAll('h1, h2, h3, h4, h5, strong'); // Select the desired elements const seenTexts = new Set(); // A set to keep track of seen texts const tagCounters = {}; // Object to track instances of each tag // List of classes to be excluded const excludedClasses = ['medium-author', 'post-widget-title']; elements.forEach(el => { // Skip elements with any of the excluded classes if (excludedClasses.some(cls => el.classList.contains(cls))) { return; // Skip this element if it has any of the excluded classes } const tagName = el.tagName.toLowerCase(); // Get the tag name (e.g., 'h1', 'h2', etc.) // Initialize a counter for each tag if not already done if (!tagCounters[tagName]) { tagCounters[tagName] = 0; } // Only process the first 10 elements of each tag type if (tagCounters[tagName] >= 10) { return; // Skip if the number of elements exceeds 10 } const text = el.textContent.trim(); // Get the text content const words = text.split(/\s+/); // Split the text into words if (words.length >= 4) { // Ensure at least 4 words const significantPart = words.slice(0, 5).join(' '); // Get first 5 words for matching // Check if the text (not the tag) has been seen before if (seenTexts.has(significantPart)) { // console.log('Duplicate found, removing:', el); // Log duplicate el.remove(); // Remove duplicate element } else { seenTexts.add(significantPart); // Add the text to the set } } tagCounters[tagName]++; // Increment the counter for this tag }); } removeDuplicateText(); //Remove unnecessary text in blog excerpts document.querySelectorAll('.blog p').forEach(function(paragraph) { // Replace the unwanted text pattern for each paragraph paragraph.innerHTML = paragraph.innerHTML .replace(/Author\(s\): [\w\s]+ Originally published on Towards AI\.?/g, '') // Removes 'Author(s): XYZ Originally published on Towards AI' .replace(/This member-only story is on us\. Upgrade to access all of Medium\./g, ''); // Removes 'This member-only story...' }); //Load ionic icons and cache them if ('localStorage' in window && window['localStorage'] !== null) { const cssLink = 'https://code.ionicframework.com/ionicons/2.0.1/css/ionicons.min.css'; const storedCss = localStorage.getItem('ionicons'); if (storedCss) { loadCSS(storedCss); } else { fetch(cssLink).then(response => response.text()).then(css => { localStorage.setItem('ionicons', css); loadCSS(css); }); } } function loadCSS(css) { const style = document.createElement('style'); style.innerHTML = css; document.head.appendChild(style); } //Remove elements from imported content automatically function removeStrongFromHeadings() { const elements = document.querySelectorAll('h1, h2, h3, h4, h5, h6, span'); elements.forEach(el => { const strongTags = el.querySelectorAll('strong'); strongTags.forEach(strongTag => { while (strongTag.firstChild) { strongTag.parentNode.insertBefore(strongTag.firstChild, strongTag); } strongTag.remove(); }); }); } removeStrongFromHeadings(); "use strict"; window.onload = () => { /* //This is an object for each category of subjects and in that there are kewords and link to the keywods let keywordsAndLinks = { //you can add more categories and define their keywords and add a link ds: { keywords: [ //you can add more keywords here they are detected and replaced with achor tag automatically 'data science', 'Data science', 'Data Science', 'data Science', 'DATA SCIENCE', ], //we will replace the linktext with the keyword later on in the code //you can easily change links for each category here //(include class="ml-link" and linktext) link: 'linktext', }, ml: { keywords: [ //Add more keywords 'machine learning', 'Machine learning', 'Machine Learning', 'machine Learning', 'MACHINE LEARNING', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, ai: { keywords: [ 'artificial intelligence', 'Artificial intelligence', 'Artificial Intelligence', 'artificial Intelligence', 'ARTIFICIAL INTELLIGENCE', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, nl: { keywords: [ 'NLP', 'nlp', 'natural language processing', 'Natural Language Processing', 'NATURAL LANGUAGE PROCESSING', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, des: { keywords: [ 'data engineering services', 'Data Engineering Services', 'DATA ENGINEERING SERVICES', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, td: { keywords: [ 'training data', 'Training Data', 'training Data', 'TRAINING DATA', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, ias: { keywords: [ 'image annotation services', 'Image annotation services', 'image Annotation services', 'image annotation Services', 'Image Annotation Services', 'IMAGE ANNOTATION SERVICES', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, l: { keywords: [ 'labeling', 'labelling', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, pbp: { keywords: [ 'previous blog posts', 'previous blog post', 'latest', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, mlc: { keywords: [ 'machine learning course', 'machine learning class', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, }; //Articles to skip let articleIdsToSkip = ['post-2651', 'post-3414', 'post-3540']; //keyword with its related achortag is recieved here along with article id function searchAndReplace(keyword, anchorTag, articleId) { //selects the h3 h4 and p tags that are inside of the article let content = document.querySelector(`#${articleId} .entry-content`); //replaces the "linktext" in achor tag with the keyword that will be searched and replaced let newLink = anchorTag.replace('linktext', keyword); //regular expression to search keyword var re = new RegExp('(' + keyword + ')', 'g'); //this replaces the keywords in h3 h4 and p tags content with achor tag content.innerHTML = content.innerHTML.replace(re, newLink); } function articleFilter(keyword, anchorTag) { //gets all the articles var articles = document.querySelectorAll('article'); //if its zero or less then there are no articles if (articles.length > 0) { for (let x = 0; x < articles.length; x++) { //articles to skip is an array in which there are ids of articles which should not get effected //if the current article's id is also in that array then do not call search and replace with its data if (!articleIdsToSkip.includes(articles[x].id)) { //search and replace is called on articles which should get effected searchAndReplace(keyword, anchorTag, articles[x].id, key); } else { console.log( `Cannot replace the keywords in article with id ${articles[x].id}` ); } } } else { console.log('No articles found.'); } } let key; //not part of script, added for (key in keywordsAndLinks) { //key is the object in keywords and links object i.e ds, ml, ai for (let i = 0; i < keywordsAndLinks[key].keywords.length; i++) { //keywordsAndLinks[key].keywords is the array of keywords for key (ds, ml, ai) //keywordsAndLinks[key].keywords[i] is the keyword and keywordsAndLinks[key].link is the link //keyword and link is sent to searchreplace where it is then replaced using regular expression and replace function articleFilter( keywordsAndLinks[key].keywords[i], keywordsAndLinks[key].link ); } } function cleanLinks() { // (making smal functions is for DRY) this function gets the links and only keeps the first 2 and from the rest removes the anchor tag and replaces it with its text function removeLinks(links) { if (links.length > 1) { for (let i = 2; i < links.length; i++) { links[i].outerHTML = links[i].textContent; } } } //arrays which will contain all the achor tags found with the class (ds-link, ml-link, ailink) in each article inserted using search and replace let dslinks; let mllinks; let ailinks; let nllinks; let deslinks; let tdlinks; let iaslinks; let llinks; let pbplinks; let mlclinks; const content = document.querySelectorAll('article'); //all articles content.forEach((c) => { //to skip the articles with specific ids if (!articleIdsToSkip.includes(c.id)) { //getting all the anchor tags in each article one by one dslinks = document.querySelectorAll(`#${c.id} .entry-content a.ds-link`); mllinks = document.querySelectorAll(`#${c.id} .entry-content a.ml-link`); ailinks = document.querySelectorAll(`#${c.id} .entry-content a.ai-link`); nllinks = document.querySelectorAll(`#${c.id} .entry-content a.ntrl-link`); deslinks = document.querySelectorAll(`#${c.id} .entry-content a.des-link`); tdlinks = document.querySelectorAll(`#${c.id} .entry-content a.td-link`); iaslinks = document.querySelectorAll(`#${c.id} .entry-content a.ias-link`); mlclinks = document.querySelectorAll(`#${c.id} .entry-content a.mlc-link`); llinks = document.querySelectorAll(`#${c.id} .entry-content a.l-link`); pbplinks = document.querySelectorAll(`#${c.id} .entry-content a.pbp-link`); //sending the anchor tags list of each article one by one to remove extra anchor tags removeLinks(dslinks); removeLinks(mllinks); removeLinks(ailinks); removeLinks(nllinks); removeLinks(deslinks); removeLinks(tdlinks); removeLinks(iaslinks); removeLinks(mlclinks); removeLinks(llinks); removeLinks(pbplinks); } }); } //To remove extra achor tags of each category (ds, ml, ai) and only have 2 of each category per article cleanLinks(); */ //Recommended Articles var ctaLinks = [ /* ' ' + '

Subscribe to our AI newsletter!

' + */ '

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

'+ '

Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!

' + '
' + '' + '' + '

Note: Content contains the views of the contributing authors and not Towards AI.
Disclosure: This website may contain sponsored content and affiliate links.

' + 'Discover Your Dream AI Career at Towards AI Jobs' + '

Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 10,000 live jobs today with Towards AI Jobs!

' + '
' + '

🔥 Recommended Articles 🔥

' + 'Why Become an LLM Developer? Launching Towards AI’s New One-Stop Conversion Course'+ 'Testing Launchpad.sh: A Container-based GPU Cloud for Inference and Fine-tuning'+ 'The Top 13 AI-Powered CRM Platforms
' + 'Top 11 AI Call Center Software for 2024
' + 'Learn Prompting 101—Prompt Engineering Course
' + 'Explore Leading Cloud Providers for GPU-Powered LLM Training
' + 'Best AI Communities for Artificial Intelligence Enthusiasts
' + 'Best Workstations for Deep Learning
' + 'Best Laptops for Deep Learning
' + 'Best Machine Learning Books
' + 'Machine Learning Algorithms
' + 'Neural Networks Tutorial
' + 'Best Public Datasets for Machine Learning
' + 'Neural Network Types
' + 'NLP Tutorial
' + 'Best Data Science Books
' + 'Monte Carlo Simulation Tutorial
' + 'Recommender System Tutorial
' + 'Linear Algebra for Deep Learning Tutorial
' + 'Google Colab Introduction
' + 'Decision Trees in Machine Learning
' + 'Principal Component Analysis (PCA) Tutorial
' + 'Linear Regression from Zero to Hero
'+ '

', /* + '

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

',*/ ]; var replaceText = { '': '', '': '', '
': '
' + ctaLinks + '
', }; Object.keys(replaceText).forEach((txtorig) => { //txtorig is the key in replacetext object const txtnew = replaceText[txtorig]; //txtnew is the value of the key in replacetext object let entryFooter = document.querySelector('article .entry-footer'); if (document.querySelectorAll('.single-post').length > 0) { //console.log('Article found.'); const text = entryFooter.innerHTML; entryFooter.innerHTML = text.replace(txtorig, txtnew); } else { // console.log('Article not found.'); //removing comment 09/04/24 } }); var css = document.createElement('style'); css.type = 'text/css'; css.innerHTML = '.post-tags { display:none !important } .article-cta a { font-size: 18px; }'; document.body.appendChild(css); //Extra //This function adds some accessibility needs to the site. function addAlly() { // In this function JQuery is replaced with vanilla javascript functions const imgCont = document.querySelector('.uw-imgcont'); imgCont.setAttribute('aria-label', 'AI news, latest developments'); imgCont.title = 'AI news, latest developments'; imgCont.rel = 'noopener'; document.querySelector('.page-mobile-menu-logo a').title = 'Towards AI Home'; document.querySelector('a.social-link').rel = 'noopener'; document.querySelector('a.uw-text').rel = 'noopener'; document.querySelector('a.uw-w-branding').rel = 'noopener'; document.querySelector('.blog h2.heading').innerHTML = 'Publication'; const popupSearch = document.querySelector$('a.btn-open-popup-search'); popupSearch.setAttribute('role', 'button'); popupSearch.title = 'Search'; const searchClose = document.querySelector('a.popup-search-close'); searchClose.setAttribute('role', 'button'); searchClose.title = 'Close search page'; // document // .querySelector('a.btn-open-popup-search') // .setAttribute( // 'href', // 'https://medium.com/towards-artificial-intelligence/search' // ); } // Add external attributes to 302 sticky and editorial links function extLink() { // Sticky 302 links, this fuction opens the link we send to Medium on a new tab and adds a "noopener" rel to them var stickyLinks = document.querySelectorAll('.grid-item.sticky a'); for (var i = 0; i < stickyLinks.length; i++) { /* stickyLinks[i].setAttribute('target', '_blank'); stickyLinks[i].setAttribute('rel', 'noopener'); */ } // Editorial 302 links, same here var editLinks = document.querySelectorAll( '.grid-item.category-editorial a' ); for (var i = 0; i < editLinks.length; i++) { editLinks[i].setAttribute('target', '_blank'); editLinks[i].setAttribute('rel', 'noopener'); } } // Add current year to copyright notices document.getElementById( 'js-current-year' ).textContent = new Date().getFullYear(); // Call functions after page load extLink(); //addAlly(); setTimeout(function() { //addAlly(); //ideally we should only need to run it once ↑ }, 5000); }; function closeCookieDialog (){ document.getElementById("cookie-consent").style.display = "none"; return false; } setTimeout ( function () { closeCookieDialog(); }, 15000); console.log(`%c 🚀🚀🚀 ███ █████ ███████ █████████ ███████████ █████████████ ███████████████ ███████ ███████ ███████ ┌───────────────────────────────────────────────────────────────────┐ │ │ │ Towards AI is looking for contributors! │ │ Join us in creating awesome AI content. │ │ Let's build the future of AI together → │ │ https://towardsai.net/contribute │ │ │ └───────────────────────────────────────────────────────────────────┘ `, `background: ; color: #00adff; font-size: large`); //Remove latest category across site document.querySelectorAll('a[rel="category tag"]').forEach(function(el) { if (el.textContent.trim() === 'Latest') { // Remove the two consecutive spaces (  ) if (el.nextSibling && el.nextSibling.nodeValue.includes('\u00A0\u00A0')) { el.nextSibling.nodeValue = ''; // Remove the spaces } el.style.display = 'none'; // Hide the element } }); // Add cross-domain measurement, anonymize IPs 'use strict'; //var ga = gtag; ga('config', 'G-9D3HKKFV1Q', 'auto', { /*'allowLinker': true,*/ 'anonymize_ip': true/*, 'linker': { 'domains': [ 'medium.com/towards-artificial-intelligence', 'datasets.towardsai.net', 'rss.towardsai.net', 'feed.towardsai.net', 'contribute.towardsai.net', 'members.towardsai.net', 'pub.towardsai.net', 'news.towardsai.net' ] } */ }); ga('send', 'pageview'); -->