Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Trends in AI — August 2022
Latest   Machine Learning

Trends in AI — August 2022

Last Updated on July 26, 2023 by Editorial Team

Author(s): Sergi Castella i Sapé

Originally published on Towards AI.

3% of Google’s new code is written by a Language Model, DeepSpeed for making inferences with a Trillion-scale Mixture of Experts, Beating neural scaling laws, prompting in Information Retrieval, Language Model Cascades, and much more.

Image by Zeta Alpha.

While blockbuster research has slowed down slightly in the past month, probably because of the summer season, conferences are back at full speed in person: NAACL in Seattle, SIGIR in Madrid, and also ICML, for which we created a special guide with the help of GPT-3. Other news we’d like to highlight, to begin with are:

U+1F52C Research

Every month we analyze the most recent research literature and select a varied set of 10 papers you should know of. This month we’re covering topics such as Reinforcement Learning (RL), Scaling Laws, Information Retrieval, Language Models, and more.

1. Beyond neural scaling laws: beating power law scaling via data pruning

By Ben Sorscher, Robert Geirhos, Shashank Shekhar, Surya Ganguli, Ari S. Morcos.

U+2753 Why → Scaling laws¹ is a pervasive empirical phenomenon in modern Neural Networks, where the error is observed to off as a power of the training set size, model size, or both. While some have embraced this fact to devise a research agenda focused on scaling up, many think there must be ways to build better models without the need for outrageous scale. This paper explores a technique — data pruning — that can improve the learning efficiency of NNs “beating” scaling laws.

U+1F4A1 Key insights → In the context of this work, pruning refers to dropping training data samples from the training dataset, not pruning weights of the Neural Network. The intuition behind the method proposed is simple: imagine you could rank samples in a training dataset from “easy to learn” to “hard to learn” for a model. A typical dataset will contain too many of the easy-to-learn samples — that is to say, fewer would suffice to reach good performance on those samples — and too few from the hard-to-learn side — meaning you’d need more examples to train the model in properly.

One way of solving this problem is to scale up the whole training dataset because given enough scaling up — assuming the data distribution is uniform — you’ll get enough of the “hard to learn” samples eventually. But that’s very wasteful. What if we could use a priori to curate a training dataset that contains a better balance of easy-to-learn and hard-to-learn samples? This is what this paper investigates.

This problem can be formalized as trying to find a pruning metric to assign to each training sample (a proxy for its hardness), which is then used to prune the training dataset to the desired size. They propose a new metric in this paper that is comparable to existing work requiring labeled data.

Source: https://arxiv.org/pdf/2206.14486.pd

However, the most interesting contribution, in my opinion, is their section on data pruning without labels. They perform k-means clustering on the embeddings of a pre-trained ImageNet model and define the hardness of each sample as its distance to the nearest centroid: the intuition is that prototypical samples that are easy to learn will be closest to a centroid, whereas hard-to-learn samples will fall further away from their cluster centroids. The results show how around 20% of the training samples from ImageNet can be pruned without sacrificing performance.

To be fair, the results in the paper are not jaw-dropping, but the key ideas behind it have the potential to be useful in other tasks such as image segmentation, language modeling, or any other multimodal dataset curation.

2. Denoised MDPs: Learning World Models Better Than the World Itself U+007C Project Page U+007C Code

By Tongzhou Wang, Simon S. Du, Antonio Torralba, Phillip Isola, Amy Zhang, Yuandong Tian.

U+2753 Why → The world contains a lot of information that’s irrelevant for you to be able to navigate it. At the core of many Machine Learning techniques is the ability to discern relevant and useful signals — or patterns — from noise.

U+1F4A1 Key insights → This work formalizes the problem of separating “the good from the irrelevant information” in the context of reinforcement learning by identifying information that’s both controllable by the agent and relevant for the reward, as described in the figure below.

Source: https://arxiv.org/pdf/2206.15477.pdf

Based on this idea, the authors propose Denoised MDPs (Markov Decision Process), a method for learning a factorization of state representations that disentangles the controllable and reward-relevant bit of the state using information theoretic principles. The gist of it is that different factors of the state should be maximally or minimally predictive of other factors depending on their relationship, which enables the authors to set up a variational objective for the agent to optimize (you’ll need to check out the paper for the real mathy goodies).

Source: https://arxiv.org/pdf/2206.15477.pdf

The result is a world model that explicitly models what information should be discarded as noise and what information should be used for modeling the agent’s decisions. The authors prove how this approach is competitive in the DeepMind Control Suite, and fascinatingly, they showcase qualitatively how the Denoised MDP representations work by training a decoder on reconstructing the input so you can understand what the signal representation of the state learns to capture. You can watch those on their project page.

Source: https://www.tongzhouwang.info/denoised_mdp/

3. Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers U+007C Code

By Weng Lam Tam, Xiao Liu, Kaixuan Ji, Lilong Xue, Xingjian Zhang, Yuxiao Dong, Jiahua Liu, Maodi Hu, Jie Tang.

U+2753 Why → Prompting has been making strides in NLP for the past couple of years, and now it also seems promising for Information Retrieval.

U+1F4A1 Key insights → Prompt tuning is a technique for adapting a pre-trained frozen model to a given task by adding trainable prompt tokens to the input of the sequence model. One of the main advantages of this approach with respect to the more common full model finetuning is that it only requires retraining a small subset of parameters which is much more efficient and also enables higher reusability of the original pre-trained model.

Now they train both a Dense Passage Retriever (retrieval by nearest neighbor search of query and doc embeddings) and a ColBERT⁴ type model with late interaction (includes joint modeling of query and document). The difference here is that instead of finetuning the whole model, they only finetune a prompt while keeping the pre-trained LM frozen. They base their implementation on the P-Tuning v² method, which is a more advanced form of prompt tuning where the trainable prompt is not only appended to the input but also at each layer of the Transformer.

Source: https://arxiv.org/pdf/2207.07087.pdf

The most interesting part of the results is that of generalization. While prompt tuning performs comparably to full fine-tuning on in-domain benchmarks, it outperforms substantially in various cross-domain datasets from the BEIR³ benchmark.

Source: https://arxiv.org/pdf/2207.07087.pdf

Yet again, this work strengthens the hypothesis that prompting and all its advantages are a viable alternative to fine-tuning and will probably keep increasing in popularity.

4. DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale U+007C Code

By Reza Yazdani Aminabadi, Samyam Rajbhandari, Minjia Zhang, Ammar Ahmad Awan, Cheng Li, Du Li, Elton Zheng, Jeff Rasley, Shaden Smith, Olatunji Ruwase, Yuxiong He.

U+2753 Why → DeepSpeed — the framework for large-scale distributed training of large NNs developed and used by Microsoft — now also for inference besides training.

U+1F4A1 Key insights → The landscape of large Transformer architectures has diversified in the past year since the success of a sparse Mixture of Expert models in scaling up models in the Trillion parameter regime. The key advantage of these architectures is that they have increased expressive power from their larger sizes, but at inference time, only an input-dependent subset of the weights is used, which makes them more efficient (if the implementation is also optimized!). The downside is that training and running these models efficiently is more involved because most existing Deep Learning libraries and hardware were not designed with this type of computation.

While DeepSpeed was previously designed to enable training large Transformers, the latest update focuses on making inference faster both in latency and throughput on all kinds of Transmodels, including sparsely activated architectures.

We’re talking about a system that enables parallelism in heterogeneous hardware at a scale of hundreds of GPUs, CPU, and NVMe memory that enables high-speed inference even when large models can’t fit in aggregate GPU memory.

Source: https://arxiv.org/pdf/2207.00032.pdf

Let’s be honest, though, most people reading this article won’t ever have the need to use such a framework for training Trillion-scale models themselves, but it’s still a fascinating read if you’re interested in the engineering that goes behind training and running massive Neural Networks.

5. Language Models (Mostly) Know What They Know

By Saurav Kadavath et al.

U+2753 Why → Performance is far from the only desirable property of ML models. Accurately knowing how confident they’re in their outputs can be more important, especially in safety-focused applications.

U+1F4A1 Key insights → Calibration is the concept used in Machine Learning to indicate how well-tuned a model’s prediction confidence is (e.g. a perfectly calibrated model with an output with 90% certainty should be right 9/10 times, no less, no more).

This work first investigates the calibration of LMs answering questions by formulating the prompt in a multiple choice fashion where the output of the model is a single token with the answer. Given that a single token is an answer, the probability can be calculated directly from the model’s likelihood of that token, normalized over valid token answers (e.g. A, B or C).

While LMs are highly sensitive to prompt design and formatting, given an appropriate question formulation, large LMs are very well calibrated. Interestingly, this capacity breaks down at smaller scales (see figure below), and also at the tail of very low or very high confidence.

Source: https://arxiv.org/pdf/2207.05221.pdf

The paper dives into more modes of calibration comparisons and analysis than you’ll find in the paper, but the takeaway remains: LMs know what they know (are well calibrated), but results are still very vulnerable to prompt variations, and models need to be large enough.

Source: https://arxiv.org/pdf/2207.05221.pdf

6. Towards Grand Unification of Object Tracking (UnicornU+1F984) U+007C Code

By Bin Yan, Yi Jiang, Peize Sun, Dong Wang, Zehuan Yuan, Ping Luo, Huchuan Lu.

U+2753 Why → Decluttering and unifying Machine Learning model architectures have proven to be fruitful in NLP in the past few years, now time for video Computer Vision tasks.

U+1F4A1 Key insights → When it comes to video-related tasks, existing top-performing models still tend to rely on task-specific designs and overspecialize for their specific application as a result.

The authors propose a single model architecture that performs competitively in 4 modes for Object Tracking: Single Object Tracking (SOT), Multiple Object Tracking (MOT), Video Object Segmentation (VOS), and Multi-Object Tracking and Segmentation (MOTS).

This architecture is quite complex and is best understood through the figure below. In broad strokes, it starts with a unified backbone for embedding images, then a unified embedding is computed for the reference and current frame. A Transformer is used for the feature interaction between the unified embedding and the current frame to output class, box, and masks corresponding to all object tracking flavors.

Source: https://arxiv.org/pdf/2207.07078.pdf

The system is benchmarked on several object tracking benchmarks such as LaSOT, TrackingNet, MOT17, BDD100K (and others) and sets new state-of-the-art performances in most of them.

Source: https://arxiv.org/pdf/2207.07078.pdf

7. Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?

By Yi Tay, Mostafa Dehghani, Samira Abnar, Hyung Won Chung, William Fedus, Jinfeng Rao, Sharan Narang, Vinh Q. Tran, Dani Yogatama, Donald Metzler.

U+2753 Why → Are we missing phenomena by benchmarking models at a particular compute/data scale? Yes.

U+1F4A1 Key insights → The authors perform hundreds of experiments across scales for a wide range of architectures: vanilla, efficient and advanced Transformers, MLP mixer, and Convolution-based models. The experiments consist of pretraining with autoregressive language modeling, which results in an upstream performance (language perplexity), followed by supervised fine-tuning on GLUE, SuperGLUE, and SQuAD (downstream performance).

The TL;DR is straightforward. While not the best option in all scale regimes, the vanilla Transformer shows the most robust and consistent performance scaling results across scales. Other useful tidbits of wisdom are:

  • Convolution and MLP-based architectures do well on pretraining (upstream performance) but fail to transfer properly when finetuned. This points toward the importance of architectural inductive biases when it comes to transferring learning.
  • Efficient Transformers only perform competitively with their full-attention counterparts in a certain scale regime and lag behind if scaled up sufficiently.
Source: https://arxiv.org/pdf/2207.10551.pdf

8. Discrete Key-Value Bottleneck

By Frederik Träuble, Anirudh Goyal, Nasim Rahaman, Michael Mozer, Kenji Kawaguchi, Yoshua Bengio, Bernhard Schölkopf.

U+2753 Why → As the benchmarking culture in ML slowly but surely shifts focus to out-of-domain generalization, inductive biases tailored for it will become more relevant.

U+1F4A1 Key insights → This method is part of a longer research agenda that we've highlighted before from Bengio's group (e.g. Coordination Among Neural Modules Through a Shared Global Workspace) focused on how introducing bottlenecks of information flow in architectures leads to representations that are robust and generalize better in an out of distribution sets. In this case, the proposed method consists of 3 steps:

  • Encoding a high dimensional input (e.g. image) into an embedding with an encoder pre-trained on a large dataset.
  • Splitting the embedding into C lower dimensional heads and finding the nearest neighbor from a set of predefined vectors which are frozen during training. Then use that nearest neighbor’s representation across heads to reconstruct the embedding. The discretized value codes are frozen during the training of the decoder.
  • The decoder gets the reconstructed embedding as input and produces the task-specific output.
Source: https://arxiv.org/pdf/2207.11240.pdf

The experiments focus on the setting where a model is trained for one distribution and then adapted to a new one, as you can see in the figure below. The model is initialized by training on i.i.d. data, updating the decoder weights, and keeping the codebooks. When adapting the model to a distribution shift, the decoder is frozen, and only the codebook values are updated.

Source: https://arxiv.org/pdf/2207.11240.pdf

Their experiments prove how this approach reduces catastrophic forgetting and results in more robust predictions. This work will not have a big short-term impact — results are not groundbreaking — but some of these ideas could be key catalyst for the next leap in true generalization.

9. Language Model Cascades

By David Dohan, Winnie Xu, Aitor Lewkowycz, Jacob Austin, David Bieber, Raphael Gontijo Lopes, Yuhuai Wu, Henryk Michalewski, Rif A. Saurous, Jascha Sohl-dickstein, Kevin Murphy, Charles Sutton.

U+2753 Why → Large Language Models have become so powerful that they’re increasingly used as a black-box building block for other applications such as Reinforcement Learning or data augmentation.

U+1F4A1 Key insights → This work formalizes the interaction of Language Models from the lens of probabilistic programming: directed graphical models of random variables, which map to natural language strings. This turns out to be a powerful language to analyze existing advanced prompting techniques such as Scratchpad⁶ or chain-of-thought⁷. Moreover, it can be used to design new forms of interaction between LMs or advanced promoting techniques.

The practical value of this method is still a bit of a question, but it has the potential to be a powerful conceptual tool for building complex ML systems by abstracting away the classic Deep Learning toolbox (weights, embeddings, gradient descent, objective functions), creating the space for new higher-level mechanisms for reasoning.

Source: https://arxiv.org/pdf/2207.10342.pdf

ZeroC: A Neuro-Symbolic Model for Zero-shot Concept Recognition and Acquisition at Inference Time

By Tailin Wu, Megan Tjandrasuwita, Zhengxuan Wu, Xuelin Yang, Kevin Liu, Rok Sosič, Jure Leskovec.

U+2753 Why → Neurosymbolics delivering on its promise..?

U+1F4A1 Key insights → ZeroC is a method to represent concepts as graphs of constituent concept models (i.e. primary shapes like lines). The main objective of this paper is to build a system that can recognize unseen concepts at inference time. For instance, in the figure below, the letter F was not seen by the model, but it’s able to disentangle its components (lines) and their relationships (angle and positions), representing them as an explicit graph with 3 nodes and 3 edges.

Source: https://arxiv.org/pdf/2206.15049.pdf

The recipe for training such a system relies on Energy-Based Models (EBMs): feed positive and negative image/graph representation pairs and minimize the energy of positive pairs while accounting for the graph invariances of the representation. The experiments show success in modest gridworld environments where the primary shapes and relationships are quite simple, but this represents a first step toward learning structure-rich representations that could become useful in the context of few-shot learning and generalization.

References:

[1] “Scaling Laws for Neural Language Models” by Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, Dario Amodei; 2020.

[2] “P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks” by Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Lam Tam, Zhengxiao Du, Zhilin Yang, Jie Tang, 2022.

[3] “BEIR: A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models” by

Nandan Thakur, Nils Reimers, Andreas Rücklé, Abhishek Srivastava, Iryna Gurevych; 2021.

[4] “ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT” by Omar Khattab, Matei Zaharia; 2020.

[6] “Show Your Work: Scratchpads for Intermediate Computation with Language Models” by Maxwell Nye et al. 2021.

[7] “Chain of Thought Prompting Elicits Reasoning in Large Language Models” by Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou; 2022.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓

Sign Up for the Course
`; } else { console.error('Element with id="subscribe" not found within the page with class "home".'); } } }); // Remove duplicate text from articles /* Backup: 09/11/24 function removeDuplicateText() { const elements = document.querySelectorAll('h1, h2, h3, h4, h5, strong'); // Select the desired elements const seenTexts = new Set(); // A set to keep track of seen texts const tagCounters = {}; // Object to track instances of each tag elements.forEach(el => { const tagName = el.tagName.toLowerCase(); // Get the tag name (e.g., 'h1', 'h2', etc.) // Initialize a counter for each tag if not already done if (!tagCounters[tagName]) { tagCounters[tagName] = 0; } // Only process the first 10 elements of each tag type if (tagCounters[tagName] >= 2) { return; // Skip if the number of elements exceeds 10 } const text = el.textContent.trim(); // Get the text content const words = text.split(/\s+/); // Split the text into words if (words.length >= 4) { // Ensure at least 4 words const significantPart = words.slice(0, 5).join(' '); // Get first 5 words for matching // Check if the text (not the tag) has been seen before if (seenTexts.has(significantPart)) { // console.log('Duplicate found, removing:', el); // Log duplicate el.remove(); // Remove duplicate element } else { seenTexts.add(significantPart); // Add the text to the set } } tagCounters[tagName]++; // Increment the counter for this tag }); } removeDuplicateText(); */ // Remove duplicate text from articles function removeDuplicateText() { const elements = document.querySelectorAll('h1, h2, h3, h4, h5, strong'); // Select the desired elements const seenTexts = new Set(); // A set to keep track of seen texts const tagCounters = {}; // Object to track instances of each tag // List of classes to be excluded const excludedClasses = ['medium-author', 'post-widget-title']; elements.forEach(el => { // Skip elements with any of the excluded classes if (excludedClasses.some(cls => el.classList.contains(cls))) { return; // Skip this element if it has any of the excluded classes } const tagName = el.tagName.toLowerCase(); // Get the tag name (e.g., 'h1', 'h2', etc.) // Initialize a counter for each tag if not already done if (!tagCounters[tagName]) { tagCounters[tagName] = 0; } // Only process the first 10 elements of each tag type if (tagCounters[tagName] >= 10) { return; // Skip if the number of elements exceeds 10 } const text = el.textContent.trim(); // Get the text content const words = text.split(/\s+/); // Split the text into words if (words.length >= 4) { // Ensure at least 4 words const significantPart = words.slice(0, 5).join(' '); // Get first 5 words for matching // Check if the text (not the tag) has been seen before if (seenTexts.has(significantPart)) { // console.log('Duplicate found, removing:', el); // Log duplicate el.remove(); // Remove duplicate element } else { seenTexts.add(significantPart); // Add the text to the set } } tagCounters[tagName]++; // Increment the counter for this tag }); } removeDuplicateText(); //Remove unnecessary text in blog excerpts document.querySelectorAll('.blog p').forEach(function(paragraph) { // Replace the unwanted text pattern for each paragraph paragraph.innerHTML = paragraph.innerHTML .replace(/Author\(s\): [\w\s]+ Originally published on Towards AI\.?/g, '') // Removes 'Author(s): XYZ Originally published on Towards AI' .replace(/This member-only story is on us\. Upgrade to access all of Medium\./g, ''); // Removes 'This member-only story...' }); //Load ionic icons and cache them if ('localStorage' in window && window['localStorage'] !== null) { const cssLink = 'https://code.ionicframework.com/ionicons/2.0.1/css/ionicons.min.css'; const storedCss = localStorage.getItem('ionicons'); if (storedCss) { loadCSS(storedCss); } else { fetch(cssLink).then(response => response.text()).then(css => { localStorage.setItem('ionicons', css); loadCSS(css); }); } } function loadCSS(css) { const style = document.createElement('style'); style.innerHTML = css; document.head.appendChild(style); } //Remove elements from imported content automatically function removeStrongFromHeadings() { const elements = document.querySelectorAll('h1, h2, h3, h4, h5, h6, span'); elements.forEach(el => { const strongTags = el.querySelectorAll('strong'); strongTags.forEach(strongTag => { while (strongTag.firstChild) { strongTag.parentNode.insertBefore(strongTag.firstChild, strongTag); } strongTag.remove(); }); }); } removeStrongFromHeadings(); "use strict"; window.onload = () => { /* //This is an object for each category of subjects and in that there are kewords and link to the keywods let keywordsAndLinks = { //you can add more categories and define their keywords and add a link ds: { keywords: [ //you can add more keywords here they are detected and replaced with achor tag automatically 'data science', 'Data science', 'Data Science', 'data Science', 'DATA SCIENCE', ], //we will replace the linktext with the keyword later on in the code //you can easily change links for each category here //(include class="ml-link" and linktext) link: 'linktext', }, ml: { keywords: [ //Add more keywords 'machine learning', 'Machine learning', 'Machine Learning', 'machine Learning', 'MACHINE LEARNING', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, ai: { keywords: [ 'artificial intelligence', 'Artificial intelligence', 'Artificial Intelligence', 'artificial Intelligence', 'ARTIFICIAL INTELLIGENCE', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, nl: { keywords: [ 'NLP', 'nlp', 'natural language processing', 'Natural Language Processing', 'NATURAL LANGUAGE PROCESSING', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, des: { keywords: [ 'data engineering services', 'Data Engineering Services', 'DATA ENGINEERING SERVICES', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, td: { keywords: [ 'training data', 'Training Data', 'training Data', 'TRAINING DATA', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, ias: { keywords: [ 'image annotation services', 'Image annotation services', 'image Annotation services', 'image annotation Services', 'Image Annotation Services', 'IMAGE ANNOTATION SERVICES', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, l: { keywords: [ 'labeling', 'labelling', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, pbp: { keywords: [ 'previous blog posts', 'previous blog post', 'latest', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, mlc: { keywords: [ 'machine learning course', 'machine learning class', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, }; //Articles to skip let articleIdsToSkip = ['post-2651', 'post-3414', 'post-3540']; //keyword with its related achortag is recieved here along with article id function searchAndReplace(keyword, anchorTag, articleId) { //selects the h3 h4 and p tags that are inside of the article let content = document.querySelector(`#${articleId} .entry-content`); //replaces the "linktext" in achor tag with the keyword that will be searched and replaced let newLink = anchorTag.replace('linktext', keyword); //regular expression to search keyword var re = new RegExp('(' + keyword + ')', 'g'); //this replaces the keywords in h3 h4 and p tags content with achor tag content.innerHTML = content.innerHTML.replace(re, newLink); } function articleFilter(keyword, anchorTag) { //gets all the articles var articles = document.querySelectorAll('article'); //if its zero or less then there are no articles if (articles.length > 0) { for (let x = 0; x < articles.length; x++) { //articles to skip is an array in which there are ids of articles which should not get effected //if the current article's id is also in that array then do not call search and replace with its data if (!articleIdsToSkip.includes(articles[x].id)) { //search and replace is called on articles which should get effected searchAndReplace(keyword, anchorTag, articles[x].id, key); } else { console.log( `Cannot replace the keywords in article with id ${articles[x].id}` ); } } } else { console.log('No articles found.'); } } let key; //not part of script, added for (key in keywordsAndLinks) { //key is the object in keywords and links object i.e ds, ml, ai for (let i = 0; i < keywordsAndLinks[key].keywords.length; i++) { //keywordsAndLinks[key].keywords is the array of keywords for key (ds, ml, ai) //keywordsAndLinks[key].keywords[i] is the keyword and keywordsAndLinks[key].link is the link //keyword and link is sent to searchreplace where it is then replaced using regular expression and replace function articleFilter( keywordsAndLinks[key].keywords[i], keywordsAndLinks[key].link ); } } function cleanLinks() { // (making smal functions is for DRY) this function gets the links and only keeps the first 2 and from the rest removes the anchor tag and replaces it with its text function removeLinks(links) { if (links.length > 1) { for (let i = 2; i < links.length; i++) { links[i].outerHTML = links[i].textContent; } } } //arrays which will contain all the achor tags found with the class (ds-link, ml-link, ailink) in each article inserted using search and replace let dslinks; let mllinks; let ailinks; let nllinks; let deslinks; let tdlinks; let iaslinks; let llinks; let pbplinks; let mlclinks; const content = document.querySelectorAll('article'); //all articles content.forEach((c) => { //to skip the articles with specific ids if (!articleIdsToSkip.includes(c.id)) { //getting all the anchor tags in each article one by one dslinks = document.querySelectorAll(`#${c.id} .entry-content a.ds-link`); mllinks = document.querySelectorAll(`#${c.id} .entry-content a.ml-link`); ailinks = document.querySelectorAll(`#${c.id} .entry-content a.ai-link`); nllinks = document.querySelectorAll(`#${c.id} .entry-content a.ntrl-link`); deslinks = document.querySelectorAll(`#${c.id} .entry-content a.des-link`); tdlinks = document.querySelectorAll(`#${c.id} .entry-content a.td-link`); iaslinks = document.querySelectorAll(`#${c.id} .entry-content a.ias-link`); mlclinks = document.querySelectorAll(`#${c.id} .entry-content a.mlc-link`); llinks = document.querySelectorAll(`#${c.id} .entry-content a.l-link`); pbplinks = document.querySelectorAll(`#${c.id} .entry-content a.pbp-link`); //sending the anchor tags list of each article one by one to remove extra anchor tags removeLinks(dslinks); removeLinks(mllinks); removeLinks(ailinks); removeLinks(nllinks); removeLinks(deslinks); removeLinks(tdlinks); removeLinks(iaslinks); removeLinks(mlclinks); removeLinks(llinks); removeLinks(pbplinks); } }); } //To remove extra achor tags of each category (ds, ml, ai) and only have 2 of each category per article cleanLinks(); */ //Recommended Articles var ctaLinks = [ /* ' ' + '

Subscribe to our AI newsletter!

' + */ '

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

'+ '

Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!

' + '
' + '' + '' + '

Note: Content contains the views of the contributing authors and not Towards AI.
Disclosure: This website may contain sponsored content and affiliate links.

' + 'Discover Your Dream AI Career at Towards AI Jobs' + '

Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 10,000 live jobs today with Towards AI Jobs!

' + '
' + '

🔥 Recommended Articles 🔥

' + 'Why Become an LLM Developer? Launching Towards AI’s New One-Stop Conversion Course'+ 'Testing Launchpad.sh: A Container-based GPU Cloud for Inference and Fine-tuning'+ 'The Top 13 AI-Powered CRM Platforms
' + 'Top 11 AI Call Center Software for 2024
' + 'Learn Prompting 101—Prompt Engineering Course
' + 'Explore Leading Cloud Providers for GPU-Powered LLM Training
' + 'Best AI Communities for Artificial Intelligence Enthusiasts
' + 'Best Workstations for Deep Learning
' + 'Best Laptops for Deep Learning
' + 'Best Machine Learning Books
' + 'Machine Learning Algorithms
' + 'Neural Networks Tutorial
' + 'Best Public Datasets for Machine Learning
' + 'Neural Network Types
' + 'NLP Tutorial
' + 'Best Data Science Books
' + 'Monte Carlo Simulation Tutorial
' + 'Recommender System Tutorial
' + 'Linear Algebra for Deep Learning Tutorial
' + 'Google Colab Introduction
' + 'Decision Trees in Machine Learning
' + 'Principal Component Analysis (PCA) Tutorial
' + 'Linear Regression from Zero to Hero
'+ '

', /* + '

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

',*/ ]; var replaceText = { '': '', '': '', '
': '
' + ctaLinks + '
', }; Object.keys(replaceText).forEach((txtorig) => { //txtorig is the key in replacetext object const txtnew = replaceText[txtorig]; //txtnew is the value of the key in replacetext object let entryFooter = document.querySelector('article .entry-footer'); if (document.querySelectorAll('.single-post').length > 0) { //console.log('Article found.'); const text = entryFooter.innerHTML; entryFooter.innerHTML = text.replace(txtorig, txtnew); } else { // console.log('Article not found.'); //removing comment 09/04/24 } }); var css = document.createElement('style'); css.type = 'text/css'; css.innerHTML = '.post-tags { display:none !important } .article-cta a { font-size: 18px; }'; document.body.appendChild(css); //Extra //This function adds some accessibility needs to the site. function addAlly() { // In this function JQuery is replaced with vanilla javascript functions const imgCont = document.querySelector('.uw-imgcont'); imgCont.setAttribute('aria-label', 'AI news, latest developments'); imgCont.title = 'AI news, latest developments'; imgCont.rel = 'noopener'; document.querySelector('.page-mobile-menu-logo a').title = 'Towards AI Home'; document.querySelector('a.social-link').rel = 'noopener'; document.querySelector('a.uw-text').rel = 'noopener'; document.querySelector('a.uw-w-branding').rel = 'noopener'; document.querySelector('.blog h2.heading').innerHTML = 'Publication'; const popupSearch = document.querySelector$('a.btn-open-popup-search'); popupSearch.setAttribute('role', 'button'); popupSearch.title = 'Search'; const searchClose = document.querySelector('a.popup-search-close'); searchClose.setAttribute('role', 'button'); searchClose.title = 'Close search page'; // document // .querySelector('a.btn-open-popup-search') // .setAttribute( // 'href', // 'https://medium.com/towards-artificial-intelligence/search' // ); } // Add external attributes to 302 sticky and editorial links function extLink() { // Sticky 302 links, this fuction opens the link we send to Medium on a new tab and adds a "noopener" rel to them var stickyLinks = document.querySelectorAll('.grid-item.sticky a'); for (var i = 0; i < stickyLinks.length; i++) { /* stickyLinks[i].setAttribute('target', '_blank'); stickyLinks[i].setAttribute('rel', 'noopener'); */ } // Editorial 302 links, same here var editLinks = document.querySelectorAll( '.grid-item.category-editorial a' ); for (var i = 0; i < editLinks.length; i++) { editLinks[i].setAttribute('target', '_blank'); editLinks[i].setAttribute('rel', 'noopener'); } } // Add current year to copyright notices document.getElementById( 'js-current-year' ).textContent = new Date().getFullYear(); // Call functions after page load extLink(); //addAlly(); setTimeout(function() { //addAlly(); //ideally we should only need to run it once ↑ }, 5000); }; function closeCookieDialog (){ document.getElementById("cookie-consent").style.display = "none"; return false; } setTimeout ( function () { closeCookieDialog(); }, 15000); console.log(`%c 🚀🚀🚀 ███ █████ ███████ █████████ ███████████ █████████████ ███████████████ ███████ ███████ ███████ ┌───────────────────────────────────────────────────────────────────┐ │ │ │ Towards AI is looking for contributors! │ │ Join us in creating awesome AI content. │ │ Let's build the future of AI together → │ │ https://towardsai.net/contribute │ │ │ └───────────────────────────────────────────────────────────────────┘ `, `background: ; color: #00adff; font-size: large`); //Remove latest category across site document.querySelectorAll('a[rel="category tag"]').forEach(function(el) { if (el.textContent.trim() === 'Latest') { // Remove the two consecutive spaces (  ) if (el.nextSibling && el.nextSibling.nodeValue.includes('\u00A0\u00A0')) { el.nextSibling.nodeValue = ''; // Remove the spaces } el.style.display = 'none'; // Hide the element } }); // Add cross-domain measurement, anonymize IPs 'use strict'; //var ga = gtag; ga('config', 'G-9D3HKKFV1Q', 'auto', { /*'allowLinker': true,*/ 'anonymize_ip': true/*, 'linker': { 'domains': [ 'medium.com/towards-artificial-intelligence', 'datasets.towardsai.net', 'rss.towardsai.net', 'feed.towardsai.net', 'contribute.towardsai.net', 'members.towardsai.net', 'pub.towardsai.net', 'news.towardsai.net' ] } */ }); ga('send', 'pageview'); -->