Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Introduction
Latest   Machine Learning

Introduction

Last Updated on July 25, 2023 by Editorial Team

Author(s): Towards AI Editorial Team

Originally published on Towards AI.

Introduction to Reinforcement Learning Series. Tutorial 1; Motivation, States, Actions, and Rewards

Table of Content:

1. What is Reinforcement Learning?

2. Why is this Useful?

3. Markov Decision Process

4. State, Actions & Rewards

State st​​

Action at

Reward rt

5. Mini-Exercise — Identify State, Action & Rewards

6. Policy π(s)

Review

7. Example Solved MDP

The Optimal Policy

8. Exercise 2 — Barry’s Boeings

Exercise 3 — Barry’s Execs

Welcome to the first tutorial for our Introduction to Reinforcement Learning series!

The idea that we learn by interacting with our environment probably occurs to us first when we think about learning something. When an infant plays or waves its arms, it has no teacher. However, it interacts with its environment by using its senses (e.g. vision, hearing) and taking actions (e.g. moving, speaking). This connection produces a wealth of information about cause and effect, the consequences of actions, and what to do to achieve goals.

This series initially formed part of Delta Academy and is brought to you in partnership with the Delta Academy team’s new project Superflows!

Superflows turns replying to emails into a single click. It suggests multiple 1-click replies in your tone of voice, customised with info like calendar invites. It’s the ultimate productivity tool to reclaim time from your inbox.

1. What is Reinforcement Learning?

The term Reinforcement Learning refers to both a class of problems & a set of solutions.

Reinforcement Learning problems are sequential decision-making problems, i.e., you must repeatedly make decisions.

Reinforcement Learning solutions are computational approaches to learning from interactions with an environment that solve sequential decision-making problems. They are a type of machine learning that learns from interacting with an environment that gives feedback.

The broader picture of Machine Learning

How does reinforcement learning fit into the broader picture of machine learning algorithms? Machine learning strategies can be divided into three categories.

  • Supervised Learning: Observations in the dataset are all labeled. The algorithm learns the mapping between data points and their labels. So, after learning, the algorithm can assign labels to new unlabelled data.
  • Unsupervised Learning: Observations in the dataset are unlabeled. The algorithm learns patterns in the data and relationships between data points. An example is clustering, where an algorithm learns how best to assign data points to groups.
  • Reinforcement Learning: RL differs from both supervised learning and unsupervised learning in that it does not have access to an unlabelled or labeled dataset, but rather selects actions and learns only from the feedback given by the environment.
Figure — 1: Machine Learning Strategies

2. Why is this Useful?

RL can learn highly complex behavior — it has seen remarkable breakthroughs in recent years. The highest-profile of these have been in playing games.

An RL algorithm, AlphaGo, famously beat the world champion in the game of Go in 2016. There’s a movie about it that’s free on YouTube — would strongly recommend it — see the trailer below.

RL is also the technology behind this amazing feat accomplished by OpenAI in 2019 — they trained a robotic hand to solve a Rubik’s cube one-handed! It uses the video from cameras around the robotic hand to give it feedback as to how it’s doing.

Reinforcement Learning has also reached a superhuman level of play in Poker, and numerous video games including DOTA & Starcraft 2.

While game-playing may seem like a small niche, RL can be applied far beyond this — it’s already used practically in a massive range of industries — from autonomous driving and robotics to advertising and agriculture. And as the technology develops in capabilities, so too will the number of practical applications!

Examples

A good way to understand reinforcement learning is to consider some examples of applications.

  • A master chess player looks at the pieces on the board and makes a move.
  • An adaptive controller adjusts an oil refinery’s operation in real-time. The controller optimizes the yield/cost/quality trade-off.
  • A mobile robot decides whether it should enter a new room in search of more trash to collect or start trying to find its way back to its battery recharging station.

These examples share features that are so basic that they are easy to overlook. All involve interaction between a decision-making agent and its environment. The agent seeks to achieve a goal. The agent’s actions affect the future state of the environment (e.g. the level of reservoirs of the refinery, the robot’s next location, and the future charge level of its battery). This, therefore, affects the actions and opportunities available to the agent at later times.

Note: from here on, this tutorial is unusually terminology-heavy. Stick with it, and keep the above examples in mind, it’ll be worth the effort in the end!

3. Markov Decision Process

Markov Decision Process is the technical name for the broad set of problems that Reinforcement Learning algorithms solve.

They are discrete-time processes. This means time progresses in discrete steps. E.g. a single action is taken every 0.1 seconds, so at 0.1s, 0.2s, 0.3s, …

At each timestep t, the agent takes an action a, the state s updates as a result of this action, and a reward r is given.

They have specific definitions of their 3 key elements, which are detailed below:

  1. The state (denoted s ) — the state the environment is in
  2. An action (denoted a ) — one of the possible decisions made or actions taken by the agent
  3. The reward (denoted r ) — the feedback signal from the environment

The aim of solving the MDP is to find a way of acting that maximizes the sum of future rewards. I.e. the reward at every future timestep added up.

Figure — 2: Markov Decision Process

Review

This course uses an experimental learning system called Orbit, intended to make it much easier for you to remember and apply this material, over the long term. Throughout the tutorials, we occasionally pause to ask a few quick questions, testing your understanding of the material just explained. In the weeks ahead, you’ll be re-tested in follow-up review sessions. Orbit carefully expands the testing schedule to ensure you consolidate the answers into your long-term memory while minimizing the review time required. The review sessions take no more than a few minutes per session, and Orbit will notify you when you need to review.

The benefit is that instead of remembering how reinforcement learning works for a few hours or days, you’ll remember for years. It’ll become a much more deeply internalized part of your thinking. Please indulge us by answering these questions — it’ll take just a few seconds. For each question, think about what you believe the answer to be, click to reveal the actual answer, and then mark whether you remembered or not. If you can recall, that’s great. But if not, that’s fine, just mentally note the correct answer, and continue.

Review the Questions and Answers:

1. Give two examples of problems RL could solve.

U+27A5 e.g. choosing moves to make in a board game; adjusting the shower knobs to maintain a comfortable temperature; a search-and-rescue robot deciding how to navigate a building.

2. What kind of problems do reinforcement learning systems solve?

U+27A5 Sequential decision-making problems (i.e. situations where an agent must repeatedly choose an action)

3. What three key elements occur at each time step in a Markov decision process?

U+27A5 Agent takes action a; the environment state s is updated; a reward r is given.

4. What defines the optimal solution to a Markov decision process?

U+27A5 It maximizes the sum of future rewards.

5. How do Markov decision processes model time?

U+27A5 As a series of discrete, fixed-size steps.

6. What’s the technical name for the problems which RL algorithms solve?

U+27A5 Markov decision processes.

7. How does the data used to train RL systems differ from supervised and unsupervised learning?

U+27A5 Supervised/unsupervised systems learn from a dataset; RL systems learn from feedback given by an environment.

4. State, Actions & Rewards

State st

Note: the subscript t in st​ just means it’s the state at time t. We’ll follow this convention throughout the course when talking about quantities that change over time t.

The key aspect of the state in a Markov Decision Process is that it is ‘memoryless’.

This means that the current state fully summarises the environment. Put another way, you gain no information about what will happen in the future by knowing preceding states (or actions).

Put a third way, if you knew only this, you could recreate the game completely from this point on.

Examples:

  1. The pieces and their locations are on a chessboard
  2. The position & orientation of the mobile robot plus the layout of its environment

Action at​

These are the choices the agent can make. At each timestep, the agent takes one action. These affect the successor state of the environment — the next state you are in.

Action can have multiple independent dimensions. E.g. a robot can decide its acceleration/deceleration independent from deciding its change in turning angle.

Examples:

  1. Any move of your piece on the chessboard
  2. The heating/cooling and the stirring occur in the oil refinery

Reward rt​

A reward signal defines the goal of a reinforcement learning problem. At each timestep, the environment sends a single number called the reward to the reinforcement learning agent.

The agent’s objective is to maximize its total reward over the long run.

The reward is received as the environment transitions between states. So the sequence of events when interacting with an environment is shown below & after step 3, steps 2 & 3 alternate. Your agent is only in control of step 2 while the environment is in control of step 3.

For many tasks you might want to train a reinforcement learning agent to solve, there isn’t a well-defined reward function.

Take robot navigation as an example. Reaching the desired destination should give a positive reward, but what about the reward from all the other timesteps? Should they be 00? Or how about higher rewards given based on how close it is to the destination? In such cases, it is often the RL designer’s job to decide what the reward should be based on the task he wants the agent to solve.

This means that by defining the reward function, you can train your agent to achieve different goals. For example, in chess, if you want to train your agent to lose as fast as possible, you could give a positive reward for losing and a negative one for winning.

Examples:

  1. +1 for winning a chess game, -1 for losing, 0 otherwise
  2. +100 To reach the goal location, -1 otherwise*

∗∗ The -1 is to encourage the robot to make it there as fast as possible, as the longer it takes, the more negative reward it gets.

5. Mini-Exercise — Identify State, Action & Rewards

The code for all our tutorials is hosted on a platform called Replit. It runs the code in the cloud and allows easy, live collaboration in an IDE with zero setups.

Replit is great because it means you don’t have to do any installs yourself — the environment is taken care of. If you have any trouble with it, please contact us on Slack!

Join Replit here

Once you’ve clicked the above link, you should be able to access the link below.

Each of the mini-exercises is an example of MDP.

For each mini-exercise, there are 5 quantities listed. 2 of these are false flags, and the other 3 are the state, action & reward.

Enter in the line at the bottom which string describes the state, action & reward in this scenario.

https://replit.com/team/delta-academy-RL7/12-MDP-MCQs

6. Policy π(s)

(last bit of terminology for this tutorial, I promise!)

A policy is a function that picks which action to take when in each state. It’s the core of a reinforcement learning agent — determines the agent’s behavior.

It’s commonly represented by the Greek letter Pi (pronounced ‘pie’), π. Mathematically, it’s a policy that takes a state as input and outputs an action

Example:

  • In the case of a game of chess, the policy tells you what action to take in each state.

In some cases, the policy may be a simple function or a lookup table (i.e. in-state X, take action Y, defined for all states), whereas in others it may involve extensive computation such as planning a sequence of actions into the future (and then performing the first action in this sequence).

In general, policies may be stochastic, specifying probabilities for each action.

Note: stochastic simply means ‘includes randomness’, as opposed to deterministic, which will give the same output every time for the same input.

Review the Questions and Answers:

1. In a Markov decision process, action at​ produces reward r at what time step?

U+27A5 t+1, i.e. at​ yields rt+1​

2. How might you cause a robot trained via RL to find the quickest path to its destination?

U+27A5 e.g. Give a negative reward for every time step before it reaches its destination.

3. What does it mean for a policy function to be stochastic?

U+27A5 It outputs a probability for each action, rather than defining a single definite action.

4. Why might it not be the best strategy for an RL agent to simply choose the next action with the highest estimate reward?

U+27A5Its goal is to maximize the total reward over all future steps, not just the next one.

5. If you wanted to losslessly save and restore a Markov decision process, what information would you need to persist?

U+27A5 Only the current state.

6. What does it mean that Markov decision processes are memoryless?

U+27A5 The current state fully defines the environment — i.e. previous states and actions provide no extra information.

7. In a chess-playing RL system, what might the policy function take as input and produce as output?

U+27A5 The input is the state, i.e. the positions of all pieces on the board. The output is the action, i.e. a move to make.

8. In this course, what’s meant by the notation xt?

U+27A5 The value of x at time t.

9. If a Markov decision process takes only one action per time step, how to model a search-and-rescue robot which can both accelerate and turn?

U+27A5 Model changes to acceleration and orientation as independent dimensions of a single action vector.

10. What is a “successor state” in a Markov decision process?

U+27A5 The agent’s next state (after taking its next action).

11. What is a “policy” in an RL system?

U+27A5 A function which chooses an action to take, given a state.

12. Practically speaking, how does an RL designer express different goals for his agents?

U+27A5 By defining different reward functions.

13. In a Markov decision process modeling a search and rescue robot, what might the state describe?

U+27A5 e.g. Estimates of the robot’s pose and the building’s layout.

7. Example Solved MDP

So what does it mean to solve a Markov Decision Process? It means finding the policy which maximizes the future sum of rewards.

This is typically called the optimal policy. It’s what every Reinforcement Learning algorithm is trying to find.

Here is an example of MDP that is perhaps relevant to you!

Figure — 3: Example of MDP

There are 3 states — Learning RL, Sleep & Pub. In this MDP, at any point in time, you are in one of these 3 states.

You have 3 actions you can take: Go to sleep, Go to Pub and Study. Each of these gives a different reward based on what state you're in. These rewards are shown in red.

The Optimal Policy

The optimal policy is the mapping from state to action that gets the highest possible reward over time.

Taking a quick look at the MDP diagram, we should quickly see that Pub is a bad state to enter or be in.

Staying in Sleep state gets a reward per timestep of 1. Learning RL gets us a reward of 6 per timestep. And going from Sleep to Learning RL (learning RL when fresh is best) gets 10, the biggest reward!

So to get the maximum future reward, you want to alternate between Sleep and Learning RL, giving an average reward per timestep of 7.5.

This isn’t the same as just taking the action that gives the maximum reward the next timestep. That would mean staying in Learning RL forever since the reward of 6 the Study action is higher than the 5 of Go to Sleep. Over time, the average reward is higher if you alternate.

And what do we do if we’re in the state Pub? We should take the action that takes us to either Sleep or Learning RL which costs us the least. In this case, taking the Go to sleep action costs us the least.

We can represent this in a Python dictionary like so:

# Dictionary of {state: action} pairs
optimal_policy = {
"Pub": "Go to sleep",
"Sleep": "Study",
"Learning RL": "Go to sleep"
}
for state, action in optimal_policy.items():
print(f"If in state: {state}, then take action: {action}")

### Result --->
"""
If in state: Pub, then take action: Go to sleep
If in state: Sleep, then take action: Study
If in state: Learning RL, then take action: Go to sleep
"""

8. Exercise 2 — Barry’s Boeings

Figure — 4: Barry’s Boeing

https://replit.com/team/delta-academy-RL7/12-Barrys-Boeings

You’ve ended up in the orbit of Barry. He’s the CEO of a major airline called BarryJet (haven’t you heard of it?).

Boeing has sold him planes with faulty ailerons, but the replacement parts they sell are also faulty! After every flight they become faulty, and after 2 flights without repair, they go out of operation.

He wants you to figure out the right policy for this MDP — what action should he take in each state?

Write the optimal policy dictionary in the Replit linked above.

Exercise 3 — Barry’s Execs

Figure — 5: Barry’s Exces

https://replit.com/team/delta-academy-RL7/13-Barrys-Execs

Barry’s senior execs are spread out across the world. But they have a very important meeting to attend in Hong Kong.

To show brand loyalty, they must always travel by BarryJet, but sadly for them, their planes are economy-only and the passengers are crammed in so tightly that it’s physically impossible to work. So Barry wants to work out the fastest way for his senior execs to travel to Hong Kong.

Barry again wants you to find the policy (in your head) which minimizes the time wasted by his execs.

Write the optimal policy dictionary in the code linked above.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓

Sign Up for the Course
`; } else { console.error('Element with id="subscribe" not found within the page with class "home".'); } } }); // Remove duplicate text from articles /* Backup: 09/11/24 function removeDuplicateText() { const elements = document.querySelectorAll('h1, h2, h3, h4, h5, strong'); // Select the desired elements const seenTexts = new Set(); // A set to keep track of seen texts const tagCounters = {}; // Object to track instances of each tag elements.forEach(el => { const tagName = el.tagName.toLowerCase(); // Get the tag name (e.g., 'h1', 'h2', etc.) // Initialize a counter for each tag if not already done if (!tagCounters[tagName]) { tagCounters[tagName] = 0; } // Only process the first 10 elements of each tag type if (tagCounters[tagName] >= 2) { return; // Skip if the number of elements exceeds 10 } const text = el.textContent.trim(); // Get the text content const words = text.split(/\s+/); // Split the text into words if (words.length >= 4) { // Ensure at least 4 words const significantPart = words.slice(0, 5).join(' '); // Get first 5 words for matching // Check if the text (not the tag) has been seen before if (seenTexts.has(significantPart)) { // console.log('Duplicate found, removing:', el); // Log duplicate el.remove(); // Remove duplicate element } else { seenTexts.add(significantPart); // Add the text to the set } } tagCounters[tagName]++; // Increment the counter for this tag }); } removeDuplicateText(); */ // Remove duplicate text from articles function removeDuplicateText() { const elements = document.querySelectorAll('h1, h2, h3, h4, h5, strong'); // Select the desired elements const seenTexts = new Set(); // A set to keep track of seen texts const tagCounters = {}; // Object to track instances of each tag // List of classes to be excluded const excludedClasses = ['medium-author', 'post-widget-title']; elements.forEach(el => { // Skip elements with any of the excluded classes if (excludedClasses.some(cls => el.classList.contains(cls))) { return; // Skip this element if it has any of the excluded classes } const tagName = el.tagName.toLowerCase(); // Get the tag name (e.g., 'h1', 'h2', etc.) // Initialize a counter for each tag if not already done if (!tagCounters[tagName]) { tagCounters[tagName] = 0; } // Only process the first 10 elements of each tag type if (tagCounters[tagName] >= 10) { return; // Skip if the number of elements exceeds 10 } const text = el.textContent.trim(); // Get the text content const words = text.split(/\s+/); // Split the text into words if (words.length >= 4) { // Ensure at least 4 words const significantPart = words.slice(0, 5).join(' '); // Get first 5 words for matching // Check if the text (not the tag) has been seen before if (seenTexts.has(significantPart)) { // console.log('Duplicate found, removing:', el); // Log duplicate el.remove(); // Remove duplicate element } else { seenTexts.add(significantPart); // Add the text to the set } } tagCounters[tagName]++; // Increment the counter for this tag }); } removeDuplicateText(); //Remove unnecessary text in blog excerpts document.querySelectorAll('.blog p').forEach(function(paragraph) { // Replace the unwanted text pattern for each paragraph paragraph.innerHTML = paragraph.innerHTML .replace(/Author\(s\): [\w\s]+ Originally published on Towards AI\.?/g, '') // Removes 'Author(s): XYZ Originally published on Towards AI' .replace(/This member-only story is on us\. Upgrade to access all of Medium\./g, ''); // Removes 'This member-only story...' }); //Load ionic icons and cache them if ('localStorage' in window && window['localStorage'] !== null) { const cssLink = 'https://code.ionicframework.com/ionicons/2.0.1/css/ionicons.min.css'; const storedCss = localStorage.getItem('ionicons'); if (storedCss) { loadCSS(storedCss); } else { fetch(cssLink).then(response => response.text()).then(css => { localStorage.setItem('ionicons', css); loadCSS(css); }); } } function loadCSS(css) { const style = document.createElement('style'); style.innerHTML = css; document.head.appendChild(style); } //Remove elements from imported content automatically function removeStrongFromHeadings() { const elements = document.querySelectorAll('h1, h2, h3, h4, h5, h6, span'); elements.forEach(el => { const strongTags = el.querySelectorAll('strong'); strongTags.forEach(strongTag => { while (strongTag.firstChild) { strongTag.parentNode.insertBefore(strongTag.firstChild, strongTag); } strongTag.remove(); }); }); } removeStrongFromHeadings(); "use strict"; window.onload = () => { /* //This is an object for each category of subjects and in that there are kewords and link to the keywods let keywordsAndLinks = { //you can add more categories and define their keywords and add a link ds: { keywords: [ //you can add more keywords here they are detected and replaced with achor tag automatically 'data science', 'Data science', 'Data Science', 'data Science', 'DATA SCIENCE', ], //we will replace the linktext with the keyword later on in the code //you can easily change links for each category here //(include class="ml-link" and linktext) link: 'linktext', }, ml: { keywords: [ //Add more keywords 'machine learning', 'Machine learning', 'Machine Learning', 'machine Learning', 'MACHINE LEARNING', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, ai: { keywords: [ 'artificial intelligence', 'Artificial intelligence', 'Artificial Intelligence', 'artificial Intelligence', 'ARTIFICIAL INTELLIGENCE', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, nl: { keywords: [ 'NLP', 'nlp', 'natural language processing', 'Natural Language Processing', 'NATURAL LANGUAGE PROCESSING', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, des: { keywords: [ 'data engineering services', 'Data Engineering Services', 'DATA ENGINEERING SERVICES', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, td: { keywords: [ 'training data', 'Training Data', 'training Data', 'TRAINING DATA', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, ias: { keywords: [ 'image annotation services', 'Image annotation services', 'image Annotation services', 'image annotation Services', 'Image Annotation Services', 'IMAGE ANNOTATION SERVICES', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, l: { keywords: [ 'labeling', 'labelling', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, pbp: { keywords: [ 'previous blog posts', 'previous blog post', 'latest', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, mlc: { keywords: [ 'machine learning course', 'machine learning class', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, }; //Articles to skip let articleIdsToSkip = ['post-2651', 'post-3414', 'post-3540']; //keyword with its related achortag is recieved here along with article id function searchAndReplace(keyword, anchorTag, articleId) { //selects the h3 h4 and p tags that are inside of the article let content = document.querySelector(`#${articleId} .entry-content`); //replaces the "linktext" in achor tag with the keyword that will be searched and replaced let newLink = anchorTag.replace('linktext', keyword); //regular expression to search keyword var re = new RegExp('(' + keyword + ')', 'g'); //this replaces the keywords in h3 h4 and p tags content with achor tag content.innerHTML = content.innerHTML.replace(re, newLink); } function articleFilter(keyword, anchorTag) { //gets all the articles var articles = document.querySelectorAll('article'); //if its zero or less then there are no articles if (articles.length > 0) { for (let x = 0; x < articles.length; x++) { //articles to skip is an array in which there are ids of articles which should not get effected //if the current article's id is also in that array then do not call search and replace with its data if (!articleIdsToSkip.includes(articles[x].id)) { //search and replace is called on articles which should get effected searchAndReplace(keyword, anchorTag, articles[x].id, key); } else { console.log( `Cannot replace the keywords in article with id ${articles[x].id}` ); } } } else { console.log('No articles found.'); } } let key; //not part of script, added for (key in keywordsAndLinks) { //key is the object in keywords and links object i.e ds, ml, ai for (let i = 0; i < keywordsAndLinks[key].keywords.length; i++) { //keywordsAndLinks[key].keywords is the array of keywords for key (ds, ml, ai) //keywordsAndLinks[key].keywords[i] is the keyword and keywordsAndLinks[key].link is the link //keyword and link is sent to searchreplace where it is then replaced using regular expression and replace function articleFilter( keywordsAndLinks[key].keywords[i], keywordsAndLinks[key].link ); } } function cleanLinks() { // (making smal functions is for DRY) this function gets the links and only keeps the first 2 and from the rest removes the anchor tag and replaces it with its text function removeLinks(links) { if (links.length > 1) { for (let i = 2; i < links.length; i++) { links[i].outerHTML = links[i].textContent; } } } //arrays which will contain all the achor tags found with the class (ds-link, ml-link, ailink) in each article inserted using search and replace let dslinks; let mllinks; let ailinks; let nllinks; let deslinks; let tdlinks; let iaslinks; let llinks; let pbplinks; let mlclinks; const content = document.querySelectorAll('article'); //all articles content.forEach((c) => { //to skip the articles with specific ids if (!articleIdsToSkip.includes(c.id)) { //getting all the anchor tags in each article one by one dslinks = document.querySelectorAll(`#${c.id} .entry-content a.ds-link`); mllinks = document.querySelectorAll(`#${c.id} .entry-content a.ml-link`); ailinks = document.querySelectorAll(`#${c.id} .entry-content a.ai-link`); nllinks = document.querySelectorAll(`#${c.id} .entry-content a.ntrl-link`); deslinks = document.querySelectorAll(`#${c.id} .entry-content a.des-link`); tdlinks = document.querySelectorAll(`#${c.id} .entry-content a.td-link`); iaslinks = document.querySelectorAll(`#${c.id} .entry-content a.ias-link`); mlclinks = document.querySelectorAll(`#${c.id} .entry-content a.mlc-link`); llinks = document.querySelectorAll(`#${c.id} .entry-content a.l-link`); pbplinks = document.querySelectorAll(`#${c.id} .entry-content a.pbp-link`); //sending the anchor tags list of each article one by one to remove extra anchor tags removeLinks(dslinks); removeLinks(mllinks); removeLinks(ailinks); removeLinks(nllinks); removeLinks(deslinks); removeLinks(tdlinks); removeLinks(iaslinks); removeLinks(mlclinks); removeLinks(llinks); removeLinks(pbplinks); } }); } //To remove extra achor tags of each category (ds, ml, ai) and only have 2 of each category per article cleanLinks(); */ //Recommended Articles var ctaLinks = [ /* ' ' + '

Subscribe to our AI newsletter!

' + */ '

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

'+ '

Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!

' + '
' + '' + '' + '

Note: Content contains the views of the contributing authors and not Towards AI.
Disclosure: This website may contain sponsored content and affiliate links.

' + 'Discover Your Dream AI Career at Towards AI Jobs' + '

Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 10,000 live jobs today with Towards AI Jobs!

' + '
' + '

🔥 Recommended Articles 🔥

' + 'Why Become an LLM Developer? Launching Towards AI’s New One-Stop Conversion Course'+ 'Testing Launchpad.sh: A Container-based GPU Cloud for Inference and Fine-tuning'+ 'The Top 13 AI-Powered CRM Platforms
' + 'Top 11 AI Call Center Software for 2024
' + 'Learn Prompting 101—Prompt Engineering Course
' + 'Explore Leading Cloud Providers for GPU-Powered LLM Training
' + 'Best AI Communities for Artificial Intelligence Enthusiasts
' + 'Best Workstations for Deep Learning
' + 'Best Laptops for Deep Learning
' + 'Best Machine Learning Books
' + 'Machine Learning Algorithms
' + 'Neural Networks Tutorial
' + 'Best Public Datasets for Machine Learning
' + 'Neural Network Types
' + 'NLP Tutorial
' + 'Best Data Science Books
' + 'Monte Carlo Simulation Tutorial
' + 'Recommender System Tutorial
' + 'Linear Algebra for Deep Learning Tutorial
' + 'Google Colab Introduction
' + 'Decision Trees in Machine Learning
' + 'Principal Component Analysis (PCA) Tutorial
' + 'Linear Regression from Zero to Hero
'+ '

', /* + '

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

',*/ ]; var replaceText = { '': '', '': '', '
': '
' + ctaLinks + '
', }; Object.keys(replaceText).forEach((txtorig) => { //txtorig is the key in replacetext object const txtnew = replaceText[txtorig]; //txtnew is the value of the key in replacetext object let entryFooter = document.querySelector('article .entry-footer'); if (document.querySelectorAll('.single-post').length > 0) { //console.log('Article found.'); const text = entryFooter.innerHTML; entryFooter.innerHTML = text.replace(txtorig, txtnew); } else { // console.log('Article not found.'); //removing comment 09/04/24 } }); var css = document.createElement('style'); css.type = 'text/css'; css.innerHTML = '.post-tags { display:none !important } .article-cta a { font-size: 18px; }'; document.body.appendChild(css); //Extra //This function adds some accessibility needs to the site. function addAlly() { // In this function JQuery is replaced with vanilla javascript functions const imgCont = document.querySelector('.uw-imgcont'); imgCont.setAttribute('aria-label', 'AI news, latest developments'); imgCont.title = 'AI news, latest developments'; imgCont.rel = 'noopener'; document.querySelector('.page-mobile-menu-logo a').title = 'Towards AI Home'; document.querySelector('a.social-link').rel = 'noopener'; document.querySelector('a.uw-text').rel = 'noopener'; document.querySelector('a.uw-w-branding').rel = 'noopener'; document.querySelector('.blog h2.heading').innerHTML = 'Publication'; const popupSearch = document.querySelector$('a.btn-open-popup-search'); popupSearch.setAttribute('role', 'button'); popupSearch.title = 'Search'; const searchClose = document.querySelector('a.popup-search-close'); searchClose.setAttribute('role', 'button'); searchClose.title = 'Close search page'; // document // .querySelector('a.btn-open-popup-search') // .setAttribute( // 'href', // 'https://medium.com/towards-artificial-intelligence/search' // ); } // Add external attributes to 302 sticky and editorial links function extLink() { // Sticky 302 links, this fuction opens the link we send to Medium on a new tab and adds a "noopener" rel to them var stickyLinks = document.querySelectorAll('.grid-item.sticky a'); for (var i = 0; i < stickyLinks.length; i++) { /* stickyLinks[i].setAttribute('target', '_blank'); stickyLinks[i].setAttribute('rel', 'noopener'); */ } // Editorial 302 links, same here var editLinks = document.querySelectorAll( '.grid-item.category-editorial a' ); for (var i = 0; i < editLinks.length; i++) { editLinks[i].setAttribute('target', '_blank'); editLinks[i].setAttribute('rel', 'noopener'); } } // Add current year to copyright notices document.getElementById( 'js-current-year' ).textContent = new Date().getFullYear(); // Call functions after page load extLink(); //addAlly(); setTimeout(function() { //addAlly(); //ideally we should only need to run it once ↑ }, 5000); }; function closeCookieDialog (){ document.getElementById("cookie-consent").style.display = "none"; return false; } setTimeout ( function () { closeCookieDialog(); }, 15000); console.log(`%c 🚀🚀🚀 ███ █████ ███████ █████████ ███████████ █████████████ ███████████████ ███████ ███████ ███████ ┌───────────────────────────────────────────────────────────────────┐ │ │ │ Towards AI is looking for contributors! │ │ Join us in creating awesome AI content. │ │ Let's build the future of AI together → │ │ https://towardsai.net/contribute │ │ │ └───────────────────────────────────────────────────────────────────┘ `, `background: ; color: #00adff; font-size: large`); //Remove latest category across site document.querySelectorAll('a[rel="category tag"]').forEach(function(el) { if (el.textContent.trim() === 'Latest') { // Remove the two consecutive spaces (  ) if (el.nextSibling && el.nextSibling.nodeValue.includes('\u00A0\u00A0')) { el.nextSibling.nodeValue = ''; // Remove the spaces } el.style.display = 'none'; // Hide the element } }); // Add cross-domain measurement, anonymize IPs 'use strict'; //var ga = gtag; ga('config', 'G-9D3HKKFV1Q', 'auto', { /*'allowLinker': true,*/ 'anonymize_ip': true/*, 'linker': { 'domains': [ 'medium.com/towards-artificial-intelligence', 'datasets.towardsai.net', 'rss.towardsai.net', 'feed.towardsai.net', 'contribute.towardsai.net', 'members.towardsai.net', 'pub.towardsai.net', 'news.towardsai.net' ] } */ }); ga('send', 'pageview'); -->