Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Transformers for Multi-Regression — [PART2]
Latest

Transformers for Multi-Regression — [PART2]

Last Updated on December 9, 2022 by Editorial Team

Author(s): Zeineb Ghrib

Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.

🤖 Fine Tuning 🤖

In the context of the FB3 competition, we aim to model six analysis metrics using pre-scored argumentative essays written by 8th-12th grade English Language Learners. The skills we have to model are the following: cohesion, syntax, vocabulary, phraseology, grammar, and conventions. The scores range from 1.0 to 5.0 in increments of 0.5.

Tribute to Marcel Proust: @Mario Breda

In my last post, I showed you how to use a pre-trained transformer to extract context-capturing embeddings and use them for training a multi-regressor.

This time I will show you how to train end-to-end the whole transformer, which also means updating the parameters of the pre-trained model.

Also, I will show you how to use the weights and biases platform: from the login with the wandb API, to creating and using model artifacts and passing through the model track.

All the code sources can be retrieved from my Kaggle notebook

Credit: In this Part, I borrowed @debarshichanda model’s architecture.

🎛Imports and Config

First of all, we will define the CONFIG dictionary and the transformer-related imports that we will be using all over the project:

import torch
import torch.nn as nn
import transformers
from transformers import (
AutoModel, AutoConfig,
AutoTokenizer, logging,
AdamW, get_linear_schedule_with_warmup,
DataCollatorWithPadding,
Trainer, TrainingArguments
)
from transformers.modeling_outputs import SequenceClassifierOutput

logging.set_verbosity_error()
logging.set_verbosity_warning()

CONFIG = {
"model_name": "microsoft/deberta-v3-base",# "distilbert-base-uncased",
"device": 'cuda' if torch.cuda.is_available() else 'cpu',
"dropout": random.uniform(0.01, 0.60),
"max_length": 512,
"train_batch_size": 8,
"valid_batch_size": 16,
"epochs": 10,
"folds" : 3,
"max_grad_norm": 1000,
"weight_decay": 1e-6,
"learning_rate": 1e-5,
"loss_type": "rmse",
"n_accumulate" : 1,
"label_cols" : ['cohesion', 'syntax', 'vocabulary', 'phraseology', 'grammar', 'conventions'],

}

🧮Custom Dataset Iterator:

As explained in the previous post, we will define a subclass of torch.utils.data.Dataset and override __init__ , __len__ and __getitem__ special methods as follows:

import pandas as pd

train = pd.read_csv(PATH_TO_TRAIN)
test = pd.read_csv(PATH_TO_TEST)

# lets define the batch genetator
class CustomIterator(torch.utils.data.Dataset):
def __init__(self, df, tokenizer, labels=CONFIG['label_cols'], is_train=True):
self.df = df
self.tokenizer = tokenizer
self.max_seq_length = CONFIG["max_length"]# tokenizer.model_max_length
self.labels = labels
self.is_train = is_train

def __getitem__(self,idx):
tokens = self.tokenizer(
self.df.loc[idx, 'full_text'],#.to_list(),
add_special_tokens=True,
padding='max_length',
max_length=self.max_seq_length,
truncation=True,
return_tensors='pt',
return_attention_mask=True
)
res = {
'input_ids': tokens['input_ids'].to(CONFIG.get('device')).squeeze(),
'attention_mask': tokens['attention_mask'].to(CONFIG.get('device')).squeeze()
}

if self.is_train:
res["labels"] = torch.tensor(
self.df.loc[idx, self.labels].to_list(),
).to(CONFIG.get('device'))

return res

def __len__(self):
return len(self.df)

🤖 Fine Tune Transformer

With this approach, the hidden states are not fixed but trainable: for this reason, it requires the classification head to be differentiable. Usually, we use a neural network for the classifier.

Zeineb Ghrib

In this section, we will see how to fine-tune an encoder transformer based on microsoft/deberta-v3-base pretrained model, using a simple and feature-complete training and evaluation API provided by HuggingFace : the Trainer.

We will define a custom model that extends microsoft/deberta-v3-base with a trainable neural network head.

The custom model will be composed of:

  • Pre-trained Baseline model : load the pre-trained microsoft/deberta-v3-base with the AutoModel.from_pretrained function
  • Mean pooling Layer: we need to add some changes to the previous Mean Pooling function (see part one post): inherit from torch.nn.Module the pooling class and define the mean pooling function within a forward method (see code below)
  • Dropout layer: add a dropout layer for regularization
  • Linear layer: input size = hidden_state_dim, output size = number of target features (6)

The logits output of the linear layer is returned through a SequenceClassifierOutput – a subclass of ModelOutput class- on the forward method (All models must have outputs that are instances of subclasses of ModelOutput: reference here )

class MeanPooling(nn.Module):
def __init__(self):
super(MeanPooling, self).__init__()
def forward(self, last_hidden_state, attention_mask):
input_mask_expanded = attention_mask.unsqueeze(-1).expand(last_hidden_state.size()).float()
sum_embeddings = torch.sum(last_hidden_state * input_mask_expanded, 1)
sum_mask = input_mask_expanded.sum(1)
sum_mask = torch.clamp(sum_mask, min=1e-9)
mean_embeddings = sum_embeddings / sum_mask
return mean_embeddings

class FeedBackModel(nn.Module):
def __init__(self, model_name):
super(FeedBackModel, self).__init__()
self.config = AutoConfig.from_pretrained(model_name)
self.config.hidden_dropout_prob = 0
self.config.attention_probs_dropout_prob = 0
self.model = AutoModel.from_pretrained(model_name, config=self.config)
self.drop = nn.Dropout(p=0.2)
self.pooler = MeanPooling()
self.fc = nn.Linear(self.config.hidden_size, len(CONFIG['label_cols']))

def forward(self, input_ids, attention_mask):
out = self.model(input_ids=input_ids,
attention_mask=attention_mask,
output_hidden_states=False)
out = self.pooler(out.last_hidden_state, attention_mask)
out = self.drop(out)
outputs = self.fc(out)
return SequenceClassifierOutput(logits=outputs)

Loss & Metric

As we will be using a Trainer, we need to define a new loss function corresponding to the target evaluation Metric (in our cases MCRMSE) this loss function will be used to train the transformer. The way to implement this is to define a subclassing Trainer and override the compute_loss() method.

In the same way, we want to get the local evaluation for each target class during the evaluation step, so we will provide the Trainer a custom compute_metrics() function that allows calculating the RMSE of each of the six targets (Otherwise, the evaluation would just have returned just the loss evaluation – the MCRMSE).

class RMSELoss(nn.Module):
"""
Code taken from Y Nakama's notebook (https://www.kaggle.com/code/yasufuminakama/fb3-deberta-v3-base-baseline-train)
"""
def __init__(self, reduction='mean', eps=1e-9):
super().__init__()
self.mse = nn.MSELoss(reduction='none')
self.reduction = reduction
self.eps = eps

def forward(self, predictions, targets):
loss = torch.sqrt(self.mse(predictions, targets) + self.eps)
if self.reduction == 'none':
loss = loss
elif self.reduction == 'sum':
loss = loss.sum()
elif self.reduction == 'mean':
loss = loss.mean()
return loss

class CustomTrainer(Trainer):
def compute_loss(self, model, inputs, return_outputs=False):
outputs = model(inputs['input_ids'], inputs['attention_mask'])
loss_func = RMSELoss(reduction='mean')
loss = loss_func(outputs.logits.float(), inputs['labels'].float())
return (loss, outputs) if return_outputs else loss

def compute_metrics(eval_pred):
predictions, labels = eval_pred
colwise_rmse = np.sqrt(np.mean((labels - predictions) ** 2, axis=0))
res = {
f"{analytic.upper()}_RMSE" : colwise_rmse[i]
for i, analytic in enumerate(CONFIG["label_cols"])
}
res["MCRMSE"] = np.mean(colwise_rmse)
return res

🧚Weights & Biases🧚

Even though HuggingFace Transformers provide a wide range of training checkpointing facilities. W&B provides powerful experiment tracking and model versioning tools with friendly and interactive dashboards. Each experiment project is partitioned on its own.

Check out this excellent notebook that describes in detail how to use W&B in kaggle:

The W&B provides two main utilities:

🤙 Dashboard (experiment tracking): Log and visualize experiments in real time = Keep data and results in one convenient place. Consider this as a repository of experiments.

🤙 Artifacts (dataset + model versioning): Store and version datasets, models, and results = Know exactly what data a model is being trained on.

To connect to Weights & Biases, we need the Access your API key from https://wandb.ai/authorize.
There are two ways you can log in using a Kaggle kernel:

Feel free to skip this part if you are not using Kaggle

  • Use Kaggle secrets to store your API key: and use the code snippet below to log in.
  1. Click on the Add-ons menus in the Notebook Editor then Secrets:

2. Store the api-key as key-value pair that will be attached to the current Notebook:

3. Copy and paste the code snippet to access to the api-key then use wandb.login() to connect to W&B:

from kaggle_secrets import UserSecretsClient
import wandb

user_secrets = UserSecretsClient()
api_key = user_secrets.get_secret("wandb_api")
wandb.login(key=api_key)

🛠Wandb Arguments

For each CV iteration ‘i’ we will create a new run called FB3-fold-i with ‘i’ = iteration’s number, within a single project that we’ll call Feedback3-deberta.

Some other parameters:

  • group: the group parameter is especially used to organize individual runs into a larger experiment, here are some use cases examples
  • tags: we will add the model name and the metric tags. as explained in W&B doc Tags are useful for organizing runs together or applying temporary labels like “baseline” or “production”. It’s easy to add and remove tags in the UI or filter down to just runs with a specific tag.
  • job_type: usually it’s either “train” or “eval”. Later, it would allow filtering and grouping of similar runs. We will set the job_type to “train”
  • anonymous: this parameter allows controlling anonymous logging. we will set it to “must” which would send the run to an anonymous account instead of to a signed-up user account. For the other options, you can check the documentation

For each CV iteration we can instantiate a wandb run as follows:

run = wandb.init(project="FB3-deberta-v3", 
config=CONFIG,
job_type='train',
group="FB3-BASELINE-MODEL",
tags=[CONFIG['model_name'], CONFIG['loss_type'], "10-epochs"],
name=f'FB3-fold-{fold}',
anonymous='must')

Now let’s define the training arguments that will be used by the Huggingface Trainer

🛠Training Arguments

Before instantiating our custom Trainer, we will create a TrainingArguments to define the training config.
We will set the following parameters:

  • output_dir : The output directory where the model predictions and checkpoints will be written: each CV iteration would have its own directory with a name equal to the number of iterations prefixed with "output-"
  • evaluation_strategy : set to "epoch" which means the Evaluation is done at the end of each epoch.
  • per_device_train_batch_size: The batch size for training. We will set it to 8
  • per_device_eval_batch_size: The batch size for evaluation. We will set it to 16 (to speed up the time execution)
  • num_train_epochs: number of training epochs. As a reminder, during one epoch each sample of the training dataset has been seen by the model
  • group_by_length: as long as we will be using dynamic padding, we will set this parameter to True to group together samples of roughly the same length in the training dataset (to minimize padding applied and be more efficient)
  • max_grad_norm: Maximum gradient norm (for gradient clipping).
  • learning_rate: The initial learning rate for AdamW optimizer. As a reminder, the AdamW optimizer
  • weight_decay: The weight decay to apply to AdamW optimizer: in our case, we will apply the weight decay to all layers except biases and normalization layers

Note :
Weight decay is a regularization technique that adds a small penalty to the loss function (usually the L2 norm of the weights).
loss = loss + weight_decay_parameter * L2_norm_of_the_weights

Some implementations apply weight decay only to the weights and not the bias. On the other hand, PyTorch applies weight decay to both weights and bias.

Why weight decay?

1. To prevent overfitting.
2. To avoid exploding gradient: Because of the additional L2 norm, each iteration of your network will try to optimize the model weights in addition to the loss. This will help keep the weights as small as possible, preventing the weights to grow out of control, and thus avoiding exploding gradient

  • gradient_accumulation_steps: Number of steps the gradients should be accumulated across before performing a backward pass: When using gradient accumulation, gradient calculation is done in smaller steps rather than all at once for a batch; 1 (means no gradient accumulation))

Note:
In this Stackoverflow discussion, it has been explained how to use the set gradient_accumulation_steps parameter to avoid OOM errors: set the gradient_accumulation_steps argument to a number that would fit into memory, and modify the per_device_train_batch_size to original_batch_size/gradient_accumulation_steps: so that the gradients would be accumulated across gradient_accumulation_steps, and a backward pass would be performed through gradient_accumulation_steps*original_batch_size/gradient_accumulation_steps=original_batch_size samples. The total number of training steps would be :

  • load_best_model_at_end: we will set it to True to load the best model found during training at the end of the training: in this case, the save_strategy must be the same as evaluation_strategy: epoch in our case
  • metric_for_best_model: we will set to the competition metric MCRMSE or eval_MCRMSE (with eval_ prefix)
  • greater_is_better: set it to False, because we want to get the model with lower MCRMSE
  • save_total_limit: we will set it to 1 to always keep one checkpoint a time (older checkpoints in output_dir will be deleted).
  • report_to: as we are connected to W&B we will set the report_to logs to "wandb"
  • label_name: set The list of label_name parameters to ["labels"], which corresponds to the previously defined field yielded by our custom Dataloader corresponding to the target classes
training_args = TrainingArguments(
output_dir=f"outputs-{fold}/",
evaluation_strategy="epoch",
per_device_train_batch_size=CONFIG['train_batch_size'],
per_device_eval_batch_size=CONFIG['valid_batch_size'],
num_train_epochs=CONFIG['epochs'],
learning_rate=CONFIG['learning_rate'],
weight_decay=CONFIG['weight_decay'],
gradient_accumulation_steps=CONFIG['n_accumulate'],
seed=SEED,
group_by_length=True,
max_grad_norm=CONFIG['max_grad_norm'],
metric_for_best_model='eval_MCRMSE',
load_best_model_at_end=True,
greater_is_better=False,
save_strategy="epoch",
save_total_limit=1,
report_to="wandb",
label_names=["labels"]
)

Furthermore, we will define some other parameters for the Trainer:

  • data collator: we need to define how to create a single batch from the list of data inputs returned by the Dataloader
    We will use DataCollatorWithPadding which will dynamically pad the received inputs.
  • optimizer: we will use AdamW with decay in all layers except bias and normalization layers
  • scheduler: we will use the get_linear_schedule_with_warmup to create a schedule with a warmup period during which the learning rate increases linearly from 0 to the initial lr (set in the optimizer) and then decreases linearly from the initial lr set in the optimizer to 0.
    The Scheduler allows us to keep control of the learning rate if, for example, we want to make sure that every update of the learning rate doesn’t exceed a lambda value (check this Stackoverflow discussion about the utility of optimizer scheduler)

To launch the cross-validation training, first, we have to create the CV fold columns as explained in the PART1 post:

# set seed to produce similar folds
cv = MultilabelStratifiedKFold(n_splits=CONFIG.get("folds", 3), shuffle=True, random_state=SEED)

train = train.reset_index(drop=True)
for fold, ( _, val_idx) in enumerate(cv.split(X=train, y=train[CONFIG['label_cols']])):
train.loc[val_idx , "fold"] = int(fold)

train["fold"] = train["fold"].astype(int)

The CV training workflow can be implemented as follows:

# Data Collator for Dynamic Padding
collate_fn = DataCollatorWithPadding(tokenizer=tokenizer)
# init predictions by fold
predictions = {}
for fold in range(0, CONFIG['folds']):
print(f" ---- Fold: {fold} ----")
run = wandb.init(project="FB3-deberta-v3",
config=CONFIG,
job_type='train',
group="FB3-BASELINE-MODEL",
tags=[CONFIG['model_name'], CONFIG['loss_type'], "10-epochs"],
name=f'FB3-fold-{fold}',
anonymous='must')
# the reset index is VERY IMPORTANT for the Dataset iterator
df_train = train[train.fold != fold].reset_index(drop=True)
df_valid = train[train.fold == fold].reset_index(drop=True)
# create iterators
train_dataset = CustomIterator(df_train, tokenizer)
valid_dataset = CustomIterator(df_valid, tokenizer)
# init model
model = FeedBackModel(CONFIG['model_name'])
model.to(CONFIG['device'])

# SET THE OPITMIZER AND THE SCHEDULER
# no decay for bias and normalization layers
param_optimizer = list(model.named_parameters())
no_decay = ["bias", "LayerNorm.weight"]
optimizer_parameters = [
{
"params": [p for n, p in param_optimizer if not any(nd in n for nd in no_decay)],
"weight_decay": CONFIG['weight_decay'],
},
{
"params": [p for n, p in param_optimizer if any(nd in n for nd in no_decay)],
"weight_decay": 0.0,
},
]
optimizer = AdamW(optimizer_parameters, lr=CONFIG['learning_rate'])
num_training_steps = (len(train_dataset) * CONFIG['epochs']) // (CONFIG['train_batch_size'] * CONFIG['n_accumulate'])
scheduler = get_linear_schedule_with_warmup(
optimizer,
num_warmup_steps=0.1*num_training_steps,
num_training_steps=num_training_steps
)
# CREATE THE TRAINER
trainer = CustomTrainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=valid_dataset,
data_collator=collate_fn,
optimizers=(optimizer, scheduler),
compute_metrics=compute_metrics
)
# LAUNCH THE TRAINER
trainer.train()

You can access my public W&B dashboard that I created for this project: https://wandb.ai/athena75/FB3-deberta-v3?workspace=user-athena75

🗿Create W&B artifacts

For Later use, W&B is very convenient to create model artifacts once the model is fine-tuned. We can use them later and create new versions of our models.

To create a model artifact, all you have to do is :

  1. Create a wandb.Artifact object with a clear and consistent name, and you have to specify the type parameter it can be a dataset or model, in our case model
  2. Add the local directory to the artifact: in fact, once you instantiate the model and start to fine-tune it, it creates a local checkpoint containing the model bin as well as the model state and configuration. You have to add it to the artifact.
  3. Once the artifact has all the desired files, you can call wandb.log_artifact() to log it.

Here is a snip-code example of creating an artifact for each CV model:

for fold in range(0, CONFIG['folds']):
run = wandb.init(project="FB3-deberta-v3",
config=CONFIG,
job_type='train',
group="FB3-BASELINE-MODEL",
tags=[CONFIG['model_name'], CONFIG['loss_type'], "10-epochs"],
name=f'FB3-fold-{fold}',
anonymous='must')

trainer = CustomTrainer(
.....
)
##### TRAIN / FINE-TUNE ####
# create model artifact
model_artifact = wandb.Artifact(f'FB3-fold-{fold}', type="model",
description=f"MultilabelStratified - fold--{fold}")
# save locally the model - it would create a local dir
trainer.save_model(f'fold-{fold}')
# add the local dir to the artifact
model_artifact.add_dir(f'fold-{fold}')
# log artifact
# it would save the artifact version and declare the artifact as an output of the run
run.log_artifact(model_artifact)

run.finish()

✨Use W&B artifacts for inference✨:

Once the training is accomplished, we can use the stored artifacts on the Weights & Biases server, in our case, to generate model predictions and generate an aggregated output prediction.

PS: you can extract the usage code directly from the W&B interface https://wandb.ai/athena75/Feedback3-deberta/artifacts/model/FB3-fold-0/93c08783e5b7c696451a/usage

  1. login to your wandb account and Instantiate a default run with wandb.init()
  2. Indicate to the use_artifact() method the path to your artifact as well as the type (the model in our case) to retrieve the artifact
  3. Download the artifact directory locally using the download() method
  4. Load the local model and use it to make predictions

Example of implementation :

predictions = torch.zeros((len(test), len(CONFIG['label_cols']))

for fold in range(CONFIG["folds"]):
print(f"---- FOLD {fold} -------")
# instantiate deafault run
run = wandb.init()
# Indicate the artifact we want to use with the use_artifact method.
artifact = run.use_artifact(f'athena75/FB3-deberta-10/FB3-fold-{fold}:v0', type='model')
# download locally the model
artifact_dir = artifact.download()
# load the loacal model
# it is a pytorch moeal: loaded as follows
# https://pytorch.org/tutorials/beginner/saving_loading_models.html
model = FeedBackModel(CONFIG['model_name'])
model.load_state_dict(torch.load(f'artifacts/FB3-fold-{fold}:v0/pytorch_model.bin'))
# generate test embediings
test_dataset = CustomIterator(test, tokenizer, is_train=False)
test_dataloader = torch.utils.data.DataLoader(
test_dataset,
batch_size=CONFIG["train_batch_size"],
shuffle=False
)
input_ids, attention_mask = tuple(next(iter(test_dataloader)).values())
input_ids = input_ids.to('cpu')
attention_mask = attention_mask.to('cpu')
# genreate predictions
fold_preds = model(input_ids, attention_mask)
predictions = fold_preds.logits.add(predictions)
# remove local dir to save space
shutil.rmtree('artifacts')

🙏Credits:

This Work is a singular genesis of these excellent resources, do not hesitate to upvote the Kaggle resources:

Conclusion:

Thanks a lot for reading my posts! 🥰, I share with you all my work: this Kaggle notebook, as well as my public W&B dashboard.

I hope it was clear, and feel free to ask me questions.

📬My address mail is: schopenhacker75@gmail.com📬

For a later post, I intend to address the subject of how to deploy a Transformer in production. Or how to build an MLOps Pipeline for NLP Transformers, I haven’t decided yet…


Transformers for Multi-Regression — [PART2] was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓

Sign Up for the Course
`; } else { console.error('Element with id="subscribe" not found within the page with class "home".'); } } }); // Remove duplicate text from articles /* Backup: 09/11/24 function removeDuplicateText() { const elements = document.querySelectorAll('h1, h2, h3, h4, h5, strong'); // Select the desired elements const seenTexts = new Set(); // A set to keep track of seen texts const tagCounters = {}; // Object to track instances of each tag elements.forEach(el => { const tagName = el.tagName.toLowerCase(); // Get the tag name (e.g., 'h1', 'h2', etc.) // Initialize a counter for each tag if not already done if (!tagCounters[tagName]) { tagCounters[tagName] = 0; } // Only process the first 10 elements of each tag type if (tagCounters[tagName] >= 2) { return; // Skip if the number of elements exceeds 10 } const text = el.textContent.trim(); // Get the text content const words = text.split(/\s+/); // Split the text into words if (words.length >= 4) { // Ensure at least 4 words const significantPart = words.slice(0, 5).join(' '); // Get first 5 words for matching // Check if the text (not the tag) has been seen before if (seenTexts.has(significantPart)) { // console.log('Duplicate found, removing:', el); // Log duplicate el.remove(); // Remove duplicate element } else { seenTexts.add(significantPart); // Add the text to the set } } tagCounters[tagName]++; // Increment the counter for this tag }); } removeDuplicateText(); */ // Remove duplicate text from articles function removeDuplicateText() { const elements = document.querySelectorAll('h1, h2, h3, h4, h5, strong'); // Select the desired elements const seenTexts = new Set(); // A set to keep track of seen texts const tagCounters = {}; // Object to track instances of each tag // List of classes to be excluded const excludedClasses = ['medium-author', 'post-widget-title']; elements.forEach(el => { // Skip elements with any of the excluded classes if (excludedClasses.some(cls => el.classList.contains(cls))) { return; // Skip this element if it has any of the excluded classes } const tagName = el.tagName.toLowerCase(); // Get the tag name (e.g., 'h1', 'h2', etc.) // Initialize a counter for each tag if not already done if (!tagCounters[tagName]) { tagCounters[tagName] = 0; } // Only process the first 10 elements of each tag type if (tagCounters[tagName] >= 10) { return; // Skip if the number of elements exceeds 10 } const text = el.textContent.trim(); // Get the text content const words = text.split(/\s+/); // Split the text into words if (words.length >= 4) { // Ensure at least 4 words const significantPart = words.slice(0, 5).join(' '); // Get first 5 words for matching // Check if the text (not the tag) has been seen before if (seenTexts.has(significantPart)) { // console.log('Duplicate found, removing:', el); // Log duplicate el.remove(); // Remove duplicate element } else { seenTexts.add(significantPart); // Add the text to the set } } tagCounters[tagName]++; // Increment the counter for this tag }); } removeDuplicateText(); //Remove unnecessary text in blog excerpts document.querySelectorAll('.blog p').forEach(function(paragraph) { // Replace the unwanted text pattern for each paragraph paragraph.innerHTML = paragraph.innerHTML .replace(/Author\(s\): [\w\s]+ Originally published on Towards AI\.?/g, '') // Removes 'Author(s): XYZ Originally published on Towards AI' .replace(/This member-only story is on us\. Upgrade to access all of Medium\./g, ''); // Removes 'This member-only story...' }); //Load ionic icons and cache them if ('localStorage' in window && window['localStorage'] !== null) { const cssLink = 'https://code.ionicframework.com/ionicons/2.0.1/css/ionicons.min.css'; const storedCss = localStorage.getItem('ionicons'); if (storedCss) { loadCSS(storedCss); } else { fetch(cssLink).then(response => response.text()).then(css => { localStorage.setItem('ionicons', css); loadCSS(css); }); } } function loadCSS(css) { const style = document.createElement('style'); style.innerHTML = css; document.head.appendChild(style); } //Remove elements from imported content automatically function removeStrongFromHeadings() { const elements = document.querySelectorAll('h1, h2, h3, h4, h5, h6, span'); elements.forEach(el => { const strongTags = el.querySelectorAll('strong'); strongTags.forEach(strongTag => { while (strongTag.firstChild) { strongTag.parentNode.insertBefore(strongTag.firstChild, strongTag); } strongTag.remove(); }); }); } removeStrongFromHeadings(); "use strict"; window.onload = () => { /* //This is an object for each category of subjects and in that there are kewords and link to the keywods let keywordsAndLinks = { //you can add more categories and define their keywords and add a link ds: { keywords: [ //you can add more keywords here they are detected and replaced with achor tag automatically 'data science', 'Data science', 'Data Science', 'data Science', 'DATA SCIENCE', ], //we will replace the linktext with the keyword later on in the code //you can easily change links for each category here //(include class="ml-link" and linktext) link: 'linktext', }, ml: { keywords: [ //Add more keywords 'machine learning', 'Machine learning', 'Machine Learning', 'machine Learning', 'MACHINE LEARNING', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, ai: { keywords: [ 'artificial intelligence', 'Artificial intelligence', 'Artificial Intelligence', 'artificial Intelligence', 'ARTIFICIAL INTELLIGENCE', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, nl: { keywords: [ 'NLP', 'nlp', 'natural language processing', 'Natural Language Processing', 'NATURAL LANGUAGE PROCESSING', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, des: { keywords: [ 'data engineering services', 'Data Engineering Services', 'DATA ENGINEERING SERVICES', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, td: { keywords: [ 'training data', 'Training Data', 'training Data', 'TRAINING DATA', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, ias: { keywords: [ 'image annotation services', 'Image annotation services', 'image Annotation services', 'image annotation Services', 'Image Annotation Services', 'IMAGE ANNOTATION SERVICES', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, l: { keywords: [ 'labeling', 'labelling', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, pbp: { keywords: [ 'previous blog posts', 'previous blog post', 'latest', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, mlc: { keywords: [ 'machine learning course', 'machine learning class', ], //Change your article link (include class="ml-link" and linktext) link: 'linktext', }, }; //Articles to skip let articleIdsToSkip = ['post-2651', 'post-3414', 'post-3540']; //keyword with its related achortag is recieved here along with article id function searchAndReplace(keyword, anchorTag, articleId) { //selects the h3 h4 and p tags that are inside of the article let content = document.querySelector(`#${articleId} .entry-content`); //replaces the "linktext" in achor tag with the keyword that will be searched and replaced let newLink = anchorTag.replace('linktext', keyword); //regular expression to search keyword var re = new RegExp('(' + keyword + ')', 'g'); //this replaces the keywords in h3 h4 and p tags content with achor tag content.innerHTML = content.innerHTML.replace(re, newLink); } function articleFilter(keyword, anchorTag) { //gets all the articles var articles = document.querySelectorAll('article'); //if its zero or less then there are no articles if (articles.length > 0) { for (let x = 0; x < articles.length; x++) { //articles to skip is an array in which there are ids of articles which should not get effected //if the current article's id is also in that array then do not call search and replace with its data if (!articleIdsToSkip.includes(articles[x].id)) { //search and replace is called on articles which should get effected searchAndReplace(keyword, anchorTag, articles[x].id, key); } else { console.log( `Cannot replace the keywords in article with id ${articles[x].id}` ); } } } else { console.log('No articles found.'); } } let key; //not part of script, added for (key in keywordsAndLinks) { //key is the object in keywords and links object i.e ds, ml, ai for (let i = 0; i < keywordsAndLinks[key].keywords.length; i++) { //keywordsAndLinks[key].keywords is the array of keywords for key (ds, ml, ai) //keywordsAndLinks[key].keywords[i] is the keyword and keywordsAndLinks[key].link is the link //keyword and link is sent to searchreplace where it is then replaced using regular expression and replace function articleFilter( keywordsAndLinks[key].keywords[i], keywordsAndLinks[key].link ); } } function cleanLinks() { // (making smal functions is for DRY) this function gets the links and only keeps the first 2 and from the rest removes the anchor tag and replaces it with its text function removeLinks(links) { if (links.length > 1) { for (let i = 2; i < links.length; i++) { links[i].outerHTML = links[i].textContent; } } } //arrays which will contain all the achor tags found with the class (ds-link, ml-link, ailink) in each article inserted using search and replace let dslinks; let mllinks; let ailinks; let nllinks; let deslinks; let tdlinks; let iaslinks; let llinks; let pbplinks; let mlclinks; const content = document.querySelectorAll('article'); //all articles content.forEach((c) => { //to skip the articles with specific ids if (!articleIdsToSkip.includes(c.id)) { //getting all the anchor tags in each article one by one dslinks = document.querySelectorAll(`#${c.id} .entry-content a.ds-link`); mllinks = document.querySelectorAll(`#${c.id} .entry-content a.ml-link`); ailinks = document.querySelectorAll(`#${c.id} .entry-content a.ai-link`); nllinks = document.querySelectorAll(`#${c.id} .entry-content a.ntrl-link`); deslinks = document.querySelectorAll(`#${c.id} .entry-content a.des-link`); tdlinks = document.querySelectorAll(`#${c.id} .entry-content a.td-link`); iaslinks = document.querySelectorAll(`#${c.id} .entry-content a.ias-link`); mlclinks = document.querySelectorAll(`#${c.id} .entry-content a.mlc-link`); llinks = document.querySelectorAll(`#${c.id} .entry-content a.l-link`); pbplinks = document.querySelectorAll(`#${c.id} .entry-content a.pbp-link`); //sending the anchor tags list of each article one by one to remove extra anchor tags removeLinks(dslinks); removeLinks(mllinks); removeLinks(ailinks); removeLinks(nllinks); removeLinks(deslinks); removeLinks(tdlinks); removeLinks(iaslinks); removeLinks(mlclinks); removeLinks(llinks); removeLinks(pbplinks); } }); } //To remove extra achor tags of each category (ds, ml, ai) and only have 2 of each category per article cleanLinks(); */ //Recommended Articles var ctaLinks = [ /* ' ' + '

Subscribe to our AI newsletter!

' + */ '

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

'+ '

Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!

' + '
' + '' + '' + '

Note: Content contains the views of the contributing authors and not Towards AI.
Disclosure: This website may contain sponsored content and affiliate links.

' + 'Discover Your Dream AI Career at Towards AI Jobs' + '

Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 10,000 live jobs today with Towards AI Jobs!

' + '
' + '

🔥 Recommended Articles 🔥

' + 'Why Become an LLM Developer? Launching Towards AI’s New One-Stop Conversion Course'+ 'Testing Launchpad.sh: A Container-based GPU Cloud for Inference and Fine-tuning'+ 'The Top 13 AI-Powered CRM Platforms
' + 'Top 11 AI Call Center Software for 2024
' + 'Learn Prompting 101—Prompt Engineering Course
' + 'Explore Leading Cloud Providers for GPU-Powered LLM Training
' + 'Best AI Communities for Artificial Intelligence Enthusiasts
' + 'Best Workstations for Deep Learning
' + 'Best Laptops for Deep Learning
' + 'Best Machine Learning Books
' + 'Machine Learning Algorithms
' + 'Neural Networks Tutorial
' + 'Best Public Datasets for Machine Learning
' + 'Neural Network Types
' + 'NLP Tutorial
' + 'Best Data Science Books
' + 'Monte Carlo Simulation Tutorial
' + 'Recommender System Tutorial
' + 'Linear Algebra for Deep Learning Tutorial
' + 'Google Colab Introduction
' + 'Decision Trees in Machine Learning
' + 'Principal Component Analysis (PCA) Tutorial
' + 'Linear Regression from Zero to Hero
'+ '

', /* + '

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

',*/ ]; var replaceText = { '': '', '': '', '
': '
' + ctaLinks + '
', }; Object.keys(replaceText).forEach((txtorig) => { //txtorig is the key in replacetext object const txtnew = replaceText[txtorig]; //txtnew is the value of the key in replacetext object let entryFooter = document.querySelector('article .entry-footer'); if (document.querySelectorAll('.single-post').length > 0) { //console.log('Article found.'); const text = entryFooter.innerHTML; entryFooter.innerHTML = text.replace(txtorig, txtnew); } else { // console.log('Article not found.'); //removing comment 09/04/24 } }); var css = document.createElement('style'); css.type = 'text/css'; css.innerHTML = '.post-tags { display:none !important } .article-cta a { font-size: 18px; }'; document.body.appendChild(css); //Extra //This function adds some accessibility needs to the site. function addAlly() { // In this function JQuery is replaced with vanilla javascript functions const imgCont = document.querySelector('.uw-imgcont'); imgCont.setAttribute('aria-label', 'AI news, latest developments'); imgCont.title = 'AI news, latest developments'; imgCont.rel = 'noopener'; document.querySelector('.page-mobile-menu-logo a').title = 'Towards AI Home'; document.querySelector('a.social-link').rel = 'noopener'; document.querySelector('a.uw-text').rel = 'noopener'; document.querySelector('a.uw-w-branding').rel = 'noopener'; document.querySelector('.blog h2.heading').innerHTML = 'Publication'; const popupSearch = document.querySelector$('a.btn-open-popup-search'); popupSearch.setAttribute('role', 'button'); popupSearch.title = 'Search'; const searchClose = document.querySelector('a.popup-search-close'); searchClose.setAttribute('role', 'button'); searchClose.title = 'Close search page'; // document // .querySelector('a.btn-open-popup-search') // .setAttribute( // 'href', // 'https://medium.com/towards-artificial-intelligence/search' // ); } // Add external attributes to 302 sticky and editorial links function extLink() { // Sticky 302 links, this fuction opens the link we send to Medium on a new tab and adds a "noopener" rel to them var stickyLinks = document.querySelectorAll('.grid-item.sticky a'); for (var i = 0; i < stickyLinks.length; i++) { /* stickyLinks[i].setAttribute('target', '_blank'); stickyLinks[i].setAttribute('rel', 'noopener'); */ } // Editorial 302 links, same here var editLinks = document.querySelectorAll( '.grid-item.category-editorial a' ); for (var i = 0; i < editLinks.length; i++) { editLinks[i].setAttribute('target', '_blank'); editLinks[i].setAttribute('rel', 'noopener'); } } // Add current year to copyright notices document.getElementById( 'js-current-year' ).textContent = new Date().getFullYear(); // Call functions after page load extLink(); //addAlly(); setTimeout(function() { //addAlly(); //ideally we should only need to run it once ↑ }, 5000); }; function closeCookieDialog (){ document.getElementById("cookie-consent").style.display = "none"; return false; } setTimeout ( function () { closeCookieDialog(); }, 15000); console.log(`%c 🚀🚀🚀 ███ █████ ███████ █████████ ███████████ █████████████ ███████████████ ███████ ███████ ███████ ┌───────────────────────────────────────────────────────────────────┐ │ │ │ Towards AI is looking for contributors! │ │ Join us in creating awesome AI content. │ │ Let's build the future of AI together → │ │ https://towardsai.net/contribute │ │ │ └───────────────────────────────────────────────────────────────────┘ `, `background: ; color: #00adff; font-size: large`); //Remove latest category across site document.querySelectorAll('a[rel="category tag"]').forEach(function(el) { if (el.textContent.trim() === 'Latest') { // Remove the two consecutive spaces (  ) if (el.nextSibling && el.nextSibling.nodeValue.includes('\u00A0\u00A0')) { el.nextSibling.nodeValue = ''; // Remove the spaces } el.style.display = 'none'; // Hide the element } }); // Add cross-domain measurement, anonymize IPs 'use strict'; //var ga = gtag; ga('config', 'G-9D3HKKFV1Q', 'auto', { /*'allowLinker': true,*/ 'anonymize_ip': true/*, 'linker': { 'domains': [ 'medium.com/towards-artificial-intelligence', 'datasets.towardsai.net', 'rss.towardsai.net', 'feed.towardsai.net', 'contribute.towardsai.net', 'members.towardsai.net', 'pub.towardsai.net', 'news.towardsai.net' ] } */ }); ga('send', 'pageview'); -->