Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Building Reflection-Gemma-2–2b
Artificial Intelligence   Latest   Machine Learning

Building Reflection-Gemma-2–2b

Last Updated on September 17, 2024 by Editorial Team

Author(s): Arthur Lagacherie

Originally published on Towards AI.

image original from google blog mofied with Adobe Firefly

In the world of artificial intelligence, the ability of models to think and correct their mistakes is crucial. This article explores how Gemma 2–2b, can learn to reflect and improve its problem-solving skills.

How does reflection work?

During the inference:

  • First, the model generates a reasoning inside the <thinking> tags.
  • Then if the model finds an error, it corrects the mistake in the <reflection> tags as many times as necessary.
  • To finish the model generate the output in the <output> tags.

Example:

### Question
Jungkook is the 5th place. Find the number of people who crossed the finish line faster than Jungkook.
### Answer
<thinking>
The question asks how many people finished before Jungkook. If he is in 5th place, it means 4 people finished ahead of him.
</thinking>
<reflection>
It's important to understand that place numbers start from the first person to finish. So, 5th place means 4 people finished before him.
</reflection>
<output>4 people crossed the finish line faster than Jungkook.</output>

Reflection-tuning

We first need some datasets to train Gemma. I choose four datasets:

  • mahiatlinux/Reflection-Dataset-v2
  • Harshkmr/orca-math-word-reflection
  • gabrielmbmb/distilabel-reflection-tuning
  • Replete-AI/Replete-Reflection

The first one is composed of 9.17k rows organized into two columns, the prompt and the responses. The question is like “Explain the concept of…”.

image of the dataset

The second is composed of 2.44k rows of mathematical problems.

image of the dataset

Etc…

First, I began by downloading them.

from datasets import load_dataset
dataset1 = load_dataset("mahiatlinux/Reflection-Dataset-ShareGPT-v2")["train"]
dataset2 = load_dataset("gabrielmbmb/distilabel-reflection-tuning")["train"]
dataset3 = load_dataset("Replete-AI/Replete-Reflection")["train"]
dataset4 = load_dataset("mahiatlinux/Reflection-Dataset-v2")["train"]
dataset5 = load_dataset("Harshkmr/orca-math-word-reflection")["train"]
image from author

And now to be able to train our model we need to put the prompt and the answer in the same row, following the prompt template of Gemma2.

<bos><start_of_turn>user
Write a hello world program<end_of_turn>
<start_of_turn>model
```python
print("Hello World")
```<end_of_turn><eos>
import os
import pandas as pd
data = {"text":[]}

### mahiatlinux/Reflection-Dataset-ShareGPT-v2
for i in range(len(dataset1)):
user = f"""<bos><start_of_turn>user\n{dataset1[1]["conversations"][0]["value"]}<end_of_turn>"""
llm = f"""<start_of_turn>model\n{dataset1[1]["conversations"][1]["value"]}<end_of_turn><eos>"""
data["text"].append(user+llm)


### gabrielmbmb/distilabel-reflection-tuning
prompts = dataset2["instruction"]
responses = dataset2["generation"]
for i in range(len(prompts)):
user = f"""<bos><start_of_turn>user\n{prompts[i]}<end_of_turn>"""
llm = f"""<start_of_turn>model\n{responses[i]}<end_of_turn><eos>"""
data["text"].append(user+llm)


### Replete-AI/Replete-Reflection
prompts = dataset3["input"]
responses = dataset3["output"]
for i in range(int(len(prompts)/3)):
user = f"""<bos><start_of_turn>user\n{prompts[i]}<end_of_turn>"""
llm = f"""<start_of_turn>model\n{responses[i]}<end_of_turn><eos>"""
data["text"].append(user+llm)

### mahiatlinux/Reflection-Dataset-v2
prompts = dataset4["prompt"]
responses = dataset4["response"]
for i in range(int(len(prompts)/3)):
user = f"""<bos><start_of_turn>user\n{prompts[i]}<end_of_turn>"""
llm = f"""<start_of_turn>model\n{responses[i]}<end_of_turn><eos>"""
data["text"].append(user+llm)

### Harshkmr/orca-math-word-reflection
prompts = dataset5["question"]
responses = dataset5["Reflection"]
for i in range(len(prompts)):
user = f"""<bos><start_of_turn>user\n{prompts[i]}<end_of_turn>"""
llm = f"""<start_of_turn>model\n{responses[i]}<end_of_turn><eos>"""
data["text"].append(user+llm)

### save as csv
data = pd.DataFrame.from_dict(data)
print(data)
try:
os.mkdir("/kaggle/working/data")
except:
pass
data.to_csv('data/dataset.csv', index=False, escapechar='\\')
image from author

Now we have the dataset we can install AutoTrain which is the library I used to train the model. It manages the training of any Huggingface model with just one command.

!pip install -U autotrain-advanced
!autotrain setup

After the installation, we can set up the settings.

import os
project_name = 'Reflection-Gemma-2-2b' # @param {type:"string"}
model_name = 'Arthur-LAGACHERIE/Gemma-2-2b-4bit' # @param {type:"string"}

push_to_hub = True # @param ["False", "True"] {type:"raw"}
hf_token = "hf_token" #@param {type:"string"}
hf_username = "Arthur-LAGACHERIE" #@param {type:"string"}


unsloth = False # @param ["False", "True"] {type:"raw"}
learning_rate = 1e-4 # @param {type:"number"}
num_epochs = 1 #@param {type:"number"}
batch_size = 2 # @param {type:"slider", min:1, max:32, step:1}
block_size = 1124 # @param {type:"number"}
trainer = "sft" # @param ["generic", "sft"] {type:"string"}
warmup_ratio = 0.1 # @param {type:"number"}
weight_decay = 0.01 # @param {type:"number"}
gradient_accumulation = 4 # @param {type:"number"}
mixed_precision = "fp16" # @param ["fp16", "bf16", "none"] {type:"string"}
peft = True # @param ["False", "True"] {type:"raw"}
quantization = "int4" # @param ["int4", "int8", "none"] {type:"string"}
lora_r = 16 #@param {type:"number"}
lora_alpha = 32 #@param {type:"number"}
lora_dropout = 0.05 #@param {type:"number"}
merge_adapter = "true"

os.environ["HF_TOKEN"] = hf_token
os.environ["HF_USERNAME"] = hf_username

conf = f"""
task: llm-{trainer}
base_model: {model_name}
project_name: {project_name}
log: tensorboard
backend: local

data:
path: data/
train_split: train
valid_split: null
chat_template: null
column_mapping:
text_column: text

params:
block_size: {block_size}
lr: {learning_rate}
warmup_ratio: {warmup_ratio}
weight_decay: {weight_decay}
epochs: {num_epochs}
batch_size: {batch_size}
gradient_accumulation: {gradient_accumulation}
mixed_precision: {mixed_precision}
peft: {peft}
quantization: {quantization}
lora_r: {lora_r}
lora_alpha: {lora_alpha}
lora_dropout: {lora_dropout}
unsloth: {unsloth}
merge_adapter: {merge_adapter}

hub:
username: ${{HF_USERNAME}}
token: ${{HF_TOKEN}}
push_to_hub: {push_to_hub}
"""


with open("conf.yaml", "w") as f:
f.write(conf)

And now we just have to run this simple command to train our model.

!autotrain --config conf.yaml

And 4 hours later, it’s ready our model is fine-tuned. 👍

image from author

Inference

Now we have our fine-tuned model we can test it to solve problems.

!pip install -q bitsandbytes

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch
from transformers import TextIteratorStreamer
from threading import Thread
device = 'cuda' if torch.cuda.is_available() else 'cpu'

model_id = "Arthur-LAGACHERIE/Reflection-Gemma-2-2b"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

prompt = """
### System
You are a world-class AI system, capable of complex reasoning and reflection.
Reason through the query inside <thinking> tags, and then provide your final response inside <output> tags.
If you detect that you made a mistake in your reasoning at any point, correct yourself inside <reflection> tags.
Try an answer and see if it's correct before generate the ouput.
But don't forget to think very carefully.

### Question
The question here.
"""



chat = [
{ "role": "user", "content": prompt},
]
question = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
question = tokenizer(question, return_tensors="pt").to(device)
streamer = TextIteratorStreamer(tokenizer, skip_prompt=True)
generation_kwargs = dict(question, streamer=streamer, max_new_tokens=4000)
thread = Thread(target=model.generate, kwargs=generation_kwargs)

# generate
thread.start()
for new_text in streamer:
print(new_text, end="")

Problems

Let’s see if it can solve some problems.

1.

The length of one span of Jinseo is about 12 centimeters (cm). 
When Jinseo measured the length of the shorter side of the bookshelf,
it was about two spans. How many centimeters (cm) is the short side of the
bookshelf?
<thinking> 
To find the length of the shorter side of the bookshelf, we need to figure out how many spans are in the shorter side. We know that one span is 12 cm long. The shorter side is two spans, so we need to multiply the length of one span by 2.
</thinking>

<output>
The shorter side of the bookshelf is 2 * 12 cm = 24 cm.
</output>

👌 Ok first passed with success. (but it was very simple)

2.

How many diagonals can you draw in a decagon?
<thinking> 
A decagon has 10 sides. A diagonal is a line segment that connects two non-adjacent vertices.
To find the number of diagonals, we can use the formula for the number of diagonals in a polygon.
The formula is n(n-3)/2, where n is the number of sides.
So, for a decagon with 10 sides, the number of diagonals would be 10(10-3)/2. Let's calculate that. </thinking>

<output>
The number of diagonals in a decagon is 10(10-3)/2 = 10(7)/2 = 35 diagonals.
</output>

✅ Also right.

3.

What is the difference between the largest number and the smallest number 
that can be made by selecting four from the number magnets
2, 0, 3, 5, and 8 and arranging them in a row to form a four-digit number?
<thinking>
To find the largest and smallest numbers, I need to consider the possible arrangements of the numbers 2, 0, 3, 5, and 8. I should also consider the range of values these numbers can take. I need to figure out the largest and smallest possible four-digit numbers that can be formed.
</thinking>

<output>
The largest number that can be made is 8350, and the smallest number is 0235.
</output>

❌ It’s false.

One last:

4.

This year, the age difference between Minsu and his mother is 28 years old, 
and after 13 years, the age of his mother will be twice that of Minsu.
Find the age of Minsu this year.
<thinking>
Let's break down the problem step-by-step:

* **Let Minsu's current age be 'm'**
* **Let his mother's current age be 'M'**

We know:
* M - m = 28 (age difference is 28 years)
* M + 13 = 2(m + 13) (mother's age in 13 years will be twice Minsu's age in 13 years)

Now we have two equations, and we can solve for our two unknowns.
</thinking>

<output>
Let's solve for 'm' (Minsu's current age) and 'M' (his mother's current age).

From the first equation, we can express 'M' in terms of 'm':
M = m + 28

Substitute this value of 'M' into the second equation:
(m + 28) + 13 = 2(m + 13)

Simplify and solve for 'm':
m + 41 = 2m + 26
15 = m

Therefore, Minsu is currently 15 years old.
</output>

End

I hope you enjoyed this article and if this is the case you can clap it. (you can also follow me) = ).

Arthur-LAGACHERIE/Reflection-Gemma-2-2b · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓