Building Reflection-Gemma-2–2b

Last Updated on September 17, 2024 by Editorial Team

Author(s): Arthur Lagacherie

Originally published on Towards AI.

image original from google blog mofied with Adobe Firefly

In the world of artificial intelligence, the ability of models to think and correct their mistakes is crucial. This article explores how Gemma 2–2b, can learn to reflect and improve its problem-solving skills.

How does reflection work?

During the inference:

First, the model generates a reasoning inside the <thinking> tags.
Then if the model finds an error, it corrects the mistake in the <reflection> tags as many times as necessary.
To finish the model generate the output in the <output> tags.

Example:

### Question
Jungkook is the 5th place. Find the number of people who crossed the finish line faster than Jungkook.
### Answer
<thinking>
The question asks how many people finished before Jungkook. If he is in 5th place, it means 4 people finished ahead of him.
</thinking>
<reflection>
It's important to understand that place numbers start from the first person to finish. So, 5th place means 4 people finished before him.
</reflection>
<output>4 people crossed the finish line faster than Jungkook.</output>

Reflection-tuning

We first need some datasets to train Gemma. I choose four datasets:

mahiatlinux/Reflection-Dataset-v2
Harshkmr/orca-math-word-reflection
gabrielmbmb/distilabel-reflection-tuning
Replete-AI/Replete-Reflection

The first one is composed of 9.17k rows organized into two columns, the prompt and the responses. The question is like “Explain the concept of…”.

The second is composed of 2.44k rows of mathematical problems.

Etc…

First, I began by downloading them.

from datasets import load_dataset
dataset1 = load_dataset("mahiatlinux/Reflection-Dataset-ShareGPT-v2")["train"]
dataset2 = load_dataset("gabrielmbmb/distilabel-reflection-tuning")["train"]
dataset3 = load_dataset("Replete-AI/Replete-Reflection")["train"]
dataset4 = load_dataset("mahiatlinux/Reflection-Dataset-v2")["train"]
dataset5 = load_dataset("Harshkmr/orca-math-word-reflection")["train"]

And now to be able to train our model we need to put the prompt and the answer in the same row, following the prompt template of Gemma2.

<bos><start_of_turn>user
Write a hello world program<end_of_turn>
<start_of_turn>model
```python
print("Hello World")
```<end_of_turn><eos>

import os
import pandas as pd
data = {"text":[]}

### mahiatlinux/Reflection-Dataset-ShareGPT-v2
for i in range(len(dataset1)):
 user = f"""<bos><start_of_turn>user\n{dataset1[1]["conversations"][0]["value"]}<end_of_turn>"""
 llm = f"""<start_of_turn>model\n{dataset1[1]["conversations"][1]["value"]}<end_of_turn><eos>"""
 data["text"].append(user+llm)

 
### gabrielmbmb/distilabel-reflection-tuning
prompts = dataset2["instruction"]
responses = dataset2["generation"]
for i in range(len(prompts)):
 user = f"""<bos><start_of_turn>user\n{prompts[i]}<end_of_turn>"""
 llm = f"""<start_of_turn>model\n{responses[i]}<end_of_turn><eos>"""
 data["text"].append(user+llm)

 
### Replete-AI/Replete-Reflection
prompts = dataset3["input"]
responses = dataset3["output"]
for i in range(int(len(prompts)/3)):
 user = f"""<bos><start_of_turn>user\n{prompts[i]}<end_of_turn>"""
 llm = f"""<start_of_turn>model\n{responses[i]}<end_of_turn><eos>"""
 data["text"].append(user+llm)

### mahiatlinux/Reflection-Dataset-v2
prompts = dataset4["prompt"]
responses = dataset4["response"]
for i in range(int(len(prompts)/3)):
 user = f"""<bos><start_of_turn>user\n{prompts[i]}<end_of_turn>"""
 llm = f"""<start_of_turn>model\n{responses[i]}<end_of_turn><eos>"""
 data["text"].append(user+llm)

### Harshkmr/orca-math-word-reflection
prompts = dataset5["question"]
responses = dataset5["Reflection"]
for i in range(len(prompts)):
 user = f"""<bos><start_of_turn>user\n{prompts[i]}<end_of_turn>"""
 llm = f"""<start_of_turn>model\n{responses[i]}<end_of_turn><eos>"""
 data["text"].append(user+llm)
 
### save as csv
data = pd.DataFrame.from_dict(data)
print(data)
try:
 os.mkdir("/kaggle/working/data")
except:
 pass
data.to_csv('data/dataset.csv', index=False, escapechar='\\')

Now we have the dataset we can install AutoTrain which is the library I used to train the model. It manages the training of any Huggingface model with just one command.

!pip install -U autotrain-advanced
!autotrain setup

After the installation, we can set up the settings.

import os
project_name = 'Reflection-Gemma-2-2b' # @param {type:"string"}
model_name = 'Arthur-LAGACHERIE/Gemma-2-2b-4bit' # @param {type:"string"}

push_to_hub = True # @param ["False", "True"] {type:"raw"}
hf_token = "hf_token" #@param {type:"string"}
hf_username = "Arthur-LAGACHERIE" #@param {type:"string"}


unsloth = False # @param ["False", "True"] {type:"raw"}
learning_rate = 1e-4 # @param {type:"number"}
num_epochs = 1 #@param {type:"number"}
batch_size = 2 # @param {type:"slider", min:1, max:32, step:1}
block_size = 1124 # @param {type:"number"}
trainer = "sft" # @param ["generic", "sft"] {type:"string"}
warmup_ratio = 0.1 # @param {type:"number"}
weight_decay = 0.01 # @param {type:"number"}
gradient_accumulation = 4 # @param {type:"number"}
mixed_precision = "fp16" # @param ["fp16", "bf16", "none"] {type:"string"}
peft = True # @param ["False", "True"] {type:"raw"}
quantization = "int4" # @param ["int4", "int8", "none"] {type:"string"}
lora_r = 16 #@param {type:"number"}
lora_alpha = 32 #@param {type:"number"}
lora_dropout = 0.05 #@param {type:"number"}
merge_adapter = "true"

os.environ["HF_TOKEN"] = hf_token
os.environ["HF_USERNAME"] = hf_username

conf = f"""
task: llm-{trainer}
base_model: {model_name}
project_name: {project_name}
log: tensorboard
backend: local

data:
 path: data/
 train_split: train
 valid_split: null
 chat_template: null
 column_mapping:
 text_column: text

params:
 block_size: {block_size}
 lr: {learning_rate}
 warmup_ratio: {warmup_ratio}
 weight_decay: {weight_decay}
 epochs: {num_epochs}
 batch_size: {batch_size}
 gradient_accumulation: {gradient_accumulation}
 mixed_precision: {mixed_precision}
 peft: {peft}
 quantization: {quantization}
 lora_r: {lora_r}
 lora_alpha: {lora_alpha}
 lora_dropout: {lora_dropout}
 unsloth: {unsloth}
 merge_adapter: {merge_adapter}

hub:
 username: ${{HF_USERNAME}}
 token: ${{HF_TOKEN}}
 push_to_hub: {push_to_hub}
"""

with open("conf.yaml", "w") as f:
 f.write(conf)

And now we just have to run this simple command to train our model.

!autotrain --config conf.yaml

And 4 hours later, it’s ready our model is fine-tuned. 👍

Inference

Now we have our fine-tuned model we can test it to solve problems.

!pip install -q bitsandbytes

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch
from transformers import TextIteratorStreamer
from threading import Thread
device = 'cuda' if torch.cuda.is_available() else 'cpu'

model_id = "Arthur-LAGACHERIE/Reflection-Gemma-2-2b"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

prompt = """
### System
You are a world-class AI system, capable of complex reasoning and reflection. 
Reason through the query inside <thinking> tags, and then provide your final response inside <output> tags. 
If you detect that you made a mistake in your reasoning at any point, correct yourself inside <reflection> tags.
Try an answer and see if it's correct before generate the ouput. 
But don't forget to think very carefully.

### Question
The question here.
"""


chat = [
 { "role": "user", "content": prompt},
]
question = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
question = tokenizer(question, return_tensors="pt").to(device)
streamer = TextIteratorStreamer(tokenizer, skip_prompt=True)
generation_kwargs = dict(question, streamer=streamer, max_new_tokens=4000)
thread = Thread(target=model.generate, kwargs=generation_kwargs)

# generate
thread.start()
for new_text in streamer:
 print(new_text, end="")

Problems

Let’s see if it can solve some problems.

1.

The length of one span of Jinseo is about 12 centimeters (cm). 
When Jinseo measured the length of the shorter side of the bookshelf, 
it was about two spans. How many centimeters (cm) is the short side of the 
bookshelf?

<thinking> 
To find the length of the shorter side of the bookshelf, we need to figure out how many spans are in the shorter side. We know that one span is 12 cm long. The shorter side is two spans, so we need to multiply the length of one span by 2. 
</thinking>

<output> 
The shorter side of the bookshelf is 2 * 12 cm = 24 cm. 
</output>

👌 Ok first passed with success. (but it was very simple)

2.

How many diagonals can you draw in a decagon?

<thinking> 
A decagon has 10 sides. A diagonal is a line segment that connects two non-adjacent vertices. 
To find the number of diagonals, we can use the formula for the number of diagonals in a polygon. 
The formula is n(n-3)/2, where n is the number of sides. 
So, for a decagon with 10 sides, the number of diagonals would be 10(10-3)/2. Let's calculate that. </thinking>

<output> 
The number of diagonals in a decagon is 10(10-3)/2 = 10(7)/2 = 35 diagonals. 
</output>

✅ Also right.

3.

What is the difference between the largest number and the smallest number 
that can be made by selecting four from the number magnets 
2, 0, 3, 5, and 8 and arranging them in a row to form a four-digit number?

<thinking>
To find the largest and smallest numbers, I need to consider the possible arrangements of the numbers 2, 0, 3, 5, and 8. I should also consider the range of values these numbers can take. I need to figure out the largest and smallest possible four-digit numbers that can be formed. 
</thinking>

<output>
The largest number that can be made is 8350, and the smallest number is 0235. 
</output>

❌ It’s false.

One last:

4.

This year, the age difference between Minsu and his mother is 28 years old, 
and after 13 years, the age of his mother will be twice that of Minsu. 
Find the age of Minsu this year.

<thinking>
Let's break down the problem step-by-step:

* **Let Minsu's current age be 'm'**
* **Let his mother's current age be 'M'**

We know:
* M - m = 28 (age difference is 28 years)
* M + 13 = 2(m + 13) (mother's age in 13 years will be twice Minsu's age in 13 years)

Now we have two equations, and we can solve for our two unknowns. 
</thinking>

<output>
Let's solve for 'm' (Minsu's current age) and 'M' (his mother's current age).

From the first equation, we can express 'M' in terms of 'm':
M = m + 28

Substitute this value of 'M' into the second equation:
(m + 28) + 13 = 2(m + 13)

Simplify and solve for 'm':
m + 41 = 2m + 26
15 = m 

Therefore, Minsu is currently 15 years old. 
</output>

✅

End

I hope you enjoyed this article and if this is the case you can clap it. (you can also follow me) = ).

Arthur-LAGACHERIE/Reflection-Gemma-2-2b · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Building Reflection-Gemma-2–2b

Author(s): Arthur Lagacherie

How does reflection work?

Reflection-tuning

Inference

Problems

1.

2.

3.

4.

End

Arthur-LAGACHERIE/Reflection-Gemma-2-2b · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

Top 10 AI and LLM Trends Transforming Marketing in 2024

AI That Thinks Before It Speaks — OpenAI “o1” Models

Approaches Used by State-of-the-Art Vision-Language Models for Handling High-Resolution Images

I used AI to Prioritize My Goals

Building Reflection-Gemma-2–2b

The World’s Leading AI and Technology Publication.

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Building Reflection-Gemma-2–2b

Author(s): Arthur Lagacherie

How does reflection work?

Reflection-tuning

Inference

Problems

1.

2.

3.

4.

End

Arthur-LAGACHERIE/Reflection-Gemma-2-2b · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement