Reasoning Model: Short Overview and Feature for Developers

Last Updated on January 21, 2025 by Editorial Team

Author(s): Igor Novikov

Originally published on Towards AI.

When LLMs first came out they were kinda like children, they would say the first thing that came to mind and didn’t bother much with logic. You had to tell them they should think before you speak. And just like with children even then it didn’t mean they would think.

Many argued that because of that, the models do not possess real intelligence and must be supplemented with either human help or some sort of external framework on top of an LLM, like Chain of Thought.

It was only a matter of time before major LLM developers like OpenAI decided to replicate this external thinking step (see picture below) inside an LLM. After all, it’s pretty simple — create a dataset that contains not just question-answer pairs but the whole step-by-step logic, and train on that. Additionally, it would require more computation resources at inference time, as a model would go through the same step-by-step thinking process when determining the answer.

They natively break down problems into small pieces and integrate a Chain of thought approach, error correction, and trying multiple strategies before answering.

O1 spends more time at inference (o1 is 30 times slower than Gpt4o), what a surprise — longer thinking time leads to better results!

Reasoning tokens are not passed from one turn to the next — only the output.

Also, it verifies the solution by generating multiple answers and choosing the best via consensus, and the approach that we used to implement manually. Here is the overall process:

One important conclusion is that GPU computation requirements are going to grow as it is obvious that longer thinking time (in tokens) leads to better answers, so it is possible to scale model quality just by giving the model more computing power, whereas before this was mostly true at training phase. So GPU requirements for modern models are going to go significantly higher.

These models are thus different and old approaches no longer work.

How to work with reasoning models

Interestingly it is kind of similar to working with an intelligent human:

Be simple and direct. State your question clearly.
No explicit Chain of Thought. The model will do that internally
Have a good structure: break the prompt into sections using clear markup
Show vs tell: it is better to show the model and example of a good answer or behavior than describe it in several thousand words
No more need for coaxing, intimidating, or bribing the model nonsense

I can even summarize this into one: know what you want to ask and ask it clearly.

Mini vs Full models

Since reasoning models like o3 consume a lot of tokens during inference — it is rather expensive to use them for everything and the latency is not great. So the idea is to delegate the most difficult task — high-level thinking and planning, and have faster and more cost-efficient mini-models to execute the plan. They can be used for tasks like coding, math, and science.

This is an agentic approach, that allows us to combine best of the both worlds — smart but expensive models with small and fast workers.

How much better these models are?

Much better, and going to get even better soon. For o1 it’s approaching expert humans in math and coding (see below):

Math

Coding

ELO 2727 puts o3 in the 200 best coders in the world. If you are not worried about your job security as a developer — it’s time to start now. This is exactly the job that scales perfectly by adding more computing power and the current rate of progress is not showing any signs of slowing down.

What is next

I can only speculate but my take is that for a year or two it is possible to dramatically improve the model quality just by adding more inference computing power and improving training datasets. Adding some sort of memory outside of the context window also seems logical although very expensive on a large scale.

I think the next big step really is to implement multiagent architecture on the LLM level, so it can have multiple collaborating internal dialogues, that share the same memory and context. It follows the current trajectory of embedding external thinking tools into the model and also benefits from linear scaling of compute power at training and inference, so I think end of this or next year we will see an LMM, Large Multiagent Model, or something similar. The sky is the limit for such a model so I propose to call it SkyNet.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Reasoning Model: Short Overview and Feature for Developers

Author(s): Igor Novikov

How to work with reasoning models

Mini vs Full models

How much better these models are?

What is next

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

Mastering Data Scaling: The Only Guide You’ll Ever Need (Straight from My Journey)

You Can Out-Compete OpenAI.

Why Small Language Models Make Business Sense

Best Laptop For Data Science

Mastering Generative AI Architectural Patterns: A Comprehensive Guide

The World’s Leading AI and Technology Publication.

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Reasoning Model: Short Overview and Feature for Developers

Author(s): Igor Novikov

How to work with reasoning models

Mini vs Full models

How much better these models are?

What is next

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement