Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Reasoning Model: Short Overview and Feature for Developers
Latest   Machine Learning

Reasoning Model: Short Overview and Feature for Developers

Last Updated on January 21, 2025 by Editorial Team

Author(s): Igor Novikov

Originally published on Towards AI.

Image by the author

When LLMs first came out they were kinda like children, they would say the first thing that came to mind and didn’t bother much with logic. You had to tell them they should think before you speak. And just like with children even then it didn’t mean they would think.

Many argued that because of that, the models do not possess real intelligence and must be supplemented with either human help or some sort of external framework on top of an LLM, like Chain of Thought.

It was only a matter of time before major LLM developers like OpenAI decided to replicate this external thinking step (see picture below) inside an LLM. After all, it’s pretty simple β€” create a dataset that contains not just question-answer pairs but the whole step-by-step logic, and train on that. Additionally, it would require more computation resources at inference time, as a model would go through the same step-by-step thinking process when determining the answer.

Added thinking step. Image by OpenAI

They natively break down problems into small pieces and integrate a Chain of thought approach, error correction, and trying multiple strategies before answering.

O1 spends more time at inference (o1 is 30 times slower than Gpt4o), what a surprise β€” longer thinking time leads to better results!

Image by OpenAI

Reasoning tokens are not passed from one turn to the next β€” only the output.

Also, it verifies the solution by generating multiple answers and choosing the best via consensus, and the approach that we used to implement manually. Here is the overall process:

Image by OpenAI

One important conclusion is that GPU computation requirements are going to grow as it is obvious that longer thinking time (in tokens) leads to better answers, so it is possible to scale model quality just by giving the model more computing power, whereas before this was mostly true at training phase. So GPU requirements for modern models are going to go significantly higher.

These models are thus different and old approaches no longer work.

How to work with reasoning models

Interestingly it is kind of similar to working with an intelligent human:

  1. Be simple and direct. State your question clearly.
  2. No explicit Chain of Thought. The model will do that internally
  3. Have a good structure: break the prompt into sections using clear markup
  4. Show vs tell: it is better to show the model and example of a good answer or behavior than describe it in several thousand words
  5. No more need for coaxing, intimidating, or bribing the model nonsense

I can even summarize this into one: know what you want to ask and ask it clearly.

Mini vs Full models

Since reasoning models like o3 consume a lot of tokens during inference β€” it is rather expensive to use them for everything and the latency is not great. So the idea is to delegate the most difficult task β€” high-level thinking and planning, and have faster and more cost-efficient mini-models to execute the plan. They can be used for tasks like coding, math, and science.

This is an agentic approach, that allows us to combine best of the both worlds β€” smart but expensive models with small and fast workers.

How much better these models are?

Much better, and going to get even better soon. For o1 it’s approaching expert humans in math and coding (see below):

Math

Image by OpenAI

Coding

Image by OpenAI

ELO 2727 puts o3 in the 200 best coders in the world. If you are not worried about your job security as a developer β€” it’s time to start now. This is exactly the job that scales perfectly by adding more computing power and the current rate of progress is not showing any signs of slowing down.

What is next

I can only speculate but my take is that for a year or two it is possible to dramatically improve the model quality just by adding more inference computing power and improving training datasets. Adding some sort of memory outside of the context window also seems logical although very expensive on a large scale.

I think the next big step really is to implement multiagent architecture on the LLM level, so it can have multiple collaborating internal dialogues, that share the same memory and context. It follows the current trajectory of embedding external thinking tools into the model and also benefits from linear scaling of compute power at training and inference, so I think end of this or next year we will see an LMM, Large Multiagent Model, or something similar. The sky is the limit for such a model so I propose to call it SkyNet.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓