Meta’s Self-Rewarding Models, the Key to SuperHuman LLMs?
Author(s): Ignacio de Gregorio

Meta, the company behind Facebook, Whatsapp, and Rayban’s Meta glasses, has announced a recent, highly promising AI breakthrough, Self-Rewarding Language Models.

Their results have allowed their LLaMa-2 70B fine-tuned model to surpass models like Claude 2, Gemini Pro, and GPT-4 0613, despite being at least an order of magnitude smaller.

However, that is not the true breakthrough, as these new models also show signs of being a reasonable path to creating the first superhuman LLMs, even if that means humans taking one step closer to losing complete control over our best AI models.

But what does that mean? And is that a good thing?

Let’s find out.

To this day, in all frontier models like ChatGPT, or Claude, humans play a crucial role in their creation.

As explained in my newsletter from two weeks ago, the later stages of the training process of…

