DPO, Open-Source’s New Weapon in the AI War
Last Updated on January 25, 2024 by Editorial Team
Author(s): Ignacio de Gregorio
Originally published on Towards AI.
The end of RLHF?
Top highlight
“It is only rarely that, after reading a research paper, I feel like giving the authors a standing ovation.“
If this is how one of the most prominent researchers in the world, Andrew Ng, refers to a recent research paper, you know it’s awesome.
A group of researchers from Stanford and CZ Biohub has presented DPO, a new alignment breakthrough that could give back to the open-source community the capacity to challenge the big tech companies; something thought impossible… until now.
This insight and more I share in Medium have previously been shared in my weekly newsletter, TheTechOasis.
If you want to be up-to-date with the frenetic world of AI while also feeling inspired to take action or, at the very least, to be well-prepared for the future ahead of us, this is for you.
U+1F3DDSubscribe belowU+1F3DD
The newsletter to stay ahead of the curve in AI
thetechoasis.beehiiv.com
When one looks at the numbers, it’s easy to realize that building the best Large Language Models (LLMs) like ChatGPT is a rich people’s game.
The current gold standard to build these models is as follows:
Source: Chip Huyen
You first assemble billions of documents with trillions of words and, in a self-supervised manner, you ask the model to predict the next… Read the full blog for free on Medium.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI