Site icon Towards AI

DeepMind and OpenAI Use Human Feedback to Maximize the Performance of Reinforcement Learning Agents

Author(s): Jesus Rodriguez

A research paper from 2018, introduces a training model that combines human feedback and reward optimization to maximize the knowledge of…

Published via Towards AI

Exit mobile version