DeepMind and OpenAI Use Human Feedback to Maximize the Performance of Reinforcement Learning Agents

Towards AI Team

3 years ago

A research paper from 2018, introduces a training model that combines human feedback and reward optimization to maximize the knowledge of…

Published via Towards AI