Reinforcement Learning: Dynamic Programming and Monte Carlo — Part 2
Author(s): Tan Pengshi Alvin

Introducing two simple iterative techniques to solve the Markov Decision Process
Image by Wil Stewart on Unsplash

In the previous article — Part 1 — we have formulated the Markov Decision Process (MDP) as a paradigm to solve any Reinforcement Learning (RL) problem. However, the overarching framework discussed did not mention a systematic solution to the MDP. We have ruled out using linear techniques — like matrix inversion — and briefly raised the possibility of using iterative techniques to solve the MDP. To revisit the idea of MDP, check out the Part I below:

Introducing the backbone of Reinforcement Learning — The Markov Decision Process

