Linear Function Approximation in Reinforcement Learning
Last Updated on November 3, 2024 by Editorial Team
Author(s): Shivam Mohan
Originally published on Towards AI.
This member-only story is on us. Upgrade to access all of Medium.
In reinforcement learning (RL), a key challenge is estimating the value function, which predicts future rewards based on the current state. In large or continuous state spaces, itβs often impractical to explicitly store or compute the value function for every possible state. This is where function approximation becomes essential, allowing us to generalize the value of unseen states from observed ones.
A widely used approach in Reinforcement Learning is Linear Function Approximation. Here, instead of learning the value of each state individually, we represent the value function as a weighted combination of features of that state. Mathematically, we express the estimated value function V(s) as:
V(s) β w_1 * Ο_1(s) + w_2 * Ο_2(s) + … + w_k * Ο_k(s)
Where:
w is a vector of weights (parameters) that we aim to learn.Ο(s) is a vector of features (or basis functions) that describe the state.
Each feature Ο_i(s) represents a specific characteristic of the state. For example, in a Pac-Man-like environment:
Ο_1(s) could represent the distance to the nearest dot.Ο_2(s) might represent the inverse distance to the nearest ghost.
By learning the appropriate weights for these features, we can approximate the value function V(s) across the… Read the full blog for free on Medium.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI