Some Notes on Generalization in RL
This is a personal note on some of the ideas in RL the writer learned so far. Many of them could be incorrect and I’d be happy if you could let me know. In this article, we are going to include some of these ideas proposed in such field. Note that most of the content actually comes from this book. Background First of all, let’s define the notations and settings. $$ \begin{align*} {\cal S}\quad&\text{state space}\\ {\cal A}\quad&\text{action space}\\ H\quad&\text{horizon}\\ s\quad&\text{state}\\ a\quad&\text{action}\\ h\quad&\text{step}\\ r_h(s,a)\quad&\text{reward}\\ \Bbb P_h(\cdot\vert s,a)\quad&\text{transition probability}\\ K\quad&\text{number of episodes}\\ k\quad&\text{episode}\\ \end{align*} $$ We say we are in episodic setting and consider the finite horizon MDP ${\cal M}=({\cal S}, {\cal A}, H, \Bbb P, r)$. We are also going to use this setting in most parts of this article. ...