The *ambition* of this page is to discuss (and hopefully dispel) the myth: Large state spaces are hard for RL

The main reason this myth exists is that RL problems with large state spaces can be hard to solve with *naive* RL algorithms. For example, Q-learning with *inexpertly* chosen input feature representation and/or *inexperienced* choice of function approximation can often lead to poor scaling with size of state space. Another reason for this myth is that the most well known and *basic* theoretical analyses of foundational RL algorithms such as Q-learning, TD, or E^3 are for the lookup table case and these naturally provide results that show at least a linear dependence on the size of the state space. Linear dependence on size of state space is too slow for real problems.

This is a myth for the following reasons:

- Several researchers have successfully applied RL to large state space problems, e.g., TDgammon, channel assignment in cellular telephones, (see Successes of RL for a more comprehensive list) involve very large discrete state spaces, and various robot learning tasks as well as the recent helicopter-control work involves high-dimensional continuous state spaces.
- One can easily construct artificial and arbitrarily large-dimensional continuous state spaces in which the optimal value-function or the optimal policy or both may be very simply defined and thus easily learned.
- There exist RL algorithms such as sparse-sampling, approximate value iteration, approximate policy iteration, policy gradient, conservative policy iteration, etc... whose complexity is
*independent of the size of the state space*.

Statements of the form "I am interested in solving problem X and X has a very large state space and therefore X is clearly difficult for RL" are at best lazy and at worst wrong. It hides the speaker's ignorance as to what really makes X hard.

In fact, what makes some RL problems hard remains an important and open research question.

## Comments (0)

You don't have permission to comment on this page.