The *ambition* of this page is to discuss (and hopefully dispel) the myth: Large state spaces are hard for RL.

The main reason this myth exists is that RL problems with large state spaces can be hard to solve with *naive* versions of RL algorithms. For example, Q-learning with *inexpertly* chosen input feature representation and/or *inexperienced* choice of function approximation can often lead to poor scaling with size of state space. Another reason for this myth is that the most well known and *basic* theoretical analyses of foundational RL algorithms such as Q-learning, TD, or E^3 are for the lookup table case and these naturally provide results that show at least a linear dependence on the size of the state space. Linear dependence on size of state space is too slow for real problems.

This is a myth for the following reasons:

- Several researchers have successfully applied RL to large state space problems involving very large discrete state spaces, and various robot learning tasks involving high-dimensional continuous state spaces. (See Successes of RL for a more comprehensive list)
- One can easily construct artificial and arbitrarily large-dimensional continuous state spaces in which the optimal value-function or the optimal policy or both may be very simply defined and thus easily learned.
- There exist RL algorithms such as sparse-sampling, approximate value iteration, approximate policy iteration, policy gradient, conservative policy iteration, etc., whose complexity is
*independent of the size of the state space*.

Statements of the form "I am interested in solving problem X, and X has a very large state space and therefore X is clearly difficult for RL" are at best lazy and at worst wrong. It hides the speaker's ignorance as to what really makes X hard.

In fact, what makes some RL problems hard remains an important and open research question.

## Comments (0)

You don't have permission to comment on this page.