• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!


Large state spaces are hard for RL

Page history last edited by Satinder Singh 15 years, 2 months ago

The ambition of this page is to discuss (and hopefully dispel) the myth: Large state spaces are hard for RL.


The main reason this myth exists is that RL problems with large state spaces can be hard to solve with naive versions of RL algorithms.  For example, Q-learning with inexpertly chosen input feature representation and/or inexperienced choice of function approximation can often lead to poor scaling with size of state space. Another reason for this myth is that the most well known and basic theoretical analyses of foundational RL algorithms such as Q-learning, TD, or E^3 are for the lookup table case and these naturally provide results that show at least a linear dependence on the size of the state space. Linear dependence on size of state space is too slow for real problems.


This is a myth for the following reasons:

  1. Several researchers have successfully applied RL to large state space problems involving very large discrete state spaces, and various robot learning tasks involving high-dimensional continuous state spaces. (See Successes of RL for a more comprehensive list)
  2. One can easily construct artificial and arbitrarily large-dimensional continuous state spaces in which the optimal value-function or the optimal policy or both may be very simply defined and thus easily learned.
  3. There exist RL algorithms such as sparse-sampling, approximate value iteration, approximate policy iteration, policy gradient,  conservative policy iteration, etc., whose complexity is independent of the size of the state space.


Statements of the form "I am interested in solving problem X, and X has a very large state space and therefore X is clearly difficult for RL" are at best lazy and at worst wrong. It hides the speaker's ignorance as to what really makes X hard.


In fact, what makes some RL problems hard remains an important and open research question.

Comments (0)

You don't have permission to comment on this page.