The ambition of this page is to collect RL success stories. By "success story" we mean an application of RL methods to a substantial and difficult problem domain that is of independent interest (to some community). Yes, this is vague and if that leads to a longer list than otherwise, that may be ok.
Jump to successes in: [[#RoboticS][Robotics]], [[#ControL][Control]], [[#OperationsresearcH][Operations Research]], [[
#GameS][Games]], [[#HcI][Human-Computer Interaction]], [[#EcO][Economics/Finance]], [[#CoS][Complex Simulation]]
-------------
Robotics
- (Quadruped Gait Control) Policy Gradient Reinforcement Learning for Fast Quadrupedal Locomotion by Nate Kohl and Peter Stone
- (Quadruped Ball Acquisition) Learning Ball Acquisition on a Physical Robot by Peggy Fidelman and Peter Stone
- (Air Hockey) Learning from Observation Using Primitives, and particularly the movie of a humanoid robot playing air hockey. An example paper.
- (Active Sensing) Active Sensing Using Reinforcement Learning by Cody Kwok and Dieter Fox.
#ControL
* %RED%Control%ENDCOLOR%
1 (__Helicopter control__) [[http://www.robotics.stanford.edu/~ang/papers/iser04-invertedflight.pdf][I
nverted autonomous helicopter flight via reinforcement learning]], by Andrew Y. Ng, Adam Coates, Mark Diel, Varun Gana
pathi, Jamie Schulte, Ben Tse, Eric Berger and Eric Liang. In International Symposium on Experimental Robotics, 2004.
1 (__Helicopter control__) [[http://www.ri.cmu.edu/pubs/pub_3791.html][Autonomous helicopter control u
sing Reinforcement Learning Policy Search Methods]], by J.A. Bagnell and J. Schneider. In Proceedings of the Internati
onal Conference on Robotics and Automation, 2001.
#OperationsresearcH
* %RED%Operations Research%ENDCOLOR%
1 (__Pricing__) [[http://www.stanford.edu/~bvr/psfiles/GM-pricing.pdf][Opportunities and Challenges in
Using Online Preference Data for Vehicle Pricing: A Case Study at General Motors]] by P. Rusmevichientong, J. A. Sali
sbury, L. T. Truss, B. Van Roy, and P. W. Glynn.
1 (__Vehicle Routing__) [[http://web.engr.oregonstate.edu/~proper/AAAI04SProper.pdf][Scaling Average-r
eward Reinforcement Learning for Product Delivery]] by S. Proper and P. Tadepalli.
#GameS
* %RED%Games%ENDCOLOR%
1 (__Backgammon__) [[http://www.research.ibm.com/massive/tdl.html][Temporal difference learning and TD
-Gammon]] by Gerald Tesauro, Communications of the ACM, 38(3), March 1995.
1 (__Solitaire__) [[http://www.stanford.edu/~bvr/psfiles/solitaire.pdf][Solitaire: Man Versus Machine]
], by X. Yan, P. Diaconis, P. Rusmevichientong, and B. Van Roy, to appear in Advances in Neural Information Processing
Systems 17, MIT Press, 2005.
1 (__Chess__) [[http://www.syseng.anu.edu.au/lsg/knightcap.html][The KnightCap program]], which went f
rom a rating of 1600 to a rating of 2100 by altering its heuristic evaluation function using TD-lambda. [[http://cite
seer.ist.psu.edu/6262.html][CiteSeer]] has a link to the paper.
1 (__Checkers__) [[http://www.cs.ualberta.ca/~jonathan/Papers/Papers/td.ps][Temporal Difference Learni
ng Applied to a High-Performance Game-Playing Program]] by Jonathan Schaeffer, Markian Hlynka, and Vili Jussila, Inter
national Joint Conference on Artificial Intelligence (IJCAI), pp. 529-534, 2001..
#HcI
* %RED%Human-Computer Interaction%ENDCOLOR%
1 (__Spoken Dialogue Systems__) [[http://www.eecs.umich.edu/~baveja/Papers/RLDSjair.pdf][Optimizing D
ialogue Management with Reinforcement Learning: Experiments with the NJFun System]]. S. Singh, D. Litman, M. Kearns an
d M. Walker. In Journal of Artificial Intelligence Research (JAIR), Volume 16, pages 105-133, 2002
1 (__Software Agent in MOOs__) [[http://www.eecs.umich.edu/~baveja/Papers/CobotNIPS01.pdf][Cobot: A So
cial Reinforcement Learning Agent]]. C. Isbell, C. Shelton, M. Kearns, S. Singh, and P. Stone (2002). In Proceedings o
f Neural Information Processing Systems 14 (NIPS), pp. 1393-1400.
#EcO
* %RED%Economics/Finance%ENDCOLOR%
1 (__Trading__) Learning to Trade via Direct Reinforcement. John Moody and Matthew Saffell, IEEE Trans
actions on Neural Networks, Vol 12, No 4, July 2001.
#CoS
* %RED%Complex Simulation%ENDCOLOR%
1 (__Robot_Soccer__) [[http://www.cs.utexas.edu/~pstone/Papers/bib2html-links/ICML2001.pdf][Scaling Re
inforcement Learning toward RoboCup Soccer]], by Peter Stone and Richard S. Sutton, Proceedings of the Eighteenth Inte
rnational Conference on Machine Learning, pp. 537–544, Morgan Kaufmann, San Francisco, CA, 2001.
#MkT
* %RED%Marketing%ENDCOLOR%
1 (__Targeted_Marketing__) [[http://www.research.ibm.com/people/n/nabe/kdd04AVAS.pdf][Cross Channel Op
timized Marketing by Reinforcement Learning]], by Naoki Abe, Naval Verma, Chid Apte and Robert Schroko, Proceedings of
the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2004.
Comments (0)
You don't have permission to comment on this page.