Whenever an intelligent agent learns to control an unknown environment, two opposing objectives have to be combined. On the one hand, the environment must be sufficiently explored in order to identify a (sub-) optimal controller. For instance, a robot facing an unknown environment has to spend time moving around and acquiring knowledge. On the other hand, the environment must also be exploited during learning, i.e., experience made during learning must also be considered for action selection, if one is interested in minimizing costs of learning. For example, although a robot has to explore its environment, it should avoid collisions with obstacles once it has received some negative reward for collisions. For efficient learning, actions should thus be generated in such a way that the environment is explored and pain is avoided. This fundamental trade-off between exploration and exploitation demands efficient exploration capabilities, maximizing the effect of learning while minimizing the costs of exploration.
This chapter explores the role of exploration in learning control.
Click here to obtain the full paper (812657 bytes).
 
@INCOLLECTION{Thrun92c,
 AUTHOR = {S. Thrun},
 YEAR = {1992},
 TITLE = {The Role of Exploration in Learning Control},
 BOOKTITLE = {Handbook for Intelligent Control: Neural, Fuzzy and 
 Adaptive Approaches},
 EDITOR = {D.A. White and D.A. Sofge},
 PUBLISHER = {Van Nostrand Reinhold},
 ADDRESS = {Florence, Kentucky 41022}
}