See also the more recent overview blog post (2021) on artificial curiosity and creativity since 1990!
Only data with still unknown but learnable statistical or algorithmic regularities are truly novel or surprising or interesting and thus deserve attention.
Even
beautiful
things are not necessarily
interesting.
Beauty reflects low complexity with respect to the observer's current
knowledge,
interestingness and
curiosity the learning process leading from high to low subjective complexity.
More.
ON TV: Schmidhuber's theory of
interestingness / curiosity / beauty / surprise / novelty / creativity was subject
of a TV documentary (BR "Faszination Wissen", 29 May 2008,
21:15, plus several later repeats on other channels).
See also an interview in HPlus Magazine: Build Optimal Scientist, Then Retire. This got slashdotted in 2010.
Nevertheless, we show [e.g., refs 3, 4, 6, 8, 12 below] that intrinsic curiosity reward can speed up the construction of predictive world models and the collection of external reward.
More recent work on PowerPlay [29,30,31] uses artificial creativity & curiosity
to incrementally build a more and more general problem solver.
It does not just solve given tasks but keeps inventing new ones,
without forgetting old skills. How? By continually
searching for the simplest (fastest to find) still unsolvable task
and its solution.
.
Fundamental Principle of Artificial Curiosity and Creativity:
Reward the reward- optimizing controller for actions yielding data that cause improvements of the adaptive predictor or data compressor!
(Formulated in the early 1990s; basis of much of the recent work in Developmental Robotics since 2004)
Variant 1: Reward the controller whenever the predictor errs [1990; refs 1a, 1, 2]. The predictor minimizes the objective function maximized by the generative controller. The first Generative Adversarial Networks of 1990!
Variant 2: Reward the controller whenever the predictor improves / becomes more reliable [1991; refs 3, 4, 6, 13, 14].
Variant 3: Reward the controller in proportion to the Kullback-Leibler distance between the predictor's subjective probability distributions before and after an observation - the relative entropy between its prior and posterior [1995; ref 6].
Variant 4 (zero sum intrinsic reward games): Two reward- maximizing modules bet on outcomes of potentially surprising experiments they have agreed upon [1997-2002; refs 8, 11, 12].
Variant 5 (progress in data compression): Store entire life, keep trying to compress it, reward controller for actions that yield data causing compressor improvements [1990s - 2008; e.g., refs 14-17].
Both art and science are by-products of the desire to create / discover more data that is compressible in hitherto unknown ways! [Refs 14-21, 35]
Greedy but practical Variant 6
(PowerPlay):
Incrementally build a more and more general problem solver as follows.
Systematically generate pairs of new (possibly self-invented) tasks and modifications of the current problem solver (where subjectively simple, low-complexity pairs come first), until a more powerful problem solver is found that provably solves all previously learned tasks plus the new one, while the unmodified predecessor does not. New skills may (partially) re-use previously learned skills, that is, tasks and solver modifications that used to be subjectively complex may become subjectively simple. Wow-effects are achieved by continually making previously learned skills more computationally efficient such that they require less time and storage space
[2011-; e.g., refs 29-31].
Related links:
1. Formal Theory of Creativity explains science, art, music, humor
2. Reinforcement learning
3. Recurrent network predictors
4. Learning attentive vision
5. Reinforcement learning economies
6. Learning to learn
7. Learning robots
8. Self-modeling robots
9. Hierarchical learning & subgoal generation
10. Beauty
11. Low-Complexity Art
12. Femmes Fractales
13. CoTeSys group
14. Full publication list
Recent videos / invited talks on Creativity, Curiosity, Beauty, Novel Patterns, True Surprise & Novelty, Art & Science & Humor:
13 June 2012: JS featured in Through the Wormhole with Morgan Freeman on the Science Channel. See Teaser Video and more.
20 Jan 2012: TEDx Talk (uploaded 10 March) at TEDx Lausanne: When creative machines overtake man (12:47).
15 Jan 2011: Winter Intelligence Conference, Oxford (on universal AI and theory of fun). See video at Vimeo of Sept 2011. Also available at youtube.
22 Sep 2010: Banquet Talk in Palau de la Musica Catalana for Joint Conferences ECML / PKDD 2010, Barcelona: Formal Theory of Fun & Creativity. 4th slide. All slides and a video of this talk (Dec 14) at videolectures.net
12 Nov 2009: Keynote in Cinema Corso (Lugano) for Multiple Ways to Design Research 09: Art & Science
3 Oct 2009: Invited talk for Singularity Summit, New York City. See original video (40 min). Or save time by watching the condensed but jagged video (20 min), also available at the ShanghAI Lectures. Save even more time by watching the short video (10 min, also at the bottom of this page).
25 Aug 2009: Dirac summer school, Leuven, Belgium
12 Jul 2009: Dagstuhl Castle Seminar on Computational Creativity
3 Sep 2008: Keynote for Knowledge-Based and Intelligent Information & Engineering Systems KES 2008, Zagreb
2 Oct 2007: Joint invited lecture for Algorithmic Learning Theory (ALT 2007) and Discovery Science (DS 2007), Sendai, Japan (the only joint invited lecture). Preprint
23 Aug 2007: Keynote for A*STAR Meeting on Expectation & Surprise, Singapore
12 July 2007: Keynote for Art Meets Science 2007: "Randomness vs simplicity & beauty in physics and the fine arts"
35. J. Schmidhuber. A Formal Theory of Creativity to Model the Creation of Art. In J. McCormack (ed.), Computational Creativity. MIT Press, 2012. PDF of older preprint.
34. L. Pape, C. M. Oddo, M. Controzzi, C. Cipriani, A. Foerster, M. C. Carrozza, J. Schmidhuber. Learning tactile skills through curious exploration. Frontiers in Neurorobotics 6:6, 2012, doi: 10.3389/fnbot.2012.00006
33. H. Ngo, M. Luciw, A. Foerster, J. Schmidhuber. Learning Skills from Play: Artificial Curiosity on a Katana Robot Arm. Proc. IJCNN 2012. PDF. Video.
32. V. R. Kompella, M. Luciw, M. Stollenga, L. Pape, J. Schmidhuber. Autonomous Learning of Abstractions using Curiosity-Driven Modular Incremental Slow Feature Analysis. Proc. IEEE Conference on Development and Learning / EpiRob 2012 (ICDL-EpiRob'12), San Diego, 2012, in press.
31. R. K. Srivastava, B. Steunebrink, J. Schmidhuber. First Experiments with PowerPlay. Neural Networks, 2013. ArXiv preprint (2012): arXiv:1210.8385 [cs.AI].
30. R. K. Srivastava, B. R. Steunebrink, M. Stollenga, J. Schmidhuber. Continually Adding Self-Invented Problems to the Repertoire: First Experiments with POWERPLAY. Proc. IEEE Conference on Development and Learning / EpiRob 2012 (ICDL-EpiRob'12), San Diego, 2012. PDF.
29. J. Schmidhuber. POWERPLAY: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem. Frontiers in Cognitive Science, 2013. ArXiv preprint (2011): arXiv:1112.5309 [cs.AI]
28. Sun Yi, F. Gomez, J. Schmidhuber. Planning to Be Surprised: Optimal Bayesian Exploration in Dynamic Environments. In Proc. Fourth Conference on Artificial General Intelligence (AGI-11), Google, Mountain View, California, 2011. PDF.
27. V. Graziano, T. Glasmachers, T. Schaul, L. Pape, G. Cuccu, J. Leitner, J. Schmidhuber. Artificial Curiosity for Autonomous Space Exploration. Acta Futura 4:41-51, 2011 (DOI: 10.2420/AF04.2011.41). PDF.
26. G. Cuccu, M. Luciw, J. Schmidhuber, F. Gomez. Intrinsically Motivated Evolutionary Search for Vision-Based Reinforcement Learning. In Proc. Joint IEEE International Conference on Development and Learning (ICDL) and on Epigenetic Robotics (ICDL-EpiRob 2011), Frankfurt, 2011. PDF.
25. M. Luciw, V. Graziano, M. Ring, J. Schmidhuber. Artificial Curiosity with Planning for Autonomous Visual and Perceptual Development. In Proc. Joint IEEE International Conference on Development and Learning (ICDL) and on Epigenetic Robotics (ICDL-EpiRob 2011), Frankfurt, 2011. PDF.
24. T. Schaul, L. Pape, T. Glasmachers, V. Graziano J. Schmidhuber. Coherence Progress: A Measure of Interestingness Based on Fixed Compressors. In Proc. Fourth Conference on Artificial General Intelligence (AGI-11), Google, Mountain View, California, 2011. PDF.
23. T. Schaul, Yi Sun, D. Wierstra, F. Gomez, J. Schmidhuber. Curiosity-Driven Optimization. IEEE Congress on Evolutionary Computation (CEC-2011), 2011. PDF.
22. H. Ngo, M. Ring, J. Schmidhuber. Curiosity Drive based on Compression Progress for Learning Environment Regularities. In Proc. Joint IEEE International Conference on Development and Learning (ICDL) and on Epigenetic Robotics (ICDL-EpiRob 2011), Frankfurt, 2011.
21. J. Schmidhuber. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990-2010). IEEE Transactions on Autonomous Mental Development, 2(3):230-247, 2010. IEEE link. PDF of draft.
20. J. Schmidhuber. Artificial Scientists & Artists Based on the Formal Theory of Creativity. In Proceedings of the Third Conference on Artificial General Intelligence (AGI-2010), Lugano, Switzerland. PDF.
19. J. Schmidhuber. Art & science as by-products of the search for novel patterns, or data compressible in unknown yet learnable ways. In M. Botta (ed.), Multiple ways to design research. Research cases that reshape the design discipline, Milano-Lugano, Swiss Design Network - Et al. Edizioni, 2009, pp. 98-112. (Keynote talk.) PDF of preprint.
18. J. Schmidhuber. Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes. Based on keynote talk for KES 2008 (below) and joint invited lecture for ALT 2007 / DS 2007 (below). Short version: ref 17 below. Long version in G. Pezzulo, M. V. Butz, O. Sigaud, G. Baldassarre, eds.: Anticipatory Behavior in Adaptive Learning Systems, from Sensorimotor to Higher-level Cognitive Capabilities, Springer, LNAI, 2009. Preprint (2008, revised 2009): arXiv:0812.4360. PDF (Dec 2008). PDF (April 2009).
17. J. Schmidhuber. Simple Algorithmic Theory of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes. Journal of SICE, 48(1):21-32, 2009. PDF.
16. J. Schmidhuber. Driven by Compression Progress. In Proc. Knowledge- Based Intelligent Information and Engineering Systems KES-2008, Lecture Notes in Computer Science LNCS 5177, p 11, Springer, 2008. (Abstract of invited keynote talk.) PDF.
15.
J. Schmidhuber.
Simple Algorithmic Principles of Discovery, Subjective Beauty,
Selective Attention, Curiosity & Creativity.
In V. Corruble, M. Takeda, E. Suzuki, eds.,
Proc. 10th Intl. Conf. on Discovery Science (DS 2007)
p. 26-38, LNAI 4755, Springer, 2007.
Also in M. Hutter, R. A. Servedio, E. Takimoto, eds.,
Proc. 18th Intl. Conf. on Algorithmic Learning Theory (ALT 2007)
p. 32, LNAI 4754, Springer, 2007.
(Joint invited lecture for DS 2007 and ALT 2007, Sendai, Japan, 2007.)
Preprint: arxiv:0709.0674.
PDF.
Curiosity as the drive to improve the compression
of the lifelong sensory input stream: interestingness as
the first derivative of subjective "beauty" or compressibility.
14.
J. Schmidhuber.
Developmental Robotics,
Optimal Artificial Curiosity, Creativity, Music, and the Fine Arts.
Connection Science, 18(2): 173-187, June 2006.
PDF.
On mathematically optimal universal artificial curiosity,
based on theoretically best possible ways of maximizing learning progress
in embedded agents or robots with an intrinsic motivation to
learn skills that lead to a better understanding of the world and what can be done in it.
It is also pointed out how music and the arts can be formally understood as a consequence of
the principle of artificial curiosity and creativity.
13. J. Schmidhuber. Self-Motivated Development Through Rewards for Predictor Errors / Improvements. Developmental Robotics 2005 AAAI Spring Symposium, March 21-23, 2005, Stanford University, CA. PDF.
12. J. Schmidhuber. Exploring the Predictable. In Ghosh, S. Tsutsui, eds., Advances in Evolutionary Computing, p. 579-612, Springer, 2002. PDF . HTML. One of the key publications - see more details under refs [8, 11, 1997-].
11. J . Schmidhuber. Artificial Curiosity Based on Discovering Novel Algorithmic Predictability Through Coevolution. In P. Angeline, Z. Michalewicz, M. Schoenauer, X. Yao, Z. Zalzala, eds., Congress on Evolutionary Computation, p. 1612-1618, IEEE Press, Piscataway, NJ, 1999.
11a. J. Schmidhuber. What's interesting? In Abstract Collection of SNOWBIRD: Machines That Learn. Utah, April 1998.
10. M. Wiering and J. Schmidhuber. Efficient model-based exploration. In R. Pfeiffer, B. Blumberg, J. Meyer, S. W. Wilson, eds., From Animals to Animats 5: Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior, p. 223-228, MIT Press, 1998.
9. M. Wiering and J. Schmidhuber. Learning exploration policies with models. In Proc. CONALD, 1998.
8.
J. Schmidhuber.
What's interesting?
Technical Report IDSIA-35-97, IDSIA, July 1997
(23 pages, 10 figures, 157 K, 834 K gunzipped).
Here we focus
on automatic creation of predictable internal
abstractions of complex spatio- temporal events:
two competing, intrinsically motivated agents agree on essentially
arbitrary algorithmic experiments and bet
on their possibly surprising (not yet predictable)
outcomes in zero-sum games,
each agent potentially profiting from outwitting / surprising
the other by inventing experimental protocols where both
modules disagree on the predicted outcome. The focus is on exploring
the space of general algorithms (as opposed to
traditional simple mappings from inputs to
outputs); the
general system
[12] focuses on the interesting
things by losing interest in both predictable and
unpredictable aspects of the world. Unlike the previous
systems with intrinsic motivation (1990, 91, 95, see below), the system also
takes into account
the computational cost of learning new skills, learning when to learn and what to learn.
See also refs [11, 12, 1998-2002].
7. J. Schmidhuber, J. Zhao, N. Schraudolph. Reinforcement learning with self-modifying policies. In S. Thrun and L. Pratt, eds., Learning to learn, Kluwer, pages 293-309, 1997. PDF; HTML.
6.
J. Storck, S. Hochreiter, and J. Schmidhuber.
Reinforcement-driven information acquisition in non-deterministic
environments.
In Proc. ICANN'95, vol. 2, pages 159-164.
EC2 & CIE, Paris, 1995.
PDF .
HTML.
In this paper the curiosity reward is
again proportional to
the predictor's surprise / information gain, this time measured as the
Kullback-Leibler distance between the learning predictor's
subjective probability distributions
before and after new observations -
the relative entropy between its prior and posterior.
(In 2005 Itti & Baldi called this "Bayesian surprise" and
demonstrated experimentally that it explains certain patterns of
human visual attention better than certain previous approaches.)
Note the differences to "Active Learning": The latter typically focuses on
choosing which data points to evaluate next in order to maximize information gain (i.e., one-step look-ahead) assuming all data point evaluations are equally costly. The 1995 system, however, is
more general and takes into account:
(1) arbitrary delays between experimental actions agents and corresponding information gains,
(2) the highly environment-dependent costs of obtaining or creating not
just individual data points but entire data sequences.
5. J. Schmidhuber. On learning how to learn learning strategies. Technical Report FKI-198-94, Fakultät für Informatik, Technische Universität München, November 1994. PDF.
4.
J. Schmidhuber.
Curious model-building control systems.
In Proc. International Joint Conference on Neural Networks,
Singapore, volume 2, pages 1458-1463. IEEE, 1991.
PDF .
HTML.
The second peer-reviewed English-language publication
on artificial curious agents with intrinsic motivation. The system
uses reinforcement learning to create behaviors that lead to parts of the environment
where previous experience indicates that the prediction error can be improved (not necessarily
where it is high). So the agent is neither attracted by unpredictable
randomness nor by totally predictable aspects of the world. Instead
it likes to go where it learnt to expect additional learning progress.
(Quite a few later publications on developmental robotics and intrinsic reward took up this basic idea, e.g.,
Oudeyer & Kaplan (2007), whose work is restricted to one-step look-ahead though,
and doesn't allow for delayed intrinsic rewards like the 1991 paper above.)
3. J. Schmidhuber. J. Schmidhuber. Adaptive confidence and adaptive curiosity. Technical Report FKI-149-91, Inst. f. Informatik, Tech. Univ. Munich, April 1991. PDF.
2.
J. Schmidhuber.
A possibility for implementing curiosity and boredom in
model-building neural controllers.
In J. A. Meyer and S. W. Wilson, editors, Proc. of the
International Conference on Simulation
of Adaptive Behavior: From Animals to
Animats, pages 222-227. MIT Press/Bradford Books, 1991.
PDF .
HTML.
The first peer-reviewed English-language publication
on artificial curious agents with intrinsic motivation. The system
uses reinforcement learning to create behaviors that lead the agent to parts of the environment
where the separate predictor's
prediction error is expected to be high, assuming one can learn something there.
Quite a few later publications on developmental robotics and/or intrinsic reward took up this basic idea, e.g.,
Singh & Barto & Chentanez (2005).
1. J. Schmidhuber. Making the world differentiable: On using fully recurrent self-supervised neural networks for dynamic reinforcement learning and planning in non-stationary environments. Technical Report FKI-126-90, TUM, Feb 1990, revised Nov 1990. PDF. The first paper on planning with reinforcement learning recurrent neural networks (NNs) (more) and on generative adversarial networks where a generator NN is fighting a predictor NN in a minimax game (more).
1a. J. Schmidhuber. Dynamische neuronale Netze und das fundamentale raumzeitliche Lernproblem (Dynamic neural nets and the fundamental spatio-temporal credit assignment problem). Dissertation, Institut fuer Informatik, Technische Universitaet Muenchen, 1990. PDF . HTML.
Differences to Shannon / Boltzmann's notion of surprise. Since the early 1990s, the papers above have repeatedly pointed out an essential difference between our theory of surprise & novelty and Shannon's traditional information theory based on Boltzmann's entropy notion. Consider two extreme examples of uninteresting, unsurprising, boring data. A vision-based agent that always stays in the dark will experience an extremely compressible, soon totally predictable and unsurprising history of unchanging visual inputs. In front of a screen full of white noise conveying a lot of information and "novelty" and "surprise" in the traditional sense of Boltzmann (1800s) and Shannon (1948), however, it will experience highly unpredictable and fundamentally uncompressible data. In both cases the data gets boring quickly as it does not allow for learning new things or for further compression progress. Neither the arbitrary nor the fully predictable is truly novel or surprising or interesting - only data with still unknown but learnable statistical or algorithmic regularities are! That's why our theory of surprise and curiosity and creativity takes the time-varying state of the subjective, learning observer into account.
Check out related papers on adaptive visual attention with foveas (overview page):
J. Schmidhuber and R. Huber. Learning to generate artificial fovea trajectories for target detection. International Journal of Neural Systems, 2(1 & 2):135-141, 1991. Figures in overview page. PDF . HTML.
J. Schmidhuber and R. Huber.
Using sequential adaptive neuro-control for efficient learning of
rotation and translation invariance.
In T. Kohonen,
K. Mäkisara, O. Simula, and J. Kangas, editors,
Artificial Neural Networks, pages 315-320.
Elsevier Science Publishers B.V., North-Holland, 1991.
.
See also Peter Redgrave's comment on Nature (473, 450, 26 May 2011): Neuroscience: What makes us laugh.