Intrinsic motivation (artificial intelligence)

Intrinsic motivation in the study of artificial intelligence and robotics is a mechanism for enabling artificial agents (including robots) to exhibit inherently rewarding behaviours such as exploration and curiosity, grouped under the same term in the study of psychology. Psychologists consider intrinsic motivation in humans to be the drive to perform an activity for inherent satisfaction – just for the fun or challenge of it.[1]

Definition

An intelligent agent is intrinsically motivated to act if the information content alone, of the experience resulting from the action, is the motivating factor.

Information content in this context is measured in the information-theoretic sense of quantifying uncertainty. A typical intrinsic motivation is to search for unusual, surprising situations (exploration), in contrast to a typical extrinsic motivation such as the search for food (homeostasis).[2] Extrinsic motivations are typically described in artificial intelligence as task-dependent or goal-directed.

Origins in psychology

The study of intrinsic motivation in psychology and neuroscience began in the 1950s with some psychologists explaining exploration through drives to manipulate and explore, however, this homeostatic view was criticised by White.[3] An alternative explanation from Berlyne in 1960 was the pursuit of an optimal balance between novelty and familiarity.[4] Festinger described the difference between internal and external view of the world as dissonance that organisms are motivated to reduce.[5] A similar view was expressed in the '70s by Kagan as the desire to reduce the incompatibility between cognitive structure and experience[6]. In contrast to the idea of optimal incongruity, Deci and Ryan identified in the mid 80's an intrinsic motivation based on competence and self-determination.[7]

Computational models

An influential early computational approach to implement artificial curiosity in the early 1990s by Schmidhuber, has since been developed into a "Formal theory of creativity, fun, and intrinsic motivation”.[8]

Intrinsic motivation is often studied in the framework of computational reinforcement learning[9][10] (introduced by Sutton and Barto), where the rewards that drive agent behaviour are intrinsically derived rather than externally imposed and must be learnt from the environment.[11] Reinforcement learning is agnostic to how the reward is generated - an agent will learn a policy (action strategy) from the distribution of rewards afforded by actions and the environment. Each approach to intrinsic motivation in this scheme is essentially a different way of generating the reward function for the agent.

Curiosity vs. exploration

Intrinsically motivated artificial agents exhibit behaviour that resembles curiosity or exploration. Exploration in artificial intelligence and robotics has been extensively studied in reinforcement learning models,[12] usually by encouraging the agent to explore as much of the environment as possible, to reduce uncertainty about the dynamics of the environment (learning the transition function) and how best to achieve its goals (learning the reward function). Intrinsic motivation, in contrast, encourages the agent to first explore aspects of the environment that confer more information, to seek out novelty. Recent work unifying state visit count exploration and intrinsic motivation has shown faster learning in a video game setting.[13]

Types of models

Ouedeyer and Kaplan have made a substantial contribution to the study of intrinsic motivation.[14][2][15] They define intrinsic motivation based on Berlyne's theory[4], and divide approaches to the implementation of intrinsic motivation into three categories that broadly follow the roots in psychology: "knowledge-based models", "competence-based models" and "morphological models".[2] Knowledge-based models are further subdivided into "information-theoretic" and "predictive".[15] Baldassare and Mirolli present a similar typology, differentiating knowledge-based models between prediction-based and novelty-based.[16]

Information-theoretic intrinsic motivation

The quantification of prediction and novelty to drive behaviour is generally enabled through the application of information-theoretic models, where agent state and strategy (policy) over time are represented by probability distributions describing a markov decision process and the cycle of perception and action treated as an information channel.[17][18] These approaches claim biological feasibility as part of a family of bayesian approaches to brain function. The main criticism and difficulty of these models is the intractability of computing probability distributions over large discrete or continuous state spaces.[2] Nonetheless a considerable body of work has built up modelling the flow of information around the sensorimotor cycle, leading to de-facto reward functions derived from the reduction of uncertainty, including most notably active inference[19], but also infotaxis,[20] predictive information[21][22], empowerment.[23]

Competence-based models

Steels' autotelic principle [24] is an attempt to formalise flow (psychology).[25]

Intrinsically Motivated Learning

Intrinsically motivated (or curiosity-driven) learning is an emerging research topic in artificial intelligence and developmental robotics[26] that aims to develop agents that can learn general skills or behaviours, that can be deployed to improve performance in extrinsic tasks, such as acquiring resources.[27] Intrinsically motivated learning has been studied as an approach to autonomous lifelong learning in machines.[28][29] Despite the impressive success of deep learning in specific domains (e.g. AlphaGo), many in the field (e.g. Gary Marcus) have pointed out that the ability to generalise remains a fundamental challenge in artificial intelligence. Intrinsically motivated learning, although promising in terms of being able to generate goals from the structure of the environment without externally imposed tasks, faces the same challenge of generalisation – how to reuse policies or action sequences, how to compress and represent continuous or complex state spaces and retain and reuse the salient features that have been learnt.[27]

gollark: - implement lambda calculus atop CGoL
gollark: <@!332271551481118732> ideas:- make PRs to minoteaur as I just put the source on git.osmarks.net- hindley milner type inference- prove riemann hypothesis- osmarksßssearch™ engine but less bad- fix EWO server
gollark: I can put the current code on git.
gollark: However, we COULD engage in collaborative minoteaur development.
gollark: And it's not actually a very well-designed game.

See also

References

  1. Ryan, Richard M; Deci, Edward L (2000). "Intrinsic and extrinsic motivations: Classic definitions and new directions". Contemporary Educational Psychology. 25 (1): 54–67. doi:10.1006/ceps.1999.1020. PMID 10620381.
  2. Oudeyer, Pierre-Yves; Kaplan, Frederic (2008). "How can we define intrinsic motivation?". Proc. of the 8th Conf. on Epigenetic Robotics. 5. pp. 29–31.
  3. White, R. (1959). "Motivation reconsidered: The concept of competence". Psychological Review. 66 (5): 297–333. doi:10.1037/h0040934. PMID 13844397.
  4. Berlyne, D.: Conflict, Arousal and Curiosity. McGraw-Hill, New York (1960)
  5. Festinger, L.: A theory of cognitive dissonance. Evanston, Row, Peterson (1957)
  6. Kagan, J.: Motives and development. Journal of Personality and Social Psychology 22, 51–66
  7. Deci, E.L., Ryan, R.M.: Intrinsic motivation and self-determination in human behavior. Plenum, New York (1985)
  8. Schmidhuber, J (2010). "Formal theory of creativity, fun, and intrinsic motivation (1990-2010)". IEEE Trans. Auton. Mental Dev. 2 (3): 230–247. doi:10.1109/TAMD.2010.2056368.
  9. Barto, A., Singh, S., Chentanez, N.: Intrinsically motivated learn- ing of hierarchical collections of skills. In: ICDL 2004. Proceedings of the 3rd In- ternational Conference on Development and Learning, Salk Institute, San Diego (2004)
  10. Singh, S., Barto, A. G., and Chentanez, N. (2005). Intrinsically motivated reinforcement learning. In Proceedings of the 18th Annual Conference on Neural Information Processing Systems (NIPS), Vancouver, B.C., Canada.
  11. Barto, A.G.: Intrinsic motivation and reinforcement learning. In: Baldassarre, G., Mirolli, M. (eds.) Intrinsically Motivated Learning in Natural and Artificial Systems. Springer, Berlin (2012)
  12. Thrun, S. B. (1992). Efficient Exploration in Reinforcement Learning. https://doi.org/10.1007/978-1-4899-7687-1_244
  13. Bellemare, M. G., Srinivasan, S., Ostrovski, G., Schaul, T., Saxton, D., & Munos, R. (2016). Unifying count-based exploration and intrinsic motivation. Advances in Neural Information Processing Systems, 1479–1487.
  14. Kaplan, F. and Oudeyer, P. (2004). Maximizing learning progress: an internal reward system for development. Embodied artificial intelligence, pages 629–629.
  15. Oudeyer, P. Y., & Kaplan, F. (2009). What is intrinsic motivation? A typology of computational approaches. Frontiers in Neurorobotics, 3(NOV). https://doi.org/10.3389/neuro.12.006.2007
  16. Baldassarre, Gianluca; Mirolli, Marco (2013). "Intrinsically Motivated Learning Systems: An Overview". Intrinsically Motivated Learning in Natural 1 and Artificial Systems. Rome, Italy: Springer. pp. 1–14.
  17. Klyubin, A., Polani, D., and Nehaniv, C. (2008). Keep your options open: an information-based driving principle for sensorimotor systems. PloS ONE, 3(12):e4018. https://dx.doi.org/10.1371%2Fjournal.pone.0004018
  18. Biehl, Martin; Guckelsberger, Christian; Salge, Christoph; Smith, Simón C.; Polani, Daniel (2018). "Expanding the Active Inference Landscape: More Intrinsic Motivations in the Perception-Action Loop". Frontiers in Neurorobotics. 12: 45. arXiv:1806.08083. doi:10.3389/fnbot.2018.00045. ISSN 1662-5218. PMC 6125413. PMID 30214404.
  19. Friston, Karl; Kilner, James; Harrison, Lee (2006). "A free energy principle for the brain" (PDF). Journal of Physiology-Paris. Elsevier BV. 100 (1–3): 70–87. doi:10.1016/j.jphysparis.2006.10.001. ISSN 0928-4257. PMID 17097864.
  20. Vergassola, M., Villermaux, E., & Shraiman, B. I. (2007). ‘Infotaxis’ as a strategy for searching without gradients. Nature, 445(7126), 406–409. https://doi.org/10.1038/nature05464
  21. Ay, N., Bertschinger, N., Der, R., Güttler, F. and Olbrich, E. (2008), ‘Predictive information and explorative behavior of autonomous robots’, The European Physical Journal B 63(3), 329–339.
  22. Martius, G., Der, R., and Ay, N. (2013). Information driven self-organization of complex robotic behaviors. PLoS ONE 8:e63400. doi: 10.1371/journal.pone.0063400
  23. Salge, C; Glackin, C; Polani, D (2014). "Empowerment -- An Introduction". In Prokopenko, M (ed.). Guided Self-Organization: Inception. Emergence, Complexity and Computation. 9. Springer. pp. 67–114. arXiv:1310.1863. doi:10.1007/978-3-642-53734-9_4. ISBN 978-3-642-53733-2.
  24. Steels, Luc: The autotelic principle. In: Iida, F., Pfeifer, R., Steels, L., Kuniyoshi, Y. (eds.) Embodied Artificial Intelligence. LNCS (LNAI), vol. 3139, pp. 231–242. Springer, Heidelberg (2004)
  25. Csikszentmihalyi, M. (2000). Beyond boredom and anxiety. Jossey-Bass.
  26. Lungarella, M., Metta, G., Pfeifer, R., and Sandini, G. (2003). Developmental robotics: a survey. Connect. Sci. 15, 151–190. doi: 10.1080/09540090310001655110
  27. Santucci, V. G., Oudeyer, P. Y., Barto, A., & Baldassarre, G. (2020). Editorial: Intrinsically motivated open-ended learning in autonomous robots. Frontiers in Neurorobotics, 13(January), 2019–2021. https://doi.org/10.3389/fnbot.2019.00115
  28. Barto, A. G. (2013). “Intrinsic motivation and reinforcement learning,” in Intrinsically Motivated Learning in Natural and Artificial Systems (Berlin; Heidelberg: Springer), 17–47
  29. Mirolli, M., and Baldassarre, G. (2013). “Functions and mechanisms of intrinsic motivations,” in Intrinsically Motivated Learning in Natural and Artificial Systems, eds G. Baldassarre and M. Mirolli (Berlin; Heidelberg: Springer), 49–72
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.