Stage / Internship

Structuring the action space of a naive agent in sensorimotor interaction


The AMAC team (Architectures and Models for Adaptation and Cognition) from the Institut des Systèmes Intelligents et de Robotique (ISIR) has been working for several years on the sensorimotor approach to perception. This approach is fundamentally opposed to the modeling approach classically used in robotics, in which the system is modeled, by the engineer, via a set of laws and parameters taken directly from the laws of physics. The model thus obtained is then highly dependent on the relevance and concordance of the parameters with physical reality. The sensorimotor approach, on the other hand, considers the system from an intrinsic point of view, where every structure is induced by the dependencies between its inputs (exteroceptive sensations) and its outputs (motor commands). The aim of the AMAC team is to apply this approach, already partially enunciated or theorized by neuroscientists, psychologists and mathematicians, to the concrete framework of developmental robotics.

The aim of the work carried out at ISIR on this theme was to formalize, mathematically and algorithmically, the processes that enable a totally naive agent -i.e. one with no a priori knowledge of its kinematic structure or environment- to extract information from its interaction with the physical space in which it is immersed. This is made possible intuitively by the interlink between its motor actions and the information obtained by its sensors. The team has already obtained convincing theoretical results on these subjects, demonstrating mathematically that a naïve robot was capable of constructing an image of its body [Marcel2017] or its workspace [Marcel2021] without the slightest model, or of structuring or interpreting the sensory consequences of its own motor actions even though they were initially unknown [Goasguen2022].


The aim of this internship topic is to continue the developments proposed in [Goasguen2022], which are still limited in scope by the assumptions made. Indeed, this contribution shows how elementary sensory prediction capabilities can be used to structure agent actions. However, the prediction functions exploited here only take the form of permutation functions within a simple visual sensor: by applying an action, the agent only causes a "sliding" of visual information within the camera observing its environment. While this hypothesis provides formal evidence for action structures, it remains largely unrealistic and limits the analysis to visual data.

The idea is therefore to focus on a more general case where the influence of actions on the agent's sensory observations is no longer limited to simple permutations, but can instead be characterized by more general, and in particular non-linear, functions. In this context, it is proposed to design a simple learning model that can perform a probabilistic prediction of future sensory observations based on past actions and observations. On this basis, it will then be necessary to understand (i) how combinations of these predictions can be exploited to capture the corresponding action combinations, but also (ii) to study the structures obtained from the agent's actions in different environments. The question of extending these considerations to other sensory modalities, such as audition, may also arise, as the general framework is initially intended to be multimodal. Another extension is to study the use of this structure for learning disentangled state representations. Indeed, [Caselles2019, Quessard2020] show that in the symmetry-based disentangled representation learning framework, disentanglement is obtained by enforcing the state representation and the action space to share the same group decomposition. Therefore, the internship could, in a second stage, investigate the exploitation of the action structure discovery method for learning disentangled state representations in various environments.

This research work will be based on a detailed bibliography on the context, subject and state of the art of the project, and will give rise -depending on the results obtained- to the writing of a scientific article.

[Marcel2017] V. Marcel, S. Argentieri and B. Gas, Building a Sensorimotor Representation of a Naive Agent's Tactile Space, IEEE Transactions on Cognitive and Developmental Systems (2017),volume 9, number 2, 141--152.

[Caselles2019] H. Caselles-Dupré, M. Garcia Ortiz and D. Filliat, Symmetry-based disentangled representation learning requires interaction with environments, Advances in Neural Information Processing Systems, 32 (2019)

[Quessard2020] R. Quessard, T. Barrett and W. Clements, Learning disentangled representations and group structure of dynamical environments, Advances in Neural Information Processing Systems, 33 (2020)

[Marcel2021] V. Marcel, S. Argentieri and B. Gas, Where Do I Move My Sensors? Emergence of a Topological Representation of Sensors Poses From the Sensorimotor Flow, IEEE Transactions on Cognitive and Developmental Systems (2021), volume 13, number 2, 312--325.

[Goasguen2022] L. Goasguen, J.-M.Godon et S. Argentieri, From State Transitions to Sensory Regularity: Structuring Uninterpreted Sensory Signals from Naive sensorimotor Experiences, IEEE Transactions on Cognitive and Developmental Systems (2022), in press.   

Profile required

  • Student in the final year of a Master's 2 or in the final year of an engineering school in robotics, artificial intelligence, or a related field;
  • Strong interest in theoretical research;
  • Ability to work independently and as part of a team;
  • Excellent communication and writing skills;
  • Good command of English.

General information

How to apply

Interested candidates are invited to send their CVs, covering letters and transcripts to
Louis Annabi (annabi(at) and Sylvain Argentieri (sylvain.argentieri(at)

Please include ``Internship application - ISIR'' in the subject line of the e-mail. Shortlisted candidates will be contacted for an interview.

This article was updated on April 17, 2024