Category: Uncategorized

Bio-inspiration and modularity make robotic locomotion adaptable in the NRP

Current robotic control strategies are mainly based on trajectory plans that adjust the movements based on the next desired state. These control policies do not efficiently perform where the dimensionality of the control problem increases or disturbances are perturbing the system from the external environment. These issues are critical in locomotion tasks and the need for different control methods arises. To make autonomous robots able to move in a real and dynamic environment, the research has focused on biologically inspired controller, such as neuro-controller.

The interaction among different bio-inspired motion controllers whose communication represents a simplified model of the neural locomotion control in vertebrates is possible in the Neurorobotics Platform.

archi_eli

The presented solution combines classical control strategies with reservoir computing and spiking neural networks (Reservoir computing with spiking populations by Alex Vandesompele) to obtain a scalable and adaptable controller by taking advantage of the different learning properties of neural networks. To reflect the scalability of the controller, the experiments are performed on the simulated modular Fable Robot. In the experiment, it is built in a quadruped configuration so that the control architecture is composed of 4 cerebellar microcircuits, called Unit Learning Machines (ulm).

The use of a spiking neural network with reservoir computing as a trajectory planner (Central Pattern Generator, cpg) allows the learning of complex periodic trajectories for the movements of the robotic modules, whose frequency modulation is possible by just changing the frequency of the input signal to the network. Not optimally tuned PIDs give to the robot early stability during the first part of the simulation and provide a torque command for each module. Thus, a cerebellar network composed of 4 micro complexes computes and provides corrective effort contributions based on the inverse dynamics model of each robotic modules.

In the video below, it is possible to appreciate the locomotion improvements of the robot in an experimental simulation. The recording shows the simulation around second 100-130 when the position error is decreasing and stabilizing. The brain visualizer shown the spiking activity of the input population of the Central Pattern Generator (the higher groups) which are reflected in the blinking of one population of the reservoir (lower group). The spike train window (on the left) shows the periodicity of the spike trains which generate the trajectories for the modules (starting from the bottom, the activities of the input populations and one reservoir population are displayed).

video_4mod_fable_v1

The feed-forward cerebellar effort contribution decreases the mean of the position error of 0.3 radiant and its variance of 0.01 radiant compared to the case when just the effort command from the cpg is provided to the robot (the plot concerning the behavior of the second module is shown below). Moreover, the trend of the error is decreasing along the simulation time and the distance covered by the robot with the cerebellar-like neural network contribution is 9.48 m while the cpg controller contributes to have the robot walk for 1.39 m.

The modular configuration of the Fable Robot makes easier to test the control strategy for different configurations of the robot and patterns of locomotion, having the cerebellar-like neural network compensate the error after a short learning phase, since it has previously learned the internal model of the module.

This work was done in collaboration between DTU, Ghent and SSSA teams.

 

Advertisements

A new integration experiment for the visual system of the NRP

We implemented a new experiment to demonstrate that the NRP is able to run many models together, as a single visual system. Here, a retina model, a deep neural network for saliency computation, a spiking cortical model for early-stage visual segmentation and an echo state network for saccade generation collaborate, despite the differences in their structures. The NRP provides a common framework where models can talk to each other easily.

In the video below, the robot has to keep track of a stimulus that moves on a screen. The saliency model computes where the stimulus lies (most salient region in the visual field), and whenever it is not in the fovea, it delivers a signal to the saccade model, so that an eye movement is generated towards the visual stimulus. When the stimulus reaches the fovea, the second task is to segregate the target (small tilted bars) from the flanking square (impairs target detection). For this purpose, when a saccade is generated, the segmentation model triggers saccade inhibition and then sends local signals that initiate a spreading segmentation process that lies in the networks dynamics, attempting to segregate the target from the flanker. These local signals are sent, using the saliency output as a 2D probability density distribution, so that segmentation is only triggered around regions that are worthy of attention. Only when the segmentation is successful, the target is detected. During the whole experiment, the retina model delivers adaptation to the lighting of the scene. Thanks to it, the cortical representation of the stimulus is stable and segmentation is possible even in low-lighting conditions.

The NRP demonstrates its ability to run large-scale, collaborative simulations. Being able to run a whole visual system gives many novel opportunities to vision research, especially to explain global effects in human vision.

Reservoir computing with spiking populations

Passively compliant robots are robots with passive compliant parts, for instance springs or soft body parts. They can be cheaper, safer and more versatile than traditional stiff robotics. Since the compliance introduces non-linearities that are not easy to model analytically, we need to monitor the body with sensors and use machine learning to interpret those sensors.

Reservoir computing allows to train non-linear dynamical systems using only simple machine learning techniques. The unit of our reservoir is a population of spiking neurons. During training the desired motor commands are gradually taught to the closed loop system (with gradual FORCE learning), as illustrated in the figure below. The only weights that need to be learned are those to the readouts.

Model_overview_v2

The neurorobotics platform provides a convenient interface between the robot model and a spiking ‘brain’. After training, we have a closed loop system consisting of only the body and its ‘brain’. The body sensors drive the ‘brain’ activity which in turn drives the actuators. The motor commands are ‘embedded’ into the dynamics of this system. The animation below shows the resulting trained closed loop gait controller. Spike trains are shown for all neurons in one population.

CL_walkingGait_spikes

If desired, the system can be trained with an extra input to the reservoir (in addition to sensor inputs). This extra input can be coupled with different motor commands (for instance different gaits, or different frequency of the same gait). After training, this extra input can be set by an external actor (be it another ‘brain’ region or just a human) to control the system in real time:

CL_GaitTransition_spikes

New mechanisms in the Laminart model

The Laminart model is a spiking cortical network for early-stage visual segmentation. Using network dynamics simulated in NEST, it is able to parse its visual input into several perceptual groups. It is the largest simulated network available on the Neurorobotics Platform. It explains very well many behavioural results about visual crowding (a behavioural paradigm in which human observers try to identify a target being disrupted by nearby flanking shapes, the target being “crowded” when identification performance is impaired by the flankers – see fig. 1).

crowding
Fig. 1. a) Crowding in real life. If you look at the bull’s eye, the kid on the left is easily identifiable. However, the one on the right is harder to identify, because the nearby elements have similar features (orange color, human shape). c) Crowding in behavioural experiments. The visual stimuli on the x-axis are presented in the periphery of the visual field of human observers. The task is to identify the direction of the offset of the target (small tilted bars). Flanking squares try to decrease performance. What is plotted on the y-axis is the target offset at which observers give 75% of correct answers (low values indicate good performance). When the target is alone (dashed line), performance is very good. When only one square flanks the target, performance decreases dramatically. However, when more squares are added, the task becomes easier and easier. All classical models of crowding fail at explaining the latter condition (uncrowding), because they all predict that more flankers induce more interferences.

The model explains the results by stating that the target is crowded when it is considered as in the same perceptual group as the flanker, allowing interferences. When the flankers form a group on their own, they “frame” the target, increasing performance (less interference happens between elements that are in different perceptual groups). This idea explains the behavioural results very well. More than that, the model offers neural mechanisms for perceptual grouping and segmentation. However, these mechanisms work properly only with very simple flanking shapes (squares, rectangles, etc.). As soon as the shapes are slightly more complex (more orientations, various scales, etc.), the mechanisms that are necessary to produce the results do not work properly. This is a problem, because uncrowding happens for any flanking shape (not only squares). Here, we introduce new ways of implementing the key mechanisms of the model.

The first mechanism groups boundaries together (see fig. 2). This is linked to illusory contours computation. Activity spreads to connect edges that are well aligned. The new idea that was implemented here is that the illusory contours can now spread at different scales, according to the configuration of the stimulus. For example, if you are reading, you will tune your illusory contours mechanism, so that horizontal grouping happens between letters of the same word, but not between words, using the fact that letters of a same word are closer than letters from neighbouring words. We used the same idea for our crowding paradigms, where different grouping scales lead to different crowding results.

V2  V2

Laminart grouping
Fig. 2. Top: illustration of how the multi-scale grouping mechanism works (V2, layer 2/3 activity). Left: if tuned for low distances, no illusory contour will spread from the flankers to the target. Right: if tuned for long distances along the horizontal direction, illusory contours easily spreads through long distances, to link all the flanking octagons. Bottom: dynamics of the grouping mechanism. Activity naturally spreads in V2. However, spreading also activates interneurons that activate spreading control. In this state, nothing spreads. However, stimulus onset triggers a damping signal whose duration is specific for the orientation of spreading. It is highly dependent on the stimulus shape. For example, in the top-left stimulus, the closed surfaces impair spreading around the target. In the top-right stimulus, the horizontal alignement of the flankers makes that the damping signal is quite long along the horizontal direction, allowing horizontal boundaries to spread across flankers. When the damping signal stops, the illusory contours are already stabilised and do not spread back.

The second mechanism parses subsets of the image that are linked by boundary grouping to different segmentation layers (different regions of the network’s activity – see fig. 3). In our case, this allows the model to block interferences between parts of the image that belong to different groups. The new mechanism is better, because the former one required tonic activity of many neurons in the segmentation network, which was biologically implausible. Now it is only driven by input-related activity and by a brief segmentation signal.

segmentation1

Laminart segmentation
Fig. 3. Top: Illustration of the mechanisms of segmentation. After the segmentation signal is sent, activity spreads from V2 segmentation layer 0 (SL0) to V2 SL1-2 (after competition, all the flanker-related activity ends up in SL2). Activity can spread throughout all flankers thanks to the illusory contours between them. This is only possible with the new grouping mechanism. Bottom: dynamics of the segmentation mechanism. The interneuron layer is the one that does all the spreading. It also triggers its own control by spreading. The control layer is being inhibited by activity in SL1, allowing the interneurons to shut down activity in SL0, disinhibiting activity in SL1. Because all the dishinibition relies on V2 activity, it will only spread along connected boundaries. You can see in the figure on the top that the control layer acts as a sheath to make sure that the interneuron activity only spreads along V2 activity.

In the future, the new mechanisms will be generalised to more orientations (ongoing work), and the new version of the model is going to be integrated on the NRP. Also, top-down influence on how damping signals distribute across orientations will be investigated.

Going beyond conventional AI

European Robotics Forum 2018 in Tampere

The Neurorobotics Platform developed in the SP10  keeps improving its usability and reliability, and is looking to expand its user base. If the feedback obtained from the audience at the European Robotics Forum (900 registered guests, all roboticists from research and industry) is anything to go by, the NRP is in prime position to fill the need expressed by this community for an interdisciplinary simulation platform than connects neuroscience, AI and robotics.

Indeed, during our Workshop at the ERF and the various discussions that ensued, we were able to speak with a large number of researchers and company representatives from different backgrounds and activities. The overwhelming majority has clearly caught on the potential advantages of using the NRP, especially with standard AI tools such as TensorFlow. Furthermore, we found they were open to considering the ability of the NRP to establish brain-derived intelligent controllers that go beyond conventional AI. Finally, compliant robotics based on the up-and-coming technology of elastic elements that can make robots safe by design is an active area of research where ERF participants also saw potential for the NRP (OpenSim, custom robot designer, etc.).

We are thus looking forward to collaborating with our new contacts in the industry, and to improving the platform even further for their benefit.

20180315_140814_resized.jpg

(Benedikt Feldotto (TUM) walking the audience through the NRP’s many features)

 

Fable robot simulator

Fable is a 2 DoF modular robot arm that is being used by the group of DTU in order to develop the task of “Self-Adaptation in Modular Robotics”.

Thanks to the modularity provided by Fable, it is feasible to combine several modules together in order to create different robotic configurations increasing the complexity of the system. In this way, one is able to work on manipulation tasks as well as in locomotion tasks just by plugging a few modules together to form an arm, a worm, a spider,…

In the process to make the Fable robot as accessible as possible to the community, here at DTU we have been working on the implementation of the Fable v2.0 simulator.

We have created 3 different configurations:

A simple robotic arm, 2 DoF (1 Fable module)

fable1

A worm-like robot, 4 DoF (2 Fable module)

fable2

A quadruped-like robot, 8 DoF (4 Fable module)

fable3

This robot model has not been included to the NRP yet, but soon will be available for users. We will keep you updated.

 

 

 

Sensory driven hind-limb mouse locomotion model

In the paper on hind-limb locomotion of a cat in simulation [\textit{reference}], the authors studied the importance two main sensory feedbacks important swing-stance phase switching and which of the particular feedbacks are more important than the other for stable locomotion. In this preliminary work we set-up similar rules to produce locomotion in the mouse model developed in the Neuro-Robotics Platform(NRP). This work will be used to study the role of sensory feedback in locomotion and its integration with feed-forward components such as the Central Pattern Generator’s(CPG’s).In the paper on hind-limb locomotion of a cat in simulation [1], the authors studied the importance two main sensory feedbacks important swing-stance phase switching and which of the particular feedbacks are more important than the other for stable locomotion. In this preliminary work we set-up similar rules to produce locomotion in the mouse model developed in the Neuro-Robotics Platform(NRP). This work will be used to study the role of sensory feedback in locomotion and its integration with feed-forward components such as the Central Pattern Generator’s(CPG’s).


Bio-mechanical model :
We use the Neuro-Robotics platform (NRP) to develop the simulation model and its environment. The rigid body model of the mouse available in NRP was obtained from a high resolution 3D scan of a real mouse. Relationship between the segments are established via joints. For the purpose of this experiment only hind-limbs are actuated. Thus the current model has in total eight actuated joints, four in each hind-limb. Muscles are modeled as hill type muscles with passive and active dynamics. Muscle morphometry and related parameters were obtained from [2]. Each of the actuated joint consisted of at least one pair of antagonist muscle. Some joints also bi-articular muscles. In total the model consists of sixteen muscles. Proprioceptive feedback from muscles and rigid body and tactile information close the loop between the different components of locomotion.

This slideshow requires JavaScript.


Reflex controller :
The idea here is to break the motion of hind limb locomotion into four phases, namely (i) swing (ii) touch-down (iii) stance (iv) lift-off. Proprioceptive feedback and joint angles dictate the reflex conditions under which the phase transitions from one to another. Figure shows the four phases and their sequence of transition. For the hind limbs to change from one phase to another we optimize the muscle activation patterns as a function of proprioceptive feedback and joint angle. This ensures a smooth transition between one phase to another when a necessary condition is met.


Discussions :
With the bio-mechanical model of mouse in NRP and reflex control law we are able to reproduce stable hind-limb gait patterns that are purely sensory driven. The next steps to taken in the experiment are :

  1. Convert reflex laws into neuron based reflex loops
  2. Extend the reflex model for quadruped locomotion
  3. Add a CPG layer to interface with the reflex loops

References :

  1. O. Ekeberg and K. Pearson, “Computer simulation of stepping in the hind legs of the cat: an examination of mechanisms regulating the stance-to-swing transition.” Journal of neurophysiology, vol. 94, no. 6, pp. 4256–68, dec 2005.
  2. J. P. Charles, O. Cappellari, A. J. Spence, J. R. Hutchinson, and D. J. Wells, “Musculoskeletal geometry, muscle architecture and functional specialisations of the mouse hindlimb,” PLoS ONE, vol. 11, no. 4, pp. 1–21, 2016.