## Abstract

We present a spiking neuron model of the motor cortices and cerebellum of the motor control system. The model consists of anatomically organized spiking neurons encompassing premotor, primary motor, and cerebellar cortices. The model proposes novel neural computations within these areas to control a nonlinear three-link arm model that can adapt to unknown changes in arm dynamics and kinematic structure. We demonstrate the mathematical stability of both forms of adaptation, suggesting that this is a robust approach for common biological problems of changing body size (e.g. during growth), and unexpected dynamic perturbations (e.g. when moving through different media, such as water or mud). To demonstrate the plausibility of the proposed neural mechanisms, we show that the model accounts for data across 19 studies of the motor control system. These data include a mix of behavioural and neural spiking activity, across subjects performing adaptive and static tasks. Given this proposed characterization of the biological processes involved in motor control of the arm, we provide several experimentally testable predictions that distinguish our model from previous work.

## 1. Introduction

Biological motor control systems are extremely robust, adaptive and versatile. Advances in control theory continue to improve the performance of engineered control systems [1,2], but biological systems remain far more capable. For this reason, and to better understand motor diseases, unravelling the control algorithms used by the brain is an important challenge. In this paper, we provide a novel, mathematically stable and neurally plausible approach to realizing the fundamental biological process of robust adaptation for limb control. While we specify some of the model's basic operations using control theory, we also propose specific biological processes, in terms of anatomically localized neural computations and a spike-based learning rule, to realize an adaptive, robust motor controller for an arm.

In order to relate control algorithms to measured neural spiking activity, it is crucial to construct neural models able to reproduce both observable behaviour and spiking activity comparable to empirical data. Contemporary work on the motor system typically focuses either on behavioural phenomena (such as stereotypical trajectories [3], velocity profiles [4] and motor learning ability [5–7]) or on neural phenomena (such as spiking neuron activity profiles [8] and neural population-level data analysis [9]) (see electronic supplementary material, section S1). In this paper, we employ recent advances in neural modelling methods [10,11] to present a novel, adaptive, nonlinear motor control algorithm fully realized in a spiking neuron model. We refer to this model as the recurrent error-driven adaptive control hierarchy (REACH) model.

We demonstrate the REACH model's abilities by controlling a three-link, nonlinear arm through complex paths, including handwritten words and numbers. We also demonstrate its ability to adapt to environmental changes (e.g. an unknown force field) and changes to the physical properties of the arm (e.g. from growth). We suggest a mapping between neural structures in our model and anatomical brain areas based on biologically plausible computational processes realized within the model. Using this mapping, we provide a variety of comparisons with behavioural performance during static and adaptive tasks, as well as detailed spiking neural response analyses compared with classic and recent results in experimental neuroscience. To the best of our knowledge, the REACH model is unique among motor system models of limb control in the variety of both behavioural and neural data it is able to account for.

## 2. Material and methods

We begin by providing an overview of the architecture and algorithms of the REACH model, and direct the reader to electronic supplementary material, section S3.4 for an in-depth discussion of the implementation details. Figure 1 identifies the major neural and physical components of the REACH model. The arm, shown at the bottom of figure 1, is a three-link planar arm physically modelled after the human arm, using parameters described in [12]. The arm simulation was built using MapleSim, and is available online (https://github.com/studywolf/REACH-paper/). The equations of motion for the arm can be expressed in joint coordinates **q** = [*q*_{0}, *q*_{1}, *q*_{2}]^{T} as [1]
2.1where **M**(**q**) is the inertia matrix, **u** is the joint torque applied to the arm and captures the centrifugal and Coriolis effects. The purpose of the motor system is to determine a control signal **u** that will move the arm to a desired target or through a desired trajectory.

The premotor cortex (PMC) is known to play an important role in the preparation and planning of movements, as well as the control of timing during execution [13–15]. The neural activity recorded from PMC during tasks where a path is traced out using different limbs suggests that it is largely responsible for abstract trajectory generation [16,17]. For the purposes of modelling, based on this functional characterization, we refer to the trajectory generation network of our model as the PMC. The PMC in the REACH model implements a neural circuit based on dynamical movement primitives (DMPs) [18,19] (described in electronic supplementary material, section S3.4.1), which provides desired trajectories that can be complex (e.g. letters and numbers, see electronic supplementary material, figure S5).

Intuitively, DMPs can be thought of as building blocks of movement that can be strung together or executed in parallel to efficiently generate complicated movements, similar to motor primitives [20]. In REACH, the PMC generates trajectories specifying how to move the hand in a two-dimensional (*x*, *y*) space, where movements are more easily planned and generalized than in a lower-level representation like joint space (see electronic supplementary material, figure S5*c* for an example of generalization in REACH). However, purely DMP-based approaches do not solve the central problems of adaptation and redundant control, which is a focus of the REACH model.

We now demonstrate how to determine **u** in equation (2.1) to give us a desired motion to a target position. The control signal calculated to move in the two-dimensional space will take the form of a standard proportional–derivative (PD) control signal [21], which is composed of two parts: a term that accounts for the position error, and a term for the velocity error. The force to correct the position error, which we denote **u _{x}**, is calculated in hand-space as
2.2where

**x**is the position of the hand, provided through a visual or proprioceptive feedback,

**x*** is the desired position and

*K*

_{p}is a gain term. Once calculated, this force command must be translated into joint torques to apply to the arm.

M1 has been shown to have cells sensitive to the activation of specific muscles, and cells sensitive to the general direction of movement, regardless of which muscles activate to achieve that movement [22]. Additionally, Graziano [14] has shown that M1 is not a simple fixed mapping to muscles, but rather a complex function of the state of the system. To model M1, we suggest a functional assignment of transforming forces from an abstracted hand-space representation to joint torques that can be sent out to the arm (see electronic supplementary material §3.4.2). While much important detail is lost in this characterization of the role of M1 (especially by drawing a clear distinction to PMC), we believe that it is useful for advancing models of motor control, and show in §3 that even this simplified description captures numerous features of M1's neural activity.

In control-theoretic terms, the mapping from hand-space to joint torques is described, using a Jacobian matrix that relates the movement of the joint angles, **q**, to movement of the hand position, **x**. That is,
2.3where and are velocity of the hand and joints, respectively. The fact that **J** is a function of **q** highlights that this can be a very complex mapping.

The Jacobian also defines an approximate relationship between forces in hand-space and joint-space [23]
2.4where **u** is the set of torques sent to the arm, **u _{x}** are the forces we want to apply to the hand, and functional dependencies have been suppressed for clarity.

To this point, we have described how the PMC provides our system with desired forces to apply to the hand, and how M1 transforms these forces into joint torques that approximate the desired movement. To accurately control the arm, the controller must also account for the inertia generated by its own movement (i.e. the **M** term in equation (2.1)). To do this, we reformulate the transformation performed by M1 as
2.5where **M _{x}** estimates the inertia matrix in hand-space, allowing the effects of inertia to be cancelled out [24].

The function performed by M1 also needs to be robust to changes in the physical properties of the arm (e.g. changes owing to growth). In the REACH model, we employ a novel reformulation of recent nonlinear adaptive control methods [1] to allow our neural model to learn the Jacobian on-the-fly (see electronic supplementary material §3.3). We replace **J** with to indicate an adaptive Jacobian. This gives
2.6Notably, the control signal in equation (2.6) does not include the standard PD controller's velocity error term, ignores the centrifugal and Coriolis effects described in equation (2.1), and provides no means of adapting to unexpected forces. The lateral and intermediate cerebellum have been suggested to be responsible for developing and storing internal models of the system dynamics, and generating adaptive error correction signals, respectively [25–29]. The REACH model accounts for these terms by (i) using internal models to estimate the inertia matrix of the body to give the velocity error term, and (ii) introducing the adaptive error correction to implement dynamics adaptation, denoted **u**_{adapt}. These new terms have a natural mapping to the terms missing from equation (2.6) (see electronic supplementary material, section S3.4.3 for details). With these contributions to the outgoing control signal from the cerebellum, the output of the REACH model becomes
2.7where **u**_{adapt} is a function of **q** and . The **u**_{adapt} term is implemented using a novel reformulation of nonlinear adaptive methods [30,31] for neural systems (see electronic supplementary material, section S3.3).

Finally, the formulation of the model in equation (2.7) allows only a single controller to drive goal-directed movements. Animals, however, are able to optimize movements with respect to several goals simultaneously [32]. For instance, monkeys can reach to a target while staying as close to a default resting position as possible, minimizing energy usage. Task priority [33,34] and operational space control methods [35] provide a natural framework for such additional constraints. To account for the contributions of secondary controllers, we add an additional force term, **u**_{null}, which is filtered such that only movements not affecting movement in the primary operational space (i.e. hand-space) are allowed.

As a result of this last extension (see electronic supplementary material, figure S4), the full command sent to the arm is 2.8

As a model of the motor control system, the absence of the basal ganglia (BG), generally considered an important part of motor control, is conspicuous. The BG have been shown to be critical for reinforcement learning [36,37] and the volitional control of movement [38,39]. We justify not including the BG by noting that neither of these aspects of motor control are central to the tasks we are exploring here. For instance, the error adaptation for dynamics relevant to the REACH model is believed to occur almost exclusively in the cerebellum [40]. Studies have shown that patients with BG lesions, but not those with cerebellar lesions, are capable of adapting to such reach errors [41]. The BG are also important for determining which actions are most appropriate for a given situation, but the tasks we present include explicit direction at the beginning of each simulation as to which action to perform, making flexible action selection outside the scope of the model. However, REACH naturally integrates with spiking BG models, like those we have proposed in the past [42].

To implement the dynamics specified above for the PMC, M1 and CB network models in spiking neurons we use the neural engineering framework (NEF; [43]; see electronic supplementary material §3.1). The NEF can be thought of as a ‘neural compiler’, which guarantees a globally optimal approximation of the underlying dynamical equations by computing connection weights for a spiking network. Importantly, the resulting network behaviour is an *approximation* of the ideal mathematical formulation because of neural heterogeneity, stochasticity and connectivity, all of which affect performance. These sources of error provide important implementational constraints on any proposed model. The REACH model contains approximately 30 000 spiking neurons that map to anatomical areas as described above, and detailed in the electronic supplementary material §3.4. This number of neurons has been chosen to ensure that the subsystem computations required are performed with at least 95% normalized accuracy over a sufficient range of the state space for the subsystem. The methods for making this determination are detailed in [43]. This neural implementation allows testing of the behavioural performance of the REACH model while recording single neuron spiking behaviour throughout various tasks in any of the modelled anatomical areas.

## 3. Results

The first task we perform is the standard clinical behavioural test of eight centre–out reaches. While it would be possible to encode a human-like trajectory for these reaches into the DMP network, we instead only give the target location and use equation (2.2) to generate the desired hand forces. Figure 2 shows the arm trajectories and velocity profiles from the REACH model compared with human data collected by Shadmehr & Mussa-Ivaldi [44]. In [47], the total squared jerk (TSJ) is suggested as a metric for comparing reaching movements, where jerk is the rate of change of acceleration. The TSJ of 100 model-generated reaches compares well with the normal movements reported in [47]: the 95% confidence interval of the REACH model data spans [0.89–1.05], whereas the confidence interval of the human data spans [0.7–1.1]. Similarly, the velocity profiles in figure 2*b* show strong matches with the data presented in [44]: a correlation of 0.96 across the mean and a 0.703 correlation across the variance.

In addition to performing static reaching tasks and complex path generation (see handwriting in electronic supplementary material, figure S5), the REACH model incorporates robust, online adaptation based on observation of its own errors during execution of the trajectories. The two distinct types of online adaptation discussed above, cerebellar and cortical, are implemented using a spike-timing-based learning rule [11]. These two kinds of learning correspond to the dynamics and kinematics adaptation, respectively, discussed in [1,31] (see electronic supplementary material, section S3.3 for details).

Figure 3*a* shows simulation results of the model employing cerebellar adaptation while moving in an unknown force field. The results are compared with human data collected from [44] performing the same task under the same force field. In both the model and human trajectories, the effects of the force field are evident in the initial, unadapted reaches. After adaptation to the force field, both the human subject and REACH model are able to accurately move to the targets in a straight line.

Figure 3*b* shows results for cortical adaptation. Specifically, the arm segment lengths are initially set to 3.0 feet for calculating the Jacobian, instead of their actual values of 2.0, 1.2 and 0.7 feet. As the model repeatedly attempts to trace an ellipse, it learns to adapt to this change in the system kinematics and eventually accurately reproduces the trajectory.

The comparisons we have made to this point, while reflecting underlying neural performance and deficits, have focused on behavioural data. A major benefit of implementing models at the level of individual spiking neurons is that it allows for detailed comparison with a large body of experimental evidence from neuroscientific studies. Here, we present three such comparisons.

The first, in figure 4*a*, shows that the REACH model reproduces correlations exhibited during centre–out reaching movements between neural activity and various movement parameters. The neural activity profiles show correlations with acceleration, arm position, distance to target, hand position, movement force, movement magnitude, and movement velocity, consistent with correlations reported across the 19 studies listed in electronic supplementary material, table S1. Additionally, the system shows correlations with movement direction, displayed in detail in figure 4.

These correlations are clearly understandable based on the model structure, providing insights into the biological processes giving rise to this empirical information. For instance, the actual and target hand positions are represented in the primary motor cortex, as shown in figure 1, because they are required for the computation of the task-space control signal and Jacobian matrix. Hence, neurons in this area correlate with acceleration, arm position, distance to target, hand position, movement direction, movement magnitude and movement velocity, as shown in figure 4*a*. Similarly, the cerebellum represents position and velocity information, as well as an efferent copy of the motor signal, which is necessary for dynamics compensation and adaptive error correction. The representation of these signals means that neurons will correlate with arm position and movement direction, magnitude and velocity.

More specifically, work from the Georgopolous laboratory [48] provides a classic example of the correlations observed between neurons in M1 and movement parameters. That work suggests that neurons respond to a ‘preferred direction’ of arm movement in the plane. Figure 4 shows a comparison between a neuron in the original experimental data and simulated data taken from a REACH neuron with a similar preferred direction, demonstrating that the model is using neurons with biologically plausible response curves while generating plausible behaviour.

Finally, recent work has challenged the classic view that neural activity represents basic movement parameters, suggesting instead that neural responses include dynamics as part of the representation [8]. To demonstrate this claim, a variation on principal component analysis called ‘jPCA’ was presented in [8] and applied to single-cell neural recordings from monkeys made during a similar reach task. Figure 4*b* shows a comparison of a jPCA analysis performed on the spiking data from our model and the data from [8]. Figure 4*b* demonstrates that the same kinds of rotational dynamics captured by this analysis are found both in the model neural activity and empirical data (see electronic supplementary material, section S3.5.3). Finding systems-level constraints on neural activity is especially important for neural models, such as the REACH, where sheer volume makes finding and matching neuroscientific constraints at the level of single neurons across the whole model implausible.

## 4. Discussion

Thus far, we have presented a spiking neural model of motor control, and compared it with a variety of behavioural and neural data. We believe that the REACH model represents a significant departure from past work in several respects (discussed in electronic supplementary material, section S1), and significantly improves on the motor system used in our Spaun model [49] by incorporating kinematic and dynamic adaptation, and introducing DMP-based trajectory generation. Despite improvements over past models, there are clear limitations of this work. Most obvious, it does not approach the scale or behavioural sophistication of the primate motor system. The motor cortex in the model is four orders of magnitude smaller than human motor cortex [50], and the cerebellum is five orders of magnitude smaller [51].

As to the complexity of the REACH model, there are well-known anatomical structures in both CB and motor cortex that are not captured by the model, including the cerebellar Purkinje cell circuit and the six-layer cortical structure found throughout mammalian cortex. In addition, the model maintains artificial separation of areas, such as the PMC and M1. Past work suggests that models of the kind presented here can be mapped onto these structures while preserving function [52,53], but incorporating additional neuroanatomy remains important future work.

The arm being controlled by this model is also significantly less complex than its biological counterpart. Including the dynamics of the musculoskeletal system, as well as details of spinal cord circuitry, would enhance the biological realism of the model overall. It would also provide for comparisons of myoelectrical activity in the biological and model arm, acting as an additional source of empirical constraint.

Despite these limitations, the REACH model embodies a testable set of neural hypotheses about the motor control system. To highlight the consequences of these hypotheses, we present four predictions of the REACH model, three of them in the context of contrasting predictions from past work. These first three predictions were formed by comparing the details of the REACH and previous models to identify situations where distinct neural activity is expected given different modelling assumptions.

We begin with two predictions that distinguish REACH from what we call ‘compositional control’ models [54,55]. By ‘compositional control’, we mean an implementation of optimal feedback control theory [4] based on Kullback–Leibler (KL) control [56], or a mixture-of-experts model [57]. The first prediction relates to the expected activity of the PMC during the generalization of complex trajectories, for example drawing a ‘2’ right-side up and upside down. The REACH model uses the DMP framework for trajectory generation, where the same circuit generates both the right-side up and upside down ‘2’, so we expect very similar patterns of activation across cells in the PMC during the production of both the normal and transformed movements. By contrast, the compositional controller generates a right-side up and upside down ‘2’ using different circuits, because the kinematics of the two trajectories share little similarity; as a result, we expect little similarity in the spiking activity patterns in the PMC between the two movements.

The second prediction distinguishes the REACH and KL control models from the hierarchical modular selection and identification for control (HMOSAIC) mixture-of-experts model [58]. During the execution of learned movements, PCA analysis on the PMC in both the REACH and compositional control models (as described in electronic supplementary material, section S2) is expected to reveal a strong negative correlation between variance explained by the primary principal component and the number of degrees of freedom being controlled. That is, we expect more variance explained in a task with 1 degree of freedom than with 2 degrees of freedom, which in turn will be higher than in a task with 3 degrees of freedom. In the HMOSAIC model, the number of degrees of freedom under control should not correlate with the structure of the neural activity of the PMC, because that activity relates to choosing a particular low-level controller rather than explicitly providing a trajectory to follow. For highly trained movements, higher levels of HMOSAIC will specify one or two controllers, regardless of the number of degrees of freedom involved. However, the amount of training will not affect the degrees of freedom generated by PMC for either the REACH or compositional control models.

The third prediction distinguishes between HMOSAIC and the REACH model. When the dynamics of the underlying system change (e.g. when the arm becomes heavier from grasping an object), the REACH model predicts little to no qualitative changes in the neural activity of the PMC in comparison with moving under normal dynamics. In the HMOSAIC model, however, a change in system dynamics requires appropriately changing the low-level controllers used. We expect this change to be reflected in the neural activity of the higher levels of the system, including the PMC.

We note that these predictions are based on our understanding of how competing models map onto the underlying neuroanatomy, which has often not been explicitly specified. Regardless, performing such experiments will narrow the set of plausible alternatives for how the impressive skill and adaptability of the mammalian motor system is generated by the underlying neural controllers.

Finally, as explained in electronic supplementary material, section S3.5.3, we also predict that the same rotational dynamics found in the neural activity of M1 and the PMC with jPCA analysis will be found in any area of the brain representing the position and velocity of the body during directed movement. In particular, we predict that these rotations will be present in the sensory motor cortex and the cerebellum. If true, this suggests that jPCA is identifying general properties of dynamical systems rather than providing specific insights about cortical representation in the motor cortex.

We also note that, while the specific arm being controlled here has been parametrized to match a human limb, there is little else specific to human arm movements that has informed the specification of the model. Consequently, we would argue that the same model architecture and implementation should capture limb control movement across a wide variety of mammalian motor systems. To effect such comparison, it is critical to determine a reasonable limb model, but the adaptive mechanisms proposed, and the mapping to neuroanatomy should remain largely unchanged.

Even more generally, we believe that the adaptive algorithms we have identified can be applied much more broadly than to limb control. For instance, much of the motor system faces the same adaptive challenges. Indeed, much of the brain faces the challenge of robustly adapting its dynamics to the uncertain and changing dynamics of the environment it faces. Suggestively, the same neural algorithm that we have employed here for motor control can also be used, with minor modification, to tackle the challenge of predicting the dynamics of sensory input (technically, the perceptual ‘filtering’ problem is the dual of the control problem [59]). Consequently, it will be interesting to explore whether the same biological processes identified here prove broadly applicable to more forms of adaptive behaviour.

In conclusion, detailed models that bridge the gap between neural activity and behaviour provide quantitative and testable hypotheses that will improve our understanding of the complex relationship between brain and behaviour. While approaches to building this bridge are new and often controversial [60], understanding this relationship is critical for developing and testing new therapeutic interventions, advancing the state-of-the-art in machine learning, and deepening our understanding of the fundamental biological mechanisms driving behaviour.

## Data accessibility

All of the data and analysis scripts used in this paper can be found at https://github.com/studywolf/REACH-paper/.

## Authors' contributions

T.D. implemented the algorithms and collected the data. All authors developed the neural adaptation algorithms. T.D. and C.E. interpreted the data and wrote the manuscript, with discussion and feedback from T.C.S. and J.-J.S.

## Competing interests

We declare that none of the authors have competing interests.

## Funding

This work was supported by the Natural Sciences and Engineering Research Council of Canada, ONR grant no. N000141310419, Canadian Foundation for Innovation, Ontario Innovation Trust, the Canada Research Chairs programme, AFOSR grant no. FA8655-13-1-3084, and DARPA grant no N66001-15-C-4024.

## Acknowledgements

We thank the Brains in Silicon laboratory at Stanford for their useful discussion and feedback.

## Footnotes

Electronic supplementary material is available online at https://dx.doi.org/10.6084/m9.figshare.c.3578777.

- Received October 4, 2016.
- Accepted November 3, 2016.

- © 2016 The Author(s)

Published by the Royal Society. All rights reserved.