 Original Article
 Open Access
 Published:
Extended DMPs Framework for Position and Decoupled Quaternion Learning and Generalization
Chinese Journal of Mechanical Engineering volume 35, Article number: 95 (2022)
Abstract
Dynamic movement primitives (DMPs) as a robust and efficient framework has been studied widely for robot learning from demonstration. Classical DMPs framework mainly focuses on the movement learning in Cartesian or joint space, and can't properly represent endeffector orientation. In this paper, we present an extended DMPs framework (EDMPs) both in Cartesian space and 2Dimensional (2D) sphere manifold for Quaternionbased orientation learning and generalization. Gaussian mixture model and Gaussian mixture regression (GMMGMR) are adopted as the initialization phase of EDMPs to handle multidemonstrations and obtain their mean and covariance. Additionally, some evaluation indicators including reachability and similarity are defined to characterize the learning and generalization abilities of EDMPs. Finally, a realworld experiment was conducted with human demonstrations, the endpoint poses of human arm were recorded and successfully transferred from human to the robot. The experimental results show that the absolute errors of the Cartesian and Riemannian space skills are less than 3.5 mm and 1.0°, respectively. The Pearson’s correlation coefficients of the Cartesian and Riemannian space skills are mostly greater than 0.9. The developed EDMPs exhibits superior reachability and similarity for the multispace skills’ learning and generalization. This research proposes a fused framework with EDMPs and GMMGMR which has sufficient capability to handle the multispace skills in multidemonstrations.
1 Introduction
Learning from Demonstration (LfD) has played a key role for robots to learn movement and manipulation skills from humans due to its high efficiency [1]. Conventional LfD methods, e.g., teachpendant, joysticks, keyboard, etc. are used for fast programming that more focus on the endpoint movement trajectory planning and control. Such interfaces are only for some simple tasks, and it is powerless to the anthropomorphic skillful operations. In recent years, many LfD approaches have been developed for complicated tasks, of which DMPs [2], stable estimator of dynamical systems (SEDS) [3], Gaussian mixture model/regression (GMMGMR) [4], probabilistic movement primitives (ProMP) [5], kernelized movement primitives (KMP) [6] and Hidden (Semi) Markov model (H(s)MM) [7] are outstanding representatives.
As a widespread LfD approach, DMPs is proposed and developed by Ijspeert et al. [8,9,10], to describe a trajectory by a series of action units. Such movement primitives are formalized as a stable attractor system to generate the trajectory either in task or joint space [11]. The classical DMPs framework composed of a canonical system module, a transformation system module, and a locally weighted regression (LWR) module, is developed to encode movement, learn their characteristics, and generalize to other similar targets.
In recent years, many approaches based on the classical DMPs are presented to extend its functionality, such as obstacle avoidance [12,13,14], stiffness learning [15, 16], collaborative behavior imitation [17], etc. As one of the most commonly used skill learning frameworks, DMPs model exhibits many excellent performances such as robustness to perturbations, convergence to attractors, time independence, etc. The approach is extensively applied to learn some anthropomorphic skills such as the skillful sports [18]. Although the classical DMPs is widely used, it still has some drawbacks [19]. In this paper, we are committed to endowing the classical DMPs with the capability to handle multidemonstrations and Riemannian space skills such as orientations.
In LfD community, GMMGMR provides a suitable option for multidemonstrations to obtain more demonstrated information, such as the probability distribution of multitrajectories. GMMGMR encodes the human skills as a clustering problem by estimating the joint distribution over the state variables and performing regression with the conditional distribution. As a robust learning algorithm, GMMGMR is widely used for learning and reproducing human skills in kinematics and dynamics. When dealing with multidemonstrated trajectories, the data is usually projected onto a latent space, and then encoded and reproduced by GMM and GMR successively [20]. Comparing with the DMPs approach, GMMGMR can obtain mean and probability distribution simultaneously from multidemonstrations. These parameters are beneficial to summarizing the demonstrated law, even provide some guidance for variable impedance controllers [21]. Although GMMGMR has many merits, this approach lacks generalization capacity when the target exceeds its distribution range. On that account, TPGMM [22] is developed to adapt the context by extracting the relevance between different tasks. Due to the mutual complementarity between DMPs and GMMGMR, in Ref. [23], GMMGMR is introduced into DMPs framework as the nonlinear terms for multitrajectories, but this approach was applied in joint space, only suitable for Cartesian space parameters, and ignored the probability distribution of multidemonstrations. Similarly, we incorporate GMMGMR into DMPs, but we more focus on the task space and Riemannian space skills like orientation, and effectively utilized the covariance characteristics.
Position and orientation are important for robots to accurately learn movement skills. Many existing works have addressed the position learning based on the classical DMPs framework in Cartesian space. Since the orientation is the skill on manifolds, the classical DMPs framework is unable to precisely handle such skills. Therefore, in recent years, many researches have represented the distance between orientations with the geodesics on the Riemannian manifolds. Such approaches provide the possibility to properly represent endeffector orientations. In Ref. [24], several concepts of Riemannian manifolds such as geodesics and logarithm/exponential maps are specifically discussed in robotics, and four kinds of manifolds are listed including the sphere manifold \(\mathcal{S}^{d}\), special orthogonal group \(SO(d)\), special Euclidean group \(SE(3)\), and the manifold of SPD matrix \(S_{ + + }^{d}\). In Refs. [25, 26], a modified DMPs framework is proposed to learn orientations in Cartesian space based on the quaternion \(S^{3}\) and rotation matrix \(SO(3)\) with the logarithmic map. The approaches take an effective way for the robot endeffector orientations, but they lack the ability to handle multispace skills, such as the poses including positions and orientations, Moreover, the methods inherit the drawbacks of the classical DMPs which are powerless to the multidemonstrations. In Ref. [27], the skills on the \(S_{ + + }^{d}\) manifold are learned with their geometry of the SPD matrix space. Although the method successfully learns the endpoint stiffness skills which have SPD property. But the rotation matrix always has not the positive definite and symmetric characteristics which limits its application.
To this end, we provide a new approach for learning Quaternionbased orientations based on the concepts of geodesics and exponential function on the Riemannian manifold. Different from the abovementioned publications, our approach focuses on the 2D sphere manifold \(S^{2}\). We decompose the quaternion \(S^{3}\) into a Cartesian term \({\mathbb{R}}\) and a Riemannian term \(S^{2}\), i.e., the rotation angle and axis \({\varvec{q}} = q + \lambda {\varvec{v}}\). Thus, our framework can handle the Cartesian term \(q \in {\mathbb{R}}\) and the Riemannian term \({\varvec{v}} \in S^{{{\kern 1pt} 2}}\) respectively. In brief, comparing with the stateoftheart researches [28], our framework can learn the multispace skills in cartesian space and 2D sphere manifold. The demonstrated human arm endpoint poses including positions and orientations can be transferred to robots simultaneously.
The contributions of this paper can be summarized as follows:

(1)
We proposed an EDMPs framework to learn and generalize quaternionbased orientations from human to robots by extending the classical DMPs to the 2D sphere manifold.

(2)
We combined the GMMGMR and EDMPs framework according to their mutual complementarity. The fused framework can not only handle multiple demonstrations to obtain more demonstrated information, but also has a good generalization ability.

(3)
We proposed several evaluation indicators including reachability and similarity to evaluate the learning results of EDMPs under the determined RBFs and time constants of the algorithms.
The remaining of this paper is organized as follows. Section 2 presents the methodology of data preprocessing, EDMPs framework, GMMGMR algorithm and evaluation indicators. In Section 3, a realworld experiment has been performed to evaluate its effectiveness. Discussion is carried out in Section 4. Section 5 provides the conclusion of this paper
2 Methodology
Aiming at the orientation learning from human to robots, and helping them acquire multispace skills conveniently and autonomously, as shown in Figure 1, the architecture mainly consists of four layers, i.e., human demonstrations (green), data preprocessing (blue), skills learning (yellow) and robot control (red). We will provide a specific description of data preprocessing and EDMPs framework in Sections 2.1 and 2.2. And then, the methodology of GMMGMR for multispace parameters under multidemonstrations will be introduced in Section 2.3. Additionally, we design several evaluation indicators in Section 2.4 to evaluate our learning and generalization results. For a better understanding, we summarize the key notations and abbreviations in Table 1.
2.1 Data Processing
2.1.1 Orthogonal Processing
As described in Figure 1, several trajectories of the reference points are recorded from human demonstrations with the VICON motion capture system, thus, we can calculate the positions \(\left\{ {\left\{ {\varvec{p}} \right\}} \right\} \in {\mathbb{R}}^{3}\) and orientations \(\left\{ {\left\{ {{\varvec{o}}_{x} } \right\}} \right\}\), \(\left\{ {\left\{ {{\varvec{o}}_{y} } \right\}} \right\}\), \(\left\{ {\left\{ {{\varvec{o}}_{z} } \right\}} \right\} \in \mathcal{S}^{2}\) with these reference points. Thus, the pose matrices can be constructed with multidimensional orientations \({\varvec{R}} \in SO(3)\), \(\user2{R = }\left[ {{\varvec{o}}_{x}^{\rm {T}} ,{\varvec{o}}_{y}^{\rm {T}} ,{\varvec{o}}_{z}^{\rm {T}} } \right]\). To guarantee the orthogonality of the columns in pose matrices, we should firstly adopt the GramSchmidt orthogonalization approach to finetune the demonstrated multidimensional orientations.
where \(\left\langle {{\varvec{o}},{\varvec{\xi}}} \right\rangle\) represents the inner product between \({\varvec{o}}\) and \({\varvec{\xi}}\), i.e., \(\left\langle {{\varvec{o}},{\varvec{\xi}}} \right\rangle = {\varvec{o}}^{\rm {T}} {\varvec{\xi}}\). Thus, we can obtain a set of orthogonal basis \(\left\{ {{\varvec{\xi}}_{x} ,{\varvec{\xi}}_{y} ,{\varvec{\xi}}_{z} } \right\}\) as well as their standard form \(\left\{ {\eta_{x} ,\eta_{y} ,\eta_{z} } \right\}\), wherein \(\eta_{x} = {\varvec{\xi}}_{x} /\left\ {{\varvec{\xi}}_{x} } \right\\), \(\eta_{y} = {\varvec{\xi}}_{y} /\left\ {{\varvec{\xi}}_{y} } \right\\) and \(\eta_{z} = {\varvec{\xi}}_{z} /\left\ {{\varvec{\xi}}_{z} } \right\\). The pose matrices are constructed of the axes with orthogonal constraints \(\hat{\user2{R}} \in SO(3)\), \(\user2{\hat{R} = }\left[ {\eta_{x}^{\rm {T}} ,\eta_{y}^{\rm {T}} ,\eta_{z}^{\rm {T}} } \right]\).
2.1.2 Continuous Quaternion Solution and Decomposition
In screw theory, every transformation of robot endeffector with respect to the base coordinate system can be expressed by a screw displacement, which is a translation along a axis \({\varvec{v}} \in S^{2}\) and a rotation with an angle \(\theta \in {\mathbb{R}}\) about the axis. Quaternionbased representation of robot endeffector poses has been widely used with its high efficiency and nonsingularity. Due to a specific pose can be represented in two different ways of quaternions, i.e., \((\theta ,{\varvec{v}})\) and \((  \theta ,  {\varvec{v}})\), we introduced a constraint rule for adjacent quaternions to ensure the quaternionbased trajectories continuously.
where the sign of \({\varvec{q}}_{i}\) is determined by \({\varvec{q}}_{i  1}\). On this basis, we decomposed quaternion into a Cartesian term \(q = \cos (\theta /2)\) and a Riemannian term \({\varvec{v}}{ = [}x{,}y {,}z ]\). And then, the multispace parameters \(\theta\) and \({\varvec{v}}\) can be learned with the presented EDMPs framework respectively.
2.1.3 Quaternion Dimension Reduction before GMMGMR
In the initial stage, to get the mean and covariance from multidemonstrations, the dimension of quaternionbased orientations should be reduced firstly before GMMGMR initialization. As depicted in Eq. (4), quaternions can be written in exponential form.
Thus, the dimension of quaternion can be reduced through logarithmic map.
Based on the above conversion, we can handle the quaternionbased orientations with GMM and GMR in Cartesian space, and finally obtain the mean and covariance in all decoupling dimensions. Hereinafter, we will use DRquaternion to represent the quaternion after dimensionality reduction.
2.2 Methodology of EDMPs
For notational simplicity, in the rest of this paper, we denote the rotation angle and the rotation axis of quaternion as anglequaternion \(\theta\) and axisquaternion \({\varvec{v}}\).
As described in Figure 1, EDMPs framework is combined with a transformation system module, an LWR updating module, and a canonical system module, wherein the transformation system module includes two components, i.e., the transformation system in Cartesian space and 2D sphere manifold. We use the transformation system in Cartesian space to learn the anglequaternions and positions, and the extended transformation system on the 2D sphere manifold is developed for the axisquaternions. LWR is applied for updating nonlinear terms, the canonical system is used to avoid the explicit time dependency.
To be specific, under the proposed EDMPs framework, at the learning stage, positions and anglequaternions \(\left\{ {({\varvec{p}},\theta ),(\dot{\user2{p}},\dot{\theta }),(\user2{\ddot{p}},\ddot{\theta })} \right\}\) and axisquaternions \(\left\{ {{\varvec{v}},\dot{\user2{v}},\user2{\ddot{v}}} \right\}\) are processed with the transformation system in Cartesian space and 2D sphere manifold, respectively. The target nonlinear terms of \(\left\{ {{\varvec{f}}_{{\varvec{p}}} \in {\mathbb{R}}^{3} ,f_{\theta } \in {\mathbb{R}}} \right\}\) and \(f_{{\varvec{v}}} \in {\mathbb{R}}\) are calculated with the input parameters, and then encoded with the linear combination of several RBFs. The weights of RBFs in the nonlinear terms are finally updated with the LWR approach. In the generalization stage, the target position and anglequaternion \(\hat{\user2{p}}_{g}\), \(\hat{\theta }_{g}\) and the target axisquaternion \(\hat{\user2{v}}_{g}\) are provided as the unique attractors of the secondorder differential equations to calculate the corresponding generalized trajectories.
2.2.1 Transformation System Module
In this section, we take the anglequaternions \(\theta \in {\mathbb{R}}\) and the axisquaternions \({\varvec{v}} \in S^{2}\) as the research objects to describe the transformation system in Cartesian space and 2D sphere manifold, respectively.
As depicted in Figure 1, the transformation system in Cartesian space is composed of a simple dynamic and a nonlinear function, wherein the simple dynamics is developed to build the relationship among the position, velocity and acceleration of anglequaternions \(\left\{ {\theta ,\dot{\theta },\ddot{\theta }} \right\}\) by a secondorder differential equation. The nonlinear term is formalized with several nonlinear radial basis functions to fit any curve. The mathematical model of the transformation system is defined as Eq. (6).
where \(\theta\), \(z\) and \(\dot{z}\) denote the position, velocity and acceleration of anglequaternions, respectively. τ is used to adjust the duration of the task. \(\alpha_{\theta }\) and \(\beta_{\theta }\) are time constants for guaranteeing that the anglequaternion \(\theta\) will finally converge to the target \(\theta_{g}\). In this paper, we set \(\alpha = 4\beta\) for position, anglequaternion and axisquaternion learning that the Eqs. (6) and (7) becomes critically damped, and the values are determined by the specific task.
The extended unit of the transformation system is developed on the 2D sphere manifold for the axisquaternions. The distance between two axisquaternions is represented by geodesics on the 2D sphere manifold, and the modified mathematical model is described as Eq. (7).
where \(\lambda_{i}\), \(\dot{\lambda }_{i} \in {\mathbb{R}}\) denote the velocity and acceleration term between \({\varvec{v}}_{i}\) and \({\varvec{v}}_{i + 1}\). \({\rm {d}}\left( {{\varvec{v}}_{{i{ + 1}}} ,{\varvec{v}}_{i} } \right) = {\text{arccos(}}{\varvec{v}}_{{i{ + 1}}}^{{T}} {\varvec{v}}_{{\text{i}}} {)} \in {\mathbb{R}}\) is the geodesic distance between \({\varvec{v}}_{i}\) and \({\varvec{v}}_{i + 1}\). dt represents their interval time. \({\varvec{v}}_{i}\) represents the axisquaternion in the ith state of trajectories. \(\tau\), \(\alpha_{v}\) and \(\beta_{v}\) are constants.
Taking into consideration of other situations where the initial and target axisquaternions are changed, the rotation matrix \({\varvec{R}}_{O}^{{\hat{O}}} \in {\mathbb{R}}^{3 \times 3}\) should be introduced to update the mapping direction \({\varvec{\gamma}}{\text{ = log}}_{{{\varvec{v}}_{{\varvec{i}}} }} {\varvec{v}}_{{{i + 1}}} \in {\mathbb{R}}^{3 \times 1}\) between neighboring axisquaternions.
where \({\varvec{R}}_{O}^{{\hat{O}}}\) is determined by the initial and target axisquaternions of the demonstrated and the generalized trajectory.
where \({\varvec{o}} \in {\mathbb{R}}^{3 \times 1}\) and \(\hat{\user2{o}} \in {\mathbb{R}}^{3 \times 1}\) represent the vectors from the initial to the target axisquaternions of the demonstrated and the generalized trajectory, respectively.
The rotation angle \(\theta_{{\varvec{o}}}^{{\hat{\user2{o}}}} \in {\mathbb{R}}\) and the rotation axis \({\varvec{\omega}} \in {\mathbb{R}}^{3 \times 1}\) of \({\varvec{o}}\) and \(\hat{\user2{o}}\) can be calculated as Eq. (10).
\({\varvec{R}}_{O}^{{\hat{O}}}\) can be deduced with the Rodrigues’ formula.
where \({\varvec{I}} \in {\mathbb{R}}^{3 \times 3}\) is the identity matrix, \(\hat{\user2{\omega }} \in {\mathbb{R}}^{3 \times 3}\) is the antisymmetric matrix. The \({\varvec{v}}_{{i{ + 1}}}\) can be calculated by \({\varvec{v}}_{i}\) with the exponential function [24].
where the vector \({\overline{\user2{\gamma }}}_{i}\), can be updated by the normalized \({{\hat{\user2{\gamma }}}_{i}}\) and the \({\text{arccos(}}{\varvec{v}}_{{i{ + 1}}}^{\rm {T}} {\varvec{v}}_{i} {)}\), in which the geodesic distance between \( {\varvec{v}}_{i} \) and \({\varvec{v}}_{{i{ + 1}}}\) is calculated with the Eq. (7). On this basis, the nonlinear sequence \(\left\{ {f_{\theta } } \right\}\) and \(\left\{ {f_{{\varvec{v}}} } \right\}\) can be calculated with the Eqs. (6) and (7) successively.
2.2.2 LWR Updating Module
In this paper, we used a linear combination of several nonlinear RBFs to successively fit the proposed nonlinear terms. LWR approach is introduced to update their weighted distributions in the linear combinations.
where \(c_{i} = \exp (  \alpha \cdot i \cdot T/N_{1} )\), \(h_{i} = 1/(c_{i + 1}  c_{i} )^{2}\) when \(i = 1,2, \cdots, N\), and \(h_{N} = h_{N  1}\). Each RBFs \(\Psi_{i} (s)\) is weighted by \(W_{i}\), which can be updated by the LWR approach.
2.2.3 Canonical System Module
To avoid the explicit time dependency during learning and generalization, the phase variables \({\varvec{s}} \in {\mathbb{R}}\) are introduced as the state parameters in the firstorder linear dynamic system, i.e., the canonical system.
where \({\varvec{s}} \in [0,1]\), \({\varvec{s}}{(0) = 1}\), \(\dot{\user2{s}}\) denotes the derivative of \({\varvec{s}}\); \(\tau\) and \(\alpha_{s}\) are constants. When \({\varvec{s}}\) converges to zero, the nonlinear term \(f{(}s{) = 0}\); \(\theta\) and \({\varvec{v}}\) are finally converged to the target \(\theta_{g}\) and \({\varvec{v}}_{g}\). The whole system is dependent on the phase variables \({\varvec{s}}\), but not the time. Thus, the EDMPs framework can be generalized to other situations without changing the trajectories.
2.3 GMMGMR Algorithm for Multispace Parameters
GMMGMR is presented at the initialization stage to handle multitrajectories from human demonstrations. As depicted in Figure 1, \(\left\{ {\left\{ {\varvec{p}} \right\}} \right\}\) and \(\left\{ {\left\{ {(\theta ,{\varvec{v}})} \right\}} \right\}\) are obtained from multidemonstrations of a human tutor. In the initialization stage, multidemonstrated positions \(\left\{ {\left\{ {\varvec{p}} \right\}} \right\}\), DRquaternions \(\left\{ {\left\{ {(\theta x,\theta y,\theta z) \in {\mathbb{R}}^{3 \times 1} } \right\}} \right\}\) and phase variables \(\left\{ {\varvec{s}} \right\}\) are imported into the GMM unit in Cartesian space to learn the distribution of multitrajectories, and the GMR unit is applied to generate a single trajectory and the corresponding probability distribution. After that, the output QRquaternions \(\left\{ {\left\{ {(\theta x,\theta y,\theta z)} \right\}} \right\}\) is refactored back to the quaternion representation \(\left\{ {\left\{ {(\theta ,{\varvec{v}})} \right\}} \right\}\), and the obtained single generated trajectory including positions and orientations can be learned by the EDMPs. Moreover, the variable impedance control can be realized with the probability distribution of multitrajectories. The specific process is depicted as follows.
The demonstrated data are collected as Eq. (16).
In this paper, we have K demonstrations, and each demonstration has M discrete points. \(\left\{ {{\varvec{\xi}}^{I} } \right\}\) is the phase variables \(\left\{ {\varvec{s}} \right\}\) in EDMPs, and \(\left\{ {{\varvec{\xi}}^{O} } \right\}\) is composed with positions \(\left\{ {\varvec{p}} \right\}\) and DRquaternions \(\left\{ {(\theta x,\theta y,\theta z)} \right\}\).
As depicted in Eq. (16), we have K +M discrete data, and each data follows the probability distribution \(P({\varvec{p}}(s))\), \(P(\theta (s))\) and \(P({\varvec{v}}(s))\). Hereinafter, we take the position \({\varvec{p}}(s)\) as example.
where d denotes the dimension of output parameters. The posterior probability \(\uppi\), mean \({\varvec{u}}\) and covariance matrix \({\varvec{\varSigma}}\) of N_{2} Gaussian distribution functions can be determined by the EM algorithm.
To avoid local optimal values, Kmeans algorithm is firstly introduced to initial the clustering centers. And then, the EM algorithm is applied to update the parameters. The whole process can be divided into Estep and Mstep, and the former is used to optimize the expectation function, i.e., the sum of posterior probabilities \(E = \sum\nolimits_{k = 1}^{M + K} {P({\varvec{\mu}}_{k} \,\,{\varvec{\varSigma}}_{k} \,\,{\varvec{\xi}}_{k} )}\), in this phase, the parameters \(\left\{ {\uppi ,{\varvec{\mu}},{\varvec{\varSigma}}} \right\}\) are seen as invariants. Oppositely, the purpose of Mstep is to update the parameters \(\left\{ {\uppi ,{\varvec{\mu}},{\varvec{\varSigma}}} \right\}\), and the expectation function E is invariant. The detailed explanation of EM algorithm, and the parameters’ updating process, please refer to Ref. [29].
Based on the updated parameters \(\left\{ {{\hat{\uppi }},\hat{\user2{\mu }},\hat{\user2{\Sigma }}} \right\}\) of GMM, for positions, the GMR is applied to calculate the expectation \(E(P(p{\varvec{s}}))\) and the covariance \({\text{cov}} \left( {P(p{\varvec{s}})} \right)\) of the conditional probability \(P(p{\varvec{s}})\). In brief, the conditional probability \(P({\varvec{\xi}}^{O} {\varvec{\xi}}^{I} )\) with several Gaussian distribution functions can be calculated based on the updated mean and covariance matrix, i.e., \(\hat{\user2{u}}_{k} { = }\left[ {\begin{array}{*{20}c} {\hat{u}_{k}^{I} } & {\hat{u}_{k}^{O} } \\ \end{array} } \right]^{\rm {T}}\), \(\hat{\user2{\Sigma }}_{k} { = }\left[ {\begin{array}{*{20}c} {\hat{\Sigma }_{k}^{O} } & {\hat{\Sigma }_{k}^{OI} } \\ {\hat{\Sigma }_{k}^{IO} } & {\hat{\Sigma }_{k}^{I} } \\ \end{array} } \right]\).
On this basis, the reconstructed data and the constraints are deduced as Eqs. (23) and (24).
After initialization stage with GMMGMR, a single trajectory with covariance can be obtained, wherein the trajectory can be used to train EDMPs framework, and the covariance can be applied to estimate the stiffness matrices \({\varvec{K}}_{i} \in {\mathbb{R}}^{6 \times 6}\) of impedance control loop.
where \({\mathbf{0}} \in {\mathbb{R}}^{3 \times 3}\), \({\varvec{K}}_{{T_{i} }} = {\rm {diag}}(k_{px} ,k_{py} ,k_{pz} ) \in {\mathbb{R}}^{3 \times 3}\) and \({\varvec{K}}_{{R_{i} }} = {\rm {diag}}(k_{rx} ,k_{ry} ,k_{rz} ) \in {\mathbb{R}}^{3 \times 3}\) respectively represent the translational and rotational stiffness. \(k_{i} = k_{\min } + (k_{\max }  k_{\min } )\frac{{\phi_{i}  \phi_{\min } }}{{\phi_{\max }  \phi_{\min } }}\) , and \(\phi\) are the stiffness indicators determined by the inverse expected covariance matrices \((\hat{\Sigma }^{O} )^{  1}\) in Eq. (24). \(k_{\min }\) and \(k_{\max }\) are the predetermined minimum and maximum stiffness according to the specific application scenarios.
2.4 Evaluation Indicators of Learning Results
Although the DMPs has the merit of convergence to the attractor, the effects in a limited execution time largely depend on the selection of the number of RBFs and the constants of α and β in Eqs. (6) and (7). In this section, to properly exhibit the reproducibility or generalization capability of our approach under the determined RBFs and constants, we defined some evaluation indicators including reachability and similarity for the learning results. In Cartesian space, the reachability is determined with the absolute error in the Cartesian space \(e_{c}\) between the target and actual position/anglequaternion in the end state, and the relative error \(\Delta e_{c}\) calculated with the \(e_{c}\) relative to the range of the trajectories. The similarity is determined by the PCCc \(\rho_{c}\) between the scaled demonstration and the actual generalized trajectories, wherein the scaling factor \(\eta\,{ = }{{\left {\hat{\theta }_{{\varvec{g}}}  \hat{\theta }_{{0}} } \right} \mathord{\left/ {\vphantom {{\left {\hat{\theta }_{{\varvec{g}}}  \hat{\theta }_{{0}} } \right} {\left {\theta_{{\varvec{g}}}  \theta_{{0}} } \right}}} \right. \kern\nulldelimiterspace} {\left {\theta_{{\varvec{g}}}  \theta_{{0}} } \right}}\) is calculated according to the difference between the demonstrated target and the new targets. On the 2D sphere manifold, the reachability is determined with absolute error \(e_{r}\) between the target and actual axisquaternions in the end state. The similarity is determined with the PCCr \(\rho_{r}\) between the rotated demonstration and the actual generalized axisquaternions. The evaluation indicators of \(\Delta e_{c}\), \(\rho_{c}\) and \(\rho_{r}\) are dimensionless.
The acceptable reachability and similarity can be determined according to the actual application scenarios. In this paper, we defined the satisfactory generalized results when \(\Delta e_{c}\) is small than 0.005, \(e_{c}\) range is between 5°–5°, and \(\rho_{c}\), \(\rho_{r}\) are greater than 0.8. Under these criteria, the generalized trajectories will converge to the target poses with high accuracy and strong correlation compared with the demonstrated trajectory.
3 Experiment
In this section, The Franka Panda robot was used as the experimental platform. A pickup task with different poses was designed and illustrated to verify the learning and generalization ability of the proposed method both in Cartesian space and 2D sphere manifold.
3.1 Multispace Skills Processing and Learning
Multidemonstrations of the pickup task were conducted in Figure 2. The VICON motion capture system composed of 10 cameras and 4 optical markers was used to record the trajectories of demonstrations. Three of these optical markers were respectively placed at the center of the palm, the radial and ulnar styloid, to ensure that the plane formed by these points is approximately parallel to the subject’s palm, and further determine the zaxis of the palm during movement. The last optical marker was selected between the radial and ulnar styloid, to facilitate the determination of the yaxis. The xaxis is determined with the righthand rule. The trajectories of these points are processed to represent the positions and orientations of the palm.
After multidemonstrations and data preprocessing, GMM is used to encode their distributed characteristics, and GMR is introduced to generate a single trajectory and the corresponding probability distribution according to the input phase variables. To properly characterize the distributions of multitrajectories, and generate a suitable trajectory for EDMPs framework, we selected 5 Gaussian distribution functions for multispace parameters’ learning in our experiment, i.e., N_{2} = 5. The learning results are depicted in Figure 3.
On this basis, the positions and quaternions of the generated trajectory are imported to the EDMPs framework, to learn their characteristics both in Cartesian space and 2D sphere manifold. In this scenario, we selected three targets in different positions with different poses to test the generalization ability of the presented approach in multispaces. Moreover, to obtain a relatively higher learning accuracy, we set \(\alpha_{\theta } = {4}\beta_{\theta } = 25\) for position and anglequaternion, \(\alpha_{{\varvec{v}}} = {4}\beta_{{\varvec{v}}} = 25\) for axisquaternion, and selected 25 RBFs i.e., N_{1} = 25 to fit corresponding nonlinear terms. Therefore, the reproduced and generalized trajectories including positions and quaternionbased orientations for different targets are successfully obtained, as shown in Figure 4.
Figure 4(a) represents the generalization of the positions, and Figure 4(b) represents the generalization of decoupling quaternionbased orientations including anglequaternion and axisquaternion, respectively. To characterize the learning and generalizing capability of the EDMPs framework in multispaces, the reachability and similarity of the reproduced and generalized trajectories are calculated, as shown in Tables 2 and 3.
In Tables 2 and 3, the average \(e_{c}\), \(\Delta e_{c}\) and \(\rho_{c}\) of the generalized positions on the x, y, and zaxis are 3.2943 mm, 3.4869 mm, 2.4576 mm, 0.0081, 0.2574, 0.0114, and 0.9984, 0.8353, 0.9998 respectively. The average \(e_{c}\), \(\Delta e_{c}\) and \(\rho_{c}\) of the generalized anglequaternions are 0.0466°, 0.0041, and 0.9981 respectively. The average \(e_{r}\) and \(\rho_{r}\) of the generalized axisquaternions are 0.4675° and 0.9939. The absolute errors of the positions and the quaternionbased orientations are less than 3.5 mm and 1°, respectively. Except for the G_{1} and G_{3} of the position on the yaxis, the Pearson’s correlation coefficients of the demonstrated and the generalized trajectories are mostly greater than 0.9. The phenomena of G_{1} on the yaxis is due to the sign of the target is opposite to the demonstrated one, and the G_{3} is that its target is too close to the starting point. To solve these problems, please refer to Ref. [19]. Nevertheless, the experiment results reveal that the presented approach performs relatively good learning and generalization capabilities both in Cartesian space and 2D sphere manifold.
Based on the definition of the satisfactory region in Section 2.4, we calculate the satisfactory generalization region of axisquaternions to further verify the generalization capability of our approach on the 2D sphere manifold, as shown in Figure 5.
As shown in Figure 5, the satisfactory generalized region with \(e_{r} \in \left[ {  5^\circ ,5^\circ } \right]\) and \(\rho_{r} \in [0.8,\;1]\) is determined. The region can cover nearly 1/3 of the spherical coordinate. All reachable targets are located on the same hemisphere with the demonstrated target. If the generalized target is too close to the starting point, the nonlinear terms may produce an unexpected influence on the generalized trajectories, and the reachability and similarity will be unsatisfactory. Moreover, if the vectors from the generalized targets to the starting point are opposite to the demonstration, or the generalized targets are located on the other hemisphere of the spherical coordinate, the generalized trajectories will also show an undesired correlation with the demonstrated one, and the reachability is also unsatisfactory. The phenomena are consistent with our experimental results.
3.2 Experimental Verification on Real Robot
To apply our approach in a real scenario, and further verify its effectiveness, we designed a pickup task based on the above learning and generalization results with the panda robot. Firstly, the variable stiffness including translational and rotational stiffness profiles are obtained through GMMGMR initialization, and the distributionbased variable impedance control is realized, as shown in Figure 6. The whole control system is based on the ROS network. Figure 7 shows several typical results of this task, and the robot successfully completed the relative tasks with similar trajectory profiles compared with the demonstration.
As shown in Figure 6, the action is started at the initial phase variable \(s(t_{0} ) = 1\) and finished at \(s(t_{{{\text{end}}}} ) = 0\). According to the probability distribution of multitrajectories, the diagonal element of translational and rotational stiffness matrices can be obtained from Eq. (25). For the translational stiffness, the stiffness along the x and yaxis maintained a low stiffness in the initial stage and gradually increased with the execution of the task. The stiffness along the zaxis firstly decreased in the initial stage and then increased to a high level for the targets. A similar trend can be seen in the different dimensions of rotational stiffness. From these results, it can be concluded that the priority of each axis is that the x and yaxis are greater than the zaxis in this task.
As shown in Figure 7, four bottles were placed on the desk, one of them with the blue cap is the demonstrated target, the others with yellow caps are the generalized targets which placed randomly. The robot was firstly regulated to the initial pose, as shown in Figure 7(a), which is similar to the demonstrated initial pose in Figure 2(a). The initial homogeneous matrix of the human tutor is transformed to the real initial pose of the robot in Figure 7(a) through a transformation matrix, and the demonstrated trajectory is also changed accordingly. Figure 7(b), (c) represent the reproduced trajectory for the demonstrated target. On this basis, we manually adjusted the joint angle of the robot to reach the corresponding generalized targets with reasonable grasping poses. The obtained end poses were imported to the EDMPs framework, and three similar trajectories to the demonstrated curve can be deduced successively. Figure 7(d), (f), (h) describes the intermediate process of the generalized movement, and Figure 7(e), (g), (i) represents the end poses of the robot for the generalized targets.
4 Discussion
It is worth noting that the learning results of the existing DMPsbased frameworks heavily depend on the selected number and distribution of RBFs and the time constants of transformation system. These parameters are determined empirically with the specific tasks. To the best of our knowledge, there is still no literature on how to evaluate the algorithm under the selected RBFs and time constants. Therefore, we proposed several evaluation indicators to characterize the performance of EDMPs, and determined the satisfactory generalized region in our application scenario. As shown in Figure 5. If the generalized targets and the demonstrated one are similar or located in the satisfactory generalized region, the EDMPs framework will perform superior characteristics. But when the difference is too large, especially if the target is located on the other hemisphere of the manifold, the results will be unsatisfactory. This limitation may be overcome by building a knowledge database for the robot, the database including different skills for various tasks and covering the whole sphere on the manifold.
The proposed EDMPs framework can be applied for more complex tasks, such as the humanrobot cooperation scenarios, skillful manipulations, etc., where should consider positions and orientations simultaneously. The main difference between our contribution and the predecessors is that our approach can handle the skills on the different kinds of manifolds, including the sphere manifold \(S^{d}\), special orthogonal group \(SO(d)\), special Euclidean group \(SE(3)\), and the manifold of SPD matrix \(\mathcal{S}_{ + + }^{d}\), by reducing the dimensions of these skills and combing with the classical transformation system and our extended transformation system on the 2D sphere manifold. We use quaternions to represent the Riemannian space skills, and decouple the quaternions into Euclidean space and Riemannian space terms \({(}\theta \in {\mathbb{R}}{,}{\varvec{v}} \in S^{2} {)}\). Thus, the decoupled quaternions, as well as the positions, can be learned with our EDMPs framework, simultaneously. The EDMPs provide a new way to learn and generalize multispace skills.
5 Conclusions

(1)
An EDMPs framework both in Cartesian space and 2D sphere manifold has been presented for transferring kinematic skills including positions and orientations from human to robots. The quaternionbased orientations could be successfully learned and generalized under the 2Dspheremanifoldbased transformation system of the EDMPs framework.

(2)
GMMGMR algorithms are combined into the presented EDMPs framework that allows us to obtain not only a smooth regression trajectory, but the corresponding probability distribution. The former could be learned with the EDMPs, and the latter could be applied as reference for designing variable impedance controllers.

(3)
The reachability and similarity are defined as the evaluation indicators to characterize the learning and generalization capability of the EDMPs framework under the determined RBFs and the constants of α and β.

(4)
A realworld experiment was implemented with Panda robot. The experimental results show that the absolute errors of Cartesian and Riemannian space skills are less than 3.5 mm and 1.0°, respectively. The Pearson’s correlation coefficients of the Cartesian and Riemannian space skills are mostly greater than 0.9. The developed EDMPs exhibits a relatively good learning ability for the multispace skills.
The present study takes some references for transferring multispace skills from human to robots. In the future, we will extend our framework to other industrial applications and various skillful tasks, where need to consider position, orientation, force and stiffness both in Cartesian space and Riemannian manifolds simultaneously, such as polishing, scraping, welding, humanrobot cooperation, etc.
References
H Ravichandar, A S Polydoros, S Chernova, et al. Recent advances in robot learning from demonstration. Annual Review of Control Robotics and Autonomous Systems, 2020, 3(1): 297330.
A J Ijspeert, J Nakanishi, S Schaal. Movement imitation with nonlinear dynamical systems in humanoid robot. Proceedings 2002 IEEE International Conference on Robotics and Automation, Washington, DC, USA, May 1115, 2002, 2: 13981403.
N Figueroa, A Billard. Locally active globally stable dynamical systems: Theory, learning, and experiments. The International Journal of Robotics Research, 2022: 02783649211030952.
N Jaquier, L Rozo, D G Caldwell, et al. Geometryaware manipulability learning, tracking, and transfer. The International Journal of Robotics Research, 2021, 40(23): 624650.
S GomezGonzalez, G Neumann, B Schölkopf, et al. Adaptation and robust learning of probabilistic movement primitives. IEEE Transactions on Robotics, 2020, 36(2): 366379.
Y Huang, L Rozo, J Silvério, et al. Kernelized movement primitives. The International Journal of Robotics Research, 2019, 38(7): 833852.
A K Tanwani, A Yan, J Lee, et al. Sequential robot imitation learning from observations. The International Journal of Robotics Research, 2021, 40(1011): 13061325.
A J Ijspeert, J Nakanishi, H Hoffmann, et al. Dynamical movement primitives: learning attractor models for motor behaviors. Neural computation, 2013, 25(2): 328373.
A Gams, B Nemec, A J Ijspeert, et al. Coupling movement primitives: Interaction with the environment and bimanual tasks. IEEE Transactions on Robotics, 2014, 30(4): 816830.
T Petrič, A Gams, L Colasanto, et al. Accelerated sensorimotor learning of compliant movement primitives. IEEE Transactions on Robotics, 2018, 34(6): 16361642.
T Kulvicius, K J Ning, M Tamosiunaite, et al. Joining movement sequences: Modified dynamic movement primitives for robotics applications exemplified on handwriting. IEEE Transactions on Robotics, 2011, 28(1): 145157.
M Chi, Y Yao, Y Liu, et al. Learning, generalization, and obstacle avoidance with dynamic movement primitives and dynamic potential fields. Applied Sciences, 2019, 9(8): 1535.
M Ginesi, D Meli, A Roberti, et al. Dynamic movement primitives: Volumetric obstacle avoidance using dynamic potential functions. Journal of Intelligent & Robotic Systems, 2021, 101(4): 120.
Z Lu, N Wang, C Yang. A constrained dmps framework for robot skills learning and generalization from human demonstrations. IEEE/ASME Transactions on Mechatronics, 2021, 26(6): 32653275.
C Yang, C Zeng, C Fang, et al. A DMPsbased framework for robot learning and generalization of humanlike variable impedance skills. IEEE/ASME Transactions on Mechatronics, 2018, 23(3): 11931203.
F Bian, D Ren, R Li, et al. An extended DMP framework for robot learning and improving variable stiffness manipulation. Assembly Automation, 2019, 40(1): 8594.
B Nemec, N Likar, A Gams, et al. Human robot cooperation with compliance adaptation along the motion trajectory. Autonomous robots, 2018, 42(5): 10231035.
A Ude, A Gams, T Asfour, et al. Taskspecific generalization of discrete and periodic dynamic movement primitives. IEEE Transactions on Robotics, 2010, 26(5): 800815.
M Ginesi, N Sansonetto, P Fiorini. Overcoming some drawbacks of dynamic movement primitives. Robotics and Autonomous Systems, 2021, 144: 103844.
S Calinon. Mixture models for the analysis, edition, and synthesis of continuous time series. Mixture Models and Applications. Springer Press, Cham, 2020: 3957.
S Calinon, I Sardellitti, D G Caldwell. Learningbased control strategy for safe humanrobot interaction exploiting task and robot redundancies. 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, Oct 1822, 2010: 249254.
S Calinon. Robot learning with taskparameterized generative models. Robotics Research, Springer Press, Cham, 2018: 111126.
C Yang, C Chen, W He, et al. Robot learning system based on adaptive neural control and dynamic movement primitives. IEEE transactions on neural networks and learning systems, 2018, 30(3): 777787.
S Calinon. Gaussians on Riemannian manifolds: Applications for robot learning and adaptive control. IEEE Robotics & Automation Magazine, 2020, 27(2): 3345.
A Ude, B Nemec, T Petrić, et al. Orientation in cartesian space dynamic movement primitives. 2014 IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China, May 31  June 7, 2014: 29973004.
F J AbuDakka, B Nemec, J A Jørgensen, et al. Adaptation of manipulation skills in physical contact with the environment to reference force profiles. Autonomous Robots, 2015, 39(2): 199217.
F J AbuDakka, V Kyrki. Geometryaware dynamic movement primitives. 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, May 31  Aug 31, 2020: 44214426.
M Saveriano, F J AbuDakka, A Kramberger, et al. Dynamic movement primitives in robotics: A tutorial survey. arXiv preprint arXiv:2102.03861, 2021.
S Calinon, P Kormushev, D G Caldwell. Compliant skills acquisition and multioptima policy search with EMbased reinforcement learning. Robotics and Autonomous Systems, 2013, 61(4): 369379.
Acknowledgements
Not applicable.
Funding
Supported by National Natural Science Foundation of China (Grant No. 52175029), and Key Industrial Chain Projects of Shaanxi Province (Grant No. 2018ZDCXLGY0605).
Author information
Authors and Affiliations
Contributions
ZL was in charge of the whole research and wrote the manuscript; FZ discussed and read the manuscript; GJ and XM assisted with the analysis and validation. All authors read and approved the final manuscript.
Authors’ Information
Zhiwei Liao received the B.Eng. degree and M.Eng. degree from Xi’an University of Science and Technology, China, in 2015, and Fuzhou University, China, in 2018, respectively. He is currently pursuing the Ph.D. degree at the Shaanxi Key Laboratory of Intelligent Robots, Xi’an Jiaotong University, China. His research interests include Robot skills learning, Robot kinematics and Robot impedance control.
Fei Zhao received the Ph.D. degree in mechanical engineering from Xi’an Jiaotong University, China, in 2013. He joined the School of Mechanical Engineering, Xi’an Jiaotong University, and the Shaanxi Key Laboratory of Intelligent Robots, in 2017. His research interests include Robotics and Intelligent manufacturing, Smart factory and Robotics technology.
Gedong Jiang received the PhD in mechanology from Xi’an Jiaotong University, China, in 1998. She is a Professor in School of Mechanical Engineering of Xi’an Jiaotong University, China. Her research interests include robotics, smart factory and robotics technology, precision measurement technology and electromechanical system dynamics.
Xuesong Mei received the Ph.D. degree in mechanical engineering from Xi’an Jiaotong University, Xi’an, China, in 1991. He is a Full Professor with the School of Mechanical Engineering and the Director of the Shaanxi Key Laboratory of Intelligent Robots, Xi’an Jiaotong University, China. His research interests include intelligent manufacturing, robotics, and theory and method for precision laser processing.
Corresponding author
Ethics declarations
Competing Interests
The authors declare no competing financial interests.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Liao, Z., Zhao, F., Jiang, G. et al. Extended DMPs Framework for Position and Decoupled Quaternion Learning and Generalization. Chin. J. Mech. Eng. 35, 95 (2022). https://doi.org/10.1186/s1003302200761w
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1003302200761w
Keywords
 Learning from demonstration
 Dynamic movement primitives
 2D sphere manifold
 Gaussian mixture model
 Gaussian mixture regression
 Quaternionbased orientation