Skip to main content
  • Research Highlight
  • Open access
  • Published:

Revitalizing Human-Robot Interaction: Phygital Twin Driven Robot Avatar for China–Sweden Teleoperation

1 Introduction

Digital Twin (DT) is built and maintained in the digital rather physical realm [1, 2]. Dynamic DT mirrors the exact state of the physical system by means of multidimensional models describing the behavior of physical entity, and sensors providing the real-time coupling to models [3]. The outbreak of COVID-19 pandemic results in speeding up the virtualization and digitization of the physical entity with the support from DT. Nevertheless, quarantine and division led by COVID-19 pandemic make human beings’ eyes for the physical interaction become more eager. The physical world and the digital world are becoming interwoven, which indicates a new opportunity towards phygital twin. As is so often the question: how can I get closer to my distant companion? How can I extend my service or emotional concern to facilitate interaction in spite of space obstruction? These questions deserve thinking as a lesson learned from outbreak of COVID-19 pandemic.

DT focuses on how digital elements to represent physical objects in the abstract and condensed ways based on a confluence of technological advances [4]. Intrinsic difference between human beings and physical systems leads to different emphasis in the description by DT [5]. When involved into human-centric DT, human users expect tangible and intuitive experience to meet requirements of engagement and immersion. They need to prioritize transparency and intuition providing for humans. Moreover, not only interaction enhancement, but also getting rid of the limitations of space are primary requirements of the next-generation DT in favor of human’s involvement. Phygital twin emerges at this critical moment.

“Phygital” is a portmanteau of “physical” and “digital”. It is the concept of utilizing technology to bridge the reality and cross-space reality by means of interaction between digital world and physical world, aiming for enhancing unique and interactive experiences of users, providing a positive outcome with a seamless toggling between the physical realm and the digital realm.

Phygital twin is a term allowing trio among physical reality, digital world, and the other cross-space physical reality. Its connection changes the paradigm by unifying multiple hardware infrastructure and software infrastructure [6]. With advanced technology, phygital twin has an opportunity to endow avatar with more rational and emotional factors. Imaging taking care of distant family, sharing your joy or sorrows with distant friend, receiving warm hug from lovers, surpass verbal communication [7].

In terms of realization, robot is quite an appropriate vehicle of phygital twin thanks to its advanced technologies and broad application [8, 9]. Telerobotics is one of the earliest aspects of robotics. It refers to robotics with a human operator in control or human-in-the-loop. Furthermore, human is responsible for any high-level planning, or cognitive decisions while robot executes mechanical implementation. Researchers attempt to apply human intelligence, cognition, and skills to robots. Selvaggio et al. introduced the first haptic-guided shared-control teleoperation architecture [10]. It is used in the transportation of an object on a tray-like end-effector. It allows integrating human operators’ experience and cognitive capabilities with the precision, reactivity, and repeatability of autonomous systems. In addition to the control strategy, the transparency and stability are also challenging. Feizi et al. proposed a real-time frequency-based delay compensation approach to maximize transparency while reducing the activation of the stabilization layer [11]. However, human-robot teleoperation is the epitome of phygital twin in terms of technology scale while phygital twin does not just refer to operation of systems or machines at a distance. Zhang et al presented a mixed-reality avatar attached with robot in the dynamic interaction among humans, robot, and mixed-reality avatar robot [12]. Beik-Mohammadi et al. integrated reinforcement learning with the model mediated teleoperation. It can handle challenging problems in a long-distance teleoperated grasping scenario under long time-delays [13]. In comparison with telerobotics, researchers related to phygital twin strive to create an environment for maximizing the value of time, transcending space, and extending humans’ ability. No matter how far away you are, where you are, and who you are, robot becomes your avatar to do what you want to do without needing to be there in person. Meanwhile, real-world experiences and virtual world experiences are cherished for humans. Thus, robot avatar is expected to perceive real-world, thereby facilitate human beings to sense real world and virtual world without constraints of place.

In this work, researchers step further on transcontinental phygital twin system. Section 2 introduces the system architecture of the proposed system, and Section 3 shows validation case study. Conclusion is given in Section 4.

2 Phygital Twin

Phygital twin focuses on applying knowledge and abilities gained from human beings to phygital avatar, i.e., transferring information from humans with subjective initiative to physical avatar (e.g., robot). The integrated phygital twin system involves the operator located in Sweden, and the robot avatar in China (Figure 1). A set of wearable devices is used to capture human’s motion and provide feedback information. The robot receives transmitted information, replacing human for executing tasks in a first-person perspective.

Figure 1
figure 1

Phygital twin system for China–Sweden teleoperation

2.1 Interactive Interface

Interactive interface supports information transmission between the natural and the artificial side in the human-robot interaction. Intuitiveness, functionality, and affordability are taken into consideration in the context of concerning humans’ well-being and rights. The well-known and traditional interface devices, such as touch joystick and keyboard, enable an easy and portable interaction way. Nevertheless, it places cognition burden on humans. Similarly, the interfaces with a high level of autonomy (e.g., brain-machine interface) makes demands on autonomy and intelligence of interface and robots. When human beings get there, it will be a magic world experienced rarely before. Now, researchers go further by developing intuitive interface for the execution of teleoperation tasks, such as wearable devices [14]. Developed inertial motion capture device (InMoCap) is employed for human motion acquisition. The wearable feedback device tracks robotic gripper openness and reflects degree of proximity sensing by robot. Wearable devices with vibrator and Light Emitting Diode (LED) allow to translate robotic grippers’ operation states into feedback as a reference for teleoperating grippers.

Using inertial measurement units (IMUs), InMoCap measures primary body segments’ motion. This device is made up of IMU nodes, hub microcontroller, communication module, and battery module. An IMU node integrates a gyroscope, an accelerometer, and a magnetometer with a processing unit, and acquires acceleration, angular velocity, and magnetic field for obtaining human motion. The built-in microcontroller of IMU node reads real-time motion data and communicates with the hub microcontroller. The hub microcontroller is responsible for passing control command to each IMU node and collecting motion data of attached IMUs. Embracing by multiple operation systems and application software environment, corresponding data interface is developed to receive data based on Transmission Control Protocol/Internet Protocol (TCP/IP). Figure 2 shows the operator and her animation acquired from InMoCap and optical motion capture system (Qualisys, Sweden). The interface is embedded into Robot Operation System (ROS) for interacting with robot avatar.

Figure 2
figure 2

IMU-based human motion capture and visualization

2.2 Dual-Arm Robot

Robot is an appropriate choice to become human avatar in phygital twin. They are superior in executing repetitive tasks, performing independent and fast actions in structured or known environments. Based on advanced robotics technologies, researcher proposes the future potential of “toward surrogate of human: robot avatar”, allowing transferring human beings’ skills to remote place. With robot avatar, interaction space where operator completes manipulation tasks acquires extension.

Robot represents the avatar of human operator in the physical system. Two six degrees of freedom KINOVA JACO2 robotic arms face to face in slightly V-shaped angle to imitate human arms’ configuration. Moreover, the design of robotic arms contributes to smoothness and versatility of fully functioning human arm. Each arm is capable of suffering the maximum continuous payload of 1.6 kg, the maximum workspace of 900 mm, and the maximum gripper force of 40 N with power consumption of 25 W. Each arm joint has torque sensor for force and torque measurements close to robotic arms. In order to enrich operation information, mounted depth cameras are perpendicular to robotic grippers. The head of robot is endowed with functions of images capturing and video streaming using one stabilized camera. It is placed at the middle position between two robotic arms by mimicking human head configuration. High-definition image transmission system motivates real-time video transmission with ultra-low time delay and remote transmission. It provides robot with high-performance visual ability. Robot avatar’s upper body is mounted on a holonomic platform with four wheels and laser radar. It enables positioning, navigation, obstacle avoiding, and omnidirectional movement.

2.3 Human-Robot Interaction

Phygital twin envisions a borderless, interactive, and fully immersive realm capable of covering avatars in the virtual world. Beyond virtual domain, can we extend the world based on existing border? Thus, it proposes the “reconfiguration of world” in the view of humans and their physical avatars in the physical world and virtual world. It consists of human, the physical system, and the cyber system [5].

Human operator is responsible for the perception, decision, interaction, and control. Physical robot becomes avatar of humans and promotes human-robot interaction. With sensor fusion algorithms and body segment calibration, orientations of human body segments are obtained. Thereinto, hip is regarded as the root node of the human global coordinate system. With the known body segments’ orientations and constructed skeleton models, position and orientation of human hands in the human global coordinate system are mapped to the robotic end-executors by coordinate transformations between human and robot [15]. Converted motion data is transmitted to robot and leads to path planning of robotic arms. As for human-robot communication, InMoCap and robot connect to respective Local Area Networks (LANs) via wired or wireless methods. Each sub-network is controlled by sub-router respectively and made connection by building virtual private network. ROS is chosen as the unified platform for communication, control, and development. Operation station is represented as a ROS node to communicate with robot avatar by sharing topics and services. It ensures transmission of interactive data and services between human and robot.

3 Validation

The phygital twin goes after releasing potential abilities of the avatar system and combining strengths of the operator and the robot together. Inspired by robot avatar, the phygital twin system is validated in a transcontinental demonstration involving the human operator and the robot system. Human wearing the InMoCap located in Västerås, Sweden, controls the robot system located in Hangzhou, China.

Effective and reliable human motion and robotic motion lay the foundation for collaborative interaction. Characteristics of InMoCap are shown in Table 1. The computed human motion data have been measured using the optical system as golden standard. The Root Mean Square Error (RMSE) of joint angles in the static poses is 3.9° while is 4.7° in the dynamic movements. The Pearson correlation coefficient of dynamic motion data is about 92.5%, and the average agreement certainty of the overall joint angles is about 97.8%. RMSE between IMU-based computed motion trajectories and standard values is about 1.2 cm.

Table 1 Measured metrics of IMU-based motion capture system using optical system as standard

With the mapped motion trajectories from human operator, robotic arms replace human arms in the operation tasks, for example, placing blocks into container (Figure 3(a). The active robotic arm takes on the primary part of task, i.e., picking up the target and placing it down. By moving container, the passive robotic arm collaborates with the active arm in the placement of the target. Furthermore, dual-arm operation ability shows humans’ strength in interaction. Robot system approaches targeted objects and executes task with the collaboration of two robotic arms. As shown in Figure 3(b), robot avatar fetches bamboo slip on the desk, and the other robotic arm fetches bamboo tube. Human operator needs to coordinate two arms to put the bamboo slip into the tube. In the operation tasks, basic feedback devices are equipped with operators. For example, when operator teleoperates the robot, robotic perception information is transmitted to operator. Specifically, operator needs to know if the robotic grippers open or not according to results of grabbing tasks. With the acquired information from vibration and flash, direct feedback information helps operator make decision conveniently.

Figure 3
figure 3

Transcontinental operation tasks based on phygital twin

4 Discussions and Conclusions

This paper presents preliminary reflections and discussions on phygital twin as a novel human-robot interaction paradigm, including preliminary framework, components, and validation case. Existing human-machine interaction paradigm is at the transitional stage where multiple enabling technologies are provided for extension and enrichment of related conceptual framework. In this paper, phygital twin facilitates interaction between human and robot avatar. Thereinto, robot avatar replaces human to execute multiple tasks with providing more sense of perception, feedback, and cognition. In particular, operation cases have been completed using transcontinental teleoperation. With increasing awareness of bridging the physical reality and cross-space physical reality, it is foreseeable that phygital twin will overthrow traditional ideas and join in next-generation human-machine paradigm.

Challenges remain in realizing and promoting phygital twin system. To promote and popularize phygital twin, it is critical to clarify a comprehensive and basic framework for implementing phygital twin in varying industrial levels. Despite all the optimism around phygital twin, it is known that we are at the early stage of adopting this vision in multiple areas. It is expected to embrace the future development across the broader application areas, such as robotics, cognitive computing, and artificial intelligence. However, there are many challenges remaining to solve, such as the availability of data from which to build phygital twin system, safety issues related to the integration of robot avatar into human-centric system. Moreover, the development of phygital twin also requires the legal and political support. The growing awareness of the vision of phygital twin will certainly bring more discussions about handling with challenges and promoting phygital twin.

Data availability

Due to the nature of this research, participants of this study did not agree for their data to be shared publicly, so supporting data is not available.


  1. F Tao, H Zhang, A Liu, et al. Digital twin in industry: State-of-the-art. IEEE Trans. Ind. Inform., 2019, 15(4): 2405–2415.

    Article  Google Scholar 

  2. D Jones, C Snider, A Nassehi, et al. Characterising the Digital Twin: A systematic literature review. CIRP J. Manuf. Sci. Technol., 2020, 29: 36–52.

    Article  Google Scholar 

  3. A Rasheed, O San, T Kvamsdal. Digital twin: Values, challenges and enablers from a modeling perspective. IEEE Access, 2020, 8: 21980–22012.

    Article  Google Scholar 

  4. S Aheleroff, X Xu, R Y Zhong, et al. Digital twin as a service (DTaaS) in Industry 4.0: An architecture reference model. Adv. Eng. Inform., 2021, 47: 101225.

    Article  Google Scholar 

  5. B C Wang, H Y Zhou, G Yang, X Y Li, H Y Yang. Human Digital Twin (HDT) driven Human-Cyber-Physical Systems: Key technologies and applications. Chin. J. Mech. Eng., 2022, 35:11.

    Article  Google Scholar 

  6. J N Ortiz, P R Diaz, S Sendra, et al. A survey on 5G usage scenarios and traffic models. IEEE Commun. Surv. Tutor., 2020, 22(2): 905–929.

    Article  Google Scholar 

  7. B Kang, I Hwang, S Lee, et al. My being to your place, your being to my place: Co-present robotic avatars create illusion of living together. Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services, Germany, June 10–15, 2018: 54–67.

  8. H Y Zhou, G Yang, B C Wang, et al. An attention-based deep learning approach for inertial motion recognition and estimation in human-robot collaboration. J. Manuf. Syst., 2023, 67: 97–110.

    Article  Google Scholar 

  9. G Yang, H H Lv, Z Y Zhang, et al. Keep Healthcare workers safe: Application of teleoperated robot in isolation ward for COVID-19 prevention and control. Chin. J. Mech. Eng., 2020, 33:47.

    Article  Google Scholar 

  10. M Selvaggio, J Cacace, C Pacchierotti, et al. A shared-control teleoperation architecture for nonprehensile object transportation. IEEE Trans. Robot., 2022, 38(1): 569–583.

    Article  Google Scholar 

  11. N Feizi, R V Patel, M R Kermani, et al. Adaptive wave reconstruction through regulated-bmflc for transparency-enhanced telerobotics over delayed networks. IEEE Trans. Robot., 2022, 38(5): 2928–2942.

    Article  Google Scholar 

  12. J X Zhang, O Janeh, N Katzakis, et al. Evaluation of proxemics in dynamic interaction with a mixed reality avatar robot. International Conference on Artificial Reality and Telexistence & Eurographics Symposium on Virtual Environments, Japan, September 11–13, 2019: 37–44.

  13. H Beik-Mohammadi, M Kerzel, B Pleintinger, et al. Model mediated teleoperation with a hand-arm exoskeleton in long time delays using reinforcement learning. 29th IEEE International Conference on Robot and Human Interactive Communication, Italy, August 31–September 4, 2022: 713-720.

  14. B Fang, X Wei, F C Sun, et al. Skill learning for human-robot interaction using wearable device. Tsinghua Science and Technology, 2019, 24(6): 654–662.

    Article  Google Scholar 

  15. H Y Zhou, G Yang, H H Lv, et al. IoT-enabled dual-arm motion capture and mapping for telerobotics in home care. IEEE J. Biomed. Health Inform., 2020, 24(6): 1541–1549.

    Article  Google Scholar 

Download references


Supported by National Natural Science Foundation of China (Grant Nos. 51975513, 51890884), Zhejiang Provincial Natural Science Foundation of China (Grant No. LR20E050003), Major Research Plan of Ningbo Innovation 2025 of China (Grant No. 2020Z022), and Bellwethers Research and Development Plan of Zhejiang Province of China (Grant No. 2023C01045).

Author information

Authors and Affiliations



HZ: conceptualization; methodology; investigation; writing—review & editing. HL and RW: methodology; investigation. HW: resources. GY: methodology; resources; writing—review; supervision. All authors read and approved the final manuscript.

Author's information

Huiying Zhou, is currently a Ph.D. candidate at State Key Laboratory of Fluid Power and Mechatronic Systems, School of Mechanical Engineering, Zhejiang University, China. Her main research interests include human-cyber-physical systems, wearable sensorshuman-robot interaction.

Honghao Lv, is currently a Ph.D. candidate at State Key Laboratory of Fluid Power and Mechatronic Systems, School of Mechanical Engineering, Zhejiang University, China. His main research interests include advanced robotics, human-robot interaction.

Ruohan Wang, is currently a Ph.D. candidate at State Key Laboratory of Fluid Power and Mechatronic Systems, School of Mechanical Engineering, Zhejiang University, China. Her main research interests include advanced robotics, human-robot interaction.

Haiteng Wu, is currently working at Hangzhou Shenhao Technology Co., Ltd., China.

Geng Yang, is currently a Professor at State Key Laboratory of Fluid Power and Mechatronic Systems, School of Mechanical Engineering, Zhejiang University, China, and Zhejiang Key Laboratory of Intelligent Operation and Maintenance Robot, China. He received his doctoral degree from the Royal Institute of Technology (KTH), Stockholm, Sweden. His main research interests include human-cyber-physical systems, flexible sensors, advanced robotics, artificial intelligence, human-robot interaction.

Corresponding author

Correspondence to Geng Yang.

Ethics declarations

Competing Interests

The authors declare no competing financial interests.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, H., Lv, H., Wang, R. et al. Revitalizing Human-Robot Interaction: Phygital Twin Driven Robot Avatar for China–Sweden Teleoperation. Chin. J. Mech. Eng. 36, 124 (2023).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: