Learn to adapt to human walking: A Model-based
Reinforcement Learning Approach for a Robotic
Assistant Rollator
Georgia Chalvatzaki1 , Xanthi S. Papageorgiou2 , Petros Maragos1 and Costas S. Tzafestas1
Abstract—In this paper, we tackle the problem of adapting
the motion of a robotic assistant rollator to patients with
different mobility status. The goal is to achieve a coupled human-
robot motion in a front-following setting as if the patient was
pushing the rollator him/herself. To this end, we propose a novel
approach using Model-based Reinforcement Learning (MBRL)
for adapting the control policy of the robotic assistant. This
approach encapsulates our previous work on human tracking and Fig. 1: Robotic agent observes predicted human motion intention
gait analysis from RGB-D and laser streams into a human-in-the- and learns through Model-based Reinforcement Learning to adapt its
loop decision making strategy. We use Long Short-Term Memory control actions, accordingly.
(LSTM) networks for designing a Human Motion Intention
Model (HuMIM) and a Coupling Parameters Forecast model,
leveraging on the outcome of human gait analysis. An initial
LSTM-based policy network was trained via Imitation Learning
(IL) from human demonstrations in a Motion Capture setup.
This policy is then fine-tuned with the MBRL framework using
tracking data from real patients. A thorough evaluation analysis
proves the efficiency of the MBRL approach as a user-adaptive
controller.
Index Terms—Human-Centered Robotics; Learning and Adap-
tive Systems; Automation in Life Sciences: Biotechnology, Phar-
maceutical and Health Care Fig. 2: Left and middle: Prototype robotic assistant rollator
equipped with a RGB-D sensor for capturing the upper body pose
and a 2D laser sensor for detecting the legs motion. Right: Example
I. I NTRODUCTION of the MoCap markers on an elderly user and a passive rollator, from
which the data for imitation learning derived.
T HE development of robotic mobility assistants is a major
research area with great impact on society. The constant
increase of aged population in recent years has created new
In this paper, we tackle the problem of adapting the motion
of a robotic rollator that moves along with an elder user while
challenges in the healthcare sector, causing great difficulties being in front of him. The applied control should comply with
for the existing care and nursing staff to keep up with these the user’s needs in case the user wants to walk either supported
evolving needs. The necessity for robotic assistants that will or unsupported by the rollator, whenever feeling confident, i.e.
help with elderly mobility and rehabilitation is clear. It has leaving the handles and walking along with the robot in front
been now close to twenty years since the first robotic rollators of them (Fig. 1). However, the robot should follow and be in a
emerged [1], [2]. An intelligent robotic mobility assistant close distance in front of the user, not only to provide support
should serve many purposes; postural support, gait analy- whenever needed, but also to prevent possible falls.
sis, sit-to-stand transfer, navigation and cognitive assistance. Motivated by this need, taking into account the variability in
Adaptation to user needs is important for seamless human- human walking, and especially in pathological gait (e.g ataxic
robot interaction in such applications. and freezing types of gait present different velocities and pat-
Manuscript received: February, 24, 2019; Revised May, 31, 2019; Accepted terns), we propose a unified method for continuous monitoring
June, 27, 2019. of each user and adaptation of the robotic platform’s motion
This paper was recommended for publication by Editor Allison Okamura accordingly. We propose a MBRL method for adapting the
upon evaluation of the Associate Editor and Reviewers’ comments. This
research work has been cofinanced by the European Union and Greek national robot’s motion in front of the user. Fig. 1 encapsulates an
funds through the Operational Program Competitiveness, Entrepreneurship overview of the problem we aim to solve; the robotic assistant
and Innovation, under the call RESEARCH CREATE INNOVATE (i-Walk, should infer the human’s motion intention and learn a control
project code:T1EDK- 01248 / MIS: 5030856).
1 G. Chalvatzaki, C.S. Tzafestas and P. Maragos are with the School policy using MBRL to select control actions that will comply
of Electrical and Computer Engineering, National Technical University of to the human’s way of walking.
Athens, Greece (email: {gchal}@mail.ntua.gr. We build upon our previous work, regarding human tracking
2 X.S. Papageorgiou is with the Institute for Language and Speech Process-
ing (ILSP / Athena R.C.), Greece (email: {xpapag}@ilsp.gr). and gait analysis fusing 2D laser data capturing the legs motion
Digital Object Identifier (DOI): see top of this page. [3] and RGB-D streams of the upper body pose estimation
, 2 IEEE ROBOTICS AND AUTOMATION LETTERS. PREPRINT VERSION. ACCEPTED MONTH, YEAR
using the Open Pose Library [4], from sensors mounted on a
robotic rollator (Fig. 2). Laser data are used to perform robust
gait tracking and reliable on-line gait analysis by exploiting
the high scanning frequency and precision of the laser sensor,
while RGB-D streams can provide additional information from
which we can infer human gait stability [5]. In this work,
we integrate the aforementioned methods into a human-in-the-
loop control framework using MBRL for adapting the robot
motion to each user.
In the human-robot coupled navigation context, our main
contribution resides on a novel approach considering human
motion intentions within a MBRL framework for the online Fig. 3: Model-based Reinforcement Learning framework for policy
motion adaptation of a robotic assistant in a challenging front- adaptation using human motion intention predictions.
following scenario (Fig. 1). In this framework, we start by
for learning human motion behavior was presented in [18].
developing LSTM based prediction models for estimating
In [19], deep RL was used for navigating according to social
human motion intention using a history of motion tracking
norms across crowds, while in [20], RL is used for unfreezing
data. We then train models which associate the human motion
the robot in the crowd by taking into account the coordination
orientation and the estimated stride length provided by gait
between robots and detected humans. In such cases the robot
analysis to the desired coupling parameters for the robot’s
does not accompany humans, but it rather learns how to move
heading and position, i.e. the desired separation distance and
through and avoid collisions with them.
bearing in the human-robot frame. Further on, we use this
information to train a policy for suggesting robot control Regarding robotic companions, a method for human-robot
actions according to the human motion intentions and the navigation using the social force model and a Bayesian pre-
expected desired coupling. We developed an initial policy dictor for human motion is described in [21]. A model based
model trained with IL from human demonstrations using data on social force and human motion prediction is presented in
from motion markers (VICON system), which were placed [22], for making robots capable of approaching people with a
on the human and a passive rollator frame in a series of human-like behavior, while they are walking in a side-by-side
data collection experiments (Fig. 2). Although such a model formation with a person, avoiding several pedestrians in the
behaves well for the demonstrated cases and gives insight environment. An MPC technique that accounts for safety and
on how the user wants the platform to be placed in front of comfort requirements for a robot accompanying a human in a
him/her while walking, this policy does not have experience search and rescue scenario is presented in [23].
for recovering from drift cases or unexpected detection loss of The use of deep RL is prevalent in modern research
the user. To cope with such situations, the proposed MBRL aiming to plan robot motion [24] and control [25] for var-
framework performs fine-tuning of the initial control policy (as ious tasks. Robot navigation systems which have integrated
seen in Fig. 3), while using random sampling Model Predictive such RL decision-making schemes can be found in [26]–
Control (MPC) for planning [6]–[8]. Detailed experimental [28]. Approaches combining IL with RL for learning control
results are presented in the paper showing the efficiency of policies are presented in [29], [30]. Although, model-free RL
the proposed MBRL framework for the motion adaptation of approaches have many successful applications, they require
a robotic assistant rollator using data from real patients. large amount of training data, which are often simulated, thus
their applicability is limited. On the other hand, model-based
II. R ELATED W ORK RL firstly learns a model of the system and then trains a
State-of-the-art research for robotic assistants mostly relies control policy using feedback [31]. MBRL has been used for
on admittance control schemes [9], [10]. A control strategy robot control both in simulated and real world experiments
using as inputs human velocity and orientation was proposed [32]–[34]. MBRL relies on MPC for planning control actions,
in [11]. A formation control for a robot-human following thus using learned models along with MPC as a control policy,
scenario was presented in [12], for safely navigating blind is a matter in hand for RL and IL research [8], [35], [36]. We
people. In our previous work [13], we have considered a were inspired by recent advances in adaptive control using
front-following problem with a kinematic controller adapting MBRL [7], [37]. In this work, we propose a novel MBRL
to users according to their pathological mobility class. A framework for learning and adapting the control policy of a
Reinforcement Learning (RL) shared-control for a walking robotic assistant rollator to human walking. To the best of our
aid with human intention prediction from force sensors is knowledge, this is the first approach aiming to solve a front-
presented in [14]. following problem using MBRL and human motion prediction
A lot of research focuses on social robot navigation [15], models, either for a robotic assistant or a robotic companion.
i.e. robot motion planning among crowds [16], using RL.
III. P RELIMINARIES
Most methods for robot navigation require pedestrians motion
predictions for the robot to learn how to navigate among them In RL the goal is to learn a policy that will propose actions
in a compliant way [17]. An interaction-aware motion pre- for an agent, which will maximize the sum of the expected
diction approach for pedestrians with an LSTM-based model future rewards [38]. Given the current state xt ∈ X, the agent