A goal of interactive machine learning (IML) is to enable people with no specialized training to intuitively teach intelligent agents how to perform tasks. Toward achieving that goal, we are studying how the design of the interaction method for a Bayesian Q-Learning algorithm impacts aspects of the human's experience of teaching the agent using human-centric metrics such as frustration in addition to traditional ML performance metrics. This study investigated two methods of natural language instruction: critique and action advice. We conducted a human-in-the-loop experiment in which people trained two agents with different teaching methods but, unknown to each participant, the same underlying reinforcement learning algorithm. The results show an agent that learns from action advice creates a better user experience compared to an agent that learns from binary critique in terms of frustration, perceived performance, transparency, immediacy, and perceived intelligence. We identified nine main characteristics of an IML algorithm's design that impact the human's experience with the agent, including: using human instructions about the future, compliance with input, empowerment, transparency, immediacy, a deterministic interaction, the complexity of the instructions, accuracy of the speech recognition software, and the robust and flexible nature of the interaction algorithm.
Abstract for Test 2
The benefits of personalized social robots must be evaluated in real-world educational contexts over periods of time longer than a single session in order to understand their full potential to impact learning outcomes. In this work, we describe a personalization system designed for longer-term personalization that orders curriculum based on an adaptive Hidden Markov Model (HMM) that evaluates students' skill proficiencies. We present a study investigating the effectiveness of this system in a five-session interaction with a robot tutor, taking place over the course of two weeks. Our system is evaluated in the context of native Spanish-speaking first graders interacting with a social robot tutor while completing an English Language Learning (ELL) educational task. Participants either received lessons: (1) ordered by our adaptive HMM personalization system which selects a lesson based on a skill that the individual participant needs more practice with (``personalized condition''), or (2) ordered randomly from among the lessons the participant had not yet seen (``non-personalized condition''). We found that participants who received personalized lessons from the robot tutor outperformed participants who received non-personalized lessons on a post-test by 2.0 standard deviations on average, corresponding to a mean learning gain in the 98th percentile.
In this paper, we present a bi-directional communication scheme that facilitates interaction between a person and a mobile robot that follows the person. A person-following robot can assist people in many applications including load carrying, elder care, and emotional support. However, commercially available personal robot systems usually have limited sensing and actuation capabilities. They are not expected to function perfectly in complex environments, and human intervention is required when the robot fails. We propose to use a holdable mechatronic device to reduce the user's effort in communication and enable natural interaction during the intervention. Our design of the holdable device consists of two parts: a haptic interface that displays touch cues to convey the robot's failure status via asymmetric vibrations, and a command interface for teleoperating the robot follower with hand gestures. We experimentally evaluated the device and the communication strategy in two sets of user studies with a controlled environment and a physical robot follower. Results show that with the proposed method, users are able to perform their tasks better, respond to robot failure events faster, and adjust walking speed according to the robot's limitations. We also demonstrate that users can successfully teleoperate the robot to avoid obstacles when navigating in challenging environments.
As robots become increasingly prevalent in human environments, there will inevitably be times when the robot needs to interrupt a human to initiate an interaction. Our work introduces the first interruptibility-aware mobile-robot system, which uses social and contextual cues online to accurately determine when to interrupt a person. We evaluate multiple non-temporal and temporal models on the interruptibility classification task, and show that a variant of Conditional Random Fields (CRFs), the Latent-Dynamic CRF, is the most robust, accurate, and appropriate model for use on our system. Additionally, we evaluate different classification features, and show that the observed demeanour of a person can help in interruptibility classification, but in the presence of detection noise, robust detection of object labels as a visual cue to the interruption context can improve interruptibility estimates. Finally, we deploy our system in a large-scale user study to understand the effects of interruptibility-awareness on human-task performance, robot task performance, and on human interpretation of the robot's social aptitude. Our results show that while participants are able to maintain task performance even in the presence of interruptions, interruptibility-awareness improves the robot's task performance and improves participant social perceptions of the robot.
Socially assistive robots can autonomously provide activity assistance to vulnerable populations, including those living with cognitive impairments. To provide effective assistance, these robots must be capable of displaying appropriate behaviors and personalizing them to a users cognitive abilities. Our research focuses on the development of a novel robot learning architecture that uniquely combines learning from demonstration (LfD) and reinforcement learning (RL) to effectively teach socially assistive robots personalized behaviors in order to aid users during human-robot interaction. Caregivers can demonstrate a series of assistive behaviors for an activity to the robot, which it uses to learn general behaviors via LfD. This information is used to obtain initial assistive state-behavior pairings using a decision tree. Then, the robot uses RL to obtain a personalized policy in order to select the appropriate behavior to achieve a desirable user state based on the users cognition level. Experiments were conducted with the socially assistive robot Casper to investigate the effectiveness of our proposed learning architecture. Results showed that Casper was able to learn personalized behaviors for the new assistive activity of tea-making, and that combining LfD and RL significantly reduces the time required for a robot to learn a new activity.