Agent-advising approaches in an interactive reinforcement learning scenario

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

14 Citas (Scopus)

Resumen

Reinforcement learning has become one of the fundamental topics in the field of robotics and machine learning. In this paper, we expand the classical reinforcement learning framework by the idea of external interaction to support the learning process. To this end, we review a number of proposed advising approaches for interactive reinforcement learning and discuss their implications, namely, probabilistic advising, early advising, importance advising, and mistake correcting. Moreover, we implement the advice strategies for interactive reinforcement learning based on a simulated robotic scenario of a domestic cleaning task. The obtained results show that the mistake correcting approach outperforms a purely probabilistic advice approach as well as the early and importance advising approaches allowing to collect more reward and also to converge faster.

Idioma originalInglés
Título de la publicación alojada7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017
EditorialInstitute of Electrical and Electronics Engineers Inc.
Páginas209-214
Número de páginas6
ISBN (versión digital)9781538637159
DOI
EstadoPublicada - 2 jul. 2017
Evento7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017 - Lisbon, Portugal
Duración: 18 sep. 201721 sep. 2017

Serie de la publicación

Nombre7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017
Volumen2018-January

Conferencia

Conferencia7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017
País/TerritorioPortugal
CiudadLisbon
Período18/09/1721/09/17

Huella

Profundice en los temas de investigación de 'Agent-advising approaches in an interactive reinforcement learning scenario'. En conjunto forman una huella única.

Citar esto