Skip to main navigation Skip to search Skip to main content

Action Selection Methods in a Robotic Reinforcement Learning Scenario

  • Francisco Cruz
  • , Peter Wuppen
  • , Alvin Fazrie
  • , Cornelius Weber
  • , Stefan Wermter

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

15 Scopus citations

Abstract

Reinforcement learning allows an agent to learn a new task while autonomously exploring its environment. For this aim, the agent chooses an action to perform among the available ones for a certain state. Nonetheless, a common problem for a reinforcement learning agent is to find a proper balance between exploration and exploitation of actions in order to achieve an optimal behavior. This paper compares multiple approaches to the exploration/exploitation dilemma in reinforcement learning and, moreover, it implements an exemplary reinforcement learning task within the domain of domestic robotics to show the performance of different exploration policies on it. We perform the domestic task using -greedy, softmax, VDBE, and VDBE-Softmax with online and offline temporal-difference learning. The obtained results show that the agent is able to collect larger and faster reward by using the VDBE-Softmax exploration strategy with both Q-learning and SARSA.

Original languageEnglish
Title of host publication2018 IEEE Latin American Conference on Computational Intelligence, LA-CCI 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781538646250
DOIs
StatePublished - 23 Jan 2019
Event2018 IEEE Latin American Conference on Computational Intelligence, LA-CCI 2018 - Gudalajara, Mexico
Duration: 6 Nov 20189 Nov 2018

Publication series

Name2018 IEEE Latin American Conference on Computational Intelligence, LA-CCI 2018

Conference

Conference2018 IEEE Latin American Conference on Computational Intelligence, LA-CCI 2018
Country/TerritoryMexico
CityGudalajara
Period6/11/189/11/18

Fingerprint

Dive into the research topics of 'Action Selection Methods in a Robotic Reinforcement Learning Scenario'. Together they form a unique fingerprint.

Cite this