TY - GEN
T1 - EXplainable Reinforcement Learning Using Introspection in a Competitive Scenario
AU - Opazo, Alfonso
AU - Ayala, Angel
AU - Barros, Pablo
AU - Fernandes, Bruno
AU - Cruz, Francisco
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Reinforcement learning (RL) is inspired by behavioral psychology and helps solve problems where there is no previous data; that is, the agent learns through trial and error by interacting with the environment. Explainable RL aims to solve problems related to trust and transparency that people without technical knowledge might have about these systems. This work proposes a novel explainable reinforcement learning approach based on introspection in Deep Q-network and Proximal Policy Optimization algorithms. The integration of the introspection method empowers RL agents to assess the probability of success in a game, solely based on the Q-values obtained. In this regard, the agent will be able to measure how high the chance of winning for each available action during the game using the value function approximation's output. Finally, the introspection-based agents could win several rounds during training, being more competitive than their opponents in different game moments. The computed probabilities of success, showed that although the agent was able to complete a reasonable number of games and generated strategies to win, the agent could not maintain a constant rhythm and learning process.
AB - Reinforcement learning (RL) is inspired by behavioral psychology and helps solve problems where there is no previous data; that is, the agent learns through trial and error by interacting with the environment. Explainable RL aims to solve problems related to trust and transparency that people without technical knowledge might have about these systems. This work proposes a novel explainable reinforcement learning approach based on introspection in Deep Q-network and Proximal Policy Optimization algorithms. The integration of the introspection method empowers RL agents to assess the probability of success in a game, solely based on the Q-values obtained. In this regard, the agent will be able to measure how high the chance of winning for each available action during the game using the value function approximation's output. Finally, the introspection-based agents could win several rounds during training, being more competitive than their opponents in different game moments. The computed probabilities of success, showed that although the agent was able to complete a reasonable number of games and generated strategies to win, the agent could not maintain a constant rhythm and learning process.
KW - competitive environment
KW - human-robot interaction
KW - instrospection-based explainability
KW - reinforcement learning
UR - https://www.scopus.com/pages/publications/85216540863
U2 - 10.1109/LA-CCI62337.2024.10814839
DO - 10.1109/LA-CCI62337.2024.10814839
M3 - Conference contribution
AN - SCOPUS:85216540863
T3 - 2024 IEEE Latin American Conference on Computational Intelligence, LA-CCI 2024 - Proceedings
BT - 2024 IEEE Latin American Conference on Computational Intelligence, LA-CCI 2024 - Proceedings
A2 - Orjuela-Canon, Alvaro David
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 IEEE Latin American Conference on Computational Intelligence, LA-CCI 2024
Y2 - 13 November 2024 through 15 November 2024
ER -