TY - GEN
T1 - Human Decision-Making Concepts with Goal-Oriented Reasoning for Explainable Deep Reinforcement Learning
AU - Lee, Chris
AU - Sandoval, Eduardo Benitez
AU - Cruz, Francisco
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025
Y1 - 2025
N2 - Recently, the development and integration of Artificial Intelligence (AI) has accelerated and been popularized widely throughout modern society. AI is becoming a powerful tool ranging from leisurely use to critical applications. However, due to the black-box nature of some AI approaches such as Deep Reinforcement Learning (DRL), complex AI algorithms now face growing concerns of trust in ethical and responsible decision-making. EXplainable Artificial Intelligence (XAI) is a subfield of AI focused on deriving interpretable information from incomprehensible statistics to generate explanations for an AI’s decisions. This paper proposes an architecture that combines 2 XAI techniques, Testable Concept Activation Vectors (TCAV) and Reward Decomposition, to create goal-oriented explanations. The XAI approach is tested in a simulated movement prediction environment where a DRL agent is trained to represent different human concepts and goal prioritizations; we can confidently distinguish those concepts between agents in a human-centric framework. Results obtained demonstrate our method allows users to insert their own high-level thinking into XAI and use it to generate explanations.
AB - Recently, the development and integration of Artificial Intelligence (AI) has accelerated and been popularized widely throughout modern society. AI is becoming a powerful tool ranging from leisurely use to critical applications. However, due to the black-box nature of some AI approaches such as Deep Reinforcement Learning (DRL), complex AI algorithms now face growing concerns of trust in ethical and responsible decision-making. EXplainable Artificial Intelligence (XAI) is a subfield of AI focused on deriving interpretable information from incomprehensible statistics to generate explanations for an AI’s decisions. This paper proposes an architecture that combines 2 XAI techniques, Testable Concept Activation Vectors (TCAV) and Reward Decomposition, to create goal-oriented explanations. The XAI approach is tested in a simulated movement prediction environment where a DRL agent is trained to represent different human concepts and goal prioritizations; we can confidently distinguish those concepts between agents in a human-centric framework. Results obtained demonstrate our method allows users to insert their own high-level thinking into XAI and use it to generate explanations.
KW - Artificial Intelligence
KW - Explainable Artificial Intelligence
KW - Neural Networks
KW - Reinforcement Learning
KW - Reward Decomposition
UR - https://www.scopus.com/pages/publications/85210842022
U2 - 10.1007/978-981-96-0348-0_17
DO - 10.1007/978-981-96-0348-0_17
M3 - Conference contribution
AN - SCOPUS:85210842022
SN - 9789819603473
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 228
EP - 240
BT - AI 2024
A2 - Gong, Mingming
A2 - Song, Yiliao
A2 - Koh, Yun Sing
A2 - Xiang, Wei
A2 - Wang, Derui
PB - Springer Science and Business Media Deutschland GmbH
ER -