TY - GEN
T1 - MERCI
T2 - 2025 International Conference on Content-Based Multimedia Indexing, CBMI 2025
AU - Althubyani, Mohammed
AU - Meng, Zhijin
AU - Xie, Shengyuan
AU - Cruz, Francisco
AU - Razzak, Imran
AU - Prasad, Mukesh
AU - Sandoval, Eduardo B.
AU - Kocaballi, Baki
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - The integration of conversational agents into daily life has become increasingly common. However, sustaining deeply engaging and natural interactions remains challenging due to a lack of multimodal datasets capturing personal and emotional nuances. In this paper, we introduce MERCI (Multimodal dataset for Emotionally-aware peRsonalised Conversational In-teractions), a dataset derived from user-robot dialogues involving thirty participants who completed user profile questionnaires covering ten personal topics (e.g., hobbies, music). A conver-sational system called PERCY then engaged with each partici-pant in open-domain conversations, leveraging GPT-4, real-time facial-expression and sentiment analysis to generate contextu-ally appropriate, empathetic responses. MERCI contains 1860 utterances, equating to about 12.5 hours of aligned audio, three-view video, transcripts with timestamps, emotion labels, and sentiment scores. This dataset serves as a reproducible test-bed for tasks such as emotion-aware response generation, multimodal affect recognition, and personalised policy learning. Baseline performance results have been established using advanced models such as BERT, T5, BART, and GPT-3.5/4/4o-mini across gener-ation, regression, and classification. Evaluations through human and automated methods have demonstrated strong naturalness, relevance, and consistency in responses while indicating areas for enhanced personalisation and empathic depth. We expect that MERCI will enhance the development of emotionally intelligent, user-centric conversational AI applications, potentially ranging from social robotics to mental health support.
AB - The integration of conversational agents into daily life has become increasingly common. However, sustaining deeply engaging and natural interactions remains challenging due to a lack of multimodal datasets capturing personal and emotional nuances. In this paper, we introduce MERCI (Multimodal dataset for Emotionally-aware peRsonalised Conversational In-teractions), a dataset derived from user-robot dialogues involving thirty participants who completed user profile questionnaires covering ten personal topics (e.g., hobbies, music). A conver-sational system called PERCY then engaged with each partici-pant in open-domain conversations, leveraging GPT-4, real-time facial-expression and sentiment analysis to generate contextu-ally appropriate, empathetic responses. MERCI contains 1860 utterances, equating to about 12.5 hours of aligned audio, three-view video, transcripts with timestamps, emotion labels, and sentiment scores. This dataset serves as a reproducible test-bed for tasks such as emotion-aware response generation, multimodal affect recognition, and personalised policy learning. Baseline performance results have been established using advanced models such as BERT, T5, BART, and GPT-3.5/4/4o-mini across gener-ation, regression, and classification. Evaluations through human and automated methods have demonstrated strong naturalness, relevance, and consistency in responses while indicating areas for enhanced personalisation and empathic depth. We expect that MERCI will enhance the development of emotionally intelligent, user-centric conversational AI applications, potentially ranging from social robotics to mental health support.
KW - dialogue systems
KW - empathetic conversational agents
KW - human-robot interaction
KW - multimodal dataset
KW - personalisation
UR - https://www.scopus.com/pages/publications/105033149200
U2 - 10.1109/CBMI66578.2025.11339324
DO - 10.1109/CBMI66578.2025.11339324
M3 - Conference contribution
AN - SCOPUS:105033149200
T3 - CBMI 2025 - 2025 International Conference on Content-Based Multimedia Indexing, Conference Proceedings
BT - CBMI 2025 - 2025 International Conference on Content-Based Multimedia Indexing, Conference Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 22 October 2025 through 24 October 2025
ER -