The Consensus Paradox: When Low Disagreement Leads to Catastrophic Failure in Multi-teacher Reinforcement Learning

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

Resumen

In multi-teacher reinforcement learning, conventional wisdom suggests that combining expert knowledge through ensemble methods should improve performance. We reveal a striking paradox: in environments with changing goals, ensemble methods that achieve the highest agreement among teachers deliver the worst performance (32.3% success rate) – even worse than random teacher selection (34.5%). Through controlled experiments in a drifting grid world where four expert teachers guide a learning agent, we demonstrate that confidence-weighted voting creates false consensus by amplifying outdated expertise. Our analysis of 30 random seeds (F = 8957.6, p < 0.0001) shows that when environments change, teacher disagreement is not noise to be reduced but a valuable signal of adaptation. We introduce the Teacher Confusion Index (TCI) and Goal Coherence Score (GCS) to quantify this phenomenon, revealing a positive correlation (r = 0.277) between disagreement and performance. These findings challenge fundamental assumptions about ensemble learning in non-stationary environments, with implications for any multi-expert system facing concept drift.

Idioma originalInglés
Título de la publicación alojadaAI 2025
Subtítulo de la publicación alojadaAdvances in Artificial Intelligence - 38th Australasian Joint Conference on Artificial Intelligence, AI 2025, Proceedings
EditoresMiaomiao Liu, Xin Yu, Chang Xu, Yiliao Song
EditorialSpringer Science and Business Media Deutschland GmbH
Páginas426-438
Número de páginas13
ISBN (versión impresa)9789819549719
DOI
EstadoPublicada - 2026
Evento38th Australasian Joint Conference on Artificial Intelligence, AI 2025 - Canberra, Australia
Duración: 1 dic. 20255 dic. 2025

Serie de la publicación

NombreLecture Notes in Computer Science
Volumen16371 LNAI
ISSN (versión impresa)0302-9743
ISSN (versión digital)1611-3349

Conferencia

Conferencia38th Australasian Joint Conference on Artificial Intelligence, AI 2025
País/TerritorioAustralia
CiudadCanberra
Período1/12/255/12/25

Huella

Profundice en los temas de investigación de 'The Consensus Paradox: When Low Disagreement Leads to Catastrophic Failure in Multi-teacher Reinforcement Learning'. En conjunto forman una huella única.

Citar esto