Exploiting structures in reinforcement learning: multi-agent homogeneity, euclidean symmetry, and natural languages

25 views

Tuesday, March 11, 2025 - 03:00 pm

Online

DISSERTATION DEFENSE

Author : Dingyang Chen

Advisor: Dr. Qi Zhang

Date: March 11, 2025

Time: 3:00 pm – 5:00 pm

Place: Zoom

Meeting Link: https://us06web.zoom.us/j/85964374977

Abstract

Reinforcement learning (RL) has emerged as a powerful paradigm for decision-making in complex environments. However, many RL tasks exhibit inherent structural properties—such as homogeneity, symmetry, and linguistic patterns—that are often underutilized, leading to inefficiencies in learning and generalization. This dissertation systematically exploits these structures to improve the efficiency, scalability, and robustness of RL algorithms across multi-agent and sequential decision-making settings.

First, we investigate homogeneity in multi-agent systems, where agents share similar roles and objectives. By leveraging this structure, we develop communication-efficient actor-critic methods for homogeneous Markov games, enabling scalable learning with reduced coordination overhead.

Second, we introduce Euclidean symmetry in RL, demonstrating how equivariant function approximators can significantly enhance sample efficiency and generalization in spatially structured tasks, such as robotic control.

Third, we integrate large language models (LLMs) into RL to improve sequential decision-making while avoiding expensive retraining. Our framework efficiently combines LLM inference with RL-based optimization, leading to better adaptability and reduced computational costs in contextual decision-making tasks.

Finally, we explore Markov Potential Games (MPGs), a subclass of multi-agent RL with inherent homogeneity. We develop best-response learning dynamics that mitigate non-stationarity and improve equilibrium quality, providing theoretical guarantees on convergence and the first known Price of Anarchy (POA) bounds for policy gradient methods in MPGs.

Through extensive theoretical analysis and empirical validation on diverse benchmarks, this work demonstrates the power of structural exploitation in RL. By leveraging homogeneity, symmetry, and natural languages, this research lays the foundation for more efficient, generalizable, and scalable RL algorithms, with applications in multi-robot coordination, traffic management, recommendation systems, and strategic game playing.

Jobs Board