Generating Diverse Cooperative Agents by Learning Incompatible Policies

55

Views

0

Downloads

Charakorn, Rujikorn, Manoonpong, Poramate and Dilokthanakul, Nat (2023) Generating Diverse Cooperative Agents by Learning Incompatible Policies In: International Conference on Learning Representations (ICLR).

Abstract

Training a robust cooperative agent requires diverse partner agents. However, obtaining those agents is difficult. Previous works aim to learn diverse behaviors by changing the state-action distribution of agents. But, without information about the task's goal, the diversified agents are not guided to find other important, albeit sub-optimal, solutions: the agents might learn only variations of the same solution. In this work, we propose to learn diverse behaviors via policy compatibility. Conceptually, policy compatibility measures whether policies of interest can coordinate effectively. We theoretically show that incompatible policies are not similar. Thus, policy compatibility—which has been used exclusively as a measure of robustness—can be used as a proxy for learning diverse behaviors. Then, we incorporate the proposed objective into a population-based training scheme to allow concurrent training of multiple agents. Additionally, we use state-action information to induce local variations of each policy. Empirically, the proposed method consistently discovers more solutions than baseline methods across various multi-goal cooperative environments. Finally, in multi-recipe Overcooked, we show that our method produces populations of behaviorally diverse agents, which enables generalist agents trained with such a population to be more robust.

Item Type:

Conference or Workshop Item (Paper)

Subjects:

Subjects > Computer Science > Artificial Intelligence

Subjects > Computer Science > Machine Learning

Subjects > Computer Science > Robotics

Deposited by:

Nat Dilokthanakul

Date Deposited:

2024-03-07 14:09:47

Last Modified:

2024-08-27 23:18:02

Impact and Interest:

Statistics