Charakorn, Rujikorn, Manoonpong, Poramate and Dilokthanakul, Nat n-LIPO: Framework for Diverse Cooperative Agent Generation using Policy Compatibility IEEE Transactions on Artificial Intelligence..
Diverse training partners in multi-agent tasks are crucial for training a robust and adaptable cooperative agent. Prior methods often rely on state-action information to diversify partners’ behaviors, but this can lead to minor variations instead of diverse behaviors and solutions. We address this limitation by introducing a novel training objective based on “policy com- patibility.” Our method learns diverse behaviors by encouraging agents within a team to be compatible with each other while being incompatible with agents from other teams. We theoretically prove that incompatible policies are inherently dissimilar, allowing us to use policy compatibility as a proxy for diversity. We call this method Learning Incompatible Policies for n-Player Cooperative Games (n-LIPO). We propose to further diversify individual policies by incorporating a mutual information objective using state-action information. We empirically demonstrate that n-LIPO effectively generates diverse joint policies in various two-player and multi-player cooperative environments. In a complex cooperative task, two-player multi-recipe Overcooked, we find that n-LIPO generates a population of behaviorally diverse partners. These populations are then used to train robust generalist agents that can generalize better than using baseline populations. Finally, we demonstrate that n-LIPO can be applied to a high-dimensional StarCraft Multi-Agent Challenge (SMAC) multi-player cooperative environment to discover diverse winning strategies when only a single goal exists. Additional visualization can also be accessed at https://sites.google.com/view/n-lipo/home.
Item Type:
Article
Subjects:
Subjects > Computer Science > Artificial Intelligence
Deposited by:
Nat Dilokthanakul
Date Deposited:
2025-07-03 12:00:09
Last Modified:
2025-07-03 13:39:23