6.4.4. Multi-agents
Multi-agent reinforcement learning (MARL) extends RL to scenarios where multiple independent agents interact with each other and their environment to achieve a common goal or optimize their own individual rewards.
Unlike single-agent RL, where an agent’s decisions are based solely on its own observations and actions, multi-agent interactions introduce complexity, as each agent’s behavior depends on:
Its own actions and observations
The actions and observations of other agents
This dynamic interdependence makes cooperation, competition, and adaptation key challenges in multi-agent RL.
Multi-Agent RL in MLPro
MLPro-RL supports both single-agent and multi-agent RL, providing a structured approach to managing multiple agents within an environment.
Here are some key characteristics of multi-agent RL in MLPro:
Multi-Agent Model → Combines multiple single agents into a cohesive system
Independent Agent Policies → Each agent can have its own policy
Separate Observation & Action Spaces → Each agent operates within a unique portion of the multi-agent environment
Scalar Reward Per Agent → Each agent receives individual feedback on performance
Native & Third-Party Environments → Compatible with MLPro environments and PettingZoo environments (via wrapper class)