PettingZoo

Ver. 2.2.0 (2023-03-26)

This module provides wrapper classes for PettingZoo multi-agent environments.

class mlpro.wrappers.pettingzoo.WrEnvPZOO2MLPro(p_zoo_env, p_state_space: MSpace = None, p_action_space: MSpace = None, p_visualize: bool = True, p_logging=True)

Bases: Wrapper, Environment

This class is a ready to use wrapper class for Petting Zoo environments. Objects of this type can be treated as an environment object. Encapsulated petting zoo environment must be compatible to class pettingzoo.env.

Parameters:

p_pzoo_env – Petting Zoo environment object
p_state_space (MSpace) – Optional external state space object that meets the state space of the gym environment
p_action_space (MSpace) – Optional external action space object that meets the action space of the gym environment
p_visualize (bool) – Boolean switch for env/agent visualisation. Default = True.
p_logging – Log level (see constants of class Log). Default = Log.C_LOG_ALL.

C_TYPE = 'Wrapper PettingZoo2MLPro'

C_WRAPPED_PACKAGE = 'pettingzoo'

C_MINIMUM_VERSION = '1.20.0'

C_PLOT_ACTIVE: bool = True

C_SUPPORTED_MODULES = ['pettingzoo.classic', 'pettingzoo.butterfly', 'pettingzoo.atari', 'pettingzoo.mpe', 'pettingzoo.sisl']

_reduce_state(p_state: dict, p_path: str, p_os_sep: str, p_filename_stub: str)

The embedded PettingZoo env itself can’t be pickled due to it’s dependencies on Pygame. That’s why the current env instance needs to be removed before pickling the object.

_complete_state(p_path: str, p_os_sep: str, p_filename_stub: str)

Custom method to complete the object state (=self) from external data sources. This method is called by standard method __setstate__() during unpickling the object from an external file.

Parameters:

p_path (str) – Path of the object pickle file (and further optional related files)
p_os_sep (str) – OS-specific path separator.
p_filename_stub (str) – Filename stub to be used for further optional custom data files

_recognize_space(p_zoo_space, dict_name) → ESpace

static setup_spaces()

Static template method to set up and return state and action space of environment.

Returns:

state_space (MSpace) – State space object
action_space (MSpace) – Action space object

_reset(p_seed=None)

Custom method to reset the system to an initial/defined state. Use method _set_status() to set the state.

Parameters:: p_seed (int) – Seed parameter for an internal random generator

simulate_reaction(p_state: State, p_action: Action) → State

Simulates a state transition based on a state and an action. The simulation step itself is carried out either by an internal custom implementation in method _simulate_reaction() or by an embedded external function.

Parameters:

p_state (State) – Current state.
p_action (Action) – Action.

Returns:

Subsequent state after transition

Return type:

State

compute_reward(p_state_old: State = None, p_state_new: State = None) → Reward

Computes a reward for the state transition, given by two successive states. The reward computation itself is carried out either by a custom implementation in method _compute_reward() or by an embedded adaptive function.

Parameters:

p_state_old (State) – Optional state before transition. If None the internal previous state of the environment is used.
p_state_new (State) – Optional tate after transition. If None the internal current state of the environment is used.

Returns:

Reward object.

Return type:

Reward

compute_success(p_state: State) → bool

Assesses the given state whether it is a ‘success’ state. Assessment is carried out either by a custom implementation in method _compute_success() or by an embedded external function.

Parameters:: p_state (State) – State to be assessed.
Returns:: success – True, if the given state is a ‘success’ state. False otherwise.
Return type:: bool

compute_broken(p_state: State) → bool

Assesses the given state whether it is a ‘broken’ state. Assessment is carried out either by a custom implementation in method _compute_broken() or by an embedded external function.

Parameters:: p_state (State) – State to be assessed.
Returns:: broken – True, if the given state is a ‘broken’ state. False otherwise.
Return type:: bool

init_plot(p_figure: Figure = None, p_plot_settings: list = Ellipsis, p_plot_depth: int = 0, p_detail_level: int = 0, p_step_rate: int = 0, **p_kwargs)

Initializes the plot functionalities of the class.

Parameters:

p_figure (Matplotlib.figure.Figure, optional) – Optional MatPlotLib host figure, where the plot shall be embedded. The default is None.
p_plot_settings (PlotSettings) – Optional plot settings. If None, the default view is plotted (see attribute C_PLOT_DEFAULT_VIEW).

update_plot(**p_kwargs)

Updates the plot.

Parameters:: **p_kwargs – Implementation-specific plot data and/or parameters.

get_cycle_limit(): Returns limit of cycles per training episode.

class mlpro.wrappers.pettingzoo.WrEnvMLPro2PZoo(p_mlpro_env: Environment, p_num_agents, p_state_space: MSpace = None, p_action_space: MSpace = None, p_logging=True)

Bases: Wrapper

This class is a ready to use wrapper class for MLPro to PettingZoo environments. Objects of this type can be treated as an AECEnv object. Encapsulated MLPro environment must be compatible to class Environment. To be noted, this wrapper is not capable for parallel environment yet.

Parameters:

p_mlpro_env (Environment) – MLPro’s Environment object
p_num_agents (int) – Number of Agents
p_state_space (MSpace) – Optional external state space object that meets the state space of the MLPro environment
p_action_space (MSpace) – Optional external action space object that meets the action space of the MLPro environment

C_TYPE = 'Wrapper MLPro2PettingZoo'

C_WRAPPED_PACKAGE = 'pettingzoo'

C_MINIMUM_VERSION = '1.20.0'

class raw_env(p_mlpro_env, p_num_agents, p_state_space: MSpace = None, p_action_space: MSpace = None, p_render_mode='human')

Bases: AECEnv

metadata: dict[str, Any] = {'name': 'pzoo_custom', 'render_modes': ['human', 'ansi']}

_recognize_space(p_mlpro_space)

step(action)

Accepts and executes the action of the current agent_selection in the environment.

Automatically switches control to the next agent.

observe(agent_id)

Returns the observation an agent currently can make.

last() calls this function.

reset(seed, options): Resets the environment to a starting state.

render(mode='human')

Renders the environment as specified by self.render_mode.

Render mode can be human to display a window. Other render modes in the default environments are ‘rgb_array’ which returns a numpy array and is supported by all environments outside of classic, and ‘ansi’ which returns the strings printed (specific to classic environments).

close()

Closes any resources that should be released.

Closes the rendering window, subprocesses, network connections, or any other resources that should be released.

Cross References

Howto RL-006: Run own multi-agent with Petting Zoo environment
Howto RL-009: Wrap native MLPro environment class to PettingZoo environment