Grid World

Ver. 2.0.4 (2023-04-12)

This module provides an environment of customizable Gridworld.

class mlpro.rl.pool.envs.gridworld.GridWorld(p_logging: bool = True, p_grid_size=(8, 8), p_random_start_position: bool = True, p_random_goal_position: bool = True, p_max_step: int = 50, p_action_type: int = 0, p_visualize=True, p_start_position=None, p_goal_position=None)

Bases: Environment

Custom environment of an n-D grid world where the agent has to go to a random or defined target.

Parameters:
  • p_logging (bool) – Subspace of an environment that is observed by the policy. Default = Log.C_LOG_ALL.

  • p_grid_size (dimension) – Dimension of the grid world (n-D grid world), e.g. (8,8) for 2-D or (8,8,8) for 3-D. Default = (8,8)

  • p_random_start_position (bool) – Randomize start position. Default = True.

  • p_random_goal_position (bool) – Randomize goal position. Default = True.

  • p_max_step (int) – Maximum step per episode. Default = 50.

  • p_action_type (int) – Type of actions, which is either continuous action or discrete action. To be noted, discrete action is now limited to 2-d grid world. Default = C_ACTION_TYPE_C.

  • p_start_position (dimension) – To define the starting position, if p_random_start_position is False, e.g. (3,2). Default = None.

  • p_goal_position (dimension) – To define the goal positoin, if p_random_goal_position is False, e.g. (5,5). Default = None.

C_NAME = 'Grid World'
C_LATENCY = datetime.timedelta(seconds=1)
C_INFINITY = 3.4028235e+38
C_REWARD_TYPE = 0
C_ACTION_TYPE_CONT = 0
C_ACTION_TYPE_DISC_2D = 1
static setup_spaces()

Static template method to set up and return state and action space of environment.

Returns:

  • state_space (MSpace) – State space object

  • action_space (MSpace) – Action space object

_setup_spaces()
_reset(p_seed=None) None

To reset environment

get_all_states()
_simulate_reaction(p_state: State, p_action: Action) State

Custom method for a simulated state transition. Implement this method if no external state transition function is used. See method simulate_reaction() for further details.

_compute_reward(p_state_old: State, p_state_new: State) Reward

Custom reward method. See method compute_reward() for further details.

_compute_success(p_state: State) bool

Custom method for assessment for success. Implement this method if no external function is used. See method compute_success() for further details.

_compute_broken(p_state: State) bool

Custom method for assessment for breakdown. Implement this method if no external function is used. See method compute_broken() for further details.

init_plot(p_figure=None)

Initializes the plot functionalities of the class.

Parameters:
  • p_figure (Matplotlib.figure.Figure, optional) – Optional MatPlotLib host figure, where the plot shall be embedded. The default is None.

  • p_plot_settings (PlotSettings) – Optional plot settings. If None, the default view is plotted (see attribute C_PLOT_DEFAULT_VIEW).

update_plot()

Updates the plot.

Parameters:

**p_kwargs – Implementation-specific plot data and/or parameters.