Grid World 

[[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 2, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0]]

By default, the agent will be placed in a 2 dimensional grid world with the size of 8x8, tasked to reach the goal through position increment actions. The user can customize the dimension of the grid and decide the maximum number of steps. The agent is represented by number 1 and the goal is represented by number 2.

This Grid World environment can be imported via:

import mlpro.rl.pool.envs.gridworld

Prerequisites

NumPy

General Information

Parameter	Value
Agents	1
Native Source	MLPro
Action Space Dimension	Depends on grid_size
Action Space Base Set	Real number
Action Space Boundaries	Depends on grid_size
State Space Dimension	Depends on grid_size
State Space Base Set	Integer number
State Space Boundaries	Depends on grid_size
Reward Structure	Overall reward

Action Space

The action directly affects the location of the agent. The action is interpreted as increments towards the current location value. The dimension depends on the grid_size parameter.

State Space

The state space is initialized from the grid_size parameter, which can be set up to however many dimension as needed. For example, the agent can be placed in a three dimensional world with a 4x4x4 size by setting grid_size = (4,4,4)

Reward Structure

reward = Reward(self.C_REWARD_TYPE)
rew = 1
euclidean_distance = np.linalg.norm(self.goal_pos-self.agent_pos)
if euclidean_distance !=0:
    rew = 1/euclidean_distance
if self.num_step >= self.max_step:
    rew -= self.max_step

reward.set_overall_reward(rew.item())

Change Log

Version	Changes
1.0.8	First public version

Cross Reference

API Reference

Read the Docs v: v0.8.5

Versions: latest; v0.8.5; v0.8.1; v0.8.0; v0.5.0

Downloads

On Read the Docs: Project Home; Builds