Grid World
[[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 2, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0]]
By default, the agent will be placed in a 2 dimensional grid world with the size of 8x8, tasked to reach the goal through position increment actions. The user can customize the dimension of the grid and decide the maximum number of steps. The agent is represented by number 1 and the goal is represented by number 2.
This Grid World environment can be imported via:
import mlpro.rl.pool.envs.gridworld
Prerequisites
General Information
Parameter |
Value |
---|---|
Agents |
1 |
Native Source |
MLPro |
Action Space Dimension |
Depends on grid_size |
Action Space Base Set |
Real number |
Action Space Boundaries |
Depends on grid_size |
State Space Dimension |
Depends on grid_size |
State Space Base Set |
Integer number |
State Space Boundaries |
Depends on grid_size |
Reward Structure |
Overall reward |
Action Space
The action directly affects the location of the agent. The action is interpreted as increments towards the current location value. The dimension depends on the grid_size parameter.
State Space
The state space is initialized from the grid_size parameter, which can be set up to however many dimension
as needed. For example, the agent can be placed in a three dimensional world with a 4x4x4 size by setting
grid_size = (4,4,4)
Reward Structure
reward = Reward(self.C_REWARD_TYPE)
rew = 1
euclidean_distance = np.linalg.norm(self.goal_pos-self.agent_pos)
if euclidean_distance !=0:
rew = 1/euclidean_distance
if self.num_step >= self.max_step:
rew -= self.max_step
reward.set_overall_reward(rew.item())
Change Log
Version |
Changes |
---|---|
1.0.8 |
First public version |