Howto RL-008: Wrap native MLPro environment class to OpenAI Gym environment

Prerequisites

Please install the following packages to run this examples properly:

Executable code

## -------------------------------------------------------------------------------------------------
## -- Project : MLPro - A Synoptic Framework for Standardized Machine Learning Tasks
## -- Package : mlpro
## -- Module  : howto_rl_008_wrap_mlpro_environment_to_gym_environment.py
## -------------------------------------------------------------------------------------------------
## -- History :
## -- yyyy-mm-dd  Ver.      Auth.    Description
## -- 2021-09-30  0.0.0     SY       Creation
## -- 2021-09-30  1.0.0     SY       Released first version
## -- 2021-10-04  1.0.1     DA       Minor fixes
## -- 2021-12-22  1.0.2     DA       Cleaned up a bit
## -- 2022-03-21  1.0.3     MRD      Use Gym Env Checker
## -- 2022-05-30  1.0.4     DA       Little refactoring
## -------------------------------------------------------------------------------------------------

"""
Ver. 1.0.4 (2022-05-30)

This module shows how to wrap a native MLPro environment class to OpenAI Gym environment.
"""


from mlpro.bf.various import Log
from mlpro.wrappers.openai_gym import WrEnvMLPro2GYM
from mlpro.rl.pool.envs.gridworld import GridWorld
from gym.utils.env_checker import check_env



mlpro_env   = GridWorld(p_logging=Log.C_LOG_ALL)
env         = WrEnvMLPro2GYM(mlpro_env, p_state_space=None, p_action_space=None)
check_env(env)

Results

The native MLPro GridWorld environment will be wrapped to a OpenAI Gym environment. By making use of Gym’s environment checker, we could confirm the success of the environment wrapping.

YYYY-MM-DD  HH:MM:SS.SSSSSS  I  Environment GridWorld: Instantiated
YYYY-MM-DD  HH:MM:SS.SSSSSS  I  Environment GridWorld: Instantiated
YYYY-MM-DD  HH:MM:SS.SSSSSS  I  Environment GridWorld: Operation mode set to 0
YYYY-MM-DD  HH:MM:SS.SSSSSS  I  Environment GridWorld: Reset
YYYY-MM-DD  HH:MM:SS.SSSSSS  I  Environment GridWorld: Reset
YYYY-MM-DD  HH:MM:SS.SSSSSS  I  Environment GridWorld: Start processing action
YYYY-MM-DD  HH:MM:SS.SSSSSS  I  Environment GridWorld: Actions of agent 0 = [3.415721893310547, -7.9934492111206055]
YYYY-MM-DD  HH:MM:SS.SSSSSS  I  Environment GridWorld: Action processing finished successfully
...

There will be several more lines of action processing logs due to the nature of the environment checker. When there is no detected failure, the environment is successfully wrapped.