Howto RL-008: Wrap native MLPro environment class to OpenAI Gym environment
Prerequisites
- Please install the following packages to run this examples properly:
Executable code
## -------------------------------------------------------------------------------------------------
## -- Project : MLPro - A Synoptic Framework for Standardized Machine Learning Tasks
## -- Package : mlpro
## -- Module : howto_rl_008_wrap_mlpro_environment_to_gym_environment.py
## -------------------------------------------------------------------------------------------------
## -- History :
## -- yyyy-mm-dd Ver. Auth. Description
## -- 2021-09-30 0.0.0 SY Creation
## -- 2021-09-30 1.0.0 SY Released first version
## -- 2021-10-04 1.0.1 DA Minor fixes
## -- 2021-12-22 1.0.2 DA Cleaned up a bit
## -- 2022-03-21 1.0.3 MRD Use Gym Env Checker
## -- 2022-05-30 1.0.4 DA Little refactoring
## -------------------------------------------------------------------------------------------------
"""
Ver. 1.0.4 (2022-05-30)
This module shows how to wrap a native MLPro environment class to OpenAI Gym environment.
"""
from mlpro.bf.various import Log
from mlpro.wrappers.openai_gym import WrEnvMLPro2GYM
from mlpro.rl.pool.envs.gridworld import GridWorld
from gym.utils.env_checker import check_env
mlpro_env = GridWorld(p_logging=Log.C_LOG_ALL)
env = WrEnvMLPro2GYM(mlpro_env, p_state_space=None, p_action_space=None)
check_env(env)
Results
The native MLPro GridWorld environment will be wrapped to a OpenAI Gym environment. By making use of Gym’s environment checker, we could confirm the success of the environment wrapping.
YYYY-MM-DD HH:MM:SS.SSSSSS I Environment GridWorld: Instantiated
YYYY-MM-DD HH:MM:SS.SSSSSS I Environment GridWorld: Instantiated
YYYY-MM-DD HH:MM:SS.SSSSSS I Environment GridWorld: Operation mode set to 0
YYYY-MM-DD HH:MM:SS.SSSSSS I Environment GridWorld: Reset
YYYY-MM-DD HH:MM:SS.SSSSSS I Environment GridWorld: Reset
YYYY-MM-DD HH:MM:SS.SSSSSS I Environment GridWorld: Start processing action
YYYY-MM-DD HH:MM:SS.SSSSSS I Environment GridWorld: Actions of agent 0 = [3.415721893310547, -7.9934492111206055]
YYYY-MM-DD HH:MM:SS.SSSSSS I Environment GridWorld: Action processing finished successfully
...
There will be several more lines of action processing logs due to the nature of the environment checker. When there is no detected failure, the environment is successfully wrapped.