RL-ENV-ADA - Environment Models
Ver. 1.0.2 (2023-03-10)
This module provides model classes for adaptive environment models.
- class mlpro.rl.models_env_ada.AFctBase(p_afct_cls, p_state_space: ~mlpro.bf.math.basics.MSpace, p_action_space: ~mlpro.bf.math.basics.MSpace, p_input_space_cls=<class 'mlpro.bf.math.basics.ESpace'>, p_output_space_cls=<class 'mlpro.bf.math.basics.ESpace'>, p_output_elem_cls=<class 'mlpro.bf.math.basics.Element'>, p_threshold=0, p_buffer_size=0, p_ada: bool = True, p_visualize: bool = False, p_logging=True, **p_kwargs)
Bases:
Model
Base class for all special adaptive functions (state transition, reward, success, broken).
- Parameters:
p_afct_cls – Adaptive function class (compatible to class mlpro.sl.SLAdaptiveFunction)
p_state_space (MSpace) – State space of an environment or observation space of an agent
p_action_space (MSpace) – Action space of an environment or agent
p_input_space_cls – Space class that is used for the generated input space of the embedded adaptive function (compatible to class MSpace)
p_output_space_cls – Space class that is used for the generated output space of the embedded adaptive function (compatible to class MSpace)
p_output_elem_cls – Output element class (compatible to/inherited from class Element)
p_threshold (float) – Threshold for the difference between a set point and a computed output. Computed outputs with a difference less than this threshold will be assessed as ‘good’ outputs. Default = 0.
p_buffer_size (int) – Initial size of internal data buffer. Default = 0 (no buffering).
p_ada (bool) – Boolean switch for adaptivity. Default = True.
p_visualize (bool) – Boolean switch for visualisation. Default = False.
p_logging – Log level (see constants of class Log). Default: Log.C_LOG_ALL
p_kwargs (Dict) – Further model specific parameters (to be specified in child class).
- _afct
Embedded adaptive function
- Type:
- C_TYPE = 'AFct Base'
- get_afct() SLAdaptiveFunction
- get_hyperparam() HyperParamTuple
Returns the internal hyperparameter tuple to get access to single values.
- switch_adaptivity(p_ada: bool)
Switches adaption functionality on/off.
- Parameters:
p_ada (bool) – Boolean switch for adaptivity
- switch_logging(p_logging)
Sets new log level.
- Parameters:
p_logging – Log level (constant C_LOG_LEVELS contains valid values)
- set_random_seed(p_seed=None)
Resets the internal random generator using the given seed.
- get_adapted() bool
Returns True, if the model was adapted at least once. False otherwise.
- clear_buffer()
Clears internal buffer (if buffering is active).
- get_accuracy()
Determines the accuracy of the model.
- Returns:
accuracy – Accuracy of the model as a scalar value in interval [0,1]
- Return type:
float
- init_plot(p_figure: Figure | None = None, p_plot_settings: list = [], p_plot_depth: int = 0, p_detail_level: int = 0, p_step_rate: int = 0, **p_kwargs)
Initializes the plot functionalities of the class.
- Parameters:
p_figure (Matplotlib.figure.Figure, optional) – Optional MatPlotLib host figure, where the plot shall be embedded. The default is None.
p_plot_settings (PlotSettings) – Optional plot settings. If None, the default view is plotted (see attribute C_PLOT_DEFAULT_VIEW).
- update_plot(**p_kwargs)
Updates the plot.
- Parameters:
**p_kwargs – Implementation-specific plot data and/or parameters.
- class mlpro.rl.models_env_ada.AFctSTrans(p_afct_cls, p_state_space: ~mlpro.bf.math.basics.MSpace, p_action_space: ~mlpro.bf.math.basics.MSpace, p_input_space_cls=<class 'mlpro.bf.math.basics.ESpace'>, p_output_space_cls=<class 'mlpro.bf.math.basics.ESpace'>, p_output_elem_cls=<class 'mlpro.bf.systems.basics.State'>, p_threshold=0, p_buffer_size=0, p_ada: bool = True, p_visualize: bool = False, p_logging=True, **p_par)
-
Online adaptive version of a state transition function. See parent classes for further details.
- C_TYPE = 'AFct STrans'
- class mlpro.rl.models_env_ada.AFctSuccess(p_afct_cls, p_state_space: ~mlpro.bf.math.basics.MSpace, p_action_space: ~mlpro.bf.math.basics.MSpace, p_input_space_cls=<class 'mlpro.bf.math.basics.ESpace'>, p_output_space_cls=<class 'mlpro.bf.math.basics.ESpace'>, p_output_elem_cls=<class 'mlpro.bf.math.basics.Element'>, p_threshold=0, p_buffer_size=0, p_ada: bool = True, p_visualize: bool = False, p_logging=True, **p_kwargs)
Bases:
AFctBase
,FctSuccess
Online adaptive version of a function that determine whether or not a state is a success state. See parent classes for further details.
- C_TYPE = 'AFct Success'
- class mlpro.rl.models_env_ada.AFctBroken(p_afct_cls, p_state_space: ~mlpro.bf.math.basics.MSpace, p_action_space: ~mlpro.bf.math.basics.MSpace, p_input_space_cls=<class 'mlpro.bf.math.basics.ESpace'>, p_output_space_cls=<class 'mlpro.bf.math.basics.ESpace'>, p_output_elem_cls=<class 'mlpro.bf.math.basics.Element'>, p_threshold=0, p_buffer_size=0, p_ada: bool = True, p_visualize: bool = False, p_logging=True, **p_kwargs)
-
Online adaptive version of a function that determine whether or not a state is a broken state. See parent classes for further details.
- C_TYPE = 'AFct Broken'
- class mlpro.rl.models_env_ada.AFctReward(p_afct_cls, p_state_space: ~mlpro.bf.math.basics.MSpace, p_action_space: ~mlpro.bf.math.basics.MSpace, p_input_space_cls=<class 'mlpro.bf.math.basics.ESpace'>, p_output_space_cls=<class 'mlpro.bf.math.basics.ESpace'>, p_output_elem_cls=<class 'mlpro.bf.math.basics.Element'>, p_threshold=0, p_buffer_size=0, p_ada: bool = True, p_visualize: bool = False, p_logging=True, **p_kwargs)
-
Online adaptive version of a reward function. See parent classes for further details.
- C_TYPE = 'AFct Reward'
- class mlpro.rl.models_env_ada.SARSElement(p_state: State, p_action: Action, p_reward: Reward, p_state_new: State)
Bases:
BufferElement
Element of a SARSBuffer.
- class mlpro.rl.models_env_ada.SARSBuffer(p_size=1)
Bases:
Buffer
State-Action-Reward-State-Buffer in dictionary.
- class mlpro.rl.models_env_ada.EnvModel(p_observation_space: MSpace, p_action_space: MSpace, p_latency: timedelta, p_afct_strans: AFctSTrans, p_afct_reward: AFctReward | None = None, p_afct_success: AFctSuccess | None = None, p_afct_broken: AFctBroken | None = None, p_ada: bool = True, p_init_states: State | None = None, p_visualize: bool = False, p_logging=True)
-
Environment model class as part of a model-based agent.
- Parameters:
p_observation_space (MSpace) – Observation space of related agent.
p_action_space (MSpace) – Action space of related agent.
p_latency (timedelta) – Latency of related environment.
p_afct_strans (AFctSTrans) – Mandatory external adaptive function for state transition.
p_afct_reward (AFctReward) – Optional external adaptive function for reward computation.
p_afct_success (AFctSuccess) – Optional external adaptive function for state assessment ‘success’.
p_afct_broken (AFctBroken) – Optional external adaptive function for state assessment ‘broken’.
p_ada (bool) – Boolean switch for adaptivity.
p_init_states (State) – Initial state of the env models.
p_visualize (bool) – Boolean switch for env/agent visualisation. Default = False.
p_logging – Log level (see class Log for more details).
- C_TYPE = 'EnvModel'
- C_NAME = 'Default'
- static setup_spaces()
Static template method to set up and return state and action space of environment.
- Returns:
state_space (MSpace) – State space object
action_space (MSpace) – Action space object
- get_cycle_limit() int
Returns limit of cycles per training episode.
- switch_adaptivity(p_ada: bool)
Switches adaption functionality on/off.
- Parameters:
p_ada (bool) – Boolean switch for adaptivity
- adapt(**p_kwargs) bool
Reactivated adaptation mechanism. See method Model.adapt() for further details.
- get_adapted() bool
Returns True, if the model was adapted at least once. False otherwise.
- get_accuracy()
Returns accuracy of environment model as average accuracy of the embedded adaptive functions.
- clear_buffer()
Clears internal buffer (if buffering is active).