MPC - Model Predictive Control

Ver. 1.1.1 (2023-02-04)

This module provides a default implementation of model predictive control (MPC).

class mlpro.rl.pool.actionplanner.mpc.MPC(p_range_max=0, p_state_thsld=1e-08, p_logging=True)

Bases: ActionPlanner, ScientificObject, Async

Template class for MPC to be used as part of model-based planning agents. The goal is to find the best sequence of actions that leads to a maximum reward.

Parameters:
  • p_range (int) – Range of asynchonicity.

  • p_state_thsld (float) – Threshold for metric difference between two states to be equal. Default = 0.00000001.

  • p_logging – Log level (see constants of class Log). Default = Log.C_LOG_ALL.

C_TYPE = 'Model Predictive Control'
_plan_action(p_obs: State) SARSBuffer

Custom planning algorithm to fill the internal action path (self._action_path). Search width and depth are restricted by the attributes self._width_limit and self._prediction_horizon. The default implementation utilizes MPC.

Parameters:

p_obs (State) – Observation data.

Returns:

action_path – Sequence of SARSElement objects with included actions that lead to the best possible reward.

Return type:

SARSBuffer

execute(**p_kwargs)
_async_subtask(p_tid: int, p_obs: State)