MPC - Model Predictive Control

Ver. 1.1.1 (2023-02-04)

This module provides a default implementation of model predictive control (MPC).

class mlpro.rl.pool.actionplanner.mpc.MPC(p_range_max=0, p_state_thsld=1e-08, p_logging=True)

Bases: ActionPlanner, ScientificObject, Async

Template class for MPC to be used as part of model-based planning agents. The goal is to find the best sequence of actions that leads to a maximum reward.

Parameters:

p_range (int) – Range of asynchonicity.
p_state_thsld (float) – Threshold for metric difference between two states to be equal. Default = 0.00000001.
p_logging – Log level (see constants of class Log). Default = Log.C_LOG_ALL.

C_TYPE = 'Model Predictive Control'

_plan_action(p_obs: State) → SARSBuffer

Custom planning algorithm to fill the internal action path (self._action_path). Search width and depth are restricted by the attributes self._width_limit and self._prediction_horizon. The default implementation utilizes MPC.

Parameters:: p_obs (State) – Observation data.
Returns:: action_path – Sequence of SARSElement objects with included actions that lead to the best possible reward.
Return type:: SARSBuffer

execute(**p_kwargs)

_async_subtask(p_tid: int, p_obs: State)