Dynamic Games

../../../../../../_images/MLPro-GT_class_diagram.drawio.png

Ver. 2.3.0 (2023-09-25)

This module provides model classes for tasks related to Game Theory in dynamic games.

class mlpro.gt.dynamicgames.basics.GameBoard(p_mode=0, p_latency: timedelta = None, p_fct_strans: FctSTrans = None, p_fct_reward: FctReward = None, p_fct_success: FctSuccess = None, p_fct_broken: FctBroken = None, p_mujoco_file=None, p_frame_skip: int = 1, p_state_mapping=None, p_action_mapping=None, p_camera_conf: tuple = (None, None, None), p_visualize: bool = False, p_logging=True)

Bases: Environment

Model class for a game theoretical game board. See super class for more information.

C_TYPE = 'Game Board'
C_REWARD_TYPE = 1
_compute_reward(p_state_old: State, p_state_new: State) Reward

Custom reward method. See method compute_reward() for further details.

_utility_fct(p_state: State, p_player_id)

Computes utility of given player. To be redefined.

class mlpro.gt.dynamicgames.basics.Player(p_policy: Policy, p_envmodel: EnvModel = None, p_em_acc_thsld=0.9, p_action_planner: ActionPlanner = None, p_predicting_horizon=0, p_controlling_horizon=0, p_planning_width=0, p_name='', p_ada=True, p_visualize: bool = True, p_logging=True, **p_mb_training_param)

Bases: Agent

This class implements a game theoretical player model. See super class for more information.

C_TYPE = 'Player'
class mlpro.gt.dynamicgames.basics.MultiPlayer(p_name: str = '', p_ada: bool = True, p_visualize: bool = False, p_logging=True)

Bases: MultiAgent

This class implements a game theoretical model for a team of players. See super class for more information.

C_TYPE = 'Multi-Player'
add_player(p_player: Player, p_weight=1.0) None
class mlpro.gt.dynamicgames.basics.Game(p_mode=0, p_ada: bool = True, p_cycle_limit=0, p_visualize: bool = True, p_logging=True)

Bases: RLScenario

This class implements a game consisting of a game board and a (multi-)player. See super class for more information.

C_TYPE = 'Game'
class mlpro.gt.dynamicgames.basics.GTTrainingResults(p_scenario: RLScenario, p_run, p_cycle_id, p_logging='W')

Bases: RLTrainingResults

Results of a GT training.

C_NAME = 'GT'
class mlpro.gt.dynamicgames.basics.GTTraining(**p_kwargs)

Bases: RLTraining

This class implements a standardized episodical training process. See super class for more information.

Parameters:
  • p_game_cls – Name of GT game class, compatible to/inherited from class Game.

  • p_cycle_limit (int) – Maximum number of training cycles (0=no limit). Default = 0.

  • p_cycles_per_epi_limit (int) – Optional limit of cycles per episode (0=no limit, -1=get environment limit). Default = -1.

  • p_adaptation_limit (int) – Maximum number of adaptations (0=no limit). Default = 0.

  • p_stagnation_limit (int) – Optional limit of consecutive evaluations without training progress. Default = 0.

  • p_eval_frequency (int) – Optional evaluation frequency (0=no evaluation). Default = 0.

  • p_eval_grp_size (int) – Number of evaluation episodes (eval group). Default = 0.

  • p_hpt (HyperParamTuner) – Optional hyperparameter tuner (see class mlpro.bf.ml.HyperParamTuner). Default = None.

  • p_hpt_trials (int) – Optional number of hyperparameter tuning trials. Default = 0. Must be > 0 if p_hpt is supplied.

  • p_path (str) – Optional destination path to store training data. Default = None.

  • p_collect_states (bool) – If True, the environment states will be collected. Default = True.

  • p_collect_actions (bool) – If True, the agent actions will be collected. Default = True.

  • p_collect_rewards (bool) – If True, the environment reward will be collected. Default = True.

  • p_collect_training (bool) – If True, global training data will be collected. Default = True.

  • p_visualize (bool) – Boolean switch for env/agent visualisation. Default = False.

  • p_logging – Log level (see constants of class mlpro.bf.various.Log). Default = Log.C_LOG_WE.

C_NAME = 'GT'
C_CLS_RESULTS

alias of GTTrainingResults

Ver. 1.1.0 (2023-05-11)

This module provides model classes for Potential Games in dynamic programming.

class mlpro.gt.dynamicgames.potential.PGameBoard(p_mode=0, p_latency: timedelta = None, p_fct_strans: FctSTrans = None, p_fct_reward: FctReward = None, p_fct_success: FctSuccess = None, p_fct_broken: FctBroken = None, p_mujoco_file=None, p_frame_skip: int = 1, p_state_mapping=None, p_action_mapping=None, p_camera_conf: tuple = (None, None, None), p_visualize: bool = False, p_logging=True)

Bases: GameBoard

Model class for a potential game theoretical game board. See super class for more information.

C_TYPE = 'Potential Game Board'
compute_potential()

Computes (weighted) potential level of the game board.

Ver. 1.1.1 (2023-08-22)

This module provides model classes for Stackelberg Games in dynamic programming.

class mlpro.gt.dynamicgames.stackelberg.GTPlayer_SG(p_policy: Policy, p_envmodel: EnvModel = None, p_em_acc_thsld=0.9, p_action_planner: ActionPlanner = None, p_predicting_horizon=0, p_controlling_horizon=0, p_planning_width=0, p_name='', p_ada=True, p_visualize: bool = True, p_logging=True, p_role: int = 0, **p_mb_training_param)

Bases: Player

This class implements a game theoretical player model in a stackelberg game mode, in which there is a possibility to assign the role of the player as a leader or follower.

The leader(s) has a priority to compute actions and adapt policies over the followers. Then, the followers can react according to the selected actions by the leaders, while computing their actions and adapting their policies. Thus, as followers, the selected actions by the leaders will assign as one of the inputs on both _adapt and compute_action methods.

Parameters:

p_role – Role of the player. Default = C_PLAYER_LEADER.

C_TYPE = 'GT Player SG'
C_PLAYER_LEADER = 0
C_PLAYER_FOLLOWER = 1
_adapt(**p_args) bool

Default adaptation implementation of a single agent.

Parameters:
  • p_state (State) – State object.

  • p_reward (Reward) – Reward object.

Returns:

result – True, if something has been adapted. False otherwise.

Return type:

bool

compute_action(p_state: State, p_action_leaders=False) Action

Default implementation of a single agent.

Parameters:

p_state (State) – State of the related environment.

Returns:

action – Action object.

Return type:

Action

class mlpro.gt.dynamicgames.stackelberg.GTMultiPlayer_SG(p_name: str = '', p_ada: bool = True, p_visualize: bool = False, p_logging=True)

Bases: MultiPlayer

This class implements a game theoretical multi-player model in a stackelberg game mode.

C_TYPE = 'GT Multi-Player SG'
_adapt(**p_args) bool

Default adaptation implementation of a single agent.

Parameters:
  • p_state (State) – State object.

  • p_reward (Reward) – Reward object.

Returns:

result – True, if something has been adapted. False otherwise.

Return type:

bool

compute_action(p_state: State) Action

Default implementation of a single agent.

Parameters:

p_state (State) – State of the related environment.

Returns:

action – Action object.

Return type:

Action