Double Pendulum System

../../../../../../../_images/MLPro-BF-Systems-NativeDoublePendulumSystem_class_diagram.drawio.png

Ver. 1.0.1 (2023-03-08)

The Double Pendulum System is an implementation of a classic control problem of Double Pendulum system. The dynamics of the system are based on the Double Pendulum implementation by Matplotlib. The double pendulum is a system of two poles, with the inner pole connected to a fixed point at one end and to outer pole at other end. The native implementation of Double Pendulum consists of an input motor providing the torque in either directions to actuate the system.

class mlpro.bf.systems.pool.doublependulum.DoublePendulumSystemRoot(p_mode=0, p_latency=None, p_max_torque=20, p_l1=1.0, p_l2=1.0, p_m1=1.0, p_m2=1.0, p_init_angles='random', p_g=9.8, p_fct_strans: FctSTrans | None = None, p_fct_success: FctSuccess | None = None, p_fct_broken: FctBroken | None = None, p_mujoco_file=None, p_frame_skip=None, p_state_mapping=None, p_action_mapping=None, p_camera_conf=None, p_history_length=5, p_visualize: bool = False, p_random_range: list | None = None, p_balancing_range: list = (-0.2, 0.2), p_swinging_outer_pole_range=(0.2, 0.5), p_break_swinging: bool = False, p_logging=True)

Bases: System

This is the root double pendulum environment class inherited from Environment class with four dimensional state space and underlying implementation of the Double Pendulum dynamics, default reward strategy.

Parameters:
  • p_mode – Mode of environment. Possible values are Mode.C_MODE_SIM(default) or Mode.C_MODE_REAL.

  • p_latency (timedelta) – Optional latency of environment. If not provided, the internal value of constant C_LATENCY is used by default.

  • p_max_torque (float, optional) – Maximum torque applied to pendulum. The default is 20.

  • p_l1 (float, optional) – Length of pendulum 1 in m. The default is 0.5

  • p_l2 (float, optional) – Length of pendulum 2 in m. The default is 0.25

  • p_m1 (float, optional) – Mass of pendulum 1 in kg. The default is 0.5

  • p_m2 (float, optional) – Mass of pendulum 2 in kg. The default is 0.25

  • p_init_angles (str, optional) – C_ANGLES_UP starts the pendulum in an upright position C_ANGLES_DOWN starts the pendulum in a downward position C_ANGLES_RND starts the pendulum from a random position.

  • p_g (float, optional) – Gravitational acceleration. The default is 9.8

  • p_history_length (int, optional) – Historical trajectory points to display. The default is 5.

  • p_fct_strans (FctSTrans, optional) – The custom State transition function.

  • p_fct_success (FctSuccess, optional) – The custom Success Function.

  • p_fct_broken (FctBroken, optional) – The custom Broken Function.

  • p_mujoco_file (optional) – The corresponding mujoco file

  • p_frame_skip (optional) – Number of frames to be skipped for visualization.

  • p_state_mapping (optional) – State mapping configurations.

  • p_action_mapping (optional) – Action mapping configurations.

  • p_camera_conf (optional) – Camera configurations for mujoco specific visualization.

  • p_visualize (bool) – Boolean switch for visualisation. Default = False.

  • p_random_range (list) – The boundaries for state space for initialization of environment randomly

  • range (p_balancing) – The boundaries for state space of environment in balancing region

  • p_swinging_outer_pole_range – The boundaries for state space of environment in swinging of outer pole region

  • p_break_swinging (bool) – Boolean value stating whether the environment shall be broken outside the balancing region

  • p_logging – Log level (see constants of class mlpro.bf.various.Log). Default = Log.C_LOG_WE.

C_NAME = 'DoublePendulumSystemRoot'
C_SCIREF_TYPE = 'Online'
C_SCIREF_AUTHOR = 'John Hunter, Darren Dale, Eric Firing, Michael                                        Droettboom and the Matplotlib development team'
C_SCIREF_TITLE = 'The Double Pendulum Problem'
C_SCIREF_URL = 'https://matplotlib.org/stable/gallery/animation/double_pendulum.html'
C_PLOT_ACTIVE: bool = True
C_PLOT_DEFAULT_VIEW: str = '2D'
C_CYCLE_LIMIT = 0
C_LATENCY = datetime.timedelta(microseconds=40000)
C_ANGLES_UP = 'up'
C_ANGLES_DOWN = 'down'
C_ANGLES_RND = 'random'
C_VALID_ANGLES = ['up', 'down', 'random']
C_THRSH_GOAL = 0
C_ANI_FRAME = 30
C_ANI_STEP = 0.001
setup_spaces()

Method to setup the spaces for the Double Pendulum root environment. This method sets up four dimensional Euclidean space for the root DP environment.

class mlpro.bf.systems.pool.doublependulum.DoublePendulumSystemS4(p_mode=0, p_latency=None, p_max_torque=20, p_l1=1.0, p_l2=1.0, p_m1=1.0, p_m2=1.0, p_init_angles='random', p_g=9.8, p_fct_strans: FctSTrans | None = None, p_fct_success: FctSuccess | None = None, p_fct_broken: FctBroken | None = None, p_mujoco_file=None, p_frame_skip=None, p_state_mapping=None, p_action_mapping=None, p_camera_conf=None, p_history_length=5, p_visualize: bool = False, p_random_range: list | None = None, p_balancing_range: list = (-0.2, 0.2), p_swinging_outer_pole_range=(0.2, 0.5), p_break_swinging: bool = False, p_logging=True)

Bases: DoublePendulumSystemRoot

This is the Double Pendulum Static 4 dimensional environment that inherits from the double pendulum root class, inheriting the dynamics and default reward strategy.

Parameters:
  • p_mode – Mode of environment. Possible values are Mode.C_MODE_SIM(default) or Mode.C_MODE_REAL.

  • p_latency (timedelta) – Optional latency of environment. If not provided, the internal value of constant C_LATENCY is used by default.

  • p_max_torque (float, optional) – Maximum torque applied to pendulum. The default is 20.

  • p_l1 (float, optional) – Length of pendulum 1 in m. The default is 0.5

  • p_l2 (float, optional) – Length of pendulum 2 in m. The default is 0.25

  • p_m1 (float, optional) – Mass of pendulum 1 in kg. The default is 0.5

  • p_m2 (float, optional) – Mass of pendulum 2 in kg. The default is 0.25

  • p_init_angles (str, optional) – C_ANGLES_UP starts the pendulum in an upright position C_ANGLES_DOWN starts the pendulum in a downward position C_ANGLES_RND starts the pendulum from a random position.

  • p_g (float, optional) – Gravitational acceleration. The default is 9.8

  • p_history_length (int, optional) – Historical trajectory points to display. The default is 5.

  • p_visualize (bool) – Boolean switch for visualisation. Default = False.

  • p_plot_level (int) – Types and number of plots to be plotted. Default = ALL C_PLOT_DEPTH_ENV only plots the environment C_PLOT_DEPTH_REWARD only plots the reward C_PLOT_ALL plots both reward and the environment

  • p_rst_balancingL – Reward strategy to be used for the balancing region of the environment

  • p_rst_swinging – Reward strategy to be used for the swinging region of the environment

  • p_reward_weights (list) – List of weights to be added to the dimensions of the state space for reward computation

  • p_reward_trend (bool) – Boolean value stating whether to plot reward trend

  • p_reward_window (int) – The number of latest rewards to be shown in the plot. Default is 0

  • p_random_range (list) – The boundaries for state space for initialization of environment randomly

  • range (p_balancing) – The boundaries for state space of environment in balancing region

  • p_break_swinging (bool) – Boolean value stating whether the environment shall be broken outside the balancing region

  • p_logging – Log level (see constants of class mlpro.bf.various.Log). Default = Log.C_LOG_WE.

C_NAME = 'DoublePendulumSystemS4'
class mlpro.bf.systems.pool.doublependulum.DoublePendulumSystemS7(p_mode=0, p_latency=None, p_max_torque=20, p_l1=1.0, p_l2=1.0, p_m1=1.0, p_m2=1.0, p_init_angles='random', p_g=9.8, p_fct_strans: FctSTrans | None = None, p_fct_success: FctSuccess | None = None, p_fct_broken: FctBroken | None = None, p_mujoco_file=None, p_frame_skip=None, p_state_mapping=None, p_action_mapping=None, p_camera_conf=None, p_history_length=5, p_visualize: bool = False, p_random_range: list | None = None, p_balancing_range: list = (-0.2, 0.2), p_swinging_outer_pole_range=(0.2, 0.5), p_break_swinging: bool = False, p_logging=True)

Bases: DoublePendulumSystemS4

This is the classic implementation of Double Pendulum with 7 dimensional state space including derived accelerations of both the poles and the input torque. The dynamics of the system are inherited from the Double Pendulum Root class.

Parameters:
  • p_mode – Mode of environment. Possible values are Mode.C_MODE_SIM(default) or Mode.C_MODE_REAL.

  • p_latency (timedelta) – Optional latency of environment. If not provided, the internal value of constant C_LATENCY is used by default.

  • p_max_torque (float, optional) – Maximum torque applied to pendulum. The default is 20.

  • p_l1 (float, optional) – Length of pendulum 1 in m. The default is 0.5

  • p_l2 (float, optional) – Length of pendulum 2 in m. The default is 0.25

  • p_m1 (float, optional) – Mass of pendulum 1 in kg. The default is 0.5

  • p_m2 (float, optional) – Mass of pendulum 2 in kg. The default is 0.25

  • p_init_angles (str, optional) – C_ANGLES_UP starts the pendulum in an upright position C_ANGLES_DOWN starts the pendulum in a downward position C_ANGLES_RND starts the pendulum from a random position.

  • p_g (float, optional) – Gravitational acceleration. The default is 9.8

  • p_history_length (int, optional) – Historical trajectory points to display. The default is 5.

  • p_visualize (bool) – Boolean switch for visualisation. Default = False.

  • p_plot_level (int) – Types and number of plots to be plotted. Default = ALL C_PLOT_DEPTH_ENV only plots the environment C_PLOT_DEPTH_REWARD only plots the reward C_PLOT_ALL plots both reward and the environment

  • p_rst_balancingL – Reward strategy to be used for the balancing region of the environment

  • p_rst_swinging – Reward strategy to be used for the swinging region of the environment

  • p_reward_weights (list) – List of weights to be added to the dimensions of the state space for reward computation

  • p_reward_trend (bool) – Boolean value stating whether to plot reward trend

  • p_reward_window (int) – The number of latest rewards to be shown in the plot. Default is 0

  • p_random_range (list) – The boundaries for state space for initialization of environment randomly

  • range (p_balancing) – The boundaries for state space of environment in balancing region

  • p_break_swinging (bool) – Boolean value stating whether the environment shall be broken outside the balancing region

  • p_logging – Log level (see constants of class mlpro.bf.various.Log). Default = Log.C_LOG_WE.

C_NAME = 'DoublePendulumSystemS7'
setup_spaces()

Method to set up the state and action spaces of the classic Double Pendulum Environment. Inheriting from the root class, this method adds 3 dimensions for accelerations and torque respectively.