Reinforcement Learning
The following examples demonstrate various functionalities of MLPro-RL:
- Howto RL-001: Types of reward
- Howto RL-002: Run an agent with an own policy with an OpenAI GYM environment
- Howto RL-003: Train an agent with an own policy on an OpenAI Gym environment
- Howto RL-004: Run a multi-agent with an own policy with an OpenAI Gym environment
- Howto RL-005: Train a multi-agent with an own policy on the Multi-Cartpole environment
- Howto RL-006: Run own multi-agent with Petting Zoo environment
- Howto RL-007: Training of a wrapped Stable Baselines 3 policy
- Howto RL-008: Wrap native MLPro environment class to OpenAI Gym environment
- Howto RL-009: Wrap native MLPro environment class to PettingZoo environment
- Howto RL-010: Train a wrapped Stable Baslines 3 policy on MLPro’s native UR5 environment
- Howto RL-011: Train a wrapped Stable Baslines 3 policy on MLPro’s native UR5 environment (Paper)
- Howto RL-012: Train a wrapped Stable Baslines 3 policy on MLPro’s native RobotHTM environment
- Howto RL-013: Model Based Reinforcement Learning
- Howto RL-014: Advanced training with stagnation detection
- Howto RL-015: Train a wrapped Stable Baselines 3 policy with stagnation detection
- Howto RL-016: Comparison of native and wrapped Stable Baselines 3 policy
- Howto RL-017: Comparison of native and wrapped Stable Baselines 3 policy (off-policy)
- Howto RL-018: Train a wrapped Stable Baselines 3 policy on MLPro’s native MultiGeo environment
- Howto RL-019: Train and reuse a single agent
- Howto RL-020: Run a native random agent in MLPro’s native DoublePendulum environment
- Howto RL-021: Train a wrapped Stable Baselines 3 policy on MLPro’s native DoublePendulum environment