6.2. Getting started

This guide provides a structured introduction to MLPro-RL, catering to both newcomers and experienced MLPro users.

If you are new to MLPro, please refer to the Getting Started page of MLPro to gain foundational knowledge before proceeding.

By following the step-by-step guidelines below, users will develop a practical understanding of MLPro-RL and learn to use it effectively.

1. What is reinforcement learning?

Reinforcement Learning (RL) is a branch of machine learning where an agent learns optimal decision-making by interacting with an environment. The agent receives feedback in the form of rewards or penalties, guiding it toward maximizing cumulative rewards.

Unlike supervised learning, which relies on labeled data, RL involves the agent exploring different actions and learning from their consequences. Key RL concepts include:

  • Exploration: Trying new actions to discover better strategies.

  • Exploitation: Choosing the best-known actions based on prior learning.

Common applications of RL include robotics, game playing, and autonomous systems. For an in-depth understanding, we recommend reading the book by Sutton and Barto, titled: Reinforcement Learning: An Introduction.

2. What is MLPro-RL?

If you are already familiar with MLPro and RL, the next step is understanding MLPro-RL. Start with:

  1. MLPro-RL introduction page

  2. Section 4 of MLPro 1.0 paper

3. Understanding environments in MLPro-RL

The environment is a crucial component of RL. Begin by learning its structure in MLPro:

For practical examples, refer to the following guides:

  1. Howto RL-001: Reward

  2. Howto RL-AGENT-001: Run an Agent with Own Policy

4. Understanding agents in MLPro-RL

MLPro-RL supports both single-agent and multi-agent RL. Learn more here:

Then, explore how to set up different agent types:

  1. Howto RL-AGENT-001: Run an agent with own policy

  2. Howto RL-AGENT-003: Run multi-agent with own policy

5. Selecting between model-Free and model-based RL

Decide on your RL training approach by first reviewing these pages:

6. Additional guidance

After completing the above steps, you should be comfortable working with MLPro-RL. For further learning, consider these advanced topics:

  1. Howto RL-AGENT-001: Train and reload single agent (Gymnasium)

  2. Howto RL-HT-001: Hyperparameter tuning using Hyperopt

  3. Howto RL-HT-001: Hyperparameter tuning using Optuna

  4. Howto RL-ATT-001: Train and reload single agent using stagnation detection (Gymnasium)

By following this guide, you will be well-equipped to integrate MLPro-RL into your reinforcement learning projects.