Skip to main content

Tutorial 2: Learning to Act-MultiArmed Bandits

Difficulty level

In this tutorial, you will use 'bandits' to understand the fundamentals of how a policy interacts with the learning algorithm in reinforcement learning.

Topics covered in this lesson
  • The fundamental tradeoff between exploration and exploitation in a policy
  • How the learning rate interacts with exploration to find the best available action

Experience with Python Programming Language.

Back to the course