Skip to main content

Tutorial 2: Learning to Act: Multi-Armed Bandits

Difficulty level
Beginner
Speaker

In this tutorial, you will use 'bandits' to understand the fundamentals of how a policy interacts with the learning algorithm in reinforcement learning.

Topics covered in this lesson
  • The fundamental tradeoff between exploration and exploitation in a policy
  • How the learning rate interacts with exploration to find the best available action
Prerequisites

Experience with Python Programming Language

Back to the course