Tutorial 2: Learning to Act: Multi-Armed Bandits
Tutorial 2: Learning to Act: Multi-Armed Bandits
In this tutorial, you will use 'bandits' to understand the fundamentals of how a policy interacts with the learning algorithm in reinforcement learning.
Topics covered in this lesson
- The fundamental tradeoff between exploration and exploitation in a policy
- How the learning rate interacts with exploration to find the best available action
External Links
Prerequisites
Experience with Python Programming Language
Back to the course