Reinforcement Learning

By

Neuromatch Academy

Category

Computational neuroscience

Level

Beginner

Neuromatch Academy aims to introduce traditional and emerging tools of computational neuroscience to trainees. It is appropriate for student population ranging from undergraduates to faculty in academic settings and also includes industry professionals. In addition to teaching the technical details of computational methods, Neuromatch Academy also provide a curriculum centered on modern neuroscience concepts taught by leading professors along with explicit instruction on how and why to apply models.

This course provides an introduction to the features of a Reinforcement Learning (RL) system, general methods for predicting state values, an overview of the control problem in RL, and brief introduction to function approximation and deep RL.

Course Features

Lectures

Videos

Tutorials

Suggested reading

Recordings of question and answer sessions

Discussion forum on Neurostars.org

Lessons of this Course

1

1

Reinforcement Learning I (Intro Lecture)

Duration:

39:12

Speaker:

This lecture provides an introduction to a variety of topics in reinforcement learning.

2

2

Tutorial 1: Learning to Predict

Duration:

6:57

Speaker:

This tutorial presents how to estimate state-value functions in a classical conditioning paradigm using Temporal Difference (TD) learning and examine TD-errors at the presentation of the conditioned and unconditioned stimulus (CS and US) under different CS-US contingencies. These exercises will provide you with an understanding of both how reward prediction errors (RPEs) behave in classical conditioning and what we should expect to see if dopamine represents a "canonical" model-free RPE.

3

3

Tutorial 2: Learning to Act: Multi-Armed Bandits

Duration:

6:55

Speaker:

In this tutorial, you will use 'bandits' to understand the fundamentals of how a policy interacts with the learning algorithm in reinforcement learning.

4

4

Tutorial 3: Learning to Act: Q-Learning

Duration:

11:16

Speaker:

In this tutorial, you will learn how to act in the more realistic setting of sequential decisions, formalized by Markov Decision Processes (MDPs). In a sequential decision problem, the actions executed in one state not only may lead to immediate rewards (as in a bandit problem), but may also affect the states experienced next (unlike a bandit problem). Each individual action may therefore affect affect all future rewards. Thus, making decisions in this setting requires considering each action in terms of their expected cumulative future reward.

5

5

Tutorial 4: From Reinforcement Learning to Planning

Duration:

9:10

Speaker:

In this tutorial, you will implement one of the simplest model-based reinforcement learning algorithms, Dyna-Q. You will understand what a world model is, how it can improve the agent's policy, and the situations in which model-based algorithms are more advantageous than their model-free counterparts.

6

6

Reinforcement Learning II (Outro Lecture)

Duration:

33:25

Speaker:

This lecture highlights up-and-coming issues in the neuroscience of reinforcement learning.

Recent courses

INCF Assembly 2022 - Training Day 1

View the course

Machine Learning

Neuromatch Academy

View the course

INCF TrainingSpace

Deep Learning: Associative Memories

NYU Center for Data Science

View the course