Register Now
Lead Instructor(s)
Jun 28 - 30, 2021
Live Virtual
Course Length
3 days
Course Fee
2.35 CEUs
Sign-up for Course Updates

An active area of research, reinforcement learning has already achieved impressive results in solving complex games and a variety of real-world problems. However, organizations that attempt to leverage these strategies often encounter practical industry constraints. In this dynamic course, you will explore the cutting-edge of RL research, and enhance your ability to identify the correct approach for applying advanced frameworks to pressing industry challenges. 

Course Overview


Reinforcement learning (RL) is transforming machine learning applications across industries—and its potential is only beginning to be tapped. From natural language processing and computer vision to self-driving cars and gaming, this paradigm offers practical applications in industries as diverse as transportation, retail, finance, urban planning, and healthcare.

In this accelerated, three-day course, you’ll receive an advanced overview of the cutting-edge RL topics that are driving exciting advancements in machine learning. Through interactive lectures and exercises, you’ll acquire a multi-faceted glimpse into the development and potential of RL, from the perspectives of statistics, optimal control, economics, operational research, and other disciplines. 

A majority of the course will be dedicated to deep overviews into key topics in active research, including offline reinforcement learning, the theory of RL, multi-agent RL, Monte Carlo Tree Search, hierarchical RL, and model-based RL exploration. Additional sessions will focus on practical considerations when using deep RL methods, such as deep learning architectures, and what actually makes deep RL methods work. 

You will additionally have the opportunity to put your learning into practice during hands-on clinics, in which you will use advanced algorithms to solve real-world problems, and then discuss your solutions with the class and instructors during office hours. You will leave the course armed with a broad understanding of reinforcement learning as a tool, mathematical framework, and active field of study.


By completing this course, you will enhance your ability to:

  • Determine the reinforcement learning framework (e.g. goal-directed, hierarchical, offline reinforcement learning, bandits) that is best-suited to solve a specific problem 
  • Select the most promising algorithms for an already-formulated reinforcement learning problem
  • Recognize the limitations of reinforcement learning in order to judge whether a situation is suited for these strategies


This course is designed for mid-career professionals who are actively involved in or want to learn more about reinforcement learning. The strategies covered will be applicable for a wide variety of fields, including robotics, automotive, manufacturing, urban planning and design, logistics, government and military, science and technology, retail, finance, healthcare, and pharmaceutical industries.

Relevant job titles include, but are not limited to:

  • Research Scientist
  • Machine Learning Engineer
  • Software Engineer
  • Data Scientist
  • Data Analyst
  • Automation Engineer 
  • CTO
  • Product Manager
  • Program Manager

Participants should be familiar with the basics of RL, including exact dynamic programming algorithms, Q-learning, deep neural networks, machine learning libraries (e.g. PyTorch or Tensorflow), and basic deep RL methods (DQN, policy gradient methods).


  • Day 1 (9:30am - 7:30pm) 
    • Session 1 (1.5 hours): Intro, review of basic RL, overview, Why RL? 
      • Statistical / ML perspective
      • Optimal control / Operational Research perspective
      • Economics perspective (discounting)
      • Online v/s Offline
      • On Policy v/s Off Policy
    • Break (0.5 hours)
    • Session 2: Theoretical Results in RL (60 mins)
    • Lunch (1 hour)
    • Coffee chat (30 mins)
    • Session 3: Offline RL Theory & Applications (1.5 hours) 
      • Introduction to Offline RL (30 mins)
      • State-of-the-art in Offline RL (30 mins)
      • Applications of Offline RL (30 mins)
    • Break (0.5 hours)
    • Hands-on implementation (2 hours) 
      • Set up an environment into a format amenable to RL algorithms
      • Design reward functions to gain practical experience in reward hacking 
        • Discussion on approaches to circumvent reward hacking
      • Compare offline and online RL
    • Reception (1.5 hours) 
  • Day 2 (9:30am - 6:30pm) 
    • Session 1: Exploration v/s Exploitation (1.5 hours) 
      • Exploration using Learning Progress, Prediction Error, and State Visitation
      • Curriculum Learning
    • Break (0.5 hours)
    • Session 2 (1 hour): Goal-Based and Hierarchical RL 
      • Inverse Models, Hindsight Experience Replay
      • Approaches to HRL
    • Coffee chat (30 mins)
    • Lunch (1 hour)
    • Session 3 (Interactive): Using Models and Demonstrations to Improve Sample Efficiency (2 hours) 
      • Theory of improving sample efficiency
      • Implementing Models and Demonstrations
    • Break (0.5 hours)
    • Session 4: Deep Learning Architectures and RL (45 mins) 
      • Memory Based RL
      • Transformers and Self-Attention
      • Episodic Control
    • Break (15 mins)
    • Session 5: State-of-the Art RL Algorithms (1 hour) 
      • TD3
      • Soft Actor Critic (SAC)
      • RAINBOW
      • PPO
  • Day 3 (9:30am - 7:30pm) 
    • Session 1 (2 hours): Problem Clinic and Casting Your Problem Into RL
    • Office Hours (1 hour): 10 mins sign up slots
    • Lunch (1 hour)
    • Session 2 (1.5 hours): Applications to RL 
      • Operations Research
      • Navigation
      • Manipulation
      • Urban Planning
    • Break (0.5 hours)
    • Session 3 (2 hours): Miscellaneous Topics 
      • MCTS and its application to Alpha Go
      • Safety in RL
      • Are policy gradients true gradients?
      • Connection between RL and Evolutionary Algorithms
      • RL from the neuroscience view
    • AMA session (1.5 hours)