Model-Based Reinforcement Learning

Model Ensemble TRPO and Meta Policy Learning (MB-MPO)

Amulya Reddy Konda
1 min readMay 6, 2020

Model Ensemble TRPO

Ensemble policy learns policy across every member of the ensemble. Learns a single policy that is good across all members of the ensemble.

Meta Policy Learning (MB-MPO)

Learn an adaptive policy that can quickly adapt to any of the learned models.

--

--