Model-Based Reinforcement Learning
Model Ensemble TRPO and Meta Policy Learning (MB-MPO)
1 min readMay 6, 2020
Model Ensemble TRPO
Ensemble policy learns policy across every member of the ensemble. Learns a single policy that is good across all members of the ensemble.
Meta Policy Learning (MB-MPO)
Learn an adaptive policy that can quickly adapt to any of the learned models.