Model-Based Reinforcement Learning
Model Ensemble TRPO and Meta Policy Learning (MB-MPO)
Model Ensemble TRPO

Ensemble policy learns policy across every member of the ensemble. Learns a single policy that is good across all members of the ensemble.
Meta Policy Learning (MB-MPO)
Learn an adaptive policy that can quickly adapt to any of the learned models.

