Model-Based Reinforcement Learning

Model Ensemble TRPO

Ensemble policy learns policy across every member of the ensemble. Learns a single policy that is good across all members of the ensemble.

Meta Policy Learning (MB-MPO)

Learn an adaptive policy that can quickly adapt to any of the learned models.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store