Page 14 of 16 for Zero | This blog no longer updates but I’m still in my quest of RL. For anyone interested in discussion of recent advance of AI/RL, please contact me via my emails: 122134545@qq.com/o.xlnwel@gmail.com

EM — Expectation-Maximization Algorithm

Discussion on the Expectation-Maximization(EM) algorithm, and its application to GMMs

GPS-iLQR — Guided Policy Search with iLQR

Discussion on iterative Linear Quadratic Regulator with a local linear-Gaussian model

LQR — Linear-Quadratic Regulator

Discussion on Linear Quadratic Regulator its derivatives

MB-MF — Model-Based Model-Free

Discussion on model-based model-free algorithm

SCG — Stochastic Computational Graphs

Discussion on stochastic computational graphs, a type of directed asyclic computational graph that include both deterministic functions and conditional probability distrbutions.

GAE — Generalized Advantage Estimation

Discussion on a multi-step advantage estimation for online reinforcement learning

TRPO, PPO

Discussion on two policy-based algorithms which restrict the step size to help avoid big steps: Trust Region Policy Optimization(TRPO) and Proximal Policy Optimization(PPO).

CG — Conjugate Gradient Method

Discussion on the conjugate gradient method in chaos :-)

Planning and Learning in Model-Based Reinforcement Learning Methods

Discussion on a series of algorithms in model-based reinforcement learning where planning and learning are intermixed.

GQN — Generative Query Network

Discussion on the generative query network, a brand new unsupervised scene-based generative network.