EM — Expectation-Maximization Algorithm
Discussion on the Expectation-Maximization(EM) algorithm, and its application to GMMs
GPS-iLQR — Guided Policy Search with iLQR
Discussion on iterative Linear Quadratic Regulator with a local linear-Gaussian model
LQR — Linear-Quadratic Regulator
Discussion on Linear Quadratic Regulator its derivatives
MB-MF — Model-Based Model-Free
Discussion on model-based model-free algorithm
SCG — Stochastic Computational Graphs
Discussion on stochastic computational graphs, a type of directed asyclic computational graph that include both deterministic functions and conditional probability distrbutions.
GAE — Generalized Advantage Estimation
Discussion on a multi-step advantage estimation for online reinforcement learning
TRPO, PPO
Discussion on two policy-based algorithms which restrict the step size to help avoid big steps: Trust Region Policy Optimization(TRPO) and Proximal Policy Optimization(PPO).
CG — Conjugate Gradient Method
Discussion on the conjugate gradient method in chaos :-)
Planning and Learning in Model-Based Reinforcement Learning Methods
Discussion on a series of algorithms in model-based reinforcement learning where planning and learning are intermixed.
GQN — Generative Query Network
Discussion on the generative query network, a brand new unsupervised scene-based generative network.