Page 13 of 16 for Zero | This blog no longer updates but I’m still in my quest of RL. For anyone interested in discussion of recent advance of AI/RL, please contact me via my emails: 122134545@qq.com/o.xlnwel@gmail.com

AIRL — Adversarial Inverse Reinforcement Learning

We introduce a practical GAN-style IRL algorithm named adversarial inverse reinforcement learning(AIRL)

GAN-GCL

We build a connection between maximum entropy inverse reinforcement learning and generative adversarial networks

GCL — Guided Cost Learning

We introduce a maximum entropy inverse reinforcement learning algorithm, named guided policy learning.

PCL — Path Consistency Learning and More

Discussion on path consistency learning and its derivatives.

SAC — Soft Actor-Critic with Adaptive Temperature

We introduce adaptive temperature to soft actor-critic(SAC).

SAC — Soft Actor-Critic

Discussion on soft actor-critic, a maximum entropy algorithm.

SVI — Soft Value Iteration

We address the optimism problem of the probabilistic graphical model introduced in the previous post via variational inference.

PGM — Probabilistic Graphic Model

Discussion on statistic inference in a temporal probabilistic graphical model.

SL — Statistic Learning: A Connection to Neural Networks

We expand the topic of latent variable models in a sense that the latent variables model the underlying structure of the observed data, whereby the model is able to do statistical inference over these latent variables. Then we will build a connnection between statistic learning and neural networks.