Page 8 of 16 for Zero | This blog no longer updates but I’m still in my quest of RL. For anyone interested in discussion of recent advance of AI/RL, please contact me via my emails: 122134545@qq.com/o.xlnwel@gmail.com

PlaNet: Deep Planning Network

Discussion on a model-based reinforcement learning agent called PlaNet

SIL - Self-Imitation Learning

Discussion on self-imitation learning, in which the agent exploits the previous transitions that receives better returnas than it expects

AdaNorm

We analyze layer normalization and discuss its improvement AdaNorm.

UNREAL — Unsupervised Reinforcement and Auxiliary Learning

Discussion on UNsupervised Reinforcement and Auxiliary Learning(UNREAL), which aims to fully utilize training signals from environments to speed up the learning process and gain better performance.

Time Limits in Reinforcement Learning

Discussion on the impact of time limits in reinforcement learning

PtrNet: Pointer Network

Discussion on Pointer Network.

Ape-X DQfD

Discussion on several enhancements on Ape-X DQN.

Solving Rubik’s Cube with a Robot Hand

Discussion on an agent, trained on simulation, can solve Rubik’s Cube with a real robot hand.

Challenges of Real-World Reinforcement Learning

Discussion on several challenges of real-world reinforcement learning.

REM - Random Ensemble Mixture

Discussion on a RL algorithm that exploit off-policy data.