PlaNet: Deep Planning Network
Discussion on a model-based reinforcement learning agent called PlaNet
SIL - Self-Imitation Learning
Discussion on self-imitation learning, in which the agent exploits the previous transitions that receives better returnas than it expects
AdaNorm
We analyze layer normalization and discuss its improvement AdaNorm.
UNREAL — Unsupervised Reinforcement and Auxiliary Learning
Discussion on UNsupervised Reinforcement and Auxiliary Learning(UNREAL), which aims to fully utilize training signals from environments to speed up the learning process and gain better performance.
Time Limits in Reinforcement Learning
Discussion on the impact of time limits in reinforcement learning
PtrNet: Pointer Network
Discussion on Pointer Network.
Ape-X DQfD
Discussion on several enhancements on Ape-X DQN.
Solving Rubik’s Cube with a Robot Hand
Discussion on an agent, trained on simulation, can solve Rubik’s Cube with a real robot hand.
Challenges of Real-World Reinforcement Learning
Discussion on several challenges of real-world reinforcement learning.
REM - Random Ensemble Mixture
Discussion on a RL algorithm that exploit off-policy data.