Page 6 of 16 for Zero | This blog no longer updates but I’m still in my quest of RL. For anyone interested in discussion of recent advance of AI/RL, please contact me via my emails: 122134545@qq.com/o.xlnwel@gmail.com

The Deadly Triad

We analyze how different components of DQN play a role in emergence of the deadly triad

TPPO — Truly PPO

We investigate the behavior of PPO and introduce new methods that forces the trust region constraint.

3rd-place solution to MineRL 2019 Competition

Discussion on the 3rd-place solution to MineRL 2019 Competition.

Anti-Aliasing

Discussion on aliasing in modern convolutional neural networks and address it with low-pass filters.

SENet: Squeeze-and Excitation Network

Discussion on Squeeze-and Excitation Network, an architecture that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels.

EvoNorm

Discussion on EvoNorm, a set of uniform normalization-activation layers found by AutoML.

MobileNet

Discussion on MobileNet families

Math

We summarize some mathematical concepts used in deep reinforcement learning

Combining EAs with RL

We summarize summarize several recent works that combine evolutionary algorithms with reinforcement learning.

CLEAR — Continual Learning with Experience And Replay

Discussion on continual learning with experience and replay, a simple method preventing catastrophic forgetting and improve stability of learning.