The Deadly Triad
We analyze how different components of DQN play a role in emergence of the deadly triad
TPPO ā Truly PPO
We investigate the behavior of PPO and introduce new methods that forces the trust region constraint.
3rd-place solution to MineRL 2019 Competition
Discussion on the 3rd-place solution to MineRL 2019 Competition.
Anti-Aliasing
Discussion on aliasing in modern convolutional neural networks and address it with low-pass filters.
SENet: Squeeze-and Excitation Network
Discussion on Squeeze-and Excitation Network, an architecture that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels.
EvoNorm
Discussion on EvoNorm, a set of uniform normalization-activation layers found by AutoML.
MobileNet
Discussion on MobileNet families
Math
We summarize some mathematical concepts used in deep reinforcement learning
Combining EAs with RL
We summarize summarize several recent works that combine evolutionary algorithms with reinforcement learning.
CLEAR ā Continual Learning with Experience And Replay
Discussion on continual learning with experience and replay, a simple method preventing catastrophic forgetting and improve stability of learning.