Rainbow

Discussion on Rainbow, an integration of multiple improvements on DQN.

8 min read

c51 — Distributional Deep Q Network

Discussion on the distributional deep Q network(a.k.a. c51), an improvement to deep Q network which replaces action-value Q with the value distribution to take on the stochastic nature of the environment.

6 min read

Basic Policies in Reinforcement Learning

We talk in detail about some wildly used policy in reinforcement learning, including epsilon-greedy policy, stochastic policy with temperature, upper confidence bound(UCB), and gradient bandit algorithm

5 min read

DQN — Deep Q Network

Discussion on Deep Q network(DQN), a successful algorithm works in discrete-action environments

4 min read

Contrastive Predicting Coding

Discussion on a sequential representation learning model, contrastive predicting coding.

5 min read

Beta-VAE and Its Variants

Discussion on beta-VAE and its variants, which attempt to learn disentangled representation by heavily penalizing the corresponding correlation term

9 min read

DIM — Deep INFOMAX

Discussion on Deep INFOMAX, a representation-learning method maximizing mutual information between the input and its representation based on MINE

10 min read