PopArt: Preserving Outputs Precisely, while Adaptively Rescaling Targets
Discussion on a method that can learn values across many orders of magnitudes.
SchedNet — Schedule Network
Discussion on a multi-agent reinforcement learning algorithm that schedules communication between cooperative agents.
PR2 — Probabilistic Recursive Reasoning
Discussion on a multi-agent reinforcement learning algorithm that recursively reason the opponents’ behavior.
MADDPG — Multi-Agent-Deep deterministic Policy Gradient
Discussion on a multi-agent reinforcement learning algorithm that follows the framework of centralized training with decentralized execution.
EMI — Exploration with Mutual Information
Discussion on a novel exploration method based on representation learning
QWeb
Discussion on how to solve the web navigation problem using DQN.
MIRL — Mutual Information Reinforcement Learning
Discussion on a new regularization mechanism that leverage an optimal prior to explicitly penalize the mutual information between states and f.
SAGAN: Techniques in Self-Attention Generative Adversarial Networks
Discussion on several techniques involved in SAGAN, including self-attention, spectral normalization, conditional batch normalization, etc
MB-MRL — Model-Based Meta-Reinforcement Learning
Discussion on a model-based meta reinforcement learning algorithm that enables the agent to fast adapt to changes of environment.
PEARL — Probabilistic Embedding for Actor-critic RL
Discussion on an off-policy meta reinforcement learning algorithm that achieves state-of-the-art performance and sample efficiency.