RODE — Learning Roles to Decompose Multi-Agent Tasks
Discussion on RODE, a hierarchical MARL method that decompose the action space into role action subspaces according to their effects on the environment.
PWIL — Primal Wasserstein Imitation Learning
Discussion on Primal Wasserstein Imitation Learning.
Network Regularization in Policy Optimization
Discussion on the effect of network regularization in policy optimization.
HIDIO — Hierarchical RL by Discovering Intrinsic Options
Discussion on HIDIO, which identifies and addresses the problem of using a shared representation for learning the policy and the value function.
IDAAC — Invariant Decoupled Advantage Actor-Critic
Discussion on IDAAC, which identifies and addresses the problem of using a shared representation for learning the policy and the value function.
DTSIL — Diverse Trajectory-conditioned Self-Imitation Learning
Discussion on Diverse Trajectory-conditioned Self-Imitation Learning,
TAC — Tsallis Actor Critic
Discussion on Tsallis Actor Critic
MARL — A Survey and Critique
We present an overview of multi-agent reinforcement learning
C++ Concurrency in Action — Chapter 9
Notes from Williams’ C++ Concurrency in Action
C++ Concurrency in Action — Chapter 8
Notes from Williams’ C++ Concurrency in Action