Page 2 of 16 for Zero | This blog no longer updates but I’m still in my quest of RL. For anyone interested in discussion of recent advance of AI/RL, please contact me via my emails: 122134545@qq.com/o.xlnwel@gmail.com

RODE — Learning Roles to Decompose Multi-Agent Tasks

Discussion on RODE, a hierarchical MARL method that decompose the action space into role action subspaces according to their effects on the environment.

PWIL — Primal Wasserstein Imitation Learning

Discussion on Primal Wasserstein Imitation Learning.

Network Regularization in Policy Optimization

Discussion on the effect of network regularization in policy optimization.

HIDIO — Hierarchical RL by Discovering Intrinsic Options

Discussion on HIDIO, which identifies and addresses the problem of using a shared representation for learning the policy and the value function.

IDAAC — Invariant Decoupled Advantage Actor-Critic

Discussion on IDAAC, which identifies and addresses the problem of using a shared representation for learning the policy and the value function.

DTSIL — Diverse Trajectory-conditioned Self-Imitation Learning

Discussion on Diverse Trajectory-conditioned Self-Imitation Learning,

TAC — Tsallis Actor Critic

Discussion on Tsallis Actor Critic

MARL — A Survey and Critique

We present an overview of multi-agent reinforcement learning

C++ Concurrency in Action — Chapter 9

Notes from Williams’ C++ Concurrency in Action

C++ Concurrency in Action — Chapter 8

Notes from Williams’ C++ Concurrency in Action