PPG — Phasic Policy Gradient

Discussion on phasic policy gradient, which implements two disjoint networks for the policy and value function and optimizes them in two phases.

5 min read