Bayesian

Deep Reinforcement Learning-WindFarm

Reinforcement learning is known to be unstable or even to diverge when a nonlinear function approximator such as a neural network is used to represent the action-value known as Q function.