xg15 2 days ago

(2021), still very interesting. Especially the "post-overfitting" training strategy is unexpected.

esafak 2 days ago

The low sample efficiency of RL is well explained.