1 resource

  • Anirudh Goyal, Abram L. Friesen, Andrea ...
    |
    May 24th, 2022
    |
    preprint
    Anirudh Goyal, Abram L. Friesen, Andrea ...
    May 24th, 2022

    Most deep reinforcement learning (RL) algorithms distill experience into parametric behavior policies or value functions via gradient updates. While effective, this approach has several disadvantages: (1) it is computationally expensive, (2) it can take many updates to integrate experiences into the parametric model, (3) experiences that are not fully integrated do not appropriately influence the agent's behavior, and (4) behavior is limited by the capacity of the model. In this paper we...

Last update from database: 29/12/2024, 13:15 (UTC)
Powered by Zotero and Kerko.