Twitter | Pretraživanje | |
Steven Hansen
Excited to announce our work on memory generalization in Deep RL is out now! We created a suite of 13 tasks with variants to test interpolation and extrapolation. Our new MRA agent out-performs baselines, but these tasks remain an open challenge. 1/n
Reply Retweet Označi sa "sviđa mi se" More
Steven Hansen 30. lis
This was a hugely collaborative effort, led by Meire Fortunato, Ryan Faulkner, Melissa Tan, and myself under 's leadership, with Adrià Badia, Gavin Buttimore & also playing key roles. Many more of my colleagues were also very supportive 2/n
Reply Retweet Označi sa "sviđa mi se"
Steven Hansen 30. lis
Odgovor korisniku/ci @Zergylord
Results! 1) Some of these tasks are hard! Underfitting is still an issue in RL 2) Extrapolation isn't impossible for Deep RL agents, but it requires the right inductive biases and is far from solved 3) Adding a contrastive loss to an external memory is a good thing to do 3/n
Reply Retweet Označi sa "sviđa mi se"
Steven Hansen 30. lis
Odgovor korisniku/ci @Zergylord
Tasks! In addition to a standard train/test split based on partitioning some variable (e.g. color), we also pick a scalar variable (e.g. size of room). We can thus train on some values and test on unseen values inside the range (interp) or outside of the range (extrap) 4/n
Reply Retweet Označi sa "sviđa mi se"
Steven Hansen 30. lis
Odgovor korisniku/ci @Zergylord
Memory Recall Agent! A new agent that combines 1) an external memory 2) contrastive auxiliary loss 3) jumpy-backpropagation for credit assignment Importantly, all of these pieces were validated through over 10 ablations! 5/n
Reply Retweet Označi sa "sviđa mi se"
Steven Hansen 30. lis
Odgovor korisniku/ci @NeurIPSConf
Feverishly working on preparing the tasks for an external just in time for . We hope these tasks represent an interesting challenge for the deep RL community. Excited to see what y'all can do with them! n/n back to work time
Reply Retweet Označi sa "sviđa mi se"
John O'Malia 31. lis
Odgovor korisniku/ci @Zergylord
This deep dive on memory and generalisation is a really important direction for moving RL forward. Nice work!
Reply Retweet Označi sa "sviđa mi se"