|
@Zergylord | |||||
|
Excited to announce our work on memory generalization in Deep RL is out now!
We created a suite of 13 tasks with variants to test interpolation and extrapolation.
Our new MRA agent out-performs baselines, but these tasks remain an open challenge.
arxiv.org/abs/1910.13406
1/n pic.twitter.com/jnnPr5ISk7
|
||||||
|
||||||
|
Steven Hansen
@Zergylord
|
30. lis |
|
This was a hugely collaborative effort, led by Meire Fortunato, Ryan Faulkner, Melissa Tan, and myself under @BlundellCharles's leadership, with Adrià Badia, Gavin Buttimore & @bigblueboo also playing key roles. Many more of my @DeepMindAI colleagues were also very supportive 2/n
|
||
|
|
||
|
Steven Hansen
@Zergylord
|
30. lis |
|
Results!
1) Some of these tasks are hard! Underfitting is still an issue in RL
2) Extrapolation isn't impossible for Deep RL agents, but it requires the right inductive biases and is far from solved
3) Adding a contrastive loss to an external memory is a good thing to do
3/n pic.twitter.com/LEbaXFtcqG
|
||
|
|
||
|
Steven Hansen
@Zergylord
|
30. lis |
|
Tasks!
In addition to a standard train/test split based on partitioning some variable (e.g. color), we also pick a scalar variable (e.g. size of room). We can thus train on some values and test on unseen values inside the range (interp) or outside of the range (extrap)
4/n pic.twitter.com/RDKaVbQWlz
|
||
|
|
||
|
Steven Hansen
@Zergylord
|
30. lis |
|
Memory Recall Agent!
A new agent that combines
1) an external memory
2) contrastive auxiliary loss
3) jumpy-backpropagation for credit assignment
Importantly, all of these pieces were validated through over 10 ablations!
5/n pic.twitter.com/5rQjDjQVYA
|
||
|
|
||
|
Steven Hansen
@Zergylord
|
30. lis |
|
Feverishly working on preparing the tasks for an external just in time for @NeurIPSConf.
We hope these tasks represent an interesting challenge for the deep RL community.
Excited to see what y'all can do with them!
sites.google.com/corp/view/memo…
n/n
back to work time pic.twitter.com/eurQnUVA2R
|
||
|
|
||
|
John O'Malia
@john_omalia
|
31. lis |
|
This deep dive on memory and generalisation is a really important direction for moving RL forward. Nice work!
|
||
|
|
||