|
@SussilloDavid | |||||
|
#tweeprint
Universality and individuality in neural dynamics across large populations of recurrent networks
arxiv.org/abs/1907.08549.
With fantastic collaborators @niru_m, @ItsNeuronal, @MattGolub_Neuro, @SuryaGanguli.
|
||||||
|
||||||
|
David Sussillo ☝️🤓
@SussilloDavid
|
22. srp |
|
Many recent studies find striking similarities between representations in biological brains 🧠and artificial neural networks 🤖 trained to solve analogous tasks.
|
||
|
|
||
|
David Sussillo ☝️🤓
@SussilloDavid
|
22. srp |
|
This is pretty crazy when you think about it because brains and ANNs have serious differences in their biophysical/architectural details.
|
||
|
|
||
|
David Sussillo ☝️🤓
@SussilloDavid
|
22. srp |
|
This raises a fundamental question: How should we interpret the often striking representational similarity of biological and artificial networks? 👩🔬👨🔬
|
||
|
|
||
|
David Sussillo ☝️🤓
@SussilloDavid
|
22. srp |
|
While the technology is not (yet!) there to address this question with biological brains, we can begin to address it theoretically by comparing various ANN architectures’ representations and dynamics against each other. ⚖️
|
||
|
|
||
|
David Sussillo ☝️🤓
@SussilloDavid
|
22. srp |
|
E.g. when you train a recurrent network on the same task with different architectures, would you expect the solutions to look different or the same? In what ways?
|
||
|
|
||
|
David Sussillo ☝️🤓
@SussilloDavid
|
22. srp |
|
We trained thousands of RNNs to solve simple tasks, to see whether these comparisons (and scientific conclusions) would be sensitive to particularities of modeling choices, e.g. LSTM vs. GRU vs vanilla RNN.
|
||
|
|
||
|
David Sussillo ☝️🤓
@SussilloDavid
|
22. srp |
|
We found evidence for both individuality 💅 and universality 🌌 in the solutions across different RNN architectures, with the geometry of representations tending to be more varied and the dynamics tending to be more universal.
|
||
|
|
||
|
David Sussillo ☝️🤓
@SussilloDavid
|
22. srp |
|
Here’s an example from the preprint, a simple task to produce a sine wave whose frequency is proportional to a given input (command) frequency. We trained recurrent networks to solve this task pic.twitter.com/AVebkXp8Kr
|
||
|
|
||
|
David Sussillo ☝️🤓
@SussilloDavid
|
22. srp |
|
Example state space trajectories show some differences across architecture pic.twitter.com/RNc3ePYb3b
|
||
|
|
||
|
David Sussillo ☝️🤓
@SussilloDavid
|
22. srp |
|
We compared the geometry of network representations using canonical correlation analysis (CCA). CCA suggests that networks are sensitive to these modeling choices (each dot is a network) if intracluster distances gives a measure for intercluster distance. pic.twitter.com/nMG5LES3r1
|
||
|
|
||
|
David Sussillo ☝️🤓
@SussilloDavid
|
22. srp |
|
Next, we asked if we would reach the same conclusion if we compared the underlying dynamics 🌀. To do this, we used tools from dynamical systems theory (fixed points and linearization) to extract a simple dynamical portrait for each network. pic.twitter.com/ARueBiVffQ
|
||
|
|
||
|
David Sussillo ☝️🤓
@SussilloDavid
|
22. srp |
|
When we then compared network distances using fixed point topology, we found that there was no real difference across architecture (e), suggesting the topology may be universal. pic.twitter.com/yMCy4Ds1qe
|
||
|
|
||
|
David Sussillo ☝️🤓
@SussilloDavid
|
22. srp |
|
Linearization of the dynamics reveals a common motif with some small differences across architectures. The common motif is a nearly linear solution to produce the oscillations. The varied motif is a small degree of nonlinearity used to generate the oscillations. pic.twitter.com/Uj1a6GTLZz
|
||
|
|
||
|
David Sussillo ☝️🤓
@SussilloDavid
|
22. srp |
|
We hope this kind of in silico 🖥️ study helps advance a discussion about the use of ANNs in neuroscience. There are more examples in the preprint arxiv.org/abs/1907.08549.
|
||
|
|
||