Twitter | Pretraživanje | |
Tim Vieira
The fact that evaluating ∇f(x) is as fast as f(x) is very important and often misunderstood
Reply Retweet Označi sa "sviđa mi se" More
Federico Vaggi 31. kol
Odgovor korisniku/ci @xtimv
If I recall, given a function f: R^n -> R^m - adjoint (backwards) methods scale with m, forward sensitivity methods scale with n. In almost all of ML, m is a scalar function (a loss function) - so backwards methods dominate.
Reply Retweet Označi sa "sviđa mi se"
Tim Vieira 31. kol
Odgovor korisniku/ci @F_Vaggi
Yup! And there is a rich space of hybrid forward-reverse methods for the general (n,m) setting depending on the underlying graph.
Reply Retweet Označi sa "sviđa mi se"
Kyunghyun Cho 31. kol
Odgovor korisniku/ci @xtimv
would love to hear what points are misunderstood often
Reply Retweet Označi sa "sviđa mi se"
Tim Vieira 31. kol
Odgovor korisniku/ci @kchonyc
Beyond what I wrote up in the post?
Reply Retweet Označi sa "sviđa mi se"
Robert M. Gower 1. ruj
Odgovor korisniku/ci @xtimv
And it even extends to Hessian vector products which also have the same order of cost as evaluating the function itself!
Reply Retweet Označi sa "sviđa mi se"
Petr Kungurtsev 31. kol
Odgovor korisniku/ci @xtimv @superbobry
If you are interested in some other applications of this technique, this is how we do the back-propagation for shape optimization ( == parameters) of PDE-systems
Reply Retweet Označi sa "sviđa mi se"
Scott H. Hawley 31. kol
Odgovor korisniku/ci @xtimv
I wrote in at the bottom with a question about this.
Reply Retweet Označi sa "sviđa mi se"