Twitter | Pretraživanje | |
Kevin Clark
PhD student with .
13
Tweetovi
122
Pratim
534
Osobe koje vas prate
Tweetovi
Kevin Clark 29. sij
Seminar tomorrow: on "Causal Mediation Analysis for Interpreting Neural NLP: The Case of Gender Bias" Please join us!
Reply Retweet Označi sa "sviđa mi se"
Kevin Clark 22. sij
Seminar tomorrow: from Amazon on "A Discourse Centric Framework for Facilitating Instructor Intervention in MOOC Discussion Forums" . Please join us!
Reply Retweet Označi sa "sviđa mi se"
Kevin Clark proslijedio/la je tweet
Hamlet 🇩🇴 🇺🇸 10. sij
🔥“we train a model on one GPU for 4 days that outperforms GPT (trained using 30x more compute) on the GLUE natural language understanding ... we match the performance of RoBERTa, the current state-of-the-art pre-trained transformer, while using less than 1/4 of the compute.” 🤯
Reply Retweet Označi sa "sviđa mi se"
Kevin Clark 10. sij
Odgovor korisniku/ci @Mlbot4 @stanfordnlp @GoogleAI
Code will be released early February!
Reply Retweet Označi sa "sviđa mi se"
Kevin Clark 21. pro
Odgovor korisniku/ci @paulomannjr @lmthang i 4 ostali
It is in tensorflow, but the neural architecture is the same as BERT so the pre-trained weights should be compatible with any pytorch implementation of BERT such as 's transformers library.
Reply Retweet Označi sa "sviđa mi se"
Kevin Clark proslijedio/la je tweet
Sam Bowman 27. stu
New analysis paper from my group! We zoom in on some of et al.'s on syntax-sensitive attention heads in BERT (+RoBERTa, +...), and find interestingly mixed results.
Reply Retweet Označi sa "sviđa mi se"
Kevin Clark proslijedio/la je tweet
Urvashi Khandelwal 4. stu
Excited to share new work!!! “Generalization through Memorization: Nearest Neighbor Language Models” We introduce kNN-LMs, which extend LMs with nearest neighbor search in embedding space, achieving a new state-of-the-art perplexity on Wikitext-103, without additional training!
Reply Retweet Označi sa "sviđa mi se"
Kevin Clark proslijedio/la je tweet
Mike Lewis 31. lis
Excited to share our work on BART, a method for pre-training seq2seq models by de-noising text. BART outperforms previous work on a bunch of generation tasks (summarization/dialogue/QA), while getting similar performance to RoBERTa on SQuAD/GLUE
Reply Retweet Označi sa "sviđa mi se"
Kevin Clark proslijedio/la je tweet
John Hewitt 10. ruj
How do we design probes that give us insight into a representation? In paper with , our "control tasks" help us understand the capacity of a probe to make decisions unmotivated by the repr. paper: blog:
Reply Retweet Označi sa "sviđa mi se"
Kevin Clark proslijedio/la je tweet
Grzegorz Chrupała 🇪🇺 1. kol
best paper award went to: What does BERT look at? An Analysis of BERT’s Attention. Kevin Clark, Urvashi Khandelwal, Omer Levy and Christopher D. Manning.
Reply Retweet Označi sa "sviđa mi se"
Kevin Clark 11. srp
BAM! Our new paper presents "Born-Again Multi-Task Networks," a simple way to improve multi-task learning using knowledge distillation. With . Paper: Code:
Reply Retweet Označi sa "sviđa mi se"
Kevin Clark 27. lip
Code for our paper "What Does BERT Look At? An Analysis of BERT's Attention" () has been released!
Reply Retweet Označi sa "sviđa mi se"
Kevin Clark 12. lip
Check out our new paper "What Does BERT Look At? An Analysis of BERT's Attention" with ! Among other things, we show that BERT's attention corresponds surprisingly well to aspects of syntax and coreference.
Reply Retweet Označi sa "sviđa mi se"