|
@lmthang | |||||
|
Perplexity for a language model, by definition, is computed by first averaging all neg log predictions and then exponentiating. Does that help explain? towardsdatascience.com/perplexity-int…
|
||||||
|
||||||
|
Quoc Le
@quocleix
|
28. sij |
|
New paper: Towards a Human-like Open-Domain Chatbot. Key takeaways:
1. "Perplexity is all a chatbot needs" ;)
2. We're getting closer to a high-quality chatbot that can chat about anything
Paper: arxiv.org/abs/2001.09977
Blog: ai.googleblog.com/2020/01/toward… pic.twitter.com/5SOBa58qx3
|
||
|
|
||
|
lerner zhang
@lerner_adams
|
29. sij |
|
By perplexity do you mean average perplexity? I wonder if a weighted average perplexity would be better?
|
||
|
|
||
|
Stéphane Guillitte
@guillittes
|
29. sij |
|
Is perplexity calculated on words or BPE subwords ?
|
||
|
|
||
|
Thang Luong
@lmthang
|
29. sij |
|
It's on subwords (with 8K units).
|
||
|
|
||
|
lerner zhang
@lerner_adams
|
29. sij |
|
Danke
|
||
|
|
||
|
Danny Iskandar
@diskandartweet
|
29. sij |
|
Is there a simple term for normal people could understand?
|
||
|
|
||