|
@lmthang | |||||
|
Implications from the #MeenaBot project:
1. Perplexity might be "the" automatic metric that the field's been looking for.
2. Bots trained on large-scale social conversations & pushed hard for low perplexity will be good.
3. Safety layer is needed for respectful conversations! pic.twitter.com/WHrcstcglt
|
||||||
|
||||||
|
Thang Luong
@lmthang
|
28. sij |
|
Introducing #MeenaBot, a 2.6B-param open-domain chatbot with near-human quality. Remarkably, we show strong correlation between perplexity & humanlikeness!
Paper: arxiv.org/abs/2001.09977
Sample conversations: github.com/google-researc… twitter.com/GoogleAI/statu… pic.twitter.com/3xNSV4r4uB
|
||
|
|
||
|
Thang Luong
@lmthang
|
28. sij |
|
#MeenaBot is based on the Evolved Transformer (ET, an improved Transformer) & trained to minimize perplexity, the uncertainty of predicting the next word in a conversation. We built a novel "shallow-deep" seq2seq architecture: 1 ET block for encoder & 13 ET blocks for decoder. pic.twitter.com/Mv2d4Los3k
|
||
|
|
||
|
Thang Luong
@lmthang
|
28. sij |
|
We design a new human evaluation metric, Sensibleness & Specificity Average (SSA), which captures key elements of natural conversations. SSA is also shown to correlate with humanlikeness while being easier to measure. Human scores 86% SSA, #MeenaBot 79%, other best chatbots 56%. pic.twitter.com/I7NKl2b9Tl
|
||
|
|
||
|
Sanuj
@sanuj_sharma
|
2. velj |
|
Meena is pretty impressive. Was minimizing perplexity (or cross-entropy) the only training objective? Or was there something more to ensure things like consistency during a conversation? It seems like SSA was only used for evaluation. Forgive me if I overlooked any paper details!
|
||
|
|
||
|
Thang Luong
@lmthang
|
2. velj |
|
You got it right. We only minimize cross-entropy during training & SSA was used during evaluation.
|
||
|
|
||
|
Andrew Sears
@andrew_sears
|
29. sij |
|
I've never heard "Dolphin power!" in a conversation before... perhaps training it on '90s sitcoms might improve the "Perplexity", or just keep that level of awesomeness in the model.
It also appears to have a propensity for saying cool!
Set this up as a Google Home action!
|
||
|
|
||
|
Danny Iskandar
@diskandartweet
|
29. sij |
|
any time/plan to release as opensource?
|
||
|
|
||