|
@TrentonBricken | |||||
|
This work is currently a blog post rather than a paper because I have been unsuccessful in empirically validating Tail Free Sampling against Top-K and Nucleus Sampling.
|
||||||
|
||||||
|
Trenton Bricken
@TrentonBricken
|
18. pro |
|
Generating sequences from a language model using Ancestral, Top-K, or Nucleus Sampling? Consider using Tail Free Sampling instead! trentbrick.github.io/Tail-Free-Samp…
👇Thread
|
||
|
|
||
|
Trenton Bricken
@TrentonBricken
|
18. pro |
|
Tail Free Sampling tries to ensure you sample diverse and high quality sequences by finding where the probability distribution for the next token to be generated plateaus.
Here is an example with different hyperparameters: 0.9 (green) and 0.95 (blue) tend to work well pic.twitter.com/6NslQvQqlg
|
||
|
|
||
|
Trenton Bricken
@TrentonBricken
|
18. pro |
|
I argue this approach explicitly finds the set of “replaceable” tokens for a particular context and that languages (including that of biology) have this replaceability property.
If you’re interested please reach out and/or give me feedback.
|
||
|
|
||
|
Trenton Bricken
@TrentonBricken
|
18. pro |
|
(neither Top-K or Nucleus Sampling have done empirical validation before, probably for the very reasons why I am finding it difficult!)
More details on what I have tried and why this validation is hard are in the blog post :)
trentbrick.github.io/Tail-Free-Samp…
|
||
|
|
||
|
Dylan Marshall
@dyl4nm4rsh4ll
|
18. pro |
|
Looks like opportunity to introduce new metric / benchmark for assessing the quality and diversity of generated text? Would allow comparison of methods (top K vs ...)?
|
||
|
|
||