Loading video...
Video Failed to Load
The example below is using prompt-based speculative decoding. Specifically, ngram hashing is utilized to suggest drafts of up to 64 tokens. The hasher keeps track of ngrams in the observed contexts, so mostly effective for coding tasks. Here is another demo:
29,592 views • 2 months ago •via X (Twitter)
0 Comments
No comments available
Comments from the original post will appear here



