正在加载视频...
视频加载失败
The example below is using prompt-based speculative decoding. Specifically, ngram hashing is utilized to suggest drafts of up to 64 tokens. The hasher keeps track of ngrams in the observed contexts, so mostly effective for coding tasks. Here is another demo:
29,592 次观看 • 2 个月前 •via X (Twitter)
0 条评论
暂无评论
原始帖子的评论将显示在这里



