Loading video...
Video Failed to Load
Added context to my tiny diffusion model to enable sequential generation of longer outputs! Currently the context is a quarter of the sequence length (seq_len=256, context_len=64). I have a theory that the less semantic-value-per-token, the worse the “curse of parallel decoding” is. With parallel decoding, we independently predict multiple... show more
89,040 views • 7 months ago •via X (Twitter)
0 Comments
No comments available
Comments from the original post will appear here

