Loading video...

Video Failed to Load

Go Home

PSA: DeepSeek R1 Distill Llama 70B speculative decoding version is now live on Groq Inc for Dev Tier. We just made fast even faster for instant reasoning. 🏁

44,675 views • 1 year ago •via X (Twitter)

11 Comments

Hatice Ozen's profile picture
Hatice Ozen1 year ago

1/5 What is speculative decoding? It's a technique that uses a smaller, faster model to predict a sequence of tokens, which are then verified by the main, more powerful model in parallel. The main model evaluates these predictions and determines which tokens to keep or reject.

Hatice Ozen's profile picture
Hatice Ozen1 year ago

2/5 Speculative decoding achieves faster inference because the main model can verify multiple tokens in parallel rather than generating them one-by-one. This parallel verification is significantly faster than traditional sequential token generation.

Hatice Ozen's profile picture
Hatice Ozen1 year ago

3/5 Think of it like pair programming where your junior dev (small model) writes the first draft of code, and the senior dev (large model) reviews and corrects it. When the junior gets it right and the draft aligns, you save a lot of time.

Hatice Ozen's profile picture
Hatice Ozen1 year ago

4/5 The efficiency comes from parallel verification - while the main model still verifies each token, it can do this simultaneously for many tokens. When wrong? No problem, the main model corrects course. This means much faster inference without having to compromise on quality.

Hatice Ozen's profile picture
Hatice Ozen1 year ago

5/5 Really excited for you all to try it. Will get around to doc updates, but you can just use the `deepseek-r1-distill-llama-70b-specdec` model ID to try. Let us know what else you'd like to see below and have fun building with instant reasoning! 💪

Ben Everman's profile picture
Ben Everman1 year ago

@GroqInc Any plans for 670B?

Jasper's profile picture
Jasper1 year ago

@GroqInc The speed is insane! Would love to have your machines and models on our platform

Mike Sulka's profile picture
Mike Sulka1 year ago

@GroqInc Great stuff!

Charlie Greenman's profile picture
Charlie Greenman1 year ago

@GroqInc cool

Rish e/acc's profile picture
Rish e/acc1 year ago

@GroqInc This is fkin awesome, I hadn’t heard of speculative deckding. Can you recommend any literature etc on the subject?

Hatice Ozen's profile picture
Hatice Ozen1 year ago

@GroqInc 100% agree and recommend this white paper to learn more:

Related Videos