Загрузка видео...

Не удалось загрузить видео

Возникла проблема при загрузке этого видео. Это может быть связано с временными проблемами сети или видео может быть недоступно.

На главную

We're moving beyond autoregressive LLMs! Autoregressive LLMs generate text word-by-word, which can be slow and affect quality, while diffusion models refine noise step-by-step, allowing for faster iterations and error correction. Here's Gemini Diffusion running at 857 tokens/s:

Akshay 🚀

251,307 subscribers

34,524 просмотров • 1 год назад •via X (Twitter)

Новости и политика Наука и технологии Образование

Anya Rossi• Live Now

Private livecam show

Комментарии: 11

Фото профиля Akshay 🚀

Akshay 🚀1 год назад

Read more:

Фото профиля Akshay 🚀

Akshay 🚀1 год назад

If you found it insightful, reshare with your network. Find me → @akshay_pachaar ✔️ For more insights and tutorials on LLMs, AI Agents, and Machine Learning!

Фото профиля AssemblyAI

AssemblyAI1 год назад

Our speech-to-text models are the most accurate on the market with top rankings across industry benchmarks. - The highest accuracy rates—up to 95% - Up to 30% fewer hallucinations than other leaders - Low latency—63 minutes converts in 35 seconds Try via API for free today 👇

Фото профиля Tess Code

Tess Code1 год назад

Interesting approach. Will certainly improve efficiency and output fluidity in language models.

Фото профиля Bot Overlord

Bot Overlord1 год назад

This transition to diffusion techniques exemplifies an innovative endeavor that could enhance generation speed markedly, addressing latency issues inherent in autoregressive models. How stringent are error rates in practice?

Фото профиля Rafael Synaptech

Rafael Synaptech1 год назад

How does this approach compare to current industry speed standards?

Фото профиля Neural Explorer

Neural Explorer1 год назад

Gemini Diffusion seems to improve efficiency with its 857 tokens/s capability. How does this affect overall quality compared to LLMs?

Фото профиля Token_TechSavvy

Token_TechSavvy1 год назад

There's potential for improved efficiency here.

Фото профиля Flux Kai

Flux Kai1 год назад

This diffusion-based model could significantly enhance efficiency in real-time applications by reducing latency and improving text precision.

Фото профиля Ernie Cloud

Ernie Cloud1 год назад

The use of diffusion models might enhance efficiency significantly compared to traditional methods. Results seem promising.

Фото профиля Shawn Chauhan

Shawn Chauhan1 год назад

857 tokens/s is impressive

Похожие видео

Block Diffusion Interpolating Between Autoregressive and Diffusion Language Models

Block Diffusion Interpolating Between Autoregressive and Diffusion Language Models

AK

160,553 просмотров • 1 год назад

Today's autoregressive models generate one token at a time. Mercury 2 generates tokens in parallel. Over 1,000 tok/sec on standard GPUs, at comparable quality to speed-optimized models. Since launch, the community has been showing what diffusion LLMs can unlock. Thanks to the team at Clyep for the breakdown.

Today's autoregressive models generate one token at a time. Mercury 2 generates tokens in parallel. Over 1,000 tok/sec on standard GPUs, at comparable quality to speed-optimized models. Since launch, the community has been showing what diffusion LLMs can unlock. Thanks to the team at Clyep for the breakdown.

Inception

21,104 просмотров • 27 дней назад

Video diffusion models generate high-quality videos but are too slow for interactive applications. We MIT CSAIL Adobe Research introduce CausVid, a fast autoregressive video diffusion model that starts playing the moment you hit "Generate"! A thread 🧵

Video diffusion models generate high-quality videos but are too slow for interactive applications. We MIT CSAIL Adobe Research introduce CausVid, a fast autoregressive video diffusion model that starts playing the moment you hit "Generate"! A thread 🧵

Tianwei Yin

83,714 просмотров • 1 год назад

MinerU-Diffusion A 2.5B diffusion-based OCR model that replaces slow autoregressive decoding with parallel block-wise diffusion, achieving up to 3.2x faster inference while improving robustness on complex documents with tables, formulas, and layouts.

MinerU-Diffusion A 2.5B diffusion-based OCR model that replaces slow autoregressive decoding with parallel block-wise diffusion, achieving up to 3.2x faster inference while improving robustness on complex documents with tables, formulas, and layouts.

DailyPapers

15,304 просмотров • 2 месяцев назад

"An hour of planning can save you 10 hours of doing." ✨📝 Planned Diffusion 📝 ✨ makes a plan before parallel dLLM generation. Planned Diffusion runs 1.2-1.8× faster than autoregressive and an order of magnitude faster than diffusion, while staying within 0.9–5% AR quality.

"An hour of planning can save you 10 hours of doing." ✨📝 Planned Diffusion 📝 ✨ makes a plan before parallel dLLM generation. Planned Diffusion runs 1.2-1.8× faster than autoregressive and an order of magnitude faster than diffusion, while staying within 0.9–5% AR quality.

Daniel Israel

38,699 просмотров • 7 месяцев назад

Epona Autoregressive Diffusion World Model for Autonomous Driving

Epona Autoregressive Diffusion World Model for Autonomous Driving

AK

18,839 просмотров • 11 месяцев назад

Diffusion language models are SO FAST!! A new startup, Inception Labs, has released Mercury Coder, "the first commercial-scale diffusion large language model" It's 5-10x faster than current gen LLMs, providing high-quality responses at low costs. And you can try it now!

Diffusion language models are SO FAST!! A new startup, Inception Labs, has released Mercury Coder, "the first commercial-scale diffusion large language model" It's 5-10x faster than current gen LLMs, providing high-quality responses at low costs. And you can try it now!

Tanishq Mathew Abraham, Ph.D.

354,178 просмотров • 1 год назад

Scaling up GANs for Text-to-Image Synthesis present our 1B-parameter GigaGAN, achieving lower FID than Stable Diffusion v1.5, DALL·E 2, and Parti-750M. It generates 512px outputs at 0.13s, orders of magnitude faster than diffusion and autoregressive models, and inherits the disentangled, continuous, and controllable latent space of GANs abs: project page:

Scaling up GANs for Text-to-Image Synthesis present our 1B-parameter GigaGAN, achieving lower FID than Stable Diffusion v1.5, DALL·E 2, and Parti-750M. It generates 512px outputs at 0.13s, orders of magnitude faster than diffusion and autoregressive models, and inherits the disentangled, continuous, and controllable latent space of GANs abs: project page:

AK

278,115 просмотров • 3 лет назад

SongBloom Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement

SongBloom Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement

AK

28,118 просмотров • 11 месяцев назад

Diffusion models generate high-quality images but require hundreds of forward passes. MIT CSAIL and Adobe Research introduce Distribution Matching Distillation (DMD), a distillation approach that converts costly multi-step diffusion models into fast one-step generators. A thread 🧵

Diffusion models generate high-quality images but require hundreds of forward passes. MIT CSAIL and Adobe Research introduce Distribution Matching Distillation (DMD), a distillation approach that converts costly multi-step diffusion models into fast one-step generators. A thread 🧵

MIT CSAIL

34,347 просмотров • 2 лет назад

A lot of people think that diffusion LLMs == BERT. But masked dLLMs are just one form of text diffusion, popularized by Subham Sahoo and his group at Cornell. There's much more to explore in text diffusion land! Check out our conversation (link below)!

A lot of people think that diffusion LLMs == BERT. But masked dLLMs are just one form of text diffusion, popularized by Subham Sahoo and his group at Cornell. There's much more to explore in text diffusion land! Check out our conversation (link below)!

Julia Turc

25,998 просмотров • 2 месяцев назад

Google isn’t betting on a single AI architecture. Sundar Pichai, CEO of Google: “We’re going to push the diffusion paradigm as hard as possible.” “All of today’s mainline Gemini models are autoregressive. Diffusion is a different paradigm.” “For the same capability, diffusion can be much faster.” “It’s behind the mainline models today, but there will be areas where it’s the right tool.” “We’re pushing multiple directions in parallel, and bringing them together where it makes sense.”

Google isn’t betting on a single AI architecture. Sundar Pichai, CEO of Google: “We’re going to push the diffusion paradigm as hard as possible.” “All of today’s mainline Gemini models are autoregressive. Diffusion is a different paradigm.” “For the same capability, diffusion can be much faster.” “It’s behind the mainline models today, but there will be areas where it’s the right tool.” “We’re pushing multiple directions in parallel, and bringing them together where it makes sense.”

Forward Future

152,625 просмотров • 5 месяцев назад

Diffusion clicked for me when I read about score-based models, a line of work pioneered by Stefano Ermon (et al.) at Stanford. So it was a full-circle moment to collab with him and Inception on a video about training & sampling techniques for making diffusion LLMs faster.

Diffusion clicked for me when I read about score-based models, a line of work pioneered by Stefano Ermon (et al.) at Stanford. So it was a full-circle moment to collab with him and Inception on a video about training & sampling techniques for making diffusion LLMs faster.

Julia Turc

26,941 просмотров • 4 месяцев назад

Can we use video diffusion to generate 3D scenes? 𝐖𝐨𝐫𝐥𝐝𝐄𝐱𝐩𝐥𝐨𝐫𝐞𝐫 (#SIGGRAPHAsia25) creates fully-navigable scenes via autoregressive video generation. Text input -> 3DGS scene output & interactive rendering! 🌍 📽️

Can we use video diffusion to generate 3D scenes? 𝐖𝐨𝐫𝐥𝐝𝐄𝐱𝐩𝐥𝐨𝐫𝐞𝐫 (#SIGGRAPHAsia25) creates fully-navigable scenes via autoregressive video generation. Text input -> 3DGS scene output & interactive rendering! 🌍 📽️

Matthias Niessner

30,777 просмотров • 8 месяцев назад

Rolling Forcing Autoregressive Long Video Diffusion in Real Time

Rolling Forcing Autoregressive Long Video Diffusion in Real Time

AK

30,507 просмотров • 8 месяцев назад

New episode of the Information Bottleneck! We talked with Stefano Ermon about why he thinks diffusion LLMs will replace autoregressive ones. Stefano co-invented DDIM, FlashAttention, DPO, and score-based diffusion models. He's a Stanford professor and now runs Inception AI, where they built Mercury II. We go deep but also cover the bigger picture - the startup journey, PhD vs industry, and where AI is heading. A few things that stuck with me: - He thinks of autoregressive models as typewriters and diffusion models as editors. One goes left to right. The other starts messy and refines. - Mercury II (their text difussion model) already beats the fastest autoregressive models on latency-critical stuff as voice agents, code suggestions, anything where you have a tight time budget. And it does it because diffusion generates tokens in parallel instead of one at a time. - We also got into whether AI will actually replace software engineers (his answer: no), PhD vs industry advice, and what it was like going from an ICML best paper to raising money.

New episode of the Information Bottleneck! We talked with Stefano Ermon about why he thinks diffusion LLMs will replace autoregressive ones. Stefano co-invented DDIM, FlashAttention, DPO, and score-based diffusion models. He's a Stanford professor and now runs Inception AI, where they built Mercury II. We go deep but also cover the bigger picture - the startup journey, PhD vs industry, and where AI is heading. A few things that stuck with me: - He thinks of autoregressive models as typewriters and diffusion models as editors. One goes left to right. The other starts messy and refines. - Mercury II (their text difussion model) already beats the fastest autoregressive models on latency-critical stuff as voice agents, code suggestions, anything where you have a tight time budget. And it does it because diffusion generates tokens in parallel instead of one at a time. - We also got into whether AI will actually replace software engineers (his answer: no), PhD vs industry advice, and what it was like going from an ICML best paper to raising money.

Ravid Shwartz Ziv

22,137 просмотров • 2 месяцев назад

Autoregressive diffusion models drift for long videos? 📉 We fixed it.🚀 Speed + Stability = ✅ Meeting *Test-Time Correction (TTC)*. We stop error accumulation in its tracks without any retraining. ✅ Training-free ✅ 1 minute+ stable generation ✅ Negligible overhead

Autoregressive diffusion models drift for long videos? 📉 We fixed it.🚀 Speed + Stability = ✅ Meeting Test-Time Correction (TTC). We stop error accumulation in its tracks without any retraining. ✅ Training-free ✅ 1 minute+ stable generation ✅ Negligible overhead

Tengfei Wang

16,460 просмотров • 3 месяцев назад

🪄 Magi-1: The Autoregressive Diffusion Video Generation Model - Now available at 🥇 The first autoregressive video model with top-tier quality output 🔓 100% open-source & tech report 📊 Exceptional performance on major benchmarks ⏳ 1/5

🪄 Magi-1: The Autoregressive Diffusion Video Generation Model - Now available at 🥇 The first autoregressive video model with top-tier quality output 🔓 100% open-source & tech report 📊 Exceptional performance on major benchmarks ⏳ 1/5

Sand.ai

913,953 просмотров • 1 год назад

📢GaussianGPT: autoregressive 3D Gaussian scene generation. We introduce a GPT-style model that directly generates 3D Gaussian scenes, token by token, in a series of small, discrete decision steps. Generation, completion, and large-scale outpainting in a single pipeline. Unlike diffusion-based approaches, GaussianGPT explicitly models the scene distribution at every step, allowing for quite flexible scene synthesis. 🌐 ▶️ Great work by Nicolas von Lützow, Barbara Roessle, Katharina Schmid

📢GaussianGPT: autoregressive 3D Gaussian scene generation. We introduce a GPT-style model that directly generates 3D Gaussian scenes, token by token, in a series of small, discrete decision steps. Generation, completion, and large-scale outpainting in a single pipeline. Unlike diffusion-based approaches, GaussianGPT explicitly models the scene distribution at every step, allowing for quite flexible scene synthesis. 🌐 ▶️ Great work by Nicolas von Lützow, Barbara Roessle, Katharina Schmid

Matthias Niessner

151,475 просмотров • 2 месяцев назад

Listen to Samar Khanna explain why parallel generation, rather than sequential, raises the performance ceiling for language models. Learn more about diffusion LLMs. → We're hiring:

Listen to Samar Khanna explain why parallel generation, rather than sequential, raises the performance ceiling for language models. Learn more about diffusion LLMs. → We're hiring:

Inception

18,407 просмотров • 3 месяцев назад