Video wird geladen...

Video konnte nicht geladen werden

Zur Startseite

Large Language Diffusion with Masking (LLaDA) are here - and their generation looks so fucking dope! 🤯 True to Yann LeCun's vision, Ditch the auto-regressive bits and approximate the language distribution via Maximum Likelihood Estimation! So cool to watch the model denoise text from tokens in real time! -...

21,394 Aufrufe • vor 1 Jahr •via X (Twitter)

12 Kommentare

Profilbild von Vaibhav (VB) Srivastav
Vaibhav (VB) Srivastavvor 1 Jahr

Check out the demo here:

Profilbild von Vaibhav (VB) Srivastav
Vaibhav (VB) Srivastavvor 1 Jahr

Model checkpoints here:

Profilbild von AssemblyAI
AssemblyAIvor 1 Jahr

Announcing: Our most advanced speech-to-text model goes beyond accuracy to capture the real-world complexity of human conversation and deliver reliable, source-of-truth audio data. Explore Universal-2 updates 👇

Profilbild von Chaithanya Kumar
Chaithanya Kumarvor 1 Jahr

@ylecun @Stardust_nds check this out buddy , something that we have been discussing about Also LLaDa

Profilbild von HosseinAgha
HosseinAghavor 1 Jahr

@ylecun This is interesting! Finally a new architecture for LLMs. I don't think this solves any of the @ylecun concerns with transformer based Auto Regressive LMs. No world model. No video understanding, etc.

Profilbild von luis
luisvor 1 Jahr

@ylecun Omg , I think this idea is perfect to make the answers more precise, 🥵 hello 100 precision

Profilbild von marko.
marko.vor 1 Jahr

@ylecun Since you have to compute the whole maximum possible response length every time, what does this mean for VRAM requirements when deploying these models?

Profilbild von Futurist Avenue
Futurist Avenuevor 1 Jahr

@ylecun How does this stack up with Inception?

Profilbild von AI at Meta
AI at Metavor 1 Jahr

Llama has now been downloaded over 1 Billion times! A note to: The researchers at Meta training these models — and those building on the research in other labs. The developers and enthusiasts on r/LocalLlama, @huggingface and more; experimenting with new models and creating derivatives. The small startups and big enterprises alike who are creating a new wave of AI-powered products, built with Llama. The global AI community. Your actions speak louder than words, thank you for making it abundantly clear — a billion times over — that open source AI is how we'll create the next wave of world changing technologies, together. 🦙❤️

Profilbild von Hunyuan
Hunyuanvor 1 Jahr

Coming soon: HunYuan-T1,The first ultra-large Mamba-powered reasoning model! Stay tuned! 🚀

Profilbild von AK
AKvor 1 Jahr

Bytedance just dropped DAPO on Hugging Face An Open-Source LLM Reinforcement Learning System at Scale

Profilbild von Jeremy Howard
Jeremy Howardvor 1 Jahr

Announcing fasttransform: a Python lib that makes data transformations reversible/extensible. No more writing inverse functions to see what your model sees. Debug pipelines by actually looking at your data. Built on multi-dispatch. Work w/ @R_Dimm

Ähnliche Videos

We've officially released and open-sourced HunyuanImage 2.1, our latest text-to-image model. The new model delivers on our commitment to balancing performance and quality. With native 2K image generation, HunyuanImage 2.1 is an advanced open-source text-to-image model.🎨 ✨ New in 2.1: 🔹Advanced Semantics: Supports ultra-long and complex prompts of up to 1000 tokens, and precisely controls the generation of multiple subjects in a single image. 🔹Precise Chinese and English Text Rendering with seamless image–text integration: The model naturally integrates text into images, making it suitable for a wide range of applications such as product covers, illustrations, and poster design to meet the needs of various fields. 🔹Rich Styles and High Aesthetic: Capable of generating images in various styles—including photorealistic portraits, comics, and vinyl figures—it delivers outstanding visual appeal and artistic quality. 🔹High-Quality Generation: Efficiently produces ultra-high-definition (2K) images in the same time other models take to generate a 1K image. HunyuanImage 2.1 uses two text encoders: a multimodal large language model (MLLM) to improve the model's image and text alignment capabilities, and a multi-language character-aware encoder to improve text rendering capabilities. The model is a single- and double-stream diffusion transformer with 17B parameters. We've also open-sourced the weights of the the accelerated version with meanflow which reduces inference steps from 100 to just 8, and PromptEnhancer, the first industrial-grade rewriting model that enhances your prompts for more nuanced and expressive image generation. Now, creators turn complex ideas—like posters with slogans or multi-panel comics—into visuals faster than ever. We’re just getting started. Stay tuned for our native multimodal image generation model coming soon. 🌐Website: 🔗Github: 🤗Hugging Face: ✨Hugging Face Demo:

Tencent Hy

89,257 Aufrufe • vor 9 Monaten