Загрузка видео...

Не удалось загрузить видео

На главную

Introducing SDXL Turbo: A real-time text-to-image generation model. SDXL Turbo achieves state-of-the-art performance with a new distillation technology, enabling single-step image generation with unprecedented quality, reducing the required step count from 50 to just one. The code, research paper, and weights for non-commercial use are now available on our...

976,312 просмотров • 2 лет назад •via X (Twitter)

Комментарии: 10

Фото профиля Tom Osman 🐦‍⬛
Tom Osman 🐦‍⬛2 лет назад

Can you let us catch our breath for 1 minute at least

Фото профиля Simon C
Simon C2 лет назад

The world is ridiculous right now. God. Real time images updating as I type. Amazing work @StabilityAI team. As usual.

Фото профиля Walgtech 👨🏻‍💻
Walgtech 👨🏻‍💻2 лет назад

First run 🤯😅

Фото профиля Smoke-away
Smoke-away2 лет назад

SDXL Turbo 🔥 Now I just need DALL-E Turbo @ChatGPTapp It's the Stable Diffusion/DALL-E 2 release cycle all over again 💯

Фото профиля Stable Diffusion 🎨 AI Art
Stable Diffusion 🎨 AI Art2 лет назад

👀

Фото профиля 無限 💀
無限 💀2 лет назад

LFG!!! Stability killing it out here :)

Фото профиля Fabella
Fabella2 лет назад

@replicate make it real.

Фото профиля Burkay Gur
Burkay Gur2 лет назад

Just gonna drop a playground here:

Фото профиля Cavit Erginsoy
Cavit Erginsoy2 лет назад

Lol are you trolling OAI with that naming?

Фото профиля s3nh
s3nh2 лет назад

Pls I have a hangover from img2vid let me rest

Похожие видео

We've officially released and open-sourced HunyuanImage 2.1, our latest text-to-image model. The new model delivers on our commitment to balancing performance and quality. With native 2K image generation, HunyuanImage 2.1 is an advanced open-source text-to-image model.🎨 ✨ New in 2.1: 🔹Advanced Semantics: Supports ultra-long and complex prompts of up to 1000 tokens, and precisely controls the generation of multiple subjects in a single image. 🔹Precise Chinese and English Text Rendering with seamless image–text integration: The model naturally integrates text into images, making it suitable for a wide range of applications such as product covers, illustrations, and poster design to meet the needs of various fields. 🔹Rich Styles and High Aesthetic: Capable of generating images in various styles—including photorealistic portraits, comics, and vinyl figures—it delivers outstanding visual appeal and artistic quality. 🔹High-Quality Generation: Efficiently produces ultra-high-definition (2K) images in the same time other models take to generate a 1K image. HunyuanImage 2.1 uses two text encoders: a multimodal large language model (MLLM) to improve the model's image and text alignment capabilities, and a multi-language character-aware encoder to improve text rendering capabilities. The model is a single- and double-stream diffusion transformer with 17B parameters. We've also open-sourced the weights of the the accelerated version with meanflow which reduces inference steps from 100 to just 8, and PromptEnhancer, the first industrial-grade rewriting model that enhances your prompts for more nuanced and expressive image generation. Now, creators turn complex ideas—like posters with slogans or multi-panel comics—into visuals faster than ever. We’re just getting started. Stay tuned for our native multimodal image generation model coming soon. 🌐Website: 🔗Github: 🤗Hugging Face: ✨Hugging Face Demo:

Tencent Hy

89,257 просмотров • 9 месяцев назад

We’re excited to announce the release and open-source of HunyuanImage 3.0 — the largest and most powerful open-source text-to-image model to date, with over 80 billion total parameters, of which 13 billion are activated per token during inference.The effect is completely comparable to the industry’s flagship closed-source model.🚀🚀🚀 HunyuanImage 3.0 originates from our internally developed native multimodal large language model, with fine-tuning and post-training focused on text-to-image generation. This unique foundation gives the model a powerful set of capabilities: ✅Reason with world knowledge ✅Understand complex, thousand-word prompts ✅Generate precise text within images Different from traditional DiT architecture image generation models, HunyuanImage 3.0’s MoE architecture uses a Transfusion-based approach to deeply couple Diffusion and LLM training for a single, powerful system. Built on Hunyuan-A13B, HunyuanImage 3.0 was trained on a massive dataset: 5 billion image-text pairs, video frames, interleaved image-text data, and 6 trillion tokens of text corpora. This hybrid training across multimodal generation, understanding, and LLM capabilities allows the model to seamlessly integrate multiple tasks. Whether you're an illustrator, designer, or creator, this is built to slash your workflow from hours to minutes. HunyuanImage 3.0 can generate intricate text, detailed comics, expressive emojis, and lively, engaging illustrations for educational content. The current release focuses solely on text-to-image generation and future updates will include image-to-image, image editing, multi-turn interaction, and more. 👉🏻Try it now: 🔗GitHub: 🤗Hugging Face:

Tencent Hy

412,572 просмотров • 9 месяцев назад