Video wird geladen...

Video konnte nicht geladen werden

Zur Startseite

New text and image to video generation AI model Open-Sora-Plan-v1.3.0

51,838 Aufrufe • vor 1 Jahr •via X (Twitter)

6 Kommentare

Profilbild von AK
AKvor 1 Jahr

model:

Profilbild von Cesar Silva
Cesar Silvavor 1 Jahr

How can i use it?

Profilbild von Aditya Singh
Aditya Singhvor 1 Jahr

How many H100s do we need?

Profilbild von Agbomekhe Iwonii
Agbomekhe Iwoniivor 1 Jahr

Waoh

Profilbild von Romain Abdel-Aal
Romain Abdel-Aalvor 1 Jahr

awesome :)

Profilbild von Jonah
Jonahvor 1 Jahr

@TomLikesRobots So Sora is only on hugging face?

Ähnliche Videos

We've officially released and open-sourced HunyuanImage 2.1, our latest text-to-image model. The new model delivers on our commitment to balancing performance and quality. With native 2K image generation, HunyuanImage 2.1 is an advanced open-source text-to-image model.🎨 ✨ New in 2.1: 🔹Advanced Semantics: Supports ultra-long and complex prompts of up to 1000 tokens, and precisely controls the generation of multiple subjects in a single image. 🔹Precise Chinese and English Text Rendering with seamless image–text integration: The model naturally integrates text into images, making it suitable for a wide range of applications such as product covers, illustrations, and poster design to meet the needs of various fields. 🔹Rich Styles and High Aesthetic: Capable of generating images in various styles—including photorealistic portraits, comics, and vinyl figures—it delivers outstanding visual appeal and artistic quality. 🔹High-Quality Generation: Efficiently produces ultra-high-definition (2K) images in the same time other models take to generate a 1K image. HunyuanImage 2.1 uses two text encoders: a multimodal large language model (MLLM) to improve the model's image and text alignment capabilities, and a multi-language character-aware encoder to improve text rendering capabilities. The model is a single- and double-stream diffusion transformer with 17B parameters. We've also open-sourced the weights of the the accelerated version with meanflow which reduces inference steps from 100 to just 8, and PromptEnhancer, the first industrial-grade rewriting model that enhances your prompts for more nuanced and expressive image generation. Now, creators turn complex ideas—like posters with slogans or multi-panel comics—into visuals faster than ever. We’re just getting started. Stay tuned for our native multimodal image generation model coming soon. 🌐Website: 🔗Github: 🤗Hugging Face: ✨Hugging Face Demo:

Tencent Hy

89,257 Aufrufe • vor 9 Monaten