Loading video...

Video Failed to Load

Go Home

Introducing StyleDrop, a model that allows a significantly higher level of stylized text-to-image synthesis by using a few style reference images that describe the style for text-to-image generation, bypassing the burden of text prompt engineering. More→

80,377 views • 2 years ago •via X (Twitter)

10 Comments

kache's profile picture
kache2 years ago

model? code?

Alex Volkov (Thursd/AI) 🔜 AIENG summit NY's profile picture
Alex Volkov (Thursd/AI) 🔜 AIENG summit NY2 years ago

Shoutout @natanielruizg 👏

Max (e/acc)'s profile picture
Max (e/acc)2 years ago

Could you please introduce Gemini (at least pro) to EU, and Gemini Ultra to the world?🙂

Digital Adam's profile picture
Digital Adam2 years ago

@natanielruizg IPAdapter?

Nader Ale Ebrahim's profile picture
Nader Ale Ebrahim2 years ago

Impressive work, @GoogleAI! This innovation promises to simplify the process and enhance the quality of text-to-image generation. Keep pushing the boundaries of AI research! 🌟 #AI #Research #Innovation

takeyourmeds's profile picture
takeyourmeds2 years ago

watchumean introducing that's old stuff in stable diffusion like a year old

Rom_AI's profile picture
Rom_AI2 years ago

Where is our access, Mr. Google?

Sumone .'s profile picture
Sumone .2 years ago

Just looking like a wow !!!

Mr.D's profile picture
Mr.D2 years ago

That sounds like an exciting advancement in text-to-image synthesis! StyleDrop seems to offer a more efficient approach by utilizing style reference images instead of relying solely on text prompts. This could potentially lead to more accurate and diverse image generation. I'm curious to learn more about how this model works and the results it can produce. @mira_hurley @TimeForPlanX

xiutai's profile picture
xiutai2 years ago

@PublicAI_ #AI

Related Videos

We've officially released and open-sourced HunyuanImage 2.1, our latest text-to-image model. The new model delivers on our commitment to balancing performance and quality. With native 2K image generation, HunyuanImage 2.1 is an advanced open-source text-to-image model.🎨 ✨ New in 2.1: 🔹Advanced Semantics: Supports ultra-long and complex prompts of up to 1000 tokens, and precisely controls the generation of multiple subjects in a single image. 🔹Precise Chinese and English Text Rendering with seamless image–text integration: The model naturally integrates text into images, making it suitable for a wide range of applications such as product covers, illustrations, and poster design to meet the needs of various fields. 🔹Rich Styles and High Aesthetic: Capable of generating images in various styles—including photorealistic portraits, comics, and vinyl figures—it delivers outstanding visual appeal and artistic quality. 🔹High-Quality Generation: Efficiently produces ultra-high-definition (2K) images in the same time other models take to generate a 1K image. HunyuanImage 2.1 uses two text encoders: a multimodal large language model (MLLM) to improve the model's image and text alignment capabilities, and a multi-language character-aware encoder to improve text rendering capabilities. The model is a single- and double-stream diffusion transformer with 17B parameters. We've also open-sourced the weights of the the accelerated version with meanflow which reduces inference steps from 100 to just 8, and PromptEnhancer, the first industrial-grade rewriting model that enhances your prompts for more nuanced and expressive image generation. Now, creators turn complex ideas—like posters with slogans or multi-panel comics—into visuals faster than ever. We’re just getting started. Stay tuned for our native multimodal image generation model coming soon. 🌐Website: 🔗Github: 🤗Hugging Face: ✨Hugging Face Demo:

Tencent Hy

89,257 views • 9 months ago