Загрузка видео...

Не удалось загрузить видео

На главную

Made this video (🎶) with a Midjourney v6 image! Started by upscaling/refining with Magnific.ai, pulled a Marigold Depth Map from that in ComfyUI, then used as a displacement map in Blender where I animated this camera pass with some relighting and narrow depth of field.🧵1/12

135,658 просмотров • 2 лет назад •via X (Twitter)

Комментарии: 11

Фото профиля CoffeeVectors
CoffeeVectors2 лет назад

Here's the base image and the before/after in @Magnific_AI. Even though MJv6 has an upscaler, Magnific gave me better eyelid and skin details for this case. (Fun fact, this image was from a v4 prompt from summer last year, when MJ had just released a new beta upscaler.) 2/12

Фото профиля CoffeeVectors
CoffeeVectors2 лет назад

Next step was using the new Marigold Depth Estimation node in ComfyUI to get an extremely detailed depth map. Note that I'm saving the result as an EXR file (important for adjusting levels later), and that the remap and colorizing nodes are just for visualization. 3/12

Фото профиля CoffeeVectors
CoffeeVectors2 лет назад

To install, I just searched for Marigold with the custom node manager in the most updated version of ComfyUI. You can also refer to the GitHub repo. 4/12

Фото профиля CoffeeVectors
CoffeeVectors2 лет назад

I kept denoise and n_repeat at 10 initially then pushed the final to 40 & 30. I found that I got the best map detail around 1024 x 1024 resolution. Above 1600 I got muddy results, and above 2000 render times shot thru the roof/VRAM ceilings on my 4090 laptop. 5/12

Фото профиля CoffeeVectors
CoffeeVectors2 лет назад

I imported the EXR into Photoshop & used layers of level/curve adjustments to make the final TIFF file for Blender. I tried to create distance between the tip of the nose and the rest of the face. I did this by eye/trial & error, so it's something you'll need to tinker with. 6/12

Фото профиля CoffeeVectors
CoffeeVectors2 лет назад

To do the displacement in Blender, I basically followed this tutorial. It explains the process better than I can. 7/12

Фото профиля CoffeeVectors
CoffeeVectors2 лет назад

Note that the displacement map I ended up with wasn't a perfect reconstruction of a nose and lips. I used my camera angles/lighting to reduce how noticeable the displacement artifacts were. It's not as robust as a fully modeled scene obvs, but it's much faster to make. 8/12

Фото профиля CoffeeVectors
CoffeeVectors2 лет назад

Here's my shader setup/modifier stack in Blender. In the shader I used a high metallic value to up the contrast and make the shader pop in 3D lighting (similar to metallic photo prints.) Also, I gave the shader a low emission value to mix the original colors with relighting. 9/12

Фото профиля CoffeeVectors
CoffeeVectors2 лет назад

Then I added a red side light to accent things using a long, thin area light. I also keyframed a slowly strobing point light moving through the scene to showcase depth. Note how a low emission shader keeps detail in the shadows while allowing relighting to make changes. 10/12

Фото профиля CoffeeVectors
CoffeeVectors2 лет назад

I also used a camera with a f/0.8-2.0 aperture to get very narrow depth of field to create bokeh and further showcase depth and draw attention to the skin texture. Then edited to a Florence & The Machine song. 11/12

Фото профиля CoffeeVectors
CoffeeVectors2 лет назад

Hope this is helpful for some of you and let me know if you have any questions! And if you make something cool with these tools let me know! Would love to see how everyone is experimenting and pushing things forward! 12/12

Похожие видео

In collaboration with Intel, our Depth Fusion showcases the power of our LDM3D diffusion model in generating 360° views from text prompts provided by the user. The LDM3D diffusion model generates a 2D RGB image and its corresponding relative depth map providing a complete RGBD representation corresponding to the text prompt. The LDM 3D model is a specialized version of the stable diffusion V 1.4 model that has been modified to fit both image and depth map data.The model was then fine tuned on a subset of the Laion400M data set - large scale image caption data set. The depth maps used to fine tune our model were generated by the DPTBeiT large 512 depth estimation model that provides highly accurate relative depth estimates for each pixel. We take the generated 2D RGB image and depth map and use them to compute a 360° projection using touchdesigner. Touchdesigner is a versatile platform that allows for the creation of immersive and interactive multimedia experiences. Our application harnesses the power of touchdesigner to bring the generated 360° views to life, providing users with a unique and engaging way to experience their text prompts, whether it’s a description of a tranquil forest, a noisy cityscape or a futuristic sci fi world. Our depth fusion can bring these concepts to life in a vivid and immersive detail. - Scottie Fox, VP Engineering Blockade Labs ScottieFox #AI #VR #3D #gamedev #stablediffusion

Blockade Labs

11,439 просмотров • 3 лет назад