Video yükleniyor...

Video Yüklenemedi

Bu video yüklenirken bir sorun oluştu. Bu geçici bir ağ sorunundan kaynaklanıyor olabilir veya video kullanılamıyor olabilir.

Ana Sayfaya Dön

Finally! #PHORHUM -- our 3D human reconstruction model from a single image -- is available to the research community 🎉 PHORHUM is joint work with Mihai Zanfir & Cristian Sminchisescu. How to get access: 👇

Thiemo Alldieck

1,788 subscribers

12,764 görüntüleme • 3 yıl önce •via X (Twitter)

Bilim & Teknoloji Haberler & Politika Eğitim #PHORHUM

Anya Rossi• Live Now

Private livecam show

4 Yorum

Thiemo Alldieck profil fotoğrafı

Thiemo Alldieck3 yıl önce

1. Go to 2. Request access 3. We will get back to you shortly If you or your group already has access to the "Google 3D Human Models" repository, you should already have access by now or will be given access in the next days! Questions? DMs are open!

Babusi Nyoni profil fotoğrafı

Babusi Nyoni3 yıl önce

@MihaiZanfir5 @CSminchisescu I waited so long for this you have no idea 🎉

Mohamed Abdelhamid 👨‍💻 profil fotoğrafı

Mohamed Abdelhamid 👨‍💻3 yıl önce

@MihaiZanfir5 @CSminchisescu 😆Amazing, I wish I have participated In that project, I proposed this idea for the graduation project 4 months ago, but unfortunately, our professors were not experienced enough and rejected this idea.and others It was the same but I wanted to apply it on yu-gi-yo monster card .

Mohamed Abdo profil fotoğrafı

Mohamed Abdo3 yıl önce

@MihaiZanfir5 @CSminchisescu Great job, congrats,

Benzer Videolar

🚀Turn Single Image into 3D Human🚀 #GeneMAN is a generalizable single-image 3D human reconstruction framework that turns in-the-wild images into high-quality 3D humans with ease 🔗Project: 📜Paper: 🧑‍💻Code:

🚀Turn Single Image into 3D Human🚀 #GeneMAN is a generalizable single-image 3D human reconstruction framework that turns in-the-wild images into high-quality 3D humans with ease 🔗Project: 📜Paper: 🧑‍💻Code:

Ziwei Liu

26,953 görüntüleme • 1 yıl önce

Early R&D using our SP-6M dataset. Exploring image-to-3D reconstruction from single images, including heavily modified inputs (lighting, hair, etc). Still a work in progress.

Early R&D using our SP-6M dataset. Exploring image-to-3D reconstruction from single images, including heavily modified inputs (lighting, hair, etc). Still a work in progress.

3D Scanstore

53,071 görüntüleme • 3 ay önce

SAM 3D enables accurate 3D reconstruction from a single image, supporting real-world applications in editing, robotics, and interactive scene generation. Matt, a SAM 3D researcher, explains how the two-model design makes this possible for both people and complex environments. 🔗 Read the SAM 3D Objects research paper: 🔗 Read the SAM 3D Body research paper:

SAM 3D enables accurate 3D reconstruction from a single image, supporting real-world applications in editing, robotics, and interactive scene generation. Matt, a SAM 3D researcher, explains how the two-model design makes this possible for both people and complex environments. 🔗 Read the SAM 3D Objects research paper: 🔗 Read the SAM 3D Body research paper:

AI at Meta

17,858 görüntüleme • 8 ay önce

📢📢 𝐀𝐯𝐚𝐭𝟑𝐫 📢📢 Avat3r creates high-quality 3D head avatars from just a few input images in a single forward pass with a new dynamic 3DGS reconstruction model. Video: Project: Our core idea is to make Gaussian Reconstruction Models animatable. We find that a simple cross-attention to an expression code sequence is already sufficient to model complex facial expressions. We then incorporate position maps from DUSt3R and feature maps from Sapiens to facilitate the prediction task. While DUSt3R's position maps act as a pixel-aligned initialization for the Gaussians' positions, the Sapiens feature maps help the cross-view transformer to match corresponding image tokens in the 4 input images. One major challenge in creating a 3D head avatar from smartphone images comes from inconsistent facial expressions when the subject could not remain perfectly static during the capture. We eliminate this static requirement by simply showing our model input images with different facial expressions during training. This technique makes our model robust to inconsistent input images later on. Finally, we show that despite the model has been trained with 4 input images, one can even create a 3D head avatar when only a single image is available. To achieve this, we employ a pre-trained 3D GAN to lift the single image to 3D and then render the 4 input images for our model. This allows us to create 3D head avatars from single images and even highly out-of-distribution examples like AI generated faces, paintings or statues. Great work by Tobias Kirschstein from his internship at Meta with Javier Romero, Artem Sevastopolsky, and Shunsuke Saito

Matthias Niessner

74,763 görüntüleme • 1 yıl önce

📢📢 𝐏𝐞𝐫𝐜𝐇𝐞𝐚𝐝: 𝐏𝐞𝐫𝐜𝐞𝐩𝐭𝐮𝐚𝐥 𝐇𝐞𝐚𝐝 𝐌𝐨𝐝𝐞𝐥 𝐟𝐨𝐫 𝐒𝐢𝐧𝐠𝐥𝐞-𝐈𝐦𝐚𝐠𝐞 𝟑𝐃 𝐇𝐞𝐚𝐝 𝐑𝐞𝐜𝐨𝐧𝐬𝐭𝐫𝐮𝐜𝐭𝐢𝐨𝐧 & 𝐄𝐝𝐢𝐭𝐢𝐧𝐠📢📢 PercHead reconstructs realistic 3D heads from a single image and enables disentangled 3D editing via geometric controls and style inputs from images or text. At its core is a generalized 3D head decoder trained with perceptual supervision from DINOv2 and SAM 2.1. We find that our new perceptual loss formulation improves reconstruction fidelity compared to commonly-used methods such as LPIPS. Our trained reconstruction model is able to generate 3D-consistent heads from a single input image. Even with challenging side-view inputs, the model robustly infers missing regions for a coherent, high-fidelity output. In addition, our architecture seamlessly adapts to downstream tasks: by swapping the encoder, we can transform the model into a disentangled 3D editing pipeline. In this scenario, we can control geometry through - potentially hand-drawn - segmentation maps, and condition style via image or text prompt. We also provide an interactive GUI to enable the exploration of our editing pipeline. 🌍 📽️ Great work by Antonio Oroz and Tobias Kirschstein

📢📢 𝐏𝐞𝐫𝐜𝐇𝐞𝐚𝐝: 𝐏𝐞𝐫𝐜𝐞𝐩𝐭𝐮𝐚𝐥 𝐇𝐞𝐚𝐝 𝐌𝐨𝐝𝐞𝐥 𝐟𝐨𝐫 𝐒𝐢𝐧𝐠𝐥𝐞-𝐈𝐦𝐚𝐠𝐞 𝟑𝐃 𝐇𝐞𝐚𝐝 𝐑𝐞𝐜𝐨𝐧𝐬𝐭𝐫𝐮𝐜𝐭𝐢𝐨𝐧 & 𝐄𝐝𝐢𝐭𝐢𝐧𝐠📢📢 PercHead reconstructs realistic 3D heads from a single image and enables disentangled 3D editing via geometric controls and style inputs from images or text. At its core is a generalized 3D head decoder trained with perceptual supervision from DINOv2 and SAM 2.1. We find that our new perceptual loss formulation improves reconstruction fidelity compared to commonly-used methods such as LPIPS. Our trained reconstruction model is able to generate 3D-consistent heads from a single input image. Even with challenging side-view inputs, the model robustly infers missing regions for a coherent, high-fidelity output. In addition, our architecture seamlessly adapts to downstream tasks: by swapping the encoder, we can transform the model into a disentangled 3D editing pipeline. In this scenario, we can control geometry through - potentially hand-drawn - segmentation maps, and condition style via image or text prompt. We also provide an interactive GUI to enable the exploration of our editing pipeline. 🌍 📽️ Great work by Antonio Oroz and Tobias Kirschstein

Matthias Niessner

18,855 görüntüleme • 8 ay önce

📢 A Recipe for Generating 3D Worlds From a Single Image 📢 Our recipe explains how existing generative models can be adapted with minimal training effort to generate 3D worlds from a single input image.

📢 A Recipe for Generating 3D Worlds From a Single Image 📢 Our recipe explains how existing generative models can be adapted with minimal training effort to generate 3D worlds from a single input image.

Katja Schwarz

13,970 görüntüleme • 1 yıl önce

Meet our new and fast 3D sculpting model! ✅ Single image to mesh with detailed geometry in the order of minutes (more GPUs coming to get to <30-60s) ✅ Dense mesh re-topology using our custom language model (minutes) Available on all Cube plans:

Meet our new and fast 3D sculpting model! ✅ Single image to mesh with detailed geometry in the order of minutes (more GPUs coming to get to <30-60s) ✅ Dense mesh re-topology using our custom language model (minutes) Available on all Cube plans:

Common Sense Machines

53,311 görüntüleme • 1 yıl önce

Thrilled to share our new work on Reconviagen✨! A key challenge in 3D creation is the alignment of generative 3D with observational input. Our method solves this by grounding the generative process in 3D reconstruction. Try it at: #Reconstruction #AIGC

Thrilled to share our new work on Reconviagen✨! A key challenge in 3D creation is the alignment of generative 3D with observational input. Our method solves this by grounding the generative process in 3D reconstruction. Try it at: #Reconstruction #AIGC

Chongjie Ye

11,624 görüntüleme • 10 ay önce

Our new work PSHuman reconstructs a detailed 3D human mesh from a single-view image in ~1min. Codes have already been released! Welcome to try it! Project page: Codes: Paper:

Our new work PSHuman reconstructs a detailed 3D human mesh from a single-view image in ~1min. Codes have already been released! Welcome to try it! Project page: Codes: Paper:

Yuan Liu

53,252 görüntüleme • 1 yıl önce

📢Pixel3DMM: Versatile Screen-Space Priors for Single-Image 3D Face Reconstruction📢 -> highly accurate face reconstruction by training powerful VITs via surface normals and UV-coordinates estimation. The geometric cues from our 2D foundation model backbone constrain the 3DMM parameters, which allows us to achieve remarkable reconstruction accuracy - works for both single image and videos! In addition, we introduce a new 3D face reconstruction benchmark that evaluates both neutral and posed face geometry. 🌍 📷 Great work by Simon Giebenhain Tobias Kirschstein Martin Rünz Lourdes Agapito

📢Pixel3DMM: Versatile Screen-Space Priors for Single-Image 3D Face Reconstruction📢 -> highly accurate face reconstruction by training powerful VITs via surface normals and UV-coordinates estimation. The geometric cues from our 2D foundation model backbone constrain the 3DMM parameters, which allows us to achieve remarkable reconstruction accuracy - works for both single image and videos! In addition, we introduce a new 3D face reconstruction benchmark that evaluates both neutral and posed face geometry. 🌍 📷 Great work by Simon Giebenhain Tobias Kirschstein Martin Rünz Lourdes Agapito

Matthias Niessner

62,269 görüntüleme • 1 yıl önce

🌟Transform an image into a 3D model in just 5 easy steps! 1️⃣Visit the Hunyuan 3D website and log in: 2️⃣Navigate to the "3D Creation" page 3️⃣Choose the "Image to 3D" feature 🎨 4️⃣Upload your image and click "Generate Immediately" 5️⃣Wait a moment , and voilà—your stunning 3D model is ready! ⏳✨ Watch our tutorial video to get started! 😸🎥

🌟Transform an image into a 3D model in just 5 easy steps! 1️⃣Visit the Hunyuan 3D website and log in: 2️⃣Navigate to the "3D Creation" page 3️⃣Choose the "Image to 3D" feature 🎨 4️⃣Upload your image and click "Generate Immediately" 5️⃣Wait a moment , and voilà—your stunning 3D model is ready! ⏳✨ Watch our tutorial video to get started! 😸🎥

Hunyuan

101,468 görüntüleme • 1 yıl önce

Krea 1 is now available to everyone. today, we're releasing the public beta of our first image model trained in collaboration with Black Forest Labs to offer superior aesthetic control and image quality. learn how to use it for free 👇

Krea 1 is now available to everyone. today, we're releasing the public beta of our first image model trained in collaboration with Black Forest Labs to offer superior aesthetic control and image quality. learn how to use it for free 👇

KREA AI

286,441 görüntüleme • 1 yıl önce

Today, we are releasing Stable Video Diffusion, our first foundation model for generative AI video based on the image model, Stable Diffusion. As part of this research preview, the code, weights, and research paper are now available. Additionally, today you can sign up for our waitlist to access a new upcoming web experience featuring a Text-To-Video interface. To access the model & sign up for our waitlist, visit our website here:

Today, we are releasing Stable Video Diffusion, our first foundation model for generative AI video based on the image model, Stable Diffusion. As part of this research preview, the code, weights, and research paper are now available. Additionally, today you can sign up for our waitlist to access a new upcoming web experience featuring a Text-To-Video interface. To access the model & sign up for our waitlist, visit our website here:

Stability AI

1,024,532 görüntüleme • 2 yıl önce

Marble is the first product from World Labs and is powered by our multimodal world model, which lets anyone create high-fidelity, persistent 3D worlds from just a single image, video, text prompt, or 3D layout. Read more at

Marble is the first product from World Labs and is powered by our multimodal world model, which lets anyone create high-fidelity, persistent 3D worlds from just a single image, video, text prompt, or 3D layout. Read more at

World Labs

117,233 görüntüleme • 8 ay önce

3D editing is hard: you need to ground an image + instruction and generate a faithful 3D shape in one forward pass -- no test-time optimization. So, we steer pretrained image-to-3D representations to do text-guided 3D edits; no massive 3D edit-pair dataset needed. Key trap: the “no-edit” solution is a nasty local minimum. We fix it with preference optimization, pushing the model to actually edit. Steer3D is the second work that adapts alignment ideas from LLMs to the 3D modality. SAM 3D also used DPO to improve its 3D generations.

3D editing is hard: you need to ground an image + instruction and generate a faithful 3D shape in one forward pass -- no test-time optimization. So, we steer pretrained image-to-3D representations to do text-guided 3D edits; no massive 3D edit-pair dataset needed. Key trap: the “no-edit” solution is a nasty local minimum. We fix it with preference optimization, pushing the model to actually edit. Steer3D is the second work that adapts alignment ideas from LLMs to the 3D modality. SAM 3D also used DPO to improve its 3D generations.

Georgia Gkioxari

116,061 görüntüleme • 7 ay önce

Mapping how neurons connect is one of neuroscience’s biggest challenges. MouseLight (from our Janelia Research Campus) is building a dataset of fully traced neurons — available to researchers everywhere, & helping reveal how brain circuits work & what goes wrong in disease.

Mapping how neurons connect is one of neuroscience’s biggest challenges. MouseLight (from our Janelia Research Campus) is building a dataset of fully traced neurons — available to researchers everywhere, & helping reveal how brain circuits work & what goes wrong in disease.

HHMI

20,392 görüntüleme • 3 ay önce

Meta PARTNR is a research framework supporting seamless human-robot collaboration. Building on our research with Habitat, we’re open sourcing a large-scale benchmark, dataset and large planning model that we hope will enable the community to effectively train social robots.

Meta PARTNR is a research framework supporting seamless human-robot collaboration. Building on our research with Habitat, we’re open sourcing a large-scale benchmark, dataset and large planning model that we hope will enable the community to effectively train social robots.

AI at Meta

106,951 görüntüleme • 1 yıl önce

HOLY SHIT! Sparc 3D is INSANE! All they need to do now is enable PBR texture support, object separation and decimation and this in my opinion completely replaces photogrammetry. The quality of the reconstruction of these 3D objects from a single image is mind blowing to me. Yes, the model is insanely dense to allow all of this detail but the fact that AI can generate with this level of fidelity is CRAZY!!! You can try it for free on Huggingface

HOLY SHIT! Sparc 3D is INSANE! All they need to do now is enable PBR texture support, object separation and decimation and this in my opinion completely replaces photogrammetry. The quality of the reconstruction of these 3D objects from a single image is mind blowing to me. Yes, the model is insanely dense to allow all of this detail but the fact that AI can generate with this level of fidelity is CRAZY!!! You can try it for free on Huggingface

Travis Davids

18,230 görüntüleme • 1 yıl önce

I'm excited to share our new work, VistaDream, which generates a 3D Gaussian field from a single-view image. The codes have already been released. Project page: (with interactive demos) Code: Paper:

I'm excited to share our new work, VistaDream, which generates a 3D Gaussian field from a single-view image. The codes have already been released. Project page: (with interactive demos) Code: Paper:

Yuan Liu

79,592 görüntüleme • 1 yıl önce