Loading video...

Video Failed to Load

Go Home

Code and data are now online for CameraHMR, our state-of-the-art parametric 3D human pose and shape (HPS) estimation method that will appear at hashtag#3DV2025. There are 4 key contributions that make it so accurate and robust: 1. To get accurate 3D shape and pose as well as good alignment...

21,647 views • 1 year ago •via X (Twitter)

9 Comments

Jerin Philip's profile picture
Jerin Philip1 year ago

Nice work, currently looking into using it. When is the release of CamSMPLify + HumanFoV + DenseKP expected? What's the hardware requirements to train this end-to-end? 4D-Humans mention 8 A100 GPUs for 7 days, does this require similar?

Michael Black's profile picture
Michael Black1 year ago

All code will be out before 3DV takes place.

Digital Currency's profile picture
Digital Currency2 years ago

From 3D modeling to VR/AR development, our MSc in Metaverse program equips you with the technical skills to excel in the rapidly evolving digital world. Don't miss out—enroll today! #UNIC #MScMetaverse

Duke 'Burrito Haver' Zero's profile picture
Duke 'Burrito Haver' Zero1 year ago

could you embed a sentiment layer keyed to facial expressions / body language?

Michael Black's profile picture
Michael Black1 year ago

This method is just about pixels to parameters but, yes, relating these parameters to expressions and body language is something we are interested in. E.g., have a look at our work on generating moving people from audio:

lotsoflittleprojects's profile picture
lotsoflittleprojects1 year ago

Will this model get folded into Meshcapade?

Michael Black's profile picture
Michael Black1 year ago

The next Meshcapade release will have many goodies that go beyond CameraHMR. Coming soon!

Matt Jaynes's profile picture
Matt Jaynes1 year ago

@camenduru Damn! Impressive!

JohnYue122333's profile picture
JohnYue1223331 year ago

MPI, awesome!

Related Videos

Physics-based Motion Retargeting from Sparse Inputs paper page: Avatars are important to create interactive and immersive experiences in virtual worlds. One challenge in animating these characters to mimic a user's motion is that commercial AR/VR products consist only of a headset and controllers, providing very limited sensor data of the user's pose. Another challenge is that an avatar might have a different skeleton structure than a human and the mapping between them is unclear. In this work we address both of these challenges. We introduce a method to retarget motions in real-time from sparse human sensor data to characters of various morphologies. Our method uses reinforcement learning to train a policy to control characters in a physics simulator. We only require human motion capture data for training, without relying on artist-generated animations for each avatar. This allows us to use large motion capture datasets to train general policies that can track unseen users from real and sparse data in real-time. We demonstrate the feasibility of our approach on three characters with different skeleton structure: a dinosaur, a mouse-like creature and a human. We show that the avatar poses often match the user surprisingly well, despite having no sensor information of the lower body available. We discuss and ablate the important components in our framework, specifically the kinematic retargeting step, the imitation, contact and action reward as well as our asymmetric actor-critic observations. We further explore the robustness of our method in a variety of settings including unbalancing, dancing and sports motions.

AK

106,519 views • 2 years ago

Muslims are now specifically targeting California to run for office and take over the government “We want to make sure we activate California Muslims in so we can shape the rest of the country, because we can shape California. We need you to run for office” “Muslims. We will build a network of activists at every masjid, every mosque, to ensure that every eligible Muslim, every eligible person is registered to vote, that every elected official at every level engages with the Muslim community, that more American Muslims run for office — And we want to make sure that we activate California Muslims in so we can shape the rest of the country, because we can shape California. We need you to run for office, and I want to salute the dozens and dozens of American Muslims who were, who had the courage and the commitment and the resolve to run for office. Dozens of them won, others did not. They will win next time. What Allah promises is guaranteed 100%.” “How many phone calls have you made? How many protests have you tried to attend? Have you met or called your elected officials, your member of Congress? Have you gone to call the White House? Have you joined the efforts, the political efforts happening in every city to organize the Muslim votes as we deal with elections? Ask yourself, where am I in the equation? Because until there is a critical mass, Allah is withholding what he has promised us, because we haven't fulfilled our part of the deal” “Mahmoud Saifi in Redlands. Dr. Asif Mahmud, running for Congress. Fatima Eqbal Zubair in LA for Assembly, and many others. But as important as running is, it is equally important to build the power of the grassroots at the grassroots level. This year, we've established the Muslim Community Action Network, we know it as MCAN, which aims to train activists as community organizers who can inform and mobilize their local community, set up candidate forums, engage with local politicians, and advocate for local, state, and national issues of importance to our community.“

Wall Street Apes

338,914 views • 6 months ago

#WATCH | India AI Impact Summit 2026 | Delhi: Founder Chairman and CEO of Sampark Foundation & former CEO of HCL Technologies, Vineet Nayar says, "...From an employment point of view I think it is very important for us to understand that Indian companies, including Indian IT companies, are going to be profit-driven and therefore if you believe that they are going to create employment you must be dreaming. Therefore, the question is how do we create employment in this environment, and that employment comes from mass scale startups, which is what this government has already doing. So, how do we create new sets of people who are trying to solve new sets of problems not new sets of technology and if we do that we will get it right. I think we as Indians have to be very careful on who does data belong to and that is the debate we have a problem with. The LLM models which exist worldwide are far superior than the Indian models. Unfortunately, in India, we never develop products, so therefore we do not have SLMs and LLMs which are world-class. On one side, we have global LLM products which are coming to India and trading on our Indian data. Should we allowed that or should we not allowed that? But on the other side if we don't allow that then we have the data but we don't have the LLM models. So, how do we encourage technology completely to develop the LLM models. This needs radicals strategic thinking and a very important aspect otherwise we will either give up a data. So, I think it's a very critical aspect for us to think about - who does this data belong, what is the kind of incentives we are going to give to develop LLM technologies or SLM technologies fast so that we train on our data otherwise an LLM will come in with our data and we'll immediately see return and we'll celebrate and we will do all these kind of press releases but the India will lose a competitive advantage on something which is very critical for the next decade."

ANI

18,742 views • 3 months ago

3D-LLM: Injecting the 3D World into Large Language Models paper page: Large language models (LLMs) and Vision-Language Models (VLMs) have been proven to excel at multiple tasks, such as commonsense reasoning. Powerful as these models can be, they are not grounded in the 3D physical world, which involves richer concepts such as spatial relationships, affordances, physics, layout, and so on. In this work, we propose to inject the 3D world into large language models and introduce a whole new family of 3D-LLMs. Specifically, 3D-LLMs can take 3D point clouds and their features as input and perform a diverse set of 3D-related tasks, including captioning, dense captioning, 3D question answering, task decomposition, 3D grounding, 3D-assisted dialog, navigation, and so on. Using three types of prompting mechanisms that we design, we are able to collect over 300k 3D-language data covering these tasks. To efficiently train 3D-LLMs, we first utilize a 3D feature extractor that obtains 3D features from rendered multi- view images. Then, we use 2D VLMs as our backbones to train our 3D-LLMs. By introducing a 3D localization mechanism, 3D-LLMs can better capture 3D spatial information. Experiments on ScanQA show that our model outperforms state-of-the-art baselines by a large margin (e.g., the BLEU-1 score surpasses state-of-the-art score by 9%). Furthermore, experiments on our held-in datasets for 3D captioning, task composition, and 3D-assisted dialogue show that our model outperforms 2D VLMs. Qualitative examples also show that our model could perform more tasks beyond the scope of existing LLMs and VLMs.

AK

249,572 views • 2 years ago