Video wird geladen...

Video konnte nicht geladen werden

Beim Laden dieses Videos ist ein Problem aufgetreten. Dies könnte an einem vorübergehenden Netzwerkproblem liegen oder das Video ist möglicherweise nicht verfügbar.

byebye expensive motion tracking equipment 👋 ai makes motion capturing so easy now! nvidia presented GENMO last week, a new model that can generate and estimate human motion from text, audio, video, and 3D keyframes

Dreaming Tulpa 🥓👑

54,918 subscribers

73,693 Aufrufe • vor 1 Jahr •via X (Twitter)

Gesundheit & Wellness Wissenschaft & Technologie Bildung

Anya Rossi• Live Now

Private livecam show

10 Kommentare

Profilbild von PowerBeatsVR

PowerBeatsVRvor 3 Jahren

Get ready for a full-body VR workout that’s fun, fast, and intuitive — Play PowerBeatsVR (Now on Meta Quest) 🔥

Profilbild von Paliesk Debesį

Paliesk Debesįvor 1 Jahr

Is it real time? Would love to use something like this for VR chat.

Profilbild von Adrian Werner

Adrian Wernervor 1 Jahr

It's not going to replace expensive motion tracking for high end production because the fidelity is too low. But it is a cool stuff to have for indie studios. It's not anything new, plenty of such systems are already in use, for example inZOI has inhouse one.

Profilbild von NΞXUS STUDIO ⒶI

NΞXUS STUDIO ⒶIvor 1 Jahr

Awesome, is it possible to generate a tracking shot of a car too?

Profilbild von Atiko 💎

Atiko 💎vor 1 Jahr

Wow

Profilbild von BLENDER SUSHI 🫶 X - 24/7 Blenderian

BLENDER SUSHI 🫶 X - 24/7 Blenderianvor 1 Jahr

Fingers typing behind clothes :)

Profilbild von JSFILMZ

JSFILMZvor 1 Jahr

ai mocap been out for like 5 years

Profilbild von WaveSpeedAI

WaveSpeedAIvor 1 Jahr

Cool!

Profilbild von Jorge

Jorgevor 1 Jahr

some people might complain about "le ai is taking le work" but actually you still gotta know dem moves I've seen people doing mocaps at home and moving really weirdly, with like 3000€ equipment

Profilbild von rey

reyvor 1 Jahr

@grok buddy, wt the hell is going on, I thought this wasn't suppose to come till 2030 ,r we in the singularity already 😂

Ähnliche Videos

AI can synthesize dynamic textures on 3D meshes without UV maps! MeshNCA is a model that can generate dynamic textures from images, text prompts, and motion vector fields.

AI can synthesize dynamic textures on 3D meshes without UV maps! MeshNCA is a model that can generate dynamic textures from images, text prompts, and motion vector fields.

Dreaming Tulpa 🥓👑

79,605 Aufrufe • vor 2 Jahren

The free UE5.5 AI motion capture plugin is compatible with Unity, Blender, and VMC. It supports both real-time motion capture via webcam and video uploads to generate 3D motion data. Powered by a 1-billion-parameter motion model running locally on your computer, it requires 8GB of VRAM for real-time processing. Compatible with Nvidia, AMD, and Intel GPUs, it can leverage NVIDIA CUDA to significantly boost graphics computation efficiency.

The free UE5.5 AI motion capture plugin is compatible with Unity, Blender, and VMC. It supports both real-time motion capture via webcam and video uploads to generate 3D motion data. Powered by a 1-billion-parameter motion model running locally on your computer, it requires 8GB of VRAM for real-time processing. Compatible with Nvidia, AMD, and Intel GPUs, it can leverage NVIDIA CUDA to significantly boost graphics computation efficiency.

CYANPUPPETS

17,842 Aufrufe • vor 1 Jahr

Next-level motion from @HeyGen_Official I got to play with the new AI Motion from HeyGen, and it blew my mind! You generate an image and describe the motion; then, you can generate long videos from that avatar. Sound effects included! 1. Tales from The Heartland

Alex Patrascu

57,728 Aufrufe • vor 1 Jahr

MotionGPT: Human Motion as a Foreign Language paper page: Though the advancement of pre-trained large language models unfolds, the exploration of building a unified model for language and other multi-modal data, such as motion, remains challenging and untouched so far. Fortunately, human motion displays a semantic coupling akin to human language, often perceived as a form of body language. By fusing language data with large-scale motion models, motion-language pre-training that can enhance the performance of motion-related tasks becomes feasible. Driven by this insight, we propose MotionGPT, a unified, versatile, and user-friendly motion-language model to handle multiple motion-relevant tasks. Specifically, we employ the discrete vector quantization for human motion and transfer 3D motion into motion tokens, similar to the generation process of word tokens. Building upon this "motion vocabulary", we perform language modeling on both motion and text in a unified manner, treating human motion as a specific language. Moreover, inspired by prompt learning, we pre-train MotionGPT with a mixture of motion-language data and fine-tune it on prompt-based question-and-answer tasks. Extensive experiments demonstrate that MotionGPT achieves state-of-the-art performances on multiple motion tasks including text-driven motion generation, motion captioning, motion prediction, and motion in-between.

MotionGPT: Human Motion as a Foreign Language paper page: Though the advancement of pre-trained large language models unfolds, the exploration of building a unified model for language and other multi-modal data, such as motion, remains challenging and untouched so far. Fortunately, human motion displays a semantic coupling akin to human language, often perceived as a form of body language. By fusing language data with large-scale motion models, motion-language pre-training that can enhance the performance of motion-related tasks becomes feasible. Driven by this insight, we propose MotionGPT, a unified, versatile, and user-friendly motion-language model to handle multiple motion-relevant tasks. Specifically, we employ the discrete vector quantization for human motion and transfer 3D motion into motion tokens, similar to the generation process of word tokens. Building upon this "motion vocabulary", we perform language modeling on both motion and text in a unified manner, treating human motion as a specific language. Moreover, inspired by prompt learning, we pre-train MotionGPT with a mixture of motion-language data and fine-tune it on prompt-based question-and-answer tasks. Extensive experiments demonstrate that MotionGPT achieves state-of-the-art performances on multiple motion tasks including text-driven motion generation, motion captioning, motion prediction, and motion in-between.

AK

125,319 Aufrufe • vor 3 Jahren

Grok Imagine Video 1.5 is live on AITOPIA. @xAI's latest video model is here. Generate from text or image. Native audio with dialogue, music, and effects. Realistic motion and strong prompt adherence. Try it now →

Grok Imagine Video 1.5 is live on AITOPIA. @xAI's latest video model is here. Generate from text or image. Native audio with dialogue, music, and effects. Realistic motion and strong prompt adherence. Try it now →

AITOPIA

46,050 Aufrufe • vor 1 Monat

new Loopy model from ByteDance can generate whole videos of realistic face motion from just ONE IMAGE and a SOUND getting that feeling again...

new Loopy model from ByteDance can generate whole videos of realistic face motion from just ONE IMAGE and a SOUND getting that feeling again...

the real deepfates

305,897 Aufrufe • vor 1 Jahr

animators are not needed anymore this 3D AI motion capture plugin can convert character movement from real video to 3D data and.. you can apply the motion to any 3D character.. link in comments

animators are not needed anymore this 3D AI motion capture plugin can convert character movement from real video to 3D data and.. you can apply the motion to any 3D character.. link in comments

el.cine

64,585 Aufrufe • vor 8 Monaten

Splatter a Video can turn a video into a 3D Gaussian representation! This allows for enhanced video tracking, depth prediction, motion and appearance editing, and stereoscopic video generation.

Splatter a Video can turn a video into a 3D Gaussian representation! This allows for enhanced video tracking, depth prediction, motion and appearance editing, and stereoscopic video generation.

Dreaming Tulpa 🥓👑

11,576 Aufrufe • vor 2 Jahren

Excited to share our latest work on 🎧spatial audio-driven human motion generation. We aim to tackle a largely underexplored yet important problem of enabling virtual humans to move naturally in response to spatial audio—capturing not just what is heard, but also where the sound is coming from. To this end, we introduce the Spatial Audio-Driven Human Motion (SAM) dataset—the first comprehensive dataset featuring paired high-quality human motion and spatial audio recordings. For benchmarking, we develop a generative framework for human MOtion generation driven by SPAtial audio, termed MOSPA, which learns to synthesize realistic and diverse human motions conditioned on spatial audio input. We hope this research could provide a foundation for future research in spatial perception, virtual characters, and embodied AI. The dataset and model will be open-sourced soon. A big thank you to our intern, Shuyang Xu, for the wonderful collaboration! Congratulations, Shuyang! Project page: Paper: Video: #Animation #CG #CV #AIGC #DL #Deeplearning #Motion #Graphics #AI #GenerativeAI

Excited to share our latest work on 🎧spatial audio-driven human motion generation. We aim to tackle a largely underexplored yet important problem of enabling virtual humans to move naturally in response to spatial audio—capturing not just what is heard, but also where the sound is coming from. To this end, we introduce the Spatial Audio-Driven Human Motion (SAM) dataset—the first comprehensive dataset featuring paired high-quality human motion and spatial audio recordings. For benchmarking, we develop a generative framework for human MOtion generation driven by SPAtial audio, termed MOSPA, which learns to synthesize realistic and diverse human motions conditioned on spatial audio input. We hope this research could provide a foundation for future research in spatial perception, virtual characters, and embodied AI. The dataset and model will be open-sourced soon. A big thank you to our intern, Shuyang Xu, for the wonderful collaboration! Congratulations, Shuyang! Project page: Paper: Video: #Animation #CG #CV #AIGC #DL #Deeplearning #Motion #Graphics #AI #GenerativeAI

Zhiyang (Frank) Dou

14,610 Aufrufe • vor 1 Jahr

NVIDIA just revealed MotionBricks, a new AI model trained on more than 350,000 motion clips. What’s really impressive is that the same model can animate a digital character and control a humanoid robot in the real world.

NVIDIA just revealed MotionBricks, a new AI model trained on more than 350,000 motion clips. What’s really impressive is that the same model can animate a digital character and control a humanoid robot in the real world.

Justin Ryan

84,193 Aufrufe • vor 12 Tagen

Sonilo has released Sound Effects 1.0, a video-native audio model that can read on-screen motion, scene context, and timing. > It can make sound effects synced to the footage. > Video-to-Sound-Effects runs automatically without requiring a prompt. > Text-to-Sound-Effects is able to generate standalone assets from a written description.

Sonilo has released Sound Effects 1.0, a video-native audio model that can read on-screen motion, scene context, and timing. > It can make sound effects synced to the footage. > Video-to-Sound-Effects runs automatically without requiring a prompt. > Text-to-Sound-Effects is able to generate standalone assets from a written description.

🚨 AI News | TestingCatalog

15,523 Aufrufe • vor 9 Tagen

Full body motion, audio driven timing, camera direction, and world control in one model. Omnia moves past stiff avatar clips and into real performance. It understands motion, sound, text, and the scene together, so what you create feels directed, not generated. Available now.

Hedra

353,217 Aufrufe • vor 5 Monaten

Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation paper page: Recent advances in generative modeling have led to promising progress on synthesizing 3D human motion from text, with methods that can generate character animations from short prompts and specified durations. However, using a single text prompt as input lacks the fine-grained control needed by animators, such as composing multiple actions and defining precise durations for parts of the motion. To address this, we introduce the new problem of timeline control for text-driven motion synthesis, which provides an intuitive, yet fine-grained, input interface for users. Instead of a single prompt, users can specify a multi-track timeline of multiple prompts organized in temporal intervals that may overlap. This enables specifying the exact timings of each action and composing multiple actions in sequence or at overlapping intervals. To generate composite animations from a multi-track timeline, we propose a new test-time denoising method. This method can be integrated with any pre-trained motion diffusion model to synthesize realistic motions that accurately reflect the timeline. At every step of denoising, our method processes each timeline interval (text prompt) individually, subsequently aggregating the predictions with consideration for the specific body parts engaged in each action. Experimental comparisons and ablations validate that our method produces realistic motions that respect the semantics and timing of given text prompts.

Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation paper page: Recent advances in generative modeling have led to promising progress on synthesizing 3D human motion from text, with methods that can generate character animations from short prompts and specified durations. However, using a single text prompt as input lacks the fine-grained control needed by animators, such as composing multiple actions and defining precise durations for parts of the motion. To address this, we introduce the new problem of timeline control for text-driven motion synthesis, which provides an intuitive, yet fine-grained, input interface for users. Instead of a single prompt, users can specify a multi-track timeline of multiple prompts organized in temporal intervals that may overlap. This enables specifying the exact timings of each action and composing multiple actions in sequence or at overlapping intervals. To generate composite animations from a multi-track timeline, we propose a new test-time denoising method. This method can be integrated with any pre-trained motion diffusion model to synthesize realistic motions that accurately reflect the timeline. At every step of denoising, our method processes each timeline interval (text prompt) individually, subsequently aggregating the predictions with consideration for the specific body parts engaged in each action. Experimental comparisons and ablations validate that our method produces realistic motions that respect the semantics and timing of given text prompts.

AK

126,585 Aufrufe • vor 2 Jahren

Introducing Wan 2.5 🚀 Create immersive AI videos with fluid motion, built-in audio and voice. Supporting text-to-video and start images Available now for all users

Introducing Wan 2.5 🚀 Create immersive AI videos with fluid motion, built-in audio and voice. Supporting text-to-video and start images Available now for all users

Magnific

170,229 Aufrufe • vor 10 Monaten

Loopy Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency paper page: With the introduction of diffusion-based video generation techniques, audio-conditioned human video generation has recently achieved significant breakthroughs in both the naturalness of motion and the synthesis of portrait details. Due to the limited control of audio signals in driving human motion, existing methods often add auxiliary spatial signals to stabilize movements, which may compromise the naturalness and freedom of motion. In this paper, we propose an end-to-end audio-only conditioned video diffusion model named Loopy. Specifically, we designed an inter- and intra-clip temporal module and an audio-to-latents module, enabling the model to leverage long-term motion information from the data to learn natural motion patterns and improving audio-portrait movement correlation. This method removes the need for manually specified spatial motion templates used in existing methods to constrain motion during inference. Extensive experiments show that Loopy outperforms recent audio-driven portrait diffusion models, delivering more lifelike and high-quality results across various scenarios.

AK

128,803 Aufrufe • vor 1 Jahr

Space potato? 🥔 Thanks to observations from Hubble and the Keck Observatory, astronomers were able to generate a 3D model of the galaxy M87. By tracking the motion of stars around the galaxy’s center, they determined that the galaxy is potato-shaped:

Space potato? 🥔 Thanks to observations from Hubble and the Keck Observatory, astronomers were able to generate a 3D model of the galaxy M87. By tracking the motion of stars around the galaxy’s center, they determined that the galaxy is potato-shaped:

Hubble

275,318 Aufrufe • vor 3 Jahren

Pure Precision. Move AI is pushing 3D character animation and motion capture forward with these powerful releases: – Dex by Move AI: significantly improved hand & finger tracking – New desktop apps: Move Pro now available on Windows, macOS, and Linux – Editor (Beta): fine-tune motion input data with precision Built for animators, developers, and researchers who demand lifelike motion — with total control, without suits. Watch the full video here →

Pure Precision. Move AI is pushing 3D character animation and motion capture forward with these powerful releases: – Dex by Move AI: significantly improved hand & finger tracking – New desktop apps: Move Pro now available on Windows, macOS, and Linux – Editor (Beta): fine-tune motion input data with precision Built for animators, developers, and researchers who demand lifelike motion — with total control, without suits. Watch the full video here →

Move AI

33,144 Aufrufe • vor 1 Jahr

Tracking Everything Everywhere All at Once paper page: present a new test-time optimization method for estimating dense and long-range motion from a video sequence. Prior optical flow or particle video tracking algorithms typically operate within limited temporal windows, struggling to track through occlusions and maintain global consistency of estimated motion trajectories. We propose a complete and globally consistent motion representation, dubbed OmniMotion, that allows for accurate, full-length motion estimation of every pixel in a video. OmniMotion represents a video using a quasi-3D canonical volume and performs pixel-wise tracking via bijections between local and canonical space. This representation allows us to ensure global consistency, track through occlusions, and model any combination of camera and object motion. Extensive evaluations on the TAP-Vid benchmark and real-world footage show that our approach outperforms prior state-of-the-art methods by a large margin both quantitatively and qualitatively.

Tracking Everything Everywhere All at Once paper page: present a new test-time optimization method for estimating dense and long-range motion from a video sequence. Prior optical flow or particle video tracking algorithms typically operate within limited temporal windows, struggling to track through occlusions and maintain global consistency of estimated motion trajectories. We propose a complete and globally consistent motion representation, dubbed OmniMotion, that allows for accurate, full-length motion estimation of every pixel in a video. OmniMotion represents a video using a quasi-3D canonical volume and performs pixel-wise tracking via bijections between local and canonical space. This representation allows us to ensure global consistency, track through occlusions, and model any combination of camera and object motion. Extensive evaluations on the TAP-Vid benchmark and real-world footage show that our approach outperforms prior state-of-the-art methods by a large margin both quantitatively and qualitatively.

AK

280,547 Aufrufe • vor 3 Jahren