Video yükleniyor...

Video Yüklenemedi

Ana Sayfaya Dön

MVP of Multiview Video → Camera parameters + 3D keypoints. Visualized with Rerun The basic pipeline as of right now looks like this: 1. Capture 🔴 – Using 4 iPhones and an Insta360 Go. iPhone videos are captured via Final Cut Pro Multicam for easy sync and the exocentric...

42,785 görüntüleme • 1 yıl önce •via X (Twitter)

11 Yorum

Pablo Vela profil fotoğrafı
Pablo Vela1 yıl önce

Links (still a work in progress, but wanted to share for folks who want to dig in): 1. Saved RRD visualization – < 2. Multicam ego/exo sync app – < 3. 3D person detection + triangulation – <

Pablo Vela profil fotoğrafı
Pablo Vela1 yıl önce

Posted the wrong link for the RRD! Updated here <

UserInterface profil fotoğrafı
UserInterface4 yıl önce

Need Professional Video Production, Music Videos, Commercials, Graphic Design, or Photo Retouching? We will take your project from concept to completion. #services #creative #DMV

Jianyuan Wang profil fotoğrafı
Jianyuan Wang1 yıl önce

@rerundotio Yeah I guess VGGT cannot predict high-quality depth maps for dynamic multi-view inputs by now.

Pablo Vela profil fotoğrafı
Pablo Vela1 yıl önce

@rerundotio Yeah, still, it performs better than I expected. Thankfully it produces extremely useful camera parameters. I'm sure it wouldn't be too hard to extend to 4D using something like as an additional dataset

Daniel profil fotoğrafı
Daniel1 yıl önce

@rerundotio Very cool

Pablo Vela profil fotoğrafı
Pablo Vela1 yıl önce

@rerundotio Thank you! Slowly but surely 🫡

Neil Nie profil fotoğrafı
Neil Nie1 yıl önce

@rerundotio Impressive pipeline! thanks for sharing!

Pablo Vela profil fotoğrafı
Pablo Vela1 yıl önce

@rerundotio Thanks for the kind words!

StudioGaltMocap profil fotoğrafı
StudioGaltMocap1 yıl önce

@rerundotio Very cool!

zeke profil fotoğrafı
zeke1 yıl önce

@rerundotio Wow. This has so much potential. Can’t wait to see the progress.

Benzer Videolar

“This guy will do anything to raise awareness for Covid and Long Covid”. Here are some of the ways I’ve tried. Want to support my efforts to raise awareness and help others? Now is the time. Now is the time to change course. I kicked off fundraising around 19 months ago to... 1. Raise Awareness for Covid and Long Covid. 2. Provide education assistance to families. 3. Provide advocacy assistance. 4. Advocate for education, mitigation, and air filtration. 5. Contact as many political and workforce leaders as I can to request immediate action. If you'd like to donate to my efforts that's great, if you can't that is understandable, and if you just don't want to I would even appreciate a share. I am not raising money to stockpile cash and if we raise a large amount it will go toward setting up a non-profit and hiring a team. That is a far off, long shot goal however so until then I will continue to try to create a job for myself with the help of those who value my work and want to help others. I am not backing down from the goal of becoming a full time advocate for this issue, and I hope to look back on this time as the start of something bigger, and the turning point of when we got enough people to notice and ask for change with us. I will assist people as best I can, I will make as much noise as I can, and I will contact as many policy influencers as I can to help turn the tide, and if you'd like to support the kind of work I'd appreciate it. No obligation friends, if you’re tagged just remove it and I appreciate it if you’ve already donated or even shared. YerAweSum!

Keith

92,638 görüntüleme • 2 ay önce

This is probably the most complex workflow I’ve ever built, only with open-source tools. It took my 4 days. It takes four inputs: author, title, and style; and generates a full visual animated story in one click in ComfyUI . I worked on it for four days. There are still some bugs, but here’s the first preview. Here’s a quick breakdown: - The four inputs are sent to LLMs with precise instructions to generate: first, prompts for images and image modifications; second, prompts for animations; third, prompts for generating music. - All voices are generated from the text and timed precisely, as they determine the length of each animation segment. - The first image and video are generated to serve as the title, but also as the guide for all other images created for the video. - Titles and subtitles are also added automatically in Comfy. - I also developed a lot of custom nodes for minor frame calculations, mostly to match audio and video. - The full system is a large loop that, for each line of text, generates an image and then a video from that image. The loop was the hardest part to build in this workflow, so it can process either a 20-second video or a 2-minute video with the same input. - There are multiple combinations of LLMs that try to understand the text in the best way to provide the best prompts for images and video. - The final video is assembled entirely within ComfyUI. - The music is generated based on the LLM output and matches the exact timing of the full animation. - Done! For reference, this workflow uses a lot of models and only works on an RTX 6000 Pro with plenty of RAM. My goal is not to replace humans, as I’ll try to explain later, this workflow is highly controlled and can be adapted or reworked at any point by real artists! My aim was to create a tool that can animate text in one go, allowing the AI some freedom while keeping a strict flow. I don’t know yet how I’ll share this workflow with people, I still need to polish it properly, but maybe through Patreon. Anyway, I hope you enjoy my research, and let’s always keep pushing further! :)

Lovis Odin

56,518 görüntüleme • 8 ay önce